In web development and web applications, HTML and CSS are commonly used to set page content and styles. However, in certain cases—such as storing data or displaying text—it's necessary to remove HTML tags and CSS styles to keep the content clean and secure. PHP offers multiple ways to achieve this. This article will explain these methods in detail.
PHP's built-in strip_tags() function is a common way to remove HTML tags.
The strip_tags() function accepts a string as input, removes all HTML tags by default, and also allows specifying tags to keep.
// Remove all HTML tags
$str = '<p>This is <b>bold</b> and this is <i>italic</i></p>';
echo strip_tags($str); // Output: This is bold and this is italic
// Keep <b> and <i> tags
$str = '<p>This is <b>bold</b> and this is <i>italic</i></p>';
echo strip_tags($str, '<b><i>'); // Output: This is <b>bold</b> and this is <i>italic</i>
The htmlspecialchars() function converts special characters to HTML entities to prevent browsers from interpreting them as tags, helping to avoid XSS attacks.
$str = 'This is <b>bold</b> and this is <i>italic</i>';
echo htmlspecialchars($str); // Output: This is <b>bold</b> and this is <i>italic</i>
Common methods to remove CSS styles include using regular expressions or third-party libraries.
You can use regex to match and remove style attributes from HTML tags.
// Remove style attribute using regex
$str = '<p style="color: red; font-size: 12px;">This is a paragraph</p>';
$str = preg_replace('/ style="[^"]*"/', '', $str);
echo $str; // Output: <p>This is a paragraph</p>
HTMLPurifier is a powerful library that effectively cleans unnecessary styles from HTML code, ensuring safe and standardized markup.
require_once '/path/to/library/HTMLPurifier.auto.php';
$config = HTMLPurifier_Config::createDefault();
$purifier = new HTMLPurifier($config);
$str = '<p style="color: red; font-size: 12px;">This is a paragraph</p>';
echo $purifier->purify($str); // Output: <p>This is a paragraph</p>
In PHP, removing HTML tags is mostly done with the strip_tags() function, while removing CSS styles can be achieved with regular expressions or third-party libraries. Choosing the appropriate method based on your needs can effectively improve data security and display quality.