Current Location: Home> Latest Articles> How to Use the strip_tags Function to Handle Strings with Nested HTML Tags and Avoid Tag Interference?

How to Use the strip_tags Function to Handle Strings with Nested HTML Tags and Avoid Tag Interference?

gitbox 2025-06-22

Basic Usage of strip_tags()

The strip_tags() function is used to remove all HTML and PHP tags from a string. Its basic syntax is as follows:

<span><span><span class="hljs-title function_ invoke__">strip_tags</span></span><span>(</span><span><span class="hljs-keyword">string</span></span><span> </span><span><span class="hljs-variable">$str</span></span><span>, </span><span><span class="hljs-keyword">string</span></span><span>|</span><span><span class="hljs-literal">null</span></span><span> </span><span><span class="hljs-variable">$allowable_tags</span></span><span> = </span><span><span class="hljs-literal">null</span></span><span>): </span><span><span class="hljs-keyword">string</span></span><span>
</span></span>
  • $str: The string to be processed.

  • $allowable_tags: An optional parameter specifying tags to allow. If not specified, all tags are removed by default.

Example:

<span><span><span class="hljs-variable">$html</span></span><span> = </span><span><span class="hljs-string">&#039;&lt;p&gt;Hello &lt;b&gt;world&lt;/b&gt;!&lt;/p&gt;&#039;</span></span><span>;
</span><span><span class="hljs-variable">$clean_text</span></span><span> = </span><span><span class="hljs-title function_ invoke__">strip_tags</span></span><span>(</span><span><span class="hljs-variable">$html</span></span><span>);
</span><span><span class="hljs-keyword">echo</span></span><span> </span><span><span class="hljs-variable">$clean_text</span></span><span>;  </span><span><span class="hljs-comment">// Output: Hello world!</span></span><span>
</span></span>

As shown above, strip_tags() removes all HTML tags by default. But when the string contains nested tags, how can we ensure tags are removed correctly without errors?


Tips for Handling Nested HTML Tags

When dealing with complex HTML structures, the behavior of strip_tags() requires special attention. If the HTML is not well-formed or nesting is too deep, directly using strip_tags() may not fully achieve the expected results. For example, nested tags might cause some tags to not be completely removed, or the resulting string’s format may not be as desired.

1. Ensure Proper HTML Tag Structure

Sometimes nested HTML tags may not conform to standards, causing strip_tags() to malfunction. To solve this, first make sure the HTML code is well-formed. You can use PHP's DOMDocument class to load and normalize the HTML structure.

<span><span><span class="hljs-variable">$html</span></span><span> = </span><span><span class="hljs-string">&#039;&lt;div&gt;&lt;b&gt;Hello &lt;i&gt;world&lt;/i&gt;&lt;/b&gt;!&lt;/div&gt;&#039;</span></span><span>;
</span><span><span class="hljs-variable">$dom</span></span><span> = </span><span><span class="hljs-keyword">new</span></span><span> </span><span><span class="hljs-title class_">DOMDocument</span></span><span>();
</span><span><span class="hljs-title function_ invoke__">libxml_use_internal_errors</span></span><span>(</span><span><span class="hljs-literal">true</span></span><span>);  </span><span><span class="hljs-comment">// Ignore HTML format errors</span></span><span>
</span><span><span class="hljs-variable">$dom</span></span><span>-></span><span><span class="hljs-title function_ invoke__">loadHTML</span></span><span>(</span><span><span class="hljs-variable">$html</span></span><span>);
</span><span><span class="hljs-variable">$clean_html</span></span><span> = </span><span><span class="hljs-variable">$dom</span></span><span>-></span><span><span class="hljs-title function_ invoke__">saveHTML</span></span><span>();
</span><span><span class="hljs-variable">$clean_text</span></span><span> = </span><span><span class="hljs-title function_ invoke__">strip_tags</span></span><span>(</span><span><span class="hljs-variable">$clean_html</span></span><span>);  </span><span><span class="hljs-comment">// Use strip_tags to remove tags</span></span><span>
</span><span><span class="hljs-keyword">echo</span></span><span> </span><span><span class="hljs-variable">$clean_text</span></span><span>;  </span><span><span class="hljs-comment">// Output: Hello world!</span></span><span>
</span></span>

With DOMDocument, we can first load and fix the HTML code, then use strip_tags() to clean the tags.

2. Allow Certain Tags to Remain

If you want to keep specific tags only, you can specify allowed tags using the second parameter. For example, if you want to keep only and tags and remove all others:

<span><span><span class="hljs-variable">$html</span></span><span> = </span><span><span class="hljs-string">&#039;&lt;p&gt;&lt;b&gt;Hello &lt;i&gt;world&lt;/i&gt;!&lt;/b&gt;&lt;/p&gt;&#039;</span></span><span>;
</span><span><span class="hljs-variable">$clean_text</span></span><span> = </span><span><span class="hljs-title function_ invoke__">strip_tags</span></span><span>(</span><span><span class="hljs-variable">$html</span></span><span>, </span><span><span class="hljs-string">&#039;&lt;b&gt;&lt;i&gt;&#039;</span></span><span>);
</span><span><span class="hljs-keyword">echo</span></span><span> </span><span><span class="hljs-variable">$clean_text</span></span><span>;  </span><span><span class="hljs-comment">// Output: Hello &lt;i&gt;world&lt;/i&gt;!</span></span><span>
</span></span>

This way, strip_tags() removes all tags not in the allowed list, keeping only and tags, preventing interference from other tags.

3. Combine with Regular Expressions to Filter Extra Tags

Sometimes relying on strip_tags() alone is not detailed enough, especially when handling complex HTML structures. In such cases, we can combine regular expressions to further clean the string by removing nested tags or other unnecessary parts.

<span><span><span class="hljs-variable">$html</span></span><span> = </span><span><span class="hljs-string">&#039;&lt;div&gt;&lt;b&gt;Hello &lt;i&gt;world&lt;/i&gt;&lt;/b&gt;!&lt;/div&gt;&#039;</span></span><span>;
</span><span><span class="hljs-variable">$clean_text</span></span><span> = </span><span><span class="hljs-title function_ invoke__">strip_tags</span></span><span>(</span><span><span class="hljs-variable">$html</span></span><span>, </span><span><span class="hljs-string">&#039;&lt;b&gt;&lt;i&gt;&#039;</span></span><span>);  </span><span><span class="hljs-comment">// First remove unwanted tags</span></span><span>
</span><span><span class="hljs-variable">$clean_text</span></span><span> = </span><span><span class="hljs-title function_ invoke__">preg_replace</span></span><span>(</span><span><span class="hljs-string">&#039;/&lt;[^&gt;]+&gt;/&#039;</span></span><span>, </span><span><span class="hljs-string">&#039;&#039;</span></span><span>, </span><span><span class="hljs-variable">$clean_text</span></span><span>);  </span><span><span class="hljs-comment">// Then use regex to remove remaining HTML tags</span></span><span>
</span><span><span class="hljs-keyword">echo</span></span><span> </span><span><span class="hljs-variable">$clean_text</span></span><span>;  </span><span><span class="hljs-comment">// Output: Hello world!</span></span><span>
</span></span>

This method allows for more precise control over the tag-cleaning process.

  • Related Tags:

    HTML
gitbox.net
Covering practical tips and function usage in major programming languages to help you master core skills and tackle development challenges with ease.
Repository for Learning Code - gitbox.net