Current Location: Home> Function Categories> htmlspecialchars

htmlspecialchars

Convert special characters to HTML entities
Name:htmlspecialchars
Category:String
Programming Language:php
One-line Description:Convert some predefined characters to HTML entities.

Definition and usage

htmlspecialchars() function converts predefined characters into HTML entities.

The predefined characters are:

  • & (Hannual) Become &
  • " (double quotes) become"
  • ' (single quotes) becomes'
  • < (less than) becomes <
  • > (greater than) become >

Tip: If you need to convert special HTML entities back to characters, please use the htmlspecialchars_decode() function.

grammar

 htmlspecialchars ( string , flags , character - set , double_encode )
parameter describe
string Required. Specifies the string to be converted.
flags

Optional. Specifies how to deal with quotes, invalid encodings, and which document type to use.

Available quote types:

  • ENT_COMPAT - Default. Encode only double quotes.
  • ENT_QUOTES - Encoded double and single quotes.
  • ENT_NOQUOTES - No quotation marks are encoded.

Invalid encoding:

  • ENT_IGNORE - Ignore invalid encoding instead of having the function return an empty string. It should be avoided as it may have a security impact.
  • ENT_SUBSTITUTE - Replaces invalid encodings with a specified character with Unicode substitution U+FFFD (UTF-8) or &#FFFD; instead of returning an empty string.
  • ENT_DISALLOWED - Replace invalid code points in the specified document type with Unicode substitution characters U+FFFD (UTF-8) or &#FFFD;.

Additional flags for the document type used:

  • ENT_HTML401 - Default. Process code as HTML 4.01.
  • ENT_HTML5 - Process code as HTML 5.
  • ENT_XML1 - Process code as XML 1.
  • ENT_XHTML - Process code as XHTML.
character-set

Optional. A string that specifies the character set to be used.

Allowed values:

  • UTF-8 - Default. ASCII compatible with multi-byte 8-bit Unicode
  • ISO-8859-1 - Western Europe
  • ISO-8859-15 - Western Europe (added to the Euro symbol + missing French and Finnish letters in ISO-8859-1)
  • cp866 - DOS-specific Cyrillic character set
  • cp1251 - Windows-specific Cyrillic character set
  • cp1252 - Windows-specific Western European character set
  • KOI8-R - Russian
  • BIG5 - Traditional Chinese, mainly used in Taiwan
  • GB2312 - Simplified Chinese, national standard character set
  • BIG5-HKSCS - Big5 with Hong Kong expansion
  • Shift_JIS - Japanese
  • EUC-JP - Japanese
  • MacRoman - Character set used by Mac operating system

Note: In versions prior to PHP 5.4, unrecognized character sets will be ignored and replaced by ISO-8859-1. Since PHP 5.4, unrecognized character sets will be ignored and replaced by UTF-8.

double_encode

Optional. Boolean value specifies whether to encode an existing HTML entity.

  • TRUE - Default. Each entity will be converted.
  • FALSE - The HTML entity that already exists is not encoded.
Similar Functions
Popular Articles