Definition and usage
htmlspecialchars()
function converts predefined characters into HTML entities.
The predefined characters are:
- & (Hannual) Become &
- " (double quotes) become"
- ' (single quotes) becomes'
- < (less than) becomes <
- > (greater than) become >
Tip: If you need to convert special HTML entities back to characters, please use the htmlspecialchars_decode()
function.
grammar
htmlspecialchars ( string , flags , character - set , double_encode )
parameter |
describe |
string
|
Required. Specifies the string to be converted. |
flags
|
Optional. Specifies how to deal with quotes, invalid encodings, and which document type to use.
Available quote types:
- ENT_COMPAT - Default. Encode only double quotes.
- ENT_QUOTES - Encoded double and single quotes.
- ENT_NOQUOTES - No quotation marks are encoded.
Invalid encoding:
- ENT_IGNORE - Ignore invalid encoding instead of having the function return an empty string. It should be avoided as it may have a security impact.
- ENT_SUBSTITUTE - Replaces invalid encodings with a specified character with Unicode substitution U+FFFD (UTF-8) or &#FFFD; instead of returning an empty string.
- ENT_DISALLOWED - Replace invalid code points in the specified document type with Unicode substitution characters U+FFFD (UTF-8) or &#FFFD;.
Additional flags for the document type used:
- ENT_HTML401 - Default. Process code as HTML 4.01.
- ENT_HTML5 - Process code as HTML 5.
- ENT_XML1 - Process code as XML 1.
- ENT_XHTML - Process code as XHTML.
|
character-set
|
Optional. A string that specifies the character set to be used.
Allowed values:
- UTF-8 - Default. ASCII compatible with multi-byte 8-bit Unicode
- ISO-8859-1 - Western Europe
- ISO-8859-15 - Western Europe (added to the Euro symbol + missing French and Finnish letters in ISO-8859-1)
- cp866 - DOS-specific Cyrillic character set
- cp1251 - Windows-specific Cyrillic character set
- cp1252 - Windows-specific Western European character set
- KOI8-R - Russian
- BIG5 - Traditional Chinese, mainly used in Taiwan
- GB2312 - Simplified Chinese, national standard character set
- BIG5-HKSCS - Big5 with Hong Kong expansion
- Shift_JIS - Japanese
- EUC-JP - Japanese
- MacRoman - Character set used by Mac operating system
Note: In versions prior to PHP 5.4, unrecognized character sets will be ignored and replaced by ISO-8859-1. Since PHP 5.4, unrecognized character sets will be ignored and replaced by UTF-8.
|
double_encode
|
Optional. Boolean value specifies whether to encode an existing HTML entity.
- TRUE - Default. Each entity will be converted.
- FALSE - The HTML entity that already exists is not encoded.
|