A character set is the system a computer uses to store characters, mapping each character to a specific code value. Encoding is the method of converting characters from a character set into binary data. Common character sets include UTF-8, GBK, and ISO-8859-1.
For multilingual websites, UTF-8 is usually preferred because it supports almost all language characters and offers excellent compatibility. UTF-8 is a variable-length encoding capable of representing nearly all written languages in the world.
The character encoding between the database and PHP must match; otherwise, data transmission may result in garbled text or lost information. For example, if the database uses UTF-8 encoding but PHP does not set the encoding correctly when connecting, Chinese content stored in the database may appear as unreadable characters on the website.
mysqli::set_charset ensures consistency of character encoding between PHP and the MySQL database. By using this method, we can set the MySQL connection to a specific character set after connecting, avoiding display issues caused by encoding mismatches.
Using mysqli::set_charset is straightforward. Suppose we have already connected to the MySQL database via mysqli. Here’s how to set the character set to UTF-8:
<span><span><span class="hljs-meta"><?php</span></span><span>
</span><span><span class="hljs-comment">// Create database connection</span></span><span>
</span><span><span class="hljs-variable">$mysqli</span></span><span> = </span><span><span class="hljs-keyword">new</span></span><span> </span><span><span class="hljs-title function_ invoke__">mysqli</span></span><span>(</span><span><span class="hljs-string">"localhost"</span></span>, </span><span><span class="hljs-string">"username"</span></span>, </span><span><span class="hljs-string">"password"</span></span>, </span><span><span class="hljs-string">"database_name"</span></span>);
<p></span>// Check connection<br>
if ($mysqli->connect_error) {<br>
die("Connection failed: ". </span>$mysqli->connect_error);<br>
}</p>
<p>// Set character set to UTF-8<br>
if (!$mysqli->set_charset("utf8")) {<br>
printf("Error: Unable to set character set %s\n", </span>$mysqli->error);<br>
</span>exit();<br>
}</p>
<p>// Perform database queries and other operations...<br>
?><br>
</span>
In the code above, $mysqli->set_charset("utf8") sets the connection’s character set to UTF-8. This ensures that data read from or written to the database is handled correctly in UTF-8, preventing garbled text.
Multilingual websites often store content in multiple languages, such as Chinese, English, and Japanese. To ensure proper display, the database’s character set and the PHP connection character set must match. You can set mysqli::set_charset in your PHP connection code to ensure encoding consistency. Additionally, the front-end pages should declare UTF-8 encoding using the HTML tag:
<span><span><span class="hljs-tag"><<span class="hljs-name">meta</span></span></span><span> </span><span><span class="hljs-attr">charset</span></span><span>=</span><span><span class="hljs-string">"UTF-8"</span></span><span>>
</span></span>
This ensures that front-end pages interpret content as UTF-8, allowing all language characters to display correctly.
To ensure that different language characters are correctly stored in the database, the tables and fields must use UTF-8 encoding. You can specify UTF-8 when creating tables with the following SQL command:
<span><span><span class="hljs-keyword">CREATE</span></span><span> </span><span><span class="hljs-keyword">TABLE</span></span><span> `content` (
`id` </span><span><span class="hljs-type">INT</span></span><span> </span><span><span class="hljs-keyword">NOT</span></span><span> </span><span><span class="hljs-keyword">NULL</span></span> AUTO_INCREMENT </span><span><span class="hljs-keyword">PRIMARY</span></span> KEY,
`title` </span><span><span class="hljs-type">VARCHAR</span></span>(</span><span><span class="hljs-number">255</span></span>) </span><span><span class="hljs-keyword">NOT</span></span> </span><span><span class="hljs-keyword">NULL</span></span>,
`description` TEXT
) ENGINE</span><span>=</span>InnoDB </span><span><span class="hljs-keyword">DEFAULT</span></span> CHARSET</span><span>=utf8;
</span></span>
By specifying DEFAULT CHARSET=utf8, the table supports UTF-8 encoding and can store characters from multiple languages.
Garbled Text:
If garbled text appears even after setting the character set, first check whether the database default character set is UTF-8. Also, make sure the HTML encoding on the page is UTF-8. If both are correct but the problem persists, check the character set settings for the database tables and fields.
Character Set Mismatch:
If the database and PHP connection use different character sets, characters may not display correctly. Using mysqli::set_charset ensures consistency between the two.
MySQL Version Support:
Ensure your MySQL version supports UTF-8 or utf8mb4 (the latter is a superset of UTF-8 that supports more characters, including emojis). You can check supported character sets in the current database with:
<span><span><span class="hljs-keyword">SHOW</span></span> </span><span><span class="hljs-type">CHARACTER</span></span> </span><span><span class="hljs-keyword">SET</span></span>;
</span></span>
Related Tags:
mysqli