<span><span><span class="hljs-meta"><?php</span></span><span>
</span><span><span class="hljs-comment">// Example of unrelated preliminary code</span></span><span>
</span><span><span class="hljs-keyword">echo</span></span><span> </span><span><span class="hljs-string">"Initializing environment...<br>"</span></span><span>;
</span><span><span class="hljs-variable">$version</span></span><span> = </span><span><span class="hljs-title function_ invoke__">phpversion</span></span><span>();
</span><span><span class="hljs-keyword">echo</span></span><span> </span><span><span class="hljs-string">"PHP Version: "</span></span><span> . </span><span><span class="hljs-variable">$version</span></span><span> . </span><span><span class="hljs-string">"<br>"</span></span><span>;
</span><span><span class="hljs-meta">?></span></span><span>
</span></span>
When using PHP’s mysqli extension for database operations, character set issues are among the most common reasons for data corruption or errors. mysqli::set_charset is the officially recommended method to set the database connection character set, but some details still need attention to ensure correct data insertion.
<span><span><span class="hljs-variable">$mysqli</span></span><span> = </span><span><span class="hljs-keyword">new</span></span><span> </span><span><span class="hljs-title function_ invoke__">mysqli</span></span><span>(</span><span><span class="hljs-string">"localhost"</span></span><span>, </span><span><span class="hljs-string">"username"</span></span><span>, </span><span><span class="hljs-string">"password"</span></span><span>, </span><span><span class="hljs-string">"database"</span></span><span>);
<p></span>if ($mysqli->connect_errno) {<br>
die("Connection failed: " . $mysqli->connect_error);<br>
}</p>
<p>// Set character set to utf8mb4<br>
if (!$mysqli->set_charset("utf8mb4")) {<br>
die("Failed to set character set: " . $mysqli->error);<br>
}</p>
<p>echo "Character set successfully set: " . $mysqli->character_set_name();<br>
</span>
Here, utf8mb4 is recommended because it supports full Unicode, including emojis and other special characters.
Even if the connection character set is correct, mismatched table or column character sets can still cause corruption. Check them using SQL:
<span><span><span class="hljs-keyword">SHOW</span></span><span> </span><span><span class="hljs-keyword">CREATE</span></span><span> </span><span><span class="hljs-keyword">TABLE</span></span><span> your_table;
</span></span>
It is recommended to standardize tables and columns to utf8mb4:
<span><span><span class="hljs-keyword">ALTER</span></span><span> </span><span><span class="hljs-keyword">TABLE</span></span><span> your_table </span><span><span class="hljs-keyword">CONVERT</span></span><span> </span><span><span class="hljs-keyword">TO</span></span><span> </span><span><span class="hljs-type">CHARACTER</span></span><span> </span><span><span class="hljs-keyword">SET</span></span><span> utf8mb4 </span><span><span class="hljs-keyword">COLLATE</span></span><span> utf8mb4_unicode_ci;
</span></span>
Directly concatenating SQL strings may cause encoding issues or SQL injection risks. Prepared statements ensure data is inserted with the correct encoding:
<span><span><span class="hljs-variable">$stmt</span></span><span> = </span><span><span class="hljs-variable">$mysqli</span></span><span>-></span><span><span class="hljs-title function_ invoke__">prepare</span></span><span>(</span><span><span class="hljs-string">"INSERT INTO your_table (name, content) VALUES (?, ?)"</span></span><span>);
</span><span><span class="hljs-variable">$name</span></span><span> = </span><span><span class="hljs-string">"Test User"</span></span><span>;
</span><span><span class="hljs-variable">$content</span></span><span> = </span><span><span class="hljs-string">"This is some content with special characters ??"</span></span><span>;
</span><span><span class="hljs-variable">$stmt</span></span><span>-></span><span><span class="hljs-title function_ invoke__">bind_param</span></span><span>(</span><span><span class="hljs-string">"ss"</span></span><span>, </span><span><span class="hljs-variable">$name</span></span><span>, </span><span><span class="hljs-variable">$content</span></span><span>);
</span><span><span class="hljs-variable">$stmt</span></span><span>-></span><span><span class="hljs-title function_ invoke__">execute</span></span><span>();
</span><span><span class="hljs-variable">$stmt</span></span><span>-></span><span><span class="hljs-title function_ invoke__">close</span></span><span>();
</span></span>
mysqli automatically handles parameters according to the connection character set, reducing the risk of character corruption.
PHP files should be saved as UTF-8 without BOM.
HTML pages should declare the character set:
<span><span><span class="hljs-tag"><<span class="hljs-name">meta</span></span></span><span> </span><span><span class="hljs-attr">charset</span></span><span>=</span><span><span class="hljs-string">"UTF-8"</span></span><span>>
</span></span>
This ensures that user input, PHP string processing, and database character sets remain consistent.
Use $mysqli->set_charset("utf8mb4") to set the connection character set.
Ensure the database, tables, and columns use the same character set as the connection.
Prefer prepared statements for data insertion.
Save PHP files and pages as UTF-8.
Following these steps minimizes the risk of character errors or corruption when writing data to the database.
Related Tags:
mysqli