Character encoding problems often become a headache for developers when developing multilingual websites or applications. Especially when the character sets between the database and PHP program are inconsistent, it is very easy to cause problems such as garbled Chinese characters and display errors in special characters. Although PHP provides multiple means to control character set settings, you may ignore a very practical function in the mbstring extension - mb_get_info() .
This article will introduce how to use the mb_get_info() function to check the multibyte character encoding currently used by PHP and compare it with the database settings to determine whether the two are consistent.
mb_get_info() is a function in PHP multibyte string extension (mbstring) to return the relevant setting information of the current mbstring.
mb_get_info(string $type = null): string|array
When no parameters are passed, an array containing all relevant settings information is returned.
Passing in parameters such as "internal_encoding" can get the current internal encoding.
Assuming that you set the encoding to utf8mb4 when connecting to the database, we can confirm whether the multibyte string settings of PHP are consistent through the following steps.
$mysqli = new mysqli('localhost', 'user', 'password', 'my_database');
$mysqli->set_charset('utf8mb4');
Make sure the character set of the database connection is set to the target encoding you want, such as utf8mb4 .
$mbInfo = mb_get_info();
echo "current mbstring Internal encoding: " . $mbInfo['internal_encoding'] . PHP_EOL;
Or a more concise way of writing:
echo "current mbstring coding: " . mb_get_info("internal_encoding") . PHP_EOL;
By default, the internal encoding of mbstring may be UTF-8 , but you should set it explicitly to prevent inconsistencies:
mb_internal_encoding("UTF-8");
The complete detection code is as follows:
<?php
$mysqli = new mysqli('localhost', 'user', 'password', 'my_database');
$mysqli->set_charset('utf8mb4');
mb_internal_encoding("UTF-8");
$dbCharset = $mysqli->character_set_name(); // Get the character set for database connections
$phpCharset = mb_get_info("internal_encoding"); // Get PHP Character set
if (strcasecmp($dbCharset, $phpCharset) === 0 || stripos($dbCharset, $phpCharset) !== false) {
echo "? Database character set and PHP of mbstring coding一致:$dbCharset" . PHP_EOL;
} else {
echo "? Inconsistent!数据库使用of是 $dbCharset,PHP mbstring 使用of是 $phpCharset" . PHP_EOL;
}
?>
Always set the default encoding : set mb_internal_encoding() uniformly when the project is initialized, such as put in the entry file.
Check whether the mbstring extension is enabled :
if (!extension_loaded('mbstring')) {
die("mbstring Extension not enabled,Please php.ini Open!");
}
Maintain front-end consistency : Front-end HTML pages should also declare consistent character sets, such as:
<meta charset="UTF-8">
Although mb_get_info() itself does not directly access the database character set settings, it provides a window to observe the PHP multibyte string environment. By comparing its results to the character set connected to the database, you can more easily troubleshoot inconsistent encodings. Stop letting character set problems bother your development process. Check it in advance and get twice the result with half the effort!
For more coding skills, please visit: https://gitbox.net/php/encoding-guide