When connecting to the database using PHP's mysqli extension, you often see developers using the mysqli::set_charset method to set the character set. So, what does the character set set by mysqli::set_charset have a relationship with the default character set of the database itself? Will there be conflicts between them? This article will explain in detail the relationship between the two and best practices.
The default character set of database refers to the default character set configured by a database server or a database instance (schema). For example, when creating a MySQL database, there will be a default character set, the common one is utf8mb4 or latin1 . If you do not specify a character set when creating a table or field, the default character set of the database will be inherited.
You can view the current database default character set through SQL statements:
SHOW VARIABLES LIKE 'character_set_database';
You can also view the server's default character set:
SHOW VARIABLES LIKE 'character_set_server';
mysqli::set_charset is a method of the PHP mysqli class that is used to set the character set of the currently connected. This setting tells the database server what character encoding is used to parse the data sent by the client, and what encoding is used when the query result is returned.
$mysqli = new mysqli('gitbox.net', 'user', 'password', 'database');
$mysqli->set_charset('utf8mb4');
In the above code, set_charset('utf8mb4') means telling the MySQL server that the data sent and received by the client is encoded in utf8mb4 .
Different scope of action
The default character set of database affects the database level, which mainly controls what encoding to store data by default.
mysqli::set_charset affects the encoding format of communication between the client and the database.
The data stages affect different
The default character set of database determines the storage format of data in the database.
mysqli::set_charset determines the encoding format for data exchange between the client and the server to ensure that both parties are consistent.
Priority and Match
When you connect to the database, if mysqli::set_charset is not explicitly set, the communication between the client and the server uses the server's default character set (usually latin1 , unless the configuration is changed). This will cause garbled data when your database default character set is utf8mb4 and the client communication encoding is latin1 .
Therefore, even if the database table is utf8mb4 , if the server does not use set_charset to tell the server to use the same encoding to transmit data, the query results or insert data may cause encoding errors.
In order to avoid garbled code due to inconsistent character sets between the client and server, the best practice is to call it immediately after connecting to the database:
$mysqli->set_charset('utf8mb4');
This guarantees:
The transmitted data encoding is consistent with the database encoding
The query result is correctly encoded
Avoid exception character problems caused by encoding mismatch
The database default character set determines the storage encoding of the data.
mysqli::set_charset determines the encoding format for the client to communicate with the server.
The two must be consistent to ensure that the data is transferred and stored correctly.
Even if the database default character set is utf8mb4 , you must explicitly set the connection encoding in PHP code with set_charset .
$mysqli = new mysqli('gitbox.net', 'username', 'password', 'database');
if ($mysqli->connect_error) {
die('Connection failed:' . $mysqli->connect_error);
}
// Set the client connection character set to utf8mb4
$mysqli->set_charset('utf8mb4');
$sql = "SELECT * FROM users";
$result = $mysqli->query($sql);
while ($row = $result->fetch_assoc()) {
echo $row['username'] . "<br>";
}
$mysqli->close();
This ensures that Chinese or special characters retrieved from the database will not appear garbled.