Current Location: Home> Latest Articles> Use convert_cyr_string to unify database reading, writing and encoding

Use convert_cyr_string to unify database reading, writing and encoding

gitbox 2025-05-29

What is the convert_cyr_string function?

convert_cyr_string is a function of PHP that converts strings between several Cyrillic letter encodings (such as KOI8-R, Windows-1251, ISO-8859-5, etc.). The function prototype is as follows:

 string convert_cyr_string(string $str, string $from, string $to);
  • $str : The string to be converted

  • $from : The encoding type of the current string

  • $to : Target encoding type

Supported encoding identifiers include:

  • k — KOI8-R

  • w — Windows-1251

  • i — ISO-8859-5

  • a — CP866

Why do database operations need to focus on encoding conversion?

If the character set settings of the database are not uniform, or if PHP is not adapted during reading and writing, it will cause the stored data to be inconsistent with the actual displayed data, resulting in garbled code problems. For example:

  • The database is encoded by utf8 , but the application uses Windows-1251 when writing, and garbled code will appear.

  • The application did not convert the encoding when reading the database, causing the data to display abnormally.

Ensuring consistent encoding is the key to avoiding garbled code.

Example of scenario using convert_cyr_string

Assuming that the database you are using is storing Cyrillic character data in Windows-1251 encoding, and PHP scripts process strings in UTF-8 encoding by default, you can use convert_cyr_string for encoding conversion to ensure that the data is formatted correctly when written and read.

Encoding before writing to database

 <?php
// original UTF-8 String
$utf8_string = "Пример строки на русском";

// Will UTF-8 Convert to Windows-1251,Prepare to write to the database
// Use first iconv Convert UTF-8 arrive Windows-1251,Use again convert_cyr_string Adjust the encoding
$win1251_string = convert_cyr_string(iconv("UTF-8", "Windows-1251//IGNORE", $utf8_string), 'w', 'w');

// Database write operation
// Assume that it has been established PDO connect $pdo
$sql = "INSERT INTO example_table (text_column) VALUES (:text)";
$stmt = $pdo->prepare($sql);
$stmt->bindParam(':text', $win1251_string);
$stmt->execute();
?>

Encoding and conversion after reading the database

 <?php
// from数据库读取String,Assume that is Windows-1251 coding
$sql = "SELECT text_column FROM example_table WHERE id = 1";
$stmt = $pdo->query($sql);
$row = $stmt->fetch(PDO::FETCH_ASSOC);

// use convert_cyr_string Convert为 UTF-8,Convenient front-end display
$win1251_string = $row['text_column'];
$utf8_string = iconv("Windows-1251", "UTF-8//IGNORE", convert_cyr_string($win1251_string, 'w', 'w'));

echo $utf8_string;
?>

Complete analysis of operation skills

  1. Confirm database character set <br> Use SQL statements to confirm the character set configuration of the database and tables, and try to ensure that it is consistent with the default encoding of the PHP script.

  2. Unified encoding and conversion tool <br> Although convert_cyr_string focuses on Cyrillic encoding conversion, iconv or mb_convert_encoding is more general for general UTF-8 and other encoding conversions.

  3. Arrange the conversion order reasonably <br> For complex encoding conversion, it is recommended to use iconv or mb_convert_encoding as the main conversion first, and then use convert_cyr_string to fine-tune.

  4. Processing conversion failed <br> When using the conversion, use the ignorant invalid character flag (such as "//IGNORE" ) to avoid the program error.

  5. Specify character set when connecting to the database <br> For MySQL, it is recommended to include character set parameters when connecting, such as charset=cp1251 to avoid confusion caused by automatic conversion during reading.

Summarize

convert_cyr_string is a simple and effective tool when dealing with specific Cyrillic encoding conversions, but a wider range of encoding conversion tasks is more appropriate to rely on iconv and mbstring extensions. Only by reasonably combining these tools and setting the database character set can we fundamentally ensure that the encoding is consistent during reading and writing of the database, prevent garbled code, and improve system stability.

Keep the encoding consistent, the data will be naturally clear and correct, and the user experience will be improved accordingly.


 <?php
// Example:fromUTF-8ConvertarriveWindows-1251Write to the database,再读取Convert回UTF-8
function saveStringToDb(PDO $pdo, string $utf8_string) {
    // Transcoding:UTF-8 -> Windows-1251
    $win1251_string = iconv("UTF-8", "Windows-1251//IGNORE", $utf8_string);
    $win1251_string = convert_cyr_string($win1251_string, 'w', 'w');

    $sql = "INSERT INTO example_table (text_column) VALUES (:text)";
    $stmt = $pdo->prepare($sql);
    $stmt->bindParam(':text', $win1251_string);
    $stmt->execute();
}

function getStringFromDb(PDO $pdo, int $id): string {
    $sql = "SELECT text_column FROM example_table WHERE id = :id";
    $stmt = $pdo->prepare($sql);
    $stmt->bindParam(':id', $id);
    $stmt->execute();
    $row = $stmt->fetch(PDO::FETCH_ASSOC);

    if (!$row) {
        return '';
    }

    $win1251_string = $row['text_column'];
    // Transcoding:Windows-1251 -> UTF-8
    $utf8_string = iconv("Windows-1251", "UTF-8//IGNORE", convert_cyr_string($win1251_string, 'w', 'w'));

    return $utf8_string;
}
?>