Current Location: Home> Latest Articles> mb_get_info function and mb_check_encoding verify string encoding

mb_get_info function and mb_check_encoding verify string encoding

gitbox 2025-05-29

In dealing with multilingual websites or applications that need to ensure consistency in string encoding, encoding problems are a challenge that developers often face. PHP provides powerful multibyte string support functions mbstring , where mb_get_info and mb_check_encoding are very useful tools to detect and verify the encoding of strings.

This article will explain how to use these two functions in combination to ensure that strings are always encoded consistently during processing, thereby avoiding garbled code or security issues.

1. mb_get_info: Get the current multibyte environment setting information

mb_get_info() is a function provided by PHP to obtain the current mbstring environment configuration.

 <?php
$info = mb_get_info();
print_r($info);
?>

The output content will include internal encoding (internal_encoding), HTTP input and output encoding, language settings, etc. This information can help us understand the encoding settings on which the current string operation is based.

If you want to get only specific settings, such as internal encoding, you can pass parameters:

 <?php
$encoding = mb_get_info("internal_encoding");
echo "Current internal encoding: " . $encoding;
?>

2. mb_check_encoding: Verify whether the string meets the specified encoding

mb_check_encoding() is a tool for checking whether a string is valid encoding, and is ideal for scenarios where user input verification or prevent unintended encoding injection.

By default, it validates the current internal encoding:

 <?php
$str = "Hello,world";
if (mb_check_encoding($str)) {
    echo "Strings are valid encodings。";
} else {
    echo "Invalid string encoding!";
}
?>

You can also specify the encoding for detection:

 <?php
$str = file_get_contents('https://gitbox.net/data/sample.txt');

if (mb_check_encoding($str, 'UTF-8')) {
    echo "The string is UTF-8 coding。";
} else {
    echo "String is not UTF-8 coding。";
}
?>

3. Practical cases of combining mb_get_info and mb_check_encoding

Here is a practical case showing how to read remote text content and verify its encoding:

 <?php
$url = 'https://gitbox.net/data/content.txt';
$content = file_get_contents($url);

// 获取当前的内部coding
$currentEncoding = mb_get_info("internal_encoding") ?? 'UTF-8';

// 验证内容是否是有效的Current internal encoding
if (mb_check_encoding($content, $currentEncoding)) {
    echo "内容coding验证成功,coding为:{$currentEncoding}";
} else {
    echo "warn:远程内容coding与系统预设不一致!";
}
?>

4. Summary

By reasonably using mb_get_info() to obtain the current encoding environment, and then using mb_check_encoding() to check the actual encoding of the string, it can effectively improve the stability and security of PHP programs when processing multilingual content. This encoding verification mechanism is crucial especially when processing user input or remote data.

When building international applications, you may want to add this type of encoding verification logic in both the input and output stages to ensure that your system always runs in the expected character set environment.