Dutch PHP Conference 2025 - Call For Papers

Налаштування під час виконання

На поведінку цих функцій випливають налаштування в php.ini.

mbstring configuration options
Назва Початково Де можна змінювати Журнал змін
mbstring.language "neutral" INI_ALL  
mbstring.detect_order NULL INI_ALL  
mbstring.http_input "pass" INI_ALL Deprecated
mbstring.http_output "pass" INI_ALL Deprecated
mbstring.internal_encoding NULL INI_ALL Deprecated
mbstring.substitute_character NULL INI_ALL  
mbstring.func_overload "0" INI_SYSTEM Deprecated as of PHP 7.2.0; removed as of PHP 8.0.0.
mbstring.encoding_translation "0" INI_PERDIR  
mbstring.http_output_conv_mimetypes "^(text/|application/xhtml\+xml)" INI_ALL  
mbstring.strict_detection "0" INI_ALL  
mbstring.regex_retry_limit "1000000" INI_ALL Available as of PHP 7.4.0.
mbstring.regex_stack_limit "100000" INI_ALL Available as of PHP 7.3.5.
Докладніша інформація та визначення режимів INI_* на сторінці Де можна встановлювати параметри конфігурації.

Тут є коротке пояснення директив конфігурації.

mbstring.language string

The default national language setting (NLS) used in mbstring. Note that this option automagically defines mbstring.internal_encoding and mbstring.internal_encoding should be placed after mbstring.language in php.ini

mbstring.encoding_translation bool

Enables the transparent character encoding filter for the incoming HTTP queries, which performs detection and conversion of the input encoding to the internal character encoding.

mbstring.internal_encoding string
Увага

Цей застарілий функціонал буде обов'язково видалено у майбутньому.

Defines the default internal character encoding.

Users should leave this empty and set default_charset instead.

mbstring.http_input string
Увага

Цей застарілий функціонал буде обов'язково видалено у майбутньому.

Defines the default HTTP input character encoding.

Users should leave this empty and set default_charset instead.

mbstring.http_output string
Увага

Цей застарілий функціонал буде обов'язково видалено у майбутньому.

Defines the default HTTP output character encoding (output will be converted from the internal encoding to the HTTP output encoding upon output).

Users should leave this empty and set default_charset instead.

mbstring.detect_order string

Defines default character code detection order. See also mb_detect_order().

mbstring.substitute_character string

Defines character to substitute for invalid character encoding. See mb_substitute_character() for supported values.

mbstring.func_overload string
Увага

Цей функціонал ЗАСТАРІВ, починаючи з PHP 7.2.0, та ВИЛУЧЕНИЙ в PHP 8.0.0. Вкрай не рекомендується на нього покладатися.

Overloads a set of single byte functions by the mbstring counterparts. See Function overloading for more information.

This setting can only be changed from the php.ini file.

mbstring.http_output_conv_mimetypes string

mbstring.strict_detection bool

Enables strict encoding detection. See mb_detect_encoding() for a description and examples.

mbstring.regex_retry_limit int

Limits the amount of backtracking that may be performed during one mbregex match.

This setting only takes effect when linking against oniguruma >= 6.8.0.

mbstring.regex_stack_limit int

Limits the stack depth of mbstring regular expressions.

According to the » HTML 4.01 specification, Web browsers are allowed to encode a form being submitted with a character encoding different from the one used for the page. See mb_http_input() to detect character encoding used by browsers.

Although popular browsers are capable of giving a reasonably accurate guess to the character encoding of a given HTML document, it would be better to set the charset parameter in the Content-Type HTTP header to the appropriate value by header() or default_charset ini setting.

Приклад #1 php.ini setting examples

; Set default language
mbstring.language        = Neutral; Set default language to Neutral(UTF-8) (default)
mbstring.language        = English; Set default language to English 
mbstring.language        = Japanese; Set default language to Japanese

;; Set default internal encoding
;; Note: Make sure to use character encoding works with PHP
mbstring.internal_encoding    = UTF-8  ; Set internal encoding to UTF-8

;; HTTP input encoding translation is enabled.
mbstring.encoding_translation = On

;; Set default HTTP input character encoding
;; Note: Script cannot change http_input setting.
mbstring.http_input           = pass    ; No conversion. 
mbstring.http_input           = auto    ; Set HTTP input to auto
                                ; "auto" is expanded according to mbstring.language
mbstring.http_input           = SJIS    ; Set HTTP input to SJIS
mbstring.http_input           = UTF-8,SJIS,EUC-JP ; Specify order

;; Set default HTTP output character encoding 
mbstring.http_output          = pass    ; No conversion
mbstring.http_output          = UTF-8   ; Set HTTP output encoding to UTF-8

;; Set default character encoding detection order
mbstring.detect_order         = auto    ; Set detect order to auto
mbstring.detect_order         = ASCII,JIS,UTF-8,SJIS,EUC-JP ; Specify order

;; Set default substitute character
mbstring.substitute_character = 12307   ; Specify Unicode value
mbstring.substitute_character = none    ; Do not print character
mbstring.substitute_character = long    ; Long Example: U+3000,JIS+7E7E

Приклад #2 php.ini setting for EUC-JP users

;; Disable Output Buffering
output_buffering      = Off

;; Set HTTP header charset
default_charset       = EUC-JP    

;; Set default language to Japanese
mbstring.language = Japanese

;; HTTP input encoding translation is enabled.
mbstring.encoding_translation = On

;; Set HTTP input encoding conversion to auto
mbstring.http_input   = auto 

;; Convert HTTP output to EUC-JP
mbstring.http_output  = EUC-JP    

;; Set internal encoding to EUC-JP
mbstring.internal_encoding = EUC-JP    

;; Do not print invalid characters
mbstring.substitute_character = none   

Приклад #3 php.ini setting for SJIS users

;; Enable Output Buffering
output_buffering     = On

;; Set mb_output_handler to enable output conversion
output_handler       = mb_output_handler

;; Set HTTP header charset
default_charset      = Shift_JIS

;; Set default language to Japanese
mbstring.language = Japanese

;; Set http input encoding conversion to auto
mbstring.http_input  = auto 

;; Convert to SJIS
mbstring.http_output = SJIS    

;; Set internal encoding to EUC-JP
mbstring.internal_encoding = EUC-JP    

;; Do not print invalid characters
mbstring.substitute_character = none   

add a note

User Contributed Notes 2 notes

up
0
ASchmidt at Anamera dot net
6 years ago
The documentation is vague, on WHAT precisely the valid "NLS" language strings are that are valid for "mbstring.language".

According to http://php.net/manual/en/function.mb-language.php the values are "Japanese", "ja", "English", "en", or "uni" for UTF-8.
On the other hand, the sample on this current page omits "uni" but introduces "Neutral" as an undocumented option - which is also the default value:

<?php
var_dump
( mb_language() ); // "neutral" (default if not set)
var_dump( mb_language( 'uni' ) ); // TRUE, valid language string
var_dump( mb_language() ); // "uni"
var_dump( mb_language( 'neutral' ) ); // TRUE, valid language string
var_dump( mb_language() ); // "neutral"
?>
up
0
Hayley Watson
6 years ago
String literals in the PHP script are encoded with the same encoding that the PHP file was saved with. This is not affected by default_charset or other .ini settings.

Scenario: The default_charset is KOI8-R, and there is a text file "input.txt" containing the string "Это текст для поиска." in KOI8-R encoding.

A PHP script is written:
<?php

// mb_internal_encoding('KOI8-R');

$string = 'текст.';

$data = file_get_contents('input.txt');

echo
mb_strpos($data, $string);

?>
But unfortunately it was saved as UTF-8.

It doesn't work; mb_strpos() returns false because it can't find the UTF-8-encoded "текст" inside the KOI8-R-encoded "Это текст для поиска.".

Adjusting the default_charset had no effect. Not even fiddling with mb_internal_encoding could fix it, simply because the strings involved had *different* encodings and without actually changing one of them they just weren't going to match.

Either re-save the source file as KOI8-R to match the data file, or re-save the data file as UTF-8 to match the source code. Only then will the script properly echo '4'.
To Top