PHPerKaigi 2025

Supported Character Encodings

Currently the following character encodings are supported by the mbstring module. Any of those Character encodings can be specified in the encoding parameter of mbstring functions.

The following character encodings are supported in this PHP extension:

  • UCS-4*
  • UCS-4BE
  • UCS-4LE*
  • UCS-2
  • UCS-2BE
  • UCS-2LE
  • UTF-32*
  • UTF-32BE*
  • UTF-32LE*
  • UTF-16*
  • UTF-16BE*
  • UTF-16LE*
  • UTF-7
  • UTF7-IMAP
  • UTF-8*
  • ASCII*
  • EUC-JP*
  • SJIS*
  • eucJP-win*
  • SJIS-win*
  • ISO-2022-JP
  • ISO-2022-JP-MS
  • CP932
  • CP51932
  • SJIS-mac (alias: MacJapanese)
  • SJIS-Mobile#DOCOMO (alias: SJIS-DOCOMO)
  • SJIS-Mobile#KDDI (alias: SJIS-KDDI)
  • SJIS-Mobile#SOFTBANK (alias: SJIS-SOFTBANK)
  • UTF-8-Mobile#DOCOMO (alias: UTF-8-DOCOMO)
  • UTF-8-Mobile#KDDI-A
  • UTF-8-Mobile#KDDI-B (alias: UTF-8-KDDI)
  • UTF-8-Mobile#SOFTBANK (alias: UTF-8-SOFTBANK)
  • ISO-2022-JP-MOBILE#KDDI (alias: ISO-2022-JP-KDDI)
  • JIS
  • JIS-ms
  • CP50220
  • CP50220raw
  • CP50221
  • CP50222
  • ISO-8859-1*
  • ISO-8859-2*
  • ISO-8859-3*
  • ISO-8859-4*
  • ISO-8859-5*
  • ISO-8859-6*
  • ISO-8859-7*
  • ISO-8859-8*
  • ISO-8859-9*
  • ISO-8859-10*
  • ISO-8859-13*
  • ISO-8859-14*
  • ISO-8859-15*
  • ISO-8859-16*
  • byte2be
  • byte2le
  • byte4be
  • byte4le
  • BASE64
  • HTML-ENTITIES (alias: HTML)
  • 7bit
  • 8bit
  • EUC-CN*
  • CP936
  • GB18030
  • HZ
  • EUC-TW*
  • CP950
  • BIG-5*
  • EUC-KR*
  • UHC (alias: CP949)
  • ISO-2022-KR
  • Windows-1251 (alias: CP1251)
  • Windows-1252 (alias: CP1252)
  • CP866 (alias: IBM866)
  • KOI8-R*
  • KOI8-U*
  • ArmSCII-8 (alias: ArmSCII8)

* denotes encodings usable also in regular expressions.

Any php.ini entry which accepts an encoding name can also use the values "auto" and "pass". mbstring functions which accept an encoding name can also use the value "auto".

If "pass" is set, no character encoding conversion is performed.

If "auto" is set, it is expanded to the list of encodings defined per the NLS. For instance, if the NLS is set to Japanese, the value is assumed to be "ASCII,JIS,UTF-8,EUC-JP,SJIS".

See also mb_detect_order()

add a note

User Contributed Notes 3 notes

up
13
akniep at rayo dot info
12 years ago
Use mb_list_encodings() to check if an encoding is supported by mbstring before using its functions for it.
up
-2
Anonymous
10 years ago
CP850 (DOS-Latin-1) is also supported.
up
-3
Tomolimo (olivier dot moron at raynet-it dot com)
11 years ago
Apart of this list, GB2312 encoding is also supported.
It is Chinese Simplified encoding which is now superseded by GB18030, but GB2312 is not in the list.
If you try to us it, the result will allright even if it is not in the list.
Regards,
Tomolimo
To Top