mb_split

(PHP 4 >= 4.2.0, PHP 5, PHP 7, PHP 8)

mb_split使用正则表达式分割多字节字符串

说明

mb_split(string $pattern, string $string, int $limit = -1): array|false

使用正则表达式 pattern 分割多字节 string 并返回结果 array

参数

pattern

正则表达式。

string

待分割的 string

limit
如果指定了可选参数 limit,将最多分割为 limit 个元素。

返回值

结果为 array, 或者在失败时返回 false

注释

注意:

The character encoding specified by mb_regex_encoding() will be used as the character encoding for this function by default.

参见

  • mb_regex_encoding() - Set/Get character encoding for multibyte regex
  • mb_ereg() - Regular expression match with multibyte support
  • explode() - 使用一个字符串分割另一个字符串

添加备注

用户贡献的备注 8 notes

up
62
Stas Trefilov, Vertilia
10 years ago
a (simpler) way to extract all characters from a UTF-8 string to array with a single call to a built-in function:<?php  $str = 'Ма-руся';  print_r(preg_split('//u', $str, null, PREG_SPLIT_NO_EMPTY));?>Output:Array(    [0] => М    [1] => а    [2] => -    [3] =>     [4] => р    [5] => у    [6] => с    [7] => я)
up
51
boukeversteegh at gmail dot com
14 years ago
The $pattern argument doesn't use /pattern/ delimiters, unlike other regex functions such as preg_match.<?php   # Works. No slashes around the /pattern/   print_r( mb_split("\s", "hello world") );   Array (      [0] => hello      [1] => world   )   # Doesn't work:   print_r( mb_split("/\s/", "hello world") );   Array (      [0] => hello world   )?>
up
48
adjwilli at yahoo dot com
17 years ago
I figure most people will want a simple way to break-up a multibyte string into its individual characters. Here's a function I'm using to do that. Change UTF-8 to your chosen encoding method.

<?php
function mbStringToArray ($string) {
    $strlen = mb_strlen($string);
    while ($strlen) {
        $array[] = mb_substr($string,0,1,"UTF-8");
        $string = mb_substr($string,1,$strlen,"UTF-8");
        $strlen = mb_strlen($string);
    }
    return $array;
}
?>
up
22
boukeversteegh at gmail dot com
14 years ago
In addition to Sezer Yalcin's tip.

This function splits a multibyte string into an array of characters. Comparable to str_split().

<?php
function mb_str_split( $string ) {
    # Split at all position not after the start: ^
    # and not before the end: $
    return preg_split('/(?<!^)(?!$)/u', $string );
}

$string   = '火车票';
$charlist = mb_str_split( $string );

print_r( $charlist );
?>

# Prints:
Array
(
    [0] => 火
    [1] => 车
    [2] => 票
)
up
3
thflori at gmail
8 years ago
I agree that some people might want a mb_explode('', $string);this is my solution for it:<?php$string = 'Hallöle';$array = array_map(function ($i) use ($string) {     return mb_substr($string, $i, 1); }, range(0, mb_strlen($string) -1));expect($array)->toEqual(['H', 'a', 'l', 'l', 'ö', 'l', 'e']);?>
up
7
gunkan at terra dot es
13 years ago
To split an string like this: "日、に、本、ほん、語、ご" using the "、" delimiter i used:     $v = mb_split('、',"日、に、本、ほん、語、ご");but didn't work.The solution was to set this before:       mb_regex_encoding('UTF-8');      mb_internal_encoding("UTF-8");      $v = mb_split('、',"日、に、本、ほん、語、ご");and now it's working:Array(    [0] => 日    [1] => に    [2] => 本    [3] => ほん    [4] => 語    [5] => ご)
up
0
qdb at kukmara dot ru
15 years ago
an other way to str_split multibyte string:<?php$s='әӘөүҗңһ';//$temp_s=iconv('UTF-8','UTF-16',$s);$temp_s=mb_convert_encoding($s,'UTF-16','UTF-8');$temp_a=str_split($temp_s,4);$temp_a_len=count($temp_a);for($i=0;$i<$temp_a_len;$i++){    //$temp_a[$i]=iconv('UTF-16','UTF-8',$temp_a[$i]);    $temp_a[$i]=mb_convert_encoding($temp_a[$i],'UTF-8','UTF-16');}echo('<pre>');print_r($temp_a);echo('</pre>');//also possible to directly use UTF-16:define('SLS',mb_convert_encoding('/','UTF-16'));$temp_s=mb_convert_encoding($s,'UTF-16','UTF-8');$temp_a=str_split($temp_s,4);$temp_s=implode(SLS,$temp_a);$temp_s=mb_convert_encoding($temp_s,'UTF-8','UTF-16');echo($temp_s);?>
up
-3
gert dot matern at web dot de
16 years ago
We are talking about Multi Byte ( e.g. UTF-8) strings here, so preg_split will fail for the following string: 

'Weiße Rosen sind nicht grün!'

And because I didn't find a regex to simulate a str_split I optimized the first solution from adjwilli a bit:

<?php
$string = 'Weiße Rosen sind nicht grün!'
$stop   = mb_strlen( $string);
$result = array();

for( $idx = 0; $idx < $stop; $idx++)
{
   $result[] = mb_substr( $string, $idx, 1);
}
?>

Here is an example with adjwilli's function:

<?php
mb_internal_encoding( 'UTF-8');
mb_regex_encoding( 'UTF-8');  

function mbStringToArray
( $string
)
{
  $stop   = mb_strlen( $string);
  $result = array();

  for( $idx = 0; $idx < $stop; $idx++)
  {
     $result[] = mb_substr( $string, $idx, 1);
  }

  return $result;
}

echo '<pre>', PHP_EOL, 
print_r( mbStringToArray( 'Weiße Rosen sind nicht grün!', true)), PHP_EOL,
'</pre>'; 
?>

Let me know [by personal email], if someone found a regex to simulate a str_split with mb_split.
To Top