strcmp

(PHP 4, PHP 5, PHP 7, PHP 8)

strcmp — バイナリセーフな文字列比較

説明

strcmp(string $string1, string $string2): int

この比較は大文字小文字を区別することに注意してください。大文字小文字を区別しない比較をする場合は、 strcasecmp() を参照ください。

この比較はロケールを認識しないことに注意してください。ロケールを考慮した比較をする場合は、 strcoll() または Collator::compare() を参照ください。

パラメータ

string1: 最初の文字列。
string2: 次の文字列。

戻り値

string1 が string2 より小さければ 0 より小さな値を、 string1 が string2 より大きければ 0 より大きな値を、両者が等しければ 0 を返します。返却される値の絶対値に、特に意味はありません。

変更履歴

バージョン	説明
8.2.0	この関数は、2つの文字列の長さが等しくない場合に `strlen($string1) - strlen($string2)` を返すとは限らなくなりました。代わりに、`-1` や `1` を返す可能性があります。

例

例1 strcmp() の例

<?php
$var1 = "Hello";
$var2 = "hello";
if (strcmp($var1, $var2) !== 0) {
    echo '$var1 is not equal to $var2 in a case sensitive string comparison';
}
?>

参考

文字列の完全な比較
- strcasecmp() - 大文字小文字を区別しないバイナリセーフな文字列比較を行う
- Collator::compare() - ふたつの Unicode 文字列を比較する
- strcoll() - ロケールに基づく文字列比較
文字列の部分比較
- substr_compare() - 指定した位置から指定した長さの 2 つの文字列について、バイナリ対応で比較する
- strncmp() - 最初の n 文字についてバイナリセーフな文字列比較を行う
- strstr() - 文字列が最初に現れる位置を見つける
文字列の類似度 / その他の文字列比較
- preg_match() - 正規表現によるマッチングを行う
- levenshtein() - 二つの文字列のレーベンシュタイン距離を計算する
- metaphone() - 文字列の metaphone キーを計算する
- similar_text() - 二つの文字列の間の類似性を計算する
- soundex() - 文字列の soundex キーを計算する

Found A Problem?

Learn How To Improve This Page • Submit a Pull Request • Report a Bug

＋add a note

User Contributed Notes 16 notes

down

116

jendoj at gmail dot com ¶

13 years ago

If you rely on strcmp for safe string comparisons, both parameters must be strings, the result is otherwise extremely unpredictable.
For instance you may get an unexpected 0, or return values of NULL, -2, 2, 3 and -3.

strcmp("5", 5) => 0
strcmp("15", 0xf) => 0
strcmp(61529519452809720693702583126814, 61529519452809720000000000000000) => 0
strcmp(NULL, false) => 0
strcmp(NULL, "") => 0
strcmp(NULL, 0) => -1
strcmp(false, -1) => -2
strcmp("15", NULL) => 2
strcmp(NULL, "foo") => -3
strcmp("foo", NULL) => 3
strcmp("foo", false) => 3
strcmp("foo", 0) => 1
strcmp("foo", 5) => 1
strcmp("foo", array()) => NULL + PHP Warning
strcmp("foo", new stdClass) => NULL + PHP Warning
strcmp(function(){}, "") => NULL + PHP Warning

down

lehal2 at hotmail dot com ¶

12 years ago

i hope this will give you a clear idea how strcmp works internally.

<?php
$str1 = "b";
echo ord($str1); //98
echo "<br/>";
$str2 = "t";
echo ord($str2); //116
echo "<br/>";
echo ord($str1)-ord($str2);//-18
$str1 = "bear";
$str2 = "tear";
$str3 = "";
echo "<pre>";
echo strcmp($str1, $str2); // -18
echo "<br/>";
echo strcmp($str2, $str1); //18
echo "<br/>";
echo strcmp($str2, $str2); //0
echo "<br/>";
echo strcmp($str2, $str3); //4
echo "<br/>";
echo strcmp($str3, $str2); //-4
echo "<br/>";
echo strcmp($str3, $str3); // 0
echo "</pre>";
?>

down

Rob Wiesler ¶

16 years ago

One big caveat - strings retrieved from the backtick operation may be zero terminated (C-style), and therefore will not be equal to the non-zero terminated strings (roughly Pascal-style) normal in PHP. The workaround is to surround every `` pair or shell_exec() function with the trim() function. This is likely to be an issue with other functions that invoke shells; I haven't bothered to check.

On Debian Lenny (and RHEL 5, with minor differences), I get this:

====PHP====
<?php
$sz = `pwd`;
$ps = "/var/www";

echo "Zero-terminated string:<br />sz = ".$sz."<br />str_split(sz) = "; print_r(str_split($sz));
echo "<br /><br />";

echo "Pascal-style string:<br />ps = ".$ps."<br />str_split(ps) = "; print_r(str_split($ps));
echo "<br /><br />";

echo "Normal results of comparison:<br />";
echo "sz == ps = ".($sz == $ps ? "true" : "false")."<br />";
echo "strcmp(sz,ps) = ".strcmp($sz,$ps);
echo "<br /><br />";

echo "Comparison with trim()'d zero-terminated string:<br />";
echo "trim(sz) = ".trim($sz)."<br />";
echo "str_split(trim(sz)) = "; print_r(str_split(trim($sz))); echo "<br />";
echo "trim(sz) == ps = ".(trim($sz) == $ps ? "true" : "false")."<br />";
echo "strcmp(trim(sz),ps) = ".strcmp(trim($sz),$ps);
?>

====Output====
Zero-terminated string:
sz = /var/www 
str_split(sz) = Array ( [0] => / [1] => v [2] => a [3] => r [4] => / [5] => w [6] => w [7] => w [8] => ) 

Pascal-style string:
ps = /var/www
str_split(ps) = Array ( [0] => / [1] => v [2] => a [3] => r [4] => / [5] => w [6] => w [7] => w ) 

Normal results of comparison:
sz == ps = false
strcmp(sz,ps) = 1

Comparison with trim()'d zero-terminated string:
trim(sz) = /var/www
str_split(trim(sz)) = Array ( [0] => / [1] => v [2] => a [3] => r [4] => / [5] => w [6] => w [7] => w ) 
trim(sz) == ps = true
strcmp(trim(sz),ps) = 0

down

kgun ! mail ! com ¶

6 years ago

In case you want to get results -1, 0 or 1 always, like JS indexOf();

<?php
function cmp(string $str1, string $str2): int {
    return ($str1 > $str2) - ($str1 < $str2);
}

$str1 = 'a';
$str2 = 'z';
var_dump(cmp($str1, $str2), strcmp($str1, $str2));

//=> int(-1) int(-25) int(-25)

$str1 = 'a';
$str2 = '1';
var_dump(cmp($str1, $str2), strcmp($str1, $str2));
//=> int(1) int(48) int(48)
?>

down

frewuill at merlin-corp dot com ¶

25 years ago

Hey be sure the string you are comparing has not special characters like '\n' or something like that.

down

luizvid at gmail dot com ¶

10 years ago

strcmp returns -1 ou 1 if two strings are not identical, 
and 0 when they are, except when comparing a string and an empty string (<?php $a = "";  ?>), it returns the length of the string.

For instance:
<?php
        $a = "foo"; // length 3
        $b = ""; // empty string
        $c = "barbar"; // length 6

        echo strcmp($a, $a); // outputs 0
        echo strcmp($a, $c); // outputs 1
        echo strcmp($c, $a); // outputs -1
        echo strcmp($a, $b); // outputs 3
        echo strcmp($b, $a); // outputs -3
        echo strcmp($c, $b); // outputs 6
        echo strcmp($b, $c); // outputs -6
?>

down

hm2k at php dot net ¶

15 years ago

Don't forget the similar_text() function...

http://php.net/manual/en/function.similar-text.php

down

erik at eldata dot se ¶

5 years ago

strcmp and strcasecmp does not work well with multibyte (UTF8) strings and there are no mb_strcmp or mb_strcasecmp - instead look at the wonderful Collator class with method compare (search for Collator above) - supports not only UTF8 but also different national collations (sort orders).

Natural sort is also supported, use setAttribute to set Collator::NUMERIC_COLLATION to Collator::ON.

down

hrodicus at gmail dot com ¶

14 years ago

Note a difference between 5.2 and 5.3 versions

echo (int)strcmp('pending',array());
will output -1 in PHP 5.2.16 (probably in all versions prior 5.3)
but will output 0 in PHP 5.3.3

Of course, you never need to use array as a parameter in string comparisions.

down

jcanals at totsoft dot com ¶

21 years ago

Some notes about the spanish locale. I've read some notes that says  "CH", "RR" or "LL" must be considered as a single letter in Spanish. That's not really tru. "CH", "RR" and "LL" where considered a single letter in the past (lot of years ago), for that you must use the "Tradictional Sort". Nowadays, the Academy uses the Modern Sort and recomends not to consider anymore "CH", "RR" and "LL" as a single letter. They must be considered two separated letters and sort and compare on that way.

Ju just have to take a look to the Offial Spanish Language Dictionary and you can see there that from many years ago there is not the separated section for "CH", "LL" or "RR" ... i.e. words starting with CH must be after the ones starting by CG, and before the ones starting by CI.

down

chris at unix-ninja dot com ¶

12 years ago

Since it may not be obvious to some people, please note that there is another possible return value for this function.

strcmp() will return NULL on failure.

This has the side effect of equating to a match when using an equals comparison (==).
Instead, you may wish to test matches using the identical comparison (===), which should not catch a NULL return.

---------------------
  Example
---------------------

$variable1 = array();
$ans === strcmp($variable1, $variable2);

This will stop $ans from returning a match;

Please use strcmp() carefully when comparing user input, as this may have potential security implications in your code.

down

mikael1 at mail dot ru ¶

6 years ago

1) If the two strings have identical BEGINNING parts, they are trunkated from both strings.
2) The resulting strings are compared with two possible outcomes:
a) if one of the resulting strings is an empty string, then the length of the non-empty string is returned (the sign depending on the order in which you pass the arguments to the function)
b) in any other case just the numerical values of the FIRST characters are compared. The result is +1 or -1 no matter how big is the difference between the numerical values. 

<?php
$str = array('','a','afox','foxa');
$size = count($str);

echo '<pre>';
for($i=0; $i<$size; $i++)
{
  for($j=$i+1; $j<$size; $j++)
  {
    echo '<br>('.$str[$i].','.$str[$j].') = '.strcmp($str[$i], $str[$j]);
    echo '<br>('.$str[$j].','.$str[$i] .') = '.strcmp($str[$j], $str[$i]);
  }
}
echo '</pre>'; 
?>

In Apache/2.4.37 (Win32) OpenSSL/1.1.1 PHP/7.2.12 produces the following results:

(,a) = -1 //comparing with an empty string produces the length of the NON-empty string
(a,) = 1 // ditto
(,afox) = -4 // ditto
(afox,) = 4 // ditto
(,foxa) = -4 // ditto
(foxa,) = 4 // ditto
(a,afox) = -3 // The identical BEGINNING part ("a") is trunkated from both strings. Then the remaining "fox" is compared to the remaing empty string in the other argument. Produces the length of the NON-empty string. Same as in all the above examples.
(afox,a) = 3 // ditto
(a,foxa) = -1 // Nothing to trunkate. Just the numerical values of the first letters are compared
(foxa,a) = 1 // ditto
(afox,foxa) = -1 // ditto
(foxa,afox) = 1 // ditto

down

Anonymous ¶

23 years ago

In summary, strcmp() does not necessarily use the ASCII code order of each character like in the 'C' locale, but instead parse each string to match language-specific character entities (such as 'ch' in Spanish, or 'dz' in Czech), whose collation order is then compared. When both character entities have the same collation order (such as 'ss' and '?' in German), they are compared relative to their code by strcmp(), or considered equal by strcasecmp().
The LC_COLLATE locale setting is then considered: only if LC_COLLATE=C or LC_ALL=C does strcmp() compare strings by character code.
Generally, most locales define the following order:
control, space, punctuation and underscore, digit, alpha (lower then upper with Latin scripts; or final, middle, then isolated, initial with Arabic script), symbols, others...
With strcasecmp(), the alpha subclass is ignored and consider all forms of letters as equal.
Note also that some locales behave differently with accented characters: some consider they are the same letter as the unaccented letter (with a minor collation order, e.g. French, Italian, Spanish), some consider they are distinct letters with an independant collation order (e.g. in the C locale, or in Nordic languages).
Finally, the collation string is not considering individual characters but instead groups of characters that form a single letter:
- for example "ch" or "CH" in Spanish which is always after all other strings beginning with 'c' or 'C', including "cz", but before 'd' or 'D';
- 'ss' and '?' in German;
- 'dz', 'DZ' and 'Dz' in some Central European languages written with the Latin script...
- UTF-8, UTF-16 (Unicode), S-JIS, Big5, ISO2022 character encoding of a locale (the suffix in the locale name) first decode the characters into the UCS4/ISO10646 code position before applying the rules of the language indicated by the main locale...
So be extremely careful to what you consider a "character", as it may just mean a encoding byte with no significance in the string collation algorithm: the first character of the string "cholera" in Spanish is "ch", not "c" !

down

wsogmm at seznam dot cz ¶

16 years ago

Just a short comment to the note of arnar at hm dot is: md5() is a hash function and therefore it may happen (although it is very unlikely) that the md5() checksums of  two different strings will be equal (hash collision) ...

down

kamil dot k dot kielczewski at gmail dot com ¶

8 years ago

Vulnerability (in PHP >=5.3) :

<?php
if (strcmp($_POST['password'], 'sekret') == 0) {
    echo "Welcome, authorized user!\n";
} else {
    echo "Go away, imposter.\n";
}
?>

$ curl -d password=sekret http://andersk.scripts.mit.edu/strcmp.php
Welcome, authorized user!

$ curl -d password=wrong http://andersk.scripts.mit.edu/strcmp.php
Go away, imposter.

$ curl -d password[]=wrong http://andersk.scripts.mit.edu/strcmp.php
Welcome, authorized user!

SRC of this example: https://www.quora.com/Why-is-PHP-hated-by-so-many-developers

down

-4

francis at flourish dot org ¶

20 years ago

If you want to strings according to locale, use strcoll instead.

＋add a note