PHP Conference Nagoya 2025

stats_stat_correlation

(PECL stats >= 1.0.0)

stats_stat_correlationReturns the Pearson correlation coefficient of two data sets

Beschreibung

stats_stat_correlation(array $arr1, array $arr2): float

Returns the Pearson correlation coefficient between arr1 and arr2.

Parameter-Liste

arr1

The first array

arr2

The second array

Rückgabewerte

Returns the Pearson correlation coefficient between arr1 and arr2, or false on failure.

add a note

User Contributed Notes 3 notes

up
17
non at dot com
9 years ago
undefined for me, thus I've implemented my own correlation which is much faster and simpler than the one provided above.

function Corr($x, $y){

$length= count($x);
$mean1=array_sum($x) / $length;
$mean2=array_sum($y) / $length;

$a=0;
$b=0;
$axb=0;
$a2=0;
$b2=0;

for($i=0;$i<$length;$i++)
{
$a=$x[$i]-$mean1;
$b=$y[$i]-$mean2;
$axb=$axb+($a*$b);
$a2=$a2+ pow($a,2);
$b2=$b2+ pow($b,2);
}

$corr= $axb / sqrt($a2*$b2);

return $corr;
}
up
1
admin at maychu dot net (Le Cong)
15 years ago
Please note that this function is reserved for two arrays with continued numbers inside (just integers).
I tested this function and found that it calculate the Pearson's Correlation Coefficient of two arrays.
---
Here's suggested documentation:

stats_stat_correlation — Calculate the Pearson's Correlation Coefficient of two arrays of continued numbers.

Parameters:
arr1 = array (integer1a, interger2a ...)
arr2 = array (integer1b, interger2b ...))
(Note that the count of elements in two arrays must be equal)

Return value: Pearson's Correlation Coefficient in decimal format (ex. 0.934399822094)

Code examples:

<?php
// Provided by admin@maychu.net
$array_x = array(5,3,6,7,4,2,9,5);
$array_y = array(4,3,4,8,3,2,10,5);
$pearson = stats_stat_correlation($array_x,$array_y);
echo
$pearson;
?>
up
2
umar dot anjum at ymail dot com
15 years ago
I tried to use this function, but got a not-defined error. Anyway, I have created a set of functions to replace this:

<?php

//Since Correlation needs two arrays, I am hardcoding them
$array1[0] = 59.3;
$array1[1] = 61.2;
$array1[2] = 56.8
$array1
[3] = 97.55;

$array2[0] = 565.82;
$array2[1] = 54.568;
$array2[2] = 84.22;
$array2[3] = 483.55;

//To find the correlation of the two arrays, simply call the
//function Correlation that takes two arrays:

$correlation = Correlation($array1, $array2);

//Displaying the calculated Correlation:
print $correlation;

//The functions that work behind the scene to calculate the
//correlation

function Correlation($arr1, $arr2)
{
$correlation = 0;

$k = SumProductMeanDeviation($arr1, $arr2);
$ssmd1 = SumSquareMeanDeviation($arr1);
$ssmd2 = SumSquareMeanDeviation($arr2);

$product = $ssmd1 * $ssmd2;

$res = sqrt($product);

$correlation = $k / $res;

return
$correlation;
}

function
SumProductMeanDeviation($arr1, $arr2)
{
$sum = 0;

$num = count($arr1);

for(
$i=0; $i<$num; $i++)
{
$sum = $sum + ProductMeanDeviation($arr1, $arr2, $i);
}

return
$sum;
}

function
ProductMeanDeviation($arr1, $arr2, $item)
{
return (
MeanDeviation($arr1, $item) * MeanDeviation($arr2, $item));
}

function
SumSquareMeanDeviation($arr)
{
$sum = 0;

$num = count($arr);

for(
$i=0; $i<$num; $i++)
{
$sum = $sum + SquareMeanDeviation($arr, $i);
}

return
$sum;
}

function
SquareMeanDeviation($arr, $item)
{
return
MeanDeviation($arr, $item) * MeanDeviation($arr, $item);
}

function
SumMeanDeviation($arr)
{
$sum = 0;

$num = count($arr);

for(
$i=0; $i<$num; $i++)
{
$sum = $sum + MeanDeviation($arr, $i);
}

return
$sum;
}

function
MeanDeviation($arr, $item)
{
$average = Average($arr);

return
$arr[$item] - $average;
}

function
Average($arr)
{
$sum = Sum($arr);
$num = count($arr);

return
$sum/$num;
}

function
Sum($arr)
{
return
array_sum($arr);
}

?>
To Top