PHP Conference Nagoya 2025

stats_stat_correlation

(PECL stats >= 1.0.0)

stats_stat_correlationReturns the Pearson correlation coefficient of two data sets

Description

stats_stat_correlation(array $arr1, array $arr2): float

Returns the Pearson correlation coefficient between arr1 and arr2.

Parameters

arr1

The first array

arr2

The second array

Return Values

Returns the Pearson correlation coefficient between arr1 and arr2, or false on failure.

add a note

User Contributed Notes 3 notes

up
17
non at dot com
9 years ago
undefined for me, thus I've implemented my own correlation which is much faster and simpler than the one provided above.

function Corr($x, $y){

$length= count($x);
$mean1=array_sum($x) / $length;
$mean2=array_sum($y) / $length;

$a=0;
$b=0;
$axb=0;
$a2=0;
$b2=0;

for($i=0;$i<$length;$i++)
{
$a=$x[$i]-$mean1;
$b=$y[$i]-$mean2;
$axb=$axb+($a*$b);
$a2=$a2+ pow($a,2);
$b2=$b2+ pow($b,2);
}

$corr= $axb / sqrt($a2*$b2);

return $corr;
}
up
1
admin at maychu dot net (Le Cong)
15 years ago
Please note that this function is reserved for two arrays with continued numbers inside (just integers).
I tested this function and found that it calculate the Pearson's Correlation Coefficient of two arrays.
---
Here's suggested documentation:

stats_stat_correlation — Calculate the Pearson's Correlation Coefficient of two arrays of continued numbers.

Parameters:
arr1 = array (integer1a, interger2a ...)
arr2 = array (integer1b, interger2b ...))
(Note that the count of elements in two arrays must be equal)

Return value: Pearson's Correlation Coefficient in decimal format (ex. 0.934399822094)

Code examples:

<?php
// Provided by admin@maychu.net
$array_x = array(5,3,6,7,4,2,9,5);
$array_y = array(4,3,4,8,3,2,10,5);
$pearson = stats_stat_correlation($array_x,$array_y);
echo
$pearson;
?>
up
2
umar dot anjum at ymail dot com
15 years ago
I tried to use this function, but got a not-defined error. Anyway, I have created a set of functions to replace this:

<?php

//Since Correlation needs two arrays, I am hardcoding them
$array1[0] = 59.3;
$array1[1] = 61.2;
$array1[2] = 56.8
$array1
[3] = 97.55;

$array2[0] = 565.82;
$array2[1] = 54.568;
$array2[2] = 84.22;
$array2[3] = 483.55;

//To find the correlation of the two arrays, simply call the
//function Correlation that takes two arrays:

$correlation = Correlation($array1, $array2);

//Displaying the calculated Correlation:
print $correlation;

//The functions that work behind the scene to calculate the
//correlation

function Correlation($arr1, $arr2)
{
$correlation = 0;

$k = SumProductMeanDeviation($arr1, $arr2);
$ssmd1 = SumSquareMeanDeviation($arr1);
$ssmd2 = SumSquareMeanDeviation($arr2);

$product = $ssmd1 * $ssmd2;

$res = sqrt($product);

$correlation = $k / $res;

return
$correlation;
}

function
SumProductMeanDeviation($arr1, $arr2)
{
$sum = 0;

$num = count($arr1);

for(
$i=0; $i<$num; $i++)
{
$sum = $sum + ProductMeanDeviation($arr1, $arr2, $i);
}

return
$sum;
}

function
ProductMeanDeviation($arr1, $arr2, $item)
{
return (
MeanDeviation($arr1, $item) * MeanDeviation($arr2, $item));
}

function
SumSquareMeanDeviation($arr)
{
$sum = 0;

$num = count($arr);

for(
$i=0; $i<$num; $i++)
{
$sum = $sum + SquareMeanDeviation($arr, $i);
}

return
$sum;
}

function
SquareMeanDeviation($arr, $item)
{
return
MeanDeviation($arr, $item) * MeanDeviation($arr, $item);
}

function
SumMeanDeviation($arr)
{
$sum = 0;

$num = count($arr);

for(
$i=0; $i<$num; $i++)
{
$sum = $sum + MeanDeviation($arr, $i);
}

return
$sum;
}

function
MeanDeviation($arr, $item)
{
$average = Average($arr);

return
$arr[$item] - $average;
}

function
Average($arr)
{
$sum = Sum($arr);
$num = count($arr);

return
$sum/$num;
}

function
Sum($arr)
{
return
array_sum($arr);
}

?>
To Top