Binary search algorithm in PHP
Binary search is a search algorithm that is dramatically faster than PHP’s internal functions (such as array_search) when searching ordered data.
How does it work?
PHP’s internal function array_search is an example of a linear search; it will iterate over the entire data set until it finds a match working from front to back. This is great if your data set is small and unordered, but is incredibly inefficient when working over large data sets, especially if your match is toward the back of the set, or doesn’t exist at all.
A different approach – divide and conquer
Binary search approaches this problem in a different way. It divides the data set to find the match starting from the middle, to narrow the range that the match can be found within (hence the requirement for your data to be ordered).
A PHP implementation
Below is a PHP example of how to implement a binary search.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 
function binarySearch($needle, array $haystack, $compare, $high, $low = 0, $containsDuplicates = false) { $key = false; // Whilst we have a range. If not, then that match was not found. while ($high >= $low) { // Find the middle of the range. $mid = (int)floor(($high + $low) / 2); // Compare the middle of the range with the needle. This should return <0 if it's in the first part of the range, // or >0 if it's in the second part of the range. It will return 0 if there is a match. $cmp = call_user_func($compare, $needle, $haystack[$mid]); // Adjust the range based on the above logic, so the next loop iteration will use the narrowed range if ($cmp < 0) { $high = $mid  1; } elseif ($cmp > 0) { $low = $mid + 1; } else { // We've found a match if ($containsDuplicates) { // Find the first item, if there is a possibility our data set contains duplicates by comparing the // previous item with the current item ($mid). while ($mid > 0 && call_user_func($compare, $haystack[($mid  1)], $haystack[$mid]) === 0) { $mid; } } $key = $mid; break; } } return $key; } 
We can utilise this function in the following example by searching an array of email addresses for a specific one. Any test data that appears here was generated by Faker.
1 2 3 
$emails = [/* array of emails */]; $searchEmail = 'wkeeling@gmail.com'; $key = binarySearch($searchEmail, $emails, 'strcmp', count($emails)  1, 0, true); 
Benchmarks
So let’s benchmark the performance of binary search against PHP’s internal array_search function over a variety of data set sizes and match positions.
Small data set – 100 items
Item exists as the first entry in the data set:
 PHP’s array_search: 0.02599999999997ms
 Binary search: 0.018999999999991ms
 Binary search is 1.37 times faster than array_search
Item exists around the middle of the data set:
 PHP’s array_search: 0.029999999999974ms
 Binary search: 0.020999999999993ms
 Binary search is 1.43 times faster than array_search
Item does not exist in the data set:
 PHP’s array_search: 0.03000000000003ms
 Binary search: 0.019000000000047ms
 Binary search is 1.58 times faster than array_search
Medium data set – 10,000 items
Item exists as the first entry in the data set:
 PHP’s array_search: 0.032000000000032ms
 Binary search: 0.023999999999968ms
 Binary search is 1.33 times faster than array_search
Item exists around the middle of the data set
 PHP’s array_search: 0.12000000000001ms
 Binary search: 0.02000000000002ms
 Binary search is 6 times faster than array_search
Item does not exist in the data set:
 PHP’s array_search: 0.19099999999994ms
 Binary search: 0.021999999999966ms
 Binary search is 8.68 times faster than array_search
Large data set – 1,000,000 items
Item exists as the first entry in the data set:
 PHP’s array_search: 0.037000000000009ms
 Binary search: 0.035000000000007ms
 Binary search is 1.06 times faster than array_search
Item exists around the middle of the data set
 PHP’s array_search: 8.734ms
 Binary search: 0.026000000000082ms
 Binary search is 335.92 times faster than array_search
Item does not exist in the data set:
 PHP’s array_search: 15.676ms
 Binary search: 0.031999999999921ms
 Binary search is 489.87 times faster than array_search
In summary
The results of the benchmarks show that binary search is slightly faster than array_search in most scenarios, but as the data set grows, the performance difference becomes huge. Binary search should be used when you know the data set is large and ordered.

mannion007