Press enter to see results or esc to cancel.

Binary search algorithm in PHP

Binary search is a search algorithm that is dramatically faster than PHP’s internal functions (such as array_search) when searching ordered data.

How does it work?

PHP’s internal function array_search is an example of a linear search; it will iterate over the entire data set until it finds a match working from front to back. This is great if your data set is small and unordered, but is incredibly inefficient when working over large data sets, especially if your match is toward the back of the set, or doesn’t exist at all.

A different approach – divide and conquer

Binary search approaches this problem in a different way. It divides the data set to find the match starting from the middle, to narrow the range that the match can be found within (hence the requirement for your data to be ordered).

Binary search divides the dataset to find the match.
Binary search divides the data set to find the match.

A PHP implementation

Below is a PHP example of how to implement a binary search.

We can utilise this function in the following example by searching an array of email addresses for a specific one. Any test data that appears here was generated by Faker.

Benchmarks

So let’s benchmark the performance of binary search against PHP’s internal array_search function over a variety of data set sizes and match positions.

Small data set – 100 items

Item exists as the first entry in the data set:

  • PHP’s array_search: 0.02599999999997ms
  • Binary search: 0.018999999999991ms
  • Binary search is 1.37 times faster than array_search

Item exists around the middle of the data set:

  • PHP’s array_search: 0.029999999999974ms
  • Binary search: 0.020999999999993ms
  • Binary search is 1.43 times faster than array_search

Item does not exist in the data set:

  • PHP’s array_search: 0.03000000000003ms
  • Binary search: 0.019000000000047ms
  • Binary search is 1.58 times faster than array_search

Medium data set – 10,000 items

Item exists as the first entry in the data set:

  • PHP’s array_search: 0.032000000000032ms
  • Binary search: 0.023999999999968ms
  • Binary search is 1.33 times faster than array_search

Item exists around the middle of the data set

  • PHP’s array_search: 0.12000000000001ms
  • Binary search: 0.02000000000002ms
  • Binary search is 6 times faster than array_search

Item does not exist in the data set:

  • PHP’s array_search: 0.19099999999994ms
  • Binary search: 0.021999999999966ms
  • Binary search is 8.68 times faster than array_search

Large data set – 1,000,000 items

Item exists as the first entry in the data set:

  • PHP’s array_search: 0.037000000000009ms
  • Binary search: 0.035000000000007ms
  • Binary search is 1.06 times faster than array_search

Item exists around the middle of the data set

  • PHP’s array_search: 8.734ms
  • Binary search: 0.026000000000082ms
  • Binary search is 335.92 times faster than array_search

Item does not exist in the data set:

  • PHP’s array_search: 15.676ms
  • Binary search: 0.031999999999921ms
  • Binary search is 489.87 times faster than array_search

In summary

The results of the benchmarks show that binary search is slightly faster than array_search in most scenarios, but as the data set grows, the performance difference becomes huge. Binary search should be used when you know the data set is large and ordered.

  • mannion007

    Really impressive. I’m currently profiling some large PHP applications in need of a performance boost and this will be massively helpful.