Faster Arrays

Project

Arrays, long considered the work horse of PHP have one flaw: they can be incredibly slow. There is however an alternative — at least, for a small subset of use cases. SplFixedArray.

You use SplFixedArray like so:

The SplFixedArray class provides a super-fast, fixed size array implementation. There are some limitations however, first you must use numeric keys and secondly you cannot use anonymous assignment (i.e. $array[] = 'value';).

You’ll notice one requirement was missing, that it should have a fixed size. While having a fixed size is what will bring you the speed increase it’s actually not a requirement that the size be fixed. Though you must specify a size to the constructor, you can change it (and lose most — if not all — speed benefits) at any time using SplFixedArray->setSize().

So, what sort of speed increase are we talking about? In my testing of arrays 100, 100… 1,000,000 elements, you will see a speed increase about 20-25%; for arrays smaller than 100, it will actually be slower by 25-40%.

The benchmarking was very simple, a comparison of a read and write iteration for both normal and fixed arrays of different sizes like so:

Additionally, the memory usage to run the benchmarks for array vs SplFixedArray is significantly different, regular arrays clock in at 198MB while SplFixedArray uses a mere 83MB, that’s a 59% memory saving.

In practical terms, you’re only going to be worried about the speed of arrays when you’re dealing with larger arrays anyway, so the speed loss for the lower digits isn’t a big concern… but where exactly could this be useful?

There is one common scenario where you will commonly be dealing with large numerically indexed arrays of data: Your database result sets. Using PDO, you can tell how many results you have before you retrieve the row data using PDOStatement->rowCount().

Unfortunately, it is not possible to set the result set container for PDOStatement->fetchAll() to use SplFixedArray — however, if someone wants to help (that is, someone who knows internals and… well, C), I’ve got an opening for a coach!

At the urging of my co-worker Helgi, I threw the arrays into a FilterIterator and got some pretty interesting results. Using similar code to the first benchmark, but instead of just reading out the array, we created and used a custom FilterIterator:

For regular arrays, we must first create an iterator:

For the SplFixedArray, we passed it straight into the EvenFilterIterator, otherwise the code is the same.

Even with the extra overhead of creating the ArrayIterator, the SplFixedArray is only marginally (1%) faster till it reaches the 10000 elements mark, and then it starts to become marginally slower (again 1%). So, I guess the take-away is: use with caution.

5 thoughts on “Faster Arrays

  1. Interesting idea for PDOStatement::fetchAll() to return a SplFixedArray for large datasets. Could probably add another class constant to PDO to indicate the desired behavior, i.e. PDO::FETCH_MODE_FIXED_ARRAY.

    • Davey Shafik

      It would have to be something like: PDO::FETCH_ALL_MODE_FIXED_ARRAY and you would also need the MODE_ARRAY to undo it. It would also be a PDO option not a fetchAll() option like emulated prepares etc.

  2. Implementing PDO fetchAll zo return a “fixedArray” would be trivial. I’m not sure whether SPL provides the APIs but even doing it from hand should take less than a day for somebody who knows PHP’s workings.

    But a few words of notice: An SplFixedArray is no array but an object. This means it is not using copy semantics but pass-by-handle semantics. It’s not possible to use all array functions etc.

    I also wonder how useful this really is. If you care about performance why do you copy the complete result set around? – Read the data and process it dirctly (using some fetch loop etc.) instead of copying it and iterating multiple times (at least one time for the fetchAll and then a second time for processing) Proper optimisation there should bring more benefit than using this not-a-real-array thing.

    • Yeah I also do not think that this will be useful in that many cases, but there might be some with an importer. However if we start doing stuff like this “read-only” arrays would likely be something we should look into first.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>