Why using PHP arrays is a huge mistake

 

PHP arrays are very popular since their first introduction in PHP 4.The reason for their popularity has to do with their versatility. But it also means it's easy to overuse arrays. Just like there is a term for using strings or integers over value objects, called stringly typed, I think it's easy to say there could be a term like 'array typed' programming.

For more details on stringly typed I suggest you read my article about the value of value objects.

So what is array typed coding? It means instead of using proper objects you use arrays for everything and it's very common in beginners code:


function example(array $input): array
{
    $resolvedAddress = $this->zipcodeChecker->checkZipcode($input['zipcode']);
    $input['address']['street'] = $resolvedAddress['street'];
    $input['address']['number'] = $resolvedAddress['street_number'] . $resolvedAddress['street_number_suffix'];
    return $input;
}

But if the problem is beginners code, then you should just learn not do array typed programming, right? Wrong! There are several other problems with arrays in PHP. And it also has to do with the fact that they are too versatile and flexible!

Lists and maps

Making a REST API in PHP is very common and since PHP json_encode and json_decode are such easy functions it's also very easy to make it support JSON. The only problem is that json_encode makes some assumptions if it finds an array whether it should render it as a list or hashmap. In JSON we have lists with [1,2,3] and hashmaps: {"0": 1, "1": 2, "2": 3 }. They both will result in the same array, but for Javascript it will be linked as arrays or objects. An empty array will always be considered a list, so will result in []. If I want to return an empty object {} I could workaround this with using the stdClass object. For Apie I could not use arrays for generating an OpenAPI spec because of this inconsistency. How should I know if an array typehint results in a list or an object?

Typed lists

Arrays can contain anything including other arrays. This makes them very versatile, but again I ran into issues that I have no knowledge what the array contains as everything is allowed. There are several solutions you could apply as a workaround.

Variadic type arguments

You can have variadic arguments with a typehint. Downside is that variadic arguments can not be promoted arguments in the constructor. Another downside is that you can only use one variadic argument per method and is always the last argument in function, so you can not provide 2 typed lists to the same method.

class Example {
    /** @var string[] $items */
    private array $items;
    
    public function __construct(string... $items)
    {
        $this->items = $items;
    }
}

Reading PHP docblocks

The Symfony serializer component does this if used inside a Symfony application. Internally it does this with the Symfony property info component that uses the phpdocumentor composer package to read php docs. It's quite simple. All you have to do is add a typehint in a phpdoc using generics:

class Example {
    /**
     * @param string[] $items */
     */
    public function __construct(private array $items)
    {
    }
}
The upside is that any automated doc generator will pick up these docblocks as well. The downside is that you rely behaviour/metadata on comments. Some code optimizers can get rid of docblocks in which case relying on docblocks will fail. Also if typehints differ between constructor, setter or getter the Symfony property info class will only output one of the possible solutions.

PHP Attributes

Since PHP8 you can use PHP attributes. They are the native solution and replacement of using docblock annotations as found often found in Doctrine entities before PHP 8. They can be 'executed' natively and will not be removed by code optimizers. The downside is that since this is not part of the phpdoc specification it will not provide documentation information for php.

class Example {
    public function __construct(
        #[ListType('string')]
        private array $items
    ) {
    }
}
Besides the lack of support since it's non-standard it is also hard to use attributes in combination with more advanced typehints, like using union types or intersection types. In those case you end making your own DSL or having lots of attribute classes which could be considered an inner platform effect.

Array objects or Array access

The last option is that instead of PHP native arrays we use array objects or make an object from scratch with the ArrayAccess interface. We can add full typehints, even union types or intersection types and internally we can make the object immutable or only have unique items in a list if we need this logic. We can just make them like we want them. The downside? We can not use any of the native array functions or we even need to write code how to make it work with forEach.

Lists with only unique items: sets

Sometimes we only want a list with unique items. These types of lists are called sets. We can now use array_unique, but if we want to keep it a list we do need to use array_values as wel to get numeric keys again. Also the easiest way to filter out non-primitive objects is by misuing the keys of the array to make it the same as the value. The downside is of course that we could only enter primitive values and not dynamic objects with solutions like this.

The Apie solution: ItemSet, ItemList and ItemHashmap object

Taking all the above options I think the most versatile is creating my own objects that implement ArrayAccess as it has the most possible solutions.

ItemList

ItemList can only create lists of objects. They will always be encoded as lists. Because of this you can only unset the last value in the list if the list is mutable. So how to get a typed list? The best solution would be to extend ItemList and override offsetGet, because of the way how covariance works in PHP.


use Apie\Core\Lists\ItemList;
final class StringOrIntList extends ItemList
{
    protected bool $mutable = false; // this makes our list item immutable in all cases.
    public function offsetGet(mixed $offset): string|int
    {
        return parent::offsetGet($offset);
    }
}
In apie/cms an ItemList will result in a list of forms with a 'add' button to add a new record. In apie/rest-api you will get typehints for array item types. minItems and maxItems is not supported yet, but we could add this in future versions very easily.

ItemHashmap

ItemHashmaps can only create hashmaps of objects. The key is part of the data. I did decide that the order of the keys is not stored as this is also not the case in the official specifications of JSON.
ItemHashmaps accepts any value, but if you want only a specific type you will do the same trick used by ItemList.

use Apie\Core\Lists\ItemHashmap;
final class StringOrIntMap extends ItemHashmap
{
    public function offsetGet(mixed $offset): StringList|string
    {
        return parent::offsetGet($offset);
    }
}
In apie/cms a hashmap will almost be the same form, but now next to the 'add' button we add an input field to enter a valid key. In apie/rest-api it will be properly typehinted with additionalProperties schema setting.

ItemSet

After I made a first version for storing ItemSet and ItemHashmap to a database I ran into the issue with lists that I had no idea if duplicate items are being correct. So technically I could make an order with the same order item in the list twice. There is also the uniqueItems schema spec in an OpenAPI spec that I could not dynamically determine. So for that reason the 3rd object: ItemSet is used. It's even smart enough to see an integer 1 and a string "1" as different data. The only downside of this solution is that offsetExists acts a bit differently as the object itself is the key.
Normally I can store any type, but with inheritance I can reduce the number of types allowed the same way:

use Apie\Core\Lists\ItemSet;

final class StringOrIntSet extends ItemSet
{
    public function offsetGet(mixed $offset): string|int
    {
        return parent::offsetGet($offset);
    }
}
Apie/cms will render the same way as item lists. There is no easy way to check for uniqueness client-side, so it can only be validated by the backend if duplicate values are entered. Apie/rest-api will also display the same response as item lists, except it will mark the list as should only contain unique items.
Because of the way how keys would work with sets and the strictness between 1 and '1', you will get sometimes an odd syntax:

$set = new StringOrIntSet([1, "1", 2, 3]);
var_dump(isset($set[1])); // true
var_dump(isset($set["1"])); // true
var_dump(isset($set["2"])); // false
$set[] = 2; // does nothing, 2 is already there
unset($set[1]);
var_dump(isset($set[1])); // false
var_dump(isset($set["1"])); // true
Item sets also look at objects. If the object is implementing EntityInterface, it will use the id as unique value. In other cases it will use the native spl_object_hash() to index objects.

Conclusion

So in the end I used my own array objects to be able to distinguish sets, lists and hashmaps as PHP arrays can make no difference between them. Even if you do not use Apie, the objects can be easily used in any PHP application without much effort.

Comments