Writing your own phpstan rules

 

When I started the Apie project, I started with the spec I designed and see if it's enough to make entities and value objects. The specs were fine, but there were some small tidbits you could write that contains subtle bugs with Apie or gives less accurate data. I found a few and for all of them I made a phpstan rule.

For example a few things that are possible, but should not be used:

// i can not garantuee that all code will consider it a value object, DTO or entity.
class EntityAndValueObject implements ValueObjectInterface, EntityInterface, DTOInterface
{
}

class EntityExample implements EntityInterface {
    public function __construct(private readonly EntityExampleIdentifier $id)
    {
    }
    
    /**
     * apie/rest-api can not specify the return type of getId() so the OpenAPI spec will be less accurate.
     */
    public function getId(): IdentifierInterface
    {
        return $this->id;
    }
}

// i can do new CompositeObjectExample() and $field1 and $field2 will not be filled in. 
class CompositeObjectExample implements ValueObjectInterface {
    use IsCompositeObject;
    
    private string $field1;
    
    private string $field2;
}

class ArrayValueObjectExample implements ValueObjectInterface {
    // what is the structure of this array? We can not type this in OpenAPI
    private array $data;
    
    public function toNative(): array
    {
        return $this->data;
    }
}

All these examples are valid PHP, but they have ambiguities that could result in inconsistent behaviour. So for that reason I have to write my own phpstan rules to avoid making common mistakes. Some of them are not obvious what a programmer does incorrectly here. Because solutions for these would be:

class EntityExample implements EntityInterface {
    public function __construct(private readonly EntityExampleIdentifier $id)
    {
    }
    
    /**
     * apie/rest-api can now specify the return type of getId() better, so the OpenAPI spec will be more accurate.
     */
    public function getId(): EntityExampleIdentifier
    {
        return $this->id;
    }
}
 
class CompositeObjectExample implements ValueObjectInterface {
    use IsCompositeObject;
    
    private string $field1;
    
    private string $field2;
    
    /**
     * by making it private you force having to use fromNative()
     */
    private function __construct() {
    }
}
// or...........

class CompositeObjectExample implements ValueObjectInterface {
    use IsCompositeObject;
    public function __construct(
        private string $field1,   
        private string $field2
    ) {
    }
}


class ArrayValueObjectExample implements ValueObjectInterface {
    use IsCompositeObject;
    private array $data;
    
    public function toNative(): array
    {
        return $this->data;
    }
}

// or....
#[SchemaMethod('createSchema')]
class ArrayValueObjectExample implements ValueObjectInterface {
    private array $data;
    
    public function toNative(): array
    {
        return $this->data;
    }
    
    /**
     * provide the JSON schema with a schema method.
     */
    public static function createSchema(): array
    {
        return [
            'type' => 'object',
            'properties' => [
                'data' => [
                    'type' => 'object',
                    'additionalProperties' => ['type' => 'string'],
                ],
            ],
        ];
    }
}

What is phpstan?

For people who do not know phpstan: Phpstan is a tool that can analyse PHP code without actually executing the code. For example the following code will not destroy itself with phpstan as phpstan only parsed the PHP code, but does not load the code. It also uses its own reflection API, because the normal PHP reflection API does load in the file because the PHP autoloader will run the php file with the class definition.
file_put_contents(__FILE__, ''); // phpstan would not execute this if it analyses this file

class Example {}
You do have to be careful as some phpstan extensions do use native PHP reflection or even load an entire application to check if a property or method exists (Larastan is a big offender here for figuring out macroable facades). But in general phpstan is supposed to be a static code analyzer.

Writing custom rules

Writing custom rules is not easy. Since it's a static analysis tool, it has to parse PHP code. After the parsing of your source code, phpstan ends up with a so called Abstract Syntax Tree. Basically a AST is a tree data structure of your source code similar like the one in this image.

In most cases a rule listens for a specific node, for example 'class' , 'if', 'while', etc. These nodes have subnodes for example the name of the class or the expression of the if statement. Rules can read the subnodes to make assumptions. For example it's very easy to prevent certain class names from being used. But you could also count the number of if statements inside the body of a function or method.

Above that phpstan collects type information about all methods and variables it find. It also reads docblock with generics information, but using generics in your own rules is very much undocumented. In fact, the PHPStan documentation is terrible without some background information how it works.

Another issue I ran into is that phpstan itself is installed as a phar file. Most PHP IDE's will not get the class declarations inside the phar, so you have to look up a lot in the documentation if you make your own rules. Alternative is setting up phpstan-src as dev dependency in composer with composer's repositories setting, but that is also not recommended. PHPStorm does include phar contents for indexing and autocomplete.

Structure of a phpstan rule

You have to implement RuleInterface to make a rule:

use PhpParser\Node;
use PhpParser\Node\Stmt\Class_;
use PHPStan\Analyser\Scope;
use PHPStan\Reflection\ClassReflection;
use PHPStan\Reflection\ReflectionProvider;
use PHPStan\Rules\Rule;
/**
 * @implements Rule<Class_>
 */
final class EntityGetIdShouldBeSpecific implements Rule
{
    public function __construct(
        private ReflectionProvider $reflectionProvider
    ) {
    }

    public function getNodeType(): string
    {
        return Class_::class;
    }

    /**
     * @param Class_ $node
     */
    public function processNode(Node $node, Scope $scope): array
    {
        // TODO
        return [
        ];
    }
}
Phpstan offers it's own dependency injection logic and this rule requires ReflectionProvider which is a class to get information about classes similar to the built-in Reflection API, except it does not load in the PHP code. The getNodeType() is telling phpstan to use this rule for a specific node type, in this case the class declaration. The error message will also be presented on the line number this node is found.
If the node type is found given by getNodeType() in the PHP code it will run processNode. Time to provide the implementation of processNode():

    /**
     * @param Class_ $node
     */
    public function processNode(Node $node, Scope $scope): array
    {
        $nodeName = $node->name->toString();
        if ($node->isAbstract() || str_starts_with($nodeName, 'Anonymous') || $node->isAnonymous()) {
            return [];
        }
        $class = $this->getClass($node, $scope);
        if ($class->implementsInterface(EntityInterface::class)) {
            $method = $class->getMethod('getId', $scope);
            foreach ($method->getVariants() as $variant) {
                $type = $variant->getNativeReturnType();
                if ($type instanceof \PHPStan\Type\ObjectType) {
                    if ($type->getClassName() === IdentifierInterface::class) {
                        return [
                            __CLASS__ => sprintf(
                                "Class '%s' is an entity, but the getId() implementation has still IdentifierInterface return type.",
                                $nodeName
                            )
                        ];
                    }
                }
            }
        }
        return [
        ];
    }

    private function getClass(Class_ $node, Scope $scope): ClassReflection
    {
        return $this->reflectionProvider->getClass($scope->getNamespace() . '\\' . $node->name->toString());
    }
The first part is some extra checks. I want to skip abstract and anonymous classes. According the documentation $node->isAnonymous() should be true, but I noticed some anonymous classes were not (but they did start with 'Anonymous' in the name). Next I want to get all the class information, but for that I need a fully qualified namespace. Luckily the $scope variable contains the current namespace and our class node contains the node of our class.
As you can see we can use ReflectionProvider to get all metadata. If the class implements EntityInterface, we look up the method 'getId' and check the return type not being the general IdentifierInterface class.
The getVariants() part is different from the native PHP Reflection API, because there could be several sources of a return type for a class method, for example you could add a docblock or you could have implement more than one interface where one of the return types is more precise.

Testing phpstan rules

Phpstan is using fixture files for testing (see my article about making your data providers small in the test classes and using files for test cases) and you have to extend a specific base class.

use Apie\ApiePhpstanRules\EntityGetIdShouldBeSpecific;
use PHPStan\Rules\Rule;
use PHPStan\Testing\RuleTestCase;

/**
 * @extends RuleTestCase<EntityGetIdSholdBeSpecific>
 */
class EntityGetIdShouldBeSpecificTest extends RuleTestCase
{
    protected function getRule(): Rule
    {
        return new EntityGetIdShouldBeSpecific($this->createReflectionProvider());
    }

    /**
     * @dataProvider ruleProvider
     */
    public function testLegacyRule(array $rules, string... $fileToAnalyse): void
    {
        $this->analyse($fileToAnalyse, $rules);
    }

    public function ruleProvider(): iterable
    {
        yield [
            [
                ["Class 'EntityWithNoSpecificIdentifier' is an entity, but the getId() implementation has still IdentifierInterface return type.", 8]
            ],
            __DIR__ . '/Fixtures/EntityWithNoSpecificIdentifier.php',
        ];
    }
}
Basically you test only one phpstan rule per testcase. In this case our phpstan rule if getId() is not typehinted as IdentifierInterface. We provide all the testscases in fixture files and have to tell what exact error message is given with the exact line number where the error is thrown.

Conclusion

So in a nutshell making your own phpstan rules requires a little bit of technical knowledge how a PHP file is being parsed. It also requires thorough documentation reading or diving into the vendor folder how Phpstan works internally. Once you have created a rule, making a test is relatively easy once you get used to the setup of a phpstan rule test. If I find other problems with the Apie object specification I would probably add more checks.
It's also possible to write fixers, but the rules I added could not automatically fix them, so I did not dive into this.
Not everything can be checked or set up with phpstan as I wish it was possible to make implicit generics in phpstan. Now you always need to explicitly mention generics in docblocks which forces you to write docblocks.

Comments