From Doctrine Annotations Parser to Static Reflection

Until recently, we used doctrine/annotations to parse class annotations that you know @ORM\Entity or @Route. Last 2 weeks, we rewrote this parser from scratch to our custom solution to improve spaces, constants and use static reflection.

During refactoring, the parser got reduced from 6700 lines to just 2700.
What we changed, why, and how can we benefit from a static reflection in annotations?

The doctrine/annotations package has been an excellent help for Rector for the past couple of years. Symfony, Doctrine, JMS, or Gedmo use the same package. We used it to parse the following code to a custom value object:

use Doctrine\ORM\Mapping as ORM;

// ...

/**
 * @ORM\Column(type="text")
 */
private $config;

Here the object was Rector\Doctrine\PhpDoc\Node\Property_\ColumnTagValueNode. This object provided data about all inner values:

$columnTagValueNode->getType(); // "text"

That way, we could modify the content, get the value type to add @var string etc. Straightforward object API with method IDE auto-complete. So far, so good?


Ups and Downs of Doctrine Annotations

The problem was that for every such annotation, we had to have a custom object. That means lot of classes:

Also, each class had its factory service that mapped annotation class to our custom *TagValueNode. Phew.


No Static Reflection

Doctrine parser uses class_exists() and native reflection to load the Column class annotation properties:

That means the static reflection we added in Rector 0.10 cannot be used here. That means you have to include the annotation classes in your autoloader. It's very confusing.


Constants are Replaced by their Values

The Doctrine parser is used only for reading the values. In Rector, we need to print the docblock back, e.g., change the type from "text" to "number".

That worked most of the time, but what if there was a constant? The constants are replaced by their values right here.

This causes bugs like these:

public const VALUES = [
    '4star' => FiveStar::class,
];

 /**
- * @Assert\Choice(choices=self::VALUES)
+ * @Assert\Choice({"4star":"App\Entity\Rating\FourStar"})
  */

Instead, we need to keep the original value of "self::VALUES", a bare constant reference. To overcome this, we had to create a set of Rector rules that will copy-paste the Doctrine parser class code from /vendor, replace the constant() lines with preserving identifier + value collector and few more ugly hacks.

This solution was terrible, but it did the job.

Broken Spaces on Reprint

Last but not least, spaces were completely removed on re-print:

-* @ORM\Table(name = "my_entity", indexes = {@ORM\Index(
-*     name = "my_entity_xxx_idx", columns = {
-*         "xxx"
-*     }
-* )})
+* @ORM\Table(name="my_entity", indexes={@ORM\Index(name="my_entity_xxx_idx", columns={"xxx"})})

We tried to compensate for this with regular expressions, but it was a very crappy solution.


Why we Used doctrine/annotations?

You may be wondering why we even used doctrine/annotations if it causes so many troubles?

"There are no solutions,
only trade-offs"

The next other solution was using phpdoc-parser from PHPStan. The most advanced docblock parser we now have in PHP. The first downside is that it parses Doctrine annotations as GenericTagValueNode with all values connected to a long string:

Do you need to change "my_entity" to "our_entity"? Use regular expression and good luck.


1. Nodes with Attributes

To make it work, we had to do 2 things: add attributes to the PhpDoc nodes, the same way nikic/php-parser does:

$phpDocNode->setAttibute('key', 'value');
$phpDocNode->getAttibute('key'); // "value"

That would enable format-preserving and nested values juggling, which Doctrine Annotations are known.

We proposed the attributes in phpdoc-parser 2 years ago, but it didn't get any traction as phpdoc-parser was also a read-only tool like Doctrine Annotations.

Luckily, it got revived and we contributed attributes on each node a month ago and was released under phpdoc-parser 0.5!

2. Rewrite Doctrine/Annotation in phpdoc-parser

We also needed values of annotation values using a custom lexer based on phpdoc-parser. This parser should:

  • keep constants
  • cover nested values, like annotation in an annotation
  • cover nested spaces, quotes, : or =
  • keep the original format


To make it happen, we had to rewrite DocParser from Doctrine to phpdoc-parser syntax. That included parsing values, arrays, curly arrays with keys, constants, brackets, quotes, and newlines between them.


11 days later, the final result is here:


Now every Doctrine-like annotation has:

  • single object to work with
  • annotation class with fully qualified class name
  • way to modify its values, quoted and silent
  • way to modify nested annotations
  • automated reprint on a modified node, e.g., if we change string to int

👍

How does it help the Rector Community?

With static reflection in annotations, now you can refactor old projects that use Doctrine Annotations without loading them.

Refactoring php doc tag nodes is now super easy. E.g., if we wanted to modify @Method from Sensio before this refactoring, we had to create a node class, a factory class, register it, autoload the doctrine annotation in a stub and prepare custom methods for custom properties of that specific class.

And now?

use Rector\BetterPhpDocParser\PhpDoc\DoctrineAnnotationTagValueNode;

$methodTagValueNode = $phpDocInfo->getByAnnotationClass(
    'Sensio\Bundle\FrameworkExtraBundle\Configuration\Method'
);

if ($methodTagValueNode instanceof DoctrineAnnotationTagValueNode) {
    $values = $methodTagValueNode->getValues();
    // ...
    $methodTagValueNode->changeValue('key', ['value']);
}


Happy coding!