How to Improve Code and Avoid Fights on Review

It's not a secret that when it comes to forming a new team, leaders (Team Leader, Tech Leader) face the challenge of establishing a unified programming style, as all team members are new, and each has their own approach to organizing code and selecting practices. Typically, this leads to lengthy debates during code reviews, which eventually escalate into various interpretations of well-known practices such as SOLID, KISS, DRY, etc. The principles behind these practices are quite blurry, and with sufficient persistence, it's easy to find paradoxes where one contradicts another. For example, let's consider Single Responsibility and DRY.

One variation of defining the Single Responsibility Principle (the "S" in SOLID) states that each object should have one responsibility, and this responsibility should be fully encapsulated within the class. The DRY principle (Don't Repeat Yourself) suggests avoiding code duplication. However, if we have a data transfer object (DTO) in our code that can be used in different layers/services/modules, which of these principles should we follow? Undoubtedly, many programming books address similar situations, typically stating that if we're dealing with different objects/functions with the same set of properties and logic but belonging to different domains, it doesn't constitute duplication. However, how can one prove that these objects SHOULD belong to different domains, and, most importantly, is the leader ready (and confident) to assert and prove this statement?

One frequently practiced approach is making categorical statements like "This is our way/It's the leader's word and we take it for granted" and similar authoritative declarations that emphasize the authority and expertise of the person who came up with these rules. This approach undoubtedly succeeds when dealing with an established team and a project with an existing codebase upon which development continues. But what should be done when the team is new, and the project has just begun? Appeals to authority may not work, as the Team/Tech Leader has not yet established their authority, and each team member believes that their knowledge and approach will be the optimal solution for the future project.

This article suggests an approach that allows avoiding most of such contentious situations. Furthermore, each developer in practice (without objections from the leader) will understand what they are doing wrong and how to improve it.

To begin with, let's introduce several additional conditions and definitions:

At the time of submission for review, the task is considered complete, and if it passes the review, it can be released without any changes. In other words, we do not consider the possibility of pre-planned changes/additions in the code.
The team consists of equally experienced and qualified specialists who face no challenges in implementing tasks; the only discrepancy lies in their approaches.
The code style is consistent and is verified by code checkers.
Development time is not critical, at least less critical than the reliability of the product.

We will consider the necessity of the first condition later, although it is quite obvious on its own, as it is illogical to submit an unfinished task for review. With the second condition, we ensure that each team member has no issues with choosing an algorithm and implementing the assigned task. In the third condition, we assume that the team adheres to a specific style (PSR), and questions like "what is better, CamelCase or snake_case" do not arise. And the final condition refrains from calculating changes in effort for task completion in this work.

Unit tests

Many readers are aware that unit testing improves code quality. Typically, after stating this, the Test-driven development (TDD) methodology is mentioned as an example, which indeed enhances code quality but is relatively rarely applied in practice because writing tests before implementation requires a high-level programming skill set.

How can unit testing help improve code without relying on the previously mentioned well-known practices? First, let's recall that unit tests are applied to test a specific method/module/class using mock objects/modules as dependencies.

Following the first condition, the task should be considered completed at the time of submission for review. Therefore, let's introduce a definition for what we consider a completed task. A task is deemed completed only when it satisfies all the conditions listed below:

The requirements of the assigned task are met.
All new code must be covered by unit tests, including various algorithmic conditions in the program.
The new code does not break existing tests.

Since we have unlimited time for writing new tests and maintaining old ones (condition 4), and each developer can write these tests and meet the task requirements (condition 2), we can consider that any task can potentially be completed. Now, since we've introduced the definition of a completed task, we can justify condition 1: the code cannot be submitted for review if it is not covered by tests; otherwise, the code will be rejected without review. Therefore, a developer knows that fixing code issues after feedback involves fixing tests. This seemingly minor point becomes a fundamental driving force for writing good code.

Let's consider the following code example (in this article, the PHP language is used for examples, but it can be any C-like language with support for the object-oriented programming paradigm):

class SomeFactory {
   public function __construct(
       private readonly ARepository $aRepository,
   ) { }

   /**
    * @throws ErrorException
    */
   public function createByParameters(ObjectType $type, array $parameters): ObjectE|ObjectD|ObjectA|ObjectB|ObjectC {
       switch ($type) {
           case ObjectType::A:
               if (!array_key_exists('id', $parameters) || !is_int($parameters['id'])) {
                   throw new ErrorException('Some message');
               }
               $aEntity = $this->aRepository->findById($parameters['id']);
               $data = [];
               if ($aEntity !== null) {
                   $data = $aEntity->getSomeParams();
               }
               if (count($data) === 0) {
                   if (array_key_exists('default', $parameters) && is_array($parameters['default'])) {
                       $data = $parameters['default'];
                   } else {
                       throw new ErrorException('Some message');
                   }
               }
               return new ObjectA($data);
           case ObjectType::B:
               // some code
               return new ObjectB($parameters);
           case ObjectType::C:
               // some code
               return new ObjectC($parameters);
           case ObjectType::D:
               // some code
               return new ObjectD($parameters);
           case ObjectType::E:
               // some code
               return new ObjectE($parameters);
       }

       throw new RuntimeException('some message');
   }
}

Here, we intentionally violated all practices to demonstrate the effectiveness of the proposed approach. However, note that the presented algorithm is functional; depending on the type, an entity with specific parameters is created. Nevertheless, our main task is to ensure that this code doesn't reach the review stage, prompting the developer to improve it independently. Following condition 1, to submit the code for review, we need to write tests. Let's write one such test:

class SomeFactoryTest extends TestCase {
   public function testCreateByParametersReturnsObjectAWithDefaultMethods(): void {
       $someFactory = new SomeFactory(
           $aRepository = $this->createMock(ARepository::class),
       );
       $parameters = [
           'id' => $id = 5,
           'default' => ['someData'],
       ];
       $aRepository->expects($this->once())
           ->method('findById')
           ->with($id)
           ->willReturn(null);

       $actualResult = $someFactory->createByParameters(ObjectType::A, $parameters);
       $this->assertInstanceOf(ObjectA::class, $actualResult);
       // additional checkers for $actualResult
   }
}

It turned out to be quite simple, but this is just one of the necessary eight tests for one of the five types. After all tests are written, any feedback during the review that requires changes may break these tests, and the developer will have to rewrite or adjust them. For instance, adding a new dependency (let's say, a logger) will result in changes to the factory initialization in all tests:

$someFactory = new SomeFactory(
   $aRepository = $this->createMock(ARepository::class),
   $this->createMock(LoggerInterface::class)
);

Note how the cost of a comment has increased: if previously adding/changing a dependency only required modifications to the SomeFactory class, now all tests (which could be more than 40) will also need to be altered. Naturally, after several iterations of such changes, a developer would want to minimize the effort required to address feedback. How can this be done? The answer is obvious - isolate the entity creation logic for each type into a separate class. Please note that we are not relying on SOLID/DRY principles, etc., and we are not engaging in abstract discussions about code readability and debugging, as each of these arguments can be disputed. We are simply simplifying the writing of tests, and there are no counterarguments for a developer against this.

After the modification, we will have 5 factories for each type (ObjectType::A, ObjectType::B, ObjectType::C, ObjectType::D, ObjectType::E). Below is an example of the factory for ObjectType::A (FactoryA):

class FactoryA {
   public function __construct(
       private readonly ARepository $aRepository,
   ) { }

   public function createByParameters(array $parameters): ObjectA {
       if (!array_key_exists('id', $parameters) || !is_int($parameters['id'])) {
           throw new ErrorException('Some message');
       }
       $aEntity = $this->aRepository->findById($parameters['id']);
       $data = [];
       if ($aEntity !== null) {
           $data = $aEntity->getSomeParams();
       }
       if (count($data) === 0) {
           if (array_key_exists('default', $parameters) && is_array($parameters['default'])) { // 6 7
               $data = $parameters['default'];
           } else {
               throw new ErrorException('Some message');
           }
       }
       return new ObjectA($data);
   }
}

And the general factory will look like this:

class SomeFactory {
   public function __construct(
       private readonly FactoryA $factoryA,
       private readonly FactoryB $factoryB,
       private readonly FactoryC $factoryC,
       private readonly FactoryD $factoryD,
       private readonly FactoryE $factoryE,
   ) { }

   /**
    * @throws ErrorException
    */
   public function createByParameters(ObjectType $type, array $parameters): ObjectE|ObjectD|ObjectA|ObjectB|ObjectC {
       switch ($type) {
           case ObjectType::A:
               return $this->factoryA->createByParameters($parameters);
           case ObjectType::B:
               return $this->factoryB->createByParameters($parameters);
           case ObjectType::C:
               return $this->factoryC->createByParameters($parameters);
           case ObjectType::D:
               return $this->factoryD->createByParameters($parameters);
           case ObjectType::E:
               return $this->factoryE->createByParameters($parameters);
       }

       throw new RuntimeException('some message');
   }
}

As we can see, the overall code has increased. Let's look at the tests for FactoryA and the modified test for SomeFactory.

class FactoryATest extends TestCase {
   public function testCreateByParametersReturnsObjectAWithDefaultMethods(): void {
       $factoryA = new FactoryA(
           $aRepository = $this->createMock(ARepository::class),
       );
       $parameters = [
           'id' => $id = 5,
           'default' => ['someData'],
       ];
       $aRepository->expects($this->once())
           ->method('findById')
           ->with($id)
           ->willReturn(null);

       $actualResult = $factoryA->createByParameters($parameters);
       $this->assertInstanceOf(ObjectA::class, $actualResult);
       // additional checkers for $actualResult
   }
}

class SomeFactoryTest extends TestCase {
   public function testCreateByParametersReturnsObjectA(): void {
       $someFactory = new SomeFactory(
           $factoryA = $this->createMock(FactoryA::class),
           $this->createMock(FactoryB::class),
           $this->createMock(FactoryC::class),
           $this->createMock(FactoryD::class),
           $this->createMock(FactoryE::class),
       );
       $parameters = ['someParameters'];


       $factoryA->expects($this->once())
           ->method('createByParameters')
           ->with($parameters)
           ->willReturn($objectA = $this->createMock(ObjectA::class));


       $this->assertSame($objectA, $someFactory->createByParameters(ObjectType::A, $parameters));
   }

   // the same test for another types and fabrics
}

The total number of tests increased by 5 (the number of possible types), while the number of tests for factories remained the same. So, what makes this code better? The main advantage is the reduction in effort required for corrections after a code review. Indeed, when changing dependencies in FactoryA, only the tests for FactoryA are affected.

Agreed, the code already looks better, and, perhaps unintentionally, we have partly adhered to the single responsibility principle. Is this the end of it? As mentioned earlier, we still need to write 5 tests for each entity. Moreover, we would have to endlessly pass the factories into the constructor as arguments for this service, and introducing a new type (or removing an old one) would lead to changes in all tests (though they are now only 5) for SomeFactory. Therefore, a logical solution, which most developers will likely see, is to create a registry (especially if there is native support for class registration by interface) and declare interfaces for DTOs and factories like:

interface ObjectInterface { }


class ObjectA implements ObjectInterface {
   // some logic
}

interface FactoryInterface {
   public function createByParameters(array $parameters): ObjectInterface;


   public static function getType(): ObjectType;
}

class FactoryB implements FactoryInterface {
   public static function getType(): ObjectType {
       return ObjectType::B;
   }

   public function createByParameters(array $parameters): ObjectB {
       // some logic
       return new ObjectB($parameters);
   }
}

Let's highlight the choice of defining the getType method as static. In the current implementation, there is no difference whether this method is static or dynamic. However, if we start writing a test for this method (no matter how absurd this idea may seem), we may notice that in the case of a dynamic method, the test would look like:

public function testGetTypeReturnsTypeA(): void {
   $mock = $this->getMockBuilder(FactoryA::class)
       ->disableOriginalConstructor()
       ->onlyMethods([])
       ->getMock();


   $this->assertSame($mock->getType(), ObjectType::A);
}

While for a static method, it would look much shorter:

public function testGetTypeReturnsTypeA(): void {
   $this->assertSame(FactoryA::getType(), ObjectType::A);
}

Thus, thanks to laziness, we chose the right solution (perhaps unknowingly) and prevented the getType method from potentially depending on the state of the FactoryB class object.

Let's look at the registry code:

class SomeRegistry {
   /** @var array<int, FactoryInterface> */
   private readonly array $factories;


   /**
    * @param FactoryInterface[] $factories
    */
   public function __construct(array $factories) {
       $mappedFactory = [];
       foreach ($factories as $factory) {
           if (array_key_exists($factory::getType()->value, $mappedFactory)) {
               throw new RuntimeException('Duplicate message');
           }
           $mappedFactory[$factory::getType()->value] = $factory;
       }
       $this->factories = $mappedFactory;
   }

   public function createByParams(ObjectType $type, array $parameters): ObjectInterface {
       $factory = $this->factories[$type->value] ?? null;
       if ($factory === null) {
           throw new RuntimeException('Not found exception');
       }


       return $factory->createByParameters($parameters);
   }
}

As we can see, we have to write 3 tests: 1) a test for duplication, 2) a test when the factory is not found, and 3) a test when the factory is found. The SomeFactory class now looks like a proxy method and can thus be removed.

class SomeFactory {
   public function __construct(
       private readonly SomeRegistry $someRegistry,
   ) { }


   public function createByParameters(ObjectType $type, array $parameters): ObjectInterface {
       return $this->someRegistry->createByParams($type, $parameters);
   }
}

In addition to the reduction in the number of tests (from 5 to 3), any addition/removal of a new factory does not entail changes to old tests (assuming that the registration of new factories is native and integrated into the framework).

To sum up our progress: in the pursuit of a solution to reduce the cost of addressing feedback after a code review, we completely revamped the generation of objects based on types. Our code now adheres to the Single Responsibility and Open/Closed principles (the "S" and "O" in the SOLID acronym), even though we did not explicitly mention them anywhere.

Next, let's make the task more complex and perform the same job with less obvious changes in the code. Let's examine the code in theFactoryA class:

class FactoryA implements FactoryInterface {
   public function __construct(
       private readonly ARepository $aRepository,
   ) { }


   public static function getType(): ObjectType {
       return ObjectType::A;
   }


   public function createByParameters(array $parameters): ObjectA {
       if (!array_key_exists('id', $parameters) || !is_int($parameters['id'])) {
           throw new ErrorException('Some message');
       }
       $aEntity = $this->aRepository->findById($parameters['id']);
       $data = [];
       if ($aEntity !== null) {
           $data = $aEntity->getSomeParams();
       }
       if (count($data) === 0) {
           if (array_key_exists('default', $parameters) && is_array($parameters['default'])) {
               $data = $parameters['default'];
           } else {
               throw new ErrorException('Some message');
           }
       }
       return new ObjectA($data);
   }
}

Can we simplify the writing of tests for this code? Let's break down the first if-block:

if (!array_key_exists('id', $parameters) || !is_int($parameters['id'])) {
   throw new ErrorException('Some message');
}

Let's try to cover it with tests:

public function testCreateByParametersThrowsErrorExceptionWhenParameterIdDoesntExist(): void {
   $this->expectException(ErrorException::class);
  
   $factoryA = new FactoryA(
       $this->createMock(ARepository::class),
   );

   $factoryA->createByParameters([]);
}

public function testCreateByParametersThrowsErrorExceptionWhenParameterIdNotInt(): void {
   $this->expectException(ErrorException::class);

   $factoryA = new FactoryA(
       $this->createMock(ARepository::class),
   );

   $factoryA->createByParameters(['id' => 'test']);
}

If the question of existence is covered easily, the test for the type contains many pitfalls. In this test, we passed a string, but what about other types? Is a large number considered an integer or a floating-point number (for example, in PHP, 10 to the power of 100 will return a short representation like 1.0E+100 of the float type)? You could write a DataProvider for all possible scenarios, or you could extract the validation logic into a separate class and get something like:

class FactoryA implements FactoryInterface {
   public function __construct(
       private readonly ARepository        $aRepository,
       private readonly ExtractorFactory   $extractorFactory
   ) { }

   public static function getType(): ObjectType {
       return ObjectType::A;
   }

   public function createByParameters(array $parameters): ObjectA {
       $extractor = $this->extractorFactory->createByArray($parameters);
       try {
           $id = $extractor->getIntByKey('id');
       } catch (ExtractorException $extractorException) {
           throw new ErrorException('Some message', previous: $extractorException);
       }

       $aEntity = $this->aRepository->findById($id);
       $data = [];
       if ($aEntity !== null) {
           $data = $aEntity->getSomeParams();
       }
       if (count($data) === 0) {
           if (array_key_exists('default', $parameters) && is_array($parameters['default'])) {
               $data = $parameters['default'];
           } else {
               throw new ErrorException('Some message');
           }
       }
       return new ObjectA($data);
   }
}

On the one hand, we added a new dependency, and perhaps we even had to create it. But in return, in all other factories, we don't have to worry about such problems. The test in the current factory is just one, and it covers all possible variations of the id parameter:

public function testCreateByParametersThrowsErrorExceptionWhenParameterIdDoesntExist(): void {
   $this->expectException(ErrorException::class);

   $factoryA = new FactoryA(
       $this->createMock(ARepository::class),
       $extractorFactory = $this->createMock(ExtractorFactory::class),
   );
   $parameters = ['someParameters'];

   $extractorFactory->expects($this->once())
       ->method('createByArray')
       ->with($parameters)
       ->willReturn($extractor = $this->createMock(Extractor::class));
   $extractor->expects($this->once())
       ->method('getIntByKey')
       ->with('id')
       ->willThrowException($this->createMock(ExtractorException::class));


   $factoryA->createByParameters($parameters);
}

Let's look at the next code block, namely:

$aEntity = $this->aRepository->findById($id);
$data = [];
if ($aEntity !== null) {
   $data = $aEntity->getSomeParams();
}
if (count($data) === 0) {
// next code

In this block, the method of the dependency aRepository (findById) is called, which returns either null or an entity with the getSomeParams method. The getSomeParams method, in turn, returns an array of data.

As we can see, the variable $aEntity is only needed to call the getSomeParams method. So, why not get the result of getSomeParams directly if it exists and an empty array if it doesn't?

$data = $this->aRepository->findSomeParamsById($id);

if (count($data) === 0) {

Let's compare the tests before and after. Before the changes, we had 3 possible behaviors: 1) when the entity is found, and getSomeParams returns a non-empty array of data, 2) when the entity is found, and getSomeParams returns an empty array of data, 3) when the entity is not found.

// case 1
$aRepository->expects($this->once())
   ->method('findById')
   ->with($id)
   ->willReturn($this->createConfiguredMock(SomeEntity::class, [
       'getSomeParams' => ['not empty params']
   ]));

// case 2
$aRepository->expects($this->once())
   ->method('findById')
   ->with($id)
   ->willReturn($this->createConfiguredMock(SomeEntity::class, [
       'getSomeParams' => []
   ]));

// case 3
$aRepository->expects($this->once())
   ->method('findById')
   ->with($id)
   ->willReturn(null);

In the modified code, there are only two possible scenarios: findSomeParamsById returns an empty array or it doesn't.

// case 1
$aRepository->expects($this->once())
   ->method('findSomeParamsById')
   ->with($id)
   ->willReturn([]);

// case 2
$aRepository->expects($this->once())
   ->method('findSomeParamsById')
   ->with($id)
   ->willReturn(['not empty params']);

In addition to reducing the number of tests, we got rid of $this->createConfiguredMock(SomeEntity::class, [..].
Next, let's look at the block:

if (count($data) === 0) {
   if (array_key_exists('default', $parameters) && is_array($parameters['default'])) {
       $data = $parameters['default'];
   } else {
       throw new ErrorException('Some message');
   }
}

Since we already have a class that can extract data of the required type, we can use it, removing the checks from the factory code:

if (count($data) === 0) {
   try {
       $data = $extractor->getArrayByKey('default');
   } catch (ExtractorException $extractorException) {
       throw new ErrorException('Some message', previous: $extractorException);
   }
}

In the end, we get a class like:

class FactoryA implements FactoryInterface {
   public function __construct(
       private readonly ARepository        $aRepository,
       private readonly ExtractorFactory   $extractorFactory
   ) { }

   public static function getType(): ObjectType {
       return ObjectType::A;
   }

   public function createByParameters(array $parameters): ObjectA {
       $extractor = $this->extractorFactory->createByArray($parameters);
       try {
           $id = $extractor->getIntByKey('id');
       } catch (ExtractorException $extractorException) {
           throw new ErrorException('Some message', previous: $extractorException);
       }

       $data = $this->aRepository->findSomeParamsById($id);

       if (count($data) === 0) {
           try {
               $data = $extractor->getArrayByKey('default');
           } catch (ExtractorException $extractorException) {
               throw new ErrorException('Some message', previous: $extractorException);
           }
       }

       return new ObjectA($data);
   }
}

The createByParameters method will have only 4 tests, namely:

a test for the first exception (getIntByKey)
a test when findSomeParamsById returned a non-empty result
a test when findSomeParamsById returned an empty result, and the second exception (getArrayByKey) is triggered
a test when findSomeParamsById returned an empty result and ObjectA was created with values from the defaultarray

However, if the task requirements allow it, and ErrorException can be replaced with ExtractorException, the code will be even shorter:

class FactoryA implements FactoryInterface {
   public function __construct(
       private readonly ARepository        $aRepository,
       private readonly ExtractorFactory   $extractorFactory
   ) { }

   public static function getType(): ObjectType {
       return ObjectType::A;
   }

   /**
    * @throws ExtractorException
    */
   public function createByParameters(array $parameters): ObjectA {
       $extractor = $this->extractorFactory->createByArray($parameters);
       $id = $extractor->getIntByKey('id');
       $data = $this->aRepository->findSomeParamsById($id);

       if (count($data) === 0) {
           $data = $extractor->getArrayByKey('default');
       }

       return new ObjectA($data);
   }
}

And there will be only two tests:

a test when findSomeParamsById returned a non-empty result
a test when findSomeParamsById returned an empty result, and ObjectA was created with values from the default array

Let's summarize the work done.

Initially, we had poorly written code that needed test coverage. Since any developer is confident in their code (until something crashes with an error), writing tests for it is a long and monotonous task that no one likes. The only way to write fewer tests is to simplify the code that needs to be covered by these tests. In the end, by simplifying and reducing the number of tests, the developer improves the code, without necessarily following any specific theoretical practices.

Discussion (20)

Not yet any reply