Code generation

One of the technologies that helps us in our project to follow the Inversion of Control (IoC) principle is Dagger framework (https://google.github.io/dagger/). It allows to implement Dependency Injection – one of the IoC forms. Most Dependency Injection frameworks relay on reflection that is used to scan a code for annotations, and this was the way how the early Android DI frameworks like Guice were implemented. Contrary to back-end solutions were the extra time and memory required for DI frameworks usually is not a problem, because dependencies are resolved and created at application’s start time and not during a request time, and user experience is not degraded, on mobile devices this extra time and memory might be crucial and can significantly impact user experience. Therefor Dagger which is recommended DI framework (https://developer.android.com/topic/performance/memory), takes a different approach – it resolves dependencies at compile time by generating extra classes. The code generation is something that most of our team members have skeptical attitude, since very rarely generated code is of the same quality as the one written by seasoned developer. And we decided to take a little bit closer look at the area of code generation and whether it should be our concern.

The whole history of programming languages is about raising the abstraction level and generating the lower level code. It started with assembly languages when developers didn’t have to write the program in binary code (zeros and ones) anymore but were able to use human readable keywords. And assembler was used to convert the assembly code in to the machine language. Since the conversion was quite straight forward developers didn’t worried much about the resulting machine code but mostly about the assembly code. The different story was with the next generation programming languages like Fortran, Cobol, Pascal, C known as high-level programming languages. The main advantage of high-level languages over low-level languages is that they are easier to read, write, and maintain. But the process of translation (done by compiler) of such language in to machine code is much more complex and therefore the generated machine code was not that optimal as in the case of assembly language. But after awhile the compilers improved and most of developers stopped to bother with the compiled code and focused solely on high-level programming language code.

The main conclusion we made from the history of the programming languages is that as long as the generated code is not the part of the code base that developers have to read, extend and maintain and it does not impose performance penalty, there is no risk of using generated code. And from the other hand the benefit it brings is significant – it saves time a lot of time for boilerplate code and allows to focus on the application’s main logic.