JVM Pit Stop: Garbage Collector Performance, Annotation Processors, and Energy Efficiency - JVM Weekly vol. 115
Today, we have just one topic but a wealth of content, as we dive into a range of VM-related subjects. Treat is as a refreshing shift in pace to kickstart the new year.
Let’s start with the topic of Garbage Collectors. Or rather, let’s begin with Mill.
Mill is a lightweight build tool for the JVM ecosystem, created by Haoyi Li in response to his dissatisfaction with the complexity and heaviness of traditional tools like Maven and Gradle, which are often overkill for smaller projects (keep that in mind - this will be one of the key topics in today’s edition). The first version of Mill was released in 2018, with the idea of providing a more minimalist tool that is still flexible enough to meet the needs of everyday Java and Scala projects.
Recently, in addition to developing an interesting tool, the creators have recently begun publishing very compelling technical articles. For example, last week Haoyi Li published Understanding JVM Garbage Collector Performance, an article that discusses the basic principles of garbage collection mechanisms. Li presents a simplified GC model, explaining how it manages memory by identifying and removing unused objects. He describes various strategies, such as dividing the heap into segments and copying active objects, to manage memory efficiently and minimize fragmentation. Additionally, the article touches on performance issues, pointing out how different GC implementations can impact the performance of JVM applications.
There’s no shortage of articles about GC online, so why highlight this one? Li draws attention to several counterintuitive conclusions about garbage collection performance. For example, adding more memory doesn’t necessarily improve GC pause times and may even worsen them in some cases. Similarly, caching data in memory increases the size of the object live set, which can extend pause times - especially with Least-Recently Used caches. There’s also no definitive answer to the question of the ideal memory size for an application, as more memory reduces GC frequency at the cost of overall memory performance, while less memory causes more frequent garbage collection.
Another important observation is the impact of process structure on GC performance. Fewer but larger processes can generate longer GC pauses due to the larger active object set. Reducing the size of the active set, for instance by offloading large data structures to tools like Redis or SQLite, significantly shortens pause times. Meanwhile, short-lived objects are quicker to collect due to the generational approach of GC. Finally, using a modern collector like ZGC can dramatically reduce pause times (to as little as 1-5 ms), which may be critical for latency-sensitive applications, but at the cost of higher memory consumption
Another great publication from Li is How JVM Executable Assembly Jars Work. Mill tool enables the creation of so-called "executable assembly jars," which can be natively executed from the command line without needing to use java -jar. These .jar files contain all the code, dependencies, and metadata, allowing them to be run with a single command. The key trick is adding a Bash script (or a universal script compatible with both Windows and Linux) that functions both as a shell command and as a standard .jar file.
The text contains a wealth of interesting technical details about this process. Mill uses the concept that .zip files can be read from the end, allowing for the addition of a launcher script at the beginning. These scripts pass JVM configurations, arguments, and environment configurations, such as through JAVA_OPTS. This solution works on macOS, Linux, and Windows (with appropriate file preparations). While tools like JLink or JPackage also allow for creating native binary files, Mill simplifies the process, enabling quick testing and deployment of Java applications without unnecessary configuration, making them more terminal-friendly. And the ability to delve into these kinds of internals always provides a better understanding of the entire process.
It won't be a very original thought, but in general, I believe learning from field experts is the way to go, and Li definitely has significant hands-on experience.
Continuing with a similar theme, though not from Mill this time, the next article will focus on energy efficiency.
Comparing Java Virtual Machine Energy Efficiency by Mirko Stocker demonstrates that the version of the JVM has a significant impact on energy consumption. Using the broadly known Spring Boot Petclinic application as an example, energy usage was measured with the JoularJX tool for JVM versions 17, 21, and 22 (the test focuses on the Temurin OpenJDK distribution).
Interestingly, the data also revealed differences in energy consumption between various versions of Spring Boot, with a noticeable jump between versions 3.0 and 3.1. However, the authors remain uncertain about the cause of this phenomenon - perhaps a topic for future publications.
The key takeaway from the article is the benefit of upgrading to the latest JVM versions, which can help reduce energy consumption in Java applications.
So next time you’re updating dependencies, explain that you’re doing it for the planet. Not every hero wears the cape.
The next article I have for you presents a rather unconventional approach to Java, as showcased in Java in the Small.
Java is traditionally recognized as a solid choice for large-scale projects. However, Cay Horstmann demonstrates that it is surprisingly effective for small, everyday programming tasks as well. Features like static typing (often dismissed as an unnecessary ceremony for small-scale tasks), excellent API support, and recent improvements such as JEP 330 (running .java files directly) and JEP 477 (eliminating the need for verbose public static void main) make Java more accessible for scripting. These updates address longstanding issues with boilerplate code (trait commonly associated with build tools) and enable simpler workflows without sacrificing the language’s advantages. This makes Java a competitive option for tasks typically handled by Python or Bash scripts.
Cay also illustrates how tools like JBang have improved Java scripting, allowing seamless integration of external libraries, as features such as dependency management directly within the script file simplify development and accelerate deployment. This is especially useful in scripts requiring JSON handling or command-line argument parsing - areas where Java’s standard library is somewhat limited.
The author also points out the growing popularity of notebook environments. Years ago, my colleagues working with Clojure had been often highlighting me the REPL environment as a significant advantage of the language. Today, Java’s potential in exploratory programming, which has now entered the mainstream, is increasingly being recognized. Although Java is still in its early stages compared to Python, tools like Jupyter (augmented with Java kernels such as IJava and Ganymede) and the emergence of libraries like Tablesaw paint an intriguing future for Java in this area as well.
Speaking of notebooks, I can’t help but share awesome-kotlin-notebook. This repository collects resources, examples, and tools supporting Kotlin programming in notebook environments like Jupyter. I highly recommend checking it out, especially if you’ve already sipped the current Kool-Aid and switched from traditional programs to notebooks.
And finally, let’s talk about Lombok (which, by the way, finally received support for JDK 23 at the end of November - so if you’ve been holding off on updates because of that, you’re good to go now) and cover article Beyond Lombok: Modern Java Code Generation Tools, by Egor Voronianskii, which while behind a Medium paywall, can be accessed using tools I won’t write about here
Everyone knows Lombok - it’s one of the most popular Java libraries, as recent statistics confirm. However, if you’ve been living under a rock (or behind a corporate VPN), it’s a tool that automatically generates boilerplate code such as getters, setters, and constructors at compile time. Still, not everyone realizes that Lombok achieves this using an annotation processor, which modifies the abstract syntax tree before bytecode is generated.
But Lombok isn’t the only tool leveraging annotation processors. Another solution featured in the article is MapStruct — a lesser-known tool designed for declarative mapping of Java Bean objects. It uses annotations to define mapping rules between the properties of different classes, generating a complete mapper implementation at compile time. This approach ensures high performance and type safety, with validation occurring at the compiler level, allowing errors caused by object structure changes to be caught early.
The article also mentions JavaPoet, a library for dynamically generating entire Java source files using a fluent API. It allows developers to create packages, classes, methods, and fields based on runtime data. Compared to Lombok or MapStruct, JavaPoet offers greater flexibility, though the generated code can be more verbose and harder to maintain. Nevertheless, it’s particularly useful for building code generation tools or annotation processors, thanks to its versatility. There is also Kotlin version available.
And speaking of JavaPoet, I have something cool to wrap up. Jacek Dubikowski once created a series titled Build your own framework using an annotation processor, detailing the process of building a hobby framework from scratch—not necessarily for production use, but for learning and exploration. Java is not a JavaScript.
The first article covers implementing Dependency Injection, the second tackles transaction handling, and the third explains creating a WebController. The articles are accompanied by source code and guide you step-by-step through the mechanisms of creating a functional framework using JavaPoet, while also addressing potential challenges along the way. Highly recommended — reverse engineering and tinkering are some of the best educational tools. Having worked with Jacek personally in the past, I consider him one of the top experts in this niche topic and I already shared my opinion about learning from experts.
BTW: If you understand Polish, I even have a recording of Jacek’s presentation from Toruń JUG for you.
I hope you enjoyed this more focused edition. This year, I’ll try to occasionally offer such cohesive, focused issues dedicated to specific topics, especially those that particularly interest me.
And as some regular readers know, I have my darlings.