Official Java Client for OpenAI & Spring’s Model Context Protocol Client - JVM Weekly vol. 113
The last one this year
1. Official Java Client from OpenAI and Spring’s Model Context Protocol Implementation
Timeo Samus Altmanus et dona ferentes
How else could we wrap up 2024 if not with AI? It just feels inevitable! In December, most activities have gone into holiday hibernation — except for the mentioned AI, where we’re seeing a cascade of announcements.
And maybe crypto too, but it has long been playing by its own rules.
Let’s start with OpenAI, which launched its 12 Days of OpenAI initiative in December. Over 12 business days, they’ve been rolling out new features, products, and updates related to their AI technology. On Tuesday, during Day 9, they introduced new models—o1 and GPT-4o Realtime API...
...and unveiled another product: OpenAI Java SDK (Beta). Let me set expectations - it’s not a heavyweight framework like Langchain4j, with its own memory model, conversational contexts, and tools. It’s simply an API client—precisely what most people will use before diving into more advanced solutions that come with a steeper learning curve. Think of the Java SDK as a bicycle rather than a space rocket. While nobody rides a rocket to the grocery store, a bicycle is perfect for scripting tasks, in my opinion.
These OpenAI announcements are always worth keeping an eye on, even if I can’t help but associate “12 Days of Christmas” with this classic:
This is not the end when it comes to Java implementations of APIs and protocols. While OpenAI API is a relatively simple solution that anyone can implement with an HTTP Rest Client, the Model Context Protocol from Anthropic is something much more "engineering-interesting." That’s why we’ll start with a quick introduction, and then I’ll explain where the connection with the Java community comes in.
AI models can generate incomplete or inaccurate responses if they lack context or user-specific data. As assistants and AI systems evolve, ensuring access to current and contextual data from various sources becomes an increasing challenge. I’ve personally participated in more than one (or two xD) projects where every integration required creating dedicated solutions for each data source. This increases maintenance costs and the overall "brittleness" of the solutions.
This is where Anthropic, the creators of Claude.ai, steps in. The Model Context Protocol they developed is an open standard that standardizes how AI systems connect with external data sources. It enables bidirectional communication between AI models (MCP clients) and MCP servers providing data. This means developers don’t have to create dedicated solutions for each source, simplifying the integration process significantly. MCP allows developers to create MCP servers that provide data, as well as AI applications (MCP clients) that communicate with these servers, enabling AI agents to access current and contextual data in real time. If I were to compare it to something, it reminds me of a mix of the Server Language Protocol and JDBC, somewhat in the style of Kafka Connect API.
This eliminates the need to build custom solutions for each data source, significantly simplifying integration. MCP allows developers to create MCP servers that provide data and AI applications (MCP clients) that communicate with those servers, enabling AI agents to access real-time contextual data. If I were to compare it to anything, it would be a mix of Server Language Protocol and JDBC, with a touch of Kafka Connect API.
This approach simplifies the integration process and reduces the need to create separate connectors for each data source. The downside is that the protocol is not trivial to implement, especially if each project had to do it on its own—believe me, I started a test implementation for my own needs at the beginning of December here.
I will finish it, but I suspect it will be more for my own better understanding and general fun, because the Spring team beat me to it (and likely did it better 😀).
The Spring AI team developed Spring AI MCP — Java SDK that implements the Model Context Protocol client. This solution allows Spring AI-based applications to easily connect with MCP servers, which provide data from local and remote sources, such as file systems, databases, or web services, using a unified interface for data retrieval.
The Spring AI MCP implementation supports both synchronous and asynchronous communication patterns. Its modular architecture allows for easy functionality expansion by adding new MCP servers.
Now, I’m waiting for a module that would allow easily exposing my own server because It is really amusing that Spring AI MCP makes request (even if not directly) to Node.js's npm MCP Server in the examples:
var stdioParams = ServerParameters.builder("npx")
.args("-y", "@modelcontextprotocol/server-filesystem", getDbPath())
.build();
We’ve covered OpenAI and Anthropic — don’t think Google has been idle. On December 11, 2024, Google introduced Gemini 2.0, the latest generation of its AI model. This iteration integrates advanced multimodal capabilities, enabling it to process text, images, and audio. Gemini 2.0 also introduces “agentic AI,” systems capable of independently performing tasks such as planning and decision-making.
Additionally, Google released Gemini 2.0 Flash, an experimental version of the model offering low latency and enhanced performance. It’s available to developers through Google AI Studio and Vertex AI. In this context, check out the article Detecting objects with Gemini 2.0 and LangChain4j. Guillaume Laforge , one of Groovy’s creators, demonstrates how Gemini 2.0 Flash enables precise object recognition and localization in images, opening new possibilities for AI-powered image analysis applications - and explains how to implement it in Java (though Java is more of a supporting role here).
To wrap it all up, one last publication related to Langchain4j: Creating pure Java LLM-infused applications with Quarkus, Langchain4j, and Jlama by Mario Fusco . The author explains how Jlama libraries by Jake Luciani allow for LLM inference directly on the JVM using models like Llama 3.2 with 4-bit quantization. Thanks to integration with Quarkus and LangChain4j, the application runs without external servers, enabling direct model handling in Java. Moreover, this solution allows easy model switching by editing a single configuration file entry and leverages the Vector API for improved numerical operation performance on the JVM.
Cool stuff.
2. GraalVM: BytecodeDSL and support in AWS Common Runtime
Now we’re diving into a very niche topic—not just GraalVM, but some of the most obscure aspects of it. But hey, 🤷 I'm in my own space – and I hope I'm not the only one who likes it...
Since not everyone might be up to speed, let’s start with a brief recap. Truffle is a framework that’s part of GraalVM, enabling the creation of custom programming language implementations by interpreting abstract syntax trees (ASTs). Thanks to Truffle, developers can build interpreters for various languages, which are automatically optimized by the Graal compiler, achieving performance levels close to those of native virtual machines.
While Truffle's Abstract Syntax Tree (AST) interpreters offer excellent performance for peak loads, they have high memory consumption due to the need to allocate entire tree structures and may have limited optimisation opportunities during interpretation. On the other hand, bytecode interpreters are more efficient in terms of memory consumption and offer better optimisation potential, but are more difficult to implement due to the complexity of bytecode handling, control flow and optimisations such as quickening.
To simplify this process, the Truffle team created BytecodeDSL. Bytecode DSL simplifies the creation of bytecode interpreters by automating the process of generating them from high-level specifications, called ‘operations’. This allows developers to focus on language-specific semantics, while the DSL takes care of the tedious details of bytecode generation, such as encoding and control flow. Like the Truffle DSL for AST interpreters, the Bytecode DSL abstracts the complexity of the interpreter, making it easier to create efficient bytecode-based interpreters. It also supports performance-enhancing features such as quickening, boxing elimination and tiered interpretation that optimise bytecode execution.
You will find great examples here in the original docs.
While not many readers may use this directly, I hope you can appreciate it — if only out of self-preservation instinct. The world is simply better when the creators of widely-used tools have an easier life.
But we’re about to go even deeper...
Continuing the GraalVM topic, here’s some news from Amazon Web Services: AWS CRT Client for Java, which has recently gained improved support for GraalVM Native Image. AWS CRT (Common Runtime) is a low-level runtime designed for maximum performance and shared across various programming languages supported by AWS SDKs, including Java, Python, C++, and Rust. Written in C and C++, CRT provides optimized support for key networking features such as HTTP/2, TLS, and multithreading. The runtime acts as a shared layer used by SDK implementations in various languages, enabling access to the same efficient low-level libraries. Integration is achieved using mechanisms specific to each language, such as JNI in the case of Java.
Thanks to this approach, AWS CRT replaces traditional, standalone networking implementations for each language, creating a single, unified runtime environment. For example, instead of writing separate solutions for HTTP/2 support in Java, Python, and C++, CRT provides one highly optimized library that SDKs across different languages use as a shared base. In practice, this means that all languages supported by AWS SDKs can rely on the same high-performance implementation of networking features, significantly reducing maintenance and optimization costs for the entire ecosystem.
Support for GraalVM Native Image in the AWS CRT Client for Java was introduced in version v0.29.20, enabling applications using AWS CRT to be compiled into GraalVM native binaries without requiring additional configuration. Starting with version v0.31.1, this support has been significantly improved. Instead of embedding shared libraries as resources within the native binary, the corresponding libraries are now stored in the same directory as the compiled native image.
This change has two key effects. First, it reduces the size of native binaries by approximately 30%—for example, from 142 MB to 101 MB in a sample application. Second, it eliminates the additional load time previously required for temporarily extracting JNI libraries to temporary paths. As a result, applications can start even faster and take up less space, which is particularly important in resource-constrained environments like serverless functions in AWS Lambda. You can learn more from the article AWS CRT Client for Java adds GraalVM Native Image support by Maximilian Schellhorn and Dengke Tang.
I even checked how many GitHub projects use aws-crt-java. While the numbers aren’t impressive, and many are "zero-star" repositories, often AWS’s own tools, there are still some interesting gems, like Astra by Slack, AsterixDB by Apache, and GeoMesa for geospatial queries.
To bring things back down to earth, let’s end with something more "for the people," so you don’t think GraalVM is just a toy for toolmakers. Alina Yurenko 🇺🇦 , GraalVM Developer Advocate, recently published an article as part of the advent calendar: 5 Cool Applications You Can Build with Java and GraalVM. In this article, she describes (as the title suggests) five inspiring examples of how GraalVM can be used in practice, providing ideas for both beginner and advanced developers.
Among the projects described are concepts such as building serverless applications with instant startup times using GraalVM Native Image, optimizing microservices by minimizing resource usage, and creating multilingual applications that combine Java with languages like Python, JavaScript, or Ruby, leveraging GraalVM Polyglot. Alina also touches on efficient developer tools that can run faster with a smaller footprint and demonstrates IoT platform integrations, where low memory usage and fast startup times are critical.
The article is a great reminder that technologies like GraalVM can be used not only for large infrastructure projects but also for more everyday, creative applications. It’s worth a look, especially if you’re seeking inspiration to experiment with GraalVM.
Finally, wrapping up the GraalVM topics, I truly wish I could recommend this week’s AirHacks.fm episode, where Adam Bien talks with Alfonso² Peterssen about Espresso — a Java implementation written using the Truffle framework mentioned earlier. Unfortunately, Alfonso had some microphone issues, so you’ll occasionally hear static. Still, the content is as strong as always. I just wanted to note the technical hiccup, so you don’t get the wrong impression about their usual quality. 😃
3. Year-End Recap: Java Trends Report 2024 by InfoQ
To wrap up this final edition of the year (more on that in a moment), I bring you the perfect summary of 2024: Java Trends Report by InfoQ.
This report, published annually, summarizes the most important trends and events in the Java world, based on analyses and opinions from key experts. The report’s authors, including Ben Evans, Holly Cummins, A N M Bazlur Rahman, Grace Jansen, Emily Jiang, Ivar Grimstad, Andrea Peruffo, Erik Costlow, Johan Janssen and Karsten Silz under the lead of Michael Redlich, InfoQ’s Lead Editor for Java, gathered their observations to create a comprehensive picture of Java’s evolution over the past year.
Key conclusions from the report include the growing significance of JDK 21 as the new LTS and the widespread adoption of features like records, pattern matching, and virtual threads. Experts also noted increased interest in GraalVM and Native Image, particularly for serverless and cloud applications where performance and minimal resource usage are critical. Special attention was given to the adoption of new tools and practices supporting the modernization of existing applications, as well as the evolution of the community around OpenJDK. The report also offers intriguing insights into Java’s role in AI environments and its future in cloud technologies.
Additionally, the report is accompanied by a podcast in which Ixchel Ruiz , Senior Software Developer at Karakun, and Gunnar Morling , Software Engineer at Decodable and host of #1BRC, discuss with Michael Redlich key themes of the report. The episode touches on topics like the benefits of Java’s six-month release cycle, Project Lilliput and compact object headers, nullability in Java, and the impact of Python’s growing popularity. Interestingly, the discussion highlights why Java, despite Python’s current dominance, has a solid chance to strengthen its position in AI, thanks to its ecosystem, tools, and long-term support in enterprise projects.
One exciting highlight from the report is the mention of GraalVM, which has now entered the “Early Majority” phase (translation: stable and ready for use). There are also some fascinating entries in the “Early Adopters” section (like CRaC and Nima) and in the “Innovators” section (GraalPy and GraalWASM). The last one is especially interesting — at the beginning of December, Safari, the last of the major browsers, introduced support for WebAssembly GC.
It’s become a running joke that every year is supposed to be the year of WebAssembly, but as we enter 2025, we’re truly closer than ever. Here’s hoping my predictions aren’t off the mark this time.
The only thing I struggle to agree with is Scala 3 being relegated to the “Late Majority” phase due to its "slow release train." Scala 3.6 was just released, and Scala 3 now has an LTS (Long-Term Support) version. Interestingly, during this week’s Scala Survey announcements, the community actually complained that new versions are coming out too quickly. It seems some developers feel overwhelmed by the pace of changes and the frequency of new releases 😅.
But I won’t rumble on; you can find the report via the link, and this edition has turned out surprisingly long—and there’s more to come.
As the year draws to a close, I’ll break my usual rule of sharing only publications in English to recommend a fantastic Polish podcast: Patoarchitekci (I'll leave translating the name to you, dear reader), hosted by Szymon Warda and Lukasz Kaluzny . In their latest episode, “JVM Now and Its Future”, they dive deep into the past, present, and future of the JVM with Jarosław Pałka.
The conversation is packed with not only technical insights but also practical advice, such as optimizing garbage collection in large systems and Java’s role in modern cloud-native environments. Listening to it, you’ll understand why Java continues to reign supreme in databases (and Jarosław, being a Senior Staff Performance Engineer at Neo4j, knows what he’s talking about and has the arguments to back it up). They also touch on the impact of licensing changes and the challenges posed by the growing worlds of big data and AI. If you’re looking for a great recap of what’s happening in the JVM world (and want it in Polish), this episode is a must-listen to close out the year... as is the entire podcast.
And with this paragraph, I’d like to bid you farewell for 2024 — Christmas falls at such an odd time this year that our next edition won’t be until January 9th. This will be the longest break from writing I’ve taken since starting this, but I promise to return with renewed energy 🎄.
May your GC always run without stop-the-world, your applications start faster than a native image, and your code compiles without errors on the first try better than Github Copilot (which since yesterday is free for VSCode users). Merry Christmas and a Happy New Year!