I’d like to introduce two new projects that are part of the Spring AI Community GitHub organization: Spring AI Agents, and Spring AI Bench. These two projects focus on using agentic coding tools—tools you likely already have in your enterprise. In 2025 AI coding agents have matured to the point that they need to be seriously considered for enterprise Java development and general SDLC tasks. CLI Tools like Claude Code, Google’s Gemini CLI, Amazon Q Developer, and OpenAI’s assistants are examples from leading large AI labs, but there are also smaller startups and open-source options. These agentic coding tools can reason about architecture, grok large code bases, and hold great promise to help developers ship software faster. They are often used in a “human in the loop” style, but they can also be instructed to execute autonomously until they determine the goal has been completed. Spring AI Agents defines a lightweight but powerful portable abstraction: the AgentClient. It acts as a consistent interface for invoking autonomous CLI-based agents. This allows developers to use the agentic tools they already have while providing flexibility to avoid locking into one single provider. However, AgentClient is only one piece of the developer toolbox you need to be effective using agentic tools. Spring AI Agents provides the following abstractions, which, when combined, can produce the most effective results:Documentation Index
Fetch the complete documentation index at: https://springaicommunity.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
- Goals: the objectives that the agent is to complete, such as increasing code coverage, labeling issues, or reviewing and merging pull requests.
- Context: the data and environment the agent reasons over - source files, logs, structured datasets, and documentation.
- Tools: Custom capabilities made available to the model to invoke when needed, most often exposed through the Model Context Protocol.
- Judges: evaluators that verify outcomes and assess quality against predefined criteria. These can be deterministic, e.g. a code coverage number or AI-driven, using the LLM–as–Judge pattern.
- Sandbox: An abstraction of where the Agent will execute their work safely and reproducibly. Current support is for local execution and in a Docker container.
Resources
Projects:- Spring AI Bench - GitHub repository
- Spring AI Agents - Documentation
- Spring AI Community - Community portal
- Developer Productivity AI Arena (DPAIA) - Industry initiative for modern agent benchmarking
- SWE-bench - Original benchmark suite
- SWE-bench-Live - Fresh issues benchmark showing 60%→19% drop
- SWE-bench-Java - Multi-language benchmark showing Java ~7-10% vs Python ~75%
- mini-SWE-agent - Minimal agent achieving competitive results
- Model Context Protocol - MCP specification
- BetterBench - Benchmark quality framework
- Devoxx 2025: Spring AI Agents and Spring AI Bench - Mark Pollack’s talk introducing both projects