Spring AI Agents
Spring AI Agents is the pragmatic integration layer for autonomous agents in Java enterprise development.
GitHub Repository: github.com/spring-ai-community/spring-ai-agents
1. What Is an Agent?
"An agent is AI-powered software that accomplishes a goal. Period." — Dharmesh Shah, HubSpot CTO and Agents.ai co-founder
At the core, every agent is software that pursues a goal. The common pattern is an LLM executing a loop: think → act (via tools) → observe → repeat until the goal is achieved. But building effective agents is far harder than this simple description suggests.
In Spring AI Agents, we model agents around these components:
1.1. The Paradigm Shift
After hundreds of hours using agentic CLI tools and conversations with engineers at Google, Amazon, and Netflix, a clear pattern emerged: these tools are incredibly effective at their jobs. The real paradigm shift isn’t any single feature—it’s capability moving into the models themselves. As models evolved from completions → function calling → reasoning → planning, and as protocols like MCP (Model Context Protocol—named that for a reason) standardized and enriched tool and context capabilities, the scaffolding we built to compensate for weaker models became unnecessary. We’ve reached a new tipping point with these tools.
Before reasoning models, we built harnesses and scaffolding—complex multi-step systems to coax capabilities from weaker models. We coded workflows step-by-step because models couldn’t plan. Reasoning models now handle what used to require elaborate client-side engineering: planning which steps to take and in what order, capabilities that traditionally belonged to workflow engines in application code.
The shift: from imperative (code every workflow step) to declarative (describe the goal and let the model plan the steps). What remains critical: context engineering and tool design.
"Before the reasoning models emerged, there was all of this work that went into engineering these agentic systems that made a lot of calls to GPT-4… to get reasoning behavior. And then it turns out… we just created reasoning models and you don’t need this complex behavior. In fact, in many ways it makes it worse." "There are a lot of things that people are building right now that will eventually be washed away by scale." — Noam Brown, OpenAI Research Lead (Latent Space podcast) |
Agentic Search > Semantic Search "Semantic search is usually faster than agentic search, but less accurate, more difficult to maintain, and less transparent… we suggest starting with agentic search, and only adding semantic search if you need faster results." — Anthropic, Building Agents with the Claude Agent SDK "I actually found that lm.txt with good descriptions… just that passed to the code agent, with a simple tool just to grab files, is extremely effective… I actually personally don’t do vector store indexing." — Lance Martin, LangChain (Latent Space podcast) This upends the traditional RAG pattern: simple file-based search with agent tools often outperforms complex vector indexing. |
Spring AI Agents leans into this direction: trust the model to plan and execute, validate through benchmarking.
1.2. The Journey: From Building to Using
I started wrapping the Claude Code CLI myself. Then I discovered Anthropic had created the Claude Code SDK (now Claude Agent SDK)—a Python wrapper providing a clean interface to their CLI tool. Similar SDKs emerged from Google (Gemini CLI), Amazon (Q Developer), and others.
As Anthropic explains in their blog:
The Claude Agent SDK enables developers to build powerful, flexible agents by giving Claude access to a computer where it can write files, run commands, and iterate on its work.
The realization: You can build custom agents with Spring AI’s @Tool
annotations and MCP support. Mini-swe-agent proves a simple "LLM in a loop" works. And we will build custom agents for domain-specific needs.
But why reinvent? Building effective agents is hard. You’re solving problems that Anthropic, Google, and OpenAI invest heavily in: context management, error recovery, planning, tool selection, performance optimization. Why not leverage that R&D?
1.3. The Spring AI Agents Approach
The pattern looked familiar: Just like database access before JDBC—many powerful tools doing similar things, but all slightly different. Spring AI solved this for LLM completions with ChatClient
, providing portability and a higher-level developer experience.
Spring AI Agents applies the same principle to autonomous agents:
-
Agent SDK portability layer - Java wrappers for leading agentic CLI tools: Claude Agent SDK, Gemini CLI Agent SDK, Amp CLI SDK, OpenAI Codex CLI SDK, Amazon Q Developer CLI SDK, mini-swe-agent, with planned support for Goose and GitHub Copilot Agent
-
Familiar Spring AI patterns - AgentClient API following ChatClient design
-
Advisor pattern - Extend agent behavior with context and judges as advisors, just like ChatClient
-
Agent sandbox - Isolated execution
Example - Declarative agent execution:
CoverageJudge judge = new CoverageJudge(80.0); (1)
AgentClientResponse response = agentClient (2)
.goal("Increase JaCoCo test coverage to 80%") (3)
.workingDirectory(projectRoot) (4)
.advisors(JudgeAdvisor.builder().judge(judge).build()) (5)
.run(); (6)
// Real results: 0% → 71.4% coverage in 6 minutes
1 | Judge - Automated verification of coverage target |
2 | Start with AgentClient instance (auto-configured by Spring Boot) |
3 | Goal - What you want to accomplish (the "what", not the "how") |
4 | Working directory - Where the agent executes (sandbox isolation) |
5 | Verification - JudgeAdvisor verifies 80% coverage achieved |
6 | Execute - Run autonomously until goal achieved |
Declarative approach: You describe the goal and provide context. The LLM plans the workflow, decides which tools to use, and adapts when things go wrong. No coding workflows, no predefined steps—just the goal and context. |
The code coverage agent increased test coverage from 0% to 71.4% on Spring’s gs-rest-service tutorial. Claude Code followed all Spring WebMVC best practices (@WebMvcTest, jsonPath(), AssertJ) while Gemini achieved the same coverage but used slower patterns (@SpringBootTest). Same coverage, different quality—model choice matters for enterprise standards. |
Or run agents directly with JBang - no build required:
jbang agents@springai coverage target_coverage=80
Zero setup - the agent runs on your local codebase, pulls context as needed, and achieves the goal. Once you see it working, tweak the configuration or create your own agents.[1]
See the Getting Started guide for complete examples.
2. Why CLI Agents?
Spring AI Agents focuses specifically on autonomous CLI agents - agents that execute goals by directly interacting with your computer through command-line interfaces.
CLI agents are uniquely effective because they:
-
Manage context through the file system - Write intermediate state to files, read when needed, avoiding context window limitations (see Context Engineering)
-
Execute bash commands - Run builds, tests, searches—anything you can type in a terminal
-
Iterate autonomously - Keep working until the goal is achieved, no human intervention required
Human-in-the-Loop vs Autonomous: Chatbots like ChatGPT and code completion tools like Copilot excel at exploration and pair programming. Autonomous CLI agents excel at executing well-defined goals end-to-end without human intervention. Different tools for different needs.
The space is evolving. Both paths coexist: use agentic CLI tools (like Claude Agent SDK, Gemini CLI, Amp) for general development tasks, or build custom agents with Spring AI’s @Tool
/MCP for specialized needs. Leading companies invest heavily in context engineering, planning strategies, and continuous model improvements—Spring AI Agents lets you leverage that R&D while maintaining flexibility to build custom solutions when appropriate.
Spring AI Agents makes autonomous agents as easy to use in Spring Boot as ChatClient is for conversational AI.
3. Key Features
-
Zero-Setup Quick Start - Try agents via JBang catalog without cloning or building
-
ChatClient-style API - Same fluent patterns Spring developers already know
-
JBang Agent Runner - Primary developer entry point for trying agents locally with LocalSandbox
-
Multiple agent providers - Claude Code, Gemini CLI, Amp, and SWE Agent support (more to come!)
-
Fluent API design - Clean, intuitive interface following Spring patterns
-
Spring Boot ready - Auto-configuration and dependency injection support
-
Production essentials - Built-in error handling, timeouts, and metadata
-
Evaluation-first design - Judge API for deterministic and AI-powered verification
4. Better Benchmarks for Java
How do you know if your agent is effective?
The agent ecosystem has a Python bias. Most benchmarks, research, and tooling assume Python workflows. But enterprise software development is multi-language, and Java remains the backbone of mission-critical systems.
4.1. The Benchmark Problem
-
SWE-bench: Python-centric, curated dataset with inflated scores
-
SWE-bench-Live: More realistic fresh issues—scores drop significantly
-
Multi-SWE-bench & SWE-PolyBench (2025): Added Java, revealed Python bias—Java agents score lower not because they’re worse, but because benchmarks don’t reflect Java workflows
For a detailed analysis of these benchmarking issues, see the Spring AI Bench documentation.
4.2. Spring AI Bench
We’re building Spring AI Bench—an open-source benchmark suite for Java that evaluates agents on goal-directed, enterprise workflows. Following Stanford’s BetterBench principles for reproducibility and contamination resistance.
Spring AI Bench and Spring AI Agents work hand-in-hand: Spring AI Agents provides the integration layer, making it easy to run different agents (Claude, Gemini, Amp, custom solutions). Spring AI Bench provides the measurement framework, evaluating agents across multiple dimensions.
Philosophy: Let the best agent per use case win. Benchmark ALL approaches—annotation-based tools, CLI agents, custom solutions—and measure what actually matters.
As Dharmesh Shah frames it on the Latent Space podcast, evaluating agents is like hiring for a job: effectiveness depends on your specific constraints and goals. Spring AI Bench measures across multiple axes:
Objective metrics: * Success rate - Can it achieve the goal? * Cost - Token usage, API costs * Speed - Execution time, latency * Reliability - Consistency across runs
Qualitative factors: * Quality vs. cost tradeoff - Is the premium model worth it for this task? * Time-to-value - How quickly does it deliver results? * Workflow fit - Does it integrate cleanly into your process?
Different scenarios optimize for different combinations:
-
Fastest at least cost - Routine tasks, CI/CD automation
-
Highest quality regardless of cost - Critical migrations, security audits
-
Balanced tradeoffs - Most development tasks
We’ll learn which agent wins for which scenario. That’s the point of benchmarking.
5. Agent Providers
Spring AI Agents provides Java integration for leading autonomous agentic CLI tools:
Provider | Status | Description |
---|---|---|
✅ Available |
Agent SDK for Anthropic’s autonomous coding agent. Renamed from Claude Code SDK (Sept 2025) to reflect broader applications beyond coding. |
|
✅ Available |
Agent SDK for Google’s command-line coding agent with multimodal capabilities. |
|
✅ Available |
Agent SDK for Sourcegraph’s autonomous coding agent. Full-featured CLI tool for code generation, refactoring, and debugging. |
|
✅ Available |
Agent SDK for lightweight 100-line autonomous agent for benchmarking. Simpler alternative to the original SWE-agent (thousands of lines of Python). |
|
🚧 Planned |
Agent SDK for Block’s open-source extensible AI agent. Runs locally, automates engineering tasks from start to finish, builds entire projects autonomously. |
|
🚧 Planned |
Agent SDK for GitHub’s autonomous coding agent. Assign issues to Copilot and it creates PRs autonomously in a GitHub Actions environment. |
|
✅ Available |
Agent SDK for AWS’s autonomous /dev agent. Multi-file implementation with natural language, autonomous planning and execution across codebases. |
|
✅ Available |
Agent SDK for OpenAI’s GPT-5-Codex optimized for agentic coding. Handles both quick sessions and long autonomous tasks. |
6. Requirements
-
Java 17 or higher
-
Maven 3.6.3 or higher
-
Agent CLI tools installed (Claude, Gemini, Amp, etc.)
-
Valid API keys for your chosen providers
7. Getting Started
Get started using Spring AI Agents by following our Getting Started guide.
8. Documentation
-
JBang Agent Runner - Primary developer entry point for trying agents locally
-
AgentClient API - Learn the core API for running autonomous tasks
-
AgentClient vs ChatClient - See how AgentClient follows ChatClient patterns
-
Sample Agents - Real-world agent examples and patterns
9. Contributing
We welcome contributions to Spring AI Agents! Please see our Contribution Guidelines for more information on how to get involved.
10. Resources
-
Spring AI Agents
-
Documentation: This site
-
Spring AI Bench
-
Documentation: spring-ai-bench documentation