Web Application¶

Web Application - direct MeterRegistry read (no historization through SystemMetricsCollector). The Active LLM ops gauge tracks in-flight ChatClient / Advisor / VectorStore operations as the agent runs.

Purpose - Servlet container, HTTP traffic, logback level counts, and live Spring AI in-flight operations. Different from Host because these metrics are operational traffic signals (rate, in-flight counts, status distribution), not resource consumption.

When to look here¶

"Is something blocking HTTP threads?" - HTTP in-flight (server) gauge climbing without proportional throughput.
"How many concurrent provider calls are in flight right now?" - HTTP in-flight (client) - outbound HTTP to model providers.
"How many LLM operations are running this second?" - Active LLM ops gauge (ChatClient + Advisor + VectorStore active LongTaskTimers).
"Are sessions piling up?" - Active sessions + Longest session alive.
"Did we hit a wave of 4xx / 5xx responses?" - HTTP requests by status chart.
"Is the WARN/ERROR rate climbing?" - Warn / Error event KPIs + Logback events chart.

Data source¶

Direct MeterRegistry read (no parallel pipeline - values are live-instant, not historized through SystemMetricsCollector).

Controls¶

Web Application reads the Observability global refresh interval and ignores the time window - gauges are live, counters are lifetime-cumulative. No tab-specific controls.

KPI cards (thirteen)¶

Card	Shows	Source
HTTP in-flight (server)	Servlet requests currently being handled	`tomcat.threads.busy` or LongTaskTimer active count
HTTP in-flight (client)	Outbound HTTP requests in flight (to model providers, MCP servers)	`http.client.requests` active LongTaskTimer
Active LLM ops	In-flight Spring AI operations	Active LongTaskTimers for `chatClient`, `advisor`, `vectorStore`
Active sessions	Currently active Tomcat sessions	`tomcat.sessions.active.current`
Longest session alive	Longest-lived active session age	`tomcat.sessions.alive.max`
Sessions created (lifetime)	Cumulative session create count	`tomcat.sessions.created`
Sessions expired	Cumulative session expiry count	`tomcat.sessions.expired`
Sessions rejected	Cumulative session rejection count	`tomcat.sessions.rejected`
HTTP requests (lifetime)	Cumulative HTTP request count	`http.server.requests` count
Logback events (lifetime)	Cumulative logback events across all levels	`logback.events` count
Error events	Cumulative ERROR-level logback events	`logback.events{level=ERROR}`
Warn events	Cumulative WARN-level logback events	`logback.events{level=WARN}`
Error/Warn rate	Combined ERROR+WARN rate per minute	Derived from above counters

Charts (four)¶

Chart	Type	Reading
HTTP requests by status	Horizontal bar (2xx / 3xx / 4xx / 5xx, lifetime)	Sudden 4xx spike → bad request pattern; 5xx → server-side regression
Outbound HTTP latency by host	Horizontal bar (ms by host)	The provider hosts your agent talks to most - useful for diagnosing slow providers
Logback events	Horizontal bar by level (lifetime)	Disproportionate ERROR/WARN → check Logs tab for context
Active LLM operations	Horizontal bar by operation type (ChatClient / Advisor / VectorStore)	Long-running operation types indicate where the agent is currently blocked

Cross-references¶

Host - sibling tab for resource consumption (heap / GC / threads / disk)
Logs - drill into individual log lines when Warn / Error KPI climbs
Observability Architecture → External export → Metrics - same MeterRegistry is scraped by Prometheus at /actuator/prometheus

Was this page helpful? Ask on Discussions · Report a docs issue