Fejléc

AI in monitoring: enhancing alert management

Szerző ikon Gergő Al-Nuwaihi

Dátum ikon 2026.03.05

AI is increasingly used in operations to reduce monitoring noise, accelerate incident triage, and support faster decision-making during outages.

Why central monitoring matters

Central monitoring is essential because it:

  • provides a single view of systems, devices and services,
  • enables real-time operational awareness,
  • detects faults early and triggers alerts,
  • supports long-term proactive operations.

It also enables service and SLA monitoring, trend analysis, capacity planning and integrations such as CMDB, ticketing systems and log management.

Why Zabbix is widley used for infrastructure monitoring

Zabbix is a strong choice for infrastructure monitoring, here is why:

  • template-based deployment,
  • dashboards and visualization,
  • scalable architecture (including proxies),
  • platform independence,
  • agent and agentless monitoring,
  • open-source code with enterprise LTS versions.

Monitoring goals remain the same: status visibility, anomaly detection, alarm prioritization and performance monitoring. The main challenge in large environments is the volume of alerts teams must process.


Using AI to accelerate alert handling

Operational teams typically need two capabilities:

  1. Rapid understanding of individual alerts and suggested next actions
  2. Prioritization among large numbers of simultaneous alerts


1) Alert Assist: “What does this alert mean?”

Alert Assist is a Zabbix UI module that uses a language model to explain a selected problem and suggest actions in a concise format:

A válasz szerkezete kifejezetten üzemeltetési logikát követ:

  • likely causes (e.g., outage, network issue, firewall rule, misconfiguration, overload),
  • verification steps with example commands,
  • recommended troubleshooting order,
  • prevention suggestions.

This shortens investigation time and supports onboarding or cross-team collaboration.


2) Priority Manager: focusing on the critical alerts

In larger environments, monitoring systems may generate dozens or hundreds of alerts simultaneously.

Priority Manager evaluates all active problems together and ranks the most critical ones using a language model (optionally enriched with internal knowledge).

Typical capabilities include:

  • configurable Top N critical alert list,
  • impact/affected environment highlighted per item,
  • plain-language action suggestions,
  • history and response-length filtering,
  • grouping of similar problems instead of purely host-based views.

The result is faster identification of the most urgent incidents.


Architecture options and deployment models

Different organizations require different deployment models, therefore both options can be supported:

  • On‑premise LLMs running on local GPU infrastructure.
  • Cloud LLM services through major provider APIs.

Security controls may include:

  • anonymization of sensitive identifiers such as hostnames or IP addresses,
  • structured prompting designed for short, operator-focused responses.

Prioritization workflows can also incorporate internal operational knowledge and rules.

Practical operational advantages

AI-assisted monitoring can deliver several practical improvements:

  • Faster incident response by highlighting the most critical issues
  • Reduced operator workload during first-level investigation
  • Stronger L1/helpdesk capability before escalation
  • Controlled AI usage within an approved operational framework


Future outlook: agentic operations

Future developments may introduce agentic capabilities, where AI can perform limited automated actions or reduce alert noise based on historical patterns.

Another direction is interacting with monitoring systems through natural language (e.g., via an MCP – Modern Context Protocol – server), potentially improving usability and shortening the learning curve for complex monitoring tools.

Read the full article on our International subsidiary’s website by clicking on the logo:

Do you have a question? Would you like a solution? Get in touch with our colleagues!