Univariate vs Multivariate Anomaly Detection
Explore the differences between single and multiple variable anomaly detection approaches.
Overview
This guide demonstrates anomaly detection across multiple time series using univariate and multivariate methods. You will learn:
• How to detect anomalies in each time series independently (univariate).
• How to detect anomalies across multiple correlated time series (multivariate).
If you want to run this notebook interactively, click the badge below:
Both univariate and multivariate methods rely on the Nixtla API for anomaly detection. The main difference is how anomalies are identified: individually per time series vs. collectively across multiple series at the same timestamp.
Setup
1. Install and Import Dependencies
If you haven’t already, install Nixtla and import your dependencies.
2. Connect to the Nixtla API
Create a NixtlaClient instance. Replace ‘my_api_key_provided_by_nixtla’ with your actual API key.
Use an Azure AI Endpoint
To use an Azure AI endpoint, set the base_url
argument explicitly:
1. Dataset
We use an example from the SMD dataset (SMD: Server Machine Dataset). This dataset is a benchmark for anomaly detection across correlated server-performance metrics (CPU, memory, disk I/O, network throughput, etc.).
File Used: SMD_test.csv
Data Size: 38 unique time series
Frequency: Hourly (freq=‘h’)
2. Univariate vs. Multivariate Methods
Definition: Univariate anomaly detection analyzes each time series in isolation. It flags anomalies based on each series’ individual deviation from its expected behavior.
Pros: Efficient for individual metrics or when correlations between metrics are not relevant.
Cons: May miss large-scale, system-wide anomalies that are only apparent when multiple series deviate simultaneously.
Definition: Univariate anomaly detection analyzes each time series in isolation. It flags anomalies based on each series’ individual deviation from its expected behavior.
Pros: Efficient for individual metrics or when correlations between metrics are not relevant.
Cons: May miss large-scale, system-wide anomalies that are only apparent when multiple series deviate simultaneously.
Definition: Multivariate anomaly detection considers all time series collectively. A time step is flagged as anomalous if the aggregate deviation across all series at that time exceeds a threshold.
Pros: Captures systemic or correlated anomalies that might be missed when analyzing each series in isolation.
Cons: Slightly higher complexity and computational overhead. May require careful threshold tuning.
Summary
Univariate:
Best for detecting anomalies in a single metric or uncorrelated metrics. Low computational overhead, but may overlook cross-series patterns.
Multivariate:
Considers correlations across metrics, capturing system-wide issues. More complex and computationally intensive than univariate methods.
Both detection approaches use Nixtla’s online anomaly detection method. Choose the strategy that best fits your use case and data characteristics.