Overview

This guide demonstrates anomaly detection across multiple time series using univariate and multivariate methods. You will learn:
• How to detect anomalies in each time series independently (univariate).
• How to detect anomalies across multiple correlated time series (multivariate).

If you want to run this notebook interactively, click the badge below:

Both univariate and multivariate methods rely on the Nixtla API for anomaly detection. The main difference is how anomalies are identified: individually per time series vs. collectively across multiple series at the same timestamp.

Setup

1. Install and Import Dependencies

If you haven’t already, install Nixtla and import your dependencies.

pip install nixtla

2. Connect to the Nixtla API

Create a NixtlaClient instance. Replace ‘my_api_key_provided_by_nixtla’ with your actual API key.

nixtla_client = NixtlaClient(
    api_key='my_api_key_provided_by_nixtla'
)

Use an Azure AI Endpoint
To use an Azure AI endpoint, set the base_url argument explicitly:

nixtla_client = NixtlaClient(
    base_url="your azure ai endpoint",
    api_key="your api_key"
)

1. Dataset

We use an example from the SMD dataset (SMD: Server Machine Dataset). This dataset is a benchmark for anomaly detection across correlated server-performance metrics (CPU, memory, disk I/O, network throughput, etc.).

File Used: SMD_test.csv
Data Size: 38 unique time series
Frequency: Hourly (freq=‘h’)

df = pd.read_csv(
    'https://datasets-nixtla.s3.us-east-1.amazonaws.com/SMD_test.csv',
    parse_dates=['ts']
)
df.unique_id.nunique()

2. Univariate vs. Multivariate Methods

Definition: Univariate anomaly detection analyzes each time series in isolation. It flags anomalies based on each series’ individual deviation from its expected behavior.

Pros: Efficient for individual metrics or when correlations between metrics are not relevant.

Cons: May miss large-scale, system-wide anomalies that are only apparent when multiple series deviate simultaneously.

2.1.1 Example Usage

Univariate detection code:

anomaly_online = nixtla_client.detect_anomalies_online(
    df[['ts', 'y', 'unique_id']],
    time_col='ts',
    target_col='y',
    freq='h',
    h=24,
    level=95,
    detection_size=475,
    threshold_method='univariate'  # Univariate anomaly detection
)

Sample output logs:

Univariate Method Log Output

Univariate Anomaly Detection Logs
INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Preprocessing dataframes...
WARNING:nixtla.nixtla_client:Detection size is large. Using the entire series to compute the anomaly threshold...
INFO:nixtla.nixtla_client:Calling Online Anomaly Detector Endpoint...

2.1.2 Visualization

# Utility function to plot anomalies
def plot_anomalies(df, unique_ids, rows, cols):
    fig, axes = plt.subplots(rows, cols, figsize=(12, rows * 2))

    for i, (ax, uid) in enumerate(zip(axes.flatten(), unique_ids)):
        filtered_df = df[df['unique_id'] == uid]
        ax.plot(filtered_df['ts'], filtered_df['y'], color='navy', alpha=0.8, label='y')
        ax.plot(filtered_df['ts'], filtered_df['TimeGPT'], color='orchid', alpha=0.7, label='TimeGPT')
        ax.scatter(
            filtered_df.loc[filtered_df['anomaly'] == 1, 'ts'],
            filtered_df.loc[filtered_df['anomaly'] == 1, 'y'],
            color='orchid', label='Anomalies Detected'
        )
        ax.set_title(f"Unique_id: {uid}", fontsize=8)
        ax.tick_params(axis='x', labelsize=6)

    fig.legend(loc='upper center', ncol=3, fontsize=8, labels=['y', 'TimeGPT', 'Anomaly'])
    plt.tight_layout(rect=[0, 0, 1, 0.95])
    plt.show()


display_ids = ['machine-1-1_y_0', 'machine-1-1_y_1', 'machine-1-1_y_6', 'machine-1-1_y_29']
plot_anomalies(anomaly_online, display_ids, rows=2, cols=2)

Univariate Anomaly Detection Results

Definition: Univariate anomaly detection analyzes each time series in isolation. It flags anomalies based on each series’ individual deviation from its expected behavior.

Pros: Efficient for individual metrics or when correlations between metrics are not relevant.

Cons: May miss large-scale, system-wide anomalies that are only apparent when multiple series deviate simultaneously.

2.1.1 Example Usage

Univariate detection code:

anomaly_online = nixtla_client.detect_anomalies_online(
    df[['ts', 'y', 'unique_id']],
    time_col='ts',
    target_col='y',
    freq='h',
    h=24,
    level=95,
    detection_size=475,
    threshold_method='univariate'  # Univariate anomaly detection
)

Sample output logs:

Univariate Method Log Output

Univariate Anomaly Detection Logs
INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Preprocessing dataframes...
WARNING:nixtla.nixtla_client:Detection size is large. Using the entire series to compute the anomaly threshold...
INFO:nixtla.nixtla_client:Calling Online Anomaly Detector Endpoint...

2.1.2 Visualization

# Utility function to plot anomalies
def plot_anomalies(df, unique_ids, rows, cols):
    fig, axes = plt.subplots(rows, cols, figsize=(12, rows * 2))

    for i, (ax, uid) in enumerate(zip(axes.flatten(), unique_ids)):
        filtered_df = df[df['unique_id'] == uid]
        ax.plot(filtered_df['ts'], filtered_df['y'], color='navy', alpha=0.8, label='y')
        ax.plot(filtered_df['ts'], filtered_df['TimeGPT'], color='orchid', alpha=0.7, label='TimeGPT')
        ax.scatter(
            filtered_df.loc[filtered_df['anomaly'] == 1, 'ts'],
            filtered_df.loc[filtered_df['anomaly'] == 1, 'y'],
            color='orchid', label='Anomalies Detected'
        )
        ax.set_title(f"Unique_id: {uid}", fontsize=8)
        ax.tick_params(axis='x', labelsize=6)

    fig.legend(loc='upper center', ncol=3, fontsize=8, labels=['y', 'TimeGPT', 'Anomaly'])
    plt.tight_layout(rect=[0, 0, 1, 0.95])
    plt.show()


display_ids = ['machine-1-1_y_0', 'machine-1-1_y_1', 'machine-1-1_y_6', 'machine-1-1_y_29']
plot_anomalies(anomaly_online, display_ids, rows=2, cols=2)

Univariate Anomaly Detection Results

Definition: Multivariate anomaly detection considers all time series collectively. A time step is flagged as anomalous if the aggregate deviation across all series at that time exceeds a threshold.

Pros: Captures systemic or correlated anomalies that might be missed when analyzing each series in isolation.

Cons: Slightly higher complexity and computational overhead. May require careful threshold tuning.

2.2.1 Example Usage

Multivariate detection code:

anomaly_online_multi = nixtla_client.detect_anomalies_online(
    df[['ts', 'y', 'unique_id']],
    time_col='ts',
    target_col='y',
    freq='h',
    h=24,
    level=95,
    detection_size=475,
    threshold_method='multivariate'  # Multivariate anomaly detection
)

Sample output logs:

Multivariate Method Log Output

Multivariate Anomaly Detection Logs
INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Preprocessing dataframes...
WARNING:nixtla.nixtla_client:Detection size is large. Using the entire series to compute the anomaly threshold...
INFO:nixtla.nixtla_client:Calling Online Anomaly Detector Endpoint...

2.2.2 Visualization

plot_anomalies(anomaly_online_multi, display_ids, rows=2, cols=2)

Multivariate Anomaly Detection Results

In multivariate anomaly detection, anomaly scores from all series at each time step are aggregated. A step is anomalous if the combined score exceeds the threshold. This reveals systemic anomalies that may go unnoticed if each series is considered alone.

Summary

Univariate:
Best for detecting anomalies in a single metric or uncorrelated metrics. Low computational overhead, but may overlook cross-series patterns.

Multivariate:
Considers correlations across metrics, capturing system-wide issues. More complex and computationally intensive than univariate methods.

Both detection approaches use Nixtla’s online anomaly detection method. Choose the strategy that best fits your use case and data characteristics.

QUICK START

GETTING STARTED

CAPABILITIES

DEPLOYMENT

TUTORIALS

USE CASES

REFERENCE

About

Univariate vs Multivariate Anomaly Detection

Overview

Setup

1. Dataset

2. Univariate vs. Multivariate Methods

Summary

QUICK START

GETTING STARTED

CAPABILITIES

DEPLOYMENT

TUTORIALS

USE CASES

REFERENCE

About

​Overview

​Setup

​1. Dataset

​2. Univariate vs. Multivariate Methods

​Summary

Overview

Setup

1. Dataset

2. Univariate vs. Multivariate Methods

Summary