Cross-validation

Cross-validation is an effective model validation method for time series forecasting. It uses historical data to evaluate the stability and performance of your model before deployment, helping you make confident predictions in real-world scenarios.

Cross-validation can be especially challenging for time series data, due to the inherent uncertainty and variability over time. Unlike general machine learning tasks where data can be shuffled randomly, time series data requires an approach that respects temporal ordering.

Rolling-window cross-validation conceptually splits your dataset into multiple training and validation sets over time.

The TimeGPT class in Nixtla incorporates a cross_validation method tailored to time series forecasting. This tutorial shows how to use NixtlaClient to run cross-validation, reinforcing the reliability of your forecasting models.

Open In Colab

For best results, ensure your data is properly formatted: you must have a time column (e.g., ds), a target column (e.g., y), and, if necessary, an identifier column (e.g., unique_id) for multiple time series.

Goal: Validate your forecasting model systematically across different time segments.

Key Benefit:

Outcome:

Tutorial Steps

1. Import Packages and Initialize NixtlaClient

Cross-validation starts with installing and importing the required packages, then creating an instance of NixtlaClient.

import-packages
import pandas as pd
from nixtla import NixtlaClient
from IPython.display import display

nixtla_client = NixtlaClient(
    api_key='my_api_key_provided_by_nixtla'
)

Use this variant if you’re connecting to Nixtla’s standard API.

standard-client
nixtla_client_standard = NixtlaClient(
api_key='my_api_key_provided_by_nixtla'
)

Use this variant if you’re connecting to Nixtla’s standard API.

standard-client
nixtla_client_standard = NixtlaClient(
api_key='my_api_key_provided_by_nixtla'
)

If you are using Azure AI, include base_url with your endpoint.

azure-client
nixtla_client_azure = NixtlaClient(
base_url="your azure ai endpoint",
api_key="your api_key"
)

2. Load Example Data

Use the Peyton Manning dataset as an example. The dataset can be loaded directly from Nixtla’s S3 bucket:

load-data
pm_df = pd.read_csv(
    'https://datasets-nixtla.s3.amazonaws.com/peyton-manning.csv'
)

3. Perform Cross-Validation

Why Rolling-window Cross-validation?

Important Parameters

Use cross_validation on the Peyton Manning dataset:

cross-validation
timegpt_cv_df = nixtla_client.cross_validation(
    pm_df,
    h=7,
    n_windows=5,
    freq='D'
)
timegpt_cv_df.head()

The logs below indicate successful cross-validation calls and data preprocessing.

Cross-validation Log Output

Log Output
INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Querying model metadata...
INFO:nixtla.nixtla_client:Preprocessing dataframes...
INFO:nixtla.nixtla_client:Restricting input...
INFO:nixtla.nixtla_client:Calling Cross Validation Endpoint...

Cross-validation output includes the forecasted values (TimeGPT) aligned with historical values (y).

ds	cutoff	y	TimeGPT
2015-12-17	2015-12-16	7.591862	7.939553
2015-12-18	2015-12-16	7.528869	7.887512
2015-12-19	2015-12-16	7.171657	7.766617
2015-12-20	2015-12-16	7.891331	7.931502
2015-12-21	2015-12-16	8.360071	8.312632

If you are using an Azure AI endpoint, remember to specify model="azureai" in cross_validation. Also refer to this tutorial to explore other supported models.

4. Plot Cross-Validation Results

Visualize forecast performance for each cutoff period. Here’s an example plotting the last 100 rows of actual data along with cross-validation forecasts for each cutoff.

plot-results
cutoffs = timegpt_cv_df['cutoff'].unique()

for cutoff in cutoffs:
    fig = nixtla_client.plot(
        pm_df.tail(100),
        timegpt_cv_df.query('cutoff == @cutoff').drop(columns=['cutoff', 'y']),
    )
    display(fig)

An example visualization of predicted vs. actual values in the Peyton Manning dataset.

5. Use Prediction Intervals, Exogenous Variables, and Model Variations

You can customize your cross-validation further:

Features to Enhance Your Model

Example Usage

advanced-cross-validation
timegpt_cv_custom_df = nixtla_client.cross_validation(
df=pm_df,
h=7,
freq='D',
level=[80, 90],
date_features=['month'],
model='timegpt-1-long-horizon'
)

timegpt_cv_custom_df.head()

unique_id	ds	cutoff	y	TimeGPT	TimeGPT-lo-90	TimeGPT-hi-90	TimeGPT-lo-80	TimeGPT-hi-80
0	2015-12-17	2015-12-16	7.591862	7.939553	7.112531	8.730458	7.316611	8.562029

Conclusion

By systematically testing your forecasting models over multiple time windows, cross_validation in Nixtla’s TimeGPT ensures predictions are accurate and reliable. Incorporating confidence intervals, exogenous variables, and different model variants can further enhance your forecasts, giving you robust insights for real-world applications.

Ready to take the next step? Explore other tutorials in the Nixtla documentation for more details on custom models, hyperparameter tuning, and advanced visualization techniques.

QUICK START

GETTING STARTED

CAPABILITIES

DEPLOYMENT

TUTORIALS

USE CASES

REFERENCE

About

Open In Colab

Key Benefit:

Outcome:

Tutorial Steps

Conclusion

QUICK START

GETTING STARTED

CAPABILITIES

DEPLOYMENT

TUTORIALS

USE CASES

REFERENCE

About

Open In Colab

Key Benefit:

Outcome:

​Tutorial Steps

​Conclusion

Tutorial Steps

Conclusion