Cross-validation is an effective model validation method for time series forecasting. It uses historical data to evaluate the stability and performance of your model before deployment, helping you make confident predictions in real-world scenarios.

Cross-validation can be especially challenging for time series data, due to the inherent uncertainty and variability over time. Unlike general machine learning tasks where data can be shuffled randomly, time series data requires an approach that respects temporal ordering.

Rolling-window cross-validation conceptually splits your dataset into multiple training and validation sets over time.

The TimeGPT class in Nixtla incorporates a cross_validation method tailored to time series forecasting. This tutorial shows how to use NixtlaClient to run cross-validation, reinforcing the reliability of your forecasting models.

Open In Colab

For best results, ensure your data is properly formatted: you must have a time column (e.g., ds), a target column (e.g., y), and, if necessary, an identifier column (e.g., unique_id) for multiple time series.

Goal: Validate your forecasting model systematically across different time segments.

Key Benefit:

Outcome:


Tutorial Steps

1

1. Import Packages and Initialize NixtlaClient

Cross-validation starts with installing and importing the required packages, then creating an instance of NixtlaClient.

import-packages
import pandas as pd
from nixtla import NixtlaClient
from IPython.display import display

nixtla_client = NixtlaClient(
    api_key='my_api_key_provided_by_nixtla'
)

Use this variant if you’re connecting to Nixtla’s standard API.

standard-client
nixtla_client_standard = NixtlaClient(
api_key='my_api_key_provided_by_nixtla'
)
2

2. Load Example Data

Use the Peyton Manning dataset as an example. The dataset can be loaded directly from Nixtla’s S3 bucket:

load-data
pm_df = pd.read_csv(
    'https://datasets-nixtla.s3.amazonaws.com/peyton-manning.csv'
)
3

3. Perform Cross-Validation

Use cross_validation on the Peyton Manning dataset:

cross-validation
timegpt_cv_df = nixtla_client.cross_validation(
    pm_df,
    h=7,
    n_windows=5,
    freq='D'
)
timegpt_cv_df.head()

The logs below indicate successful cross-validation calls and data preprocessing.

Cross-validation output includes the forecasted values (TimeGPT) aligned with historical values (y).

unique_iddscutoffyTimeGPT
02015-12-172015-12-167.5918627.939553
02015-12-182015-12-167.5288697.887512
02015-12-192015-12-167.1716577.766617
02015-12-202015-12-167.8913317.931502
02015-12-212015-12-168.3600718.312632

If you are using an Azure AI endpoint, remember to specify model="azureai" in cross_validation. Also refer to this tutorial to explore other supported models.

4

4. Plot Cross-Validation Results

Visualize forecast performance for each cutoff period. Here’s an example plotting the last 100 rows of actual data along with cross-validation forecasts for each cutoff.

plot-results
cutoffs = timegpt_cv_df['cutoff'].unique()

for cutoff in cutoffs:
    fig = nixtla_client.plot(
        pm_df.tail(100),
        timegpt_cv_df.query('cutoff == @cutoff').drop(columns=['cutoff', 'y']),
    )
    display(fig)

An example visualization of predicted vs. actual values in the Peyton Manning dataset.

5

5. Use Prediction Intervals, Exogenous Variables, and Model Variations

You can customize your cross-validation further:

Conclusion

By systematically testing your forecasting models over multiple time windows, cross_validation in Nixtla’s TimeGPT ensures predictions are accurate and reliable. Incorporating confidence intervals, exogenous variables, and different model variants can further enhance your forecasts, giving you robust insights for real-world applications.

Ready to take the next step? Explore other tutorials in the Nixtla documentation for more details on custom models, hyperparameter tuning, and advanced visualization techniques.