Cross-validation
Learn how to validate time series models with rolling-window cross-validation
Cross-validation is an effective model validation method for time series forecasting. It uses historical data to evaluate the stability and performance of your model before deployment, helping you make confident predictions in real-world scenarios.
Cross-validation can be especially challenging for time series data, due to the inherent uncertainty and variability over time. Unlike general machine learning tasks where data can be shuffled randomly, time series data requires an approach that respects temporal ordering.
Rolling-window cross-validation conceptually splits your dataset into multiple training and validation sets over time.
The TimeGPT class in Nixtla incorporates a cross_validation
method tailored to time series forecasting. This tutorial shows how to use NixtlaClient to run cross-validation, reinforcing the reliability of your forecasting models.
Open In Colab
For best results, ensure your data is properly formatted: you must have a time column (e.g., ds
), a target column (e.g., y
), and, if necessary, an identifier column (e.g., unique_id
) for multiple time series.
Goal: Validate your forecasting model systematically across different time segments.
Key Benefit:
Outcome:
Tutorial Steps
1. Import Packages and Initialize NixtlaClient
Cross-validation starts with installing and importing the required packages, then creating an instance of NixtlaClient
.
Use this variant if you’re connecting to Nixtla’s standard API.
Use this variant if you’re connecting to Nixtla’s standard API.
If you are using Azure AI, include base_url
with your endpoint.
2. Load Example Data
Use the Peyton Manning dataset as an example. The dataset can be loaded directly from Nixtla’s S3 bucket:
3. Perform Cross-Validation
Use cross_validation
on the Peyton Manning dataset:
The logs below indicate successful cross-validation calls and data preprocessing.
Cross-validation output includes the forecasted values (TimeGPT
) aligned with historical values (y
).
unique_id | ds | cutoff | y | TimeGPT |
---|---|---|---|---|
0 | 2015-12-17 | 2015-12-16 | 7.591862 | 7.939553 |
0 | 2015-12-18 | 2015-12-16 | 7.528869 | 7.887512 |
0 | 2015-12-19 | 2015-12-16 | 7.171657 | 7.766617 |
0 | 2015-12-20 | 2015-12-16 | 7.891331 | 7.931502 |
0 | 2015-12-21 | 2015-12-16 | 8.360071 | 8.312632 |
If you are using an Azure AI endpoint, remember to specify
model="azureai"
in cross_validation
. Also refer to
this tutorial to explore other supported models.
4. Plot Cross-Validation Results
Visualize forecast performance for each cutoff period. Here’s an example plotting the last 100 rows of actual data along with cross-validation forecasts for each cutoff.
An example visualization of predicted vs. actual values in the Peyton Manning dataset.
5. Use Prediction Intervals, Exogenous Variables, and Model Variations
You can customize your cross-validation further:
Conclusion
By systematically testing your forecasting models over multiple time windows, cross_validation
in Nixtla’s TimeGPT ensures predictions are accurate and reliable. Incorporating confidence intervals, exogenous variables, and different model variants can further enhance your forecasts, giving you robust insights for real-world applications.
Ready to take the next step? Explore other tutorials in the Nixtla documentation for more details on custom models, hyperparameter tuning, and advanced visualization techniques.