Data Requirements

TimeGPT accepts pandas and polars dataframes in long format. The minimum required columns are:

Required Columns

ds(timestamp): String or datetime in YYYY-MM-DD or YYYY-MM-DD HH:MM:SS format.
y(numeric): Numerical target variable to forecast.

Optional Index

If a DataFrame lacks the ds column but uses a DatetimeIndex, that is also supported.

TimeGPT also supports distributed dataframe libraries such as dask, spark, and ray.

You can include additional exogenous features in the same DataFrame. See the Exogenous Variables tutorial for details.

Example DataFrame

Below is a sample of a valid input DataFrame for TimeGPT (with columns named timestamp and value instead of ds and y):

Sample Data Loading
import pandas as pd

df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/air_passengers.csv')
df.head()

	timestamp	value
0	1949-01-01	112
1	1949-02-01	118
2	1949-03-01	132
3	1949-04-01	129
4	1949-05-01	121

Sample Data Preview

In this example:
• timestamp corresponds to ds.
• value corresponds to y.

Matching Columns to TimeGPT

You can choose how to align your DataFrame columns with TimeGPT’s expected structure:

Rename timestamp to ds and value to y:

Rename Columns Example
df = df.rename(columns={'timestamp': 'ds', 'value': 'y'})

Now your DataFrame has the explicitly required columns:

Show Head of DataFrame
print(df.head())

Rename timestamp to ds and value to y:

Rename Columns Example
df = df.rename(columns={'timestamp': 'ds', 'value': 'y'})

Now your DataFrame has the explicitly required columns:

Show Head of DataFrame
print(df.head())

Specify column names directly when calling NixtlaClient:

NixtlaClient Forecast Example
from nixtla import NixtlaClient

nixtla_client = NixtlaClient(api_key='my_api_key_provided_by_nixtla')

fcst = nixtla_client.forecast(
    df=df,
    h=12,
    time_col='timestamp',  
    target_col='value'  
)

fcst.head()

This way, you don’t need to rename your DataFrame columns, as TimeGPT will know which ones to treat as ds and y.

Example Forecast

When you run the forecast method:

Forecast Example
fcst = nixtla_client.forecast(
    df=df,
    h=12,
    time_col='timestamp',
    target_col='value'
)

fcst.head()

Forecast Logs

Forecast Logs
INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Inferred freq: MS
INFO:nixtla.nixtla_client:Preprocessing dataframes...
INFO:nixtla.nixtla_client:Querying model metadata...
INFO:nixtla.nixtla_client:Restricting input...
INFO:nixtla.nixtla_client:Calling Forecast Endpoint...

	timestamp	TimeGPT
0	1961-01-01	437.83792
1	1961-02-01	426.06270
2	1961-03-01	463.11655
3	1961-04-01	478.24450
4	1961-05-01	505.64648

Forecast Output Preview

TimeGPT attempts to automatically infer your data’s frequency (freq). You can override this by specifying the freq parameter (e.g., freq='MS').

For more information, see the TimeGPT Quickstart.

Multiple Series

When forecasting multiple time series simultaneously, each series must include a unique identifier column called unique_id:

Multiple Series Data Loading
df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/electricity-short.csv')
df.head()

	unique_id	ds	y
0	BE	2016-10-22 00:00:00	70.00
1	BE	2016-10-22 01:00:00	37.10

Multiple-Series Data Preview

Simply call:

Multiple Series Forecast Example
fcst = nixtla_client.forecast(df=df, h=24)
fcst.head()

TimeGPT will produce forecasts for all unique IDs in your DataFrame simultaneously.

Exogenous Variables

TimeGPT can use exogenous variables in your forecasts. If you have future values for these variables, provide them in a separate DataFrame.

Exogenous Variables Example

Exogenous Variables Forecast
df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/electricity-short-with-ex-vars.csv') 
future_ex_vars_df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/electricity-short-future-ex-vars.csv')

fcst = nixtla_client.forecast(
    df=df,
    X_df=future_ex_vars_df,
    h=24
)

fcst.head()

Refer to the Exogenous Variables tutorial for further details.

Important Considerations

Warning: Data passed to TimeGPT must not contain missing values or time gaps.

To handle missing data, see Dealing with Missing Values in TimeGPT.

Minimum Data Requirements (Azure AI)

These are the minimum data sizes required for each frequency when using Azure AI:

When preparing your data, also consider:

Forecast horizon (h)

Number of future periods you want to predict.

Number of validation windows (n_windows)

How many times to test the model’s performance.

Gaps (step_size)

Periodic offset between validation windows during cross-validation.

This ensures you have enough data for both training and evaluation.

QUICK START

GETTING STARTED

CAPABILITIES

DEPLOYMENT

TUTORIALS

USE CASES

REFERENCE

About

Required Columns

Optional Index

Example DataFrame

Matching Columns to TimeGPT

Example Forecast

Multiple Series

Exogenous Variables

Important Considerations

Minimum Data Requirements (Azure AI)

QUICK START

GETTING STARTED

CAPABILITIES

DEPLOYMENT

TUTORIALS

USE CASES

REFERENCE

About

Required Columns

Optional Index

​Example DataFrame

​Matching Columns to TimeGPT

​Example Forecast

​Multiple Series

​Exogenous Variables

​Important Considerations

​Minimum Data Requirements (Azure AI)

Example DataFrame

Matching Columns to TimeGPT

Example Forecast

Multiple Series

Exogenous Variables

Important Considerations

Minimum Data Requirements (Azure AI)