API Reference

This is the class and function reference for sktime.

sktime.classification: Time series classification

The sktime.classification module contains algorithms and composition tools for time series classification.

Composition

ColumnEnsembleClassifier(estimators[, …])

Applies estimators to columns of an array or pandas DataFrame.

Dictionary-based

IndividualBOSS([window_size, word_length, …])

Single Bag of SFA Symbols (BOSS) classifier

BOSSEnsemble([threshold, max_ensemble_size, …])

Bag of SFA Symbols (BOSS)

ContractableBOSS([n_parameter_samples, …])

Contractable Bag of SFA Symbols (cBOSS) implementation of BOSS from [1] with refinements described in [2]

WEASEL([anova, bigrams, binning_strategy, …])

Word ExtrAction for time SEries cLassification (WEASEL) from [1].

MUSE([anova, bigrams, window_inc, …])

WEASEL+MUSE (MUltivariate Symbolic Extension) MUSE: implementation of multivariate version of WEASEL, referred to as just MUSE from [1]

IndividualTDE([window_size, word_length, …])

Single TDE classifier, based off the Bag of SFA Symbols (BOSS) model

TemporalDictionaryEnsemble([…])

Temporal Dictionary Ensemble (TDE) as described in [1].

Distance-based

KNeighborsTimeSeriesClassifier([…])

An adapted version of the scikit-learn KNeighborsClassifier to work with time series data.

ElasticEnsemble([distance_measures, …])

The Elastic Ensemble (EE) as described in Jason Lines and Anthony Bagnall, “Time Series Classification with Ensembles of Elastic Distance Measures”, Data Mining and Knowledge Discovery, 29(3), 2015.

ProximityForest([random_state, …])

Proximity Forest class to model a decision tree forest which uses distance measures to partition data, see [1].

ProximityTree([random_state, get_exemplars, …])

Proximity Tree class to model a decision tree which uses distance measures to partition data.

ProximityStump([random_state, …])

Proximity Stump class to model a decision stump which uses a distance measure to partition data.

Hybrid

HIVECOTEV1([stc_params, tsf_params, …])

Hierarchical Vote Collective of Transformation-based Ensembles (HIVE-COTE) V1 as described in [1].

Interval-based

RandomIntervalSpectralForest([n_estimators, …])

Random Interval Spectral Forest (RISE) from [1]

SupervisedTimeSeriesForest([n_estimators, …])

Supervised Time Series Forest (STSF) classifier as described in [1].

Shapelet-based

ShapeletTransformClassifier([…])

Shapelet Transform Classifier

MrSEQLClassifier([seql_mode, symrep, …])

Time Series Classification with multiple symbolic representations and SEQL (Mr-SEQL)

ROCKETClassifier([num_kernels, ensemble, …])

Classifier wrapped for the ROCKET transformer using RidgeClassifierCV as the base classifier.

sktime.regression: Time series regression

The sktime.regression module contains algorithms and composition tools for time series regression.

Composition

ComposableTimeSeriesForestRegressor([…])

Time-Series Forest Regressor.

Interval-based

TimeSeriesForestRegressor([min_interval, …])

Time series forest regressor.

sktime.series_as_features: Series-as-features tools

The sktime.series_as_features module contains algorithms and composition tools that are shared by the classification and regression modules.

Composition

FeatureUnion(transformer_list[, n_jobs, …])

Concatenates results of multiple transformer objects. This estimator applies a list of transformer objects in parallel to the input data, then concatenates the results. This is useful to combine several feature extraction mechanisms into a single transformer. Parameters of the transformations may be set using its name and the parameter name separated by a ‘__’. A transformer may be replaced entirely by setting the parameter with its name to another transformer, or removed by setting to ‘drop’ or None. :param transformer_list: List of transformer objects to be applied to the data. The first half of each tuple is the name of the transformer. :type transformer_list: list of (string, transformer) tuples :param n_jobs: Number of jobs to run in parallel. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors. :type n_jobs: int or None, optional (default=None) :param transformer_weights: Multiplicative weights for features per transformer. Keys are transformer names, values the weights. :type transformer_weights: dict, optional.

Model selection

PresplitFilesCV([cv])

Cross-validation iterator over split predefined in files.

SingleSplit([test_size, train_size, …])

Helper class for orchestration that uses a single split for training and testing.

sktime.forecasting: Time series forecasting

The sktime.forecasting module contains algorithms and composition tools for forecasting.

Base

ForecastingHorizon([values, is_relative])

Forecasting horizon

Naive

NaiveForecaster([strategy, window_length, sp])

NaiveForecaster is a forecaster that makes forecasts using simple strategies.

Trend

PolynomialTrendForecaster([regressor, …])

Forecast time series data with a polynomial trend.

Exponential Smoothing

ExponentialSmoothing([trend, damped_trend, …])

Holt-Winters exponential smoothing forecaster.

AutoETS([error, trend, damped_trend, …])

ETS models with both manual and automatic fitting capabilities.

ARIMA

AutoARIMA([start_p, d, start_q, max_p, …])

Automatically discover the optimal order for an ARIMA model.

ARIMA([order, seasonal_order, start_params, …])

An ARIMA estimator.

Theta

ThetaForecaster([initial_level, …])

Theta method of forecasting.

BATS/TBATS

BATS([use_box_cox, box_cox_bounds, …])

BATS estimator used to fit and select best performing model.

TBATS([use_box_cox, box_cox_bounds, …])

TBATS estimator used to fit and select best performing model.

Prophet

Prophet([freq, add_seasonality, …])

Prophet forecaster by wrapping fbprophet. :param freq: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html #timeseries-offset-aliases :type freq: String of DatetimeIndex frequency. See here for possible values: :param add_seasonality: Dict can have the following keys/values: name: string name of the seasonality component. period: float number of days in one period. fourier_order: int number of Fourier components to use. prior_scale: optional float prior scale for this component. mode: optional ‘additive’ or ‘multiplicative’ condition_name: string name of the seasonality condition. :type add_seasonality: Dict with args for Prophet.add_seasonality(). :param add_country_holidays: Dict can have the following keys/values: country_name: Name of the country, like ‘UnitedStates’ or ‘US’ :type add_country_holidays: Dict with args for Prophet.add_country_holidays(). :param growth: trend. :type growth: String ‘linear’ or ‘logistic’ to specify a linear or logistic :param changepoints: not specified, potential changepoints are selected automatically. :type changepoints: List of dates at which to include potential changepoints. If :param n_changepoints: if input changepoints is supplied. If changepoints is not supplied, then n_changepoints potential changepoints are selected uniformly from the first changepoint_range proportion of the history. :type n_changepoints: Number of potential changepoints to include. Not used :param changepoint_range: be estimated. Defaults to 0.8 for the first 80%. Not used if changepoints is specified. :type changepoint_range: Proportion of history in which trend changepoints will :param yearly_seasonality: Can be ‘auto’, True, False, or a number of Fourier terms to generate. :type yearly_seasonality: Fit yearly seasonality. :param weekly_seasonality: Can be ‘auto’, True, False, or a number of Fourier terms to generate. :type weekly_seasonality: Fit weekly seasonality. :param daily_seasonality: Can be ‘auto’, True, False, or a number of Fourier terms to generate. :type daily_seasonality: Fit daily seasonality. :param holidays: and optionally columns lower_window and upper_window which specify a range of days around the date to be included as holidays. lower_window=-2 will include 2 days prior to the date as holidays. Also optionally can have a column prior_scale specifying the prior scale for that holiday. :type holidays: pd.DataFrame with columns holiday (string) and ds (date type) :param seasonality_mode: :type seasonality_mode: ‘additive’ (default) or ‘multiplicative’. :param seasonality_prior_scale: seasonality model. Larger values allow the model to fit larger seasonal fluctuations, smaller values dampen the seasonality. Can be specified for individual seasonalities using add_seasonality. :type seasonality_prior_scale: Parameter modulating the strength of the :param holidays_prior_scale: components model, unless overridden in the holidays input. :type holidays_prior_scale: Parameter modulating the strength of the holiday :param changepoint_prior_scale: automatic changepoint selection. Large values will allow many changepoints, small values will allow few changepoints. :type changepoint_prior_scale: Parameter modulating the flexibility of the :param mcmc_samples: with the specified number of MCMC samples. If 0, will do MAP estimation. :type mcmc_samples: Integer, if greater than 0, will do full Bayesian inference :param alpha: for the forecast. If mcmc_samples=0, this will be only the uncertainty in the trend using the MAP estimate of the extrapolated generative model. If mcmc.samples>0, this will be integrated over all model parameters, which will include uncertainty in seasonality. :type alpha: Float, width of the uncertainty intervals provided :param uncertainty_samples: uncertainty intervals. Settings this value to 0 or False will disable uncertainty estimation and speed up the calculation. :type uncertainty_samples: Number of simulated draws used to estimate :param stan_backend: iterate over all available backends and find the working one :type stan_backend: str as defined in StanBackendEnum default: None - will try to.

Composition

EnsembleForecaster(forecasters[, n_jobs, …])

Ensemble of forecasters

TransformedTargetForecaster(steps)

Meta-estimator for forecasting transformed time series.

DirectRegressionForecaster(regressor[, …])

Forecasting based on reduction to tabular regression with a direct reduction strategy.

DirectTimeSeriesRegressionForecaster(regressor)

Forecasting based on reduction to time series regression with a direct reduction strategy.

MultioutputRegressionForecaster(regressor[, …])

Forecasting based on reduction to tabular regression with a multioutput reduction strategy.

RecursiveRegressionForecaster(regressor[, …])

Forecasting based on reduction to tabular regression with a recursive reduction strategy.

RecursiveTimeSeriesRegressionForecaster(…)

Forecasting based on reduction to time series regression with a recursive reduction strategy.

ReducedForecaster(regressor, scitype[, …])

Forecasting based on reduction

StackingForecaster(forecasters, final_regressor)

MultiplexForecaster(forecasters[, …])

Online Forecasting

OnlineEnsembleForecaster(forecasters[, …])

Online Updating Ensemble of forecasters

NormalHedgeEnsemble([n_estimators, a, loss_func])

Implementation of A Parameter-free Hedging Algorithm, Kamalika Chaudhuri, Yoav Freund, Daniel Hsu (2009) as a hedge-style algorithm.

NNLSEnsemble([n_estimators, loss_func])

Ensemble class that performs a non-negative least squares to fit to the estimators.

Model Selection

CutoffSplitter(cutoffs[, fh, window_length])

Cutoff window splitter.

SingleWindowSplitter(fh[, window_length])

Single window splitter.

SlidingWindowSplitter([fh, window_length, …])

Sliding window splitter.

ExpandingWindowSplitter([fh, …])

Expanding window splitter.

ForecastingGridSearchCV(forecaster, cv, …)

Performs grid-search cross-validation to find optimal model parameters.

ForecastingRandomizedSearchCV(forecaster, …)

Performs randomized-search cross-validation to find optimal model parameters.

temporal_train_test_split(y[, X, test_size, …])

Split arrays or matrices into sequential train and test subsets Creates train/test splits over endogenous arrays an optional exogenous arrays.

Model Evaluation

evaluate(forecaster, cv, y[, X, strategy, …])

Evaluate forecaster using cross-validation

sktime.transformations: Time series transformers

The sktime.transformations module contains classes for data transformations.

Panel transformers

Dictionary-based

PAA([num_intervals])

(PAA) Piecewise Aggregate Approximation Transformer, as described in Eamonn Keogh, Kaushik Chakrabarti, Michael Pazzani, and Sharad Mehrotra.

SFA([word_length, alphabet_size, …])

SFA (Symbolic Fourier Approximation) Transformer, as described in

SAX([word_length, alphabet_size, …])

SAX (Symbolic Aggregate approXimation) Transformer, as described in Jessica Lin, Eamonn Keogh, Li Wei and Stefano Lonardi, “Experiencing SAX: a novel symbolic representation of time series” Data Mining and Knowledge Discovery, 15(2):107-144 Overview: for each series: run a sliding window across the series for each window shorten the series with PAA (Piecewise Approximate Aggregation) discretise the shortened series into fixed bins form a word from these discrete values by default SAX produces a single word per series (window_size=0). SAX returns a pandas data frame where column 0 is the histogram (sparse pd.series) of each series.

Summarize

DerivativeSlopeTransformer()

PlateauFinder([value, min_length])

Transformer that finds segments of the same given value, plateau in the time series, and returns the starting indices and lengths.

RandomIntervalFeatureExtractor([…])

Transformer that segments time-series into random intervals and subsequently extracts series-to-primitives features from each interval.

FittedParamExtractor(forecaster, param_names)

Extract parameters of a fitted forecaster as features for a subsequent tabular learning task.

tsfresh

TSFreshRelevantFeatureExtractor([…])

Transformer for extracting and selecting features.

TSFreshFeatureExtractor([…])

Transformer for extracting time series features

Catch22

Catch22()

Canonical Time-series Characteristics (catch22)

Compose

ColumnTransformer(transformers[, remainder, …])

Applies transformations to columns of an array or pandas DataFrame.

ColumnConcatenator()

Transformer that concatenates multivariate time series/panel data into long univariate time series/panel data by simply concatenating times series in time.

SeriesToSeriesRowTransformer(transformer[, …])

SeriesToPrimitivesRowTransformer(transformer)

make_row_transformer(transformer[, …])

Factory function for creating InstanceTransformer based on transform type

Matrix profile

MatrixProfile([m])

Takes as input a time series dataset and returns the matrix profile and index profile for each time series of the dataset.

PCA

PCATransformer([n_components])

Transformer that applies Principle Components Analysis to a univariate time series.

Reduce

Tabularizer()

A transformer that turns time series/panel data into tabular data.

Rocket

Rocket([num_kernels, normalise, random_state])

ROCKET

MiniRocket([num_features, …])

MINIROCKET

MiniRocketMultivariate([num_features, …])

MINIROCKET (Multivariate)

Segment

IntervalSegmenter([intervals])

Interval segmentation transformer.

RandomIntervalSegmenter([n_intervals, …])

Transformer that segments time-series into random intervals with random starting points and lengths.

Shapelet

ShapeletTransform([min_shapelet_length, …])

Shapelet Transform.

ContractedShapeletTransform([…])

Contracted Shapelet Transform. @incollection{bostrom2017binary, title={Binary shapelet transform for multiclass time series classification}, author={Bostrom, Aaron and Bagnall, Anthony}, booktitle={Transactions on Large-Scale Data-and Knowledge-Centered Systems XXXII}, pages={24–46}, year={2017}, publisher={Springer} }.

Series transformers

Detrend

Detrender(forecaster)

Remove a trend from a series.

Deseasonalizer([sp, model])

A transformer that removes seasonal components from time series

ConditionalDeseasonalizer([…])

A transformer that removes seasonal components from time series, conditional on seasonality test.

Adapt

TabularToSeriesAdaptor(transformer)

Adaptor for scikit-learn-like tabular transformations to series setting.

Box-cox

BoxCoxTransformer([bounds, method])

LogTransformer()

Auto-correlation

AutoCorrelationTransformer([adjusted, …])

Auto-correlation transformer.

PartialAutoCorrelationTransformer([n_lags, …])

Partial auto-correlation transformer.

Cosine

CosineTransformer()

Matrix Profile

MatrixProfileTransformer([window_length])

Imputer

Imputer(method[, random_state, value, …])

Missing value imputation

HampelFilter

HampelFilter(window_length[, n_sigma, k, …])

HampelFilter to detect outliers based on a sliding window.

OptionalPassthrough

OptionalPassthrough(transformer[, passthrough])

A transformer to tune the implicit hyperparameter whether or not to use a particular transformer inside a pipeline (e.g.

sktime.datasets: Datasets

load_airline()

Load the airline univariate time series dataset [1].

load_arrow_head([split, return_X_y])

Loads the ArrowHead time series classification problem and returns X and y.

load_gunpoint([split, return_X_y])

Loads the GunPoint time series classification problem and returns X and y :param split: Whether to load the train or test partition of the problem. By default it loads both. :type split: None or str{“train”, “test”}, optional (default=None) :param return_X_y: If True, returns (features, target) separately instead of a single dataframe with columns for features and the target. :type return_X_y: bool, optional (default=False).

load_osuleaf([split, return_X_y])

Loads the OSULeaf time series classification problem and returns X and y

load_italy_power_demand([split, return_X_y])

Loads the ItalyPowerDemand time series classification problem and returns X and y

load_basic_motions([split, return_X_y])

Loads the BasicMotions time series classification problem and returns X and y.

load_japanese_vowels([split, return_X_y])

Loads the JapaneseVowels time series classification problem and returns X and y.

load_shampoo_sales()

Load the shampoo sales univariate time series dataset for forecasting.

load_longley([y_name])

Load the Longley multivariate time series dataset for forecasting with exogenous variables.

load_lynx()

Load the lynx univariate time series dataset for forecasting.

load_acsf1([split, return_X_y])

Loads the power consumption of typical appliances time series classification problem and returns X and y.

load_uschange([y_name])

Load the multivariate time series dataset for forecasting Growth rates of personal consumption and personal income.

load_UCR_UEA_dataset(name[, split, …])

Load dataset from UCR UEA time series classification repository.

sktime.utils: Utility function

The sktime.utils module contains utility functions.

Plotting

plot_series(*series[, labels, markers])

Plot one or more time series

Data Processing

are_columns_nested(X)

Checks whether any cells have nested structure in each DataFrame column.

is_nested_dataframe(X)

Checks whether the input is a nested DataFrame.

from_nested_to_2d_array(X[, return_numpy])

Convert nested pandas DataFrame or Series with NumPy arrays or pandas Series in cells into tabular pandas DataFrame with primitives in cells, i.e. a data frame with the same number of rows as the input data and as many columns as there are observations in the nested series.

from_2d_array_to_nested(X[, index, columns, …])

Convert tabular pandas DataFrame with only primitives in cells into nested pandas DataFrame with a single column.

from_3d_numpy_to_2d_array(X)

Converts 3d NumPy array (n_instances, n_columns, n_timepoints) to a 2d NumPy array with shape (n_instances, n_columns*n_timepoints)

from_3d_numpy_to_nested(X[, column_names, …])

Convert NumPy ndarray with shape (n_instances, n_columns, n_timepoints) into nested pandas DataFrame (with time series as pandas Series in cells)

from_nested_to_3d_numpy(X)

Convert nested pandas DataFrame (with time series as pandas Series in cells) into NumPy ndarray with shape (n_instances, n_columns, n_timepoints).

from_multi_index_to_3d_numpy(X[, …])

Convert panel data stored as pandas multi-index DataFrame to Numpy 3-dimensional NumPy array (n_instances, n_columns, n_timepoints).

from_3d_numpy_to_multi_index(X[, …])

Convert 3-dimensional NumPy array (n_instances, n_columns, n_timepoints) to panel data stored as pandas multi-indexed DataFrame.

from_multi_index_to_nested(multi_ind_dataframe)

Converts a pandas DataFrame witha multi-index to a nested DataFrame

from_nested_to_multi_index(X[, …])

Converts nested pandas DataFrame (with time series as pandas Series or NumPy array in cells) into multi-indexed pandas DataFrame.

from_nested_to_long(X[, …])

Convert nested DataFrame to long DataFrame.

from_long_to_nested(X_long[, …])

Convert long DataFrame to a nested DataFrame.