TimeSeriesLloyds#

class TimeSeriesLloyds(n_clusters: int = 8, init_algorithm: Union[str, Callable] = 'random', metric: Union[str, Callable] = 'euclidean', n_init: int = 10, max_iter: int = 300, tol: float = 1e-06, verbose: bool = False, random_state: Optional[Union[int, numpy.random.mtrand.RandomState]] = None, distance_params: Optional[dict] = None)[source]#

Abstact class that implements time series Lloyds algorithm.

Parameters
n_clusters: int, defaults = 8

The number of clusters to form as well as the number of centroids to generate.

init_algorithm: str, defaults = ‘forgy’

Method for initializing cluster centers. Any of the following are valid: [‘kmeans++’, ‘random’, ‘forgy’]

metric: str or Callable, defaults = ‘dtw’

Distance metric to compute similarity between time series. Any of the following are valid: [‘dtw’, ‘euclidean’, ‘erp’, ‘edr’, ‘lcss’, ‘squared’, ‘ddtw’, ‘wdtw’, ‘wddtw’]

n_init: int, defaults = 10

Number of times the k-means algorithm will be run with different centroid seeds. The final result will be the best output of n_init consecutive runs in terms of inertia.

max_iter: int, defaults = 30

Maximum number of iterations of the k-means algorithm for a single run.

tol: float, defaults = 1e-6

Relative tolerance with regards to Frobenius norm of the difference in the cluster centers of two consecutive iterations to declare convergence.

verbose: bool, defaults = False

Verbosity mode.

random_state: int or np.random.RandomState instance or None, defaults = None

Determines random number generation for centroid initialization.

distance_params: dict, defaults = None

Dictonary containing kwargs for the distance metric being used.

Attributes
cluster_centers_: np.ndarray (3d array of shape (n_clusters, n_dimensions,

series_length)) Time series that represent each of the cluster centers. If the algorithm stops before fully converging these will not be consistent with labels_.

labels_: np.ndarray (1d array of shape (n_instance,))

Labels that is the index each time series belongs to.

inertia_: float

Sum of squared distances of samples to their closest cluster center, weighted by the sample weights if provided.

n_iter_: int

Number of iterations run.

Methods

check_is_fitted()

Check if the estimator has been fitted.

clone()

Obtain a clone of the object with same hyper-parameters.

clone_tags(estimator[, tag_names])

clone/mirror tags from another estimator as dynamic override.

create_test_instance([parameter_set])

Construct Estimator instance if possible.

create_test_instances_and_names([parameter_set])

Create list of all test instances and a list of names for them.

fit(X[, y])

Fit time series clusterer to training data.

fit_predict(X[, y])

Compute cluster centers and predict cluster index for each time series.

get_class_tag(tag_name[, tag_value_default])

Get tag value from estimator class (only class tags).

get_class_tags()

Get class tags from estimator class and all its parent classes.

get_fitted_params()

Get fitted parameters.

get_param_defaults()

Get parameter defaults for the object.

get_param_names()

Get parameter names for the object.

get_params([deep])

Get parameters for this estimator.

get_tag(tag_name[, tag_value_default, …])

Get tag value from estimator class and dynamic tag overrides.

get_tags()

Get tags from estimator class and dynamic tag overrides.

get_test_params([parameter_set])

Return testing parameter settings for the estimator.

is_composite()

Check if the object is composite.

predict(X[, y])

Predict the closest cluster each sample in X belongs to.

predict_proba(X)

Predicts labels probabilities for sequences in X.

reset()

Reset the object to a clean post-init state.

score(X[, y])

Score the quality of the clusterer.

set_params(**params)

Set the parameters of this object.

set_tags(**tag_dict)

Set dynamic tags to given values.

check_is_fitted()[source]#

Check if the estimator has been fitted.

Raises
NotFittedError

If the estimator has not been fitted yet.

clone()[source]#

Obtain a clone of the object with same hyper-parameters.

A clone is a different object without shared references, in post-init state. This function is equivalent to returning sklearn.clone of self. Equal in value to type(self)(**self.get_params(deep=False)).

Returns
instance of type(self), clone of self (see above)
clone_tags(estimator, tag_names=None)[source]#

clone/mirror tags from another estimator as dynamic override.

Parameters
estimatorestimator inheriting from :class:BaseEstimator
tag_namesstr or list of str, default = None

Names of tags to clone. If None then all tags in estimator are used as tag_names.

Returns
Self

Reference to self.

Notes

Changes object state by setting tag values in tag_set from estimator as dynamic tags in self.

classmethod create_test_instance(parameter_set='default')[source]#

Construct Estimator instance if possible.

Parameters
parameter_setstr, default=”default”

Name of the set of test parameters to return, for use in tests. If no special parameters are defined for a value, will return “default” set.

Returns
instanceinstance of the class with default parameters

Notes

get_test_params can return dict or list of dict. This function takes first or single dict that get_test_params returns, and constructs the object with that.

classmethod create_test_instances_and_names(parameter_set='default')[source]#

Create list of all test instances and a list of names for them.

Parameters
parameter_setstr, default=”default”

Name of the set of test parameters to return, for use in tests. If no special parameters are defined for a value, will return “default” set.

Returns
objslist of instances of cls

i-th instance is cls(**cls.get_test_params()[i])

nameslist of str, same length as objs

i-th element is name of i-th instance of obj in tests convention is {cls.__name__}-{i} if more than one instance otherwise {cls.__name__}

parameter_setstr, default=”default”

Name of the set of test parameters to return, for use in tests. If no special parameters are defined for a value, will return “default” set.

fit(X: Union[pandas.core.frame.DataFrame, numpy.ndarray], y=None) sktime.base._base.BaseEstimator[source]#

Fit time series clusterer to training data.

Parameters
XTraining time series instances to cluster. np.ndarray (2d or 3d array of
shape (n_instances, series_length) or shape (n_instances, n_dimensions,
series_length)) or pd.DataFrame (where each column is a dimension, each cell
is a pd.Series (any number of dimensions, equal or unequal length series)).
Converted to type _tags[“X_inner_mtype”]
y: ignored, exists for API consistency reasons.
Returns
self:

Fitted estimator.

fit_predict(X: Union[pandas.core.frame.DataFrame, numpy.ndarray], y=None) numpy.ndarray[source]#

Compute cluster centers and predict cluster index for each time series.

Convenience method; equivalent of calling fit(X) followed by predict(X)

Parameters
Xnp.ndarray (2d or 3d array of shape (n_instances, series_length) or shape

(n_instances, n_dimensions, series_length)) or pd.DataFrame (where each column is a dimension, each cell is a pd.Series (any number of dimensions, equal or unequal length series)). Time series instances to train clusterer and then have indexes each belong to return.

y: ignored, exists for API consistency reasons.
Returns
np.ndarray (1d array of shape (n_instances,))

Index of the cluster each time series in X belongs to.

classmethod get_class_tag(tag_name, tag_value_default=None)[source]#

Get tag value from estimator class (only class tags).

Parameters
tag_namestr

Name of tag value.

tag_value_defaultany type

Default/fallback value if tag is not found.

Returns
tag_value

Value of the tag_name tag in self. If not found, returns tag_value_default.

classmethod get_class_tags()[source]#

Get class tags from estimator class and all its parent classes.

Returns
collected_tagsdict

Dictionary of tag name : tag value pairs. Collected from _tags class attribute via nested inheritance. NOT overridden by dynamic tags set by set_tags or mirror_tags.

get_fitted_params()[source]#

Get fitted parameters.

State required:

Requires state to be “fitted”.

Returns
fitted_paramsdict of fitted parameters, keys are str names of parameters

parameters of components are indexed as [componentname]__[paramname]

classmethod get_param_defaults()[source]#

Get parameter defaults for the object.

Returns
default_dict: dict with str keys

keys are all parameters of cls that have a default defined in __init__ values are the defaults, as defined in __init__

classmethod get_param_names()[source]#

Get parameter names for the object.

Returns
param_names: list of str, alphabetically sorted list of parameter names of cls
get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns
paramsdict

Parameter names mapped to their values.

get_tag(tag_name, tag_value_default=None, raise_error=True)[source]#

Get tag value from estimator class and dynamic tag overrides.

Parameters
tag_namestr

Name of tag to be retrieved

tag_value_defaultany type, optional; default=None

Default/fallback value if tag is not found

raise_errorbool

whether a ValueError is raised when the tag is not found

Returns
tag_value

Value of the tag_name tag in self. If not found, returns an error if raise_error is True, otherwise it returns tag_value_default.

Raises
ValueError if raise_error is True i.e. if tag_name is not in self.get_tags(
).keys()
get_tags()[source]#

Get tags from estimator class and dynamic tag overrides.

Returns
collected_tagsdict

Dictionary of tag name : tag value pairs. Collected from _tags class attribute via nested inheritance and then any overrides and new tags from _tags_dynamic object attribute.

classmethod get_test_params(parameter_set='default')[source]#

Return testing parameter settings for the estimator.

Parameters
parameter_setstr, default=”default”

Name of the set of test parameters to return, for use in tests. If no special parameters are defined for a value, will return “default” set.

Returns
paramsdict or list of dict, default = {}

Parameters to create testing instances of the class Each dict are parameters to construct an “interesting” test instance, i.e., MyClass(**params) or MyClass(**params[i]) creates a valid test instance. create_test_instance uses the first (or only) dictionary in params

is_composite()[source]#

Check if the object is composite.

A composite object is an object which contains objects, as parameters. Called on an instance, since this may differ by instance.

Returns
composite: bool, whether self contains a parameter which is BaseObject
property is_fitted[source]#

Whether fit has been called.

predict(X: Union[pandas.core.frame.DataFrame, numpy.ndarray], y=None) numpy.ndarray[source]#

Predict the closest cluster each sample in X belongs to.

Parameters
Xnp.ndarray (2d or 3d array of shape (n_instances, series_length) or shape

(n_instances, n_dimensions, series_length)) or pd.DataFrame (where each column is a dimension, each cell is a pd.Series (any number of dimensions, equal or unequal length series)). Time series instances to predict their cluster indexes.

y: ignored, exists for API consistency reasons.
Returns
np.ndarray (1d array of shape (n_instances,))

Index of the cluster each time series in X belongs to.

predict_proba(X)[source]#

Predicts labels probabilities for sequences in X.

Default behaviour is to call _predict and set the predicted class probability to 1, other class probabilities to 0. Override if better estimates are obtainable.

Parameters
Xguaranteed to be of a type in self.get_tag(“X_inner_mtype”)
if self.get_tag(“X_inner_mtype”) = “numpy3D”:

3D np.ndarray of shape = [n_instances, n_dimensions, series_length]

if self.get_tag(“X_inner_mtype”) = “nested_univ”:

pd.DataFrame with each column a dimension, each cell a pd.Series

for list of other mtypes, see datatypes.SCITYPE_REGISTER for specifications, see examples/AA_datatypes_and_datasets.ipynb

Returns
y2D array of shape [n_instances, n_classes] - predicted class probabilities

1st dimension indices correspond to instance indices in X 2nd dimension indices correspond to possible labels (integers) (i, j)-th entry is predictive probability that i-th instance is of class j

reset()[source]#

Reset the object to a clean post-init state.

Equivalent to sklearn.clone but overwrites self. After self.reset() call, self is equal in value to type(self)(**self.get_params(deep=False))

Detail behaviour: removes any object attributes, except:

hyper-parameters = arguments of __init__ object attributes containing double-underscores, i.e., the string “__”

runs __init__ with current values of hyper-parameters (result of get_params)

Not affected by the reset are: object attributes containing double-underscores class and object methods, class attributes

score(X, y=None) float[source]#

Score the quality of the clusterer.

Parameters
Xnp.ndarray (2d or 3d array of shape (n_instances, series_length) or shape

(n_instances, n_dimensions, series_length)) or pd.DataFrame (where each column is a dimension, each cell is a pd.Series (any number of dimensions, equal or unequal length series)). Time series instances to train clusterer and then have indexes each belong to return.

y: ignored, exists for API consistency reasons.
Returns
scorefloat

Score of the clusterer.

set_params(**params)[source]#

Set the parameters of this object.

The method works on simple estimators as well as on nested objects. The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters
**paramsdict

BaseObject parameters

Returns
selfreference to self (after parameters have been set)
set_tags(**tag_dict)[source]#

Set dynamic tags to given values.

Parameters
tag_dictdict

Dictionary of tag name : tag value pairs.

Returns
Self

Reference to self.

Notes

Changes object state by settting tag values in tag_dict as dynamic tags in self.