TimeSeriesDataset

This class extends a Python List and represents multiple time series, they can be univariate or multivariate (not implemented yet). Therefore, each item in a TimeSeriesDataset has its own time index.

Thanks to its ability to handle multiple indices, this class provides a set of method to go from raw data, with unknown characteristics (frequencies, start, end, etc.), to clean data that is easy to process, model or analyze.

Warning

The aim of a TimeSeriesDataset object is to be immutable

Constructor

TimeSeriesDataset(data)

Defines a set of time series

Methods

create(length, start, end[, freq])

Create an empty TimeSeriesDataset object with a defined index and period

append(item)

Append a TimeSeries to TimeSeriesDataset

plot(*args, **kwargs)

Plot a TimeSeriesDataset

copy([deep])

Copy a TimeSeriesDataset

split_at(timestamp)

Split a TimeSeriesDataset at a defined point and include the splitting point in both as in [start,…,at] and [at,…,end].

split_in_chunks(n)

The TimeSeries in the TimeSeriesDataset are cut into chunks of length n

fill(value)

Fill all values in each TimeSeries from a TimeSeriesDataset.

empty()

Empty the values in each TimeSeries from a TimeSeriesDataset.

pad(limit[, side, value])

Pad a TimeSeriesDataset until a given limit

trim([side])

Remove NaNs from a TimeSeries start, end or both

merge(tsd)

Merge two TimeSeriesDataset by the index of the TimeSeries

merge_by_label(tsd)

Merge two TimeSeriesDatasets by the label of the TimeSeries in the TimeSeriesDatasets

select_components_randomly(n[, seed, indices])

Returns a subset of the TimeSeriesDataset with randomly chosen n elements without replacement.

select_components_by_percentage(percent[, …])

Returns a subset of the TimeSeriesDataset with randomly chosen percentage elements without replacement.

shuffle([inplace])

Randomizing the order of the TS in the TSD

Processing

apply(func[, tsd])

Apply function specialized for TimeSeriesDataset

resample(freq[, method])

Convert the TimeSeries in a TimeSeriesDataset to a specified frequency.

group_by(freq[, method])

Groups values by a frequency for each TimeSeries in a TimeSeriesDataset.

interpolate(*args, **kwargs)

Wrapper around the Pandas interpolate() method.

normalize(method)

Normalize the TimeSeries in a TimeSeriesDataset with a given method

round(decimals)

Round the values of every TimeSeries in the TimeSeriesDataset with a defined number of digits

sort(*args, **kwargs)

Sort the TimeSeries of a TimeSeriesDataset by time stamps

regularize([side, fill])

Regularize a TimeSeriesDataset so that all starting and ending timestamps are similar.

Analysis

min()

Minimum of all TimeSeries in TimeSeriesDataset

max()

Maximum of all TimeSeries in TimeSeriesDataset

mean()

Means of all TimeSeries in TimeSeriesDataset

median()

Median of all TimeSeries in TimeSeriesDataset

skewness()

Skewness of all TimeSeries in TimeSeriesDataset

kurtosis()

Kurtosis of all TimeSeries in TimeSeriesDataset

describe()

Describe a TimeSeriesDataset with the describe function from Pandas

start()

Get the first Timestamp of a all components of a TimeSeriesDataset

end()

Get the last Timestamp of a all components of a TimeSeriesDataset

boundaries()

Get the tuple with the TimeSeries first and last index for all components in the TimeSeriesDataset

frequency()

Get the frequency of a each TimeSeries in a TimeSeriesDataset

duration()

Get the duration for all TimeSeries in a TimeSeriesDataset

I/O

to_text(path)

Export a TimeSeriesDataset to text format

to_pickle(path)

Creating a pickle out of the TimeSeriesDataset

to_array()

TimeSeriesData to NumpyArray [n x len(tsd)], where n is number of

to_df()

Converts a TimeSeriesDataset to a Pandas DataFrame