IO Operations¶

[ ]:

import timeatlas as ta

The object TimeSeries and TimeSeries data have multiple way to be written into a file.

With TimeSeries¶

First we create some TimeSeries and a TimeSeriesDataset

[90]:

from timeatlas.read_write import read_text, read_pickle, read_tsd, csv_to_tsd

ts = ta.TimeSeries.create('2019-01-01', '2019-01-04', freq='1D')
ts = ts.fill([i for i in range(len(ts))])
ts.class_label = "test label"
ts.metadata = ta.Metadata({'test': "metadata test"})

ts2 = ta.TimeSeries.create('2019-01-01', '2019-01-04', freq='H')
ts2 = ts2.fill([i for i in range(len(ts2))])

ts3 = ta.TimeSeries.create('2019-01-01', '2019-01-10', freq='1D')
ts3 = ts3.fill([i for i in range(len(ts3))])

tsd = ta.TimeSeriesDataset([ts, ts2, ts3])

In time timeatlas we try to keep the TimeSeries and their Metadata close to each other. With the TimeSeries.to_text(path) we write the TimeSeries data into data.csv and the metadata into meta.json.

[91]:

ts.to_text('./data/timeseries/to_text/')

This will save the TimeSeries in a csv-files (data.csv) and the metadata (meta.json).

[92]:

ts = read_text('./data/timeseries/to_text/')

[ ]:

ts

[94]:

ts.metadata

[94]:

{'label': 'test label', 'test': 'metadata test'}

In addtion to to_text(path), TimeAtlas implements :

to_pickle(path) saving the TimeSeries as pickle,
to_df() returning a Pandas Dataframe,
to_array() returning a np.array,
to_darts() the format form u8darts.

With TimeSeriesDataset¶

TimeSeriesDataset implements the most of the same functions as in TimeSeries as both extends the abstract base class AbstarctBaseTimeSeries (excl. to_darts())

TimeSeriesDataset.to_text(path) will create a subfolder in path for each TimeSeries in the TimeSeriesDataset to keep the data.csv and the meta.json together.

data
├──time_series_dataset
│  ├── 0
│  │   ├── data.csv
│  │   └── meta.json
.  └── 1
.      ├── data.csv
.      └── meta.jsondata.csv

[95]:

tsd

[95]:

   minimum  maximum  mean  median  kurtosis  skewness
0        0        3   1.5     1.5      -1.2       0.0
1        0       72  36.0    36.0      -1.2       0.0
2        0        9   4.5     4.5      -1.2       0.0

[96]:

tsd.to_text('./data/timeseriesdataset/to_text/')

To load the TimeSeriesDataset will search the given folder for the subfolders and if they contain files in the acceptable format.

[97]:

tsd_loaded = read_tsd('./data/timeseriesdataset/to_text/')

[98]:

tsd_loaded

[98]:

   minimum  maximum  mean  median  kurtosis  skewness
0        0        3   1.5     1.5      -1.2       0.0
1        0       72  36.0    36.0      -1.2       0.0
2        0        9   4.5     4.5      -1.2       0.0

Additionally there is a possibility to load a csv-file as TimeSeriesDataset. Each column in the csv will be loaded as TimeSeries and added to the TimeSeriesDataset

[99]:

tsd_loaded.to_df().to_csv("./data/timeseriesdataset/tsd.csv")

[101]:

tsd_from_csv = csv_to_tsd("./data/timeseriesdataset/tsd.csv")

[105]:

tsd_from_csv.to_df()

[105]:

	0_values	1_values	2_values
index
2019-01-01 00:00:00	0.0	0.0	0.0
2019-01-01 01:00:00	NaN	1.0	NaN
2019-01-01 02:00:00	NaN	2.0	NaN
2019-01-01 03:00:00	NaN	3.0	NaN
2019-01-01 04:00:00	NaN	4.0	NaN
...	...	...	...
2019-01-06 00:00:00	NaN	NaN	5.0
2019-01-07 00:00:00	NaN	NaN	6.0
2019-01-08 00:00:00	NaN	NaN	7.0
2019-01-09 00:00:00	NaN	NaN	8.0
2019-01-10 00:00:00	NaN	NaN	9.0

79 rows × 3 columns

In version 0.1.1 there are few restrictions on the format of the csv:

Each column has to start with a number and an underscore (eg. 0_)
The integer has to be followed by “values”

The reason for these restrictions is the inclution of timestamp labels in the TimeSeries that are named “0_labels_XY”.

Available Data Structures Essential Functions