IO Operations¶
[ ]:
import timeatlas as ta
The object TimeSeries and TimeSeries data have multiple way to be written into a file.
With TimeSeries¶
First we create some TimeSeries and a TimeSeriesDataset
[90]:
from timeatlas.read_write import read_text, read_pickle, read_tsd, csv_to_tsd
ts = ta.TimeSeries.create('2019-01-01', '2019-01-04', freq='1D')
ts = ts.fill([i for i in range(len(ts))])
ts.class_label = "test label"
ts.metadata = ta.Metadata({'test': "metadata test"})
ts2 = ta.TimeSeries.create('2019-01-01', '2019-01-04', freq='H')
ts2 = ts2.fill([i for i in range(len(ts2))])
ts3 = ta.TimeSeries.create('2019-01-01', '2019-01-10', freq='1D')
ts3 = ts3.fill([i for i in range(len(ts3))])
tsd = ta.TimeSeriesDataset([ts, ts2, ts3])
In time timeatlas we try to keep the TimeSeries and their Metadata close to each other. With the TimeSeries.to_text(path)
we write the TimeSeries data into data.csv and the metadata into meta.json.
[91]:
ts.to_text('./data/timeseries/to_text/')
This will save the TimeSeries in a csv-files (data.csv) and the metadata (meta.json).
[92]:
ts = read_text('./data/timeseries/to_text/')
[ ]:
ts
[94]:
ts.metadata
[94]:
{'label': 'test label', 'test': 'metadata test'}
In addtion to to_text(path)
, TimeAtlas implements :
to_pickle(path)
saving the TimeSeries as pickle,to_df()
returning a Pandas Dataframe,to_array()
returning a np.array,to_darts()
the format form u8darts.
With TimeSeriesDataset¶
TimeSeriesDataset implements the most of the same functions as in TimeSeries as both extends the abstract base class AbstarctBaseTimeSeries
(excl. to_darts()
)
TimeSeriesDataset.to_text(path)
will create a subfolder in path for each TimeSeries in the TimeSeriesDataset to keep the data.csv and the meta.json together.
data
├──time_series_dataset
│ ├── 0
│ │ ├── data.csv
│ │ └── meta.json
. └── 1
. ├── data.csv
. └── meta.jsondata.csv
[95]:
tsd
[95]:
minimum maximum mean median kurtosis skewness
0 0 3 1.5 1.5 -1.2 0.0
1 0 72 36.0 36.0 -1.2 0.0
2 0 9 4.5 4.5 -1.2 0.0
[96]:
tsd.to_text('./data/timeseriesdataset/to_text/')
To load the TimeSeriesDataset will search the given folder for the subfolders and if they contain files in the acceptable format.
[97]:
tsd_loaded = read_tsd('./data/timeseriesdataset/to_text/')
[98]:
tsd_loaded
[98]:
minimum maximum mean median kurtosis skewness
0 0 3 1.5 1.5 -1.2 0.0
1 0 72 36.0 36.0 -1.2 0.0
2 0 9 4.5 4.5 -1.2 0.0
Additionally there is a possibility to load a csv-file as TimeSeriesDataset. Each column in the csv will be loaded as TimeSeries and added to the TimeSeriesDataset
[99]:
tsd_loaded.to_df().to_csv("./data/timeseriesdataset/tsd.csv")
[101]:
tsd_from_csv = csv_to_tsd("./data/timeseriesdataset/tsd.csv")
[105]:
tsd_from_csv.to_df()
[105]:
0_values | 1_values | 2_values | |
---|---|---|---|
index | |||
2019-01-01 00:00:00 | 0.0 | 0.0 | 0.0 |
2019-01-01 01:00:00 | NaN | 1.0 | NaN |
2019-01-01 02:00:00 | NaN | 2.0 | NaN |
2019-01-01 03:00:00 | NaN | 3.0 | NaN |
2019-01-01 04:00:00 | NaN | 4.0 | NaN |
... | ... | ... | ... |
2019-01-06 00:00:00 | NaN | NaN | 5.0 |
2019-01-07 00:00:00 | NaN | NaN | 6.0 |
2019-01-08 00:00:00 | NaN | NaN | 7.0 |
2019-01-09 00:00:00 | NaN | NaN | 8.0 |
2019-01-10 00:00:00 | NaN | NaN | 9.0 |
79 rows × 3 columns
In version 0.1.1 there are few restrictions on the format of the csv:
Each column has to start with a number and an underscore (eg. 0_)
The integer has to be followed by “values”
The reason for these restrictions is the inclution of timestamp labels in the TimeSeries that are named “0_labels_XY”.