Tips#
Saving Evaluator objects#
Evaltools includes a convenient way to save
Evaluator
objects
for later use. Let us create such an object as seen in tutorial:
In [1]: import evaltools as evt
In [2]: from datetime import date
# import stations with module utils
In [3]: stations = evt.utils.read_listing("./sample_data/listing")
In [4]: start_date = date(2017, 6, 1)
In [5]: end_date = date(2017, 6, 6)
# create an object of class Observations with module evaluator
In [6]: obs = evt.Observations.from_time_series(
...: generic_file_path="./sample_data/observations/{year}_co_{station}",
...: correc_unit=1e9,
...: species='co',
...: start=start_date,
...: end=end_date,
...: stations = stations,
...: forecast_horizon=2,
...: )
...:
# create an object of class Simulations with module evaluator
In [7]: sim = evt.Simulations.from_time_series(
...: generic_file_path=(
...: "./sample_data/ENSforecast/J{forecastDay}/{year}_co_{station}"
...: ),
...: stations_idx=stations.index,
...: species='co',
...: model='ENS',
...: start=start_date,
...: end=end_date,
...: forecast_horizon=2,
...: )
...:
# create an object of class Evaluator with module evaluator
In [8]: obj = evt.Evaluator(obs, sim)
In [9]: obj.summary()
Model: ENS
Species: co
Time step: 1 hour
Period: 20170601 - 20170606
Forecast horizon: 2
Color: k
Paths :
- Sim : ./sample_data/ENSforecast/J{forecastDay}/{year}_co_{station}
- Obs : ./sample_data/observations/{year}_co_{station}
Once we have an Evaluator
object, it
is possible to save it using
evaluator.Evaluator.dump
method.
In [10]: obj.dump('./sample_data/evaluatorObj.dump')
Once the file is created, it can be loaded anytime with
evaluator.load
function.
In [11]: obj2 = evt.load('./sample_data/evaluatorObj.dump')
In [12]: obj2.summary()
Model: ENS
Species: co
Time step: 1 hour
Period: 20170601 - 20170606
Forecast horizon: 2
Color: k
Paths :
- Sim : ./sample_data/ENSforecast/J{forecastDay}/{year}_co_{station}
- Obs : ./sample_data/observations/{year}_co_{station}
# objects do not have the same adress, they are considered different
In [13]: obj == obj2
Out[13]: False
# but attributes and data have the same values
In [14]: obj.stations.equals(obj2.stations)
Out[14]: True
In [15]: obj.obs_df.equals(obj2.obs_df)
Out[15]: True
In [16]: obj.sim_df[0].equals(obj2.sim_df[0])
Out[16]: True
In [17]: obj.sim_df[1].equals(obj2.sim_df[1])
Out[17]: True
Objects attributes#
Evaluator
class lies on
Observations
and Simulations
classes. Both of
them use Dataset
class, which mostly
lies on Dataframes. Let us have an overview of the attributes of these classes:
Observations
objects attributes:
In [18]: obs.species
Out[18]: 'co'
In [19]: obs.start_date
Out[19]: datetime.date(2017, 6, 1)
In [20]: obs.end_date
Out[20]: datetime.date(2017, 6, 6)
In [21]: obs.forecast_horizon
Out[21]: 2
In [22]: obs.series_type
Out[22]: 'hourly'
In [23]: obs.stations
Out[23]:
site area lat lon
code
AD0942A bac urb 42.50969 1.53914
AT0VOR1 bac rur 46.67970 12.97190
AT10001 bac sub 47.84000 16.52670
AT31401 bac sub 48.08610 16.30220
AT31402 tra sub 48.12500 16.33170
CH0002R bac rur 46.81310 6.94447
CH0005A bac sub 47.40290 8.61341
CH0005R bac rur 47.06740 8.46334
CH0010A bac urb 47.37760 8.53042
CZ0ALIB bac sub 50.00730 14.44590
CZ0HHKB tra urb 50.19540 15.84640
CZ0JKOS bac rur 49.57340 15.08030
CZ0PPLA tra urb 49.73240 13.40230
CZ0TOPR ind urb 49.85630 18.26970
In [24]: obs.dataset
Out[24]: <evaltools.dataset.Dataset at 0x7fef69a2b320>
Simulations
objects attributes:
In [25]: sim.species
Out[25]: 'co'
In [26]: sim.start_date
Out[26]: datetime.date(2017, 6, 1)
In [27]: sim.end_date
Out[27]: datetime.date(2017, 6, 6)
In [28]: sim.forecast_horizon
Out[28]: 2
In [29]: sim.series_type
Out[29]: 'hourly'
In [30]: sim.stations
Out[30]:
Index(['AD0942A', 'AT0VOR1', 'AT10001', 'AT31401', 'AT31402', 'CH0002R',
'CH0005A', 'CH0005R', 'CH0010A', 'CZ0ALIB', 'CZ0HHKB', 'CZ0JKOS',
'CZ0PPLA', 'CZ0TOPR'],
dtype='object', name='code')
In [31]: sim.model
Out[31]: 'ENS'
In [32]: sim.datasets
Out[32]:
[<evaltools.dataset.Dataset at 0x7fef69a2b080>,
<evaltools.dataset.Dataset at 0x7fef69b41fd0>]
Note
sim.datasets
is a list of
Dataset
objects, one for each
forecast day.
Dataset
objects attributes:
In [33]: dt = obs.dataset
In [34]: dt.species
Out[34]: 'co'
In [35]: dt.start_date
Out[35]: datetime.date(2017, 6, 1)
In [36]: dt.end_date
Out[36]: datetime.date(2017, 6, 7)
In [37]: dt.nb_days
Out[37]: 7
In [38]: dt.series_type
Out[38]: 'hourly'
In [39]: dt.date_format
Out[39]: '%Y%m%d%H'
In [40]: type(dt.data)
Out[40]: pandas.core.frame.DataFrame
In [41]: dt.data
Out[41]:
code AD0942A AT0VOR1 AT10001 ... CZ0JKOS CZ0PPLA CZ0TOPR
2017-06-01 00:00:00 NaN 52.42 51.67 ... 209.0 260.0 150.0
2017-06-01 01:00:00 NaN 51.94 66.99 ... NaN NaN NaN
2017-06-01 02:00:00 NaN 52.91 48.73 ... 205.0 225.0 137.0
2017-06-01 03:00:00 NaN 51.51 33.35 ... 203.0 221.0 NaN
2017-06-01 04:00:00 NaN 50.15 33.40 ... 200.0 616.0 175.0
... ... ... ... ... ... ... ...
2017-06-07 19:00:00 NaN 63.87 255.80 ... 47.0 226.0 58.0
2017-06-07 20:00:00 NaN 64.79 241.32 ... 47.0 247.0 NaN
2017-06-07 21:00:00 NaN 63.82 229.71 ... 104.0 211.0 58.0
2017-06-07 22:00:00 NaN 60.28 201.00 ... 97.0 234.0 130.0
2017-06-07 23:00:00 NaN 61.20 195.26 ... 97.0 210.0 129.0
[168 rows x 14 columns]
Evaluator
objects attributes:
In [42]: obj.species
Out[42]: 'co'
In [43]: obj.start_date
Out[43]: datetime.date(2017, 6, 1)
In [44]: obj.end_date
Out[44]: datetime.date(2017, 6, 6)
In [45]: obj.forecast_horizon
Out[45]: 2
In [46]: obj.series_type
Out[46]: 'hourly'
In [47]: obj.model
Out[47]: 'ENS'
In [48]: obj.stations
Out[48]:
site area lat lon
code
AD0942A bac urb 42.50969 1.53914
AT0VOR1 bac rur 46.67970 12.97190
AT10001 bac sub 47.84000 16.52670
AT31401 bac sub 48.08610 16.30220
AT31402 tra sub 48.12500 16.33170
CH0002R bac rur 46.81310 6.94447
CH0005A bac sub 47.40290 8.61341
CH0005R bac rur 47.06740 8.46334
CH0010A bac urb 47.37760 8.53042
CZ0ALIB bac sub 50.00730 14.44590
CZ0HHKB tra urb 50.19540 15.84640
CZ0JKOS bac rur 49.57340 15.08030
CZ0PPLA tra urb 49.73240 13.40230
CZ0TOPR ind urb 49.85630 18.26970
In [49]: type(obj.obs_df)
Out[49]: pandas.core.frame.DataFrame
In [50]: obj.obs_df
Out[50]:
code AD0942A AT0VOR1 AT10001 ... CZ0JKOS CZ0PPLA CZ0TOPR
2017-06-01 00:00:00 NaN 52.42 51.67 ... 209.0 260.0 150.0
2017-06-01 01:00:00 NaN 51.94 66.99 ... NaN NaN NaN
2017-06-01 02:00:00 NaN 52.91 48.73 ... 205.0 225.0 137.0
2017-06-01 03:00:00 NaN 51.51 33.35 ... 203.0 221.0 NaN
2017-06-01 04:00:00 NaN 50.15 33.40 ... 200.0 616.0 175.0
... ... ... ... ... ... ... ...
2017-06-07 19:00:00 NaN 63.87 255.80 ... 47.0 226.0 58.0
2017-06-07 20:00:00 NaN 64.79 241.32 ... 47.0 247.0 NaN
2017-06-07 21:00:00 NaN 63.82 229.71 ... 104.0 211.0 58.0
2017-06-07 22:00:00 NaN 60.28 201.00 ... 97.0 234.0 130.0
2017-06-07 23:00:00 NaN 61.20 195.26 ... 97.0 210.0 129.0
[168 rows x 14 columns]
In [51]: type(obj.sim_df)
Out[51]: list
In [52]: type(obj.sim_df[0])
Out[52]: pandas.core.frame.DataFrame
In [53]: obj.sim_df
Out[53]:
[ AD0942A AT0VOR1 AT10001 ... CZ0JKOS CZ0PPLA CZ0TOPR
2017-06-01 00:00:00 NaN 125.615 195.743 ... 118.072 120.903 200.016
2017-06-01 01:00:00 NaN 124.696 187.501 ... 118.872 121.781 188.149
2017-06-01 02:00:00 NaN 124.414 180.656 ... 120.605 122.917 183.144
2017-06-01 03:00:00 NaN 122.835 178.975 ... 123.277 123.259 187.981
2017-06-01 04:00:00 NaN 122.949 165.245 ... 128.243 125.293 185.981
... ... ... ... ... ... ... ...
2017-06-06 19:00:00 NaN 103.669 117.103 ... 122.876 112.820 157.919
2017-06-06 20:00:00 NaN 104.671 122.962 ... 114.701 111.285 161.868
2017-06-06 21:00:00 NaN 106.023 114.108 ... 111.590 111.109 166.062
2017-06-06 22:00:00 NaN 104.786 109.945 ... 109.645 110.133 150.973
2017-06-06 23:00:00 NaN 104.190 111.574 ... 108.634 109.217 143.948
[144 rows x 14 columns],
AD0942A AT0VOR1 AT10001 ... CZ0JKOS CZ0PPLA CZ0TOPR
2017-06-02 00:00:00 NaN 130.632 187.774 ... 115.595 123.772 346.895
2017-06-02 01:00:00 NaN 128.314 191.837 ... 115.705 124.764 307.852
2017-06-02 02:00:00 NaN 128.497 192.607 ... 115.811 124.966 279.956
2017-06-02 03:00:00 NaN 128.094 196.502 ... 114.913 125.033 280.241
2017-06-02 04:00:00 NaN 128.403 191.039 ... 114.898 125.935 303.427
... ... ... ... ... ... ... ...
2017-06-07 19:00:00 NaN 100.061 109.947 ... 103.466 105.743 163.747
2017-06-07 20:00:00 NaN 98.077 109.488 ... 103.569 106.134 188.676
2017-06-07 21:00:00 NaN 98.452 110.656 ... 103.568 108.130 165.174
2017-06-07 22:00:00 NaN 97.151 117.971 ... 103.430 109.123 163.276
2017-06-07 23:00:00 NaN 97.184 118.467 ... 103.774 109.683 171.150
[144 rows x 14 columns]]
Note
obj.obs_df
is equivalent to obj.observations.dataset.data
, and obj.sim_df[fd]
is equivalent to obj.simulations.datasets[fd].data
(where fd is one of the forecast days).
How to handle data with a time step different from 1h#
Since version 1.0.4, you can work with data at 1h, 2h, 3h, 4h, 6h and 12h time step. The following methods
have an argument step
that corresponds to the time step in hours. This
argument is ignored when argument series_type is ‘daily’.
Plotting with translated annotations#
If you want the annotations on your charts to be translated into French,
you can set evaltools.plotting.lang = 'FR'
in your script.