Extract and post-processing observations

Python class for accessing BDClim data

A python class is described below for generating sql queries to bdclim. If you are familiar with SQL, it may also be interesting to use directly the psycopg2 python module as Meteo-France database are managed through PostgreSQL.

Please read the documentation of BDClim on http://tipige.meteo.fr/Climsol/

For additional examples of extractions on BDClim, please have a look to Existing scripts for querying on BDClim that may provide useful examples. Note that they may be easily adjusted if they does not immediately match your needs, have a look to their code !

class scripts.extract.obs.bdquery.question(host='bdclimop-usr.meteo.fr', port=5432, user='anonymous', password='anonymous', database='bdclim', listvar=[], listorder=[], listconditions=[], listjoin=[], period=[], table=None, dateformat='iso')[source]

Bases: object

Class for interrogating PostgreSQL database

Useful for accessing BDClim at Meteo-France. Please consider using directly psycopg2 or pandas for new projects and make sure you are familiar with SQL syntax before using as inputs are not always correctly sanitized.

Usecase example: Read all snow depth information for poste nr 38005402 (Nivose Aigleton) and 38567002 (Chamrousse) for the season 2020-2021 and save it to a file:

question1 = question(
        # Variables to extract: here date, num_poste on 8 digits, poste name, snow depth
        listvar=["to_char(dat,'YYYY-MM-DD-HH24-MI')", "to_char(h.num_poste, 'fm00000000')",
                 "poste_nivo.nom_usuel", "neigetot"],
        # The table to read (see BDClim documentation)
        table='H',
        # Join table num_poste to get the poste name
        listjoin=['POSTE_NIVO ON H.NUM_POSTE = POSTE_NIVO.NUM_POSTE'],
        # Set a time period of interest (datetime or str objects)
        period=['2020-08-01', '2021-08-01'],
        # Set the order
        listorder=['dat', 'h.num_poste'],
        # Filtering
        # parenthesis compulsory to group with "OR"
        listconditions=['(H.NUM_POSTE=38005402 OR H.NUM_POSTE=38567002)'],
        )
# Save into a csv file
question1.run(outputfile='output.csv', header=['dat', 'num_poste', 'poste', 'neigetot'])
Parameters:
  • host (str) – Database server hostname

  • port (int) – Database server port

  • user (str) – Database username

  • password (str) – Database password

  • database (str) – Database name to use on the server

  • listvar (list of str) – List of variables to read

  • listorder (list of str) – List of variables to order on

  • listconditions (list of str) – List of conditions to apply for data selection (SQL format)

  • listjoin – List of join to be done (SQL format without ‘JOIN’)

  • period (list of str or datetime) – List [[Min date, [max_date]]] for filtering on dates (field dat, else, put it in conditions please)

  • table (str) – The table to read in the selected database

  • dateformat (str) – Date format (iso, daily or monthly)

get(outputfile=None, sep=';', header=True, mode='w')[source]

Run the query to the database and return an array of results (or write the results in a csv file).

Note that the sql query is written in sql attribute of the object.

Parameters:
  • outputfile (str or path-like) – Output csv file in which to write the results. If set to None, an array of results is returned

  • sep (one-character str) – Separator in case of csv output is specified

  • header (bool or list) – Headers in case of CSV file. Defaults to True (put the headers), can also be False to desactivate headers or a list to override default ones

  • mode (str) – Mode for opening outputfile

Returns:

array of results (or nothing in case outputfile is set)

run(*args, **kwargs)[source]

For backward compatibility only. Please do not use.

meta:private

set_condition(condition)[source]

Set a condition, written as a SQL condition

set_join(join_str)[source]

Set a join of table, written as a SQL JOIN. Example: text ON text.id = id_text

set_order(listvar)[source]

Set the list of sorting variables

set_period(datedeb, datefin)[source]

Define the period with begin and end date

Parameters:
  • datedeb (str or datetime object) – Begin date

  • datefin (str or datetime object) – End date

set_table(table)[source]

Define the table

set_varout(listvar)[source]

Set the list of desired output variables

Existing scripts for querying on BDClim

An overview of existing scripts is proposed below. They all extract data based on parameters specifying time period (at least) and produces CSV file(s) as an output.

Script scripts/extract/obs/extract_obs.py help:

usage: extract_obs.py [-h] [-o OUTPUT] date_min date_max

Read HTN (snow height) observations from BDCLim for all available stations:
both Nivoses and "nivo-meteo" network.

positional arguments:
  date_min              Start date
  date_max              End date

options:
  -h, --help            show this help message and exit
  -o OUTPUT, --output OUTPUT
                        Output file. If none selected, produce NIVOMETEO.obs
                        and NIVOSE.obs files

Script scripts/extract/obs/extract_obs_allhtn.py help:

usage: extract_obs_allhtn.py [-h] [-o OUTPUT] [--append] date_min date_max

Read HTN (snow height) observations from BDClim for all available stations on
a daily timestep at 6hUTC

positional arguments:
  date_min              Start date
  date_max              End date

options:
  -h, --help            show this help message and exit
  -o OUTPUT, --output OUTPUT
                        Output file. If none selected, produce HTN.csv file
  --append              Append to an existing output file

Script scripts/extract/obs/extract_obs_meteo.py help:

usage: extract_obs_meteo.py [-h] [-o OUTPUT] [--rr] [--tn] [--tx] [--tt]
                            [--f {monthly,daily}]
                            date_min date_max

Read precititation and temperature observations from BDCLim for all availables
stations

positional arguments:
  date_min              Start date
  date_max              End date

options:
  -h, --help            show this help message and exit
  -o OUTPUT, --output OUTPUT
                        Output folder
  --rr                  Produce RR output
  --tn                  Produce TN output
  --tx                  Produce TX output
  --tt                  Produce T output at hourly timestep
  --f {monthly,daily}   Observation frequency

Script scripts/extract/obs/extract_obs_meteo_mb.py help:

usage: extract_obs_meteo_mb.py [-h] [-o OUTPUT] [-s STATIONS [STATIONS ...]]
                               date_min date_max

Read obs data (Temperature) from BDCLim from specified observation stations

positional arguments:
  date_min              Start date
  date_max              End date

options:
  -h, --help            show this help message and exit
  -o OUTPUT, --output OUTPUT
                        Output filename
  -s STATIONS [STATIONS ...], --stations STATIONS [STATIONS ...]
                        List of stations to be used

Todo

Some scripts related to observation extraction and post-processing are not documented. Have a look in the corresponding folder !