Skip to content

onehealth_db.postgresql_database module⚓︎

onehealth_db.postgresql_database ⚓︎

Classes:

  • Base

    Base class for all models in the database.

  • GridPoint

    Grid point table for storing latitude and longitude coordinates.

  • NutsDef

    NUTS definition table.

  • TimePoint

    Time point table for storing year, month, and day.

  • VarType

    Variable type table for storing variable metadata.

  • VarValue

    Variable value table for storing variable values at specific

Functions:

Attributes:

BATCH_SIZE module-attribute ⚓︎

BATCH_SIZE = 10000

CRS module-attribute ⚓︎

CRS = 4326

MAX_WORKERS module-attribute ⚓︎

MAX_WORKERS = 4

STR_POINT module-attribute ⚓︎

STR_POINT = 'SRID={};POINT({} {})'

Base ⚓︎

Bases: DeclarativeBase

Base class for all models in the database.

GridPoint ⚓︎

GridPoint(latitude, longitude, **kw)

Bases: Base

Grid point table for storing latitude and longitude coordinates.

Attributes:

__table_args__ class-attribute instance-attribute ⚓︎

__table_args__ = (
    Index(
        "idx_point_gridpoint",
        "point",
        postgresql_using="gist",
    ),
    UniqueConstraint(
        "latitude", "longitude", name="uq_lat_lon"
    ),
)

__tablename__ class-attribute instance-attribute ⚓︎

__tablename__ = 'grid_point'

id class-attribute instance-attribute ⚓︎

id = mapped_column(
    Integer(), primary_key=True, autoincrement=True
)

latitude class-attribute instance-attribute ⚓︎

latitude = latitude

longitude class-attribute instance-attribute ⚓︎

longitude = longitude

point class-attribute instance-attribute ⚓︎

point = ST_GeomFromText(
    format(str(CRS), longitude, latitude)
)

NutsDef ⚓︎

Bases: Base

NUTS definition table.

Attributes:

__tablename__ class-attribute instance-attribute ⚓︎

__tablename__ = 'nuts_def'

cntr_code class-attribute instance-attribute ⚓︎

cntr_code = mapped_column(String(), nullable=True)

coast_type class-attribute instance-attribute ⚓︎

coast_type = mapped_column(Float(), nullable=True)

geometry class-attribute instance-attribute ⚓︎

geometry = mapped_column(
    Geometry(geometry_type="POINT", srid=CRS)
)

levl_code class-attribute instance-attribute ⚓︎

levl_code = mapped_column(Integer(), nullable=True)

mount_type class-attribute instance-attribute ⚓︎

mount_type = mapped_column(Float(), nullable=True)

name_latn class-attribute instance-attribute ⚓︎

name_latn = mapped_column(String(), nullable=True)

nuts_id class-attribute instance-attribute ⚓︎

nuts_id = mapped_column(String(), primary_key=True)

nuts_name class-attribute instance-attribute ⚓︎

nuts_name = mapped_column(String(), nullable=True)

urbn_type class-attribute instance-attribute ⚓︎

urbn_type = mapped_column(Float(), nullable=True)

TimePoint ⚓︎

Bases: Base

Time point table for storing year, month, and day.

Attributes:

__table_args__ class-attribute instance-attribute ⚓︎

__table_args__ = (
    UniqueConstraint(
        "year", "month", "day", name="uq_year_month_day"
    ),
)

__tablename__ class-attribute instance-attribute ⚓︎

__tablename__ = 'time_point'

day class-attribute instance-attribute ⚓︎

day = mapped_column(Integer())

id class-attribute instance-attribute ⚓︎

id = mapped_column(
    Integer(), primary_key=True, autoincrement=True
)

month class-attribute instance-attribute ⚓︎

month = mapped_column(Integer())

year class-attribute instance-attribute ⚓︎

year = mapped_column(Integer())

VarType ⚓︎

Bases: Base

Variable type table for storing variable metadata.

Attributes:

__table_args__ class-attribute instance-attribute ⚓︎

__table_args__ = (
    UniqueConstraint("name", name="uq_var_name"),
)

__tablename__ class-attribute instance-attribute ⚓︎

__tablename__ = 'var_type'

description class-attribute instance-attribute ⚓︎

description = mapped_column(String(), nullable=True)

id class-attribute instance-attribute ⚓︎

id = mapped_column(
    Integer(), primary_key=True, autoincrement=True
)

name class-attribute instance-attribute ⚓︎

name = mapped_column(String())

unit class-attribute instance-attribute ⚓︎

unit = mapped_column(String())

VarValue ⚓︎

Bases: Base

Variable value table for storing variable values at specific grid points and time points.

Attributes:

__table_args__ class-attribute instance-attribute ⚓︎

__table_args__ = (
    UniqueConstraint(
        "time_id",
        "grid_id",
        "var_id",
        name="uq_time_grid_var",
    ),
    ForeignKeyConstraint(
        ["grid_id"],
        ["grid_point.id"],
        name="fk_grid_id",
        ondelete="CASCADE",
    ),
    ForeignKeyConstraint(
        ["time_id"],
        ["time_point.id"],
        name="fk_time_id",
        ondelete="CASCADE",
    ),
    ForeignKeyConstraint(
        ["var_id"],
        ["var_type.id"],
        name="fk_var_id",
        ondelete="CASCADE",
    ),
)

__tablename__ class-attribute instance-attribute ⚓︎

__tablename__ = 'var_value'

grid_id class-attribute instance-attribute ⚓︎

grid_id = mapped_column(
    Integer(), ForeignKey("grid_point.id")
)

id class-attribute instance-attribute ⚓︎

id = mapped_column(
    BigInteger(), primary_key=True, autoincrement=True
)

time_id class-attribute instance-attribute ⚓︎

time_id = mapped_column(
    Integer(), ForeignKey("time_point.id")
)

value class-attribute instance-attribute ⚓︎

value = mapped_column(Float())

var_id class-attribute instance-attribute ⚓︎

var_id = mapped_column(Integer(), ForeignKey("var_type.id"))

add_data_list ⚓︎

add_data_list(session, data_list)

Add a list of data instances to the database.

Parameters:

  • session (Session) –

    SQLAlchemy session object.

  • data_list (list) –

    List of data instances to add.

add_data_list_bulk ⚓︎

add_data_list_bulk(session, data_dict_list, class_type)

Add a list of data to the database in bulk.

Parameters:

  • session (Session) –

    SQLAlchemy session object.

  • data_dict_list (list) –

    List of dictionaries containing data to insert.

  • class_type (Type[Base]) –

    SQLAlchemy mapped class to insert data into.

convert_yearly_to_monthly ⚓︎

convert_yearly_to_monthly(ds)

Convert yearly data to monthly data.

Parameters:

  • ds (Dataset) –

    xarray dataset with yearly data.

Returns:

  • Dataset

    xr.Dataset: xarray dataset with monthly data.

create_or_replace_tables ⚓︎

create_or_replace_tables(engine)

Create or replace tables in the database.

Parameters:

  • engine (Engine) –

    SQLAlchemy engine object.

create_session ⚓︎

create_session(engine)

Create a new session for the database.

Parameters:

  • engine (Engine) –

    SQLAlchemy engine object.

Returns:

  • Session ( Session ) –

    SQLAlchemy session object.

create_tables ⚓︎

create_tables(engine)

Create all tables in the database.

Parameters:

  • engine (Engine) –

    SQLAlchemy engine object.

extract_time_point ⚓︎

extract_time_point(time_point)

Extract year, month, and day from a numpy datetime64 object.

Parameters:

  • time_point (datetime64) –

    Numpy datetime64 object representing a time point.

Returns:

  • tuple ( tuple[int, int, int, int, int, int] ) –

    A tuple containing year, month, day, hour, minute, second.

get_grid_ids_in_nuts ⚓︎

get_grid_ids_in_nuts(engine, nuts_regions)

Get grid point IDs that are within the NUTS regions.

Parameters:

  • engine (Engine) –

    SQLAlchemy engine object.

  • nuts_regions (GeoDataFrame) –

    GeoDataFrame with NUTS region geometries.

Returns:

  • List[int]

    List[int]: List of grid point IDs that intersect with the NUTS regions.

get_grid_points ⚓︎

get_grid_points(session, area=None)

Get grid points from the database that fall within a specified area. Args: session (Session): SQLAlchemy session object. area (None | Tuple[float, float, float, float]): Area as (North, West, South, East). If None, all grid points are returned. Returns: List[GridPoint]: List of GridPoint objects within the specified area.

get_id_maps ⚓︎

get_id_maps(session)

Get ID maps for grid points, time points, and variable types.

Parameters:

  • session (Session) –

    SQLAlchemy session object.

Returns:

  • tuple ( tuple[dict, dict, dict] ) –

    A tuple containing three dictionaries:

    • grid_id_map: Mapping of (latitude, longitude) to grid point ID.

    • time_id_map: Mapping of datetime64 to time point ID.

    • var_id_map: Mapping of variable name to variable type ID.

get_nuts_regions ⚓︎

get_nuts_regions(engine, area=None)

Get NUTS regions from the database.

Parameters:

  • engine (Engine) –

    SQLAlchemy engine object.

  • area (None | Tuple[float, float, float, float], default: None ) –

    Area as (North, West, South, East). If None, all NUTS regions are returned.

Returns:

  • GeoDataFrame

    gpd.GeoDataFrame: GeoDataFrame with NUTS region attributes and geometries.

get_time_points ⚓︎

get_time_points(
    session, start_time_point, end_time_point=None
)

Get time points from the database that fall within a specified range.

Parameters:

  • session (Session) –

    SQLAlchemy session object.

  • start_time_point (Tuple[int, int]) –

    Start time point as (year, month).

  • end_time_point (Tuple[int, int] | None, default: None ) –

    End time point as (year, month). If None, only the start time point is used.

Returns:

  • List[TimePoint]

    List[TimePoint]: List of TimePoint objects within the specified range.

get_unique_time_points ⚓︎

get_unique_time_points(time_point_data)

Get the unique of time points.

Parameters:

  • time_point_data (list[Tuple[ndarray, bool]]) –

    List of tuples containing time point data, and the yearly flag. If flag is True, the time point needs to be converted to monthly.

Returns:

  • ndarray

    np.ndarray: Unique of (sorted) time points as a numpy array.

get_var_types ⚓︎

get_var_types(session, var_names=None)

Get variable types from the database with names specified in a list.

Parameters:

  • session (Session) –

    SQLAlchemy session object.

  • var_names (None | List[str], default: None ) –

    List of variable names to filter by. If None, all variable types are returned.

Returns:

  • List[VarType]

    List[VarType]: List of VarType objects with the specified names.

get_var_value ⚓︎

get_var_value(
    session, var_name, lat, lon, year, month, day
)

Get variable value from the database.

Parameters:

  • session (Session) –

    SQLAlchemy session object.

  • var_name (str) –

    Name of the variable to retrieve.

  • lat (float) –

    Latitude of the grid point.

  • lon (float) –

    Longitude of the grid point.

  • year (int) –

    Year of the time point.

  • month (int) –

    Month of the time point.

  • day (int) –

    Day of the time point.

Returns:

  • float | int | str | None

    float | int | str | None: Value of the variable at the specified grid point and time point.

get_var_values_cartesian ⚓︎

get_var_values_cartesian(
    session,
    start_time_point,
    end_time_point=None,
    var_names=None,
)

Get variable values for a cartesian map.

Parameters:

  • session (Session) –

    SQLAlchemy session object.

  • start_time_point (Tuple[int, int]) –

    Start time point as (year, month).

  • end_time_point (Tuple[int, int] | None, default: None ) –

    End time point as (year, month). If None, only the start time point is used.

  • var_names (None | List[str], default: None ) –

    List of variable names to filter by. If None, all variable types are used.

Returns:

  • dict ( dict ) –

    a dict with (time, latitude, longitude, var_value) keys.

get_var_values_cartesian_for_download ⚓︎

get_var_values_cartesian_for_download(
    session,
    start_time_point,
    end_time_point=None,
    area=None,
    var_names=None,
    netcdf_file="cartesian_grid_data_onehealth.nc",
)

Get variable values for a cartesian map.

Parameters:

  • session (Session) –

    SQLAlchemy session object.

  • start_time_point (Tuple[int, int]) –

    Start time point as (year, month).

  • end_time_point (Tuple[int, int] | None, default: None ) –

    End time point as (year, month). If None, only the start time point is used.

  • area (None | Tuple[float, float, float, float], default: None ) –

    Area as (North, West, South, East). If None, all grid points are used.

  • var_names (None | List[str], default: None ) –

    List of variable names to filter by. If None, all variable types are used.

  • netcdf_file (str, default: 'cartesian_grid_data_onehealth.nc' ) –

    Name of the NetCDF file to save the dataset.

Returns:

  • dict ( dict ) –

    a dict with (time, latitude, longitude, var_value) keys. time or var_value is empty if no data is found.

get_var_values_nuts ⚓︎

get_var_values_nuts(
    engine,
    session,
    start_time_point,
    end_time_point=None,
    area=None,
    var_names=None,
    shapefile=None,
)

Get variable values for NUTS regions.

Parameters:

  • engine (Engine) –

    SQLAlchemy engine object.

  • session (Session) –

    SQLAlchemy session object.

  • start_time_point (Tuple[int, int]) –

    Start time point as (year, month).

  • end_time_point (Tuple[int, int] | None, default: None ) –

    End time point as (year, month). If None, only the start time point is used.

  • area (None | Tuple[float, float, float, float], default: None ) –

    Area as (North, West, South, East). If None, all grid points are used.

  • var_names (None | List[str], default: None ) –

    List of variable names to filter by. If None, all variable types are used.

  • netcdf_file (str | None) –

    Path to the NetCDF file to save the dataset. If None, the dataset is not saved to a file.

Returns:

  • GeoDataFrame | None

    gpd.GeoDataFrame | None: GeoDataFrame with NUTS region attributes and variable values for each NUTS region. None if no data is found.

initialize_database ⚓︎

initialize_database(db_url, replace=False)

Initialize the database by creating the engine and tables, and installing PostGIS. If replace is True, it will drop and recreate the tables.

Parameters:

  • db_url (str) –

    Database URL for SQLAlchemy.

  • replace (bool, default: False ) –

    Whether to drop and recreate the tables. Defaults to False.

insert_grid_points ⚓︎

insert_grid_points(session, latitudes, longitudes)

Insert grid points into the database.

Parameters:

  • session (Session) –

    SQLAlchemy session object.

  • latitudes (ndarray) –

    Array of latitudes.

  • longitudes (ndarray) –

    Array of longitudes.

insert_nuts_def ⚓︎

insert_nuts_def(engine, shapefiles_path)

Insert NUTS definition data into the database. The shapefiles are downloaded from the Eurostat website. More details for downloading NUTS shapefiles can be found in our data page

Five shapefiles are involved in the process: - .shp: geometry data (e.g. polygons) - .shx: shape index data - .dbf: attribute data (e.g. names, codes) - .prj: projection data (i.e. CRS) - .cpg: character encoding data

Parameters:

  • engine (Engine) –

    SQLAlchemy engine object.

  • shapefiles_path (Path) –

    Path to the NUTS shapefiles.

insert_time_points ⚓︎

insert_time_points(session, time_point_data)

Insert time points into the database.

Parameters:

  • session (Session) –

    SQLAlchemy session object.

  • time_point_data (list[ndarray, bool]) –

    List of tuples containing time point data, and its flag. If flag is True, the time point needs to be converted to monthly.

insert_var_types ⚓︎

insert_var_types(session, var_types)

Insert variable types into the database.

Parameters:

  • session (Session) –

    SQLAlchemy session object.

  • var_types (list[dict]) –

    List of dictionaries containing variable type data.

insert_var_values ⚓︎

insert_var_values(
    engine,
    ds,
    var_name,
    grid_id_map,
    time_id_map,
    var_id_map,
    to_monthly=False,
)

Insert variable values into the database.

Parameters:

  • engine (Engine) –

    SQLAlchemy engine object.

  • ds (Dataset) –

    xarray dataset with variable data.

  • var_name (str) –

    Name of the variable to insert.

  • grid_id_map (dict) –

    Mapping of grid points to IDs.

  • time_id_map (dict) –

    Mapping of time points to IDs.

  • var_id_map (dict) –

    Mapping of variable types to IDs.

  • to_monthly (bool, default: False ) –

    Whether to convert yearly data to monthly data. Defaults to False.

Returns: tuple: A tuple containing the time taken to convert yearly data to monthly data, and the time taken to insert the variable values.

install_postgis ⚓︎

install_postgis(engine)

Install PostGIS extension on the database.

Parameters:

  • engine (Engine) –

    SQLAlchemy engine object.

sort_grid_points_get_ids ⚓︎

sort_grid_points_get_ids(grid_points)