onehealth_db.postgresql_database module⚓︎
onehealth_db.postgresql_database
⚓︎
Classes:
-
Base
–Base class for all models in the database.
-
GridPoint
–Grid point table for storing latitude and longitude coordinates.
-
NutsDef
–NUTS definition table.
-
TimePoint
–Time point table for storing year, month, and day.
-
VarType
–Variable type table for storing variable metadata.
-
VarValue
–Variable value table for storing variable values at specific
Functions:
-
add_data_list
–Add a list of data instances to the database.
-
add_data_list_bulk
–Add a list of data to the database in bulk.
-
convert_yearly_to_monthly
–Convert yearly data to monthly data.
-
create_or_replace_tables
–Create or replace tables in the database.
-
create_session
–Create a new session for the database.
-
create_tables
–Create all tables in the database.
-
extract_time_point
–Extract year, month, and day from a numpy datetime64 object.
-
get_grid_ids_in_nuts
–Get grid point IDs that are within the NUTS regions.
-
get_grid_points
–Get grid points from the database that fall within a specified area.
-
get_id_maps
–Get ID maps for grid points, time points, and variable types.
-
get_nuts_regions
–Get NUTS regions from the database.
-
get_time_points
–Get time points from the database that fall within a specified range.
-
get_unique_time_points
–Get the unique of time points.
-
get_var_types
–Get variable types from the database with names specified in a list.
-
get_var_value
–Get variable value from the database.
-
get_var_values_cartesian
–Get variable values for a cartesian map.
-
get_var_values_cartesian_for_download
–Get variable values for a cartesian map.
-
get_var_values_nuts
–Get variable values for NUTS regions.
-
initialize_database
–Initialize the database by creating the engine and tables, and installing PostGIS.
-
insert_grid_points
–Insert grid points into the database.
-
insert_nuts_def
–Insert NUTS definition data into the database.
-
insert_time_points
–Insert time points into the database.
-
insert_var_types
–Insert variable types into the database.
-
insert_var_values
–Insert variable values into the database.
-
install_postgis
–Install PostGIS extension on the database.
-
sort_grid_points_get_ids
–
Attributes:
-
BATCH_SIZE
– -
CRS
– -
MAX_WORKERS
– -
STR_POINT
–
Base
⚓︎
Bases: DeclarativeBase
Base class for all models in the database.
GridPoint
⚓︎
Bases: Base
Grid point table for storing latitude and longitude coordinates.
Attributes:
-
__table_args__
– -
__tablename__
– -
id
(Mapped[int]
) – -
latitude
(Mapped[float]
) – -
longitude
(Mapped[float]
) – -
point
(Mapped[Geometry]
) –
NutsDef
⚓︎
Bases: Base
NUTS definition table.
Attributes:
-
__tablename__
– -
cntr_code
(Mapped[String]
) – -
coast_type
(Mapped[Float]
) – -
geometry
(Mapped[WKBElement]
) – -
levl_code
(Mapped[int]
) – -
mount_type
(Mapped[Float]
) – -
name_latn
(Mapped[String]
) – -
nuts_id
(Mapped[String]
) – -
nuts_name
(Mapped[String]
) – -
urbn_type
(Mapped[Float]
) –
TimePoint
⚓︎
Bases: Base
Time point table for storing year, month, and day.
Attributes:
-
__table_args__
– -
__tablename__
– -
day
(Mapped[int]
) – -
id
(Mapped[int]
) – -
month
(Mapped[int]
) – -
year
(Mapped[int]
) –
VarType
⚓︎
Bases: Base
Variable type table for storing variable metadata.
Attributes:
-
__table_args__
– -
__tablename__
– -
description
(Mapped[String]
) – -
id
(Mapped[int]
) – -
name
(Mapped[String]
) – -
unit
(Mapped[String]
) –
VarValue
⚓︎
Bases: Base
Variable value table for storing variable values at specific grid points and time points.
Attributes:
-
__table_args__
– -
__tablename__
– -
grid_id
(Mapped[int]
) – -
id
(Mapped[int]
) – -
time_id
(Mapped[int]
) – -
value
(Mapped[float]
) – -
var_id
(Mapped[int]
) –
__table_args__
class-attribute
instance-attribute
⚓︎
__table_args__ = (
UniqueConstraint(
"time_id",
"grid_id",
"var_id",
name="uq_time_grid_var",
),
ForeignKeyConstraint(
["grid_id"],
["grid_point.id"],
name="fk_grid_id",
ondelete="CASCADE",
),
ForeignKeyConstraint(
["time_id"],
["time_point.id"],
name="fk_time_id",
ondelete="CASCADE",
),
ForeignKeyConstraint(
["var_id"],
["var_type.id"],
name="fk_var_id",
ondelete="CASCADE",
),
)
grid_id
class-attribute
instance-attribute
⚓︎
id
class-attribute
instance-attribute
⚓︎
time_id
class-attribute
instance-attribute
⚓︎
var_id
class-attribute
instance-attribute
⚓︎
add_data_list
⚓︎
Add a list of data instances to the database.
Parameters:
-
session
(Session
) –SQLAlchemy session object.
-
data_list
(list
) –List of data instances to add.
add_data_list_bulk
⚓︎
Add a list of data to the database in bulk.
Parameters:
-
session
(Session
) –SQLAlchemy session object.
-
data_dict_list
(list
) –List of dictionaries containing data to insert.
-
class_type
(Type[Base]
) –SQLAlchemy mapped class to insert data into.
convert_yearly_to_monthly
⚓︎
Convert yearly data to monthly data.
Parameters:
-
ds
(Dataset
) –xarray dataset with yearly data.
Returns:
-
Dataset
–xr.Dataset: xarray dataset with monthly data.
create_or_replace_tables
⚓︎
Create or replace tables in the database.
Parameters:
-
engine
(Engine
) –SQLAlchemy engine object.
create_session
⚓︎
Create a new session for the database.
Parameters:
-
engine
(Engine
) –SQLAlchemy engine object.
Returns:
-
Session
(Session
) –SQLAlchemy session object.
create_tables
⚓︎
Create all tables in the database.
Parameters:
-
engine
(Engine
) –SQLAlchemy engine object.
extract_time_point
⚓︎
Extract year, month, and day from a numpy datetime64 object.
Parameters:
-
time_point
(datetime64
) –Numpy datetime64 object representing a time point.
Returns:
-
tuple
(tuple[int, int, int, int, int, int]
) –A tuple containing year, month, day, hour, minute, second.
get_grid_ids_in_nuts
⚓︎
Get grid point IDs that are within the NUTS regions.
Parameters:
-
engine
(Engine
) –SQLAlchemy engine object.
-
nuts_regions
(GeoDataFrame
) –GeoDataFrame with NUTS region geometries.
Returns:
-
List[int]
–List[int]: List of grid point IDs that intersect with the NUTS regions.
get_grid_points
⚓︎
Get grid points from the database that fall within a specified area. Args: session (Session): SQLAlchemy session object. area (None | Tuple[float, float, float, float]): Area as (North, West, South, East). If None, all grid points are returned. Returns: List[GridPoint]: List of GridPoint objects within the specified area.
get_id_maps
⚓︎
Get ID maps for grid points, time points, and variable types.
Parameters:
-
session
(Session
) –SQLAlchemy session object.
Returns:
-
tuple
(tuple[dict, dict, dict]
) –A tuple containing three dictionaries:
-
grid_id_map: Mapping of (latitude, longitude) to grid point ID.
-
time_id_map: Mapping of datetime64 to time point ID.
-
var_id_map: Mapping of variable name to variable type ID.
-
get_nuts_regions
⚓︎
Get NUTS regions from the database.
Parameters:
-
engine
(Engine
) –SQLAlchemy engine object.
-
area
(None | Tuple[float, float, float, float]
, default:None
) –Area as (North, West, South, East). If None, all NUTS regions are returned.
Returns:
-
GeoDataFrame
–gpd.GeoDataFrame: GeoDataFrame with NUTS region attributes and geometries.
get_time_points
⚓︎
Get time points from the database that fall within a specified range.
Parameters:
-
session
(Session
) –SQLAlchemy session object.
-
start_time_point
(Tuple[int, int]
) –Start time point as (year, month).
-
end_time_point
(Tuple[int, int] | None
, default:None
) –End time point as (year, month). If None, only the start time point is used.
Returns:
-
List[TimePoint]
–List[TimePoint]: List of TimePoint objects within the specified range.
get_unique_time_points
⚓︎
Get the unique of time points.
Parameters:
-
time_point_data
(list[Tuple[ndarray, bool]]
) –List of tuples containing time point data, and the yearly flag. If flag is True, the time point needs to be converted to monthly.
Returns:
-
ndarray
–np.ndarray: Unique of (sorted) time points as a numpy array.
get_var_types
⚓︎
Get variable types from the database with names specified in a list.
Parameters:
-
session
(Session
) –SQLAlchemy session object.
-
var_names
(None | List[str]
, default:None
) –List of variable names to filter by. If None, all variable types are returned.
Returns:
-
List[VarType]
–List[VarType]: List of VarType objects with the specified names.
get_var_value
⚓︎
Get variable value from the database.
Parameters:
-
session
(Session
) –SQLAlchemy session object.
-
var_name
(str
) –Name of the variable to retrieve.
-
lat
(float
) –Latitude of the grid point.
-
lon
(float
) –Longitude of the grid point.
-
year
(int
) –Year of the time point.
-
month
(int
) –Month of the time point.
-
day
(int
) –Day of the time point.
Returns:
-
float | int | str | None
–float | int | str | None: Value of the variable at the specified grid point and time point.
get_var_values_cartesian
⚓︎
Get variable values for a cartesian map.
Parameters:
-
session
(Session
) –SQLAlchemy session object.
-
start_time_point
(Tuple[int, int]
) –Start time point as (year, month).
-
end_time_point
(Tuple[int, int] | None
, default:None
) –End time point as (year, month). If None, only the start time point is used.
-
var_names
(None | List[str]
, default:None
) –List of variable names to filter by. If None, all variable types are used.
Returns:
-
dict
(dict
) –a dict with (time, latitude, longitude, var_value) keys.
get_var_values_cartesian_for_download
⚓︎
get_var_values_cartesian_for_download(
session,
start_time_point,
end_time_point=None,
area=None,
var_names=None,
netcdf_file="cartesian_grid_data_onehealth.nc",
)
Get variable values for a cartesian map.
Parameters:
-
session
(Session
) –SQLAlchemy session object.
-
start_time_point
(Tuple[int, int]
) –Start time point as (year, month).
-
end_time_point
(Tuple[int, int] | None
, default:None
) –End time point as (year, month). If None, only the start time point is used.
-
area
(None | Tuple[float, float, float, float]
, default:None
) –Area as (North, West, South, East). If None, all grid points are used.
-
var_names
(None | List[str]
, default:None
) –List of variable names to filter by. If None, all variable types are used.
-
netcdf_file
(str
, default:'cartesian_grid_data_onehealth.nc'
) –Name of the NetCDF file to save the dataset.
Returns:
-
dict
(dict
) –a dict with (time, latitude, longitude, var_value) keys. time or var_value is empty if no data is found.
get_var_values_nuts
⚓︎
get_var_values_nuts(
engine,
session,
start_time_point,
end_time_point=None,
area=None,
var_names=None,
shapefile=None,
)
Get variable values for NUTS regions.
Parameters:
-
engine
(Engine
) –SQLAlchemy engine object.
-
session
(Session
) –SQLAlchemy session object.
-
start_time_point
(Tuple[int, int]
) –Start time point as (year, month).
-
end_time_point
(Tuple[int, int] | None
, default:None
) –End time point as (year, month). If None, only the start time point is used.
-
area
(None | Tuple[float, float, float, float]
, default:None
) –Area as (North, West, South, East). If None, all grid points are used.
-
var_names
(None | List[str]
, default:None
) –List of variable names to filter by. If None, all variable types are used.
-
netcdf_file
(str | None
) –Path to the NetCDF file to save the dataset. If None, the dataset is not saved to a file.
Returns:
-
GeoDataFrame | None
–gpd.GeoDataFrame | None: GeoDataFrame with NUTS region attributes and variable values for each NUTS region. None if no data is found.
initialize_database
⚓︎
Initialize the database by creating the engine and tables, and installing PostGIS. If replace is True, it will drop and recreate the tables.
Parameters:
-
db_url
(str
) –Database URL for SQLAlchemy.
-
replace
(bool
, default:False
) –Whether to drop and recreate the tables. Defaults to False.
insert_grid_points
⚓︎
Insert grid points into the database.
Parameters:
-
session
(Session
) –SQLAlchemy session object.
-
latitudes
(ndarray
) –Array of latitudes.
-
longitudes
(ndarray
) –Array of longitudes.
insert_nuts_def
⚓︎
Insert NUTS definition data into the database. The shapefiles are downloaded from the Eurostat website. More details for downloading NUTS shapefiles can be found in our data page
Five shapefiles are involved in the process:
- .shp
: geometry data (e.g. polygons)
- .shx
: shape index data
- .dbf
: attribute data (e.g. names, codes)
- .prj
: projection data (i.e. CRS)
- .cpg
: character encoding data
Parameters:
-
engine
(Engine
) –SQLAlchemy engine object.
-
shapefiles_path
(Path
) –Path to the NUTS shapefiles.
insert_time_points
⚓︎
Insert time points into the database.
Parameters:
-
session
(Session
) –SQLAlchemy session object.
-
time_point_data
(list[ndarray, bool]
) –List of tuples containing time point data, and its flag. If flag is True, the time point needs to be converted to monthly.
insert_var_types
⚓︎
Insert variable types into the database.
Parameters:
-
session
(Session
) –SQLAlchemy session object.
-
var_types
(list[dict]
) –List of dictionaries containing variable type data.
insert_var_values
⚓︎
Insert variable values into the database.
Parameters:
-
engine
(Engine
) –SQLAlchemy engine object.
-
ds
(Dataset
) –xarray dataset with variable data.
-
var_name
(str
) –Name of the variable to insert.
-
grid_id_map
(dict
) –Mapping of grid points to IDs.
-
time_id_map
(dict
) –Mapping of time points to IDs.
-
var_id_map
(dict
) –Mapping of variable types to IDs.
-
to_monthly
(bool
, default:False
) –Whether to convert yearly data to monthly data. Defaults to False.
Returns: tuple: A tuple containing the time taken to convert yearly data to monthly data, and the time taken to insert the variable values.
install_postgis
⚓︎
Install PostGIS extension on the database.
Parameters:
-
engine
(Engine
) –SQLAlchemy engine object.