heiplanet_db.postgresql_database module⚓︎
heiplanet_db.postgresql_database
⚓︎
Classes:
-
Base–Base class for all models in the database.
-
GridPoint–Grid point table for storing latitude and longitude coordinates.
-
GridPointResolution–Many-to-many relationship between GridPoint and ResolutionGroup
-
NutsDef–NUTS definition table.
-
ResolutionGroup–Resolution group for the different grid resolutions.
-
TimePoint–Time point table for storing year, month, and day.
-
VarType–Variable type table for storing variable metadata.
-
VarValue–Variable value table for storing variable values at specific
-
VarValueNuts–Variable value table for storing variable values at specific
Functions:
-
add_data_list–Add a list of data instances to the database.
-
add_data_list_bulk–Add a list of data to the database in bulk.
-
assign_grid_resolution_group_to_grid_point–Assign the grid resolution group to each grid point,
-
convert_yearly_to_monthly–Convert yearly data to monthly data.
-
create_or_replace_tables–Create or replace tables in the database.
-
create_session–Create a new session for the database.
-
create_tables–Create all tables in the database.
-
extract_time_point–Extract year, month, and day from a numpy datetime64 object.
-
filter_nuts_ids_for_resolution–Filter NUTS IDs based on the specified resolution.
-
get_grid_ids_by_resolution–Get all grid point IDs for a specific resolution.
-
get_grid_ids_in_nuts–Get grid point IDs that are within the NUTS regions.
-
get_grid_points–Get grid points from the database that fall within a specified area.
-
get_id_maps–Get ID maps for grid points, time points, and variable types.
-
get_nuts_regions–Get NUTS regions from the database.
-
get_resolution_id–Get the resolution ID for a given resolution value.
-
get_time_points–Get time points from the database that fall within a specified range.
-
get_unique_time_points–Get the unique of time points.
-
get_var_types–Get variable types from the database with names specified in a list.
-
get_var_value–Get variable value from the database.
-
get_var_value_nuts–Get variable value from the database.
-
get_var_values_cartesian–Get variable values for a cartesian map.
-
get_var_values_cartesian_for_download–Get variable values for a cartesian map.
-
get_var_values_nuts–Get variable values for all two-digit NUTS regions.
-
initialize_database–Initialize the database by creating the engine and tables, and installing PostGIS.
-
insert_grid_points–Insert grid points into the database.
-
insert_nuts_def–Insert NUTS definition data into the database.
-
insert_resolution_groups–Create the resolution groups.
-
insert_time_points–Insert time points into the database.
-
insert_var_types–Insert variable types into the database.
-
insert_var_value_nuts–Insert variable values for NUTS regions into the database.
-
insert_var_values–Insert variable values into the database.
-
install_postgis–Install PostGIS extension on the database.
-
sort_grid_points_get_ids–
Attributes:
-
BATCH_SIZE– -
CRS– -
MAX_WORKERS– -
STR_POINT–
Base
⚓︎
Bases: DeclarativeBase
Base class for all models in the database.
GridPoint
⚓︎
Bases: Base
Grid point table for storing latitude and longitude coordinates.
Attributes:
-
__table_args__– -
__tablename__– -
id(Mapped[int]) – -
latitude(Mapped[float]) – -
longitude(Mapped[float]) – -
point(Mapped[Geometry]) –
GridPointResolution
⚓︎
Bases: Base
Many-to-many relationship between GridPoint and ResolutionGroup
Attributes:
-
__tablename__– -
grid_id(Mapped[int]) – -
resolution_id(Mapped[int]) –
NutsDef
⚓︎
Bases: Base
NUTS definition table.
Attributes:
-
__tablename__– -
cntr_code(Mapped[String]) – -
coast_type(Mapped[Float]) – -
geometry(Mapped[WKBElement]) – -
levl_code(Mapped[int]) – -
mount_type(Mapped[Float]) – -
name_latn(Mapped[String]) – -
nuts_id(Mapped[String]) – -
nuts_name(Mapped[String]) – -
urbn_type(Mapped[Float]) –
ResolutionGroup
⚓︎
Bases: Base
Resolution group for the different grid resolutions.
Attributes:
-
__tablename__– -
description(Mapped[str]) – -
id(Mapped[int]) – -
resolution(Mapped[float]) –
TimePoint
⚓︎
Bases: Base
Time point table for storing year, month, and day.
Attributes:
-
__table_args__– -
__tablename__– -
day(Mapped[int]) – -
id(Mapped[int]) – -
month(Mapped[int]) – -
year(Mapped[int]) –
VarType
⚓︎
Bases: Base
Variable type table for storing variable metadata.
Attributes:
-
__table_args__– -
__tablename__– -
description(Mapped[String]) – -
id(Mapped[int]) – -
name(Mapped[String]) – -
unit(Mapped[String]) –
VarValue
⚓︎
Bases: Base
Variable value table for storing variable values at specific grid points and time points.
Attributes:
-
__table_args__– -
__tablename__– -
grid_id(Mapped[int]) – -
id(Mapped[int]) – -
time_id(Mapped[int]) – -
value(Mapped[float]) – -
var_id(Mapped[int]) –
__table_args__
class-attribute
instance-attribute
⚓︎
__table_args__ = (
UniqueConstraint(
"time_id",
"grid_id",
"var_id",
name="uq_time_grid_var",
),
ForeignKeyConstraint(
["grid_id"],
["grid_point.id"],
name="fk_grid_id",
ondelete="CASCADE",
),
ForeignKeyConstraint(
["time_id"],
["time_point.id"],
name="fk_time_id",
ondelete="CASCADE",
),
ForeignKeyConstraint(
["var_id"],
["var_type.id"],
name="fk_var_id",
ondelete="CASCADE",
),
)
grid_id
class-attribute
instance-attribute
⚓︎
id
class-attribute
instance-attribute
⚓︎
time_id
class-attribute
instance-attribute
⚓︎
var_id
class-attribute
instance-attribute
⚓︎
VarValueNuts
⚓︎
Bases: Base
Variable value table for storing variable values at specific NUTS regions and time points.
Attributes:
-
__table_args__– -
__tablename__– -
id(Mapped[int]) – -
nuts_id(Mapped[String]) – -
time_id(Mapped[int]) – -
value(Mapped[float]) – -
var_id(Mapped[int]) –
__table_args__
class-attribute
instance-attribute
⚓︎
__table_args__ = (
UniqueConstraint(
"time_id",
"nuts_id",
"var_id",
name="uq_time_nuts_var",
),
ForeignKeyConstraint(
["nuts_id"],
["nuts_def.nuts_id"],
name="fk_nuts_id",
ondelete="CASCADE",
),
ForeignKeyConstraint(
["time_id"],
["time_point.id"],
name="fk_time_id_nuts",
ondelete="CASCADE",
),
ForeignKeyConstraint(
["var_id"],
["var_type.id"],
name="fk_var_id_nuts",
ondelete="CASCADE",
),
)
id
class-attribute
instance-attribute
⚓︎
nuts_id
class-attribute
instance-attribute
⚓︎
time_id
class-attribute
instance-attribute
⚓︎
var_id
class-attribute
instance-attribute
⚓︎
add_data_list
⚓︎
Add a list of data instances to the database.
Parameters:
-
session(Session) –SQLAlchemy session object.
-
data_list(list) –List of data instances to add.
add_data_list_bulk
⚓︎
Add a list of data to the database in bulk.
Parameters:
-
session(Session) –SQLAlchemy session object.
-
data_dict_list(list) –List of dictionaries containing data to insert.
-
class_type(Type[Base]) –SQLAlchemy mapped class to insert data into.
assign_grid_resolution_group_to_grid_point
⚓︎
Assign the grid resolution group to each grid point, creating the many-to-many relationship between resolutions and grid points.
Parameters:
-
session(Session) –SQLAlchemy session object.
convert_yearly_to_monthly
⚓︎
Convert yearly data to monthly data.
Parameters:
-
ds(Dataset) –xarray dataset with yearly data.
Returns:
-
Dataset–xr.Dataset: xarray dataset with monthly data.
create_or_replace_tables
⚓︎
Create or replace tables in the database.
Parameters:
-
engine(Engine) –SQLAlchemy engine object.
create_session
⚓︎
Create a new session for the database.
Parameters:
-
engine(Engine) –SQLAlchemy engine object.
Returns:
-
Session(Session) –SQLAlchemy session object.
create_tables
⚓︎
Create all tables in the database.
Parameters:
-
engine(Engine) –SQLAlchemy engine object.
extract_time_point
⚓︎
Extract year, month, and day from a numpy datetime64 object.
Parameters:
-
time_point(datetime64) –Numpy datetime64 object representing a time point.
Returns:
-
tuple(tuple[int, int, int, int, int, int]) –A tuple containing year, month, day, hour, minute, second.
filter_nuts_ids_for_resolution
⚓︎
Filter NUTS IDs based on the specified resolution.
Parameters:
-
nuts_ids(List[str]) –List of NUTS IDs to filter.
-
resolution(str) –Desired NUTS resolution ("NUTS0", "NUTS1", "NUTS2", "NUTS3").
Returns:
-
List[str]–List[str]: Filtered list of NUTS IDs matching the specified resolution.
get_grid_ids_by_resolution
⚓︎
Get all grid point IDs for a specific resolution.
Parameters:
-
session(Session) –SQLAlchemy session object.
-
resolution_id(int) –Resolution ID to filter by.
Returns:
-
List[int]–List[int]: List of grid point IDs belonging to the specified resolution.
get_grid_ids_in_nuts
⚓︎
Get grid point IDs that are within the NUTS regions.
Parameters:
-
engine(Engine) –SQLAlchemy engine object.
-
nuts_regions(GeoDataFrame) –GeoDataFrame with NUTS region geometries.
Returns:
-
List[int]–List[int]: List of grid point IDs that intersect with the NUTS regions.
get_grid_points
⚓︎
Get grid points from the database that fall within a specified area. Args: session (Session): SQLAlchemy session object. area (None | Tuple[float, float, float, float]): Area as (North, West, South, East). If None, all grid points are returned. resolution_id (int | None): Resolution ID of the grid points. Defaults to None. Returns: List[GridPoint]: List of GridPoint objects within the specified area.
get_id_maps
⚓︎
Get ID maps for grid points, time points, and variable types.
Parameters:
-
session(Session) –SQLAlchemy session object.
Returns:
-
tuple(tuple[dict, dict, dict]) –A tuple containing three dictionaries:
-
grid_id_map: Mapping of (latitude, longitude) to grid point ID.
-
time_id_map: Mapping of datetime64 to time point ID.
-
var_id_map: Mapping of variable name to variable type ID.
-
get_nuts_regions
⚓︎
Get NUTS regions from the database.
Parameters:
-
engine(Engine) –SQLAlchemy engine object.
Returns:
-
GeoDataFrame–gpd.GeoDataFrame: GeoDataFrame with NUTS region attributes and geometries.
get_resolution_id
⚓︎
Get the resolution ID for a given resolution value.
Parameters:
-
session(Session) –SQLAlchemy session object.
-
resolution(float) –Resolution value to look up.
Returns:
-
int | None–int | None: Resolution ID if found, None otherwise.
get_time_points
⚓︎
Get time points from the database that fall within a specified range.
Parameters:
-
session(Session) –SQLAlchemy session object.
-
start_time_point(Tuple[int, int]) –Start time point as (year, month).
-
end_time_point(Tuple[int, int] | None, default:None) –End time point as (year, month). If None, only the start time point is used.
Returns:
-
List[TimePoint]–List[TimePoint]: List of TimePoint objects within the specified range.
get_unique_time_points
⚓︎
Get the unique of time points.
Parameters:
-
time_point_data(list[Tuple[ndarray, bool]]) –List of tuples containing time point data, and the yearly flag. If flag is True, the time point needs to be converted to monthly.
Returns:
-
ndarray–np.ndarray: Unique of (sorted) time points as a numpy array.
get_var_types
⚓︎
Get variable types from the database with names specified in a list.
Parameters:
-
session(Session) –SQLAlchemy session object.
-
var_names(None | List[str], default:None) –List of variable names to filter by. If None, all variable types are returned.
Returns:
-
List[VarType]–List[VarType]: List of VarType objects with the specified names.
get_var_value
⚓︎
Get variable value from the database.
Parameters:
-
session(Session) –SQLAlchemy session object.
-
var_name(str) –Name of the variable to retrieve.
-
lat(float) –Latitude of the grid point.
-
lon(float) –Longitude of the grid point.
-
year(int) –Year of the time point.
-
month(int) –Month of the time point.
-
day(int) –Day of the time point.
Returns:
-
float | int | str | None–float | int | str | None: Value of the variable at the specified grid point and time point.
get_var_value_nuts
⚓︎
Get variable value from the database.
Parameters:
-
session(Session) –SQLAlchemy session object.
-
var_name(str) –Name of the variable to retrieve.
-
nuts_region(str) –NUTS region code.
-
year(int) –Year of the time point.
-
month(int) –Month of the time point.
-
day(int) –Day of the time point.
Returns:
-
float | int | str | None–float | int | str | None: Value of the variable at the specified grid point and time point.
get_var_values_cartesian
⚓︎
Get variable values for a cartesian map.
Parameters:
-
session(Session) –SQLAlchemy session object.
-
time_point(Tuple[int, int]) –Date point as (year, month).
-
grid_resolution(float, default:0.1) –Resolution of the grid points. Defaults to 0.1 degree.
-
area(None | Tuple[float, float, float, float], default:None) –Area as (North, West, South, East). If None, all grid points are used.
-
var_name(None | str, default:None) –Variable name for which values should be returned. If None, the default model values will be returned.
Returns:
-
dict(dict) –a dict with (latitude, longitude, var_value) for the requested date.
get_var_values_cartesian_for_download
⚓︎
get_var_values_cartesian_for_download(
session,
start_time_point,
end_time_point=None,
area=None,
var_names=None,
netcdf_file="cartesian_grid_data_heiplanet.nc",
)
Get variable values for a cartesian map.
Parameters:
-
session(Session) –SQLAlchemy session object.
-
start_time_point(Tuple[int, int]) –Start time point as (year, month).
-
end_time_point(Tuple[int, int] | None, default:None) –End time point as (year, month). If None, only the start time point is used.
-
area(None | Tuple[float, float, float, float], default:None) –Area as (North, West, South, East). If None, all grid points are used.
-
var_names(None | List[str], default:None) –List of variable names to filter by. If None, all variable types are used.
-
netcdf_file(str, default:'cartesian_grid_data_heiplanet.nc') –Name of the NetCDF file to save the dataset.
Returns:
-
dict(dict) –a dict with (time, latitude, longitude, var_value) keys. time or var_value is empty if no data is found.
get_var_values_nuts
⚓︎
Get variable values for all two-digit NUTS regions.
Parameters:
-
session(Session) –SQLAlchemy session object.
-
time_point(Tuple[int, int]) –Date point as (year, month).
-
var_name(None | str, default:None) –Variable name for which values should be returned. If None, the default model values will be returned.
-
grid_resolution(str, default:'NUTS2') –Grid resolution, by default "NUTS2" is returned.
Returns:
-
dict(dict) –A dict with (NUTS_id: var_value) for the requested date and variable type. The NUTS id is the two-digit nuts abbreviation for the regions.
initialize_database
⚓︎
Initialize the database by creating the engine and tables, and installing PostGIS. If replace is True, it will drop and recreate the tables.
Parameters:
-
db_url(str) –Database URL for SQLAlchemy.
-
replace(bool, default:False) –Whether to drop and recreate the tables. Defaults to False.
insert_grid_points
⚓︎
Insert grid points into the database.
Parameters:
-
session(Session) –SQLAlchemy session object.
-
latitudes(ndarray) –Array of latitudes.
-
longitudes(ndarray) –Array of longitudes.
insert_nuts_def
⚓︎
Insert NUTS definition data into the database. The shapefiles are downloaded from the Eurostat website. More details for downloading NUTS shapefiles can be found in our data page
Five shapefiles are involved in the process:
- .shp: geometry data (e.g. polygons)
- .shx: shape index data
- .dbf: attribute data (e.g. names, codes)
- .prj: projection data (i.e. CRS)
- .cpg: character encoding data
Parameters:
-
engine(Engine) –SQLAlchemy engine object.
-
shapefiles_path(Path) –Path to the NUTS shapefiles.
insert_resolution_groups
⚓︎
insert_resolution_groups(
session,
resolutions=array(
[0.1, 0.2, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 5.0]
),
descriptions=None,
)
Create the resolution groups.
There are different degrees resolution that can be requested: 0.1 degree, 0.2/0.5/1.0/1.5/2.0/2.5/3.0/5.0 degrees. We are currently using 0.2 degree resolution and not 0.25 degree resolution here, since this is a subset of 0.1 degree resolution grid points. Otherwise we would need to create additional grid points for 0.25 degree resolution through interpolation, which has not been defined as to which interpolation method can be used for this.
Parameters:
-
session(Session) –SQLAlchemy session object.
-
resolutions(ndarray, default:array([0.1, 0.2, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 5.0])) –Array of resolutions to insert. Defaults to 0.1, 0.2, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 5.0 degree resolution.
-
descriptions(list[str] | None, default:None) –List of descriptions for each resolution. If None, default descriptions will be used.
insert_time_points
⚓︎
Insert time points into the database.
Parameters:
-
session(Session) –SQLAlchemy session object.
-
time_point_data(list[ndarray, bool]) –List of tuples containing time point data, and its flag. If flag is True, the time point needs to be converted to monthly.
insert_var_types
⚓︎
Insert variable types into the database.
Parameters:
-
session(Session) –SQLAlchemy session object.
-
var_types(list[dict]) –List of dictionaries containing variable type data.
insert_var_value_nuts
⚓︎
Insert variable values for NUTS regions into the database.
Parameters:
-
engine(Engine) –SQLAlchemy engine object.
-
ds(Dataset) –xarray dataset with dimensions (time, NUTS_ID).
-
var_name(str) –Name of the variable to insert.
-
time_id_map(dict) –Mapping of time points to IDs.
-
var_id_map(dict) –Mapping of variable names to variable type IDs.
Returns:
-
float(float) –The time taken to insert the variable values.
insert_var_values
⚓︎
Insert variable values into the database.
Parameters:
-
engine(Engine) –SQLAlchemy engine object.
-
ds(Dataset) –xarray dataset with variable data.
-
var_name(str) –Name of the variable to insert.
-
grid_id_map(dict) –Mapping of grid points to IDs.
-
time_id_map(dict) –Mapping of time points to IDs.
-
var_id_map(dict) –Mapping of variable types to IDs.
-
to_monthly(bool, default:False) –Whether to convert yearly data to monthly data. Defaults to False.
Returns: tuple: A tuple containing the time taken to convert yearly data to monthly data, and the time taken to insert the variable values.
install_postgis
⚓︎
Install PostGIS extension on the database.
Parameters:
-
engine(Engine) –SQLAlchemy engine object.