API Reference
Computation graph module
This module defines the ComputationGraph class. This class represents a model as a directed acyclic graph (DAG) that executes a series of interdependent tasks which together represent the run of a given heiplanet model and manages the setup and execution of such a graph.
ComputationGraph
A class to represent a computation DAG that executes a series of tasks which together represent the run of a given heiplanet model. These models are defined as combinations of functions known to the class. Modules are a loose collection of functions that are registered with the class and combined into a computational graph to create a functional system. Therefore, functions are registered as either part of a module or as utility functions, e.g., if they are used by multiple modules. The computational graph is built from these functions and executed in via dask tasks to allow for parallel, lazy execution and efficient resource management. Computations can be combined freely from the functions registered with different modules.
Attributes:
| Name | Type | Description |
|---|---|---|
modules |
dict[str, Any]
|
A dictionary of modules, where each module is a module object imported from a given path. |
module_functions |
dict[str, dict[str, Callable]]
|
A dictionary mapping module names to dictionaries of function names and their corresponding callable objects. |
task_graph |
dict[str, Delayed]
|
A dictionary representing the Dask computational graph, where each node is a dask.delayed object. |
config |
dict[str, Any]
|
A configuration dictionary for the computation, the computational graph structure. |
sink_node |
Delayed | None
|
The sink node of the computational graph, which is the final node that triggers the execution of the entire computation. |
Source code in src/heiplanet_models/computation_graph.py
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 | |
__init__(config)
Initialize the computation graph from the given configuration. This method verifies the configuration, loads the necessary modules, retrieves the functions from the modules, builds the computational graph, and sets the Dask scheduler.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
dict[str, Any]
|
The configuration dictionary. |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the configuration is invalid. |
Source code in src/heiplanet_models/computation_graph.py
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 | |
execute(client=None)
Executes the computational graph.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
client
|
Client
|
The client to use for execution if the computation should be executed on a cluster. If None, will use the local machine. Defaults to None. For more on how to use the client, see https://distributed.dask.org/en/stable/client.html. |
None
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If the sink node is not defined. |
ValueError
|
If the scheduler is not defined. |
Returns:
| Name | Type | Description |
|---|---|---|
Any |
The result of the computation. |
Source code in src/heiplanet_models/computation_graph.py
322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 | |
from_config(path_to_config)
classmethod
Creates a ComputationGraph instance from a configuration dictionary read from a json file.
Source code in src/heiplanet_models/computation_graph.py
355 356 357 358 359 360 | |
visualize(filename)
Visualizes the computational graph.
Raises:
| Type | Description |
|---|---|
ValueError
|
If the sink node is not defined. |
Returns:
| Name | Type | Description |
|---|---|---|
Any |
The visualization of the sink node as returned by the Delayed.visualize method. |
Source code in src/heiplanet_models/computation_graph.py
338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 | |
Jmodel model implementation
read_default_config()
Reads the default configuration for the JModel from a JSON file.
Returns:
| Type | Description |
|---|---|
dict[str, str | int64 | None]
|
dict[str, str | np.int64 | None]: A dictionary containing the default configuration. |
Source code in src/heiplanet_models/Jmodel.py
31 32 33 34 35 36 37 38 39 40 | |
read_input_data(model_data)
Read input data from given source 'model_data.input'
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_data
|
JModelData
|
Data class containing the model configuration and input data path. |
required |
Returns:
| Type | Description |
|---|---|
Dataset
|
xr.Dataset: xarray dataset containing the input data for the model. |
Source code in src/heiplanet_models/Jmodel.py
176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 | |
run_model(model_data, data)
Runs the JModel with the provided input data. Applies the R0 interpolation based on temperature values from the stored R0 data and returns a new dataset or dataframe with the R0 data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_data
|
JModelData
|
description |
required |
data
|
Dataset | DataFrame
|
description |
required |
Returns:
| Type | Description |
|---|---|
Dataset | DataFrame
|
xr.Dataset | pd.DataFrame: A dataset or dataframe with the incoming R0 data interpolated based on the temperature values at each grid point. |
Source code in src/heiplanet_models/Jmodel.py
233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 | |
setup_modeldata(input=None, output=None, r0_path=None, run_mode='forbidden', grid_data_baseurl=None, nuts_level=None, resolution=None, year=None, temp_colname='t2m', out_colname='R0')
Initializes the JModel with the given configuration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input
|
str | None
|
Path to the input data file. |
None
|
output
|
str | None
|
Path to the output data file. |
None
|
r0_path
|
str | None
|
Path to the R0 data file. |
None
|
run_mode
|
str
|
Dask run mode used by xarray, default is "forbidden". |
'forbidden'
|
grid_data_baseurl
|
str | None
|
Base URL for the grid data. |
None
|
nuts_level
|
int | None
|
NUTS level for the model, default is None |
None
|
resolution
|
str | None
|
Resolution for the NUTS data, default is None. |
None
|
year
|
int | None
|
Year for the model, default is None. |
None
|
temp_colname
|
str
|
Name of the temperature column in the input data, default is "t2m". |
't2m'
|
out_colname
|
str
|
Name of the output column for R0 data, default is "R0". |
'R0'
|
Source code in src/heiplanet_models/Jmodel.py
43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 | |
Utilities used throughout the code
detect_csr(data)
Detects and sets the coordinate reference system (CRS) for an xarray dataset. Uses rioxarray to handle the CRS. If the crs is not defined, it checks if the coordinates match the expected ranges for EPSG:4326 (standard lat/lon coordinates).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
Dataset
|
xarray dataset to check and set the CRS for. typically these are era5 data or other climate data which often do not come with a given crs. Currently, this only supports the |
required |
EPSG.4326 standard lat/lon coordinates, which are defined as follows
|
|
required | |
- Longitude
|
-180 to 180 degrees |
required | |
- Latitude
|
-90 to 90 degrees |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
When the CRS is not defined and the coordinates do not match the expected ranges for EPSG:4326. |
Returns:
| Type | Description |
|---|---|
Dataset
|
xr.Dataset: dataset with the CRS set to EPSG:4326 if it was not already defined and the coordinates match the expected ranges. |
Source code in src/heiplanet_models/utils.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | |
load_module(module_name, file_path)
load_module Load a python module from 'path' with alias 'alias'
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
module_name
|
str
|
module alias. |
required |
file_path
|
str
|
Path to load the module from |
required |
Returns:
| Name | Type | Description |
|---|---|---|
module |
Python module that has been loaded |
Source code in src/heiplanet_models/utils.py
79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 | |
load_name_from_module(module_name, file_path, name)
load_name_from_module Load a python module from 'path' with alias 'alias'
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
module_name
|
str
|
module alias. |
required |
file_path
|
str
|
Path to load the module from |
required |
name
|
str
|
name to import |
required |
Returns: module: Python module that has been loaded
Source code in src/heiplanet_models/utils.py
99 100 101 102 103 104 105 106 107 108 109 110 111 | |
read_geodata(nuts_level=3, year=2024, resolution='10M', base_url='https://gisco-services.ec.europa.eu/distribution/v2/nuts', url=lambda base_url, resolution, year, nuts_level: f'{base_url}/geojson/NUTS_RG_{resolution}_{year}_4326_LEVL_{nuts_level}.geojson')
load Eurostat NUTS geospatial data from the Eurostat service.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
nuts_level
|
int
|
nuts administrative region level. Defaults to 3. |
3
|
year
|
int
|
year to load data for. Defaults to 2024. |
2024
|
resolution
|
str
|
resolution of the map. Resolution of the geospatial data. One of |
'10M'
|
"60" (1
|
60million), |
required | |
"20" (1
|
20million) |
required | |
"10" (1
|
10million) |
required | |
"03" (1
|
3million) or |
required | |
"01" (1
|
1million). |
required | |
base_url
|
str
|
description. Defaults to "https://gisco-services.ec.europa.eu/distribution/v2/nuts". |
'https://gisco-services.ec.europa.eu/distribution/v2/nuts'
|
url
|
callable
|
builds the full url from the arguments passed to the function.must have the signature url(base_url, resolution, year, nuts_level). |
lambda base_url, resolution, year, nuts_level: f'{base_url}/geojson/NUTS_RG_{resolution}_{year}_4326_LEVL_{nuts_level}.geojson'
|
Returns:
| Type | Description |
|---|---|
|
geopandas.dataframe: Dataframe containing the NUTS geospatial data. |
Source code in src/heiplanet_models/utils.py
42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 | |
validate_spatial_alignment(arr1, arr2)
Validates that two xarray DataArrays have aligned spatial coordinates.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
arr1
|
DataArray
|
The first DataArray. |
required |
arr2
|
DataArray
|
The second DataArray. |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the 'latitude' or 'longitude' coordinates do not match or if the coordinates are missing. |
Source code in src/heiplanet_models/utils.py
114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 | |