onehealth_db.preprocess module⚓︎
onehealth_db.preprocess
⚓︎
Functions:
-
adjust_longitude_360_to_180
–Adjust longitude from 0-360 to -180-180.
-
align_lon_lat_with_popu_data
–Align longitude and latitude coordinates with population data of the same resolution.
-
convert_360_to_180
–Convert longitude from 0-360 to -180-180.
-
convert_m_to_mm
–Convert precipitation from meters to millimeters.
-
convert_m_to_mm_with_attributes
–Convert precipitation from meters to millimeters and keep attributes.
-
convert_to_celsius
–Convert temperature from Kelvin to Celsius.
-
convert_to_celsius_with_attributes
–Convert temperature from Kelvin to Celsius and keep attributes.
-
downsample_resolution
–Downsample the resolution of a dataset.
-
preprocess_data_file
–Preprocess the dataset based on provided settings.
-
rename_coords
–Rename coordinates in the dataset based on a mapping.
-
resample_resolution
–Resample the grid of a dataset to a new resolution.
-
truncate_data_from_time
–Truncate data from a specific start date.
-
upsample_resolution
–Upsample the resolution of a dataset.
Attributes:
warn_positive_resolution
module-attribute
⚓︎
adjust_longitude_360_to_180
⚓︎
Adjust longitude from 0-360 to -180-180.
Parameters:
-
dataset
(Dataset
) –Dataset with longitude in 0-360 range.
-
limited_area
(bool
, default:False
) –Flag indicating if the dataset is a limited area. Default is False.
-
lon_name
(str
, default:'longitude'
) –Name of the longitude variable in the dataset. Default is "longitude".
Returns:
-
Dataset
–xr.Dataset: Dataset with longitude adjusted to -180-180 range.
align_lon_lat_with_popu_data
⚓︎
align_lon_lat_with_popu_data(dataset, expected_longitude_max=float64(179.75), lat_name='latitude', lon_name='longitude')
Align longitude and latitude coordinates with population data of the same resolution. This function is specifically designed to ensure that the longitude and latitude coordinates in the dataset match the expected values used in population data, which are: - Longitude: -179.75 to 179.75, 720 points - Latitude: 89.75 to -89.75, 360 points
Parameters:
-
dataset
(Dataset
) –Dataset with longitude and latitude coordinates.
-
expected_longitude_max
(float64
, default:float64(179.75)
) –Expected maximum longitude after adjustment. Default is np.float64(179.75).
-
lat_name
(str
, default:'latitude'
) –Name of the latitude coordinate. Default is "latitude".
-
lon_name
(str
, default:'longitude'
) –Name of the longitude coordinate. Default is "longitude".
Returns:
-
Dataset
–xr.Dataset: Dataset with adjusted longitude and latitude coordinates.
convert_360_to_180
⚓︎
convert_m_to_mm
⚓︎
convert_m_to_mm_with_attributes
⚓︎
Convert precipitation from meters to millimeters and keep attributes.
Parameters:
-
dataset
(Dataset
) –Dataset containing precipitation in meters.
-
inplace
(bool
, default:False
) –If True, modify the original dataset. If False, return a new dataset. Default is False.
-
var_name
(str
, default:'tp'
) –Name of the precipitation variable in the dataset. Default is "tp".
Returns:
-
Dataset
–xr.Dataset: Dataset with precipitation converted to millimeters.
convert_to_celsius
⚓︎
convert_to_celsius_with_attributes
⚓︎
Convert temperature from Kelvin to Celsius and keep attributes.
Parameters:
-
dataset
(Dataset
) –Dataset containing temperature in Kelvin.
-
inplace
(bool
, default:False
) –If True, modify the original dataset. If False, return a new dataset. Default is False.
-
var_name
(str
, default:'t2m'
) –Name of the temperature variable in the dataset. Default is "t2m".
Returns:
-
Dataset
–xr.Dataset: Dataset with temperature converted to Celsius.
downsample_resolution
⚓︎
downsample_resolution(dataset, new_resolution=0.5, lat_name='latitude', lon_name='longitude', agg_funcs=None, agg_map=None)
Downsample the resolution of a dataset.
Parameters:
-
dataset
(Dataset
) –Dataset to change resolution.
-
new_resolution
(float
, default:0.5
) –New resolution in degrees. Default is 0.5.
-
lat_name
(str
, default:'latitude'
) –Name of the latitude coordinate. Default is "latitude".
-
lon_name
(str
, default:'longitude'
) –Name of the longitude coordinate. Default is "longitude".
-
agg_funcs
(Dict[str, str] | None
, default:None
) –Aggregation functions for each variable. If None, default aggregation (i.e. mean) is used. Default is None.
-
agg_map
(Dict[str, Callable[[Any], float]] | None
, default:None
) –Mapping of string to aggregation functions. If None, default mapping is used. Default is None.
Returns:
-
Dataset
–xr.Dataset: Dataset with changed resolution.
preprocess_data_file
⚓︎
Preprocess the dataset based on provided settings. Processed data is saved to the same directory with updated filename, defined by the settings.
Parameters:
-
netcdf_file
(Path
) –Path to the NetCDF file to preprocess.
-
settings
(Dict[str, Any]
) –Settings for preprocessing.
Returns:
-
Dataset
–xr.Dataset: Preprocessed dataset.
rename_coords
⚓︎
Rename coordinates in the dataset based on a mapping.
Parameters:
-
dataset
(Dataset
) –Dataset with coordinates to rename.
-
coords_mapping
(dict
) –Mapping of old coordinate names to new names.
Returns:
-
Dataset
–xr.Dataset: A new dataset with renamed coordinates.
resample_resolution
⚓︎
resample_resolution(dataset, new_resolution=0.5, lat_name='latitude', lon_name='longitude', agg_funcs=None, agg_map=None, expected_longitude_max=float64(179.75), method_map=None)
Resample the grid of a dataset to a new resolution.
Parameters:
-
dataset
(Dataset
) –Dataset to resample.
-
new_resolution
(float
, default:0.5
) –New resolution in degrees. Default is 0.5.
-
lat_name
(str
, default:'latitude'
) –Name of the latitude coordinate. Default is "latitude".
-
lon_name
(str
, default:'longitude'
) –Name of the longitude coordinate. Default is "longitude".
-
agg_funcs
(Dict[str, str] | None
, default:None
) –Aggregation functions for each variable. If None, default aggregation (i.e. mean) is used. Default is None.
-
agg_map
(Dict[str, Callable[[Any], float]] | None
, default:None
) –Mapping of string to aggregation functions. If None, default mapping is used. Default is None.
-
expected_longitude_max
(float64
, default:float64(179.75)
) –Expected maximum longitude after adjustment. Default is np.float64(179.75).
-
method_map
(Dict[str, str] | None
, default:None
) –Mapping of variable names to interpolation methods. If None, linear interpolation is used. Default is None.
Returns:
-
Dataset
–xr.Dataset: Resampled dataset with changed resolution.
truncate_data_from_time
⚓︎
Truncate data from a specific start date.
Parameters:
-
dataset
(Dataset
) –Dataset to truncate.
-
start_date
(Union[str, datetime64]
) –Start date for truncation. Format as "YYYY-MM-DD" or as a numpy datetime64 object.
-
var_name
(str
, default:'time'
) –Name of the time variable in the dataset. Default is "time".
Returns:
-
Dataset
–xr.Dataset: Dataset truncated from the specified start date.
upsample_resolution
⚓︎
upsample_resolution(dataset, new_resolution=0.1, lat_name='latitude', lon_name='longitude', method_map=None)
Upsample the resolution of a dataset.
Parameters:
-
dataset
(Dataset
) –Dataset to change resolution.
-
new_resolution
(float
, default:0.1
) –New resolution in degrees. Default is 0.1.
-
lat_name
(str
, default:'latitude'
) –Name of the latitude coordinate. Default is "latitude".
-
lon_name
(str
, default:'longitude'
) –Name of the longitude coordinate. Default is "longitude".
-
method_map
(Dict[str, str] | None
, default:None
) –Mapping of variable names to interpolation methods. If None, linear interpolation is used. Default is None.
Returns:
-
Dataset
–xr.Dataset: Dataset with changed resolution.