River Flood from GloFAS River Discharge Data Module#

Main Module#

class climada_petals.hazard.rf_glofas.river_flood_computation.RiverFloodCachePaths(discharge, return_period, return_period_regrid, return_period_regrid_protect)[source]#

Bases: RiverFloodCachePaths

Container for storing paths to caches for RiverFloodInundation

Depending on the state of the corresponding RiverFloodInundation instance, files might be present or not. Please check this explicitly before accessing them.

discharge#

Path to the discharge data cache.

Type:

pathlib.Path

return_period#

Path to the return period data cache.

Type:

pathlib.Path

return_period_regrid#

Path to the regridded return period data cache.

Type:

pathlib.Path

return_period_regrid_protect#

Path to the regridded return period data cache, where the return period was restricted by the protection limits.

Type:

pathlib.Path

classmethod from_dir(cache_dir: Path)[source]#

Set default paths from a cache directory

class climada_petals.hazard.rf_glofas.river_flood_computation.RiverFloodInundation(store_intermediates: bool = True, data_dir: Path | str = PosixPath('/home/docs/climada/data/river-flood-computation'), cache_dir: Path | str = PosixPath('/home/docs/climada/data/river-flood-computation/.cache'))[source]#

Bases: object

Class for computing river flood inundations

store_intermediates#

If intermediate results are stored in the respective cache_paths

Type:

bool

cache_paths#

Paths pointing to potential intermediate results stored in a cache directory.

Type:

RiverFloodCachePaths

flood_maps#

Flood inundation on lat/lon grid for specific return periods.

Type:

xarray.DataArray

gumbel_fits#

Gumbel parameters resulting from extreme value analysis of historical discharge data.

Type:

xarray.Dataset

flopros#

Spatially explicit information on flood protection levels.

Type:

geopandas.GeoDataFrame

regridder#

Storage for re-using the XESMF regridder in case the computation is repeated on the same grid. This reduces the runtime of subsequent computations.

Type:

xesmf.Regridder

__init__(store_intermediates: bool = True, data_dir: Path | str = PosixPath('/home/docs/climada/data/river-flood-computation'), cache_dir: Path | str = PosixPath('/home/docs/climada/data/river-flood-computation/.cache'))[source]#

Initialize the instance

Parameters:
  • store_intermediates (bool, optional) – Whether the data of each computation step should be stored in the cache directory. This is recommended especially for larger data. Only set this to False if the data operated on is very small (e.g., for a small country or region). Defaults to True.

  • data_dir (Path or str, optional) – The directory where flood maps, Gumbel distribution parameters and the FLOPROS database are located. Defaults to <climada>/data/river-flood-computation, where <climada> is the Climada data directory indicated by local_data : system in the climada.conf. This directory must exist.

  • cache_dir (Path or str, optional) – The top-level cache directory where computation caches of this instance will be placed. Defaults to <climada>/data/river-flood-computation/.cache (see above for configuration). This directory (and all its parents) will be created.

clear_cache()[source]#

Clear the cache of this instance

This will delete the temporary cache directory and create a new one by calling _create_tempdir().

compute(discharge: DataArray | None = None, apply_protection: bool | str = 'both', resample_kws: Mapping[str, Any] | None = None, regrid_kws: Mapping[str, Any] | None = None)[source]#

Compute river flood inundation from downloaded river discharge

After downloading discharge data, this will execute the pipeline for computing river flood inundaton. This pipeline has the following steps:

Resampling, regridding, and the application of protection information are controlled via the parameters of this method.

Parameters:
  • discharge (xr.DataArray or None (optional)) – The discharge data to compute flood depths for. If None, the cached discharge will be used. Defaults to None.

  • apply_protection (bool or “both” (optional)) – If the stored protection layer should be considered when computing the flood depth. If "both", this method will return a dataset with two flood depth arrays. Defaults to both.

  • resample_kws (Mapping (str, Any) or None (optional)) – Keyword arguments for return_period_resample(). If None, this method will call return_period(). Otherwise, it will call return_period_resample() and pass this parameter as keyword arguments. Defaults to None.

  • regrid_kws (Mapping (str, Any) or None (optional)) – Keyword arguments for regrid(). Defaults to None.

Returns:

Dataset containing the flood data with the same dimensions as the input discharge data. Depending on the choice of apply_protection, this will contain one or two DataArrays.

Return type:

xr.Dataset

Raises:

RuntimeError – If discharge is None, but no discharge data is cached.

download_forecast(countries: str | List[str], forecast_date: str | datetime64 | datetime | Timestamp, lead_time_days: int = 10, preprocess: Callable | None = None, **download_glofas_discharge_kwargs) DataArray[source]#

Download GloFAS discharge ensemble forecasts

If store_intermediates is true, the returned data is also stored in cache_paths.

Parameters:
Returns:

forecast – Downloaded forecast as DataArray after preprocessing

Return type:

xr.DataArray

download_reanalysis(countries: str | Iterable[str], year: int, preprocess: Callable | None = None, **download_glofas_discharge_kwargs)[source]#

Download GloFAS discharge historical data

If store_intermediates is true, the returned data is also stored in cache_paths.

Parameters:
Returns:

reanalysis – Downloaded forecast as DataArray after preprocessing

Return type:

xr.DataArray

return_period(discharge: DataArray | None = None) DataArray[source]#

Compute the return period for a given discharge

If no discharge data is given as parameter, the discharge cache will be accessed.

If store_intermediates is true, the returned data is also stored in cache_paths.

Parameters:

discharge (xr.DataArray, optional) – The discharge data to operate on. Defaults to None, which indicates that data should be loaded from the cache

Returns:

r_period – Return period for each location of the input discharge.

Return type:

xr.DataArray

return_period_resample(num_bootstrap_samples: int, discharge: DataArray | None = None, fit_method: str = 'MM', num_workers: int = 1, memory_per_worker: str = '2G')[source]#

Compute the return period for a given discharge using bootstrap sampling.

For each input discharge value, this creates an ensemble of return periods by employing bootstrap sampling. The ensemble size is controlled with num_bootstrap_samples.

If store_intermediates is true, the returned data is also stored in cache_paths.

Parameters:
  • num_bootstrap_samples (int) – Number of bootstrap samples to compute for each discharge value.

  • discharge (xr.DataArray, optional) – The discharge data to operate on. Defaults to None, which indicates that data should be loaded from the cache.

  • fit_method (str, optional) – Method for fitting data to bootstrapped samples.

    • "MM": Method of Moments

    • "MLE": Maximum Likelihood Estimation

  • num_workers (int, optional) – Number of parallel processes to use when computing the samples.

  • memory_per_worker (str, optional) – Memory to allocate for each process.

Returns:

r_period – Return period samples for each location of the input discharge.

Return type:

xr.DataArray

regrid(r_period: DataArray | None = None, method: str = 'bilinear', reuse_regridder: bool = False)[source]#

Regrid the return period data onto the flood hazard map grid.

This computes the regridding matrix for the given coordinates and then performs the actual regridding. The matrix is stored in regridder. If another regridding is performed on the same grid (but possibly different data), the regridder can be reused to save time. To control that, set reuse_regridder=True.

If store_intermediates is true, the returned data is also stored in cache_paths.

Parameters:
  • r_period (xr.DataArray, optional) – The return period data to regrid. Defaults to None, which indicates that data should be loaded from the cache.

  • method (str, optional) – Interpolation method of the return period data. Defaults to "bilinear". See https://xesmf.readthedocs.io/en/stable/notebooks/Compare_algorithms.html

  • reuse_regridder (bool, optional) – Reuse the regridder stored if one is stored. Defaults to False, which means that a new regridder is always built when calling this function. If True, and no regridder is stored, it will be built nonetheless.

Returns:

return_period_regrid – The regridded return period data.

Return type:

xr.DataArray

apply_protection(return_period_regrid: DataArray | None = None)[source]#

Limit the return period data by applying FLOPROS protection levels.

This sets each return period value where the local FLOPROS protection level is not exceeded to NaN and returns the result. Protection levels are read from flopros.

If store_intermediates is true, the returned data is also stored in cache_paths.

Parameters:

return_period_regrid (xr.DataArray, optional) – The return period data to regrid. Defaults to None, which indicates that data should be loaded from the cache.

Returns:

return_period_regrid_protect – The regridded return period where each value that does not reach the protection limit is set to NaN.

Return type:

xr.DataArray

flood_depth(return_period_regrid: DataArray | None = None)[source]#

Compute the flood depth from regridded return period data.

Interpolate the flood hazard maps stored in :py:attr`flood_maps` in the return period dimension at every location to compute the flood footprint.

Note

Even if store_intermediates is true, the returned data is not stored automatically! Use climada_petals.hazard.rf_glofas.transform_ops.save_file() to store the data yourself.

Parameters:

return_period_regrid (xr.DataArray, optional) – The regridded return period data to use for computing the flood footprint. Defaults to None which indicates that data should be loaded from the cache. If RiverFloodCachePaths.return_period_regrid_protect exists, that data is used. Otherwise, the “unprotected” data RiverFloodCachePaths.return_period_regrid is loaded.

Returns:

inundation – The flood inundation at every location of the flood hazard maps grid.

Return type:

xr.DataArray

Transformation Operations#

climada_petals.hazard.rf_glofas.transform_ops.sel_lon_lat_slice(target: DataArray, source: DataArray) DataArray[source]#

Select a lon/lat slice from ‘target’ using coordinates of ‘source’

climada_petals.hazard.rf_glofas.transform_ops.rp_comp(sample: ndarray, loc: ndarray, scale: ndarray, max_rp: float | None = inf)[source]#

Compute the return period from a right-handed Gumbel distribution

All parameters can be arrays, in which case numpy broadcasting rules apply.

The return period of a sample \(x\) from an extreme value distribution is defined as \((1 - \mathrm{cdf}(x))^{-1}\), where \(\mathrm{cdf}\) is the cumulative distribution function of said distribution.

Parameters:
  • sample (array) – Samples for which to compute the return period

  • loc (array) – Loc parameter of the Gumbel distribution

  • scale (array) – Scale parameter of the distribution

  • max_rp (float, optional) – The maximum value of return periods. This avoids returning infinite values. Defaults to np.inf (no maximum).

Returns:

The return period(s) for the input parameters

Return type:

np.ndarray

climada_petals.hazard.rf_glofas.transform_ops.reindex(target: DataArray, source: DataArray, tolerance: float | None = None, fill_value: float = nan, assert_no_fill_value: bool = False) DataArray[source]#

Reindex target to source with nearest neighbor lookup

Parameters:
  • target (xr.DataArray) – Array to be reindexed.

  • source (xr.DataArray) – Array whose coordinates are used for reindexing.

  • tolerance (float (optional)) – Maximum distance between coordinates. If it is superseded, the fill_value is inserted instead of the nearest neighbor value. Defaults to NaN

  • fill_value (float (optional)) – The fill value to use if coordinates are changed by a distance of more than tolerance.

  • assert_no_fill_value (bool (optional)) – Throw an error if fill values are found in the data after reindexing. This will also throw an error if the fill value is present in the target before reindexing (because the check afterwards would else not make sense)

Returns:

target – Target reindexed like ‘source’ with nearest neighbor lookup for the data.

Return type:

xr.DataArray

Raises:
  • ValueError – If tolerance is exceeded when reindexing, in case assert_no_fill_value is True.

  • ValueError – If target already contains the fill_value before reindexing, in case assert_no_fill_value is True.

climada_petals.hazard.rf_glofas.transform_ops.merge_flood_maps(flood_maps: Mapping[str, DataArray]) DataArray[source]#

Merge the flood maps GeoTIFFs into one NetCDF file

Adds a “zero” flood map (all zeros)

Parameters:

flood_maps (dict(str, xarray.DataArray)) – The mapping of GeoTIFF file paths to respective DataArray. Each flood map is identified through the folder containing it. The folders are expected to follow the naming scheme floodMapGL_rpXXXy, where XXX indicates the return period of the respective map.

climada_petals.hazard.rf_glofas.transform_ops.fit_gumbel_r(input_data: DataArray, time_dim: str = 'year', fit_method: str = 'MM', min_samples: int = 2)[source]#

Fit a right-handed Gumbel distribution to the data

Parameters:
  • input_data (xr.DataArray) – The input time series to compute the distributions for. It must contain the dimension specified as time_dim.

  • time_dim (str) – The dimension indicating time. Defaults to year.

  • fit_method (str) – The method used for computing the distribution. Either MLE (Maximum Likelihood Estimation) or MM (Method of Moments).

  • min_samples (int) – The number of finite samples along the time dimension required for a successful fit. If there are fewer samples, the fit result will be NaN.

Returns:

A dataset on the same grid as the input data with variables

  • loc: The loc parameter of the fitted distribution (mode)

  • scale: The scale parameter of the fitted distribution

  • samples: The number of samples used to fit the distribution at this coordinate

Return type:

xr.Dataset

climada_petals.hazard.rf_glofas.transform_ops.download_glofas_discharge(product: str, date_from: str, date_to: str | None, num_proc: int = 1, download_path: str | Path = PosixPath('/home/docs/climada/data/cds-download'), countries: List[str] | str | None = None, preprocess: Callable | None = None, open_mfdataset_kw: Mapping[str, Any] | None = None, **request_kwargs) DataArray[source]#

Download the GloFAS data and return the resulting dataset

Several parameters are passed directly to climada_petals.hazard.rf_glofas.cds_glofas_downloader.glofas_request(). See this functions documentation for further information.

Parameters:
  • product (str) – The string identifier of the product to download. See climada_petals.hazard.rf_glofas.cds_glofas_downloader.glofas_request() for supported products.

  • date_from (str) – Earliest date to download. Specification depends on the product chosen.

  • date_to (str or None) – Latest date to download. If None, only download the date_from. Specification depends on the product chosen.

  • num_proc (int) – Number of parallel processes to use for downloading. Defaults to 1.

  • download_path (str or pathlib.Path) – Directory to store the downloaded data. The directory (and all required parent directories!) will be created if it does not yet exist. Defaults to ~/climada/data/glofas-discharge/.

  • countries (str or list of str, optional) – Countries to download data for. Uses the maximum extension of all countries for selecting the latitude/longitude range of data to download.

  • preprocess (str, optional) – String expression for preprocessing the data before merging it into one dataset. Must be valid Python code. The downloaded data is passed as variable x.

  • open_mfdataset_kw (dict, optional) – Optional keyword arguments for the xarray.open_mfdataset function.

  • request_kwargs – Keyword arguments for the Copernicus data store request.

climada_petals.hazard.rf_glofas.transform_ops.max_from_isel(array: DataArray, dim: str, selections: List[Iterable | slice]) DataArray[source]#

Compute the maximum over several selections of an array dimension

climada_petals.hazard.rf_glofas.transform_ops.return_period(discharge: DataArray, gev_loc: DataArray, gev_scale: DataArray, max_return_period: float = 10000.0) DataArray[source]#

Compute the return period for a discharge from a Gumbel EV distribution fit

Coordinates of the three datasets must match up to a tolerance of 1e-3 degrees. If they do not, an error is thrown.

Parameters:
  • discharge (xr.DataArray) – The discharge values to compute the return period for

  • gev_loc (xr.DataArray) – The loc parameters for the Gumbel EV distribution

  • gev_scale (xr.DataArray) – The scale parameters for the Gumbel EV distribution

Returns:

The equivalent return periods for the input discharge and Gumbel EV istributions

Return type:

xr.DataArray

See also

climada_petals.hazard.rf_glofas.transform_ops.rp(), climada_petals.hazard.rf_glofas.transform_ops.return_period_resample()

climada_petals.hazard.rf_glofas.transform_ops.return_period_resample(discharge: DataArray, gev_loc: DataArray, gev_scale: DataArray, gev_samples: DataArray, bootstrap_samples: int, max_return_period: float = 10000.0, fit_method: str = 'MLE') DataArray[source]#

Compute resampled return periods for a discharge from a Gumbel EV distribution fit

This function uses bootstrap resampling to incorporate the uncertainty in the EV distribution fit. Bootstrap resampling takes the fitted distribution, draws N samples from it (where N is the number of samples originally used to fit the distribution), and fits a new distribution onto these samples. This “bootstrapped” distribution is then used to compute the return period. Repeating this process yields an ensemble of distributions that captures the uncertainty in the original distribution fit.

Coordinates of the three datasets must match up to a tolerance of 1e-3 degrees. If they do not, an error is thrown.

Parameters:
  • discharge (xr.DataArray) – The discharge values to compute the return period for

  • gev_loc (xr.DataArray) – The loc parameters for the Gumbel EV distribution

  • gev_scale (xr.DataArray) – The scale parameters for the Gumbel EV distribution

  • gev_samples (xr.DataArray) – The samples used to fit the Gumbel EV distribution at every point

  • bootstrap_samples (int) – The number of bootstrap samples to compute. Increasing this will improve the representation of uncertainty, but strongly increase computational costs later on.

  • fit_method (str) – Method for fitting the Gumbel EV during resampling. Available methods are the Method of Moments MM or the Maximum Likelihood Estimation MLE (default).

Returns:

The equivalent return periods for the input discharge and Gumbel EV distributions. The data array will have an additional dimension sample, representing the bootstrap samples for every point.

Return type:

xr.DataArray

See also

climada_petals.hazard.rf_glofas.transform_ops.rp(), climada_petals.hazard.rf_glofas.transform_ops.return_period()

climada_petals.hazard.rf_glofas.transform_ops.interpolate_space(return_period: DataArray, flood_maps: DataArray, method: str = 'linear') DataArray[source]#

Interpolate the return period in space onto the flood maps grid

climada_petals.hazard.rf_glofas.transform_ops.regrid(return_period: DataArray, flood_maps: DataArray, method: str = 'bilinear', regridder: xesmf.Regridder | None = None, return_regridder: bool = False) DataArray | Tuple[DataArray, xesmf.Regridder][source]#

Regrid the return period onto the flood maps grid

climada_petals.hazard.rf_glofas.transform_ops.apply_flopros(flopros_data: GeoDataFrame, return_period: DataArray | Dataset, layer: str = 'MerL_Riv') DataArray | Dataset[source]#

Restrict the given return periods using FLOPROS data

The FLOPROS database describes the regional protection to river flooding based on a return period. Users can choose from different database layers. For each coordinate in return_period, the value from the respective FLOPROS database layer is selected. Any return_period lower than or equal to the FLOPROS protection value is discarded and set to NaN.

Parameters:
  • flopros_data (PassthroughContainer) – The FLOPROS data bundled into a dantro.containers.PassthroughContainer

  • return_period (xr.DataArray or xr.Dataset) – The return periods to be restricted by the FLOPROS data

  • layer (str) – The FLOPROS database layer to evaluate

Returns:

The return_period data with all values below the local FLOPROS protection threshold set to NaN.

Return type:

xr.DataArray or xr.Dataset

climada_petals.hazard.rf_glofas.transform_ops.flood_depth(return_period: Dataset | DataArray, flood_maps: DataArray) Dataset | DataArray[source]#

Compute the flood depth from a return period and flood maps.

At each lat/lon coordinate, take the return period(s) and use them to interpolate the flood maps at this position in the return period dimension. Take the interpolated values and return them as flood hazard footprints. Works with arbitrarily high dimensions in the return_period array. Interpolation is linear.

Parameters:
  • return_period (xr.DataArray or xr.Dataset) – The return periods for which to compute the flood depth. If this is a dataset, the function will compute the flood depth for each variable.

  • flood_maps (xr.DataArray) – Maps that indicate flood depth at latitude/longitude/return period coordinates.

Returns:

The flood depths for the given return periods.

Return type:

xr.DataArray or xr.Dataset

climada_petals.hazard.rf_glofas.transform_ops.save_file(data: Dataset | DataArray, output_path: Path | str, encoding: Mapping[str, Any] | None = None, engine: str | None = 'netcdf4', **encoding_defaults)[source]#

Save xarray data as a file with default compression

Calls data.to_netcdf. See https://docs.xarray.dev/en/stable/generated/xarray.Dataset.to_netcdf.html for the documentation of the underlying method.

Parameters:
  • data (xr.Dataset or xr.Dataarray) – The data to be stored in the file

  • output_path (pathlib.Path or str) – The file path to store the data into. If it does not contain a suffix, .nc is automatically appended. The enclosing folder must already exist.

  • encoding (dict (optional)) – Encoding settings for every data variable. Will update the default settings.

  • engine (str (optional)) – The engine used for writing the file. Defaults to "netcdf4".

  • encoding_defaults – Encoding settings shared by all data variables. This will update the default encoding settings, which are dict(dtype="float32", zlib=False, complevel=4).

Helper Functions#

These are the functions exposed by the module.

climada_petals.hazard.rf_glofas.rf_glofas.dask_client(n_workers, threads_per_worker, memory_limit, *args, **kwargs)[source]#

Create a context with a dask.distributed.Client.

This is a lightweight wrapper and intended to expose only the most important parameters to end users.

Parameters:
  • n_workers (int) – Number of parallel processes to launch.

  • threads_per_worker (int) – Compute threads launched by each worker.

  • memory_limit (str) – Memory limit for each process. Example: 4 GB can be expressed as 4000M or 4G.

  • args, kwargs – Additional (keyword) arguments passed to the dask.distributed.Client constructor.

Example

>>> with dask_client(n_workers=2, threads_per_worker=2, memory_limit="4G"):
...     xr.open_dataset("data.nc", chunks="auto").median()
climada_petals.hazard.rf_glofas.rf_glofas.hazard_series_from_dataset(data: Dataset, intensity: str, event_dim: str) Series | Hazard[source]#

Create a series of Hazard objects from a multi-dimensional dataset

The input flood data is usually multi-dimensional. For example, you might have downloaded ensemble data over an extended period of time. Therefore, this function returns a pandas.Series. Each entry of the series is a Hazard object whose events have the same coordinates in this multi-dimensional space except the one given by event_dim. For example, if your data space has the dimensions time, lead_time and number, and you choose event_dim="number", then the index of the series will be a MultiIndex from time and lead_time, and a single hazard object will contain all events along the number axis for a given MultiIndex.

Parameters:
  • data (xarray.Dataset) – Data to load a hazard series from.

  • intensity (str) – Name of the dataset variable to read as hazard intensity.

  • event_dim (str) – Name of the dimension to be used as event dimension in the hazards. All other dimension names except the dimensions for longitude and latitude will make up the hierarchy of the MultiIndex of the resulting series.

Returns:

Series of Hazard objects with events along event_dim and with a MultiIndex of the remaining dimensions.

Return type:

pandas.Series

Tip

This function must transpose the underlying data in the dataset to convenietly build Hazard objects. To ensure that this is an efficient operation, avoid plugging the return value of compute() directly into this function, especially for large data. Instead, save the data first using save_file(), then re-open the data with xarray and call this function on it.

Examples

Execute the default pipeline and retrieve the Hazard series

>>> import xarray as xr
>>> dset = xr.open_dataset("flood.nc")
>>> sorted(list(dset.dims.keys()))
["date", "latitude", "longitude", "number", "sample"]
>>> from climada_petals.hazard.rf_glofas import hazard_series_from_dataset
>>> with xr.open_dataset("flood.nc") as dset:
>>>     hazard_series_from_dataset(dset, "flood_depth_flopros", "number")
date        sample
2022-08-10  0       <climada.hazard.base.Hazard ...
            1       <climada.hazard.base.Hazard ...
2022-08-11  0       <climada.hazard.base.Hazard ...
            1       <climada.hazard.base.Hazard ...
Length: 4, dtype: object
climada_petals.hazard.rf_glofas.setup.download_flopros_database(output_dir: str | Path = PosixPath('/home/docs/climada/data/river-flood-computation'))[source]#

Download the FLOPROS database and place it into the output directory.

Download the supplementary material of P. Scussolini et al.: “FLOPROS: an evolving global database of flood protection standards”, extract the zipfile, and retrieve the shapefile within. Discard the temporary data afterwards.

climada_petals.hazard.rf_glofas.setup.download_flood_hazard_maps(output_dir: str | Path)[source]#

Download the JRC flood hazard maps and unzip them

This stores the downloaded zip files as temporary files which are discarded after unzipping.

climada_petals.hazard.rf_glofas.setup.setup_flood_hazard_maps(flood_maps_dir: Path, output_dir=PosixPath('/home/docs/climada/data/river-flood-computation'))[source]#

Download the flood hazard maps and merge them into a single NetCDF file

Maps will be downloaded into flood_maps_dir if it does not exist. Then, the single maps are re-written as NetCDF files, if these do not exist. Finally, all maps are merged into a single dataset and written to the output_dir. Because NetCDF files are more flexibly read and written, this procedure is more efficient that directly merging the GeoTIFF files into a single dataset.

Parameters:
  • flood_maps_dir (Path) – Storage directory of the flood maps as GeoTIFF files. Will be created if it does not exist, in which case the files are automatically downloaded.

  • output_dir (Path) – Directory to store the flood maps dataset.

climada_petals.hazard.rf_glofas.setup.setup_gumbel_fit(output_dir=PosixPath('/home/docs/climada/data/river-flood-computation'), num_downloads: int = 1, parallel: bool = False)[source]#

Download historical discharge data and compute the Gumbel distribution fits.

Data is downloaded from the Copernicus Climate Data Store (CDS).

Parameters:
  • output_dir – The directory to place the resulting file

  • num_downloads (int) – Number of parallel downloads from the CDS. Defaults to 1.

  • parallel (bool) – Whether to preprocess data in parallel. Defaults to False.

climada_petals.hazard.rf_glofas.setup.download_gumbel_fit(output_dir=PosixPath('/home/docs/climada/data/river-flood-computation'))[source]#

Download the pre-computed Gumbel parameters from the ETH research collection.

Download dataset of https://doi.org/10.3929/ethz-b-000641667

climada_petals.hazard.rf_glofas.setup.setup_all(output_dir: str | Path = PosixPath('/home/docs/climada/data/river-flood-computation'))[source]#

Set up the data for river flood computations.

This performs two tasks:

  1. Downloading the JRC river flood hazard maps and merging them into a single NetCDF dataset.

  2. Downloading the FLOPROS flood protection database.

  3. Downloading the Gumbel distribution parameters fitted to GloFAS river discharge reanalysis data from 1979 to 2015.

Parameters:

output_dir (Path or str, optional) – The directory to store the datasets into.

CDS Glofas Downloader#

climada_petals.hazard.rf_glofas.cds_glofas_downloader.glofas_request(product: str, date_from: str, date_to: str | None, output_dir: Path | str, num_proc: int = 1, use_cache: bool = True, request_kw: Mapping[str, str] | None = None, client_kw: Mapping[str, Any] | None = None) List[Path][source]#

Request download of GloFAS data products from the Copernicus Data Store (CDS)

Uses the Copernicus Data Store API (cdsapi) Python module. The interpretation of the date parameters and the grouping of the downloaded data depends on the type of product requested.

Available products:

  • historical: Historical reanalysis discharge data. date_from and date_to are interpreted as integer years. Data for each year is placed into a single file.

  • forecast: Forecast discharge data. date_from and date_to are interpreted as ISO date format strings. Data for each day is placed into a single file.

Notes

Downloading data from the CDS requires authentication via a user key which is granted to each user upon registration. Do the following before calling this function:

Parameters:
  • product (str) – The indentifier for the CMS product to download. See below for available options.

  • date_from (str) – First date to download data for. Interpretation varies based on product.

  • date_to (str or None) – Last date to download data for. Interpretation varies based on product. If None, or the same date as date_from, only download data for date_from

  • output_dir (Path) – Output directory for the downloaded data

  • num_proc (int) – Number of processes used for parallel requests

  • use_cache (bool (optional)) – Skip downloading if the target file exists and the accompanying request file contains the same request

  • request_kw (dict(str: str)) – Dictionary to update the default request for the given product

  • client_kw (dict (optional)) – Dictionary with keyword arguments for the cdsapi.Client used for downloading

Returns:

Paths of the downloaded files

Return type:

list of Path

The default configuration for each product will be updated with the request_kw from climada_petals.hazard.rf_glofas.cds_glofas_downloader.glofas_request():

climada_petals.hazard.rf_glofas.cds_glofas_downloader.DEFAULT_REQUESTS = {'forecast': {'day': '01', 'format': 'grib', 'hydrological_model': 'lisflood', 'leadtime_hour': ['24', '48', '72', '96', '120', '144', '168', '192', '216', '240', '264', '288', '312', '336', '360', '384', '408', '432', '456', '480', '504', '528', '552', '576', '600', '624', '648', '672', '696', '720'], 'month': '08', 'product_type': 'ensemble_perturbed_forecasts', 'system_version': 'version_3_1', 'variable': 'river_discharge_in_the_last_24_hours', 'year': '2022'}, 'historical': {'format': 'grib', 'hday': ['01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31'], 'hmonth': ['january', 'february', 'march', 'april', 'may', 'june', 'july', 'august', 'september', 'october', 'november', 'december'], 'hydrological_model': 'lisflood', 'hyear': '1979', 'product_type': 'consolidated', 'system_version': 'version_3_1', 'variable': 'river_discharge_in_the_last_24_hours'}}#

Default request keyword arguments to be updated by the user requests