climada_petals.hazard.emulator package#

climada_petals.hazard.emulator.const module#

climada_petals.hazard.emulator.const.TC_BASIN_GEOM = {'EP': [[-180.0, -75.0, 0.0, 9.0], [-180.0, -83.5, 9.0, 15.0], [-180.0, -92.0, 15.0, 18.0], [-180.0, -99.9, 18.0, 60.0]], 'EPE': [[-135.0, -75.0, 0.0, 9.0], [-135.0, -83.5, 9.0, 15.0], [-135.0, -92.0, 15.0, 18.0], [-135.0, -99.9, 18.0, 60.0]], 'EPW': [[-180.0, -135.0, 0.0, 60.0]], 'GB': [[-179.9, 180.0, -50.0, 60.0]], 'NA': [[-99.0, 13.0, 18.0, 60.0], [-91.0, 13.0, 15.0, 18.0], [-83.5, 13.0, 9.0, 15.0], [-78.0, 13.0, 0.0, 9.0]], 'NAN': [[-99.0, 13.0, 31.0, 60.0]], 'NAS': [[-99.0, 13.0, 18.0, 31.0], [-91.5, 13.0, 15.0, 18.0], [-83.5, 13.0, 9.0, 15.0], [-78.0, 13.0, 0.0, 9.0]], 'NI': [[37.0, 99.0, 0.0, 30.0]], 'NIE': [[78.0, 99.0, 0.0, 30.0]], 'NIW': [[37.0, 78.0, 0.0, 30.0]], 'SA': [[-65.0, 20.0, -60.0, 0.0]], 'SI': [[20.0, 135.0, -50.0, 0.0]], 'SIE': [[75.0, 135.0, -50.0, 0.0]], 'SIW': [[20.0, 75.0, -50.0, 0.0]], 'SP': [[135.0, 180.01, -50.0, 0.0], [-180.0, -68.0, -50.0, 0.0]], 'SPE': [[172.0, 180.01, -50.0, 0.0], [-180.0, -68.0, -50.0, 0.0]], 'SPW': [[135.0, 172.0, -50.0, 0.0]], 'WP': [[99.0, 180.0, 0.0, 60.0]], 'WPN': [[99.0, 180.0, 20.0, 60.0]], 'WPS': [[99.0, 180.0, 0.0, 20.0]]}#

Boundaries of TC (sub-)basins (lon_min, lon_max, lat_min, lat_max)

climada_petals.hazard.emulator.const.TC_BASIN_GEOM_SIMPL = {'EP': [[-180.0, -75.0, 0.0, 60.0]], 'EPE': [[-135.0, -75.0, 0.0, 60.0]], 'EPW': [[-180.0, -135.0, 0.0, 60.0]], 'NA': [[-105.0, -30.0, 0.0, 60.0]], 'NAN': [[-105.0, -30.0, 31.0, 60.0]], 'NAS': [[-105.0, -30.0, 0.0, 31.0]], 'NI': [[37.0, 99.0, 0.0, 35.0]], 'NIE': [[78.0, 99.0, 0.0, 35.0]], 'NIW': [[37.0, 78.0, 0.0, 35.0]], 'SI': [[20.0, 135.0, -50.0, 0.0]], 'SIE': [[75.0, 135.0, -50.0, 0.0]], 'SIW': [[20.0, 75.0, -50.0, 0.0]], 'SP': [[135.0, -60.0, -50.0, 0.0]], 'SPE': [[172.0, -60.0, -50.0, 0.0]], 'SPW': [[135.0, 172.0, -50.0, 0.0]], 'WP': [[99.0, 180.0, 0.0, 60.0]], 'WPN': [[99.0, 180.0, 20.0, 60.0]], 'WPS': [[99.0, 180.0, 0.0, 20.0]]}#

Simplified boundaries of TC (sub-)basins (lon_min, lon_max, lat_min, lat_max)

climada_petals.hazard.emulator.const.TC_SUBBASINS = {'EP': ['EPW', 'EPE'], 'NA': ['NAN', 'NAS'], 'NI': ['NIW', 'NIE'], 'SA': ['SA'], 'SI': ['SIW', 'SIE'], 'SP': ['SPW', 'SPE'], 'WP': ['WPN', 'WPS']}#

Abbreviated names of TC subbasins for each basin

climada_petals.hazard.emulator.const.TC_BASIN_SEASONS = {'EP': [7, 12], 'NA': [6, 11], 'NI': [5, 12], 'SA': [1, 4], 'SI': [11, 4], 'SP': [11, 5], 'WP': [5, 12]}#

Start/end months of hazard seasons in different basins

climada_petals.hazard.emulator.const.TC_BASIN_NORM_PERIOD = {'EP': (1950, 2015), 'NA': (1950, 2015), 'NI': (1980, 2015), 'SA': (1980, 2015), 'SI': (1980, 2015), 'SP': (1980, 2015), 'WP': (1950, 2015)}#

TC basin-specific start/end year of norm period (according to IBTrACS data availability)

climada_petals.hazard.emulator.const.PDO_SEASON = [11, 3]#

Start/end months of PDO activity

climada_petals.hazard.emulator.emulator module#

class climada_petals.hazard.emulator.emulator.HazardEmulator(haz_events, haz_events_obs, region, freq_norm, pool=None)[source]#

Bases: object

Draw samples for a time period driven by climate forcing

Draw samples from the given pool of hazard events while making sure that the frequency and intensity are as predicted according to given climate indices.

explaineds = ['intensity_mean', 'eventcount']#
__init__(haz_events, haz_events_obs, region, freq_norm, pool=None)[source]#

Initialize HazardEmulator

Parameters:
  • haz_events (DataFrame) – Output of stats.haz_max_events.

  • haz_events_obs (DataFrame) – Observed events for normalization. Output of stats.haz_max_events.

  • region (HazRegion object) – The geographical region for which to run emulations.

  • freq_norm (DataFrame { year, freq }) – Information about the relative surplus of events in tracks, i.e., if freq_norm specifies the value 0.2 in some year, then it is assumed that the number of events given for that year is 5 times as large as it is predicted to be. Usually, the value will be smaller than 1 because the event set should be a good representation of TC distribution, but this is not necessary.

  • pool (EventPool object, optional) – If omitted, draws are made from haz_events.

calibrate_statistics(climate_indices)[source]#

Statistically fit hazard data to given climate indices

The internal statistics are truncated to fit the temporal range of the climate indices.

Parameters:

climate_indices (list of DataFrames { year, month, … }) – Yearly or monthly time series of GMT, ESOI etc.

predict_statistics(climate_indices=None)[source]#

Predict hypothetical hazard statistics according to climate indices

The statistical fit from calibrate_statistics is used to predict the frequency and intensity of hazard events. The standard deviation of yearly residuals is used to define the yearly acceptable deviation of sample intensity.

Without calibration, the prediction is done according to the (bias-corrected) within-year statistics of the event pool. In this case, the within-year standard deviation of intensity is taken as the acceptable deviation of samples for that year.

Parameters:

climate_indices (list of DataFrames { year, month, … }) – Yearly or monthly time series of GMT, ESOI etc. including at least those passed to calibrate_statistics. If omitted, and if calibrate_statistics has been called before, the climate indices from calibration are reused for prediction. Otherwise, the internal (within-year) statistics of the data set are used to predict frequency and intensity.

draw_realizations(nrealizations, period)[source]#

Draw samples for given time period according to calibration

Draws for a specific year in the given period are not necessarily restricted to events in the pool that are explicitly assigned to that year because the pool might be too small to allow for draws of the expected sample size and mean intensity.

Parameters:
  • nrealizations (int) – Number of samples to draw.

  • period (pair of ints [minyear, maxyear]) – Period for which to make draws.

Returns:

draws – Each entry is a sample for the whole period, given as a DataFrame with columns as in self.pool.events. The year column is set to the respective year and columns for the driving climate indices are added for reference.

Return type:

list of DataFrames, length nrealizations

class climada_petals.hazard.emulator.emulator.EventPool(haz_events)[source]#

Bases: object

Make draws from a hazard event pool according to given statistics

The event pool might cover an arbitrary number of years and an arbitrary geographical region since the time and geo information fields are ignored when making draws.

No assumptions are made about where the statistics come from that are used in making the draw.

Example

Let haz_events be a given dataset of all TC events making landfall in Belize between 1980 and 2050, together with their respective maximum wind speeds on land. Assume that we expect (from some other statistical model) 5 events of annual mean maximum wind speed 30 ± 10 m/s in the year 2025. Then, we can draw 100 realizations of hypothetical 2025 TC event sets hitting Belize with the following commands:

>>> pool = EventPool(haz_events)
>>> draws = pool.draw_realizations(100, 5, 30, 10)

The realization draw[i] might contain events from any year between 1980 and 2050, but the size of the realization and the mean maximum wind speed will be according to the given statistics.

__init__(haz_events)[source]#

Initialize instance of EventPool

Parameters:

haz_events (DataFrame) – Output of stats.haz_max_events.

init_drop(norm_period, norm_mean)[source]#

Use a drop rule when making draws

With the drop rule, a random choice of entries is dropped from events before the actual drawing is done in order to speed up the process in case of data sets where the acceptable mean is far from the input data mean.

Parameters:
  • norm_period (pair of ints [minyear, maxyear]) – Normalization period for which a specific mean intensity is expected.

  • norm_mean (float) – Desired mean intensity of events in the given time period.

draw_realizations(nrealizations, freq_poisson, intensity_mean, intensity_std)[source]#

Draw samples from the event pool according to given statistics

If EventPool.init_drop has been called before, the drop rule is applied.

Parameters:
  • nrealizations (int) – Number of samples to draw

  • freq_poisson (float) – Expected sample size (“frequency”, Poisson distributed).

  • intensity_mean (float) – Expected sample mean intensity.

  • intensity_std (float) – Acceptable deviation from intensity_mean.

Returns:

draws – Each entry is a sample, given as a DataFrame with columns as in self.events.

Return type:

list of DataFrames, length nrealizations

climada_petals.hazard.emulator.geo module#

class climada_petals.hazard.emulator.geo.HazRegion(extent=None, geometry=None, country=None, season=(1, 12))[source]#

Bases: object

Hazard region for given geo information

__init__(extent=None, geometry=None, country=None, season=(1, 12))[source]#

Initialize HazRegion

If several arguments are passed, the spatial intersection is taken.

Parameters:
  • extent (tuple (lon_min, lon_max, lat_min, lat_max), optional)

  • geometry (GeoPandas DataFrame, optional)

  • country (str or list of str, optional) – Countries are represented by their ISO 3166-1 alpha-3 identifiers. The keyword “all” chooses all countries (i.e., global land areas).

  • season (pair of int, optional) – First and last month of hazard-specific season within this region

centroids(latlon=None, res_as=360)[source]#

Return centroids in this region

Parameters:
  • latlon (pair (lat, lon), optional) – Latitude and longitude of centroids. If not given, values are taken from CLIMADA’s base grid (see res_as).

  • res_as (int, optional) – One of 150 or 360. When latlon is not given, choose coordinates from centroids according to CLIMADA’s base grid of given resolution in arc-seconds. Default: 360.

Returns:

centroids

Return type:

climada.hazard.Centroids object

class climada_petals.hazard.emulator.geo.TCRegion(tc_basin=None, season=None, **kwargs)[source]#

Bases: HazRegion

Hazard region with support for TC ocean basins

__init__(tc_basin=None, season=None, **kwargs)[source]#

Initialize TCRegion

The given geo information must be such that everything is contained in a single TC ocean basin.

Parameters:
  • tc_basin (str) – TC (sub-)basin abbreviated name, such as “SIW”. If not given, automatically determined from geometry and basin bounds.

  • **kwargs (see HazRegion.__init__)

climada_petals.hazard.emulator.geo.get_tc_basin_geometry(tc_basin)[source]#

Get TC (sub-)basin geometry

Parameters:

tc_basin (str) – TC (sub-)basin abbreviated name, such as “SIW” or “NA”.

Returns:

df

Return type:

GeoPandas DataFrame

climada_petals.hazard.emulator.random module#

climada_petals.hazard.emulator.random.estimate_drop(events, time_col, val_col, norm_period, norm_fact=None, norm_mean=None)[source]#

Determine fraction of outlying events to be dropped

If the mean intensity of events in the given time period norm_period is far from the desired mean norm_mean, sampling from events will usually yield draws whose mean is far from the desired mean, so that many resamplings will be necessary in order to get an acceptable draw.

Dropping events off the desired mean before sampling can reduce the necessary number of samplings.

This function estimates which portion of the events should be dropped.

Parameters:
  • events (DataFrame) – Each row describes one event. The dataset should contain at least the columns time_col and val_col.

  • time_col (str) – Name of time column in events.

  • val_col (str) – Name of value column in events.

  • norm_period (pair of timestamps (e.g. floats or ints)) – Normalization period for which a specific mean intensity is expected.

  • norm_mean (float) – Desired mean intensity of events in the given time period.

  • norm_fact (float) – Instead of norm_mean, the ratio between desired and observed intensity in the given time period can be given.

Returns:

drop – Only events satisfying the pandas query expression expr should be eligible for dropping. frac specifies the fraction of these events that are to be dropped.

Return type:

pair [expr, frac]

climada_petals.hazard.emulator.random.draw_poisson_events(poisson, events, val_col, val_accept, drop=None)[source]#

Draw poisson distributed events with acceptable value statistics

The size of the draw is poisson distributed. Redraws are made until the draw mean is within the range specified by val_accept.

If drop is specified, a random choice of entries is dropped from events before the actual drawing is done in order to speed up the process in case of data sets where the acceptable mean is far from the input data mean.

Parameters:
  • poisson (float) – Poisson parameter.

  • events (DataFrame) – Each row describes one event. The dataset should contain at least the column val_col.

  • val_col (str) – Name of value column in events.

  • val_accept (pair of floats) – Acceptable range of draw means.

  • drop (pair [expr, frac] or None) – If given, only events satisfying the pandas query expression expr are dropped. frac specifies the fraction of these events that is dropped.

Returns:

draw_idx – Indices into events. If no acceptable draw was among the first 10,000 attempts, the return value is None.

Return type:

Series or None

climada_petals.hazard.emulator.stats module#

climada_petals.hazard.emulator.stats.seasonal_average(data, season)[source]#

Compute seasonal average from monthly-time series.

For seasons that are across newyear, the months after June are attributed to the following year’s season. For example: The 6-month season from November 1980 till April 1981 is attributed to the year 1981.

The two seasons that are truncated at the beginning/end of the dataset’s time period are discarded. When the input data is 1980-2010, the output data will be 1981-2010, where 2010 corresponds to the 2009/2010 season and 1981 corresponds to the 1980/1981 season.

Parameters:
  • data (DataFrame { year, month, … }) – All further columns will be averaged over.

  • season (pair of ints) – Start/end month of season.

Returns:

averaged_data – Same format as input, but with month column removed.

Return type:

DataFrame { year, … }

climada_petals.hazard.emulator.stats.seasonal_statistics(events, season)[source]#

Compute seasonal statistics from given hazard event data

Parameters:
  • events (DataFrame { year, month, intensity, … }) – Events outside of the given season are ignored.

  • season (pair of ints) – Start/end month of season.

Returns:

haz_stats – For seasons that are across newyear, this might cover one year less than the input data since truncated seasons are discarded.

Return type:

DataFrame { year, events, intensity_mean, intensity_std, intensity_max }

climada_petals.hazard.emulator.stats.haz_max_events(hazard, min_thresh=0)[source]#

Table of max intensity events for given hazard

Parameters:
  • hazard (climada.hazard.Hazard object)

  • min_thresh (float) – Minimum intensity for event to be registered.

Returns:

events – The integer value in column id refers to the internal order of events in the given hazard object. lat, lon and intensity specify location and intensity of the maximum intensity registered.

Return type:

DataFrame { id, name, year, month, day, lat, lon, intensity }

climada_petals.hazard.emulator.stats.normalize_seasonal_statistics(haz_stats, haz_stats_obs, freq_norm)[source]#

Bias-corrected annual hazard statistics

Parameters:
  • haz_stats (DataFrame { … }) – Output of seasonal_statistics.

  • haz_stats_obs (DataFrame { … }) – Output of seasonal_statistics.

  • freq_norm (DataFrame { year, freq }) – Information about the relative surplus of hazard events per year, i.e., if freq_norm specifies the value 0.2 in some year, then it is assumed that the number of events given for that year is 5 times as large as it is predicted to be.

Returns:

statistics – intensity_max_obs, intensity_mean_obs, eventcount_obs } Normalized and observed hazard statistics.

Return type:

DataFrame { year, intensity_max, intensity_mean, eventcount,

climada_petals.hazard.emulator.stats.fit_data(data, explained, explanatory, poisson=False)[source]#

Fit a response variable (e.g. intensity) to a list of explanatory variables

The fitting is run twice, restricting to the significant explanatory variables in the second run.

Parameters:
  • data (DataFrame { year, explained, explanatory, … }) – An intercept column is added automatically.

  • explained (str) – Name of explained variable, e.g. ‘intensity’.

  • explanatory (list of str) – Names of explanatory variables, e.g. [‘gmt’,’esoi’].

  • poisson (boolean) – Optionally, use Poisson regression for fitting. If False (default), uses ordinary least squares (OLS) regression.

Returns:

sm_results – Results for first and second run.

Return type:

pair of statsmodels Results object

climada_petals.hazard.emulator.stats.fit_significant(sm_results)[source]#

List significant variables in sm_results

Note: The last variable (usually intercept) is omitted!

climada_petals.hazard.emulator.stats.fit_significance(sm_results)[source]#

Extract and visualize significance of model parameters