FLAREr configurations
flare-config-vignette.Rmd
A guide to the variables in configure_flare.yml
and in
the observations_config.csv
,
parameter_calibration_config.csv
, and
states_config.csv
.
configure_flare.yml
configure_flare.yml
is required to be located in your
{lake_directory}/configurations/{config_set_name}
directory.
location:
site_id: Four letter code for lake
name: name of lake
latitude: latitude in degrees north
longitude: longitude in degrees east
da_setup:
-
da_method: code for data assimilation method (
enkf
orpf
)- enkf: ensemble Kalman Filter
- pf: bootstrap resampling particle filter
-
par_fit_method: method for parameter fitting
inflate uses the parameter
inflat_pars
in the par configuration to increase the variance of the parameters when data is assimilated. This is the method used in Thomas et al. 2023perturb adds normal random noise to each parameter based on the parameter
perturb_par
in the par configuration.perturb_const Data assimilation fits the mean of the parameter distribution uses a specified variance for parameters defined by the parameter
perturb_par
in the par configuration
ensemble_size: number of ensemble members
localization_distance: distance in meters over which covariances are in the enkf covariance metric is diminished. The distance governs the exponential decay of the covariance strength.
no_negative_states: Force non-temperature states to be positive (
TRUE
orFALSE
)assimilate_first_step: Assimilate data provided by the initial conditions. Set to FALSE if the initial conditions already have data assimilated. (
TRUE
orFALSE
)use_obs_constraint: Assimilate observations (
TRUE
orFALSE
)obs_filename: file name of targets file. It is required to be located in the
lake_directory/targets/{site_id}
directory.pf_always_resample: Force the particle filter to resample each timestep (only when
da_method = pf
) (TRUE
orFALSE
)
model_settings:
ncore: number of process cores to use
model_name: name of process model (
glm
orglm_aed
)base_GLM_nml: name of base GLM namelist. It is required to be in the
lake_directory/configuraton/{config_set}
directory.max_model_layers: maximum number of layers allowed in the GLM simulations
modeled_depths: vector of depths (m) with observations or that the user desires output for. Value is for the top of the layer.
par_config_file: name of parameter configuration csv. It is required to be in the
lake_directory/configuraton/{config_set}
directory. parameter_calibration_config.csv obs_config_file: name of observation configuration csv. It is required to be in thelake_directory/configuraton/{config_set}
directory.states_config_file: name of state configuration csv. It is required to be in the
lake_directory/configuraton/{config_set}
directory.depth_model_sd_config_file: Optional state configuration file that specifies how process uncertainty depends on depth. If used it is required be in the
lake_directory/configuraton/{config_set}
directory.
default_init:
lake_depth: initial lake depth (meters)
temp: vector of initial temperature profile
temp_depths: vector of depths in initial temperature profile
salinity: initial salinity value (g/kg)
snow_thickness: initial snow thickness (m)
white_ice_thickness: initial white ice thickness (m)
blue_ice_thickness: initial blue ice thickness (m)
met:
-
future_met_model: path to met model used for
forecast days (relative to
s3$drivers$bucket
path or local_met_directory). It defines the form of the path partitioning. For examplemet/gefs-v12/stage2/reference_datetime={reference_date}/site_id={site_id}
provides the path with the the part in the brackets being updated within the FLARE run -
historical_met_model: path to met model used for
historical days (relative to
s3$drivers$bucket
path or local_met_directory). It defines the form of the path partitioning. For examplemet/gefs-v12/stage3/site_id={site_id}
provides the path with the the part in the brackets being updated within the FLARE run. - forecast_lag_days: number of days to look backward for a forecast
-
use_ler_vars: use LER standardized met names
(
TRUE
orFALSE
) -
historical_met_use_s3: access historical met data
on s3 bucket (
TRUE
orFALSE
) -
future_met_use_s3: access met data on s3 bucket
(
TRUE
orFALSE
) -
use_openmeteo: use openmeteo for meterology inputs
(
TRUE
orFALSE
) -
openmeteo_api: the name of the openmeteo api to use
(only used if
openmeteo = TRUE
);seasonal
,ensemble_forecast
,historical
,climate
-
openmeteo_model: name of the openmeteo model to use
( (only used if
openmeteo = TRUE
)); see https://open-meteo.com/en/docs for list of models -
use_openmeteo_archive: use a archived version of
openmeteo on s3 rather than directly using the api. (only used if
openmeteo = TRUE
) - local_met_directory: directory where meteorology forecasts are saved if not using s3 access. Relative to the lake_directory.
inflow:
-
include_inflow: Include inflows in simulations
(
TRUE
orFALSE
) -
include_outflow: Include outflows in simulations
(
TRUE
orFALSE
) -
future_inflow_model: path to inflow model used for
forecast days (relative to
s3$inflow$bucket
path or local_inflow_directory). It defines the form of the path partitioning. For exampleinflow/model_id=historical/reference_datetime={reference_date}/site_id={site_id}
provides the path with the the part in the brackets being updated within the FLARE run -
historical_inflow_model: path to inflow model used
for historical days (relative to
s3$inflow$bucket
path). It defines the form of the path partitioning. For exampleinflow/model_id=historical/site_id={site_id}
provides the path with the the part in the brackets being updated within the FLARE run. - local_inflow_directory: directory where inflow forecasts are saved if not using s3 access. Relative to the lake_directory.
-
future_outflow_model: path to outflow model used
for forecast days (relative to
s3$outflow$bucket
path or local_outflow_directory) -
historical_outflow_model: path to outflow model
used for historical days (relative to
s3$outflow$bucket
path or local_outflow_directory) - local_outflow_directory: directory where outflow forecasts are saved if not using s3 access. Relative to the lake_directory.
-
use_flows_s3: access flow models on an s3 bucket
(
TRUE
orFALSE
)
uncertainty:
observation: Include uncertainty in observations (
TRUE
orFALSE
)process: Include normal random noise added to states during forecast (
TRUE
orFALSE
)weather: Include multiple weather forecast ensemble members (
TRUE
orFALSE
)initial_condition: Include uncertainty in states at initiation of forecast (
TRUE
orFALSE
)parameter: Include uncertainty in parameters during forecast (
TRUE
orFALSE
)inflow: Include uncertainty in inflow during forecast (
TRUE
orFALSE
)
output_settings:
- diagnostics_names: names of non-state GLM variables to save
-
generate_plots: generate diagnostic plots
(
TRUE
orFALSE
) -
diagnostics_daily:
- names: the variable names in the csv file produced by GLM. Options are at https://github.com/AquaticEcoDynamics/glm-aed/wiki/Navigating-GLM-outputs
-
save_names: the name of the variable in the FLARE
forecast output. This may differ from
csv_name
when you want to add more information to the variable name. For example the csv may havetemp
but you want to save it asoutflow_temp
so you know it is from the outflow. If the output file isoutput.nc
then thesave
name needs to the aggregation function (mean
,min
, andmax
are supported). For example, a value oftemp_mean
would calculate the daily mean from the subdaily output.nc file.nsave
in the glm3.nml file needs adjusted to output at sub-daily time steps (nsave = 1
would output hourly ifdt = 3600
). -
file: the name of the GLM output csv that has the
variable. This is defined in the GLM nml. Example are
lake.csv
,outlet_00.csv
, andoutput.nc
- depth: Depth for daily diagnostic. Use NA for variables that are not associated with a depth (like co2 flux).
s3
-
drivers:
- endpoint: s3 endpoint of met drivers
- bucket: s3 bucket of met drivers
-
inflow_drivers:
- endpoint: s3 endpoint of inflow drivers
- bucket: s3 bucket of inflow drivers
-
outflow_drivers:
- endpoint: s3 endpoint of outflow drivers
- bucket: s3 bucket of outflow drivers
-
targets:
- endpoint: s3 endpoint of target files
- bucket: s3 bucket of target files
-
forecasts_parquet:
- endpoint: s3 endpoint of forecast parquet files
- bucket: s3 bucket of forecast parquet files
-
restart:
- endpoint: s3 endpoint of restart yaml file
- bucket: s3 bucket of restart yaml file
-
scores:
- endpoint: s3 endpoint of scores
- bucket: s3 bucket of scores
parameter_calibration_config.csv
parameter_calibration_config.csv
is required to be
located in your
{lake_directory}/configurations/{config_set_name}
directory.
- par_names: vector of GLM names of parameter values estimated
- par_names_save: vector of names of parameter values estimated that are desired in output and plots
- par_file: vector of nml or csv file names that contains the parameter that is being estimated
- par_init: vector of initial mean value for parameters
- par_init_lowerbound: vector of lower bound for the initial uniform distribution of the parameters
- par_init_upperbound: vector of upper bound for the initial uniform distribution of the parameters
- par_lowerbound: vector of lower bounds that a parameter can have
- par_upperbound: vector of upper bounds that a parameter can have
-
perturb_par: The parameter controlling the noise or
spread in the parameters. If the parameter fitting method is
perturb
then it is the standard deviation of the normally distributed random noise that is added to parameters. If the parameter fitting method isperturb_const
, then it is the standard deviation of the parameter distribution. If the parameter fitting method isinflate
it is the The variance inflation factor applied to the parameter component of the ensemble (Value greater than 1). - par_units: Units of parameter for plotting
- fix_par: 0 = fit parameter, 1 = hold parameter at par_init
states_config.csv
states_config.csv
is required to be located in your
{lake_directory}/configurations/{config_set_name}
directory.
- state_names: name of states.
-
initial_conditions: The initial conditions for the
state if observations are not available to initialize. Assumes the
initial conditions are constant over all depths, except for temperature
which uses the
default_temp_init
variable inconfigure_flare.R
to set the depth profile when observations are lacking - model_sd: the standard deviation of the process error for the tate
- vert_decorr_length:
- initial_model_sd: the standard deviation on the initial distribution of the state
- states_to_obs_mapping: a multiplier on the state to convert to the observation. In most cases this is 1. However, in the case of phytoplankton, the model predicts mmol/m3 biomass but the observations are ug/L chla. Therefore the multiplier is the biomass to chla conversion
-
states_to_obs_1: The observation that the state
contributes to
-
NA
is required if no matching observations - Name in this column must match an observation name
-
-
states_to_obs_2: A second observation that the
state contributes to
-
NA
is required if no matching observations - Name in this column must match an observation name
-
- init_obs_name: the name of observation that is used to initialize the state if there is an observation
- init_obs_mapping: a multiplier on the observation when used to initialize. For example, if using a combined DOC measurement to initialize two DOC states, you need to provide the proportion of the observation that is assigned to each state.
depth_model_sd.csv:
depth_model_sd.csv
is optional and should be located in
your {lake_directory}/configurations/{config_set_name}
directory.
- depth: depth (m)
- column names: variable name for states that have depth varying process uncertainty. Values are the sd for each depth. sd will be interpolated between and extrapolated beyond the depths provided.
observations_config.csv
observations_config.csv
is required to be located in
your {lake_directory}/configurations/{config_set_name}
directory.
- state_names_obs: names of states with observations
- obs_sd: the standard deviation of the observation uncertainty
- target_variable: the name of variable in the data file that is used for the observed state.
- multi_depth: 1 = observation has multiple depths, 0 = observation does not have a depth associated with it.