Skip to contents

A guide to the variables in configure_flare.yml and in the observations_config.csv, parameter_calibration_config.csv, and states_config.csv.

configure_flare.yml

location:

  • site_id: Four letter code for lake

  • name: name of lake

  • latitude: latitude in degrees north

  • longitude: longitude in degrees east

da_setup:

  • da_method: code for data assimilation method (enkf)

  • par_fit_method: method for parameter fitting

    • inflate uses the parameter inflat_pars in the par configuration to increase the variance of the parameters when data is assimilated. This is the method used in Thomas et al. 2023

    • perturb adds normal random noise to each parameter based on the parameter perturb_par in the par configuration.

    • perturb_const Data assimilation fits the mean of the parameter distribution uses a specified variance for parameters defined by the parameter perturb_par in the par configuration

  • ensemble_size: number of ensemble members

  • localization_distance: distance in meters over which covariances are in the enkf covariance metric is deminished. The distance governs the exponential decay of the covariance strength.

  • no_negative_states: Force non-temperature states to be postive (TRUE or FALSE)

  • assimilate_first_step: Assimilate data provided by the initial conditions. Set to FALSE if the initial conditions already have data assimilated. (TRUE or FALSE)

  • use_obs_constraint: Assimilate observations (TRUE or FALSE)

  • obs_filename: file name of targets file. It is required to be located in the lake_directory/targets/{site_id} directory.

model_settings:

  • ncore: number of process cores to use

  • model_name: name of process model (glm or glm_aed)

  • base_GLM_nml: name of base GLM namelist. It is required to be in the lake_directory/configuraton/{config_set} directory.

  • modeled_depths: vector of depths (m) that are simulated. Value is for the top of the layer.

  • par_config_file: name of parameter configuration csv. It is required to be in the lake_directory/configuraton/{config_set} directory. parameter_calibration_config.csv obs_config_file: name of observation configuration csv. It is required to be in the lake_directory/configuraton/{config_set} directory.

  • states_config_file: name of state configuration csv. It is required to be in the lake_directory/configuraton/{config_set} directory.

  • depth_model_sd_config_file: Optional state configuration file that specifies how process uncertainty depends on depth. If used it is required be in the lake_directory/configuraton/{config_set} directory.

default_init:

  • lake_depth_init: initial lake depth (meters)

  • default_temp_init: vector of initial temperature profile

  • default_temp_init_depths: vector of depths in initial temperature profile

  • the_sals_init: vector of initial salinty values

  • default_snow_thickness_init: initial snow thickness (cm)

  • default_white_ice_thickness_init: initial white ice thickness

  • default_blue_ice_thickness_init: initial blue ice thickness (cm)

  • lake_depth: initial lake depth (meters)

  • temp: vector of initial temperature profile

  • temp_depths: vector of depths in initial temperature profile

  • salinity: initial salinty value (g/kg)

  • snow_thickness: initial snow thickness (m)

  • white_ice_thickness: initial white ice thickness (m)

  • blue_ice_thickness: initial blue ice thickness (m)

met:

  • use_forecasted_met: Use forecast met during forecasting (TRUE or FALSE)

  • use_met_s3: access met data on s3 bucket (TRUE or FALSE)

  • use_observed_met: use observed met for non-forecast time steps (TRUE or FALSE)

  • observed_filename: name of meterology targets file name

  • use_ler_vars: use LER standardized met names (TRUE or FALSE)

  • local_directory: directory where meterology forcasts are saved if not using s3 access. Relative to the lake_directory.

inflow:

  • include_inflow: Include inflows in simulations (TRUE or FALSE)

  • use_forecasted_inflow: Use forecast met during forecasting (TRUE or FALSE)

  • forecast_inflow_model: name of inflow model

  • observed_filename: name of inflow targets file name

  • use_ler_vars: use LER standardized met names (TRUE or FALSE)

  • local_directory: directory where inflow forcasts are saved if not using s3 access. Relative to the lake_directory.

uncertainty:

  • observation: Include uncertainty in observations (TRUE or FALSE)

  • process: Include normal random noise added to states during forecast (TRUE or FALSE)

  • weather: Include multiple weather forecast ensemble members (TRUE or FALSE)

  • initial_condition: Include uncertainty in states at initiation of forecast (TRUE or FALSE)

  • parameter: Include uncertainty in parameters during forecast (TRUE or FALSE)

  • inflow_process: Include uncertainty in inflow during forecast (TRUE or FALSE)

output_settings:

  • diagnostics_names: names of non-state GLM variables to save

  • evaluate_past: score past forecasts with reference_datetimes within the forecast horizon (TRUE or FALSE)

  • variables_in_scores: Variable types to include in scores. Options are state, parameter, and/or diagnostic

s3

  • drivers:
    • endpoint:
    • bucket:
  • inflow_drivers:
    • endpoint:
    • bucket:
  • targets:
    • endpoint:
    • bucket:
  • forecasts:
    • endpoint:
    • bucket:
  • forecasts_parquet:
    • endpoint:
    • bucket:
  • warm_start:
    • endpoint:
    • bucket:
  • scores:
    • endpoint:
    • bucket:

parameter_calibration_config.csv

  • par_names: vector of GLM names of parameter values estimated
  • par_names_save: vector of names of parameter values estimated that are desired in output and plots
  • par_file: vector of nml or csv file names that contains the parameter that is being estimated
  • par_init_mean: vector of initial mean value for parameters
  • par_init_lowerbound: vector of lower bound for the initial uniform distribution of the parameters
  • par_init_upperbound: vector of upper bound for the initial uniform distribution of the parameters
  • par_lowerbound: vector of lower bounds that a parameter can have
  • par_upperbound: vector of upper bounds that a parameter can have
  • inflat_pars: The variance inflation factor applied to the parameter component of the ensemble. Value greater than 1.
  • perturb_par: The standard deviation of the normally distributed random noise that is added to parameters
  • par_units: Units of parameter for plotting

states_config.csv

  • state_names: name of states.
  • initial_conditions: The initial conditions for the state if observations are not available to initialize. Assumes the initial conditions are constant over all depths, except for temperature which uses the default_temp_init variable in configure_flare.R to set the depth profile when observations are lacking
  • model_sd: the standard deviation of the process error for the tate
  • initial_model_sd: the standard deviation on the initial distribution of the state
  • states_to_obs_mapping: a multiplier on the state to convert to the observation. In most cases this is 1. However, in the case of phytoplankton, the model predicts mmol/m3 biomass but the observations are ug/L chla. Therefore the multiplier is the biomass to chla conversion
  • states_to_obs_1: The observation that the state contributes to
    • NA is required if no matching observations
    • Name in this column must match an observation name
  • states_to_obs_2: A second observation that the state contributes to
    • NA is required if no matching observations
    • Name in this column must match an observation name
  • init_obs_name: the name of observation that is used to initialize the state if there is an observation
  • init_obs_mapping: a multiplier on the observation when used to initialize. For example, if using a combined DOC measurement to initialize two DOC states, you need to provide the proportion of the observation that is assigned to each state.

depth_model_sd.csv:

  • depth: depth (m)
  • column name = variable name for states that have depth varying process uncertainty. Values are the sd for each depth. sd will be interpolated between and extrapolated beyond the depths provided.

observations_config.csv

  • state_names_obs: names of states with observations
  • obs_sd: the standard deviation of the observation uncertainty
  • target_variable: the name of variable in the data file that is used for the observed state.