FLAREr upgrade
flare-upgrade-vignette.Rmd
A guide to upgrading to FLAREr 3.0.0.
configure_flare.yml
configure_flare.yml
is required to be located in your
{lake_directory}/configurations/{config_set_name}
directory.
model_settings:
max_model_layers: set to 75
modeled_depths” are now the depth with observations or the depths that you want in your output (e.g., the depths used in data assimilation) GLM internal sets the depth (actually now heights) that are modeled.
met:
Your historical met needs to be a model. If you are using stage 3
NOAA then you just need to point the historical_met_model
and local_met_directory
to that. If you were using
observations as the historical met you need to create a “model” from the
observations. The model needs to a parameter and prediction column. The
number of ensemble members in the historical model needs to match the
number in the future model. Here is some example code for converting a
gapfilled meteorology file to a “model”.
met_ensemble <- 31
hist_interp_met <- readr::read_csv(cleaned_met_file) |>
mutate(parameter = 1) |>
reframe(prediction = rnorm(met_ensemble, mean = observation, sd = 0),
parameter = 0:(met_ensemble-1),
.by = c(site_id, datetime, variable)) |>
arrow::write_dataset(path = file.path(lake_directory, "drivers/met/historical/model_id=obs_interp/site_id=fcre"))
-
future_met_model: path to met model used for
forecast days (relative to
s3$drivers$bucket
path or local_met_directory). It defines the form of the path partitioning. For examplemet/gefs-v12/stage2/reference_datetime={reference_date}/site_id={site_id}
provides the path with the the part in the brackets being updated within the FLARE run -
historical_met_model: path to met model used for
historical days (relative to
s3$drivers$bucket
path or local_met_directory). It defines the form of the path partitioning. For examplemet/gefs-v12/stage3/site_id={site_id}
provides the path with the the part in the brackets being updated within the FLARE run - forecast_lag_days: number of days to look backward for a forecast
-
use_ler_vars: use LER standardized met names
(
TRUE
orFALSE
) -
historical_met_use_s3: access historical met data
on s3 bucket (
TRUE
orFALSE
) -
future_met_use_s3: access future met data on s3
bucket (
TRUE
orFALSE
) -
use_openmeteo: use openmeteo for meterology inputs
(
TRUE
orFALSE
) -
openmeteo_api: the name of the openmeteo api to use
(only used if
openmeteo = TRUE
);seasonal
,ensemble_forecast
,historical
,climate
-
openmeteo_model: name of the openmeteo model to use
( (only used if
openmeteo = TRUE
)); see https://open-meteo.com/en/docs for list of models -
use_openmeteo_archive: use a archived version of
openmeteo on s3 rather than directly using the api. (only used if
openmeteo = TRUE
) - local_met_directory: directory where meteorology forecasts are saved if not using s3 access. Relative to the lake_directory.
inflow:
You inflow and outflows need to be separate models.
-
include_inflow: Include inflows in simulations
(
TRUE
orFALSE
) -
include_outflow: Include outflows in simulations
(
TRUE
orFALSE
) -
future_inflow_model: path to inflow model used for
forecast days (relative to
s3$inflow$bucket
path or local_inflow_directory). It defines the form of the path partitioning. For exampleinflow/model_id=historical/reference_datetime={reference_date}/site_id={site_id}
provides the path with the the part in the brackets being updated within the FLARE run -
historical_inflow_model: path to inflow model used
for historical days (relative to
s3$inflow$bucket
path). It defines the form of the path partitioning. For exampleinflow/model_id=historical/site_id={site_id}
provides the path with the the part in the brackets being updated within the FLARE run. local_inflow_directory) - local_inflow_directory: directory where inflow forecasts are saved if not using s3 access. Relative to the lake_directory.
-
future_outflow_model: path to outflow model used
for forecast days (relative to
s3$outflow$bucket
path or local_outflow_directory) -
historical_outflow_model: path to outflow model
used for historical days (relative to
s3$outflow$bucket
path or local_outflow_directory) - local_outflow_directory: directory where outflow forecasts are saved if not using s3 access. Relative to the lake_directory.
s3
This section now has inflow_drivers
and
outflow_drivers
. - drivers: -
endpoint: s3 endpoint of met drivers -
bucket: s3 bucket of met drivers -
inflow_drivers: - endpoint: s3
endpoint of inflow drivers - bucket: s3 bucket of
inflow drivers - outflow_drivers: -
endpoint: s3 endpoint of outflow drivers -
bucket: s3 bucket of outflow drivers
Change
-
warm_start:
- endpoint: s3 endpoint of restart files
- bucket: s3 bucket of restart files to
-
restart:
- endpoint: s3 endpoint of restart files
- bucket: s3 bucket of restart files
parameter_calibration_config.csv
Remove the column inflat_par
from
parameter_calibration_config.csv
Add
- fix_par: 0 = fit parameter, 1 = hold parameter at par_init
observations_config.csv
Add the following column to observations_config.csv
- multi_depth: 1 = observation has multiple depths, 0 = observation does not have a depth associated with it.
Remove the column distance_theshold
GLM3.nml
Add the following variable to the &init_profiles
section
restart_mixer_count = 0
Change the following variables in the init_profiles
section to:
the_depth
to the_heights
num_depths
to num_heights
Change the following variables in the glm_setup
section
to:
min_layer_vol = 0.025
min_layer_thick = 0.2
max_layer_thick = 0.8
Other changes
The scoring part of FLAREr
has been removed. You will
need to add it in to your workflow script and install
remotes::install_github("eco4cast/score4cast")
The code
was:
generate_forecast_score_arrow <- function(targets_file,
forecast_df,
use_s3 = FALSE,
bucket = NULL,
endpoint = NULL,
local_directory = NULL,
variable_types = "state"){
if(use_s3){
output_directory <- arrow::s3_bucket(bucket = bucket,
endpoint_override = endpoint)
}else{
output_directory <- arrow::SubTreeFileSystem$create(local_directory)
}
target <- readr::read_csv(targets_file, show_col_types = FALSE)
df <- forecast_df |>
dplyr::filter(variable_type %in% variable_types) |>
dplyr::mutate(family = as.character(family)) |>
score4cast::crps_logs_score(target, extra_groups = c('depth')) |>
dplyr::mutate(horizon = datetime-lubridate::as_datetime(reference_datetime)) |>
dplyr::mutate(horizon = as.numeric(lubridate::as.duration(horizon),
units = "seconds"),
horizon = horizon / 86400)
df <- df |> dplyr::mutate(reference_date = lubridate::as_date(reference_datetime))
arrow::write_dataset(df, path = output_directory, partitioning = c("site_id","model_id","reference_date"))
}
Other changes
The package does not depend on the GLM3r
package
anymore. There are multiple options for getting a GLM binary. The
easiest is to install GLM3r from github (the README has more information
about getting the GLM binary)
remotes::install_github("rqthomas/GLM3r")
you are require to set an environment variable in your workflow
script that calls run_flare
.
Sys.setenv('GLM_PATH'='GLM3r')
The update_run_config2
function has been replaced by the
update_run_config
function
The check_noaa_present_arrow
function has been replaced
by the check_noaa_present
function
The set_configuration
function has been replaced by the
set_up_simulation
function
Other FLAREr
functions now require :::
because they are not exported by the package.
The forecast parquet output has pub_datetime
rather than
pub_date
and it has a new column
log_weight
Plots saved in the plots subdirectory
Netcdf restart files are saved in the restart directory