This chapter describes packages that have been introduced for ocean
state estimation purposes and in relation with automatic differentiation
(see Automatic Differentiation). Various examples in this chapter rely on two
model configurations that can be setup as explained in Test Cases For Estimation Package Capabilities
10.1. ECCO: model-data comparisons using gridded data sets¶
Author: Gael Forget
The functionalities implemented in pkg/ecco are: (1) output
time-averaged model fields to compare with gridded data sets; (2)
compute normalized model-data distances (i.e., cost functions); (3)
compute averages and transports (i.e., integrals). The former is
achieved as the model runs forwards in time whereas the others occur
after time-integration has completed. Following
[FCH+15] the total cost function is formulated
generically as
using symbols defined in Table 10.1. Per
Equation (10.3) model counterparts
(\(\vec{m}_i\)) to observational data (\(\vec{o}_i\)) derive
from adjustable model parameters (\(\vec{v}\)) through model
dynamics integration (\(\mathcal{M}\)), diagnostic calculations
(\(\mathcal{D}\)), and averaging in space and time
(\(\mathcal{S}\)). Alternatively \(\mathcal{S}\) stands for
subsampling in space and time in the context of
Section 10.2 (PROFILES: model-data comparisons at observed locations). Plain
model-data misfits (\(\vec{m}_i-\vec{o}_i\)) can be penalized
directly in Eq. (10.1) but penalized misfits
(\(\vec{d}_i\)) more generally derive from
\(\vec{m}_i-\vec{o}_i\) through the generic \(\mathcal{P}\)
post-processor (Eq. (10.2)). Eqs. (10.4)-(10.5)
pertain to model control parameter adjustment capabilities described in
Section 10.3 (CTRL: Model Parameter Adjustment Capability).
Table 10.1 Symbol used in formulating generic cost functions.¶
symbol
definition
\(\vec{u}\)
vector of nondimensional control
variables
\(\vec{v}\)
vector of dimensional control
variables
\(\alpha_i, \beta_j\)
misfit and control cost function
multipliers (1 by default)
\(R_i\)
data error covariance matrix
(\(R_i^{-1}\) are weights)
The parameters available for configuring generic cost function terms in
data.ecco are given in Table 10.2 and
examples of possible specifications are available in:
The gridded observation file name is specified by gencost_datafile.
Observational time series may be provided as on big file or split into yearly
files finishing in ‘_1992’, ‘_1993’, etc. The corresponding \(\vec{m}_i\)
physical variable is specified via the gencost_barfile root (see
Table 10.3). A file named as specified by
gencost_barfile gets created where averaged fields are written
progressively as the model steps forward in time. After the final time step
this file is re-read by cost_generic.F to
compute the corresponding cost function term. If gencost_outputlevel
= 1 and gencost_name=‘foo’ then cost_generic.F outputs model-data misfit fields (i.e.,
\(\vec{d}_i\)) to a file named ‘misfit_foo.data’ for offline analysis and
visualization.
In the current implementation, model-data error covariance matrices
\(R_i\) omit non-diagonal terms. Specifying \(R_i\) thus boils
down to providing uncertainty fields (\(\sigma_i\) such that
\(R_i=\sigma_i^2\)) in a file specified via gencost_errfile. By
default \(\sigma_i\) is assumed to be time-invariant but a
\(\sigma_i\) time series of the same length as the \(\vec{o}_i\)
time series can be provided using the variaweight option
(Table 10.4). By
default cost functions are quadratic but
\(\vec{d}_i^T R_i^{-1} \vec{d}_i\) can be replaced with
\(R_i^{-1/2} \vec{d}_i\) using the nosumsq option
(Table 10.4).
In principle, any averaging frequency should be possible, but only
‘day’, ‘month’, ‘step’, and ‘const’ are implemented for
gencost_avgperiod. If two different averaging frequencies are needed
for a variable used in multiple cost function terms (e.g., daily and
monthly) then an extension starting with ‘_’ should be added to
gencost_barfile (such as ‘_day’ and ‘_mon’). 1 If two cost
function terms use the same variable and frequency, however, then using
a common gencost_barfile saves disk space.
Climatologies of \(\vec{m}_i\) can be formed from the time series of model
averages in order to compare with climatologies of \(\vec{o}_i\) by
activating the ‘clim’ option via gencost_preproc and setting the
corresponding gencost_preproc_i integer parameter to the number of
records (i.e., a # of months, days, or time steps) per climatological
cycle. The generic post-processor (\(\mathcal{P}\) in Eq. (10.2))
also allows model-data misfits to be, for example, smoothed in space by setting
gencost_posproc to ‘smooth’ and specifying the smoother parameters
via gencost_posproc_c (name of a smoothing scale file) and
gencost_posproc_i (an integer specifying the smoother number of time
steps, see Table 10.4). The smoothing scale file can be
be based on the large-scale parameter specified in data.smooth or prepared as
a factor of the model resolution dxC and dyC. As an example, one can read in
offline the model dxC and dyC and create a characteristic length-scale as
sqrt(dxC^2 + dyC^2), then multiply by a factor of 3 if one wants the smoothed
(large scale) field to be of length-scale 3x that of the model grid spacing.
The smoother number of time steps gencost_posproc_i can be the same as that
used in data.smooth. Other options associated with the computation
of Eq. (10.1) are summarized in Table 10.4 and
further discussed below. Multiple gencost_preproc /
gencost_posproc options may be specified per cost term.
In general the specification of gencost_name is optional, has no
impact on the end-result, and only serves to distinguish between cost function
terms amongst the model output (STDOUT.0000, STDERR.0000, costfunction000,
misfit*.data). Exceptions listed in Table 10.6 however
activate alternative cost function codes (in place of cost_generic.F) described in Section 10.1.3. In this section
and in Table 10.3 (unlike in other parts of the manual)
‘zonal’ / ‘meridional’ are to be taken literally and these components are
centered (i.e., not at the staggered model velocity points). Preparing gridded
velocity data sets for use in cost functions thus boils down to interpolating
them to XC / YC.
The gencost_kLev_select option allows the user to select the
vertical level of a 3D model field, thereby creating a 2D field out of that
slice which is used for the cost computation. For example, drifter velocities
correspond to the second depth level of the grid used in ECCOv4, so model
velocities are selected from this depth level to compare to the drifter
observations. The user can specify this in data.ecco with:
gencost_kLev_select(i)=2, where i is replaced with the
index for that cost function term.
Table 10.2 Run-time parameters used in formulating generic cost functions and
defined via ecco_gencost_nml` namelist in data.ecco. All
parameters are vectors of length NGENCOST (the # of available
cost terms) except for gencost_proc* are arrays of size
NGENPPROC\(\times\)NGENCOST (10 \(\times\)20
by default; can be changed in ECCO_SIZE.h at compile time). In
addition, the gencost_is3d internal parameter is reset to true
on the fly in all 3D cases in Table 10.3.¶
parameter
type
function
gencost_name
character(*)
Name of cost term
gencost_barfile
character(*)
File to receive model counterpart
\(\vec{m}_i\) (See
Table 10.3)
gencost_datafile
character(*)
File containing
observational data
\(\vec{o}_i\)
gencost_avgperiod
character(5)
Averaging period for
\(\vec{o}_i\) and
\(\vec{m}_i\)
(see text)
Vertical level of a 3D field to
create a 2D field for cost
computation
gencost_useDensityMask
logical
Needs to be true if density
following feature is used
gencost_sigmaLow
real
Use to define minimum density
surface chosen
gencost_sigmaHigh
real
Used to define maximum density
surface chosen
gencost_refPressure
real
Defines reference pressure used
in density following feature
gencost_tanhScale
real
Used in defining density levels
in density following feature
Table 10.3 Implemented gencost_barfile options (as of checkpoint 65z) that
can be used via cost_generic.F (Section 10.1.1). An extension
starting with ‘_’ can be appended at the end of the variable name
to distinguish between separate cost function terms. Note: the
‘m_eta’ formula depends on the ATMOSPHERIC_LOADING and
ALLOW_PSBAR_STERIC compile-time options and
‘useRealFreshWaterFlux’ run-time parameter.¶
variable name
description
remarks
m_eta
sea surface height
free surface + ice +
global steric
correction
m_sst
sea surface
temperature
first level potential
temperature
m_sss
sea surface salinity
first level salinity
m_bp
bottom pressure
phiHydLow
m_siarea
sea-ice area
from pkg/seaice
m_siheff
sea-ice effective
thickness
from pkg/seaice
m_sihsnow
snow effective
thickness
from pkg/seaice
m_theta
potential temperature
three-dimensional
m_salt
salinity
three-dimensional
m_UE
zonal velocity
three-dimensional
m_VN
meridional velocity
three-dimensional
m_ustress
zonal wind stress
from pkg/exf
m_vstress
meridional wind
stress
from pkg/exf
m_uwind
zonal wind
from pkg/exf
m_vwind
meridional wind
from pkg/exf
m_atemp
atmospheric
temperature
from pkg/exf
m_aqh
atmospheric specific
humidity
from pkg/exf
m_precip
precipitation
from pkg/exf
m_swdown
downward shortwave
from pkg/exf
m_lwdown
downward longwave
from pkg/exf
m_wspeed
wind speed
from pkg/exf
m_diffkr
vertical/diapycnal
diffusivity
three-dimensional,
constant
m_kapgm
GM diffusivity
three-dimensional,
constant
m_kapredi
isopycnal diffusivity
three-dimensional,
constant
m_geothermalflux
geothermal heat flux
constant
m_bottomdrag
bottom drag
constant
Table 10.4 gencost_preproc and gencost_posproc options
implemented as of checkpoint 65z. Note: the distinction between
gencost_preproc and gencost_posproc seems unclear and may be
revisited in the future.¶
The functionality described in this section is operated by
cost_gencost_boxmean.F. It is
primarily aimed at obtaining a mechanistic understanding of a chosen physical
variable via adjoint sensitivity computations (see Automatic Differentiation) as
done for example in [FWL+15, HWP+11, MGZ+99]. Thus the
quadratic term in Eq. (10.1) (\(\vec{d}_i^T R_i^{-1} \vec{d}_i\)) is
by default replaced with a \(d_i\) scalar 2 that derives from model
fields through a generic integral formula (Eq. (10.3)). The
specification of gencost_barfile again selects the physical variable
type. Current valid options to use cost_gencost_boxmean.F are reported in
Table 10.5. A suffix starting with ‘_’ can again be
appended to gencost_barfile.
The integral formula is defined by masks provided via binary files which
names are specified via gencost_mask. There are two cases: (1) if
gencost_mask=‘foo_mask’ and gencost_barfile is of the
‘m_boxmean*’ type then the model will search for horizontal, vertical,
and temporal mask files named foo_maskC, foo_maskK, and
foo_maskT; (2) if instead gencost_barfile is of the
‘m_horflux_’ type then the model will search for foo_maskW,
foo_maskS, foo_maskK, and foo_maskT.
The ‘C’ mask or the ‘W’ / ‘S’ masks are expected to be two-dimensional
fields. The ‘K’ and ‘T’ masks (both optional; all 1 by default) are
expected to be one-dimensional vectors. The ‘K’ vector length should
match Nr. The ‘T’ vector length should match the # of records that the
specification of gencost_avgperiod implies but there is no
restriction on its values. In case #1 (‘m_boxmean*’) the ‘C’ and ‘K’
masks should consists of +1 and 0 values and a volume average will be
computed accordingly. In case #2 (‘m_horflux*’) the ‘W’, ‘S’, and ‘K’
masks should consists of +1, -1, and 0 values and an integrated
horizontal transport (or overturn) will be computed accordingly.
In order to define a control volume using both a depth range and a
density range, use a ‘K’ mask and also set
gencost_useDensityMask=.TRUE.. When the density range
feature is active, the control volume is defined at each timestep by
the bounds set in the ‘K’ mask and also by the density range specified
by the parameters gencost_sigmaLow (the minimum density to
be included in the control volume) and gencost_sigmaHigh
(the maximum density to be included in the control volume). As a default
gencost_refPressure should be set to 0, but other values can
be used (e.g. 1000 dbar, 2000 dbar).
The cost_gencost_moc.F function is
similar to transport function, but is intended to compute the meridional
overturning streamfunction maximum based on the volumetric transport integrated
from the floor to surface, as in Smith and Heimbach (2019) [SH19].
Therefore, this function is intended to work with gencost_barfile=m_trVol, and note that the first 3 characters of gencost_name
must be moc, as depicted in Table 10.6. Users can specify
a latitude band to compute the MOC with appropriately defined West (‘W’) and
South (‘S’) masks as described in Section 10.1.2. As an example see
parameter group (3) in this data.ecco file
.
Note: the functionality in cost_gencost_transp.F is not regularly tested. Users interested in
computing volumetric transports through a section are recommended to use the
m_horflux_vol capabilities described above as it is regularly tested. Users
interested in computing heat and salt transport should note the following about
m_trHeat and m_trSalt:
The associated advection scheme with transports may be inconsistent with
the model unless ENUM_CENTERED_2ND is implemented
10.2. PROFILES: model-data comparisons at observed locations¶
Author: Gael Forget
The purpose of pkg/profiles is to allow sampling of MITgcm runs
according to a chosen pathway (after a ship or a drifter, along
altimeter tracks, etc.), typically leading to easy model-data
comparisons. Given input files that contain positions and dates,
pkg/profiles will interpolate the model trajectory at the observed
location. In particular, pkg/profiles can be used to do model-data
comparison online and formulate a least-squares problem (ECCO
application).
The pkg/profiles namelist is called data.profiles. In the example below,
it includes two input netcdf file names (ARGOifremer_r8.nc
and XBT_v5.nc) that should be linked to the run directory
and cost function multipliers that only matter in the
context of automatic differentiation (see Automatic Differentiation). The
first index is a file number and the second index (in mult* only) is a
variable number. By convention, the variable number is an integer
ranging 1 to 6: temperature, salinity, zonal velocity, meridional
velocity, sea surface height anomaly, and passive tracer.
The netcdf input file structure is illustrated in the case of XBT_v5.nc
To create such files, one can use the MITprof matlab toolbox obtained
from https://github.com/gaelforget/MITprof .
At run time, each file is scanned to determine which
variables are included; these will be interpolated. The (final) output
file structure is similar but with interpolated model values in prof_T
etc., and it contains model mask variables (e.g. prof_Tmask). The very
model output consists of one binary (or netcdf) file per processor.
The final netcdf output is to be built from those using
netcdf_ecco_recompose.m (offline).
When the k2 option is used (e.g. for cubed sphere runs), the input file
is to be completed with interpolation grid points and coefficients
computed offline using netcdf_ecco_GenericgridMain.m. Typically, you
would first provide the standard namelist and files. After detecting
that interpolation information is missing, the model will generate
special grid files (profilesXCincl1PointOverlap* etc.) and then stop.
You then want to run netcdf_ecco_GenericgridMain.m using the special
grid files. This operation could eventually be inlined.
Package ctrl provides an interface to defining the
control variables for an optimization. After defining CPP-flags
ALLOW_GENTIM2D_CONTROL, ALLOW_GENARR2D_CONTROL,
ALLOW_GENARR3D_CONTROL in CTRL_OPTIONS.h, the parameters available for configuring generic
cost terms in data.ctrl are given in Table 10.7. The
control variables are stored as fields on the model grid in files
$ctrlvar.$iternumber.data/meta, and corresponding gradients in
ad$ctrlvar.$iternumber.data/meta, where $ctrl is defined in
data.ctrl (see Table 10.8 for possible options) and
$iternumber is the 10-digit iteration number of the optimization. Further,
ctrl maps the gradient fields to a vector that can be
handed over to an optimization routine (see Section 10.5) and maps
the resulting new control vector to the model grid unless CPP-flag
EXCLUDE_CTRL_PACK is defined in CTRL_OPTIONS.h.
Table 10.7 Parameters in namelist group ctrl_nml_genarr in data.ctrl. The
* can be replaced by arr2d, arr3d, or tim2d for
time-invariant two and three dimensional controls and time-varying
2D controls, respectively. Parameters for genarr2d,
genarr3d, and gentime2d are arrays of length
maxCtrlArr2D, maxCtrlArr3D, and
maxCtrlTim2D, respectively, with one entry per term in
the cost function.¶
Table 10.9 xx_gen????d_preproc options implemented as of checkpoint
67x. Notes: \(^a\): If noscaling is false, the control
adjustment is scaled by one on the square root of the weight before
being added to the base control variable; if noscaling is true,
the control is multiplied by the weight in the cost function itself.¶
name
description
arguments
WC01
correlation modeling
integer: operator
type (default: 1)
smooth
smoothing without
normalization
integer: operator
type (default: 1)
docycle
average period
replication
integer: cycle length
replicate
alias for docycle
(units of
xx_gentim2d_period)
rmcycle
periodic average
subtraction
integer: cycle length
variaweight
use time-varying
weight
—
noscaling\(^{a}\)
do not scale with
xx_gen*_weight
—
documul
sets
xx_gentim2d_cumsum
—
doglomean
sets
xx_gentim2d_glosum
—
Table 10.10 xx_gen????d_preproc_c options implemented as of checkpoint
67x.¶
name
description
arguments
log10ctrl
Control adjustments to
log10 of
2D or 3D array
(not available for
xx_gentim2d).
The control problem is non-dimensional by default, as reflected in the
omission of weights in control penalties [(\(\vec{u}_j^T\vec{u}_j\)
in (10.1)]. Non-dimensional controls
(\(\vec{u}_j\)) are scaled to physical units (\(\vec{v}_j\))
through multiplication by the respective uncertainty fields
(\(\sigma_{\vec{u}_j}\)), as part of the generic preprocessor
\(\mathcal{Q}\) in (10.4). Besides the
scaling of \(\vec{u}_j\) to physical units, the preprocessor
\(\mathcal{Q}\) can include, for example, spatial correlation
modeling (using an implementation of Weaver and Coutier, 2001
[WC01]) by
setting xx_gen*_preproc=’WC01’. Alternatively, setting
xx_gen*_preproc=’smooth’ activates the smoothing part of WC01,
but omits the normalization. Additionally, bounds for the controls can
be specified by setting xx_gen*_bounds. In forward mode, adjustments
to the \(i^\text{th}\) control are clipped so that they remain
between xx_gen*_bounds(i,1) and xx_gen*_bounds(i,4). The bounds
have no effect in adjoint mode unless xx_gen*_bounds(i,j) <
xx_gen*_bounds(i,j+1) for \(j = 1, 3\). When this is the case,
the bounds will “emulate a local minimum” as follows in
pkg/ctrl/adctrl_bound.F. On the minimum end,
when xx_gen*(i) < xx_gen*_bounds(i,2) and the gradient
adxx_gen*(i) > 0.0, i.e., the derivative suggests that a
further decrease of xx_gen*(i) will decrease the cost, an adjustment
is enforced to reverse the sign of the gradient adxx_gen*(i) to be
negative such that any further decrease in xx_gen*(i) toward its minimum
bound xx_gen*_bounds(i,1) will be penalized. The opposite is enforced
at the maximum end when xx_gen*(i) > xx_gen*_bounds(i,3)
and adxx_gen*(i) < 0.0 such that the sign of the gradient
adxx_gen*(i) will be reversed to positive to penalize any further
increase in xx_gen*(i) toward its maximum bound xx_gen*_bounds(i,4).
For the case of time-varying controls, the frequency is specified by
xx_gentim2d_period. The generic control package interprets special
values of xx_gentim2d_period in the same way as the exf package:
a value of \(-12\) implies cycling monthly fields while a value of
\(0\) means that the field is steady. Time varying weights can be
provided by specifying the preprocessor variaweight, in which case
the xx_gentim2d_weight file must contain as many records as the
control parameter time series itself (approximately the run length
divided by xx_gentim2d_period).
The parameter mult_gen* sets the multiplier for the corresponding
cost function penalty [\(\beta_j\) in (10.1);
\(\beta_j = 1\) by default). The preconditioner, \(\cal{R}\),
does not directly appear in the estimation problem, but only serves to
push the optimization process in a certain direction in control space;
this operator is specified by gen*Precond (\(=1\) by default).
Note that control parameters exist for each individual near surface atmospheric
state variable, as well as the net heat and salt (EmPmR) fluxes. The user must
be mindful of control parameter combinations that make sense according to their
specific setup, e.g., with the EXF package.
Pair 1a,b are the physical fields with physical units. Pair 2a,b are temporary
files storing a repeat cycle for use during calculations when
docycle and rmcycle are active. Pair 3a,b have units or
no units depending on the setting of noscaling, which controls
scaling/unscaling by the corresponding xx_gen*_weight (see
Table 10.9).
Second, within initiase_variamd.f (see below), records
startrec to endrec of file 3a
$ctrvar.$iternumber.data are read in ctrl_map_ini_gentim2d.F, processed if scaling or smoothing, etc.,
need to be applied, and then written to (1a,2a)
$ctrlvar.{effective,tmp}.data of size diffrec. Note these
routines contain a md or ad suffix and are produced by TAF, e.g.,
s/rctrl_map_ini_gentim2dmd (found in TAF-generated file
ctrl_map_ini_gentim2d_ad.f) called from s/rinitialize_variamd (found
in TAF-generated file initialize_varia_ad.f), which in turn is called
from s/radthe_main_loop (found in TAF-generated file
the_main_loop_ad.f); alternatively, all of these routines are found the
concatenated file ad_taf_output.f.
|-adthe_main_loop only available in the_main_loop_ad.f, called from the_model_main.f
|-adopen (many tapes, ocean variables, atmos, obcs, etc) initialize tapelev grid, etc.
|-initialise_variamd
|-packages_init_variablesmd
|-diagnostics_init_varia, kpp_init_varia, exf_init_variastore salt,theta
|-profiles_init_varia, ecco_init_varia, obcs_init_variablessome done after ctrl
|-ctrl_init_variablesmd
|-ctrl_map_ini_genarr
|-ctrl_map_genarr2de.g., set etan,siheff ctrl
|-ctrl_map_genarr3de.g., set logdiffkr ctrl
|-ctrl_map_ini_gentim2dmd
|-ctrl_init_rec(xx_atemp)
example here for atemp: [startrec,endrec,diffrec]=[24,37,14]
|-active_read_xy(fnamegenIn,lrec)
read in xx_atemp.0000000001.data from 24->37
|-active_write_xy(fnamegenOut,irec)
write out to xx_atemp.effective.0000000001.data from 1->14
|-active_read_xy(fnamegenOut,irec)
read in xx_atemp.effective.0000000001.data 1->14, do some math
|-active_write_xy(fnamegenTmp,irec)
write out to xx_atemp.tmp.0000000001.data 1->14
do irec=1,diffrec
|-active_read_xy(fnamegenOut,irec)
|-mds_read_field(xx_gentim2d_weight,jrec)
if variaweight, jrec=lrec, else jrec=1
|-smooth_correl2dor smooth2d
|-xx_gen/sqrt(wgentim2d) if doscaling
|-exch_xy_rl
|-active_write_xy(fnamegenOut,irec)
write out to xx_atemp.effective.0000000001.data (smooth/scaled)
enddo
The difference in length of records for 3[a,b] compared to 1[a,b] and 2[a,b] is
due to the fact that we need to access records startrec thru
endrec in 3a, i.e., file 3a needs a total of at least
endrec records; file 3b is automatically generated to provide access
to endrec thru startrec (i.e., in reverse order). File
3b, in particular, is where adjoint sensitivity will be accumulated backward
and written; note the model would thus crash if its last record were
diffrec rather than endrec. For pairs 1[a,b] and 2[a,b],
because they are generated after we have already accessed the correct records
startrec to endrec in 3a, we simply create and write out
these records in the shorter file size diffrec. After their file
size initializations, the control adjustment field with physical unit from file
1a is passed on to pkg/exf for surface forcing application.
The adjustment xx_shicdrag is available in the velocity dependent form
of the ice-ocean transfer coefficients, which is specified by #defineSHI_ALLOW_GAMMAFRICT and SHELFICEuseGammaFrict=.TRUE. at compile time and run time respectively. This parameter provides
adjustments to the drag coefficient at the ice ocean boundary, but by default
only adjusts the drag coefficient used to compute the thermal and freshwater
fluxes, neglecting the momentum contributions. To allow the contribution
directly to momentum fluxes, specify xx_genarr2d_preproc_c(*,iarr)='mom'
in data.ctrl.
As indicated in Table 10.10, the base-10 logarithm of a
control field can be adjusted by specifying the character option
genarr*d_preproc_c(k2,iarr)='log10ctrl', with k2 and iarr
as appropriate, and *d denoting that 2d or 3d are available.
As a concrete example, if the control parameter is updating fld2d,
then the field will be set as follows:
where log10InitVal is a scalar with a default value of 0, but can be changed
by setting gencost_preproc_r(k2,iarr). This is useful in the case where
doInitXX=.TRUE..
Concretely, if we had an initial guess for fld2d=10^-4 then one could set
the following in data.ctrl:
Note that the log10ctrl option can only be used when a weight file
is provided, and finally that this log-option cannot be used with
xx_gen*_preproc(k2,iarr)='noscaling',.
Every call to simul refers to an execution of the forward and
adjoint model. Several iterations of optimization may thus be
performed within a single run of the main program (lsopt_top). The
following cases may occur:
cold start only (no optimization)
cold start, followed by one or several iterations of optimization
warm start from previous cold start with one or several iterations
warm start from previous warm start with one or several iterations
Offline version
Every call to simul refers to a read procedure which reads the
result of a forward and adjoint run Therefore, only one call to
simul is allowed, itmax = 0, for cold start itmax = 1, for warm
start Also, at the end, x(i+1) needs to be computed and saved
to be available for the offline model and adjoint run
In order to achieve minimum difference between the online and offline
code xdiff(i+1) is stored to file at the end of an (offline)
iteration, but recomputed identically at the beginning of the next
iteration.
10.5.3. Number of iterations vs. number of simulations¶
- itmax: controls the max. number of iterations
- nfunc: controls the max. number of simulations within one iteration
From one iteration to the next the descent direction changes. Within
one iteration more than one forward and adjoint run may be performed.
The updated control used as input for these simulations uses the same
descent direction, but different step sizes.
From one iteration to the next the descent direction dd changes using
the result for the adjoint vector gg of the previous iteration. In
lsline the updated control
serves as input for a forward and adjoint model run yielding a new
gg(i,1). In general, the new solution passes the 1st and 2nd Wolfe
tests so xdiff(i,1) represents the solution sought:
\[{\tt xx(i) = xdiff(i,1)}\]
If one of the two tests fails, an inter- or extrapolation is invoked
to determine a new step size tact(i-1,2). If more than one function
call is permitted, the new step size is used together with the “old”
descent direction dd(i-1) (i.e. dd is not updated using the new
gg(i)), to compute a new
NUPDATE max. no. of update pairs (gg(i)-gg(i-1), xx(i)-xx(i-1))
to be stored in OPWARMD to estimate Hessian [pair of current iter. is
stored in (2*jmax+2, 2*jmax+3) jmax must be > 0 to access these
entries] Presently NUPDATE must be > 0 (i.e. iteration without
reference to previous iterations through OPWARMD has not been tested)
EPSX relative precision on xx bellow which xx should not be
improved
EPSG relative precision on gg below which optimization is
considered successful
IPRINT controls verbose (>=1) or non-verbose output
NUMITER max. number of iterations of optimisation; NUMTER = 0:
cold start only, no optimization
ITER_NUM index of new restart file to be created (not necessarily
= NUMITER!)
NFUNC max. no. of simulations per iteration (must be > 0); is
used if step size tact is inter-/extrapolated; in this case, if NFUNC
> 1, a new simulation is performed with same gradient but “improved”
step size
FMIN first guess cost function value (only used as long as first
iteration not completed, i.e. for jmax <= 0)
The non-MITgcm package optim_m1qn3 is based on the same
quasi-Newton variable storage method (BFGS) [GLemarechal89] as the
package in subdirectory lsopt, but it uses a reverse communication
version of the latest (and probably last) release of the subroutine
m1qn3.
This avoids having to define a dummy subroutine simul and also
simplifies the code structure. As a consequence this package is
simple(r) to compile and use, because m1qn3.f contains all necessary
subroutines and only one extra routine (ddot, which was copied
from BLAS) is required.
The principle of reverse communication is outlined in this example:
external simul_rc
...
reverse = .true.
do while (.true.)
call m1qn3 (simul_rc,...,x,f,g,...,reverse,indic,...)
if (reverse) break
call simul (indic,n,x,f,g)
end while
simul_rc is an empty ‘’model simulator’’, and simul generates a
new state based on the value of indic.
The original m1qn3 has been modified to work “offline”, i.e. the
simulator and the driver of m1qn3_offline are separate programs
that are called alternatingly from a (shell-)script. This requires
that the “state” of m1qn3 is saved before this program
terminates. This state is saved in a single file OPWARM.optXXX per
simulation, where XXX is the simulation number. Communication with
the routine, writing and restoring the state of m1qn3 is achieved
via three new common-blocks that are contained in three header
files. simul is replaced by reading and storing the model state
and gradient vectors. Schematically the driver routine optim_sub
does the following:
external simul_rc
...
call optim_readdata( nn, ctrlname, ..., xx ) ! read control vector
call optim_readdata( nn, costname, ..., adxx ) ! read gradient vector
call optim_store_m1qn3( ..., .false. ) ! read state of m1qn3
reverse = .true.
call m1qn3 (simul_rc,...,xx,objf,adxx,...,reverse,indic,...)
call optim_store_m1qn3( ..., .true. ) ! write state of m1qn3
call optim_writedata( nn, ctrlname, ..., xx ) ! write control vector
The optimization loop is executed outside of this program within a script.
% cd MITgcm
% git clone https://github.com/MITgcm/verification_other
and follow the additional directions provided for global_oce_cs32
or for global_oce_llc90.
These model configurations are used for daily regression tests to ensure
continued availability of the tested estimation package features discussed in
Ocean State Estimation Packages. Daily results of these tests, which currently
run on the glacier cluster, are reported on https://mitgcm.org/testing-summary.
To this end, one sets a crontab
job that typically executes
the script reported below. The various commands can also be used to run these
examples outside of crontab, directly at the command line via the testreport
capability.
Note
Users are advised against running global_oce_llc90
tests with fewer than 12 cores (96 for adjoint tests) to avoid potential memory
overloads. global_oce_llc90
(595M) uses the same LLC90 grid as the production ECCO version 4 setup
[FCH+15]. The much coarser resolution global_oce_cs32
(614M) uses the CS32 grid and can run on any modern laptop.