PhenoFlex
ModelThe PhenoFlex
Model constitutes a framework for modeling
the spring phenology of deciduous trees, with a particular focus on
fruit and nut trees. It uses the structure of the Dynamic Model for
chill accumulation and the Growing Degree Hours model for heat
accumulation. In order to relate spring phenology to chill and heat, and
to account for varying theories about the relationship between these two
agroclimatic phenomena, PhenoFlex
includes a flexible
transition function that defines responsiveness to heat (during
ecodormancy) as a function of accumulated chill (during endodormancy).
Unlike most previous applications of these models,
PhenoFlex
can flexibly fit the parameters of both models to
observed spring phenology dates. The model can thus accommodate
variation among the temperature responses of different species and
cultivars.
The PhenoFlex
Model is implemented in the
chillR
package, which also contains functions to fit the
model parameters to observed spring phenology data. This vignette
demonstrates the use of the PhenoFlex Model to predict spring phenology
dates, as well as the procedure for fitting model parameters to
data.
We demonstrate the PhenoFlex
functions using the example
of cherry bloom data recorded at Campus Klein-Altendorf, the
experimental station of the University of Bonn. This dataset, along with
long-term records of daily minimum and maximum temperatures, is included
in the chillR
package (KA_bloom
and
KA_weather
, respectively). Since the PhenoFlex model
requires hourly temperatures, we use the stack_hourly_temps
function from chillR
to interpolate the daily data.
library(chillR)
library(ggplot2)
data(KA_weather)
data(KA_bloom)
hourtemps <- stack_hourly_temps(KA_weather, latitude=50.4)
In PhenoFlex
, the chilling requirement is denoted yc and the heat
requirement zc. We first
illustrate how the model can be used to predict spring phenology events,
based on user-specified yc and zc values and
using the usual parameters for the Dynamic Model and the Growing Degree
Hours model (the default values in PhenoFlex
). To run the
analysis for one year only, we select all data for the 2009 season. The
2009 season is the dormancy season that ends in 2009. The number of
months to consider for the season is specified in the
genSeason
function by the mrange
parameter.
Since the default (August to June) is appropriate here, this is not
specified below.
yc <- 40
zc <- 190
iSeason <- genSeason(hourtemps, years=c(2009))
res <- PhenoFlex(temp=hourtemps$hourtemps$Temp[iSeason[[1]]],
times=c(1: length(hourtemps$hourtemps$Temp[iSeason[[1]]])),
zc=zc, stopatzc=TRUE, yc=yc, basic_output=FALSE)
The PhenoFlex function generates a list that tracks the accumulation of x (precursor to the dormancy-breaking factor), y (the dormancy-breaking factor; or Chill Portion), z (heat accumulation) and xs (the ratio of the formation to the destruction rate of x) over time. It also returns a bloomindex element, which points to the row in the temps input table that corresponds to estimated budbreak (or whatever the phenological stage of interest is).
DBreakDay <- res$bloomindex
seasontemps<-hourtemps$hourtemps[iSeason[[1]],]
seasontemps[,"x"]<-res$x
seasontemps[,"y"]<-res$y
seasontemps[,"z"]<-res$z
seasontemps<-add_date(seasontemps)
ggplot(data=seasontemps[1:DBreakDay,],aes(x=Date,y=y)) +
geom_line(col="blue",lwd=1.5) +
theme_bw(base_size=20) +
geom_hline(yintercept=yc,lty=2) +
labs(title="Chill (y) accumulation")
ggplot(data=seasontemps[1:DBreakDay,],aes(x=Date,y=z)) +
geom_line(col="red",lwd=1.5) +
theme_bw(base_size=20) +
geom_hline(yintercept=zc,lty=2) +
labs(title="Heat (z) accumulation")
The PhenoFlex
model can be fitted to phenological data,
provided that sufficient observations are available for the cultivar of
interest. For this kind of model, parameters are usually determined
using an empirical solver. Solvers identify the best combination of
model parameters by trying out different values and gradually adjusting
these parameters, until the objective function - a measure of how far
predictions are from the observed data - does not decrease further. In
this framework, we fit the model using a generalized simulated annealing
algorithm. In contrast to other algorithms, simulated annealing can deal
with discrete data (we calculate the residual sum of squares between
observed and predicted bloom days as objective function). Since
simulated annealing is a stochastic solver, there is a risk of not
finding the overall best parameter combination (global minimum of
residual sum of squares). The generalisation of the basic solver reduces
this risk. Still, the search should be repeated several times with
different initial parameter values and random seeds. This iterative
procedure requires lots of model runs, which can take substantial time
and/or computing power. Here, we only demonstrate the functionality
using a maximum of 10 iterations of the simulated annealing procedure.
In order to achieve reliable parameters, we recommend at least 1000
iterations. We also recommend using many observations of a cultivar’s
spring phenology (many more than in this example), and ideally to
include winter seasons spanning a wide range of temperature
conditions.
For fitting the PhenoFlex
model to phenological data, we
have to generate a list of seasons as follows. We are only using 6
seasons in this example, but a real-life application should be based on
at least 15 to 20 seasons or even more, comprising variable temperature
profiles across years.
Parameters are then fitted using the generalized simulated annealing
algorithm, which is called by the phenologyFitter
function
and requires several inputs:
PhenoFlex
model has more than one parameter, we have to
wrap it into another function (PhenoFlex_GDHwrapper
), which
requires a temperature dataset x
and a vector containing
all the parameters of the PhenoFlex
model. Within the
wrapper function, this parameter vector is unpacked, with each value
assigned to the respective parameter.The order, in which the PhenoFlex
parameters have to be
provided, is given in the description of the
PhenoFlex_GDHwrapper
function as follows:
PhenoFlex
- default value: 0.5Note that the PhenoFlex_GDHwrapper
can only fit the
version of the PhenoFlex
model that uses a heat model based
on the GDH concept. For use of the Gaussian heat model that can also be
considered by PhenoFlex
, use the
PhenoFlex_GAUSSwrapper
function.
For this example, we initiate the procedure with the default values, and we set upper and lower bounds as follows:
par <- c(40, 190, 0.5, 25, 3372.8, 9900.3, 6319.5, 5.939917e13, 4, 36, 4, 1.60)
upper <- c(41, 200, 1.0, 30, 4000.0, 10000.0, 7000.0, 6.e13, 10, 40, 10, 50.00)
lower <- c(38, 180, 0.1, 0 , 3000.0, 9000.0, 6000.0, 5.e13, 0, 0, 0, 0.05)
Now, we can run the fitter:
Fit_res <- phenologyFitter(par.guess=par,
modelfn = PhenoFlex_GDHwrapper,
bloomJDays=KA_bloom$pheno[which(KA_bloom$Year %in% c(2003:2008))],
SeasonList=SeasonList, lower=lower, upper=upper,
control=list(smooth=FALSE, verbose=FALSE, maxit=10,
nb.stop.improvement=5))
## Loading required namespace: parallel
Note that the list control
regulates the behavior of the
GenSA
algorithm, the solving engine we are using here. Note
further that smooth=FALSE
must be set for the
PhenoFlex
model, since the objective function, the residual
sum of squares between predicted and observed bloom days is calculated
based on discrete data (days) and is thus not smooth.
verbose
controls whether messages from the algorithm are
shown. maxit
defines the maximum number of iterations of
the algorithm and should have a value of at least 1000.
nb.stop.improvement
is the number of search steps of the
algorithm, after which the solver is stopped when there is no further
improvement of the fit.
In the example above, the values are chosen in a way that allows the
fitter to finish quickly. Thus, the result may not be particularly
meaningful. More reasonable parameters are, for instance, the default
values of phenologyFitter
:
phenologyFitter
can also be used with other models. For
instance, in order to fit the StepChill
model, we could
add
to the argument list of phenologyFitter
and adapt the
parameter vectors par
, upper
and
lower
accordingly.
The result of a fitting procedure can be summarized as follows
Phenology Fitter
Final RSS: 33.94618
RMSE : 2.378591
R^2 : 1.977668
data versus predicted:
data predicted delta
1 107 105.4167 1.5833333
2 110 106.3333 3.6666667
3 105 103.8333 1.1666667
4 116 119.2500 -3.2500000
5 105 105.1667 -0.1666667
6 115 117.4583 -2.4583333
with data
being the days where bloom or any other
phenological event was observed, predicted
the day that was
predicted by the model and delta
the difference between
data
and predicted
, i.e. the residual error.
These results can also be visualized using the plot
function
Model prediction are usually uncertain, and the possible errors this may produce should be expressed. To estimate these errors, we use a bootstrapping technique, which involves the following steps:
PhenoFlex
framework, make predictions for the
validation dataset and record the results for each yearPhenoFlex
concept will
eventually be provided in a peer-reviewed paperIn short, the residuals are bootstrapped and then the fit is
repeated. This procedure is performed boot.R
times.
Fit_res.boot <- bootstrap.phenologyFit(Fit_res, boot.R=10,
control=list(smooth=FALSE, verbose=TRUE, maxit=10,
nb.stop.improvement=5),
lower=lower, upper=upper, seed=1726354)
## Emini is: 12.47395833
## xmini are:
## 39.11676954 195.5517993 0.3217588819 27.59297429 3372.8 9900.3 6278.621388 5.9399184e+13 7.096261393 31.92723795 2.3951803 1.6
## Totally it used 0.64303 secs
## No. of function call is: 258
## Algorithm reached max number of iterations.
## ========Emini is: 28.66840278
## xmini are:
## 38.23353909 195.5517993 0.3217588819 27.59297429 3372.8 9900.3 6278.621388 5.9399184e+13 7.096261393 31.92723795 2.3951803 1.6
## Totally it used 0.544482 secs
## No. of function call is: 258
## Algorithm reached max number of iterations.
## ========Emini is: 41.82465278
## xmini are:
## 38.23353909 195.5517993 0.3217588819 27.59297429 3372.8 9900.3 6278.621388 5.9399184e+13 7.096261393 31.92723795 2.3951803 1.6
## Totally it used 0.652393 secs
## No. of function call is: 258
## Algorithm reached max number of iterations.
## ========Emini is: 9.901041667
## xmini are:
## 38.23353909 190.8940883 0.3217588819 27.59297429 3372.8 9900.3 6278.621388 5.9399184e+13 7.096261393 31.92723795 2.3951803 1.6
## Totally it used 0.551667 secs
## No. of function call is: 258
## Algorithm reached max number of iterations.
## ========It: 5, obj value (lsEnd): 17.31944444 indTrace: 5
## Emini is: 16.60069444
## xmini are:
## 39.94078897 195.62368 0.4235652424 29.11345689 3372.8 9900.3 6212.604636 5.939931303e+13 5.95083233 31.92723795 2.3951803 34.66770264
## Totally it used 0.889221 secs
## No. of function call is: 340
## Algorithm reached max number of iterations.
## ========Emini is: 18.10416667
## xmini are:
## 38.23353909 181.1035987 0.1155335754 27.59297429 3372.8 9900.3 6278.621388 5.9399184e+13 7.096261393 31.92723795 2.3951803 1.6
## Totally it used 0.522999 secs
## No. of function call is: 258
## Algorithm reached max number of iterations.
## ========Emini is: 29.26388889
## xmini are:
## 39.11676954 195.5517993 0.3217588819 27.59297429 3372.8 9900.3 6278.621388 5.9399184e+13 7.096261393 31.92723795 2.3951803 1.6
## Totally it used 0.630003 secs
## No. of function call is: 258
## Algorithm reached max number of iterations.
## ========Emini is: 17.18055556
## xmini are:
## 39.11676954 195.5517993 0.3217588819 27.59297429 3372.8 9900.3 6278.621388 5.9399184e+13 7.096261393 31.92723795 2.3951803 1.6
## Totally it used 0.529752 secs
## No. of function call is: 258
## Algorithm reached max number of iterations.
## ========Emini is: 19.26909722
## xmini are:
## 39.23709156 181.6020895 0.1178039946 27.59297429 3372.8 9872.011555 6390.26006 5.939919565e+13 5.369140297 27.8544759 2.3951803 17.00992482
## Totally it used 0.749689 secs
## No. of function call is: 338
## Algorithm reached max number of iterations.
## ========Emini is: 36.94270833
## xmini are:
## 38.23353909 181.1035987 0.1155335754 27.59297429 3372.8 9900.3 6278.621388 5.9399184e+13 7.096261393 31.92723795 2.3951803 1.6
## Totally it used 0.678893 secs
## No. of function call is: 258
## Algorithm reached max number of iterations.
## ========
Like for the result of phenologyFitter
there is also a
summary
and a plot
function for
bootstrap.phenologyFit
.
par Err q16 q84
1 3.911677e+01 6.136067e-01 3.823354e+01 3.918415e+01
2 1.955518e+02 6.739732e+00 1.813229e+02 1.955518e+02
3 3.217589e-01 1.108245e-01 1.165326e-01 3.217589e-01
4 2.759297e+01 4.808188e-01 2.759297e+01 2.759297e+01
5 3.372800e+03 0.000000e+00 3.372800e+03 3.372800e+03
6 9.900300e+03 8.945592e+00 9.900300e+03 9.900300e+03
7 6.278621e+03 4.296416e+01 6.278621e+03 6.278621e+03
8 5.939918e+13 4.056009e+07 5.939918e+13 5.939919e+13
9 7.096261e+00 6.209134e-01 6.454821e+00 7.096261e+00
10 3.192724e+01 1.287920e+00 3.192724e+01 3.192724e+01
11 2.395180e+00 0.000000e+00 2.395180e+00 2.395180e+00
12 1.600000e+00 1.103495e+01 1.600000e+00 1.022956e+01
with par
being the estimated parameter values,
Err
the standard deviation of the bootstrap distribution
and q16
and q84
the 16. and 84. percentiles,
respectively.
This plot now shows observed bloom dates, as well as bloom dates
predicted with the PhenoFlex
model, including an estimate
of prediction uncertainty (shown as error bars).