Incorporating competition factors in a mixed-effect model with random effects of site quality for individual tree above-ground biomass growth of Pinus kesiya var. langbianensis

Background: Accurate biomass estimation has critical effects on quantifying carbon stocks and sequestration rates, and aboveground biomass (AGB) growth models are a key component of tree biomass estimation. The study objective was to develop a growth model for AGB of an individual tree by combining competition factors and site quality using a mixed-effect model. Methods: The AGB of 128 sampling trees was investigated for Simao pine (Pinus kesiya var. langbianensis) at three typical sites near Pu’er City of Yunnan Province, China. Richards’ Equation was used for the basic growth model (BM) of the AGB, and a mixed-effect model with random effect of site quality (MEM) based on BM and a mixed-effect model with fixed effect of competition factors (MEMC) based on MEM were built using S-plus. Results: Both mixed-effect models are significantly better than the basic model in fitting and predicting the individual tree AGB growth for Simao pine, but the MEM is better than the MEMC. Moreover, the mixed-effect model with competition factors and site quality is the optimal estimation model due to its highest prediction precision (P=86.08%) as well as the lowest absolute average relative error (RMA=54.34%) and average relative error (EE =6.45%). Conclusion: A model including site quality and competition factors can be used to improve the tree AGB growth estimation for the individual tree AGB growth of Simao pine. New Zealand Journal of Forestry Science Nong et al. New Zealand Journal of Forestry Science (2019) 49:11 https://doi.org/10.33494/nzjfs492019x27x


Introduction
Forests are important components of terrestrial ecosystems, and they play vital roles in the global carbon balance (Woodwell et al. 1978;Talhelm et al. 2014;Sleeter et al. 2018). Trees are an important element of forest ecosystems' carbon storage (Goodale et al. 2002;Houghton 2005;Houghton et al. 2009;Luo et al. 2014). Accurate estimates of tree biomass are critical for quantifying carbon stocks and fluxes as well Keywords: Above-ground biomass, competition factors, growth model, mixed-effect model, Pinus kesiya var. langbianensis, Richards' Equation, site quality Andre et al. 2005;Somogyi et al. 2007;Sileshi 2014;Chave et al. 2014;Temesgen et al. 2015). Equations for tree biomass estimation can be categorized into regional biomass conversion factors, stand-level and tree-level biomass Equations (Di Cosmo et al. 2016;Jagodziński et al. 2018). Tree-level biomass models are used to estimate the total and the components of individual trees with easily measurable tree inventory attributes (e.g. DBH, H). The tree-level biomass model became a research priority because of its better performance and higher predicting precision compared with regional and stand-level biomass models (Temesgen et al. 2015). More than 2600 biomass models related to more than 100 tree species have been constructed all over the world (e.g. Ter-Mikaelian & Korzukhin 1997;Jenkins et al., 2003Jenkins et al., , 2004Zianis 2005;Muukkonen 2007). Although many studies on tree-level biomass models have already been developed, the existing Equations are generally not good enough and need improvement (Temesgen et al. 2015). Moreover, almost all of these studies were static models and focused on the relation among the individual tree biomass and predictors, whereas few studies have considered the growth of individual tree biomass using growth Equations. Furthermore, spatial and temporal autocorrelation between individuals and groups may be neglected in modelling tree biomass growth (Li & Zhang 2010).
It is necessary to develop comprehensive biomass estimation models with better prediction performance that consider differences in stand density and among sites (Temesgen et al. 2015). Forest biomass estimation variation can be observed between different ecological zones and sites (Henry et al. 2011, De-Miguel et al. 2014. Trees have different growth rates due to differences between sites, and models perform better when site quality is considered in the model (Vanninen et al. 1996;Rohner et al. 2013). Furthermore, site quality also has an important effect on tree growth, and it can be quantified as a site index (SI) which represents the potential productivity of the forestland by using the dominant height of stands at a particular standard age (Running & Mcleod 1988;Carmean et al. 1989;Sturtevant & Seagle 2004;Waring et al. 2006). Stand density can be used to reflect the effects of crowding and competition among trees on tree growth (Curtis 1985;Zeide 2005). Competition results from interactions between many biotic and abiotic factors, and this affects the forest structure (Sahney et al. 2010). With increasing stand density, individual trees can be restricted in their growth and even die (Bragg 2001). Many studies have shown that tree size is strongly affected by competition (Coomes & Allen 2007;Coates et al. 2004Coates et al. , 2009. By incorporating the size and distance of neighboring trees into the model, predictions of growth and mortality can be improved (Mctague & Weiskittel 2016). Biomass growth Equations are generally constructed by using forestry investigation data and contain multiple repeated observations of individual trees. The characteristics of these data led to spatial and temporal correlations among observations in the same sampling unit (Lhotka & Loewenstein 2011;Timilsina & Staudhammer 2013). A previous parameter estimation of the individual tree growth Equation was estimated by ordinary least squares (OLS). This would violate the regression assumptions of homoscedasticity of variances and independence of residuals and lead to inaccurate estimates if the growth model was developed based on non-independent data by OLS (Budhathoki et al. 2010;Njana et al. 2016). To address these problems, many researchers have tried to develop new growth models by incorporating mixedeffect modelling techniques (Budhathoki et al. 2010;Lhotka & Loewenstein 2011). A mixed-effect model, which consists of fixed effects and random effects, provides a flexible and powerful tool for the analysis of grouped data (Pinheiro & Bates 2000). Fixed effects can indicate the average relationships of a dependent variable with independent variable(s), and random effects can reflect the difference among groups (Razali et al. 2015). The advantage of mixed-effect models is that they can fit growth and yield data in forestry fairly well via multilevel random effects (Gregoire et al. 1995), and the prediction accuracy of such models can be improved through modifications of random effects (Calama & Montero 2004). In recent years, mixed-effect models have been widely applied in forestry due to their better fitting performance and prediction precision (Ramon et al. 2006), and they have been used to estimate the dominant height growth (Fang & Bailey 2001), diameter increment (Lhotka & Loewenstein 2011), tree stand basal area (Gregoire et al. 1995), basal area growth of individual trees (Budhathoki et al. 2008), wood density (Li & Jiang 2013), stand volume (Li 2012), and biomass (Zhang & Borders 2004;Fehrmann et al. 2008;Fu et al. 2012;Njana et al. 2016) by incorporating different random effects (e.g. tree-level, plot-level). Therefore, it is crucial to accurately predict biomass growth by constructing a comprehensive model with good prediction performance using a mixed-effect model that considers competition factors and site quality.
Simao pine (Pinus kesiya var. langbianensis), a geographic variation of P. kesiya, is distributed in mountain areas from the northern tropical zone to the southern subtropical zone of Yunnan Province, China. Its distribution area and stocking volume account for 11% of the forestland of Yunnan Province (Compilation Committee of Yunnan Forest 1986). It has been an important tree species for afforestation in Yunnan due to its rapid growth (Southwest Forestry College & Forestry Department of Yunnan Province 1988). Moreover, Simao pine forests provide a range of goods and services and have important economic and ecological values (Wu & Dang 1992;Wen et al. 2010;Yue & Yang 2011;Li 2011). Therefore, it is important to be able to estimate biomass growth comprehensively and with high prediction accuracy by incorporating site index and competition factors to assess the potential value of these forests and to guide forest management.
In this study, natural forests of Simao pine were studied by sampling the above-ground biomass (AGB) of 128 trees at three sites. Mixed-effect models incorporating site quality and a competition factor were used to construct the AGB growth model based on a transformation of Richards' Equation. This study aimed to: (1) explore a comprehensive biomass growth model with high prediction accuracy for estimating the AGB growth of Simao pine; and (2) explain the impacts on improving AGB estimation from site quality and competition factors.

Study Region
The study region is located in Mojiang county, Simao district and Lancang county which belong to Pu'er City, southwest of Yunnan Province, P. R. China. In this city, mountain areas comprise 98.3% of the overall region, and the study region is located between 22°02′N to 24°50′N and 99°09′E to 102°19′E. Three typical geographic areas with different climates, Tongguan Town of Mojiang County (Site I), Yunxian Town of Simao District (Site II) and Nuofu Town of Lancang County (Site III), were selected as study sites (Fig. 1), and the elevation of the sites are from 1400 m to 1600 m above sea level. The mean annual temperatures and annual precipitation both decrease from south to north (Fig. 2). Lancang county has the highest annual mean temperature (19.7 ℃) and annual precipitation (1586.5 mm), and Mojiang county has the lowest values (the annual mean temperature is 18.4 ℃ and the annual precipitation is 1306 mm) (Ou et al. 2016).

Data investigation
A total sample of 128 pines in 45 plots with an area of 900 m 2 were selected and investigated in the study areas ( Fig. 1). Plot information including latitude, longitude, degree of slope, and aspect of slope was recorded. Tree age (A), diameter at breast height (DBH) and tree height (H) of the 128 sampled trees were recorded (Fig. 3). Tree age was determined by counting the number of annual growth rings of the stump of each felled sample trees. Then, we also recorded the basic characteristics of the surrounding trees within 5 metres of the sample trees, including tree species, DBH, H and the distance to calculate the competition index (CI). We calculated the average height of dominant trees (Ht) and the average age of stands (A) to calculate the site index (SI). The average height of the dominant trees for each plot is the mean of the three highest trees, and the average age of the stand of each plot is the mean age of the standard trees with a DBH similar to the average DBH of the plot.

Biomass measurement
According to the sample collection method for tree biomass modelling in China, the biomass of each tree component was measured one at a time (Zeng et al. 2011). The biomass of the stem was measured by the method of volume density. Firstly, felled trees were segmented and weighed, and the fresh weight was measured. Secondly, the segment length and diameter were measured, and the volume of the trunk was calculated. Branches and leaves were measured by the method of the graded sample branch. Dead branches and fruits were measured by the method of total weight. Thirdly, the samples from the different components were dried to a constant weight at 105 ℃ using an oven, and the sample density of the wood and bark was also measured using the drainage method. Finally, the biomass of wood and bark of the sampled trees was calculated using the volume and the corresponding sample density, and the branch biomass and needle biomass of sample trees were calculated using fresh weight and the corresponding dry matter ratio.
The sample trees were divided into two sets by random selection; one set with 96 sample trees was used to fit the models, and the other one was used for the test. The basic characteristics are listed in Table 1.  (1) where CI i is a competition index, D i is the diameter of sample tree i, D j is the diameter of competition tree j around the sample tree, and DIST ij is the distance from sample tree i to competition tree j.

Site index calculation
To derive the site index (SI), the dominant height and age of each plot were measured. SI was calculated for Simao pine using the Equation by Wang (2003), Equation 2.
( 2) where SI is the site index, H t is the average height of dominant trees of each plot, A is the average age of each plot, and the base age is 20 years according to Wang (2003). The site index of the plots includes five types from 12 m to 20 m according to an interval of 2 m in this study (Table 1).

Model fitting Basic model (BM)
Richards' growth Equation developed from Bertalanffy's growth theory is used to describe biological growth changes over time. Growth Equations such as Monomolecular, Gompertz, and Logistic Equations are Richards' growth Equations of the special form.
Richards' Equation is widespread in forestry because of its flexibility and excellent fitting performance (Richards 1959, Liu & Li 2003, Rohner et al. 2013 where parameters A, b and c are given in Equations 4-6: (4) where is the response variable describing the change in biomass with tree age (t), A is the asymptote of the maximum parameter for tree growth, b is the growth rate parameter that indicates the rate a tree approaches its asymptotic biomass, c is the parameter related to m, m is the power exponent of anabolism, η is the anabolism constant, and β is the catabolism constant. Parameter A in Equation 3 is the most unstable parameter; thus, a transformation of the Equation is constructed to solve this problem by using the parameter a, which is an expected-value parameter when t = t 0 to replace parameter A (Fang & Bailey 2001). Therefore, we finally selected the transformation of Richards' Equation to construct a basic biomass growth model. The Equation is shown in Equation 7.
where y is the organ biomass of an individual tree, a is the progressive parameter of organ biomass growth, b is the rate of growth, c is the curve shape parameter, t is the tree age, and t 0 is a fixed reference age that may be fixed at any positive value according to the specific research situation (Fang & Bailey 2001). Its value was set at 20 years in this study.

Mixed-effect model without fixed effects from competition factors (MEM)
Forestry growth and yield data are affected by the sampling region (e.g. different regions may have different site conditions) and the correlation among the trees in the same sampling position. The data from within a sampling unit is dependent; thus, autocorrelation and heterogeneity are common in these data (Gregorie 1987). The mixed-effect modelling technique can partly remove the negative impact of heterogeneity and autocorrelation within-plots by using variance (e.g. power, exponential function) and covariance (e.g. timeautocorrelation function) structure. This technique can also explain the plot parameter variability by selecting appropriate covariates (Fang & Bailey 2001). In our study, site quality was taken as a random effect to select the mixed parameters, and power and exponential functions were used for describing the variance structures. Three time-autocorrelation functions, including the autoregressive time correlation structure with order 1 (AR(1)), continuous time AR(1) structure (CAR(1)) and autoregressive moving-average structure with both order 1 (ARMA(1,1)) function, were used to describe covariance structures. The mixed parameter selection to determinate the model forms, variance and covariance structure were all given by the research of Pinheiro & Bates (2000).

Mixed-effect model with fixed effects from competition factors (MEMC)
Based on the MEM, competition factors were considered fixed effects to input the Equation parameters of the MEM, and both CI and the quadratic effect (CI 2 ) are incorporated into the models.

Model evaluation
In this study, Log likelihood (logLik), Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) were used to evaluate the model fitting results (Equations 8-10). (8) where is the maximum likelihood estimation of θ for the Likelihood function of model , x is the random sample, q is the number of unknown parameters, and n is the sample capacity.
Moreover, the sum relative error (RS), mean relative error (EE), absolute mean relative error (RMA) and prediction precision (P) were used to test the model prediction performance (Equations 11-14).
(11) (12) (13)   (14) where y i is the measured value, is the estimated value, is the mean value of the estimated value, t a is the distribution value of t (when confidence level is a = 0.05), N is the sample capacity, and T is the parameter number of the regression curve Equation.

Mixed parameter selection
The fitting results of different parameter combinations are listed in Table 2. Models with mixed parameters have lower AIC and BIC and higher logLik values than the basic model, referred to as the transformation of Richards' Equation (BM). The optimal fitting result emerges when the parameter a is regarded as the mixed parameter (AIC=1305.367, BIC=1318.189, logLik= -647.684). The mixed-effect model is listed in Equation 15. (15) where y is the above-ground biomass, a is the progressive parameter of organ biomass growth, ua is the parameter for the random effect from the site index, b is the growth rate of AGB, t is the tree age, t 0 is the standard age with a value of 20 years, and c is the shape parameter.

Variance and covariance structure selection
A comparison of the variance and covariance structure fitting results of the MEM are listed in Table 3. The optimal fitting result emerged when the power function is regarded as a variance structure but none are regarded as covariance structures (AIC=1224.4, BIC=1239.8, logLik=-606.2).

Basic model construction
Based on the MEM, when parameter a is regarded as a random effect and the basic individual tree competition index is regarded as a fixed effect, the optimal model form is listed in Equation 16. Comparisons of different effects in the AGB growth mixed-effect model are listed in Table 4. The optimal fitting results emerge in the mixed-effect model with competition factors and site index (MEMC) (AIC=1298.1, logLik=-640.1). The fitting results of parameters are listed in Table 5. Parameter b is extremely significant at the p=0.01 level.
where y is the above-ground biomass, CI is the competition index, CI 2 is the quadratic effect of CI, a is the progressive parameter of organ biomass growth, ua is the parameter for random effects from the site index, b is the growth rate of AGB, b1 and b2 are the estimated parameters of the fixed effect from and to parameter b, respectively, t is the tree age, t 0 is the standard age with a value of 20 years, c is the shape parameter, and c1 and c2 are the estimated parameters of the fixed effect from CI and CI 2 to parameter c, respectively.

Variance and covariance structure selection
The fitting results of the variance and covariance structures are listed in Table 6. The optimal fitting result emerged when the power function is regarded as a variance structure but none are regarded as covariance structures (AIC=1230.2, BIC=1250.7, logLik=-607.1). The optimal fitting results of the mixed-effect model taking the competition factor as a fixed effect are listed in Table 7, and the optimal model form is shown in Equation 17. where y is the above-ground biomass, a is the progressive parameter of organ biomass growth, ua is the parameter for random effects, is the growth rate of AGB, t is the tree age, t 0 is the standard age with a value of 20 years, c is the shape parameter, c1 is a mixed parameter of parameter c, c2 is a mixed parameter of parameter c, CI is the basic competition index, and CI 2 is the quadratic effect of CI.

Model evaluation
The final fitting results of BM, MEM and MEMC are listed in Table 8. The parameter a did not differ significantly between the three models due to overlapping intervals of the estimated value, although BM has the lowest value with 61.799, and the parameters b and c have significant differences among the three models. MEM has the optimal fitting performance because of the lower values of AIC and BIC and the highest value of logLik (Table  8), but the difference between the MEM and MEMC is not significant because of the higher p value (p value = 0.113) according to the likelihood ratio test (LRT). While MEMC has the minimum value of EE (6.45%) and RMA (54.34%), and the highest value of prediction precision P (86.08%), BM has the minimum value of RS (7.71%) (Table 8). Moreover, the heteroscedasticity of the residual is not found in either mixed-effect models, but it is obvious in the BM, and the MEMC has the narrowest interval of standardised residual (Fig. 4). Therefore, the MEMC is the optimal model for the individual tree AGB growth of Simao pine.

Discussion
The different combinations of mixed parameters were selected, and the mixed model with mixed parameter a was considered as the optimal basic model because of the low AIC (Table 2). Parameter a represents the estimated asymptotic biomass at a standard age; however, trees have different growth potentials at the same standard age depending on the site quality. Many factors would influence tree maximum diameter at an equal basal age, such as water-holding capacity, elevation, and slope (Rohner et al. 2013). Thus, selecting parameter a as a mixed parameter in the model can improve the estimation and indicates that the site quality has a significant effect on the age-biomass relationship. Site quality had important effects on tree growth (Robichaud & Methven 1993). Past studies generally used site characteristics (a plot-level variable) as variables for biomass growth models and showed that site characteristics had a significant effect on tree growth (Lee et al. 2004). Westfall (2016) reported that mixed models, including plot random effects, can reduce prediction bias and variance for populations to a great extent compared to fixed effect models; Lhotka & Loewenstein (2011) reported analogous results. Similarly, our study found that mixed-effect models, including the random effects of site quality, are better than basic models (i.e. higher logLik and lower AIC and BIC), and incorporating random effects can improve the biomass growth model for Simao pine. Our results are consistent with previous studies for both static models (Subedi & Sharma 2011;Rohner et al. 2013;Ou et al. 2016;Chen et al. 2017;Huff et al. 2018), and growth models (Budhathoki et al. 2008;Timilsina & Staudhammer 2013;Westfall 2016). Moreover, the competition factor is an important predictor variable for the individual tree model because it intensively affects tree growth (Lhotka & Loewenstein 2011). In our study, MEMC had a better estimation due to incorporating the competition factor as a fixed effect compared with MEM, and the predictive ability was significantly improved (i.e. largest P and smallest EE and RMA). Specifically, the predictive accuracy increased to 86%. Thus, it can obviously improve the mixed model predictive accuracy if competition is regarded as an independent variable, and it also indicates that tree competition is the critical element for predicting individual tree biomass growth. Therefore, mixed-effect models have been used in forestry because of their superior fitting and prediction accuracy compared with traditional models (Huff et al. 2018). The mixed-effect model, only including the random effects of site quality, can greatly improve the fit performance of the model, and the mixed-effect model incorporating competition factors can further improve the prediction ability. In addition, autocorrelation among measurement data may result in biased estimates of the model parameter for biological data (Budhathoki et al. 2010). The mixed-effect model was then introduced to address this challenge by defining the variance and covariance structures of random effects in parameter estimations (West et al. 2007;Smith et al. 2014;De-Miguel et al. 2014;Njana et al. 2016). We also found that our prediction model had issues with heteroscedasticity which was corrected by using the power variance function (the optimal variance function) in the estimation process (Budhathoki et al. 2008). Therefore, the mixed-effect model had a good fitting performance which is consistent with previous studies (Subedi & Sharma 2011;De-Miguel et al. 2014). In contrast, all of the time-autocorrelation covariance structures in our study did not improve the model performance which indicates that the estimation models considering time autocorrelation cannot improve fitting. This might be attributable to the biomass sample data without a time series (Eisfelder et al. 2017). However, it is unrealistic and impossible to obtain the time series data of the biomass because destructive sampling was performed to collect biomass data (Temesgen et al. 2015). Therefore, we investigated sample trees with different ages (from 8 years to 80 years) in different locations to replace the time series data. The AGB change rule along with the tree ages can be explained by using the mixed-effect model considering site quality and competition factors. Thus, the investigated methods for estimating the AGB growth would be reasonably practical. Additionally, we did not consider spatial autocorrelation among trees in the same plot because Simao pine is an intolerant tree species and because sampling trees of different ages occurs at distant locations.
Furthermore, the data characteristics of the independent variables are important for selecting the form of mixed effects. A random effect is applicable if the variable is a grouped one, but a fixed effect is appropriate if the data are a continuous variable in a mixed-effect model (Pinheiro & Bates 2000). In this study, the site index and competition index were incorporated into the mixed-effect models as random and fixed effects, respectively. The site index (SI) is often used as grouped data in forestry. SI tables of the dominant tree species are established to predict potential productivity of forestland; SI is based on the dominant height classes with a 2-m interval at a standard stand age (Meng 2006;Duan et al. 2009 ). Thus, it was considered as a random effect into the mixed-effect model in this study. The competition index is a continuous variable, and it was calculated using the distances between each sample tree and its neighbouring competing trees and their DBH in this study (Hegyi 1974). Thus it was incorporated into the mixed-effect model as a fixed effect.
In addition, the national continuous forest inventory of China (NCFI) is being conducted using permanent plots and carried out every five years to reflect the dynamic change of forest resources at the national scale since 1987 (Kang 2011 calculated forest biomass carbon using the NCFI database by the variable biomass expansion factor method, with estimation error of less than 3% at the regional and national scale. At the stand level, the location of each tree in each permanent plot and the height of the stands had been recorded (Kang 2011). Therefore, the CI of each tree can be calculated and the SI of the plots can be found by looking up the SI tables of the different dominant tree species. So the comprehensive estimation method has the potential to improve biomass growth estimation and reflect the dynamical change of stand biomass.

Conclusions
To improve estimation of individual tree AGB growth, a nonlinear mixed-effect model was developed by incorporating random effects of site quality and fixed effects of competition factors based on a transformation of Richards' function. We found that the mixed-effect model was significantly better than the BM because of its better fitting and prediction performance, and the MEMC is the optimal estimation model due to its highest prediction precision. Therefore, comprehensive biomass growth estimation considering the site index and competition index can be used to predict the AGB growth of Simao pine trees, and it is a potential method for AGB growth estimation of other tree species.