多水平模型英文原著chap5

Chapter 5

Nonlinear multilevel models 5.1 Nonlinear models

The models of Chapters 1-4 are linear in the sense that the response is a linear function of the parameters in the fixed part and the elements of are linear functions of the parameters in the random part. In many applications, however, it is appropriate to consider models where the fixed or random parts of the model, or both, contain nonlinear functions. For example, in the study of growth, Jenss and Bayley (1937) proposed the following function to describe the growth in height of young children

(5.1) where is the age of the j-th child at the i-th measurement occasion. Generalised linear models (McCullagh and

Nelder, 1989) are a special case of nonlinear models where the response is a nonlinear function of a fixed part linear predictor. Models for discrete data, such as counts or proportions fall into this category and we shall devote chapter 7 to studying these. For example, a 2-level log linear model can be written

(5.2) E(m),? ?exp(X)

ijwhere is assumed typically to have a Poisson distribution, in this case across level 1 units. Note here, that in the

multilevel extension of the standard single level model, the linear predictor contains random variables defined at level 2 or above.

In this chapter we consider a general nonlinear model. Later chapters will use the results for particular applications.

???ijijijj5.2 Nonlinear functions of linear components

The following results are an extension of those presented by Goldstein (1991) and appendix 5.1 gives details. Where

the random variables are not part of the nonlinear function, the procedure gives maximum likelihood estimates (see appendix 5.1). In the case where the level 1 variation is non Normal the procedure can be regarded as a generalisation of quasilikelihood estimation (McCullagh and Nelder, 1989) and such models are discussed in chapter 7.

Restricting attention to a 2-level structure we can write a fairly general model as follows

(5.3)

where the function is nonlinear and where the indicates that additional nonlinear functions can be included, involving further fixed part explanatory variables or random part explanatory variables at levels 1 and 2, respectively (1)(2). The model is first linearised by a suitable Taylor series expansion and this leads to consideration of a Z, Zlinear model where the explanatory variables in are transformed using first and second derivatives of the nonlinear function. Note that the linear component of (5.3) is treated in the standard way, and that the random variables at a given level in the linear and nonlinear components may be correlated.

Consider the nonlinear function . Appendix 5.1 shows that we can write this as the sum of a fixed part component and a random part. The Taylor expansion for the random part up to a second order approximation for the ij-th unit is as follows

f?f(H)(?Zu?Ze)f(H)?ijijt?12ij2j2ij2ijijt(2)(1)2?(Zu?Ze)f(H)/2??2ij2j2ij2ijijt(2)(1)

(5.4)

The first term on the right hand side is the fixed part value of at the current ((t+1)-th) iteration of the IGLS or RIGLS

algorithm, that is ignoring the random part. The other two terms involve the first and second differentials of the nonlinear function evaluated at the current values from the previous iteration. We have

We write the expansion for the fixed part value as where

We can choose to be either the current value of the fixed part predictor, that is X, or we can add the current 2ij2?????? TT2(2)(2)2(1)(1)z2iu2iujzj2ie2iejj(5.6)

are the current and previous iteration values of the fixed part coefficients. 1,,t?11t(2)(1)(2)(1)222(5.5) E(Zu?Ze)?0, (Zu?Ze )?? 2i2j2i2ijjj2i2j2i2ijzjjzue?Z?Z, ?Z?Z ititi,t,titj, ?estimated residuals to obtain an improved approximation to the nonlinear component for each unit. The former is referred to as a 'marginal' (quasilikelihood) model and the latter as a 'penalised' or 'predictive' (quasilikelihood) model (see Breslow and Clayton, 1993, for a further discussion). We can also choose whether or not to include the term in (5.4) involving the second derivative and we would expect its inclusion in general to improve the estimates. Its inclusion defines a further offset for the fixed part and one for the random part (see appendix 5.1). We shall illustrate the effect of these choices in the examples given in chapter 7. Further details of the estimation procedure are given in Appendix 5.1. In practice general models such as (5.1) may pose considerable estimation problems. We notice that the same explanatory variables occur in the linear and nonlinear components and this can lead to instability and failure to converge. Further work in this area is required.

Table 5.1 gives expressions for the first and second differentials for some commonly used nonlinear models.

Table 5.1 Differentials for some common nonlinear models.

Model Function First differential f(x)f(x) ?loglinear

logit log-log inverse

Second differential f(x) ??

?x?1x?2x?x?1?x?1x?1 )()()(1?e)(1?e)(1?e) (xxxx?e x?e (e?1)ee?ee

?2?3 ?x2x 5.3 Estimating population means

Consider the expected value of the response for a given set of covariate values. Because of the nonlinearity this is not

in general equal to the predicted value when the random variables in the nonlinear function are zero. For example, if we write the variance components model (5.2)

??exp(???x?u)ij01ij jand assuming Normality for we obtain

2E(|x)?exp(?x)e(u)du?exp(?x?/2) ijij01iju?????????uj01ijjj??Where is the density function of the Normal distribution. Zeger et al (1988) consider this issue and propose a

‘population average’ model for directly obtaining population predicted values by eliminating random variables from the nonlinear component. In general, however, this approach is less efficient when the full model with random variables within the nonlinear function is the correct model. The population predicted values, conditional on covariates, can be obtained if required, as above, by taking expectations over the population. An approximation to this can be obtained from the second order terms in (5.1.4) with higher order terms introduced if necessary to obtain a better approximation. Alternatively we may generate a large number of simulated sets of values for the random variables and for each set evaluate the response function to obtain an estimate of the full population distribution.

5.4 Nonlinear functions for variances and covariances

We saw in chapter 3 how we could model complex functions of the level 1 variance. As with the linear component of the model, there are cases where we may wish to model variances or covariances as nonlinear functions. In principle

we can do this at any level but we restrict our attention to level 1 and to the variance only. In chapter 6 we give an example where the covariances are modelled in this way.

Suppose that the level 1 variance decreases with increasing values of an explanatory variable such that it approaches a fixed value asymptotically. We could then model this for a 2-level model, say, as follows

where 0 are parameters to be estimated. Such a model also guarantees that the level 1 variance is positive , 1which is not the case with linear models, such as those based on polynomials. The estimation procedure is analogous to that described above and details are given in Appendix 5.1.

**v(e)?e(?x)xp i0j1ij??**??5.5 Examples of nonlinear growth and nonlinear level 1 variance

We give first an example of a model with a nonlinear function for the linear component and we then consider the case of a nonlinear level 1 variance function.

We use an example from child growth, consisting of 577 repeated measurements of height on 197 French Canadian boys aged from 5 to 10 years (Demirjian et al, 1982) with between 3 and 7 measurements each. This is a 2-level structure with measurement occasions nested within children. We fit the following version of the Jenss-Bayley curve to illustrate the procedure

(5.7)

so that the fixed part is an intercept plus a nonlinear component and the random part variance at level 2 is part of the nonlinear component. The results are given in table 5.2, using the first order approximation with prediction based upon the fixed part only. We shall compare the performance of the different approximations in chapter 7. The level 1 variance is small and of the order of the measurement error of height measurements. The starting values for this model need to be chosen with care, and in the present case the model was run to convergence without the linear intercept which was then added with a starting value of 100. Bock (1992) uses an EM algorithm to fit a nonlinear 2-level model to growth data from age 2 years to adulthood using a mixture of three logistic curves.

The second example uses the JSP dataset where we studied the level 1 variance in chapter 3. We will fit model B of Table 3.1 with a nonlinear function of the level 1 variance instead of the level 1 variance as a quadratic function of

**the 8-year-score. This level 1 variance for the ij-th level 1 unit is ex and table 5.3 shows the model ()?x011ijestimates.

The estimates are almost identical to those of model B of table 3.1 as is the likelihood value. Figure 5.1 shows the predicted level 1 variance for this model and model B of Table 3.1.

23y?exp(?t?t?t?u?ut)??e ij01ij2ij3ij0j1jij0ij??????????

联系客服:779662525#qq.com(#替换为@) 苏ICP备20003344号-4