Multilevel Modeling

Characterizing Change

Spring 2026 | CLAS | PSYC 894
Jeffrey M. Girard | Lecture 12a

Roadmap

  1. Linear Trajectories
    • Structuring Time
    • Growth Modeling
  2. Nonlinear Trajectories
    • Transforming Time
    • Instantaneous Change

Linear Trajectories

Fluctuation vs. Growth

Modeling the Trajectory

  • In Week 11, we treated the baseline as stable. We were most interested in the observation-specific fluctuations around that baseline.
  • Moving forward with longitudinal change models, we are now most interested in the shape of change over the entire assessment period.
  • We want to know the starting point, the overall trajectory, and the overall shape (e.g., linear, quadratic, or logarithmic).
  • To capture this shape, we simply add a time variable (or variables) to the model as a Level 1 predictor.
  • Random intercepts will allow each participant to have their own starting point and random time slopes will allow each to have their own trajectory.

Centering Time

  • The intercept represents the expected outcome when all predictors equal zero. Therefore, we need to be mindful about what zero represents for time.
  • If we study grades 9 through 12, then using the numbers 9 to 12 will result in an intercept corresponding to grade 0, which isn’t very meaningful here.
  • We usually center time so that 0 and the intercept represent a meaningful starting point such as the first or baseline assessment in the study.
  • We can re-code time (by subtracting 9) so that 0 and the intercept represent grade 9, and 1, 2, and 3 represent grades 10, 11, and 12, respectively.

Scaling Time

  • The slope of a predictor represents the expected change in the outcome for a 1-unit increase in that predictor. Therefore, we need to be mindful about what a 1-unit increase actually represents for time.
  • If a 10-year study records time in days (from 0 to 3650), the slope will represent daily growth. This coefficient might be uninterpretable or confusingly small (e.g., a gain of 0.002).
  • We usually scale time so that 1 represents a meaningful, intuitive interval like “one year” or “one academic semester”.
  • We can re-code time (by dividing days by 365) so that 1 represents a full year. The slope now reflects a much more intuitive gain of 0.73 per year.

Example Data

library(tidyverse)
vocab_raw <- read_csv("vocab_sim.csv")
glimpse(vocab_raw)
Rows: 900
Columns: 5
$ subject    <dbl> 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, …
$ program    <chr> "B", "B", "B", "B", "B", "B", "A", "A", "A", "A", "A",…
$ age_months <dbl> 120, 132, 144, 156, 168, 180, 120, 132, 144, 156, 168,…
$ vocab      <dbl> 45.38913, 59.32121, 72.81737, 82.83264, 88.51938, 93.2…
$ hours_read <dbl> 3.399861, 2.572877, 1.571513, 2.955250, 3.413900, 3.60…

Each of 150 subjects was assessed for vocabulary size annually. Notice that our time variable is age_months, starting at 120 months (10 years old).

Preparing the Time Variable

If we use age_months as our predictor, our intercept will represent vocabulary at an age of 0 months, and our slope will represent month-to-month growth. We need to fix this!

vocab <- vocab_raw |> 
  mutate(
    # Center at 120 months (Baseline), then scale by 12 (Years)
    time = (age_months - 120) / 12
  ) |> 
  print(n = 3)
# A tibble: 900 × 6
  subject program age_months vocab hours_read  time
    <dbl> <chr>        <dbl> <dbl>      <dbl> <dbl>
1       1 B              120  45.4       3.40     0
2       1 B              132  59.3       2.57     1
3       1 B              144  72.8       1.57     2
# ℹ 897 more rows

Multilevel Equation (Linear)

Level One (Observation)

\[y_{ti} = \beta_{0i} + \beta_{1i} T_{ti} + \color{#4daf4a}{e_{ti}}\]

Level Two (Cluster)

\[\begin{align} \beta_{0i} &= \color{#377eb8}{\gamma_{00}} + \color{#e41a1c}{u_{0i}} \\ \beta_{1i} &= \color{#377eb8}{\gamma_{10}} + \color{#e41a1c}{u_{1i}} \end{align}\]

Notice that time is the only predictor in the model. We are just defining the average starting point (\(\color{#377eb8}{\gamma_{00}}\)) and the average rate of linear change (\(\color{#377eb8}{\gamma_{10}}\)), along with how much different people vary around those averages.

Multilevel Equation Part 2

Level One (Observation)

\[ \color{#4daf4a}{e_{ti}} \sim \text{Normal}(0, \color{#4daf4a}{\sigma}) \]

Level Two (Cluster)

\[ \begin{bmatrix}\color{#e41a1c}{u_{0i}}\\ \color{#e41a1c}{u_{1i}}\end{bmatrix} \sim \text{MVN}\left(\begin{bmatrix}0\\0\end{bmatrix}, \begin{bmatrix}\color{#e41a1c}{\tau_{00}} & \\ \color{#e41a1c}{\tau_{10}} & \color{#e41a1c}{\tau_{11}}\end{bmatrix}\right) \]

  • \(\color{#4daf4a}{\sigma}\): The within-person spread around their own trajectory.
  • \(\color{#e41a1c}{\tau_{00}}\): The between-person spread in starting points at baseline.
  • \(\color{#e41a1c}{\tau_{11}}\): The between-person spread in the rate of growth over time.
  • \(\color{#e41a1c}{\tau_{10}}\): The correlation between a person’s starting point and rate of growth.

Path Diagram

Formula

\[y_{ti} = \underbrace{\color{#377eb8}{\gamma_{00}} + \color{#377eb8}{\gamma_{10}} T_{ti}}_{\text{Fixed}} + \underbrace{\color{#e41a1c}{u_{0i}} + \color{#e41a1c}{u_{1i}} T_{ti}}_{\text{(Random)}} + \color{#4daf4a}{e_{ti}}\]

Generic

y ~ 1 + T + (1 + T | person)

where the 1 + T part is fixed
and the (1 + T | person) part is random

Example

vocab ~ 1 + time + (1 + time | subject)

Fitting the Linear Model

fit_linear <- glmmTMB(
  formula = vocab ~ 1 + time + (1 + time | subject),
  data = vocab
)
model_parameters(fit_linear, effects = "fixed")
# Fixed Effects

Parameter   | Coefficient |   SE |         95% CI |     z |      p
------------------------------------------------------------------
(Intercept) |       53.33 | 0.56 | [52.23, 54.42] | 95.54 | < .001
time        |        8.06 | 0.24 | [ 7.59,  8.54] | 33.27 | < .001
  • (Intercept) (\(\gamma_{00}\)): The average expected vocabulary score at Baseline.
  • time (\(\gamma_{10}\)): The average expected annual change in vocabulary score.

Interpreting the Linear Model

model_parameters(fit_linear, effects = "random")
# Random Effects

Parameter                     | Coefficient | 95% CI
----------------------------------------------------
SD (Intercept: subject)       |        5.24 |       
SD (time: subject)            |        2.59 |       
Cor (Intercept~time: subject) |        0.68 |       
SD (Residual)                 |        6.06 |       
  • SD (Intercept) (\(\tau_{00}\)): The variation in starting scores among subjects.
  • SD (time) (\(\tau_{11}\)): The variation in the rate of linear growth among subjects.
  • Cor (Intercept~time) (\(\tau_{10}\)): A negative correlation suggests that those who start lower tend to grow faster.

Visualizing Fixed Effects

# Plot the growth curve for the average subject
estimate_relation(fit_linear, by = "time") |> plot()

Visualizing Trajectories

# Plot the growth curves for six randomly selected subjects
estimate_relation(fit_linear, by = c("time", "subject=[sample 6]")) |> plot()

Linear Slopes

In a linear model, the growth rate stays constant over time. Here, vocab growth is always 8.06 per year with no acceleration or deceleration possible.

estimate_slopes(fit_linear, trend = "time", by = "time")
Estimated Marginal Effects

time | Slope |   SE |       95% CI |     z |      p
---------------------------------------------------
0    |  8.06 | 0.24 | [7.59, 8.54] | 33.27 | < .001
1    |  8.06 | 0.24 | [7.59, 8.54] | 33.27 | < .001
2    |  8.06 | 0.24 | [7.59, 8.54] | 33.27 | < .001
3    |  8.06 | 0.24 | [7.59, 8.54] | 33.27 | < .001
4    |  8.06 | 0.24 | [7.59, 8.54] | 33.27 | < .001
5    |  8.06 | 0.24 | [7.59, 8.54] | 33.27 | < .001

Marginal effects estimated for time
Type of slope was dY/dX

Nonlinear Trajectories

Beyond Linear Growth

  • Linear models assume a constant rate of change. They force a straight line through the data.
  • However, human behavior, learning, and biological processes rarely grow at exactly the same speed forever.
  • To allow our trajectories to bend and take on nonlinear shapes, we simply apply mathematical transformations to our time variable.
  • Instead of just predicting the outcome from raw Time, we can predict it from the Square of Time, the Logarithm of Time, the Exponentiation of Time, etc.

Visualizing the Shapes

  • Quadratic: acceleration, deceleration, or reversal (e.g., marital satisfaction)
  • Logarithmic: rapid early growth that plateaus (e.g., therapy response)
  • Exponential: slow early growth that compounds (e.g., viral outbreak)

Returning to Our Data

  • Let’s apply this theory to our vocab dataset. We are modeling vocabulary growth in children from Baseline (Age 10) to Year 5 (Age 15).
  • Before we just test every math function in R, we should ask:
    Which shapes make theoretical sense for this process?
  • Exponential: Unlikely. Human vocabulary does not compound toward infinity; there are natural cognitive and linguistic ceilings.
  • Quadratic: Possible. Growth might be fast early on and then slow down significantly as they get older and possibly even reverse in old age.
  • Logarithmic: Highly likely. Language acquisition often follows a rapid early learning phase that gradually plateaus as ceilings are approached.

Our Testing Plan

  • Based on our developmental theory, we will drop the Exponential model.
  • We will formally test a Quadratic model and a Logarithmic model against our standard Linear baseline to see which shape best fits the actual data.
  • Let’s start by building the Quadratic model. We need to add a new predictor to our equation: Time-Squared.

Quadratic Equations

Level One (Observation)

\[y_{ti} = \beta_{0i} + \beta_{1i} T_{ti} + \beta_{2i} T_{ti}^2 + \color{#4daf4a}{e_{ti}}\]

Level Two (Cluster)

\[\begin{align} \beta_{0i} &= \color{#377eb8}{\gamma_{00}} + \color{#e41a1c}{u_{0i}} \\ \beta_{1i} &= \color{#377eb8}{\gamma_{10}} + \color{#e41a1c}{u_{1i}} \\ \beta_{2i} &= \color{#377eb8}{\gamma_{20}} + \color{#e41a1c}{u_{2i}} \end{align}\]

Orthogonal Polynomials

  • Using raw polynomials can introduce multicollinearity problems when \(T\) and \(T^2\) are highly correlated.
  • The Solution: We use Orthogonal Polynomials via the poly() function to transform our time variable into fully uncorrelated linear and quadratic components.
  • The Catch: The components are centered and re-scaled,
    so the coefficient values cannot be interpreted normally.

Quadratic Modeling

fit_quad <- glmmTMB(
  formula = vocab ~ 1 + poly(time, 2) + (1 + poly(time, 2) | subject),
  data = vocab
)
model_parameters(fit_quad, effects = "fixed")
# Fixed Effects

Parameter         | Coefficient |    SE |            95% CI |      z |      p
-----------------------------------------------------------------------------
(Intercept)       |       73.48 |  0.90 | [  71.71,  75.25] |  81.49 | < .001
time [1st degree] |      413.03 | 12.42 | [ 388.70, 437.37] |  33.27 | < .001
time [2nd degree] |     -107.38 |  4.65 | [-116.50, -98.26] | -23.08 | < .001

The intercept is the overall average vocabulary score across all time points (due to centering). The significant and positive 1st degree term indicates the overall average growth trend is increasing. The significant and negative 2nd degree term indicates the trajectory is curved downwards (growth is slowing).

Quadratic Visualization

# Plot the growth curve for the average subject
estimate_relation(fit_quad, by = "time") |> plot()

Quadratic Visualization

# Plot the growth curve for six randomly selected subjects
estimate_relation(fit_quad, by = c("time", "subject=[sample 6]")) |> plot()

Quadratic Slopes

In a quadratic model, the growth rate is accelerating or decelerating by a fixed, constant amount per one-unit increase in time. Here, vocab growth is always decelerating by 2.87 per year (see the diff column).

estimate_slopes(fit_quad, trend = "time", by = "time") |>
  mutate(diff = diff(c(NA, Slope)))
time | Slope |   SE |             CI |     z |      p |  diff
-------------------------------------------------------------
0    | 15.24 | 0.46 | [14.34, 16.13] | 33.42 | < .001 |      
1    | 12.37 | 0.35 | [11.67, 13.06] | 34.97 | < .001 | -2.87
2    |  9.50 | 0.27 | [ 8.97, 10.03] | 35.13 | < .001 | -2.87
3    |  6.63 | 0.23 | [ 6.18,  7.07] | 29.04 | < .001 | -2.87
4    |  3.76 | 0.25 | [ 3.27,  4.24] | 15.09 | < .001 | -2.87
5    |  0.89 | 0.32 | [ 0.26,  1.52] |  2.77 |  0.006 | -2.87

Logarithmic Equations

Level One (Observation)

\[y_{ti} = \beta_{0i} + \beta_{1i} \log(T_{ti}+1) + \color{#4daf4a}{e_{ti}}\]

Level Two (Cluster)

\[\begin{align} \beta_{0i} &= \color{#377eb8}{\gamma_{00}} + \color{#e41a1c}{u_{0i}} \\ \beta_{1i} &= \color{#377eb8}{\gamma_{10}} + \color{#e41a1c}{u_{1i}} \end{align}\]

Logarithmic Modeling

fit_log <- glmmTMB(
  formula = vocab ~ 1 + log(time + 1) + (1 + log(time + 1) | subject),
  data = vocab
)
model_parameters(fit_log, effects = "fixed")
# Fixed Effects

Parameter      | Coefficient |   SE |         95% CI |     z |      p
---------------------------------------------------------------------
(Intercept)    |       47.62 | 0.54 | [46.55, 48.68] | 87.59 | < .001
time + 1 [log] |       23.59 | 0.69 | [22.24, 24.94] | 34.27 | < .001

The intercept represents the expected vocabulary score exactly at Baseline (Age 10). The significant and positive logarithmic term indicates that vocabulary size increases over time. This growth is modeled as fastest at the beginning and gradually slowing down over time.

Logarithmic Visualization

# Plot the growth curve for the average subject
estimate_relation(fit_log, by = "time") |> plot()

Logarithmic Visualization

# Plot the growth curve for six randomly selected subjects
estimate_relation(fit_log, by = c("time", "subject=[sample 6]")) |> plot()

Logarithmic Slopes

In a logarithmic model, the growth rate changes non-uniformly over time, following a pattern of diminishing returns. Here, vocab growth decelerates rapidly at first and then more gradually over time (see the diff column).

estimate_slopes(fit_log, trend = "time", by = "time") |>
  mutate(diff = diff(c(NA, Slope)))
time | Slope |   SE |             CI |     z |      p |   diff
--------------------------------------------------------------
0    | 23.59 | 0.69 | [22.24, 24.94] | 34.27 | < .001 |       
1    | 11.79 | 0.34 | [11.12, 12.47] | 34.27 | < .001 | -11.79
2    |  7.86 | 0.23 | [ 7.41,  8.31] | 34.27 | < .001 |  -3.93
3    |  5.90 | 0.17 | [ 5.56,  6.23] | 34.27 | < .001 |  -1.97
4    |  4.72 | 0.14 | [ 4.45,  4.99] | 34.25 | < .001 |  -1.18
5    |  3.93 | 0.11 | [ 3.71,  4.16] | 34.27 | < .001 |  -0.79

Model Comparison

compare_performance(fit_linear, fit_quad, fit_log, metrics = c("AIC", "BIC"))
# Comparison of Model Performance Indices

Name       |   Model |  AIC (weights) |  BIC (weights)
------------------------------------------------------
fit_linear | glmmTMB | 6323.5 (<.001) | 6352.3 (<.001)
fit_quad   | glmmTMB | 5860.0 (<.001) | 5908.0 (<.001)
fit_log    | glmmTMB | 5793.0 (>.999) | 5821.8 (>.999)

In our simulated data, the Logarithmic change model has the lowest AIC and BIC, suggesting it provides the best fit for this specific process. Note as well that the weights for both metrics are overwhelmingly in favor of this model.

What about Autocorrelation?

  • In Fluctuation models, it was important to use the AR1 or EXP covariance structure because we had no other way to explain why “today” looked like “yesterday”

  • In Growth models, we have a systematic trend. We expect Year 2 to be related to Year 1 because they both sit on the same developmental path

  • Usually, once the shape of the growth is correctly specified, the remaining residuals are independent enough that we don’t need the extra complexity of AR1 or EXP