Fork me on GitHub

Bayesian Kernel Machine Regression Distributed Lag Models

Ander Wilson

First, install and load the regimes package.

library(devtools)
devtools::install_github(repo="AnderWilson/regimes")
library(regimes)

Data setup

We simulate data to illustrate the required data structures and the method.

#simulate data from scenario A
dat <- simBKMRDLM(n = 200, scenario="A", sd=1, seed=1234)

The exposure data is a list. Each element of the list is a matrix of values for a single exposure. The columns of each matrix are are the ordered measurements at each time point starting with the first time point in the leftmost column. The rows are subjects. Each element of the list must have the same dimensions (same number of individuals and time points).

class(dat$x)
## [1] "list"
length(dat$x)
## [1] 2
lapply(dat$x, dim)
## $x1
## [1] 200  37
## 
## $x2
## [1] 200  37

The response variable is a vector of length n. This should be continuous. There is also a matrix of covariates that should not include an intercept.

dim(dat$z) # covariates
## [1] 200   5
length(dat$y) # response vector
## [1] 200

Estimation

The main function for estimation is bkmrdlm. We fit BKMR-DLM with a quadratic kernel. Use gaussian=TRUE for a Gaussian kernel instead of a polynomial kernel. When using BKMR-DLM for real data analysis increase the number of iterations. This may take a few minutes or longer depending the the data size and number of iterations.

fit <- bkmrdlm(y=dat$y,
                    x=dat$x,
                    z=dat$z,
                    niter=2000,
                    gaussian=FALSE,
                    polydegree=2)

Summarize results

Now display numerical results using the summary method. This reports any critical windows identified and reports estimates of the regression coefficients for the covariates.

sfit <- summary(fit)
names(sfit)
## [1] "call"         "windows"      "weights"      "sdres"        "coefficients"
sfit
## 
## Call:
## 
## bkmrdlm(y = dat$y, x = dat$x, z = dat$z, niter = 2000, gaussian = FALSE, 
##     polydegree = 2)
## 
## 
## Significant windows identified:
##                
## x1 11-28, 34-37
## x2        11-37
## 
## 
## Coefficients for covariates:
## 
##    Post. Mean   Post. SD       q2.5      q97.5 Pr>0
## z1  0.3082875 0.07793993  0.1565000  0.4601251    1
## z2  0.8848398 0.06504687  0.7520208  1.0050850    1
## z3  1.2097626 0.08163969  1.0600426  1.3738288    1
## z4 -2.7809500 0.07964868 -2.9421545 -2.6229652    0
## z5 -0.3588762 0.08295008 -0.5348712 -0.2036607    0
## 
## 
## Posterior mean residuals standard deviation: 1.022655

View weights and windows

The plot method displays the estimated weight function for each exposure. Times where the displayed 95 percent intervals do not contain zero are considered to be critical windows. The sign of the weight function does not indicate a direction of the association with the outcome. The direction of association is determined by the exposure-response function.

plot(fit)

Exposure response functions

We first illustrate the exposure-response function at the median value of all other exposures. This is one form of the main effect. The first step is to predict the exposure response function at a set of values. This may take some time. The number of points is the number of points that the function is predicted at for each exposure.

univarpred_quad <- predictUnivariate(fit, points=30) 

After prediction we can visualise the exposure-response function.

library(ggplot2)
p <- ggplot(univarpred_quad, aes(x=E, y=mean, ymin=lower, ymax=upper))
p <- p + geom_ribbon(alpha=.5) 
p <- p + geom_line(size=2) + facet_wrap(~reorder(name,m), scales="free_x")
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
p <- p + ggtitle("Main effect estimates BKMR-DLM")
plot(p)

Visualizing interaction

To visualize interactions we can plot the exposure-response function for each exposure at different quantiles of the other exposures. First we estimate the exposure-response function at five quantiles of the other exposures.

# estimate exposure response
combodat <- NULL
for(q in c(0.10,0.25,0.75,0.90)){
  for(m in 1:length(dat$x)){
    combodat <- rbind(combodat,predictUnivariate(fit, points=30,crossM=m, qtl=q))
  }
}

# some formatting
combodat <- combodat[-which(combodat$m==combodat$cross_m),]
univarpred_quad2 <- univarpred_quad
univarpred_quad2$cross <- univarpred_quad2$name
univarpred_quad2$cross_m <- univarpred_quad2$m

Next we can plot the results.

p <- ggplot(combodat)  +
  geom_line(size=1, aes(x=E, y=mean, color=as.factor(qtl))) +
  facet_grid(reorder(cross,cross_m)~reorder(name,m), scales="free_x")
plot(p)