Functional brain networks I: Simulating some data

In preparation for a new fMRI dataset, I’ve been trying to set up some kind of sensible pipeline for doing general statistical modeling / machine learning on functional brain network data. This is mostly a software exercise, since I want a code base that will let me extract putative networks from BOLD data and store them in some sensible format, all more or less automatically; but it also means settling on a fairly specific processing pipeline, which means choosing the right techniques, which means doing simulation studies. So I thought I’d start by simulating a toy dataset. Since estimating the structure of the network from the BOLD signal is a major part of network analysis in general, that means simulating raw BOLD data from a known functional network.

The basic strategy is like this: First, we construct a simple network model of the dorsal and ventral visual streams in the form of a directed graph.

The solid arrows denote relationships which are always present (the basic feedforward structure), while the dashed arrows (crosstalk between the two streams) are varied parametrically to see if we can recover the true network structure. The resulting adjacency matrix \mathbf{C} looks like this

         Thalamus V1 V2 MT MST V4 IT
Thalamus        0  1  0  0   0  0  0
V1              0  0  1  0   0  0  0
V2              0  0  0  0   0  1  0
MT              0  0  0  0   1  A  0
MST             0  0  0  0   0  0  0
V4              0  0  0  0   B  0  1
IT              0  0  0  0   0  0  0

where the static connections have a constant weight of 1, and the cross-talk connections have weights \text{A} and \text{B}, which can be varied.

We simulate neural data for each vertex using a set of coupled oscillators, where the phase \theta_i of the i‘th vertex is given by the differential equation

    \[      \frac{\mathrm{d}}{\mathrm{dt}}\theta_i = \omega_i + g\sum_{j=1}^N \mathbf{C}_{ji} \sin{(\theta_j - \theta_i)} + \eta_i(t), \]

where \omega_i is the intrinsic frequency of the vertex, g controls the global connection strength, and \eta is a white noise term. We initialize the phases randomly and solve the system by Euler integration using a time step of 100 ms. For this simulation, we initialize the phases randomly and set

w_i ~ N(60, 7)
g   = 15
n_i = N(0,3)

The result, for a six second interval, looks like this

We then convolve the neural signal with a canonical haemodynamic response function

    \[     f(x) = a \left ( \frac{x^{\alpha_1 - 1}\beta_1^{\alpha_1}e^{-\beta_1 x}  }{\Gamma(\alpha_1)} -                         c\frac{x^{\alpha_2 - 1}\beta_2^{\alpha_2}e^{-\beta_2 x}  }{\Gamma(\alpha_2)}  \right ) \]

with parameters

a       = 1
alpha_1 = 16 
alpha_2 = 5 
beta_1  = 3
beta_2  = 3 
c       = 1/6

and subsample at a TR of 2 seconds to get a full 20 minutes of simulated BOLD signal (that is, 600 TRs)

We simulate 20 minutes of BOLD data for 40 subjects. Half of these subjects have zero cross-connection weights between the two streams (that is, A = B = 0 in the adjacency matrix above), while the other half have connection weights distributed as \mathrm{N}(1, .1). Additionally, these cross-stream connection weights begin at zero and only “switch-on” midway through the task, at a TR distributed as \mathrm{U}(150,350). Intuitively, perhaps these subjects adopt a strategy midway through the task that results in cross-talk between the ventral and dorsal visual streams. The goal of the exercise is, using only the BOLD signal, to detect this change in network architecture, and describe it as precisely as possible, which I plan to do over a series of posts.