Skip to content →

Towards efficient IGT model simulation/estimation in R/Stan

As part of my playing around with alternative objective functions for estimating reinforcement learning models of the Iowa Gambling Task (IGT), I needed a way to quickly simulate large numbers of participants. Since base R is slow, I implemented a few models in C++, but then I realized that it was pretty easy to set up an R package with Stan as the backend, so I did, or will have done in the future, but have also partly done already, now, in the present. What I wanted was a way to build models with arbitrary combinations of utility/updating/temperature functions without having to write entirely new model files.

The R package (in progress and largely undocumented/untested, caveat emptor) allows for the efficient simulation/estimation of reinforcement learning models of the Iowa Gambling Task (IGT), or n-armed bandit tasks more generally. The package implements several utility, updating, and temperature functions, and allows them to be combined arbitrarily to create new models. Both simulation and estimation use Stan, making the package very quick. The package can generate either full posterior distributions for parameters, or penalized ML estimates and CIs.

The package can be installed and loaded using the devtools package with

devtools::install_github(“areshenk/igt”, args = "--preclean”)
library(igt)

Documentation and testing aren’t done yet, but the general operation is described below.

Model Components

The package currently includes the following functions, which can be mixed and matched however the user likes. Additional functions can be added relatively easily (see below).

Utility functions

Expected Utility (EU)
u = (1 - w)\cdot \text{win} - w \cdot \text{loss}
Prospect Utility (PU)
u =  \begin{cases}        (\text{win} - \text{loss})^A & \text{win} - \text{loss} \geq 0 \\       -l \cdot |\text{win} - \text{loss}|^A & \text{win} - \text{loss} < 0    \end{cases}
Prospect Utility 2 (PU2)
u = \text{win}^A - l \cdot |\text{loss}|^A

Updating functions

Letting V_{i,n} be the value of deck i on trial n:

Decay Reinforcement Learning (DRL)
V_{i,n+1} = rV_{i,n} + \delta_{\text{choice},i}\cdot u
Delta Learning Rule (DEL)
V_{i,n+1} = V_{i,n} + \delta_{\text{choice},i} \cdot (u - V_{i,n})
Mixed Learning Rule (ML)
V_{i,n+1} = (1-d)V_{i,n} + \delta_{\text{choice},i}r[u - (1-d)V_{i,n}]

Temperature functions

Trial Dependent Choice (TDC)
\theta = \left (\frac{t}{10} \riht )^c
Trial Independent Choice (TIC)
\theta = 3^c - 1

Simulation

Decks are specified using the s4 class igt, which can be initialized using

x <- setupIgt()

by default, the function initializes a standard set of 4 decks, although custom configurations can be specified. We can get basic information about the task using

summary(x)
>                 A        B        C        D
> EV       -25.0000 -25.0000 25.00000 25.00000
> SD       135.3074 376.8892 27.52409 75.37784
> Win.Freq   0.5000   0.9000  0.60000  0.90000
numDecks(x)
> 4
numTrials(x)
> 100

Suppose that we want to simulate performance on the deck configuration using a model with a prospect utility function, and delta learning rule, and a trial independent temperature function. Then we can specify

utility     <- 'PU'
updating    <- 'DEL'
temperature <- 'TIC'
pars <- list(utility = list(A = .25, l = .25),
             updating = list(r = .75),
             temperature = list(c = 1.2))

and simulate 100 players using

sim <- simulateIGT(x, n = 100, 
                   utility, updating, temperature, 
                   pars, scale = .01, full.output = FALSE)

where scale = .01 indicates that wins and losses should be divided by 100 (as is standard), and full.output = FALSE returns only the trial choices, wins, and losses (otherwise, the function will return the deck valuations, utilities, and choice probabilities for every trial).

The object sim is a list containing arrays choice, wins, and losses.

Estimation

To fit a model to observed data, we first specify the utility, updating, and temperature functions (already done above), and create a list specifying the desired parameter bounds.

pars <- list(utility = list(A = c(0,1), l = c(0,5)),
             updating = list(r = c(0,1)),
             temperature = list(c = c(0,2)))
fit <- estimateIGT(choice = sim$choice[1,], 
                   win = sim$wins[1,], 
                   loss = sim$losses[1,], n_decks = 4, 
                   utility, updating, temperature, 
                   pars, scale = .01, reg = 1, 
                   method = 'bayes')

where reg is a regularization term. All parameters are initialized with support on the unit interval and a beta(reg,reg) prior. These raw parameters are then scaled to the width specified by the user. A value of reg = 1 is equivalent to a uniform prior, with larger values concentrating more tightly in the center of the parameter space. IGT model estimates have an annoying tendency to cluster towards the boundaries of the parameter space, owing largely to most models of the IGT being crappy models. I would generally recommend using something like reg = 2 (at least) in practice, especially for ML estimation.

The argument method = "bayes" causes the function to return a stanfit object with the full posterior samples for all parameters. If point estimates are needed, then ideally the user would still just set method = "bayes" and take the posterior means, but if speed is a concern, then method = "map" returns maximum a posteriori estimates along with (in the near future) confidence intervals. With the reg term, these can be thought of almost as l2 regularized ML estimates, with shrinkage towards the center of the parameter space.

Adding new functions

The package is designed to make it relatively simple to add new utility/updating/temperature functions. I plan to make a more complete guide in the future, but the basic steps for adding a new utility function are as follows (adding new updating/temperature functions works the same way, mutatis mutandis):

  1. Add the new utility function to inst/chunks/utilityFunctions.stan. Note that all utility functions must have the same signature! Look at the other functions to see how this works
  2. Add a new conditional statement to the end of inst/chunks/utilityWrapper.stan. Again, the function is simple, so you’ll see how it works.
  3. The package contains an internal lookup table in the form of a list called modelComponentDetails, which contains information about the number of parameters in each utility function, bounds on those parameters (if any), and the parameter names. You’ll need to add the necessary information to the list and save it to R/sysdata.rda. Note that order is important: The order of functions in modelComponentDetails must be identical to the order in inst/chunks/utilityFunctions.stan.

Then recompile/reinstall. Everything else should work more or less automatically.

Published in Blog

Comments are closed.