This tutorial will take you through fitting a pRF encoding model (see Dumoulin and Wandell (2008) for details) to actual data. You will write all the code to do this - either in Matlab or Python (there are worked examples of code for both Matlab and Python). You can check here for some information on how pRF fitting works and introduction to a GUI implementation. Here there will be no GUI, you will code the analysis from first principles using nothing but simple matrix computations. The full tutorial takes you through fitting a pRF model to data using a grid search, plotting the data on a triangulated cortical surface, fitting with a non-linear search algorithm (nelder-mead) and then using your fit model to decode a stimulus that was never used in training. A few snapshots from the process to give you an idea of what you will get by going through the tutorial below:
This tutorial works in either Matlab or Python (but note that 1 section near the end on displaying results on surface is not yet implemented in Python - so will require more work on your part to write the python implementation!).
To setup for matlab, you just need to do the following:
git clone https://github.com/justingardner/pRFtutorial.git pRFtutorial cd pRFtutorial
To setup for Python:
git clone https://github.com/justingardner/pRFtutorial.git pRFtutorial cd pRFtutorial
# import numpy for matrix operations import numpy # import scipy for optimization routines import scipy # import scipy for the matlab reading code import scipy.io as sio # setup matplotlib import matplotlib.pyplot as plt # Load the data pRFLoad = sio.loadmat('pRF.mat') # get the stored matrices stimImage = pRFLoad['stimImage'] stimX = pRFLoad['stimX'] stimY = pRFLoad['stimY'] canonical = numpy.transpose(pRFLoad['canonical']) canonical = canonical[:,0] c = pRFLoad['c'] v = pRFLoad['v'] t = pRFLoad['t'] tSeries = pRFLoad['tSeries']; tSeries1 = pRFLoad['tSeries1']; tSeries2 = pRFLoad['tSeries2']; tSeries1 = tSeries1[:,0]; tSeries2 = tSeries2[:,0]; unknownTSeries = pRFLoad['unknownTSeries'] possibleStimImages = pRFLoad['possibleStimImages']; base2scan = pRFLoad['base2scan'] roiScanCoords = pRFLoad['roiScanCoords'];
Let's start by just taking a look at the stimulus that was presented to the subject. Remember the subject was looking at the center of the screen where there was a fixation cross on which they had to perform a task (just to keep their attentional state and fixation under control) and bars moving in different directions were shown. Each frame of the image is stored as a 2D matrix that has the pixel values for each position on the screen. For this purpose, the matrix just contains 0's and 1's. Each of these 2D frames are stacked into a 3D matrix where the third dimension is time. So, good. Let's see if we can figure out what the stimulus looked like on some arbitrary frame of the stimulus, say frame 36. You should be able to do that just by looking at the values. The variable that holds the stimulus images is called: stimImage
Probably there were too many values to look at all at once, so let's look at a subset of values between 70 and 85 and see if you can tell what the stimulus was.
Got the answer?
Ok, it might be easier to look at this as an image, so let's go ahead and display the whole thing and see what you get.
Now that you can display the image, you might want to go ahead and look at the whole sequence. You can write a for loop and display each image at a time and see what it looks like.
Good, you should have seen a bar move in various directions across the screen. That was what was shown to the subject. Run again, and count how many times a bar crossed the screen - we'll need that number as a consistency check in a following section!
Now we need to make our encoding model. In this case it is a simple gaussian receptive field. Let's start by making an arbitrary one at the location (-12, -3) degrees (i.e. 12 degrees to the left and 3 degrees below center) We also need to set the size of the receptive field, let's do one that has a standard deviation of 3 degrees. So, first, your going need to know the equation of a gaussian, right?
The x and y coordinates of every point of the stimulus image is in stimX and stimY, so using the 2D equation with those input points, you should be able to compute the gaussian receptive field.
Now that we have the receptive field and the stimulus, it's time to compute the expected “neural response”. The amount of overlap between the stimulus and the receptive will govern the neural response. (For our purposes we won't consider any of the dynamics of neural responses themselves - e.g. transients, adaptation, offset responses, etc etc) To compute the overlap, we just take a point by point multiplication of the stimulus image with the receptive fields and sum across all points. You should now try to compute that for every time point in the stimulus image.
How many peaks in this simulated neural response do you see? Does it match the number of times there was a bar moving across the visual field that you counted above?
Now we have to convert this model neural response into a model BOLD response. The model that we use for this is that the BOLD response is a convolution of a canonical hemodynamic response function with the neural response. So, go ahead and compute the convolution of the model neural response with the canonical hemodynamic response function (we have provided one in the variable called canonical
Ok. Sanity check (always good to check that things are making sense as you go along as much as possible). Plot the neural response on the same graph as the bold response and see if you they make sense. In particular, the bold response should start after the neural response and last longer, right?
Ok. Looks good? What about at the very end. Do you notice that the BOLD response lasts longer than the neural response? That's because the conv command returns a vector longer than the original response. We don't want that last piece, so remove it from the bold model response (otherwise, dimensions won't match for the calculations we do later). Also, let's take the transpose of this modelBoldResponse (also important for matching dimensions up later)
Ok. So, we can compute the model BOLD response for a specific receptive field - now we want to know how to compare that to measured responses. We're going to use the time-point by time-point correlation between the model and the measured responses. Correlation is a nice measure to work with, since it isn't affected by the magnitude of responses. Remember that we haven't really done anything to make the units of the model match the units of response (look back at the units you have for the model BOLD response - typical BOLD responses are about 1 or 2% signal change - is that the size of the fluctuations you have in the model? (no. right!). We could fit a parameter there so that we get the model output to match in magnitude to the BOLD response (how would you do that?), but here we'll just ignore the whole issue by using correlation. Ok, we have two actual time series for you to work with tSeries1 and tSeries2 (they were measured from a subject viewing the stimuli). Go ahead and compute the correlation between these time series and the measured bold response from above. What do you get? For which tSeries is this model rf a good model for?
Ok. Let's confirm that visually, by plotting the model and the time series together and see that they match. Now, we have to worry a little bit about the magnitudes which are going to be different between the model and the actual bold responses - let's handle this by normalizing the model and the bold responses to have zero mean and vector length of 1 (i.e. subtract the mean and divide by the square root of the sum of squares). This normalization is useful since later it will allow us to compute the correlation efficiently. Probably worth noting here an important maxim. Anything you do to your data - you are going to need to do to the model as well. So if we divide by the mean of the bold responses, then divide by the mean in the model as well. If you high pass filter your data, then you need to high pass filter your data. Ok. Go ahead and do that and then plot the model and the time series.
So, should be a great match for tSeries1 and a pretty bad match for tSeries2.
Ok, so of course the example above for tSeries1 was totally cooked - I gave you the parameters that would pretty closely match the response. How can we find the best parameters for tSeries2 (and in general any time series for which we don't know what the receptive field parameters are). After all, that's the whole goal of what we are doing right? To figure out what the receptive field model is that best explains the responses of each voxel. We'll first do a quick and dirty way to get close to the right parameters by doing a grid search. What's a grid search? It's just forming a grid of parameters and testing to see which model parameters on this grid give a model response that best fits the voxel. In this case, our grid will be over x and y position of the receptive field and the width (standard deviation). Let's try x and y from -15 to 15 degrees in steps of 1 and standard deviations from 1 to 5 in steps of 1. We'll start by computing receptive fields for all combinations of these parameters and putting them all into a single giant matrix of model responses. Each row of this matrix will contain a model response with one parameter combination. So, if the length of a model response in n and there are k different combinations, you should create a k x n matrix. Also, remember to keep which model parameters made which model response in the k x n matrix, so store those in another k x 3 matrix (the three is for the three different parameters). Got it?
Ok, now that we have these matrix of modelResponses for a grid of different parameter settings, let's do some sanity checks to make sure we have what we think we have (As you see, I'm a big believe in sanity checks!). First, plot a few model responses. Do they have 8 peaks as you expect? Second, remember that tSeries1 was best correlated with the rf we calculated above for [-12, -3] and a standard deviation of 3? That should be true here to. First, go find which modelNum has those parameters.
Now compute the correlation between tSeries1 and tSeries2 for that modelRow number, you should be able to replicate the correlation numbers from above. Note that you will need to take the transpose of the modelResponses to make the dimensions work out correctly.
Everything check out? Good. Now let's find a better set of parameters for tSeries2. We can do this by computing the correlation of tSeries2 against every model in the matrix. We could build a for loop to do this, but it's actually much simper then that. Correlation is computed as the dot product of two unit length vectors (remember I told you that normalization was going to be useful?) So, all we have to do is compute the matrix product of the modelResponses matrix with the tSeries and we will get all the correlation values we need. So, let's try first with tSeries1 and confirm we get back the model that we already know is best. Find the correlation that is maximum - this should be the one with parameters [-12, -3] and standard deviation of 3.
If that worked, then try it with the other time series, tSeries2, and you should get a model with good fit. Go ahead and plot that best model against tSeries2 to confirm that is the case.
Ok, so that's a pretty rough and ready way to fit (using a grid search - we'll get more precise values in a later section), but it's good enough that we can go ahead and fit a whole bunch of voxels and plot them to see if we can see maps of visual cortex. To start with let's get the best model fit for all the tSeries that we have. It's real simple, just do the matrix multiplication against the modelResponses matrix to get the correlation, and then use max to figure out which modelResponse each tSeries gives the best fit. The hardest part is simply getting the matrix dimensions correct. modelResponse is a m x n matrix (with m models and n volumes), and tSeries is v x n (v voxels by n volumes), so to take the matrix multiplication you need to take the transpose of tSeries. This will create a k x v matrix where k is the number of models and v is the number of voxels. You want to take the max of this matrix along the model responses dimension - i.e. the largest correlation for each voxel, which is the first dimension. If you accept two arguments from the max operator it will give you both the maximum correlation (a number that is useful for evaluating model fit) and the index of which model produced that maximum value. Try it out.
Ok. You guessed it - now we need to do a sanity check to make sure that we got the right thing. We know that the best model for tSeries1 was 528, so let's make sure that's true for this computation to. In the tSeries matrix, the tSeries1 is the 134 response. So check maxR and bestModel and make sure that you get the expected values.
And another sanity check. Receptive fields in primary visual cortex should all be contralateral. The set of voxels you just fit were all in the right visual cortex - so let's make sure that the receptive fields found were all in the left visual field. To do that, let's figure out how to plot the contour of a receptive field. Remember the equation we used for the gaussian? Let's plot a set of x,y points for the one standard deviation contour. How do we do that? We can do that by finding all the y vales for a set of x values using the equation. So, take all x values from -std to +std and find the appropriate y-values. You will need to rearrange the gaussian equation to get y as a function of x.
Got it? Then start by plotting a gaussian contour with center at, say, [-5, +3] and std = 3.
You should see something that looks like a circle. If you got that, then now go ahead and use that code to plot all the contours for all the fit receptive fields.
Does it look like most of the receptive fields are in the left visual field, as expected?
It's close, but not perfect - there are a few weird ones in the right. That could be because these are not fit well and they are just due to noise. So, check that (again, a good sanity check!). Modify your code for plotting the pRF contours so that it only plots pRFs that have a correlation of greater than, say, 0.4 (remember that we calculated the best correlation above and left it in the variable maxR). Are the receptive fields now in the correct visual field?
Cool. One last thing to check here, if it really is the case that those pRFs with low correlation are bad fits, then you should be able to go find one of those voxels with a low r and plot it against the model time course and see that it doesn't look good.
And, there's a pretty important lesson here (which is sometimes forgotten). If your fit is bad (in this case low correlation) then the parameters you fit cannot be interpreted! So, always (whether you are doing an encoding model or a GLM or whatever) - you have to check goodness of fit before trying to extrapolate anything from what parameters you got. To use technical terms, if goodness-of-fit is crap then your parameter estimates will be crap too.
Now, let's see what this analysis looks like on the cortex. Let's project these values on to a cortical surface and take a look. First, we need to know how to draw a cortical surface. Take a quick look at surfaces and flat maps and make sure you understand the idea of triangulated meshes. Ok. Are you an expert now? Let's try and display a surface. We have the surface vertices in the variable v, the triangles (or faces) in t, and the colors for each vertex in c. You should be able to use the patch command, as follows:
clf % draw the surface patch('vertices',v,'faces',t,'FaceVertexCData',c,'facecolor','interp','edgecolor','none') % set the axis to make it display nicely axis equal axis off
If you click the circular arrow tool in the figure, you should be able to use the mouse to turn the surface around to view it from any angle you like.
Ok, but how do we plot the values from the pRF fit on to this surface. Let's first do this for the correlation values (that way we can see what part of cortex is the best fit. We are going to need to do a coordinate transform to know how to put the values from voxels take on the PRF scan into the coordinates of the surface. Take a quick look at this page on coordinate transforms to get oriented to how this works. Ok, you back? Then your should have learned that you can apply a 4×4 matrix transformation to coordinates that have a one added to them to go from one voxel coordinates to another. In this case, we will go from the coordinates of the surface (which comes from the voxels of the original 3D canonical anatomy) to coordinates of the pRF scan (note that when you want to color the vertices of the surface, you want to go in this direction and not from the scan to the base coordinates - since multiple base coordinates may correspond to a single scan coordinate - since the voxels are bigger in the scan then in the 3D canonical). Ok, so let's try that once and see how it goes. Pick any vertex (remember that the vertices of the surface are in the coordinate frame of the canonical scan) and convert it into coordinates of the scan. The transform matrix is in the variable base2scan. Remember that order matters here (the xform matrix comes first before the coordinates). And make sure you get the dimensions right (i.e. the xform is a 4×4, so that coordinate has to be a 4×1)!
Now that we know where it lives in the scan, we need to figure out if it is close to any of the voxels that we have computed the pRF for. To do that, compute the euclidean distance from each one of the roiScanCoords and see which one it is closest to. Note that roiScanCoords is a 3xr matrix where r is the number of voxels in the roi.
Ok. Good, now we know how to find which voxel is closest, we should be able to set the color of that vertex accordingly . Let's choose a value from the hot colormap according to the r value. To do this, we get the hot colormap for 256 colors - and then map the r value (0-1) into the range `1:256. Try that out.
Got it? Then let's put this all together. Compute a new colormap for the surface in which you put the appropriate hotmap colors in according to the max correlation value for the corresponding voxel from the pRF. Also, if the distance between the vertex and the pRF scan coordinate is too far away, just use the original grayscale curvature value.
Often people like to look at the polar angle of the rf fits - that is the angle from the fixation that goes through the center of each receptive fields. To get this value, we need to compute it from the parameter fits. Just requires a little trigonometry (you remember trigonometry right? soh-cah-toa) Now get the parameter estimates for each voxel and compute the polar angle for each voxel. (Note that to make this work you may need to special case a few different angles)…
Now that we have the polar angle for each voxel, put them on the surface just like you do for the r values - this time scaling the range -pi to pi between 1 and 256. You may want to use a different colormap - like hsv so that the polar angles are easy to distinguish
Sanity check! Does it look like the upper visual field is on the lower bank of the Calcarine and the lower visual field is on the upper bank?
Ok, so now that we have gone through the whole pRF encoding model fit, let's go back to our quick and dirty grid search fit and make that better. Of course, you could make a finer and finer grid search until the grid get's so fine that it's close enough to the optimal parameters for any given voxel, but that would take a lot of memory to hold (though sometimes this approach may be the fastest and easiest way to go - and could be done progressively - each time getting close to the real parameters and then refining the grid around the current best parameters. Here we will use a different approach in which we use a non-linear fitting algorithm. In general there is no non-linear algorithm that will always work optimally, it really depends on your problem and the landscape of the error space. Are there lots of local minima? If not, a simple gradient descent might work. A slightly smarter variant is the Levenberg-Marquardt algorithm which iterates between gradient-descent and approximation of your error space as a quadratic function. We've found Levenberg-Marquardt gets stuck in local minima too much for the pRF fitting, so use Nelder-mead “the amoeba” (check the wikipedia page for animations of what it does and you will see why it's sometimes called this). Matlab has this as a built in function, so all you have to do is supply the function the error function - that is, you need a function that returns a scalar value and the Nelder-Mead algorithm will continue to try parameters to find the parameters that makes your function return the smallest value. So, in our case, we write a function that takes pRF parameters and a time series and returns the correlation of the model time series with the actual time series (you will also need arguments for the stimImage, stimX, stimY and canonical). Actually, we need one minus the correlation value so that Nelder-Mead will search for the pRF parameters that make for the largest correlation of the model and the time series. But, we already know how to do that, right? So, let's try it out.
Ok, what next? A sanity check, of course. Use the parameter values from the beginning of this tutorial and see whether you get back the correct correlation values for tSeries1 and tSeries2.
Now, we need to figure out how to call the non-linear fitting function which in matlab is called fminsearch. It takes a function pointer (that's just the function name with an @ sign before it: @pRFCorr) and the starting parameters. What starting parameters should we use? Well, we already have a good starting place from the grid search! So use those as the starting parameters. It then takes some options, we can be set by optionset, and then the parameters for the function pRFCorr. So, it looks like the following:
But, are these really right? When debugging a non-linear fit, it is a really, really good idea to check out what it is doing by looking at how the fit changes as it searches the parameter space - at least for a voxels - once you are sure that it is doing what you hope to do (sometimes non-linear fits can find really weird parts of the parameter space that can get a fit by optimizing for a noisy feature of the data - for example a big spike in the data can be produced by a model that gets a really tiny receptive field that only gets excited at that one time point. So, let's put some drawing code into the pRFCorr function so that we can see what is happening as we fit.
So, add the following code to the end of the pRFCorr routine:
And run fminsearch again. See it fitting?
Try it with some initial parameters that are not perfect.
Did it find the same set of parameters?
What about with these?
Now that we have a fit an encoding model for each voxel, what is this good for. Decoding is one thing that you can do. And now here's the power, since we have a model (assuming its right) for how the voxel will respond to visual stimuli, we can test it on stimuli we have never encountered before in fitting the encoding model. In this case, we could use it to decode an image that is not even a bar. Let's try this with our model for a letter of the alphabet. Remember we trained our model on moving bars - never ever showed a letter of the alphabet. Here we will give you some response to one letter of the alphabet and ask which letter of the alphabet was presented (full disclosure - these responses were not actually measured like the bar stimuli responses you have been playing with - it was simulated, but in principle if we had a dataset like this we could do the following analysis - and it has been shown to work!)
Ok, first a little detail we have been sweeping under the carpet up until now. The amplitude of response. We've been normalizing all of our responses, so were just throwing out how much percent signal change there was. This becomes important here because for a static letter, the amount of response matters - it will depend on how much overlap there is with the pRF of the voxel. If we normalize the response magnitude, then all voxels will then respond with the same amplitude no matter how much the stimulus overlaps with the pRF. So, let's first start by modifying our code to compute the modelBoldResponse to be in units of percent signal change. For this simulation, let's just assume that if a receptive field totally overlapped with the stimulus that this would generate a maximal response of say, 3% signal change (for a real experiment the maximal response may vary voxel to voxel so would need to be estimated (how would you do that?). So, let's start by looking at the code we had for generating a modelBoldResponse given receptive field parameters.
So, three places we need to fix. One, get rid of normalization, of course (we'll assume the measured responses have had nothing done to them - not even removing the mean). Two, the rf area should be normalized so that if the stimulus completely covers it, it will generate the maximum response - in this case we want that to be 3% signal change. Third, we may want to normalize the canonical response function. There are various ways to do this depending on the need (e.g. you could make the maximum of the canonical to be 1) - we will make the area under the canonical to be 1. This is considered the right thing to do with convolution since it will give interpretable units in terms of percent signal change per unit time. So, go ahead and try to make those changes.
Ok. Now that we have that taken care, let's do some decoding! In the array unknownTSeries we have responses of our voxels to a, well, unknown stimulus. In this case, one of the letters of the alphabet. Can you guess which letter by looking at the responses?
To figure this out, we can try to predict what the response should be to any letter of the alphabet and then find the response that best correlates with the unkownTSeries. The letters of the alphabet are in a variable called possibleStimImages which is indexed by letter of the alphabet, x, y and t [26 x 108 x 192 x 60] (i.e the first dimension is for each of the letters of the alphabet and the last 3 are like the stimImage we have been working for - except a bit shorter - only 60 volumes). You should be able to cycle through all letters and then compute the response of all voxels. Then correlate that predicted response with the measured response and then choose which image was presented.