Note: this introduction is not intended as a guide to the AFNI GUI, for which Hillary Schaefer has created a good introduction:

http://tezpur.keck.waisman.wisc.edu/~oakes/afni/afni_intro.pdf

Also note that for all the AFNI command line programs, a list and description of all the command line options can be viewed by typing the name of the command by itself or followed by “-help”

e.g:

3dvolreg

to3d –help

The help contents are also available on the web here, with more explanatory, though somewhat dated manuals here.

NOTE: I am assuming that a recent (i.e. 2005 or later) version of AFNI is being used in these instructions. If in doubt about which version of AFNI is set up as your default, type the following command on the Linux command line:

which afni

This should tell you the path to the version that is set up as your default. To change the default version, it is necessary to edit your login startup files. If you use the BASH shell (the default in our lab), you can edit the .bash_profile file in your home directory with a text editor, and add the following lines to the end:

export AFNI_PATH=/apps/linux/afni-latest/

export AFNI_PLUGINPATH=$AFNI_PATH

export PATH=$AFNI_PATH:$PATH

Alternatively, if you want to use the most recent version of AFNI commands, you can precede the commands listed below with /apps/linux/afni-latest/ (no trailing space). E.g.:

/apps/linux/afni-latest/3dvolreg

This will work on any of the main Linux application servers.

The purpose of this document is to provide an overview of a typical processing pathway for the analysis of fMRI data with AFNI software. Throughout I emphasize the commands that are used to process data. Virtually a full analysis could, at least in principle, be done with command line programs, without ever looking at the data itself. This would be a VERY BAD IDEA. I cannot emphasize enough how important it is to examine your data (using AFNI excellent data visualisation capabilities) at all stages of data processing. The old adage “garbage in…garbage out” is very well suited to fMRI data analysis – if you don’t look at your data and consider what it means, whether results are reasonable, and make an effort to visually identify potential problems, then you are likely to end up with garbage.

Enough of the lecture – on with the overview.

- reconstruction of raw functional images (P-files) – specific to The Waisman Laboratory
- conversion of anatomical images to AFNI format – specific to The Waisman Laboratory
- slice timing correction and motion correction of functional images
- temporal filtering of functional images
- spatial blurring of functional images <optional>
- single-subject analysis of functional images with General Linear Modeling (GLM)
- Scaling parameter estimates to percent signal change and calculating SNR
- Talairach transforming of statistical image
- spatial blurring of statistical map <optional>
- Group analysis of individual statistical maps using T-tests
- Group analysis of individual statistical maps using ANOVA
- Correction for multiple comparisons: Monte Carlo simulation
- Correction for multiple comparisons: clustering
- Extraction of cluster means

Below I detail each of these steps, with some example data.

**NOTE: These instructions are for P-files collected since
the scanner upgrade in April 2005. For P-files collected before that time, you
should use the instructions
here for reconstruction.**

The functional data is saved from the scanner in K-space form. That is, the data is actually the Fourier transform of the image data itself. We use a program called epirecon to Fourier transform the data back to image space. Epirecon also applies a number of corrections for various types of image distortion. Rather than directly call epirecon, however, we use another program called sip, which in turn calls epirecon. Sip also creates some standard directories and log files which aid in later analysis and debugging.

To use sip, we first create a *.SIF file for the files we wish to reconstruct. A *.SIF file is nothing more than a text file that contains information about the files to be reconstructed, and how we want them to be reconstructed. To reconstruct the P-files, we simply invoke sip with the name of the *.SIF file as an argument:

sip trainingBlockS001.SIF

A template for a *.SIF file is shown below – you can copy this text into a new text file and use it as a template.

# file trainingBlockS001.SIF

# specifies the template from which this *.SIF file
was created

SIF template = template.SIT

# specifies the study name

Study identifier = trainingBlock

# specifies the subject number

Subject identifier = 001

# specifies the base directory in which
sub-directories corresponding to each scan run will be created

Working directory = /study/training/afni/block/s001

# specifies the name of the log file to be created

SIF log filename = /study/training/afni/block/s001/trainingBlockS001.log

#specifies
the P-files for each scan run. The items “run_1” and “run_2” here are the names
to be given to the sub-directories for each scan run

run_1 Pfile =
/study/training/afni/block/s001/raw/P05120.7

run_2 Pfile = /study/training/afni/block/s001/raw/P06656.7

# specifies the ref.dat file that was created during
the scan session and that is used for calibration during reconstruction

Reference filename =
/study/training/afni/block/s001/raw/ref.dat.s001

# Spatial reconstruction filter = Fermi, Hamming, or
None

Reconstruction filter = Fermi

# specifies the type of file to create = usually you
will want to specify BRIK (for AFNI) or ANA (for ANALYZE) here. Other options
are SDT, FNI, IFH

Reconstructed files to create = BRIK

# specifies the scaling factor to be applied during
reconstruction – specify 0 to use whole-brain autoscaling

Scaling factor = 0

# specifies how many images at the start of each run
to discard. These images correspond to the time that it takes for the scanner
to stabilize

Pre-images to discard = 5

# other reconstruction parameters that generally
should be left as shown here

ZX = NA

ZY = NA

Apply BP asymm correction = Y

Apart from the file names specific to the study and subject being analysed, a few other options will vary according to your study’s needs:

- The reconstruction filter chosen depends whether you want to filter your data now, and not later (use “Hamming”), or apply no filtering now and wait until just before Talairach transforming your data before filtering (use “None”), or whether you want to apply a small amount of filtering now to remove ghosting and other artifacts (as done by GE, use “Fermi”) but still filter the data further before Talairaching.
- The choice of output file format might depend to some extent on which software is going to be used for analysis. In general, we suggest the AFNI format, since it has some features not available in other formats, but can easily be converted to other formats at a later stage if desired. The AFNI format consists of a binary BRIK file that contains the data, and a text HEAD file that contains useful information about the data. One of the AFNI-specific features is that information about the position of the imaged brain volume in relation to the scanner coordinates can be stored in the header. This makes it possible for AFNI to align or overlay fMRI data with completely different acquisition parameters. e.g. a high resolution anatomical image acquired axially can be aligned with a low resolution functional image acquired saggitally. In addition to information about the format and position of data in the BRIK file, the HEAD file also contains a command history of the AFNI commands used to produce the file. More details about AFNI dataset formats is available here.

The reconstruction will produce a subdirectory for each scan run. Inside the subdirectory will be a reconstruction log file, called recon.log, which is a plain text file containing a record of the slice by slice reconstruction. You should open this file in a text editor and look at it for any error messages. Also in the directory will be the reconstructed data (i.e. a *.HEAD and a *.BRIK file if you chose AFNI format).

**NOTE: These instructions are for scans collected since
the scanner upgrade in April 2005. For scans collected before that time, you
should use the instructions
here.**

To convert anatomical files to AFNI format, we use the AFNI command to3d. In your data directory, you'll have a bunch of subdirectories such as:

S1_3-Plane

S2_EFGRE3D

etc. Each of these contains a different type of anatomical scan, but the overall conversion process should be the same. To take the example of the S2_EFGRE3D directory, which contains the high resolution, T1-weighted anatomical scan, this directory contains a numerically ordered set of files:

S2_EFGRE3D.0001

S2_EFGRE3D.0002

...

S2_EFGRE3D.0124

To convert these to AFNI format, cd into the directory where you want the AFNI format file stored, and then execute the following command:

to3d –session . -prefix T1High /study/<path to files>/S2_EFGRE3D.*

where **<path to files>** completes
the full Linux path to the directory containing the files that you want
converted.

Make sure you visually inspect the created file in AFNI. Also, overlay your reconstructed functional files on the anatomical file and visually inspect to make sure they are properly aligned. This can be done by choosing the anatomical file as the underlay, choosing the functional file as the overlay, and setting the overlay threshold to a value such that only voxels in and immediately around the brain are shown.

** **

For a given brain volume, each slice image is acquired at a slightly different time. The exact timing of the slices depends on the acquisition setup. Often slices are either acquired in a straightforward sequence, or in an interleaved sequence in which the even numbered slices are acquired first, followed by the odd numbered slices. Particularly in the latter case, the time offset between adjacent slices will lead to differences in signal that can be substantial, particularly in event related studies. Although such differences will not affect individual voxel analyses, any process that involves averaging or interpolating the functional time series across slices will be affected by slice timing differences. Such processes include 3 dimensional spatial smoothing, cluster averaging, motion correction, and spatial normalization (e.g. to Talairach space). When any of these processes are to be applied to the functional data, it might be desirable to correct for slice timing differences. In AFNI this can be done using the command 3dTshift, but a more simple way is to use the –tshift option during motion correction (see example below). The correction is applied prior to motion correction, since motion correction involves interpolating between adjacent slices – such interpolation will be inaccurate if adjacent slices represent different time points.

Motion correction seeks to correct for small subject head motion during the scan. The command 3dvolreg is used as shown below.

3dvolreg -prefix run_1_vr –tshift -Fourier -verbose -base run_1+orig[0] -dfile run_1_motion.txt run_1+orig

You can use different images as the base image - in this case
and in most cases you will want to use an image taken with the same scan
parameters that was acquired closest in time to your anatomical images. The
example here uses the first image of the first scan run, which would be
appropriate if the anatomical scans were acquired before the functional scans.
As another example, if an experiment acquired 5 functional scans, each of
length 150 TRs, *before* the anatomical scans, the following command might
be appropriate:

3dvolreg -prefix run_1_vr –tshift -Fourier -verbose -base run_5+orig[149] -dfile run_1_motion.txt run_1+orig

This command specifies the final time point (in AFNI, time indices start at 0, so a 150 TR scan run has time points 0 to 149) of the last functional run as the reference, or base image.

The 3dvolreg command creates a new AFNI dataset, consisting of a HEAD file and a BRIK file with the prefix given in the command line.

To see the extent to which there were any large translations or rotations, you can look at the motion text file (use 1dplot for a graphical output).

1dplot -volreg run_1_motion.txt[1..6]

* *

(The [1..6] designates the 6 columns of the motion file containing translation and rotation estimates).

3dvolreg can handle small motions fairly well, but larger motion (> 1mm) might not be properly corrected. To see whether this is the case, check the motion-corrected dataset using AFNI. How do the time series look (i.e. are there obvious discontinuities in the data that could be motion-related)? Is there still apparent motion when scrolling through the time index? Make notes!

NOTE: For earlier versions of AFNI there was a bug in the 3dTshift routine, such that samples are “shifted” in the wrong temporal direction. It is important that you use a recent version of AFNI for motion correction and slice-time correction – if in doubt, use the latest version.

** **

The data can be filtered prior to linear model fitting, using 3dFourier, which applies a rather brutal frequency domain filter to the time-series. For event-related designs, this is not really necessary now that the AFNI GLM program, 3dDeconvolve, includes options for more precise baseline drift estimation. For block designs it might still be desirable to eliminate high frequency noise terms:

3dFourier -prefix run_1_vr_lp -lowpass 0.05 -retrend run_1_vr+orig

* *

Here the low pass cutoff value was based upon the spectrum of the input stimulus model. This will tend to smooth the data before linear fitting, thus providing a better model fit. You should always be conservative when choosing frequency cutoffs, and visually inspect the time series in AFNI after temporal filtering.

At some point you will probably want to spatially smooth or blur your data. This is done for two main reasons. For one, as with any smoothing operation, spatial smoothing will tend to average out high spatial frequency noise in your image. Thus areas of activation with a greater spatial extent will remain, but those with a small spatial extent will be eliminated. Second, before aggregating data across subjects, spatial blurring needs to be performed because subjects’ brain anatomies vary, even after normalization to a standard brain space (e.g. Talairach space), and thus their areas of functional activation are likely not to exactly overlap.

There are basically three points in the data analysis process that spatial smoothing can be performed. Data can be smoothed during reconstruction (i.e. using the hamming or fermi filter option). This has the advantage of eliminating unwanted noise and artifacts from the images before they can be compounded by further processing. At least some spatial smoothing at this stage is thus probably warranted. However, spatial smoothing during reconstruction is only performed in-plane. That is, it is applied separately for each brain slice, and thus does not account for unwanted high spatial frequency variation between slices, nor does it help align functional activation across subjects in the slice direction.

A second time to spatially smooth is just before first level statistical analysis of individual subject data. This is the last stage at which the (essentially) raw data time series will be used. After individual subject statistical analysis, further processing is applied to statistical meta-data, to which it is arguably less valid to apply spatial smoothing.

The third possibility is to apply spatial smoothing to the statistical maps produced by individual subject statistical analysis, questions of validity notwithstanding (this is discussed briefly below).

To opt for the second approach, we can spatially smooth the functional dataset using the command 3dmerge:

3dmerge -1blur_fwhm 5 -doall -session . -prefix run_1_vr_lp_5mm run_1_vr_lp+orig

This will apply a Gaussian blur with a full width at half maximum (FWHM) of 5mm to the dataset.

For an introduction to the principles behind such analyses, see this brief introduction to GLM in functional imaging.

To go beyond the simple example given here, see these explanations of including motion estimates as covariates and estimating the hemodynamic response function (“deconvolution”).

In AFNI, the command that implements single-subject GLM is
called 3dDeconvolve. This is not, however, a completely appropriate name,
because 3dDeconvolve doesn’t necessarily perform a deconvolution analysis. It *can*
do deconvolution, but it more generally does GLM – please use the term GLM
rather than deconvolution (it is correct for all types of analysis that can be
performed using 3dDeconvolve).

Note that there are a number of new features that make the 3dDeconvolve command of AFNI more simple, yet more powerful. These are explained in detail at:

http://afni.nimh.nih.gov/afni/doc/misc/3dDeconvolveSummer2004

Before performing the GLM analysis, we need to make one or more stimulus timing files, which are text files containing the time onsets (in seconds) for each type of experimental stimulus for which we want to separately model the BOLD response. These timing files have the file extension .1D by convention. There should be a separate timing file for each stimulus condition to be modeled. tname is the name of the file that contains the stimulus times (in units of seconds, as in the TR of the -input file). There are two formats for timing files:

1. A single column of numbers, in which case each time is relative to the start of the first imaging run ("global times").

2. If there are 'R' runs catenated together (either directly on the command line, or as represented in the -concat option), the second format is to give the times within each run separately. In this format, the timing file would have R rows, one per run; the times for each run taking up one row. For example, with two runs, R=2:

12.3 19.8 23.7 29.2 39.8 52.7 66.6

16.8 21.8 32.7 38.9 41.9 55.5 71.2

These
times will be converted to global times by the program, by adding the time
offset for each imaging run. When using the multi-row input style, you may have
the situation where the particular class of stimulus does not occur at all in a
given imaging run. To encode this, the corresponding row of the timing file
should consist of a single '*' character

**NOTE**: The times are relative to the start of
the data time series as input to 3dDeconvolve. If the first few points of each
imaging run have been cut off (e.g. during reconstruction), then the actual
stimulus times must be adjusted correspondingly (e.g., if 5 time points were
dropped with TR=2, then the actual stimulus times should be reduced by 10.0 in
the stimulus timing files).

Now for fitting the linear model using 3dDeconvolve. The best way to explain this is with a simple example. Say we have an event-related experiment in which subjects are shown a sequence of positive and negative images, presented in a pseudorandom order. They undergo two scan runs, each one 180 seconds long. During reconstruction we drop 5 TRs off each scan run, meaning that we have 170 seconds of fMRI data for each scan run left. Let’s imagine that the time onsets (corrected for the 5 dropped TRs) for the two types of stimuli for the two runs are given in 2 files:

pos_onsets.1D:

10.0 19.0 30.9 40.8 53.1 65.0 74.7 84.6 94.5 103.3 115.7 124.7 134.8 144.2 153.7 165.7

16.0 25.4 34.8 43.0 53.5 60.6 73.1 85.5 97.3 107.0 118.3 128.1 140.4 151.9 160.9 170.0

neg_onsets.1D:

15.0 23.7 34.6 46.8 58.6 68.6 80.4 92.9 104.1 115.4 128.2 139.8 148.8 161.0 173.6 183.9

10.0 19.4 31.5 41.6 49.0 57.2 66.6 74.4 85.4 93.0 103.6 115.3 127.8 135.1 144.4 151.4

Let’s also assume that the two scan runs are stored in the files run_1+orig and run_2+orig (remember that the extension .HEAD/.BRIK are not required when referring to AFNI format files in AFNI commands). The start of the command is simple enough:

3dDeconvolve –input run_1_vr+orig run_2_vr+orig -nfirst 0 –polort 2 \

(the backslash at the end of the line here is a bash convention used to denote to the operating system that the command continues on the following line)

Here we specify the files to be analysed. AFNI automatically
takes into account that there are 2 scan runs (because there are 2 files), and
will calculate a separate baseline for each scan run. Analysis should start
from the first time point (-nfirst 0) and we want to model the baseline with a
polynomial of degree 2. Roughly speaking, a 2^{nd} degree polynomial
will model the baseline with a constant value, a linear drift, and a
quadratic-shaped curve, which should be sufficient to capture most signal drift
in each of these fairly short scan runs. For longer runs, it might be desirable
to use a polort of 3 or 4.

**NOTE**: It is no longer necessary to concatenate the
runs before analysis, nor is it necessary to make their baselines equal. Each
scan run will have its baseline individually modeled with a polynomial of the
specified degree.

Now to specify the experimental conditions and timing:

-num_stimts 2 –basis_normall 1 \

-stim_times 1 pos_onsets.1D ‘GAM’ \

-stim_label 1 positive \

-stim_times 2 neg_onsets.1D ‘GAM’ \

-stim_label 2 negative \

The first line here specifies that we have two stimulus conditions, and that the model time series for each should be normalised to have a peak amplitude of 1. The latter allows us to convert estimated b-weights to estimates of % signal change later in the analysis chain.

Then we specify the timing file for each condition, and specify what model we want to use for the hemodynamic response. Here we have chosen the simplest option, a simple Gamma variate function (‘GAM’). Other options are:

'GAM' = 0 parameter gamma variate

'GAM(p,q)' = 2 parameter gamma variate

'SPMG' = 2 parameter SPM gamma variate + derivative

'POLY(b,c,n)' = n parameter polynomial expansion

'SIN(b,c,n)' = n parameter sine series expansion

'TENT(b,c,n)' = n parameter tent function expansion

'BLOCK(d,p)' = 1 parameter block stimulus of duration 'd'

`'EXPR(b,c) exp1 ... expn' = n parameter; arbitrary expressions `

Each of these models has advantages and disadvantages, that we won’t go into here, but basically they represent different degrees of tradeoff between a well-specified, but constrained model with few estimated parameters versus a poorly specified, but flexible model with many estimated parameters. The former (exemplified by the simple ‘GAM’ model) affords more statistical power, but cannot properly model different shapes or latencies of hemodynamic response. More details of each model type are given at:

http://afni.nimh.nih.gov/afni/doc/misc/3dDeconvolveSummer2004

Now we can specify contrasts between different experimental conditions. In this instance, there are only a few possibilities:

-num_glt 2 \

-glt_label 1 neg_pos_diff \

-gltsym 'SYM: +negative -positive' \

This will calculate the linear contrast negative-positive, giving us the difference in estimated response to the two conditions. It is not necessary to specify the reverse contrast, since all contrasts in AFNI are purely mathematical i.e. they do not mask out negative values, unlike other software.

-glt_label 2 neg_pos_ave \

-gltsym 'SYM: +0.5*negative +0.5*positive' \

This will calculate the mean estimated response to the two conditions.

Finally, we need to specify the type of output we want:

-fitts fit_ts –errts error_ts \

-xjpeg glm_matrix.jpg –tout –fout –bucket glm_out

(note the absence of a backslash on the last line, since it’s the last line of the command)

The first of these lines specifies that we want the fitted time series and error time series to be output for each voxel, in the form of two 3D+time AFNI datasets. These can be useful for diagnostic purposes and for calculating signal to noise ratios.

The second line specifies that we want a graphical representation of the GLM design matrix output to a JPEG file called glm_matrix.jpg, and that we want all the estimated b-coefficients, fit statistics, and contrast estimates output to an AFNI bucket file called glm_out+orig.

Putting this all together, we have:

3dDeconvolve –input run_1+orig run_2+orig -nfirst 0 –polort 2 \

-num_stimts 2 –basis_normall 1 \

-stim_times 1 pos_onsets ‘GAM’ \

-stim_label 1 positive \

-stim_times 2 neg_onsets ‘GAM’ \

-stim_label 2 negative \

-num_glt 2 \

-glt_label 1 neg_pos_diff \

-gltsym 'SYM: +negative -positive'

-glt_label 2 neg_pos_ave \

-gltsym 'SYM: +0.5*negative +0.5*positive' \

-fitts fit_ts –errts error_ts \

-xjpeg glm_matrix.jpg –tout –fout -bucket glm_out

When we run this, AFNI will churn away for some minutes, reporting progress as it goes. When it is finished, you should use AFNI to view the statistical maps overlayed on the original EPI data or on the anatomical images. You should assess the goodness of fit of this model just as you would for a linear regression or GLM analysis with any other type of data. The beta weights and associated t-tests give you an indication of the functional activation in response to each experimental condition for this subject.

The process of applying GLM to individual subject data is far more flexible, and potentially far more complex than can be presented here. For a more complete treatment, see:

http://afni.nimh.nih.gov/pub/dist/edu/latest/afni_3dDeconvolve/regression.htm

Before combining the results from each subject into a group-level analysis, it is useful to convert each subject’s parameter estimates to a common scale. This way, differences between subjects in the overall scaling of fMRI data (which are meaningless because fMRI is an arbitrarily scaled, purely relative measure) are removed. There is no one correct way to scale each subject’s data, but the method most commonly used (and which seems to give the best combination of sensitivity and reliability) is percent signal change. This is simply calculated as the estimated signal change for a given experimental condition (or contrast between conditions) expressed as a percentage of the baseline signal.

**NOTE**: the following discussion and method is only
appropriate when using the ‘GAM’ model, or similar *single-parameter*
models. For other, multi-parameter models, percent signal change should be
calculated on the basis of the estimated response functions, which are output
from 3dDeconvolve using the –iresp option.

Now it becomes apparent why we specified the –normall option in the 3dDeconvolve command, above. By doing this, our condition regressors were normalised to a peak amplitude of 1. This means that the parameter estimates for a given condition will be estimates of the amplitude of the BOLD response for that condition. Thus we simply use 3dcalc to perform the arithmetic on the appropriate sub-bricks of the output glm bucket dataset: First we need to determine which sub-bricks of the statistical bucket dataset contains the beta coefficients and contrasts that we want to scale. We use 3dinfo for this:

3dinfo glm_out+orig

this command will give us information about the contents of the statistical bucket file, including lines like the following:

-- At
sub-brick #0 'Run #1 P_0 Coef' datum type is short: 0 to 32767
[internal]

[* 0.712808] 0 to 23356.6
[scaled]

-- At
sub-brick #6 'Run #2 P_0 Coef' datum type is short: 0 to 32767
[internal]

[* 0.722834] 0 to 24671.6
[scaled]

-- At
sub-brick #12 'positive[0] Coef' datum type is short: -26364 to
21149 [internal]

[*
0.000433336] -11.4245
to 9.16462 [scaled]

-- At
sub-brick #16 'negative[0] Coef' datum type is short: -26364 to
21149 [internal]

[* 0.000433336] -11.4245 to 9.16462
[scaled]

-- At
sub-brick #20 'neg_pos_diff LC[0] Coef' datum type is short: -18586 to 29350 [internal]

[* 0.00044168] -8.20907 to
12.9633 [scaled]

-- At sub-brick
#24 'Neg_pos_ave LC[0] Coef' datum type is short: -26364 to
21149 [internal]

[* 0.000433336] -11.4245 to
9.16462 [scaled]

This tells us that sub-bricks 0 and 6 contain the baseline estimates (P_0 denotes the parameter for the 0-order baseline regressor, which is a constant equal to the estimated baseline) for each run, and sub-bricks 12, 16, 20, and 24 contain the beta coefficients and estimated linear contrasts. Thus:

3dcalc –prefix glm_estimates_perc –fscale \

–a glm_out+orig[0] –b glm_out+orig[6] \

–c glm_out+orig[12,16,20,24] \

–expr “step(mean(a,b)-500)*100*c/mean(a,b)”

The –fscale option forces 3dalc to internally use the full range of possible integer values to store the results – this avoids rounding and truncation errors, but is otherwise invisible to the user.

The step(mean(a+b)-500) term here returns 1 if the mean of the two baselines in greater than 500, and 0 otherwise. This is used to mask out all voxels with a mean baseline signal below a certain level (500 is usually appropriate for our lab’s internal files when reconstructed with autoscaling as described above, but you should check by overlaying the estimated baseline sub-bricks in the glm_out file on the subject’s anatomical image and setting the threshold to 500). It also ensures that we don’t get ridiculously high percent signal change values for voxels outside the brain with very low baseline signal, but possibly high parameter estimates due to noise.

This technique will produce a new bucket dataset with only
the percent signal change parameter and contrast estimates – we have dropped
all the t-statistics, F-statistics, R^{2} estimates etc., which is fine
because none of these are used in the higher level (group) analysis. However,
two things that are worth including at this stage are the mean baseline
estimate, and an estimate of the signal to noise ratio (SNR). The former we can
easily get by averaging the baseline estimates from the different runs using 3dcalc:

3dcalc –prefix glm_baseline –fscale \

–a glm_out+orig[0] –b glm_out+orig[6] \

–expr “step(mean(a,b)-500)*mean(a,b)”

To estimate the SNR, we need to use both this mean estimate, as well as the residual time series that was saved with the –errts option in the 3dDeconvolve command. The residual time series is our best estimate of the noise in the data. To calculate the SNR, we need to divide the mean baseline estimate (the signal) by the standard devaition of the residual time series (the noise). To calculate the latter:

3dTstat –prefix err_ts_stdev –stdev –datum short err_ts+orig

Hence to calculate the SNR:

3dcalc –prefix snr –fscale \

-a glm_baseline+orig –b err_ts_stdev+orig \

-expr “a/b”

Now that we have both the mean baseline and the SNR, we want to put these into the same bucket dataset that contains the percent signal change parameter and contrast estimates. To do this, we can use the 3dbucket command:

3dbucket –session . –prefix glm_out_perc glm_baseline+orig snr+orig glm_estimates_perc+orig

We now have the basic dataset that we will want to use for second level, group analysis.

At this stage, if satisfied with the individual subject results, we would normalise the statistical images to Talairach space and perform a second level (i.e. group) analysis. This involves the following steps:

First we need to landmark and transform a high resolution anatomical image of each subject's brain, using interactive landmarking in AFNI, which will produce a new file with +tlrc in place of +orig in the filename. This step is discussed here:

http://afni.nimh.nih.gov/pub/dist/edu/latest/afni08_talairach/afni08_talairach.htm

We then normalise the statistical map using the adwarp program.

adwarp -apar T1High+tlrc -dpar glm_out_perc+orig -prefix glm_out_perc -dxyz 2 –verbose

* *

In this example we use the (already marked and Talairach transformed) high resolution T1-weighted anatomical image as the basis for normalizing the statistical data into Talairach space. After normalization we interpolate and resample the data to a 2mm cubic voxel size. The default interpolation method, linear interpolation, was used in this example. This interpolation method is appropriate for beta coefficients and contrast estimates, but possibly not for parameters such as t-values and F-values. Since typically beta coefficients or contrast estimates will be the only parameters carried into the next analysis step, this is a reasonable approach.

As mentioned above, spatial smoothing can be applied to the statistical maps rather than to raw data. The main reason for spatially smoothing at this stage rather than earlier on, is that it is typically far less computationally expensive, since there are not multiple time points in the statistical map. In addition, because the voxels have been interpolated and up-sampled (e.g. to 2mm cubic voxels) the result of smoothing will be also be smoother and the resulting statistical map will have a more Gaussian spatial distribution (which can have advantaged when correcting for multiple comparisons at a later stage).

However, it must be noted that in smoothing beta coefficients and contrast estimates we are smoothing meta-data; i.e. we are smoothing estimates of effects rather than actual fMRI data. This should be a reasonable approach, since such effect estimates are clearly and directly interpretable as estimates of fMRI signal values.

We blur the individual normalized statistical maps using 3dmerge:

3dmerge -1blur_fwhm 5 -doall -session . -prefix glm_out_5mm glm_out_perc+tlrc

This will apply a Gaussian blur with a full width at half maximum (FWHM) of 5mm to the statistical map.

The final stage of analysis is to perform a group level analysis. In many cases you might have between-subjects factors to include in your analysis. In a first example here, we will deal with a simple completely within-subjects design, in which the question is which areas of the brain become activated in response to the negative stimuli. First we need to determine which sub-brick of the statistical bucket dataset contains the percent signal change estimates for the negative condition. We can use 3dinfo for this, or we can simply recall that our dataset contains the mean baseline in the first sub-brick (sub-brick 0), the SNR in the second sub_brick (sub-brick 1), and then our percent signal change estimates of the 2 parameter estimates and 2 contrasts of interest in sub-bricks 2 to 5. Thus sub-brick 3 should correspond to the percent signal change estimated response to the negative stimuli.:

To perform the analysis we would use a 1-sample t-test as follows:

3dttest –session . –prefix ttest –base1 0 –set2 \

subj1/glm_out_5mm+tlrc[3] \

subj2/glm_out_5mm+tlrc[3] \

subj3/glm_out_5mm+tlrc[3] \

subj4/glm_out_5mm+tlrc[3] \

subj5/glm_out_5mm+tlrc[3] \

subj6/glm_out_5mm+tlrc[3] \

subj7/glm_out_5mm+tlrc[3] \

subj8/glm_out_5mm+tlrc[3] \

subj9/glm_out_5mm+tlrc[3] \

subj10/glm_out_5mm+tlrc[3] \

subj11/glm_out_5mm+tlrc[3] \

subj12/glm_out_5mm+tlrc[3] \

subj13/glm_out_5mm+tlrc[3] \

subj14/glm_out_5mm+tlrc[3] \

subj15/glm_out_5mm+tlrc[3]

This tests whether the percent signal change stimated reponse for the negative condition (sub-brick 1) is significantly different from 0 (the –base1 0 term).

A slightly different example concerns when there are two groups for which we want to compare functional activation in response to the task. In this case we would perform a two-group t-test. Let’s say we now wanted to compare the contrast between positive and negative images, which would be sub-brick 4:

3dttest –session . –prefix ttest \

–set1 \

subj1/glm_out_5mm+tlrc[2] \

subj2/glm_out_5mm+tlrc[2] \

subj3/glm_out_5mm+tlrc[2] \

subj4/glm_out_5mm+tlrc[2] \

subj5/glm_out_5mm+tlrc[2] \

subj6/glm_out_5mm+tlrc[2] \

subj7/glm_out_5mm+tlrc[2] \

subj8/glm_out_5mm+tlrc[2] \

–set2 \

subj9/glm_out_5mm+tlrc[2] \

subj10/glm_out_5mm+tlrc[2] \

subj11/glm_out_5mm+tlrc[2] \

subj12/glm_out_5mm+tlrc[2] \

subj13/glm_out_5mm+tlrc[2] \

subj14/glm_out_5mm+tlrc[2] \

subj15/glm_out_5mm+tlrc[2] \

subj16/glm_out_5mm+tlrc[2]

** **

In the case that we had a single group that was measured twice; say, before and after a treatment, we would perform a paired t-test by adding the –paired option:

3dttest –session . –paired –prefix ttest \

–set1 \

subj1/glm_out_5mm+tlrc[4] \

subj2/glm_out_5mm+tlrc[4] \

subj3/glm_out_5mm+tlrc[4] \

subj4/glm_out_5mm+tlrc[4] \

subj5/glm_out_5mm+tlrc[4] \

subj6/glm_out_5mm+tlrc[4] \

subj7/glm_out_5mm+tlrc[4] \

subj8/glm_out_5mm+tlrc[4] \

–set2 \

subj9/glm_out_5mm+tlrc[4] \

subj10/glm_out_5mm+tlrc[4] \

subj11/glm_out_5mm+tlrc[4] \

subj12/glm_out_5mm+tlrc[4] \

subj13/glm_out_5mm+tlrc[4] \

subj14/glm_out_5mm+tlrc[4] \

subj15/glm_out_5mm+tlrc[4] \

subj16/glm_out_5mm+tlrc[4]

At this stage you have completed a basic voxelwise statistical analysis of fMRI data.

Many experimental designs will incorporate either more than two experimental conditions or more than two experimental groups, or both. In this case, analysis of variance (ANOVA) is appropriate. Consider an example where we performed an experiment in which subjects saw pictures of either happy faces, fearful faces, or neutral faces. Consider that we want to measure which areas of the brain show differential activation to the different types of faces. In addition, let us assume that there were two experimental groups – say a patient group and a control group. In this experiment we could use the 3dANOVA3 command to perform a 3 factor ANOVA.

For this design, the within-subject condition (face expression) is a fixed factor, group is a fixed factor, and subjects should be considered a random factor nested within group (since we want to be able to generalise beyond this particular set of subjects to the general population from which they are “randomly” sampled). So we need to perform a 3-way mixed effects ANOVA. The command for this is 3dANOVA3.

Let us assume that the beta coefficients from a 3 condition (happy, fearful, neutral) GLM performed on individual subject data are contained in sub-bricks 4, 5 and 6 respectively of the GLM output bucket files. Then the command below could be used:

3dANOVA3 –type 5 –alevels 2 –blevels 3 –clevels 8 \

Here we specify some general aspects of the design. The –type option specifies which type of design. Options are:

1 A,B,C fixed; AxBxC

2 A,B,C random; AxBxC

3 A fixed; B,C random; AxBxC

4 A,B fixed; C random; AxBxC

5 A,B fixed; C random; AxB,BxC,C(A)

We also specify that the first factor (group) has 2 levels, the second factor (face) has 3 levels, and the final factor (subject) has 8 levels.

Next, we specify the datasets to analyse, including their respective factor levels:

-dset 1 1 1 control1/glm_out_5mm+tlrc[4] \

-dset 1 1 2 control2/glm_out_5mm+tlrc[4] \

-dset 1 1 3 control3/glm_out_5mm+tlrc[4] \

-dset 1 1 4 control4/glm_out_5mm+tlrc[4] \

-dset 1 1 5 control5/glm_out_5mm+tlrc[4] \

-dset 1 1 6 control6/glm_out_5mm+tlrc[4] \

-dset 1 1 7 control7/glm_out_5mm+tlrc[4] \

-dset 1 1 8 control8/glm_out_5mm+tlrc[4] \

-dset 1 2 1 control1/glm_out_5mm+tlrc[5] \

-dset 1 2 2 control2/glm_out_5mm+tlrc[5] \

-dset 1 2 3 control3/glm_out_5mm+tlrc[5] \

-dset 1 2 4 control4/glm_out_5mm+tlrc[5] \

-dset 1 2 5 control5/glm_out_5mm+tlrc[5] \

-dset 1 2 6 control6/glm_out_5mm+tlrc[5] \

-dset 1 2 7 control7/glm_out_5mm+tlrc[5] \

-dset 1 2 8 control8/glm_out_5mm+tlrc[5] \

-dset 1 3 1 control1/glm_out_5mm+tlrc[6] \

-dset 1 3 2 control2/glm_out_5mm+tlrc[6] \

-dset 1 3 3 control3/glm_out_5mm+tlrc[6] \

-dset 1 3 4 control4/glm_out_5mm+tlrc[6] \

-dset 1 3 5 control5/glm_out_5mm+tlrc[6] \

-dset 1 3 6 control6/glm_out_5mm+tlrc[6] \

-dset 1 3 7 control7/glm_out_5mm+tlrc[6] \

-dset 1 3 8 control8/glm_out_5mm+tlrc[6] \

-dset 2 1 1 patient1/glm_out_5mm+tlrc[4] \

-dset 2 1 2 patient2/glm_out_5mm+tlrc[4] \

-dset 2 1 3 patient3/glm_out_5mm+tlrc[4] \

-dset 2 1 4 patient4/glm_out_5mm+tlrc[4] \

-dset 2 1 5 patient5/glm_out_5mm+tlrc[4] \

-dset 2 1 6 patient6/glm_out_5mm+tlrc[4] \

-dset 2 1 7 patient7/glm_out_5mm+tlrc[4] \

-dset 2 1 8 patient8/glm_out_5mm+tlrc[4] \

-dset 2 2 1 patient1/glm_out_5mm+tlrc[5] \

-dset 2 2 2 patient2/glm_out_5mm+tlrc[5] \

-dset 2 2 3 patient3/glm_out_5mm+tlrc[5] \

-dset 2 2 4 patient4/glm_out_5mm+tlrc[5] \

-dset 2 2 5 patient5/glm_out_5mm+tlrc[5] \

-dset 2 2 6 patient6/glm_out_5mm+tlrc[5] \

-dset 2 2 7 patient7/glm_out_5mm+tlrc[5] \

-dset 2 2 8 patient8/glm_out_5mm+tlrc[5] \

-dset 2 3 1 patient1/glm_out_5mm+tlrc[6] \

-dset 2 3 2 patient2/glm_out_5mm+tlrc[6] \

-dset 2 3 3 patient3/glm_out_5mm+tlrc[6] \

-dset 2 3 4 patient4/glm_out_5mm+tlrc[6] \

-dset 2 3 5 patient5/glm_out_5mm+tlrc[6] \

-dset 2 3 6 patient6/glm_out_5mm+tlrc[6] \

-dset 2 3 7 patient7/glm_out_5mm+tlrc[6] \

-dset 2 3 8 patient8/glm_out_5mm+tlrc[6] \

We now specify the output that we want:

-fa group –fb face –fab groupByFace \

This line specifies that we want the F-test for the group main effect, the F-test for the face main effect, and the F-test for the group by face interaction.

Next we can look at specific contrasts:

-acontr 1 0 controls –acontr 0 1 patients \

-bcontr 1 0 0 happy –bcontr 0 1 0 fearful –bcontr 0 0 1 neutral \

These two lines specify that we want the factor level means and statistical tests of whether those means are significantly different from 0.

-acontr 1 –1 contr_pat_diff \

-bcontr 1 –1 0 happ_fear_diff –bcontr 1 0 –1 happ_neut_diff –bcontr 0 1 –1 fear_neut_diff –bcontr 0.5 0.5 –1 emo_neut_diff \

These lines test specific contrasts for each factor, averaged across the levels of the other factors.

-aBcontr 1 –1 : 1 cont_pat_diff_happ –aBcontr 1 –1 : 2 cont_pat_diff_fear –aBcontr 1 –1 : 3 cont_pat_diff_neut \

-Abcontr 1 : 1 –1 0 happ_fear_diff_cont –Abcontr 2 : 1 –1 0 happ_fear_diff_pat \

These two lines test contrasts for one factor calculated within a specific level of the other factor.

There are abviously many more contrasts that could be
specified than the ones we have here. Bear in mind that you should really only
be looking at these contrasts if i) you have an apriori hypothesis about a
specific contrast, ii) the main effect F-test for a given factor is significant
and you want to know which factor level differences are driving the main
effect, or ii) the interaction of two factors is significant and you need to
know what differences are driving the interaction. *Don’t fall victim to a
fishing expedition in which you test every single possible contrast, and
possibly wind up with a catch of junk. If you must do exporatory analyses, then
you should guard against Type I error by adopting a suitably more stringent
threshold.*

-bucket anova

Finally, we specify that all the results should be saved in a statistical bucket dataset called anova+tlrc.

At this stage you have completed a basic statistical analysis of fMRI data.

One major consideration when conducting voxelwise statistical
tests is the sheer number of tests that are being performed. In such cases,
Type I error inflation is a problem. A common approach to correct for this is
to use a clustering method that tries to determine how big a contiguous cluster
of voxels, each one significant at an uncorrected threshold of P_{u},
has to be to in order to be significant at a threshold P_{c} – that is
corrected for multiple comparisons. To determine how big such a cluster needs
to be, there are 3 common methods:

1. Guassian Random Field Theory: Uses the mathematical properties of random fields to derive a corrected cluster size.

2. Permutation tests: Uses the data itself to create a null distribution of cluster sizes, from which the cluster size corresopnding to a desired corrected P can be read off.

3. Monte Carlo simulation: Creates mutliple simulated null datasets, and from them creates a distribution of cluster sizes, from which the cluster size corresopnding to a desired corrected P can be read off.

We could go into a lot of detail about the relative merits of each method here, but that would be somewhat irrelevant for AFNI analysis, since the Monte Carlo method is the only one currently implemented in AFNI (other techniques might become available in the near future).

In order to carry out the Monte Carlo simulation, we need
some basic information about the data itself, as well as the uncorrected P_{u}
that we plan on using for creating clusters, and a few other parameters that
control the clustering operation.

First we need an estimate of how spatially smooth the noise in the dataset is. To estimate this, we use the program 3dFWHM:

3dFWHM –dset err_ts+orig[0] –mask ‘3dcalc( –a glm_out+orig[0] –expr step(a-500))’

Here we have used the residual time series from an individual subject GLM to estimate the spatial smoothness. We have masked the time series with the thresholded baseline estimate so as to exclude voxels from outside the brain. This command will return output like the following:

Gaussian filter widths:

sigmax = 1.72
FWHMx = 4.05

sigmay = 1.56
FWHMy = 3.67

sigmaz = 1.70
FWHMz = 4.00

So for this dataset, our estimate would be that the full-width-at-half-maximum (FWHM) estimate of spatial smoothness is approximately 4mm. We could then do this for all our subjects and average the results. However, if we used spatial blurring on our statistical percent signal change estimates of more that 4mm, we should use the FWHM of the spatial blur that we applied. In the example above, we used a spatial blur of 5mm, so our estimate of spatial smoothness that we use in Monte Carlo simulations should be 5mm.

The next step is to decide upon the uncorrected P_{u}
that we plan on using for creating clusters. This choice is fairly arbitrary.
If we choose a fairly liberal voxelwise P_{u}, say P_{u }= 0.01,
we will be able to detect large clusters of voxels that are activated at a
fairly liberal threshold, but we will miss small clusters that are made up of
highly significantly activated voxels. If, on the other hand, we use a
conservative P_{u}, say P_{u }= 0.0001, we will be able to
detect small clusters of highly activated voxels, but not larger clusters of
less activated voxels. The final choice on a P_{u }depends on the size
of clusters that we are looking for, as well as considerations of what our
overall statistical power is likely to be.

Let’s say for the current example that we choose a P_{u }=
0.01.

The other parameter that we need to determine is what we want our cluster connection radius to be. This specifies how close two voxels need to be to one another in order to be considered contiguous. Usually we would use the voxel size of the datasets being analysed, in this case 2mm.

So now we can run the simulation:

AlphaSim –mask glm_out_perc+tlrc[0] –fwhm 5 –rmm 2 –pthr 0.01 –iter 1000

The –iter 1000 specifies that we want to run 1000 simulations. This is usually enough. The program will take quite a few minutes to run, and then produce output like the following:

Cl Size
Frequency Cum Prop p/Voxel
Max Freq Alpha

1
229840 0.328718 0.01012002 0 1.000000

2 108491 0.483882
0.00947785 0 1.000000

3 66742 0.579336
0.00887161 0 1.000000

4 51187 0.652544
0.00831218 0
1.000000

5 39186 0.708588
0.00774011 0 1.000000

and so on. What you now need to do is read down the
right-most column, which gives the corrected P_{c} for each minimum
cluster size (left hand column).

84 4 0.999920
0.00001630 4 0.058000

85 5 0.999927
0.00001536 5 0.054000

86 1 0.999929
0.00001417 1 0.049000

87 5 0.999936
0.00001393 4 0.048000

So from this we can see that according to Monte Carlo
simulations, thresholding the statistical images with an uncorrected P_{u}
= 0.01, and clustering with a cluster connection radius of 2mm, the resulting
clusters need to be at least 86 voxels in size in order to achieve a corrected
P_{c} = 0.05. We can now use this information to threshold and cluster
our group analysis statistical maps, which is explained in the next section.

We can now use the information from Monte Carlo simulations to threshold and cluster our group analysis statistical maps, so as to get thresholded statistical images corrected for multiple comparisons. This we must do separately for each contrast or F-test that we are interested in. For the preceding example, imagine that we are interested in the F-test for the interaction of group by face. First we need to find out where this information is stored in the anova+tlrc bucket file:

3dinfo anova+tlrc

This produces output such as:

-- At
sub-brick #0 'group:Inten' datum type is short: 0 to
32767 [internal]

[* 0.000644032] 0
to 21.103 [scaled]

-- At sub-brick
#1 'group:F-stat' datum type is short: 0 to
2562 [internal]

[* 0.01] 0
to 25.62 [scaled]

statcode
= fift; statpar = 1 14

-- At
sub-brick #2 'face:Inten' datum type is short: 0 to
32767 [internal]

[* 0.000571107] 0
to 18.7135 [scaled]

-- At
sub-brick #3 'face:F-stat' datum type is short: 0 to
3689 [internal]

statcode = fift; statpar = 2 28

-- At
sub-brick #4 'groupByFace:Inten' datum type is short: 0 to
32767 [internal]

[* 0.000571007] 0
to 19.7125 [scaled]

-- At
sub-brick #5 'groupByFace:F-stat' datum type is short: 0 to 3489 [internal]

statcode
= fift; statpar = 2 28

This tells us that the sub-brick we want to threshold with is sub-brick 5. We can also keep the intensity sub-brick for the interaction, although this would make more sense for a t-test of a specific contrast, since the intensity sub-brick in that case contains the value of the contrast itself (e.g. percent signal change contrast between two conditions).

Before we run the cluster command, we need to determine the
F-statistic that corresponds to a P_{u} of 0.01, for the degrees of
freedom that we have for the interaction F-test. This can be read off from
tables, calculated using Excel or an online calculator, or read off from the threshold
slider in AFNI if viewing the interaction statistical map. In this case, the
correct F-statistic is 5.45.

Now we can use the 3dmerge command to perform the clustering:

3dmerge –session . –prefix group_by_face_clust –1thresh 5.45 -1clust 2 –86 anova+tlrc[4,5]

3dmerge –session . –prefix group_by_face_clust_order –1thresh 5.45 -1clust_order 2 –86 anova+tlrc[4,5]

The first command will identify clusters of voxels that
exceed an F-value of 5.45, using a cluster connectivity radius of 2mm, and minimum
cluster size of 86 voxels, using sub-brick 5 as the thresholding brick, saving
the thresholded, clustered data in a new bucket dataset called
group_by_face+tlrc. The second command will do the same thing, except all voxel
intensities within a cluster will be replaced by the cluster size index
(largest cluster=1, next=2, ...). This will be useful for extracting data from
specific clusters later on. The reason why we specify the minimum cluster size
as a negative number is that denotes cluster sizse in voxels. If we were to pu
a positive 86 there, that would denote 86mm^{3}.

Some of the other options in the command 3dmerge might be useful. In particular, you can play around with the –1erode and –1dilate options, which are useful in separating clusters that are joined by a narrow “neck”.

We would typically want to carry out the same basic operation on all the other F-tests and contrasts of interest. When this is done, it is possible to combine all the resulting clustered datasets into one big buckets file using the command 3dbucket.

This basically completes the voxelwise group analysis of data.

One remaining step that you might want to take is to extract
each subject’s individual percent signal change parameter estimates from the
cluster regions identified in the preceding step, for entry into a statistics
package or for graphing results. To do this, you can use the 3dmaskave command.
Say, for example, that when looking at the face by group clusters idenitified
in the preceding step, we were interested in a cluster in prefrontal cortex,
which happens to be the 5^{th} largest cluster in the group by face
interaction dataset. Thus voxels in this cluster would have a value of 5 in the
group_by_face_clust_order+tlrc dataset. We can thus use this to extract mean
voxel percent signal change for individual subjects as follows:

3dmaskave –mask group_by_face_clust_order+tlrc –mrange 5 5 control1/glm_out_5mm+tlrc[4]

This command would print on the screen the mean percent signal change for this cluster for control subject 1 for the happy face condition. We could repeat this command for the other face conditions, as well as for all the other subjects. The mean percent signal change values could then be used to create graphs with error bars etc.

Alternatively, if want to output the mean percent signal change for each cluster, we could use the following command:

3dROIstats -–mask group_by_face_clust_order+tlrc control1/glm_out_5mm+tlrc[4]

This would cycle through each cluster in group_by_face_clust_order+orig and output the mean for each one for control subject 1, face condition happy.

Because these commands simply print their output to the screen, you’ll probably want to use them with the Linux redirection opertors > and >> to save the output to a text file. Obviously these commands are probably best used in an interative script that cycles through subjects and conditions.