preprocess

Purpose:

Scan and identify input data, assign options and out paths based on a template file, and then reconstruct, convert, motion-correct, and fieldmap-correct as necessary.

Usage:

Usage: preprocess [-hvRsbademfcADEMFC] <path–to–data>

Options:


  –h, ––help            Show this help message and exit
  –v, ––verbose         Print useless stuff to screen.
  –R, ––recompute       Recompute all files.
  –s SKIP, ––skip=SKIP  Number of frames to skip at  the beginning of 
                        each run. Can be overridden from command line.
  –b, ––bashonly        Create bash file but do not execute.
  –a, ––anat            Convert structural images.
  –d, ––dti             Process dti images.
  –e, ––epi             Reconstruct epi images.
  –m, ––motion          Motion–correct epi images.
  –f, ––fmap            Fieldmap–correct epi images.
  –c, ––compute_fmap    Compute fieldmap–correction.
  –A, ––Anat            Convert structural images, (recompute).
  –D, ––Dti             Process dti images, (recompute).
  –E, ––Epi             Reconstruct epi images, (recompute).
  –M, ––Motion          Motion–correct epi images, (recompute).
  –F, ––Fmap            Fieldmap–correct epi images, (recompute).
  –C, ––Compute_fmap    Compute fieldmap-correction, (recompute).

Operation

The program begins by looking for a template file as described below. The template file defines options and the output directory structure. The script then examines every file below the directory specified on the command line to determine if it contains data the script knows how to process. A data structure is build during this process the contains the input data, parameters and output filename for each action to be taken. This is stored in the "log" directory in the file "log/preprocess_info.yaml". This file is human-readable and can be used to see exactly what the script did. The program is designed to work with any directory structure, but it assumes that all data below the path you specify belongs to the same subject.

After all files have been scanned, the program sequentially processes each data item. Three log files are created: One contains all commands that are executed (preprocess.bsh), another contains all commands that aborted (preprocess_failed.log), and a third that contains the output of every commmand (preprocess.log).

The default behaviour is to process everything unless the output file already exists. Everything will be recomputed if the -R option is given. Specific types of processing can be done by using the options as listed above. In all cases, the lower-case option will not overwrite existing data, the upper-case option will. The "skip" option on the command line will overide the value in a template file.

Templates:

Where to put them

The program will begin looking for a template file in the directory specified on the command line, i.e., the directory containing the "anatomicals" and "raw" subdirectories. If none is found it will look in the next directory above it.

Example:

If you raw data are stored in /study/mystudy with directories for each subject named "sub001, sub002, ...", you should put a global template file in /study/mystudy. The preprocess script is run ' when the data are uploaded from the scanner, so it will find this directory, and for a typical scan, will correctly process the data. For special cases, say where a session has to be stopped early or when the EPIs are run out of order, a subject specific template file can be put in that subject's directory and then it will be the one found. This makes it possible to use one template for all subjects or to use individual templates for specific subjects.

Syntax

This template defines the naming convention to be used by the preprocess script. Edit the fields in the "Value" column to change the file structure. This file follows the "yaml" (yet-another-markup-language) syntax. We chose this method because it is the geekiest sounding format name we could find. Secondary considerations were that it is readable and editable by humans but can be easily read into a Python data structure.

For this format, indentation is very important. That is how it differentiates between attributes and sub-attributes. There are only a few features of the syntax that we use. They are:

  • The file will only be recognized as a template file if the first line begins with the string "#!fmri_file_template". This isn't yaml syntax - it is detected by the preprocess script.
  • The "– – –" characters are yaml delimiters and should be left alone.
  • The colon (:) after each keyword is a yaml delimiter and is equivalent to an equal sign.
  • The brackets delimit software list structures. They are used to hold lists of filenames. There must be commas between each name in the list.
  • The items such as "faces: &id001" mark the beginnings of a single substructure. It is important to maintain the same indentation for each element of the strucuture. The name of thse substructures ("faces" here) is not used directly as a file name but is used internally by the preprocess script.
  • Comments are the same as in bash scripts - they start with "#" and can be put anywhere.
  • Subject number must be in double quotes. If quotes are not used, the software will interpret any subject id starting with zero in the base 8 (octal) numbering system and yield weird results. The preprocess script will catch this error.
  • Setting the subject tag to "same" will create the EPI subdirectories in the directory specified by "proc". See the example below.

Default template

Download this template. and edit it to create your own.


#!fmri_file_template    # The first part of this line MUST be present or
#                         this file won't be recognized as a template.

# Template defining the naming convention to be used by the 
# preprocess script. Edit the fields in the "Value" column to change the file structure.
# This file follows the "yaml" (yet-another-markup-language) syntax. For this format,
# indentation is very important.  That is how it differentiates between attributes and
# sub-attributes.  The only other special syntax in the file below is "---", the colon 
# after each tag-name, the brackets around lists of items separated by commas, and the 
# "&id00n" variables.  These latter variables should be numbered sequentially as below.
#
# Here are some basic syntax rules:
# 1. The code "#!fmri_file_template" must appear at the beginning of the first line of the file.
#
# 2. The code "---" is a delimiter used by yaml. Keep them where they are.
#
# 3. Indentation is important.  Always use the same indentation for a given block.  For
#                               example, the "anat" block has two items, "outdir" and
#                               "format".  The parser figures this out because both
#                               are indented 4 spaces from the left margin.
#
# 4. The colon is used instead of an equal. 
#
# 5. A space must follow the colon.
#
# 6. Brackets denote a list.  Each member of the list must be followed by a comma.
#
# 7. The &id_*** variables are required by the parser. Each of the lines where it appears
#    (e.g. anat:) defines the beginning of a Python dictionary, and the indented elements
#    after it are the dictionaries members.  The name of the dictionary and its "id" must 
#    be unique. For example, epis could have names of epi1, epi2 ... and id's of &id001, &id002, ...
#
# File type codes: brik=BRIK, nii=one-file nifti, ni1 = two_file nifti

#keyword   Value                  Meaning
#-------   -------                -------
---
# Global variables.
top_outdir: ""         # Directory for output data (defaults to raw data directory)
subject: "same"        # Subdirectory for processed data. MUST be in quotes.
                       # If set to "same" use the the same name as the data, e.g.,
                       #     if the data are in /study/mystudy/sub001, and proc=/study/mystudy/processed,
                       #     the data will be stored in /study/mystudy/processed/sub001.
fsl_flip: False        # If true, all output images will be flipped physically such that they are 
                       # in LPI, PSL, or LSP orientation.  This is workaround for a bug in flsview that
                       # requires this orientation. The header will correctly represent the orientation,
                       # so files can still be viewed in AFNI, SPM, VoxBo, or mricron.

# Structural images:
anat: &id_anat         # Structural image info.
    outdir: anat       # Directory where anatomical images should be stored.
    format: brik       # File format for structural images. 'brik', 'nii', or 'n+1'

# DTI processing
dti: &id_dti           # DTI info
    outdir: dti        # Directory where DTI images should be stored.
    format: nii        # Default type is nifti one-file.
    pepolar: 0         # Default phase encode direction. (pe axis read from header.)

# Log file location.
logdir: log            # Directory for log files

# Field maps Processing
fmap: &id_fmap         # Fieldmap info
    outdir: fieldmap   # Directory where fieldmaps should be stored.
    echo_spacing: .688

# EPI processing.
first_epi: epi_setup   # Directory for first two EPI images.
epi_type: brik         # Output epi file type.
skip: 5                # Number of frames to skip.
epi_motion_interp: -Fourier # Interpolation method argument for 3dvolreg.
epi_file_format: brik  # Format of final epi files.
email: noname@wisc.edu # Email address where completion status should be sent. Set
                       #       to "noname" or "noname@whatever.whatever for no email.
epidir_dflt: &id001       # First set of epis.
    type: epi          # Type of data. The only value currently used is "epi"
    acq_order: 0       # Acquisition order. 
    outdir: "run_1"    # Directory for first set of epis.
    names: [epi_run1, epi_run2, epi_run3, epi_run4, epi_run5, epi_run6, ] # EPI  names.
    pepolar: 0         # 0 = default phase-encode direction, 1 = reversed.
---

Example of study containing a faces and a go-nogo task.

#!fmri_file_template # The first part of this line MUST be present. --- top_outdir: /study/jjo/tmp/BRDEVEL/processed subject: "064" # Subdirectory for processed data. MUST be in quotes. # If set to "same" use the the same name as the data, e.g., # if the data are in /study/mystudy/sub001, and proc=/study/mystudy/processed, # the data will be stored in /study/mystudy/processed/sub001. fsl_flip: False # Structural images: anat: &id_anat outdir: anat format: brik # DTI processing dti: &id_dti # Directory for all dti data. outdir: dti format: nii # Default type is nifit one-file. pepolar: 0 # Default phase encode direction. (pe axis read from header.) # Log file location. logdir: log # Directory for log files # Field maps Processing fmap: &id_fmap # Directory for all fieldmaps. outdir: fieldmap echo_spacing: .688 # EPI processing. first_epi: epi_setup # Directory for first two EPI images. epi_type: brik # Output epi file type. skip: 5 # Number of frames to skip. epi_motion_interp: -Fourier epi_file_format: brik email: ollinger@wisc.edu faces: &id001 # Second set of epis. type: epi # Type of data. The only value currently used is "epi" acq_order: 1 # Acquisition order. Is it the first, second, third etc set of epi runs. outdir: fMRI/faces # Directory names: [faces_run1, faces_run2] # List of names. gonogo: &id002 # First set of epis. type: epi # Type of data. The only value currently used is "epi" acq_order: 2 # Acquisition order. Is it the first, second, third etc set of epi runs. outdir: fMRI/gonogo # Directory for first set of epis. names: [gonogo_run1, gonogo_run2] # List of names. pepolar: 0 ---
This file would yield the directory structure:

                             /study/jjo/tmp/BRDEVEL/processed/064
                                           |
                                           |
   —————————————————————————————————————————————————————————————————————————————————
   |               |             |        |         |                |             | 
 anat          fieldmap         dti      log       fMRI           epi_setup       log
                                                    |
                                                    |
                                         ———————————————————————
                                         |                     |
                                       gonogo                faces

A template file for a study with 5 epi runs where each is stored in its own directory would be:
#!fmri_file_template #keyword Value Meaning --- top_outdir: /study/jjo/tmp/pain_regulation/tmp_outdir subject: "same" # Subdirectory for processed data. MUST be in quotes. # If set to "same" use the the same name as the data, e.g., # if the data are in /study/mystudy/sub001, and proc=/study/mystudy/processed, # the data will be stored in /study/mystudy/processed/sub001. fsl_flip: False # Structural images: anat: &id_anat outdir: anat format: brik # DTI processing dti: &id_dti # Directory for all dti data. outdir: dti format: nii # Default type is nifit one-file. pepolar: 0 # Default phase encode direction. (pe axis read from header.) # Log file location. logdir: log # Directory for log files # Field maps Processing fmap: &id_fmap # Directory for all fieldmaps. outdir: fieldmap echo_spacing: .688 # EPI processing. first_epi: epi_setup # Directory for first two EPI images. epi_type: brik # Output epi file type. skip: 5 # Number of frames to skip. epi_motion_interp: -Fourier epi_file_format: brik email: ollinger@wisc.edu skip: 5 # Number of frames to skip. outdir_1: &id001 # First set of epis. type: epi # Type of data. The only value currently used is "epi" acq_order: 1 # Acquisition order. outdir: "run_1" # Directory for first set of epis. names: [run_1] # List of names. outdir_2: &id002 # First set of epis. type: epi # Type of data. The only value currently used is "epi" acq_order: 2 # Acquisition order. outdir: "run_2" # Directory for first set of epis. names: [run_2] # List of names. outdir_3: &id003 # First set of epis. type: epi # Type of data. The only value currently used is "epi" acq_order: 3 # Acquisition order. outdir: "run_3" # Directory for first set of epis. names: [run_3] # List of names. outdir_4: &id004 # First set of epis. type: epi # Type of data. The only value currently used is "epi" acq_order: 4 # Acquisition order. outdir: "run_4" # Directory for first set of epis. names: [run_4] # List of names. outdir_5: &id005 # First set of epis. type: epi # Type of data. The only value currently used is "epi" acq_order: 5 # Acquisition order. outdir: "run_5" # Directory for first set of epis. names: [run_5] # List of names. outdir_6: &id006 # First set of epis. type: epi # Type of data. The only value currently used is "epi" acq_order: 6 # Acquisition order. outdir: "run_6" # Directory for first set of epis. names: [run_6] # List of names. ---
With a directory structure:
/study/jjo/tmp/pain_regulation_test/processed | | --------------------------------------------------------------------------------- | | | | | | | anat fieldmap dti log | epi_setup log | | | -------------- fieldmap_sagittal.nii | | | T1High+orig T2+orig | | --------------------------------------------------------- | | | | | | run_1 run_2 run_3 run_4 run_5 run_6 | | | | | | | ------------------------------- | | | run_3+orig run_3_m+orig run_3_mf+orig

Last modified July 15, 2008