mbfmri.data

Submodules

mbfmri.data.loader module

class mbfmri.data.loader.Normalizer(normalizer_name='none', std_threshold=2.58, clip=True, scale=[0, 1], use_absolute_value=False)

Bases: object

Callable class for normalizing bold-signals or latent process signals. Standardization, Rescaling, or Min-Max scaling can be done here for improving stability of MVPA model optimization.

Parameters
  • normalizer_name (str, default=”none”) – Name for the method of normalizaion - “none” : do nothing. - “standard” : standardize the data - “rescale” : rescale values in to the given scale.

    the data ranged in [MEAN-std_threshold*STD, MEAN+``std_threshold``*STD] will be resacled to scale.

    • “minmax”rescale values in to the given scale (scale), so that

      the entire value range will fit in the scale.

  • std_threshold (float, default=2.58) – Z-score value for thresholding valid range of data. used in “rescale” normalization

  • clip (bool, default=True) – Indicate if clip the value range, used in “rescale” normalization.

  • scale ([float,float], default=[0,1]) – Tuple or list indicating the range, [lower bound, upper bound], for the data to be rescaled or fit in.

  • use_absolute_value (bool, default=False) – Indicate if use absolute value, use abs(x) instead of input x.

class mbfmri.data.loader.Binarizer(positive_range, negative_range, use_ratio=True)

Bases: object

Callable class for binarizing latent process signals. Users can indicate each range for positive (1) or negative (0). The datapoints in signals in the positive range will be “1”, and those in the negative range will be “0.” This function is applied for processing target dat for Logistic models. Any siganls with values out of both ranges will be marked as “-1,” which will be ignored in training the models.

Parameters
  • positive_range ([float,float] or float) – Range, [lower bound, upper bound], for positive logistic value (“1”). if use_ratio is True, the bounds in the given range will indicate the rates which should be fall into [0,1]. If only one given, [positive_range, 1 or Inf] will be used.

  • negative_range ([float,float] or float) – Range, [lower bound, upper bound], for negative logistic value (“0”). if use_ratio is True, the bounds in the given range will indicate the rates which should be fall into [0,1]. If only one given, [0 or -Inf, negative_range] will be used.

  • use_ratio (bool, default=True) – Indicate if boundary values in ranges are in percentage scale.

class mbfmri.data.loader.BIDSDataLoader(layout, subjects='all', sessions='all', task_name=None, process_name='unnamed', feature_name='unnamed', voxel_mask_path=None, reconstruct=False, y_normalizer='none', y_scale=[0, 1], y_std_threshold=2.58, y_clip=False, y_use_absolute_value=False, X_normalizer='none', X_use_absolute_value=False, X_scale=[0, 1], X_std_threshold=2.58, X_clip=False, logistic=False, binarizer_positive_range=None, binarizer_negative_range=None, binarizer_use_ratio=True, verbose=1)

Bases: object

BIDSDataLoader is for loading preprocessed fMRI and behaviral data. The files for voxel features, bold-like signals of latent process, time masks, and a voxel mask will be aggregated subject-wisely. A tensor X with shape (time, voxel_feature_num), and a tensor y with shape (time,) will be prepared for each subject. The users can dictionaries for X,y indexed by corresponding subject IDs. Also, temporal masking will be done for each data using time mask file along the time dimension for both of X and y data. Check the codes below for the detail.

Parameters
  • layout (str or pathlib.PosixPath or bids.layout.layout.BIDSLayout) – BIDS layout for retrieving MB-MVPA preprocessed data files. Users can input the root for entire BIDS layout with original data, or the root for MB-MVPA derivative layout.

  • subjects (list of str or “all”,default=”all”) – List of subject IDs to load. If “all”, all the subjects found in the layout will be loaded.

  • task_name (str, default=None) – Name of the task. If not given, ignored in searching through BIDS layout.

  • process_name (str, default=”unnamed”) – Name of the target latent process. If not given, ignored in searching through BIDS layout.

  • feature_name (str, default=”unnamed”) – Name for indicating preprocessed feature. If not given, ignored in searching through BIDS layout.

  • voxel_mask_path (str or pathlib.PosixPath, default=None) – Path for voxel mask file. If None, then find it from default path, “MB-MVPA_ROOT/voxelmask_{feature_name}.nii.gz”

  • reconstruct (boolean, default=False) – Flag for indicating whether reshape flattened voxel features to 4D images. Normally, it needs to be done for running CNNs.

  • y_normalizer (str, default=”none”) – Name for the method of normalizaion of latent process signals (y). - “none” : do nothing. - “standard” : standardize the data - “rescale” : rescale values in to the given scale.

    the data ranged in [MEAN-std_threshold*STD, MEAN+``std_threshold``*STD] will be resacled to scale.

    • “minmax”rescale values in to the given scale (scale), so that

      the entire value range will fit in the scale.

  • y_std_threshold (float, default=2.58) – Z-score value for thresholding valid range of latent process signals (y). used in “rescale” normalization

  • y_clip (bool, default=True) – Indicate if clip the value range, used in “rescale” normalization.

  • y_scale ([float,float], default=[0,1]) – Tuple or list indicating the range, [lower bound, upper bound], for the data to be rescaled or fit in.

  • y_use_absolute_value (bool, default=False) – Indicate if use absolute value.

  • X_normalizer (str, default=”none”) – normalizer for voxel signals (X).

  • X_std_threshold (float, default=2.58) – std_threshold for voxel signals (X).

  • X_clip (bool, default=True) – clip for voxel signals (X).

  • X_scale ([float,float], default=[0,1]) – scale for voxel signals (X).

  • X_use_absolute_value (bool, default=False) – use_absolute_value for voxel signals (X).

  • logistc (bool, default=False) – Indicate if the data (X,y) is required to be logistic value. True when a logistic model is used.

  • binarizer_positive_range ([float,float] or float) – Range, [lower bound, upper bound], for positive logistic value (“1”). if use_ratio is True, the bounds in the given range will indicate the rates which should be fall into [0,1]. If only one given, [positive_range, 1 or Inf] will be used. It is valid when logistic is True.

  • binarizer_negative_range ([float,float] or float) – Range, [lower bound, upper bound], for negative logistic value (“0”). if use_ratio is True, the bounds in the given range will indicate the rates which should be fall into [0,1]. If only one given, [0 or -Inf, negative_range] will be used. It is valid when logistic is True.

  • binarizer_use_ratio (bool, default=True) – Indicate if boundary values in ranges are in percentage scale. It is valid when logistic is True.

  • verbose (int, default=1) – Level of verbosity. Currently, if verbose > 0, print all the info. while loading.

get_data(subject_wise=True)

Get loaded data.

Parameters

subject_wise (bool, default=True) – Indicate if the data dictionary indexed by the subject ID is required. If not, a single concatenated array for each will be returned.

get_voxel_mask()

Get loaded voxel mask

mbfmri.data.tf_generator module

class mbfmri.data.tf_generator.DataGenerator(X, y, batch_size, shuffle=True)

Bases: tensorflow.python.keras.utils.data_utils.Sequence

Data generator required for fitting Keras model. This is just a simple wrapper of generating preprocessed fMRI data (\(X\)) and BOLD-like target data (\(y\)).

Please refer to the below links for examples of using DataGenerator for Keras deep learning framework.

Also, this class is used to generate a chunk of data called ‘batch’, which means a fragment aggregatin the specified number (‘batch_size’) of data (X,y). This partitioning data to small size is intended for utilizing the mini-batch gradient descent (or stochastic gradient descent). Please refer to the below link for the framework.

# TODO find a better reference

on_epoch_end()

Updates indexes after each epoch

__len__()

Denotes the number of batches per epoch

__getitem__(index)

Get a batch of data X, y