DeepMIB - Directories and Preprocessing tab
This tab allows choosing directories with images for training and prediction as well as various parameters used during image loading and preprocessing
Back to Index -->
User Guide -->
Menu -->
Tools Menu -->
Deep learning segmentation
Contents
Widgets and settings
Directory with images and modelsn for training
[ used only for training ]
use these widgets to select directory that contain images and model to be used for training. For the organization of directories see the organization schemes below.
For 2D networks the files should contain individual 2D images, while for 3D networks individual 3D datasets.
The extension ▼ dropdown menu on the right-hand side can be used to specify extension of the image files.
The [✓] Bio checkbox toggles standard or Bio-format readers for loading images. If the Bio-Format file is a collection of image, the Index... edit box can be used to specify an index of the file within the container.
For better performance, it is recommended to convert Bio-Formats compatible images to standard formats or to use the Preprocessing option.
Important notes considering training files
- Number of model or mask files should match the number of image files (one exception is 2D networks, where it is allowed to have a single model file in MIB *.model format, when Single MIB model file: ticked). This option requires data preprocessing
- For models in standard image formats it is important to specify number of classes including the Exterior into the Number of classes edit box
- Important! It is not possible to use numbers as names of materials, please name materials in a sensible way when using the *.model format!
Tip! if you have only one segmented dataset you can split it into several datasets using Menu->File->Chopped images->Export operation.
Directory with images for prediction
[ used only for prediction ]
use these widgets to specify directory with images for prediction (named 2_Prediction in the file organization schemes below).
The image files should be placed under Images subfolder. Optionally, when the ground truth models for prediction images is available, they can be placed under Labels subfolder.
When the preprocessing mode is used the images from this folder are converted and saved to 3_Results\Prediction images directory.
For 2D networks the files should contain individual 2D images or 3D stacks, while for 3D networks individual 3D datasets.
When the ground truth labels are present, they are also processed and copied to 3_Results\PredictionImages\GroundTruthLabels. These models can be used for evaluation of results (see the Predict tab for details).
The extension ▼ dropdown menu on the right-hand side can be used to specify extension of the image files. The [✓] Bio checkbox toggles standard or Bio-format readers for loading the images. If the Bio-Format file is a collection of image, the Index edit box can be used to specify an index of the file within the container.
Description of additional settings
- The [✓] Single MIB model file checkbox, (only for 2D networks) tick it, when using a single model file with segmentations
- The Model extension ▼ dropdown, (only for 2D networks) is used to select extension of files containing models. For 3D network MIB model format is used
- The Number of classes edit box, (TIF or PNG formats only) is used to define number of classes (including Exterior) in models. For model files in MIB *.model format, this field is not used
- [✓] Use masking checkbox is used when some parts of the training
data should be excluded from training. The masks may be provided in various formats
and number of mask files should match the number of image files
*Note!* masking may give drop in precision of training due to inconsistency within the image patches - Mask extension ▼ is used to select extension for files that contain masks. For 3D network only MIB *.mask format is supported
Directory with resulting images
use these widgets to specify the main output directory; results and all preprocessed images are stored there.
All subfolders inside this directory are automatically created by Deep MIB:
Description of directories created by DeepMIB
- PredictionImages, place for the prepocessed images for prediction
- PredictionImages\GroundTruthLabels, place for ground truth models for prediction images, when available
- PredictionImages\ResultsModels, the main outout directory with generated models after prediction. The 2D models can be combined in MIB by selecting the files using the ⇧ Shift+left mouse click during loading
- PredictionImages\ResultsScores, folder for generated prediction scores (probability) for each material. The score values are scaled between 0 and 255
- ScoreNetwork, for accuracy and loss score plots, when the Export training plots option of the Train tab is ticked and for storing checkpoints of the network after each epoch (or specified frequency), when the [✓] Save progress after each epoch checkbox is ticked
- TrainImages, images to be used for training (only for preprocessing mode)
- TrainLabels, models accompanying images to be used for training (only for preprocessing mode)
- ValidationImages, images to be used for validation during training (only for preprocessing mode)
- ValidationLabels, models accompanying images for validation (only for preprocessing mode)
Details of additional widgets
- [✓] Compress processed images, tick to compress the processed images.
The processed images are stored in *.mibImg format that can be loaded in MIB.
*.mibImg is a variation of standard MATLAB format and can also be directly loaded into MATLAB
using similar to this command:
res = load('img01.mibImg, '-mat');.
Compression of images slows down performance! - [✓] Compress processed models, tick to compress models during preprocessing.
The processed models are stored in *.mibCat format that can be loaded in MIB (Menu->Models->Load model).
It is a variation of a standard MATLAB format, where the model is encoded using categorical class of MATLAB.
Compression of models slows down performance but brings significant benefit of small file sizes - [✓] Use parallel processing, when ticked DeepMIB is using multiple cores to process images. Number of cores can be specified using the Workers edit box. The parallel processing during preprocessing operation brings significant decrease in time required for preprocessing.
- Fraction of images for validation, define fraction of images that will be randomly (Random generator seed) assigned into the validation set. When set to 0, the validation option will not be used during the training
- Random generator seed, a number to initialize random seed generator, which defines how the images for training and validation are split. For reproducibility of tests keep value fixed. When random seed is initialized with 0, the random seed generator is shuffled based on the current system time
- Preprocess for ▼, select mode of operation upon press of the Preprocess button. Results of the preprocessing operation for each mode are presented in schemes below
Preprocessing of files
Originally, the preprocessing of files in DeepMIB was required for most of workflows. Currently, however, DeepMIB is capable to work with unprocessed images most of times: use the Preprocessing is not required options.
When the preprocessing step is required or recommended
The preprocessing is recommended/required in the following situations:
- when labels are stored in a single *.MODEL file
- when masking is used during training
- when training set is coming in proprietary formats that can only be read using BioFormats reader
During preprocessing the images and model files are converted to mibImg and mibCat formats (a variation of MATLAB standard data format) that are adapted for training and prediction.
Organization of directories without preprocessing for semantic segmentation
Without preprocessing, when datasets are manually split into training and validation sets
In this mode, the training image files are not preprocessed and loaded on-demand during network training. The image files should be split into subfolders TrainImages, TrainLabels and optional subfolders ValidationImages, ValidationLabels (for details see Snapshot with the legend below). The images may also be automatically split, see the following section for details.
When Bio-Format library is used, it is recommended to preprocess images for the training process to speed up the file reading performance.
Snapshot with the directory tree
Snapshot with the legend
Without preprocessing, with automatic splitting of datasets into training and validation sets
In this mode, the image and label files are randomly split into the train and validation sets. The split is done upon press of the Preprocess button, when Preprocess for: Split for training and validation
Splitting of the files depends on a seed value provided in the Random generator seed field; when seed is 0 a new random seed value is used each time the spliting is done.
Snapshot with the directory tree
Snapshot with the legend
Organization of directories with preprocessing for semantic segmentation
This mode is enabled when the Preprocess for has one of the following selections:
- Training and prediction, to preprocess images for both training and prediction
- Training, to preprocess images only for training
- Prediction, to preprocess images only for prediction
The preprocessing starts by pressing of the Preprocess button.
The scheme below demonstrates organization of directories, when the preprocessing mode is used.
Snapshot with the directory tree
Snapshot with the legend
Organization of directories for 2D patch-wise workflow
The 2D patch-wise workflow requires slightly different organization of images in folders. In brief, instead having Images and Labels training directories, all images are organized in Images\[ClassnameN] subfolders. Where ClassnameN encodes a directory name with images patches that belong to ClassnameN class. Number of these subfolders should match number of classes to be used for training.
In contrast to semantic segmentation, the preprocessing is not used during the patch-wise mode.
Without preprocessing, when datasets are manually split into training and validation sets
The images for training should be organized in own subfolders named by corresponding class names and placed under:
- 1_Training\TrainImages, images to be used for training
- 1_Training\ValidationImages, images to be used for validation (optionally)
The images may also be automatically split into subfolder for training and validation.
Snapshot with the directory tree
bg and spots are examples of two class names
When the ground-truth data for prediction is present, it can be arranged in a similar way to the semantic segmentation under
2_Prediction\Images and
2_Prediction\Labels directories
or in subfolders named by class names as 2_Prediction\bg and 2_Prediction\spots, where bg and spots subfolders contain patches that belong to these classes.
Snapshot with the legend
Without preprocessing, with automatic splitting of datasets into training and validation sets
In this mode, the files are randomly split (depending on *Random generator seed*) into the train and validation sets. The split is done upon press of the Preprocess button, when Preprocess for: Split for training and validation
Initially, all images for training should be organized in own subfolders named by corresponding class names and placed under: 1_Training\Images
Snapshot with the directory tree
bg and spots are examples of two class names
When the ground-truth data for prediction is present, it can be arranged in a similar way to the semantic segmentation under
2_Prediction\Images and
2_Prediction\Labels directories
or in subfolders named by class names as 2_Prediction\bg and 2_Prediction\spots, where bg and spots subfolders contain patches that belong to these classes.
Snapshot with the legend
Back to Index -->
User Guide -->
Menu -->
Tools Menu -->
Deep learning segmentation