Skip to content

medicalai.chief.dataset_prepare

datasetFromFolder

datasetFromFolder(self,
                  folder,
                  targetDim=(31, 31),
                  normalize=False,
                  name=None,
                  useCache=True,
                  forceCleanCache=False)

TODO: Fix samplingMethodName assignment

datasetGenFromFolder

datasetGenFromFolder(self,
                     folder,
                     targetDim=(224, 224),
                     normalize=False,
                     batch_size=16,
                     augmentation=True,
                     color_mode='rgb',
                     class_mode='sparse',
                     shuffle=True,
                     seed=23)

Create a dataset generator from dataset present in Folder. The folder should consist of test and train folders and each of the folders should have n classes of folders.

Arguments

  • folder: The directory must be set to the path where your n classes of folders are present.
  • targetDim: The target_size is the size of your input images to the neural network.
  • class_mode: Set binary if classifying only two classes, if not set to categorical, in case of an Autoencoder system, both input and the output would probably be the same image, for this case set to input.
  • color_mode: grayscale for black and white or grayscale, rgb for three color channels.
  • batch_size: Number of images to be yielded from the generator per batch. If training fails lower this number.
  • augmentation: : [Optional] : Default = True: Perform augmentation on Dataset
  • shuffle: : [Optional] : Default = True: Shuffle Dataset
  • seed: : [Optional] : Default = 23: Initialize Random Seed

Returns

None: Initializes Test and Train Data Generators

datasetGenFromDataframe

datasetGenFromDataframe(self,
                        folder,
                        csv_path='.',
                        x_col='name',
                        y_col='labels',
                        targetDim=(224, 224),
                        normalize=False,
                        batch_size=16,
                        augmentation=True,
                        color_mode='rgb',
                        class_mode='sparse',
                        shuffle=True,
                        seed=17)
Creates Keras Dataset Generator for Handling Large Datasets from DataFrame.

Arguments

  • csv_path: folder containing train.csv and test.csv.
  • folder: The directory must be set to the path where your training images are present.
  • x_col: Name of column containing image name, default = name.
  • y_col: Name of column for labels, default = labels.
  • targetDim: The target_size is the size of your input images to the neural network.
  • class_mode: Set binary if classifying only two classes, if not set to categorical, in case of an Autoencoder system, both input and the output would probably be the same image, for this case set to input.
  • color_mode: grayscale for black and white or grayscale, rgb for three color channels.
  • batch_size: Number of images to be yielded from the generator per batch. If training fails lower this number.
  • augmentation: : [Optional] : Default = True: Perform augmentation on Dataset
  • shuffle: : [Optional] : Default = True: Shuffle Dataset
  • seed: : [Optional] : Default = 23: Initialize Random Seed

Returns

None: Initializes Test and Train Data Generators