aidsorb.datamodules

This module provides LightningDataModule’s for use with PyTorch Lightning.

class aidsorb.datamodules.PCDDataModule(path_to_X, path_to_Y, index_col, labels, train_size=None, train_transform_x=None, eval_transform_x=None, transform_y=None, shuffle=False, train_batch_size=32, eval_batch_size=32, config_dataloaders=None)[source]

Bases: LightningDataModule

LightningDataModule for point clouds.

Note

The following directory structure is assumed:

pcd_data
├──pcds.npz        <-- path_to_X
├──train.json
├──validation.json
└──test.json

Tip

Assuming pcd_data/pcds.npz already exists, you can create the above directory structure with prepare_data().

Todo

Add support for predict_dataloader.
Add option drop_last for train_dataloader.

Parameters:

path_to_X (str) – Absolute or relative path to the .npz file holding the point clouds.
path_to_Y (str) –
Absolute or relative path to the .csv file holding the labels of the point clouds.

Warning

The comma , is assumed as the field separator.
index_col (str) – Column name of the .csv file to be used as row labels. The names (values) under this column must follow the same naming scheme as in pcds.npz.
labels (list) – List containing the names of the properties to be predicted. No effect if path_to_Y=None.
train_size (int, optional) – The number of training samples. By default, all training samples are used.
train_transform_x (callable, optional) – Transforms applied to input during training.
eval_transform_x (callable, optional) – Transforms applied to input during validation and testing.
transform_y (callable, optional) – Transforms applied to output.
shuffle (bool, default=False) – Only for train_dataloader.
train_batch_size (int, default=32) – batch_size for train dataloader.
eval_batch_size (int, default=32) – batch_size for the validation and test dataloaders.
config_dataloaders (dict, optional) –
Dictionary for configuring the DataLoader’s. This is applied to all dataloaders, i.e. {train,validation,test}_dataloader. For example:
```
config_dataloaders = {
    'pin_memory': True,
    'num_workers': 2,
    }
```