aidsorb.datamodules
LightningDataModule’s for use with PyTorch Lightning.
- class aidsorb.datamodules.PCDDataModule(path_to_X, *, path_to_Y=None, index_col=None, labels=None, train_size=None, train_transform_x=None, eval_transform_x=None, transform_y=None, shuffle=False, drop_last=False, train_batch_size=32, eval_batch_size=32, config_dataloaders=None)[source]
Bases:
LightningDataModuleLightningDataModule for supervised/unsupervised learning on point clouds.
Given the following directory structure:
project_root ├── source <-- path_to_X │ ├── foo.npy │ ├── ... │ └── bar.npy ├── test.json ├── train.json └── validation.json
train, validation, and test datasets are set up, all of which are instances of
PCDDataset.Note
Comma
,is assumed as the field separator in.csvfile.Warning
For validation and test dataloaders,
shuffle=Falseanddrop_last=False.If
train_sizeis specified, the firsttrain_sizepoint clouds fromtrain.jsonwill be used. If the data were not split withprepare_data(), ensure that names intrain.jsondon’t follow a particular order.
Todo
Add support for
predict_dataloader.- Parameters:
path_to_X (str) – Absolute or relative path to the directory holding the point clouds.
path_to_Y (str, optional) – Absolute or relative path to the
.csvfile holding the labels of the point clouds.index_col (str, optional) – Column name of the
.csvfile to be used for indexing.labels (list, optional) – Column names of the
.csvfile containing the properties to be predicted.train_size (int, default=None) – Number of training samples. If
None, all training samples are used.train_transform_x (callable, optional) – Transformation to apply to point cloud during training.
eval_transform_x (callable, optional) – Transformation to apply to point cloud during validation and testing.
transform_y (callable, optional) – Transformation to apply to label.
shuffle (bool, default=False) – Only for train dataloader.
drop_last (bool, default=False) – Only for train dataloader.
train_batch_size (int, default=32) – Batch size for train dataloader.
eval_batch_size (int, default=32) – Batch size for validation and test dataloaders.
config_dataloaders (dict, optional) –
Dictionary for configuring all dataloaders. For example:
config_dataloaders = { 'pin_memory': True, 'num_workers': 2, }
Note
The dictionary is not copied. To avoid side effects, consider passing a copy.
See also
DataLoaderFor a description of
shuffle,drop_lastand valid options forconfig_dataloaders.
- setup(stage=None)[source]
Set up train, validation and test datasets.
Tip
Datasets are accesible via
self.{train,validation,test}_dataset.- Parameters:
stage ({None, 'fit', 'validate', 'test'}, default=None) –
Which datasets to set up.
If
'fit', only the train and validation datasets are set up.If
'validate'or'test', only the corresponding dataset is set up.If
None, all datasets are set up.
- Return type:
None
- test_dataloader()[source]
Return the test dataloader.
Can be called only after
setup()has been called andstageis{None, 'test'}.- Return type:
- train_dataloader()[source]
Return the train dataloader.
Can be called only after
setup()has been called andstageis{None, 'fit'}.- Return type: