aidsorb.utils
Helper functions for creating molecular point clouds.
Todo
Add support for optional transform before storing the point cloud.
- aidsorb.utils.pcd_from_dir(dirname, outname, features=None)[source]
Create molecular point clouds from a directory of structure files and store them.
Point clouds are stored under
outnameas.npyfiles.Tip
To get a list of the supported chemical file formats see
ase.io.read(). Alternatively, you can list them from the command line with:ase info --formats.- Parameters:
dirname (str) – Absolute or relative path to the directory.
outname (str) – Directory name where the point clouds will be stored. The directory will be created if does not exist.
features (list of str, optional) – Elemental properties from periodic table.
- Return type:
None
Notes
Molecules that can’t be processed are omitted.
Examples
>>> dirname = 'path/to/structures' >>> outname = 'path/to/pcd_data' >>> # xyz coordinates + atomic number + electronegativity >>> pcd_from_dir(dirname, outname, features=['en_pauling'])
- aidsorb.utils.pcd_from_file(filename, features=None)[source]
Create molecular point cloud from a structure file.
The molecular
pcdhas shape(N, 4+C)whereNis the number of atoms,pcd[:, :3]are the atomic coordinates,pcd[:, 3]are the atomic numbers andpcd[:, 4:]any additionalfeatures. Iffeatures=None, then the only features are the atomic numbers.- Parameters:
filename (str) – Absolute or relative path to the file.
features (list of str, optional) – See
pcd_from_dir().
- Returns:
data – Molecular point cloud and its name as
(name, pcd).- Return type:
Notes
The
nameof the molecule is the basename offilenamewith its suffix removed.Examples
>>> # xyz coordinates + atomic number + electronegativity + radius >>> name, pcd = pcd_from_file('path/to/file', features=['en_pauling', 'atomic_radius']) ...
- aidsorb.utils.pcd_from_files(filenames, outname, features=None)[source]
Create molecular point clouds from a list of structure files and store them.
Point clouds are stored under
outnameas.npyfiles.- Parameters:
filenames (iterable) – An iterable providing the filenames. Absolute or relative paths can be used.
outname (str) – Directory name where the point clouds will be stored.
features (list of str, optional) – See
pcd_from_dir().
- Return type:
None
Notes
Molecules that can’t be processed are omitted.
Examples
>>> # Create and store the point clouds. >>> outname = 'path/to/pcd_data' >>> pcd_from_files(['path/to/foo.xyz', 'path/to/bar.cif'], outname) >>> # Load back a point cloud. >>> pcd = np.load(f'{outname}/foo.npy')