aidsorb.utils#

Helper functions for generating input representations.

Currently supported representations:

  • Molecular point clouds

    Fast and flexible representation suitable for any molecular system.

  • 3D energy images

    Physics-informed representation tailored for adsorption in porous materials.

References

aidsorb.utils.pcd_from_dir(dirname, outname, features=None)[source]#

Create molecular point clouds from a directory of structure files and store them.

Point clouds are stored under outname as .npy files.

Tip

To get a list of the supported chemical file formats see ase.io.read(). Alternatively, you can list them from the command line with: ase info --formats.

Parameters:
  • dirname (str) – Absolute or relative path to the directory.

  • outname (str) – Directory where the point clouds will be stored. It is created if does not exist.

  • features (list of str, optional) – Elemental properties from periodic table.

Return type:

None

Notes

Structures that can’t be processed are omitted.

Examples

>>> dirname = 'path/to/structures'
>>> outname = 'path/to/pcd_data'
>>> # xyz coordinates + atomic number + electronegativity
>>> pcd_from_dir(dirname, outname, features=['en_pauling'])
aidsorb.utils.pcd_from_file(filename, features=None)[source]#

Create molecular point cloud from a structure file.

The molecular pcd has shape (N, 4+C) where N is the number of atoms, pcd[:, :3] are the atomic coordinates, pcd[:, 3] are the atomic numbers and pcd[:, 4:] any additional features. If features=None, then the only features are the atomic numbers.

Parameters:
  • filename (str) – Absolute or relative path to the file.

  • features (list of str, optional) – See pcd_from_dir().

Returns:

data – Molecular point cloud and its name as (name, pcd).

Return type:

tuple

Notes

The name of the molecule is the basename of filename with its suffix removed.

Examples

>>> # xyz coordinates + atomic number + electronegativity + radius
>>> name, pcd = pcd_from_file('path/to/file', features=['en_pauling', 'atomic_radius'])
...
aidsorb.utils.pcd_from_files(filenames, outname, features=None)[source]#

Create molecular point clouds from a list of structure files and store them.

Point clouds are stored under outname as .npy files.

Parameters:
  • filenames (iterable) – An iterable providing the filenames. Absolute or relative paths can be used.

  • outname (str) – Directory where the point clouds will be stored. It is created if does not exist.

  • features (list of str, optional) – See pcd_from_dir().

Return type:

None

Notes

Structures that can’t be processed are omitted.

Examples

>>> # Create and store the point clouds.
>>> outname = 'path/to/pcd_data'
>>> pcd_from_files(['path/to/foo.xyz', 'path/to/bar.cif'], outname)
>>> # Load back a point cloud.
>>> pcd = np.load(f'{outname}/foo.npy')
aidsorb.utils.voxels_from_dir(cif_dirname, out_pathname, grid_size=32, *, cutoff=10.0, epsilon=50.0, sigma=2.5, cubic_box=30.0, n_jobs=None)[source]#

Calculate voxels from a directory of .cif files and store them.

Voxels are stored under out_pathname as .npy files.

Parameters:
  • cif_dirname (str) – Pathname to the directory containing the .cif files.

  • out_pathname (str) – Pathname of an existing directory under which voxels are stored.

  • grid_size (int, default=32) – Number of grid points along each dimension.

  • cutoff (float, default=10.0) – Cutoff radius (β„«) for the LJ potential.

  • epsilon (float, default=50.0) – Epsilon value (Ξ΅/K) of the probe atom.

  • sigma (float, default=2.5) – Sigma value (Οƒ/β„«) of the probe atom.

  • cubic_box (float or None, default=30) – If None, the simulation box is a supercell scaled according to MIC. Otherwise, cubic box of size cubic_box.

  • n_jobs (int, optional) – Number of jobs to run in parallel. If None, then the number returned by os.cpu_count() is used.

Notes

Structures that can’t be processsed are omitted.

aidsorb.utils.voxels_from_file(cif_pathname, grid_size=32, *, cutoff=10.0, epsilon=50.0, sigma=2.5, cubic_box=30.0, n_jobs=None, only_voxels=True)[source]#

Calculate and return voxels from .cif file.

Parameters:
  • cif_pathname (str) – Pathname to the .cif file.

  • only_voxels (bool, default=True) – Determines out type.

Returns:

out – If only_voxels=True array, else Grid.

Return type:

array or Grid

See also

voxels_from_dir()

For a description of the parameters.

aidsorb.utils.voxels_from_files(cif_pathnames, out_pathname, grid_size=32, *, cutoff=10.0, epsilon=50.0, sigma=2.5, cubic_box=30.0, n_jobs=None)[source]#

Calculate voxels from a list of .cif files and store them.

Voxels are stored under out_pathname as .npy files.

Parameters:
  • cif_pathnames (list) – List of pathnames to the .cif files.

  • out_pathname (str) – Pathname to the directory under which voxels are stored.

See also

voxels_from_dir()

For a description of the parameters.

Notes

Structures that can’t be processsed are omitted.