Mapped Gaussian Process

MappedGaussianProcess uses splines to build up interpolationfunction of the low-dimensional decomposition of Gaussian Process, with little loss of accuracy. Refer to Xie et al., Vandermause et al., Glielmo et al.

class flare.mgp.mgp.MappedGaussianProcess(grid_params: dict, unique_species: list = [], GP: flare.gp.GaussianProcess = None, var_map: str = None, container_only: bool = True, lmp_file_name: str = 'lmp', n_cpus: int = None, n_sample: int = 10)

Build Mapped Gaussian Process (MGP) and automatically save coefficients for LAMMPS pair style.

Parameters:
  • grid_params (dict) – Parameters for the mapping itself, such as grid size of spline fit, etc. As described below.
  • unique_species (dict) – List of all the (unique) species included during the training that need to be mapped
  • GP (GaussianProcess) – None or a GaussianProcess object. If a GP is input, and container_only is False, automatically build a mapping corresponding to the GaussianProcess.
  • var_map (str) – if None: only build mapping for mean (force). If ‘pca’, then use PCA to map the variance, based on grid_params[‘xxbody’][‘svd_rank’]. If ‘simple’, then only map the diagonal of covariance, and predict the upper bound of variance. The ‘pca’ mode is much heavier in terms of memory, but its prediction is much closer to GP variance.
  • container_only (bool) – if True: only build splines container (with no coefficients); if False: Attempt to build map immediately
  • lmp_file_name (str) – LAMMPS coefficient file name
  • n_cpus (int) – Default None. Set to the number of cores needed for parallelization. Used in the construction of the map.
  • n_sample (int) – Default 10. The batch size for building map. Not used now.

Examples:

>>> # build 2 + 3 body map
>>> grid_params = {'twobody': {'grid_num': [64]},
...                'threebody': {'grid_num': [64, 64, 64]}}

For grid_params, the following keys and values are allowed

Parameters:
  • ‘twobody’ (dict, optional) – if 2-body is present, set as a dictionary of parameters for 2-body mapping. Parameters see below.
  • ‘threebody’ (dict, optional) – if 3-body is present, set as a dictionary of parameters for 3-body mapping. Parameters see below.
  • ‘load_grid’ (str, optional) – Default None. the path to the directory where the previously generated grids (grid_*.npy) are stored. If no path is specified, MGP will construct grids from scratch.
  • ‘lower_bound_relax’ (float, optional) – Default 0.1. if ‘lower_bound’ is set to ‘auto’ this value will be used as a relaxation of lower bound. (see below the description of ‘lower_bound’)

For two/three body parameter dictionary, the following keys and values are allowed

Parameters:
  • ‘grid_num’ (list) – a list of integers, the number of grid points for interpolation. The larger the number, the better the approximation of MGP is compared with GP.
  • ‘lower_bound’ (str or list, optional) – Default ‘auto’, the lower bound of the spline interpolation will be searched. First, search the training set of GP and find the minimal interatomic distance r_min. Then, the lower_bound = r_min - lower_bound_relax. The user can set their own lower_bound, of the same shape as ‘grid_num’. E.g. for threebody, the customized lower bound can be set as [1.2, 1.2, 1.2].
  • ‘upper_bound’ (str or list, optional) – Default ‘auto’, the upper bound of the spline interpolation will be the cutoffs of GP. The user can set their own upper_bound, of the same shape as ‘grid_num’. E.g. for threebody, the customized lower bound can be set as [3.5, 3.5, 3.5].
  • ‘svd_rank’ (int, optional) – Default ‘auto’. If the variance mapping is needed, it is set as the rank of the mapping. ‘auto’ uses full rank, which is the smaller one between the total number of grid points and training set size. i.e. full_rank = min(np.prod(grid_num), 3 * N_train)
as_dict() → dict

Dictionary representation of the MGP model.

static from_dict(dictionary: dict) → flare.mgp.mgp.MappedGaussianProcess

Create MGP object from dictionary representation.

predict(atom_env: flare.env.AtomicEnvironment) -> ('ndarray', 'ndarray', 'ndarray', <class 'float'>)

predict force, variance, stress and local energy for given atomic environment

Parameters:atom_env – atomic environment (with a center atom and its neighbors)
Returns:3d array of atomic force variance: 3d array of the predictive variance stress: 6d array of the virial stress energy: the local energy (atomic energy)
Return type:force
write_lmp_file(lammps_name: str, write_var: bool = False)

write the coefficients to a file that can be used by lammps pair style

write_model(name: str, format: str = 'json')

Write everything necessary to re-load and re-use the model :param model_name: :return: