Conditions to add training data

Utility functions for various tasks.

flare.utils.learner.get_max_cutoff(cell: <sphinx.ext.autodoc.importer._MockObject object at 0x7f53bef8b410>) → float
Compute the maximum cutoff compatible with a 3x3x3 supercell of a
structure. Called in the Structure constructor when setting the max_cutoff attribute, which is used to create local environments with arbitrarily large cutoff radii.
Parameters:cell (np.ndarray) – Bravais lattice vectors of the structure stored as rows of a 3x3 Numpy array.
Returns:
Maximum cutoff compatible with a 3x3x3 supercell of the
structure.
Return type:float
flare.utils.learner.is_force_in_bound_per_species(abs_force_tolerance: float, predicted_forces: ndarray, label_forces: ndarray, structure, max_atoms_added: int = inf, max_by_species: dict = {}, max_force_error: float = inf) -> (<class 'bool'>, typing.List[int])

Checks the forces of GP prediction assigned to the structure against a DFT calculation, and return a list of atoms which meet an absolute threshold abs_force_tolerance.

Can limit the total number of target atoms via max_atoms_added, and limit per species by max_by_species.

The max_atoms_added argument will ‘overrule’ the max by species; e.g. if max_atoms_added is 2 and max_by_species is {“H”:3}, then at most two atoms total will be added.

Because adding atoms which are in configurations which are far outside of the potential energy surface may not always be desirable, a maximum force error can be passed in; atoms with

Parameters:
  • abs_force_tolerance – If error exceeds this value, then return atom index
  • predicted_forces – Force predictions made by GP model
  • label_forces – “True” forces computed by DFT
  • structure – FLARE Structure
  • max_atoms_added – Maximum atoms to return
  • max_by_species – Limit to a maximum number of atoms by species
  • max_force_error – In order to avoid counting in highly unlikely configurations, if the error exceeds this, do not add atom
Returns:

Bool indicating if any atoms exceeded the error threshold, and a list of indices of atoms which did sorted by their error.

flare.utils.learner.is_std_in_bound(std_tolerance: float, noise: float, structure: flare.struc.Structure, max_atoms_added: int = inf, update_style: str = 'add_n', update_threshold: float = None) -> (<class 'bool'>, typing.List[int])

Given an uncertainty tolerance and a structure decorated with atoms, species, and associated uncertainties, return those which are above a given threshold, agnostic to species.

If std_tolerance is negative, then the threshold used is the absolute value of std_tolerance.

If std_tolerance is positive, then the threshold used is std_tolerance * noise.

If std_tolerance is 0, then do not check.

Parameters:
  • std_tolerance – If positive, multiply by noise to get cutoff. If negative, use absolute value of std_tolerance as cutoff.
  • noise – Noise variance parameter
  • structure (FLARE Structure) – Input structure
  • max_atoms_added – Maximum # of atoms to add
  • update_style – A string specifying the desired strategy for adding atoms to the training set. Current options are ``add_n’’, which adds the n = max_atoms_added highest-uncertainty atoms, and ``threshold’’, which adds all atoms with uncertainty greater than update_threshold.
  • update_threshold – A float specifying the update threshold. Ignored if update_style is not set to ``threshold’’.
Returns:

(True,[-1]) if no atoms are above cutoff, (False,[…]) if at least one atom is above std_tolerance, with the list indicating which atoms have been selected for the training set.

flare.utils.learner.is_std_in_bound_per_species(rel_std_tolerance: float, abs_std_tolerance: float, noise: float, structure: flare.struc.Structure, max_atoms_added: int = inf, max_by_species: dict = {}) -> (<class 'bool'>, typing.List[int])

Checks the stds of GP prediction assigned to the structure, returns a list of atoms which either meet an absolute threshold or a relative threshold defined by rel_std_tolerance * noise. Can limit the total number of target atoms via max_atoms_added, and limit per species by max_by_species.

The max_atoms_added argument will ‘overrule’ the max by species; e.g. if max_atoms_added is 2 and max_by_species is {“H”:3}, then at most two atoms will be added.

Parameters:
  • rel_std_tolerance – Multiplied by noise to get a lower bound for the uncertainty threshold defined relative to the model.
  • abs_std_tolerance – Used as an absolute lower bound for the uncertainty threshold.
  • noise – Noise hyperparameter for model, used to define relative uncertainty cutoff.
  • structure – FLARE structure decorated with uncertainties in structure.stds.
  • max_atoms_added – Maximum number of atoms to return from structure.
  • max_by_species – Dictionary describing maximum number of atoms to return by species (e.g. {‘H’:1,’He’:2} will return at most 1 H and 2 He atoms.)
Returns:

Bool indicating if any atoms exceeded the uncertainty threshold, and a list of indices of atoms which did, sorted by their uncertainty.

flare.utils.learner.subset_of_frame_by_element(frame: flare.Structure, predict_atoms_per_element: dict) → List[int]

Given a structure and a dictionary formatted as {“Symbol”:int, ..} describing a number of atoms per element, return a sorted list of indices corresponding to a random subset of atoms by species :param frame: :param predict_atoms_by_species: :return: