Conditions to add training data¶
Utility functions for various tasks.
-
flare.utils.learner.
get_max_cutoff
(cell: <sphinx.ext.autodoc.importer._MockObject object at 0x7f53bef8b410>) → float¶ - Compute the maximum cutoff compatible with a 3x3x3 supercell of a
- structure. Called in the Structure constructor when setting the max_cutoff attribute, which is used to create local environments with arbitrarily large cutoff radii.
Parameters: cell (np.ndarray) – Bravais lattice vectors of the structure stored as rows of a 3x3 Numpy array. Returns: - Maximum cutoff compatible with a 3x3x3 supercell of the
- structure.
Return type: float
-
flare.utils.learner.
is_force_in_bound_per_species
(abs_force_tolerance: float, predicted_forces: ndarray, label_forces: ndarray, structure, max_atoms_added: int = inf, max_by_species: dict = {}, max_force_error: float = inf) -> (<class 'bool'>, typing.List[int])¶ Checks the forces of GP prediction assigned to the structure against a DFT calculation, and return a list of atoms which meet an absolute threshold abs_force_tolerance.
Can limit the total number of target atoms via max_atoms_added, and limit per species by max_by_species.
The max_atoms_added argument will ‘overrule’ the max by species; e.g. if max_atoms_added is 2 and max_by_species is {“H”:3}, then at most two atoms total will be added.
Because adding atoms which are in configurations which are far outside of the potential energy surface may not always be desirable, a maximum force error can be passed in; atoms with
Parameters: - abs_force_tolerance – If error exceeds this value, then return atom index
- predicted_forces – Force predictions made by GP model
- label_forces – “True” forces computed by DFT
- structure – FLARE Structure
- max_atoms_added – Maximum atoms to return
- max_by_species – Limit to a maximum number of atoms by species
- max_force_error – In order to avoid counting in highly unlikely configurations, if the error exceeds this, do not add atom
Returns: Bool indicating if any atoms exceeded the error threshold, and a list of indices of atoms which did sorted by their error.
-
flare.utils.learner.
is_std_in_bound
(std_tolerance: float, noise: float, structure: flare.struc.Structure, max_atoms_added: int = inf, update_style: str = 'add_n', update_threshold: float = None) -> (<class 'bool'>, typing.List[int])¶ Given an uncertainty tolerance and a structure decorated with atoms, species, and associated uncertainties, return those which are above a given threshold, agnostic to species.
If std_tolerance is negative, then the threshold used is the absolute value of std_tolerance.
If std_tolerance is positive, then the threshold used is std_tolerance * noise.
If std_tolerance is 0, then do not check.
Parameters: - std_tolerance – If positive, multiply by noise to get cutoff. If negative, use absolute value of std_tolerance as cutoff.
- noise – Noise variance parameter
- structure (FLARE Structure) – Input structure
- max_atoms_added – Maximum # of atoms to add
- update_style – A string specifying the desired strategy for adding atoms to the training set. Current options are ``add_n’’, which adds the n = max_atoms_added highest-uncertainty atoms, and ``threshold’’, which adds all atoms with uncertainty greater than update_threshold.
- update_threshold – A float specifying the update threshold. Ignored if update_style is not set to ``threshold’’.
Returns: (True,[-1]) if no atoms are above cutoff, (False,[…]) if at least one atom is above std_tolerance, with the list indicating which atoms have been selected for the training set.
-
flare.utils.learner.
is_std_in_bound_per_species
(rel_std_tolerance: float, abs_std_tolerance: float, noise: float, structure: flare.struc.Structure, max_atoms_added: int = inf, max_by_species: dict = {}) -> (<class 'bool'>, typing.List[int])¶ Checks the stds of GP prediction assigned to the structure, returns a list of atoms which either meet an absolute threshold or a relative threshold defined by rel_std_tolerance * noise. Can limit the total number of target atoms via max_atoms_added, and limit per species by max_by_species.
The max_atoms_added argument will ‘overrule’ the max by species; e.g. if max_atoms_added is 2 and max_by_species is {“H”:3}, then at most two atoms will be added.
Parameters: - rel_std_tolerance – Multiplied by noise to get a lower bound for the uncertainty threshold defined relative to the model.
- abs_std_tolerance – Used as an absolute lower bound for the uncertainty threshold.
- noise – Noise hyperparameter for model, used to define relative uncertainty cutoff.
- structure – FLARE structure decorated with uncertainties in structure.stds.
- max_atoms_added – Maximum number of atoms to return from structure.
- max_by_species – Dictionary describing maximum number of atoms to return by species (e.g. {‘H’:1,’He’:2} will return at most 1 H and 2 He atoms.)
Returns: Bool indicating if any atoms exceeded the uncertainty threshold, and a list of indices of atoms which did, sorted by their uncertainty.
-
flare.utils.learner.
subset_of_frame_by_element
(frame: flare.Structure, predict_atoms_per_element: dict) → List[int]¶ Given a structure and a dictionary formatted as {“Symbol”:int, ..} describing a number of atoms per element, return a sorted list of indices corresponding to a random subset of atoms by species :param frame: :param predict_atoms_by_species: :return: