Advanced Hyperparameters Set Up¶
For multi-component systems, the configurational space can be highly complicated. One may want to use different hyper-parameters and cutoffs for different interactions, or do constraint optimisation for hyper-parameters.
To use more hyper-parameters, we need special kernel function that can differentiate different pairs, triplets and other descriptors and determine which number to use for what interaction.
This kernel can be enabled by using the hyps_mask
argument of the GaussianProcess class.
It contains multiple arrays to describe how to break down the array of hyper-parameters and
apply them when computing the kernel. Detail descriptions of this argument can be seen in
kernel/mc_sephyps.py.
The ParameterHelper class is to generate the hyps_mask with a more human readable interface.
Example:
>>> pm = ParameterHelper(species=['C', 'H', 'O'],
... kernels={'twobody':[['*', '*'], ['O','O']],
... 'threebody':[['*', '*', '*'],
... ['O','O', 'O']]},
... parameters={'twobody0':[1, 0.5, 1], 'twobody1':[2, 0.2, 2],
... 'threebody0':[1, 0.5], 'threebody1':[2, 0.2],
... 'cutoff_threebody':1},
... constraints={'twobody0':[False, True]})
>>> hm = pm.as_dict()
>>> kernels = hm['kernels']
>>> gp_model = GaussianProcess(kernels=kernels,
... hyps=hyps, hyps_mask=hm)
In this example, four atomic species are involved. There are many kinds of twobodys and threebodys. But we only want to use eight different signal variance and length-scales.
In order to do so, we first define all the twobodys to be group “twobody0”, by listing “-” as the first element in the twobody argument. The second element O-O is then defined to be group “twobody1”. Note that the order matters here. The later element overrides the ealier one. If twobodys=[[‘O’, ‘O’], [‘*’, ‘*’]], then all twobodys belong to group “twobody1”.
Similarly, O-O-O is defined as threebody1, while all remaining ones are left as threebody0.
The hyperpameters for each group is listed in the order of [sig, ls, cutoff] in the parameters argument. So in this example, O-O interaction will use [2, 0.2, 2] as its sigma, length scale, and cutoff.
For threebody, the parameter arrays only come with two elements. So there is no cutoff associated with threebody0 or threebody1; instead, a universal cutoff is used, which is defined as ‘cutoff_threebody’.
The constraints argument define which hyper-parameters will be optimized. True for optimized and false for being fixed.
Here are a couple more simple examples.
Define a 5-parameter 2+3 kernel (1, 0.5, 1, 0.5, 0.05)
>>> pm = ParameterHelper(kernels=['twobody', 'threebody'],
... parameters={'sigma': 1,
... 'lengthscale': 0.5,
... 'cutoff_twobody': 2,
... 'cutoff_threebody': 1,
... 'noise': 0.05})
Define a 5-parameter 2+3 kernel (1, 1, 1, 1, 0.05)
>>> pm = ParameterHelper(kernels=['twobody', 'threebody'],
... parameters={'cutoff_twobody': 2,
... 'cutoff_threebody': 1,
... 'noise': 0.05},
... ones=ones,
... random=not ones)
Define a 9-parameter 2+3 kernel
>>> pm = ParameterHelper()
>>> pm.define_group('specie', 'O', ['O'])
>>> pm.define_group('specie', 'rest', ['C', 'H'])
>>> pm.define_group('twobody', '**', ['*', '*'])
>>> pm.define_group('twobody', 'OO', ['O', 'O'])
>>> pm.define_group('threebody', '***', ['*', '*', '*'])
>>> pm.define_group('threebody', 'Oall', ['O', 'O', 'O'])
>>> pm.set_parameters('**', [1, 0.5])
>>> pm.set_parameters('OO', [1, 0.5])
>>> pm.set_parameters('Oall', [1, 0.5])
>>> pm.set_parameters('***', [1, 0.5])
>>> pm.set_parameters('cutoff_twobody', 5)
>>> pm.set_parameters('cutoff_threebody', 4)
See more examples in functions ParameterHelper.define_group
, ParameterHelper.set_parameters
,
and in the tests tests/test_parameters.py
If you want to add in a new hyperparameter set to an already-existing GP, you can perform the following steps:
>> hyps_mask = pm.as_dict() >> hyps = hyps_mask[‘hyps’] >> kernels = hyps_mask[‘kernels’] >> gp_model.update_kernel(kernels, ‘mc’, hyps_mask) >> gp_model.hyps = hyps
-
class
flare.utils.parameter_helper.
ParameterHelper
(hyps_mask=None, species=None, kernels={}, cutoff_groups={}, parameters=None, constraints={}, allseparate=False, random=False, ones=False, verbose='WARNING')¶ A helper class to construct the hyps_mask dictionary for AtomicEnvironment , GaussianProcess and MappedGaussianProcess
Parameters: - hyps_mask (dict) – Not implemented yet
- species (dict, list) – Define specie groups
- kernels (dict, list) – Define kernels and groups for the kernels
- cutoff_groups (dict) – Define different cutoffs for different species
- parameters (dict) – Define signal variance, length scales, and cutoffs
- constraints (dict) – If listed as False, the cooresponding hyperparmeters will not be trained
- allseparate (bool) – If True, define each type pair/triplet into a separate group.
- random (bool) – If True, randomized all signal variances and lengthscales
- one (bool) – If True, set all signal variances and lengthscales to one
- verbose (str) – Level to print with “ERROR”, “WARNING”, “INFO”, “DEBUG”
the
species
is an optional input. It can be left as None if the user only wants to set up one group of hyper-parameters for each kernel.the
kernels
can be defined along with or without groups. But the later mode is not compatible with theallseparate
flag.>>> kernels=['twobody', 'threebody'],
or
>>> kernels={'twobody':[['*', '*'], ['O','O']], ... 'threebody':[['*', '*', '*'], ... ['O','O', 'O']]},
Current options for the kernels are twobody, threebody and manybody (based on coordination number).
See format of
species
,kernels
(dict), andcutoff_groups
inlist_groups()
function.See format of
parameters
andconstraints
inlist_parameters()
function.
-
all_separate_groups
(group_type)¶ Separate all possible types of twobodys, threebodys, manybody. One type per group.
Parameters: group_type (str) – “specie”, “twobody”, “threebody”, “cut3b”, “manybody”
-
as_dict
()¶ Dictionary representation of the mask. The output can be used for AtomicEnvironment or the GaussianProcess
-
define_group
(group_type, name, element_list, parameters=None, atomic_str=False)¶ Define specie/twobody/threebody/3b cutoff/manybody group
Parameters: - group_type (str) – “specie”, “twobody”, “threebody”, “cut3b”, “manybody”
- name (str) – the name use for indexing. can be anything but “*”
- element_list (list) – list of elements
- parameters (list) – corresponding parameters for this group
- atomic_str (bool) – whether the elements in element_list are specified by group names or periodic table element names.
The function is helped to define different groups for specie/twobody/threebody /3b cutoff/manybody terms. This function can be used for many times. The later one always overrides the former one.
The name of the group has to be unique string (but not “*”), that define a group of species or twobodys, etc. If the same name is used, in two function calls, the definitions of the group will be merged. Both calls will be effective.
element_list has to be a list of atomic elements, or a list of specie group names (which should be defined in previous calls), or “*”. “*” will loop the function over all previously defined species. It has to be two elements for twobody/3b cutoff/manybody term, or three elements for threebody. For specie group definition, it can be as many elements as you want.
If multiple define_group calls have conflict with element, the later one has higher priority. For example, twobody 1-2 are defined as group1 in the first call, and as group2 in the second call. In the end, the twobody will be left as group2.
Example 1:
>>> define_group('specie', 'water', ['H', 'O']) >>> define_group('specie', 'salt', ['Cl', 'Na'])
They define H and O to be group water, and Na and Cl to be group salt.
Example 2.1:
>>> define_group('twobody', 'in-water', ['H', 'H'], atomic_str=True) >>> define_group('twobody', 'in-water', ['H', 'O'], atomic_str=True) >>> define_group('twobody', 'in-water', ['O', 'O'], atomic_str=True)
Example 2.2:
>>> define_group('twobody', 'in-water', ['water', 'water'])
The 2.1 is equivalent to 2.2.
Example 3.1:
>>> define_group('specie', '1', ['H']) >>> define_group('specie', '2', ['O']) >>> define_group('twobody', 'Hgroup', ['H', 'H'], atomic_str=True) >>> define_group('twobody', 'Hgroup', ['H', 'O'], atomic_str=True) >>> define_group('twobody', 'OO', ['O', 'O'], atomic_str=True)
Example 3.2:
>>> define_group('specie', '1', ['H']) >>> define_group('specie', '2', ['O']) >>> define_group('twobody', 'Hgroup', ['H', '*'], atomic_str=True) >>> define_group('twobody', 'OO', ['O', 'O'], atomic_str=True)
Example 3.3:
>>> list_groups('specie', ['H', 'O']) >>> define_group('twobody', 'Hgroup', ['H', '*']) >>> define_group('twobody', 'OO', ['O', 'O'])
Example 3.4:
>>> list_groups('specie', ['H', 'O']) >>> define_group('twobody', 'OO', ['*', '*']) >>> define_group('twobody', 'Hgroup', ['H', '*'])
3.1 to 3.4 are all equivalent.
-
fill_in_parameters
(group_type, random=False, ones=False, universal=False)¶ Separate all possible types of twobodys, threebodys, manybody. One type per group. And fill in either universal ls and sigma from pre-defined parameters from set_parameters(“sigma”, ..) and set_parameters(“ls”, ..) or random parameters if random is True.
Parameters: - group_type (str) – “specie”, “twobody”, “threebody”, “cut3b”, “manybody”
- definition_list (list, dict) – list of elements
-
find_group
(group_type, element_list, atomic_str=False)¶ find the group that contains the input pair
Parameters: - group_type (str) – species, twobody, threebody, cut3b, manybody
- element_list (list) – list of elements for a pair/triplet/coordination-pair
- atomic_str (bool) – whether the elements in element_list are specified by group names or periodic table element names.
Returns: Return type: name (str)
-
static
from_dict
(hyps_mask, verbose=False, init_spec=[])¶ convert dictionary mask to HM instance This function is not tested yet
-
list_groups
(group_type, definition_list)¶ define groups in batches.
Parameters: - group_type (str) – “specie”, “twobody”, “threebody”, “cut3b”, “manybody”
- definition_list (list, dict) – list of elements
This function runs define_group in batch. Please first read the manual of define_group.
If the definition_list is a list, it is equivalent to executing define_group through the definition_list.
>>> for all terms in the list: >>> define_group(group_type, group_type+'n', the nth term in the list)
So the first twobody defined will be group twobody0, second one will be group twobody1. For specie, it will define all the listed elements as groups with only one element with their original name.
If the definition_list is a dictionary, it is equivalent to
>>> for k, v in the dict: >>> define_group(group_type, k, v)
It is not recommended to use the dictionary mode, especially when the group definitions are conflicting with each other. There is no guarantee that the priority order is the same as you want.
Unlike ParameterHelper.define_group(), it can only be called once for each group_type, and not after any ParameterHelper.define_group() calls.
-
list_parameters
(parameter_dict: dict, constraints: dict = {})¶ Define many groups of parameters
Parameters: - parameter_dict (dict) – dictionary of all parameters
- constraints (dict) – dictionary of all constraints
Example:
>>> parameter_dict={"group_name":[sig, ls, cutoffs], ...} >>> constraints={"group_name":[True, False, False], ...}
The name of parameters can be the group name previously defined in define_group or list_groups function. Aside from the group name,
noise
,cutoff_twobody
,cutoff_threebody
, andcutoff_manybody
are reserved for noise parmater and universal cutoffs, whilesigma
andlengthscale
are reserved for universal signal variances and length scales.For non-reserved keys, the value should be a list of 2 to 3 elements, corresponding to the sigma, lengthscale and cutoff (if the third one is defined). For reserved keys, the value should be a float number.
The parameter_dict and constraints should use the same set of keys. If a key in constraints is not used in parameter_dict, it will be ignored.
The value in the constraints can be either a single bool, which apply to all parameters, or list of bools that apply to each parameter.
-
set_constraints
(name, opt)¶ Set the parameters for certain group
Parameters: - name (str) – name of the patermeters
- opt (bool, list) – whether to optimize the parameter or not
The name of parameters can be the group name previously defined in define_group or list_groups function. Aside from the group name,
noise
,cutoff_twobody
,cutoff_threebody
, andcutoff_manybody
are reserved for noise parmater and universal cutoffs, whilesigma
andlengthscale
are reserved for universal signal variances and length scales.The optimization flag can be a single bool, which apply to all parameters under that name, or list of bools that apply to each parameter.
-
set_parameters
(name, parameters, opt=True)¶ Set the parameters for certain group
Parameters: - name (str) – name of the patermeters
- parameters (list) – the sigma, lengthscale, and cutoff of each group.
- opt (bool, list) – whether to optimize the parameter or not
The name of parameters can be the group name previously defined in define_group or list_groups function. Aside from the group name,
noise
,cutoff_twobody
,cutoff_threebody
, andcutoff_manybody
are reserved for noise parmater and universal cutoffs, whilesigma
andlengthscale
are reserved for universal signal variances and length scales.The parameter should be a list of 2-3 elements, for sigma, lengthscale (and cutoff if the third one is defined).
The optimization flag can be a single bool, which apply to all parameters, or list of bools that apply to each parameter.
-
summarize_group
(group_type)¶ Sort and combine all the previous definition to internal varialbes
Parameters: group_type (str) – species, twobody, threebody, cut3b, manybody