Prepare your data¶
If you have collected data for training, including atomic positions, chemical
species, cell etc., you need to convert it into a list of Structure
objects.
Below we provide a few examples.
VASP data¶
If you have AIMD data from VASP, you can follow
the step 2 of this instruction
to generate Structure
with the vasprun.xml
file.
Data from Quantum Espresso, LAMMPS, etc.¶
If you have collected data from any
calculator that ASE supports,
or have dumped data file of format that ASE supports,
you can convert your data into ASE Atoms
, then from Atoms
to
Structure
via Structure.from_ase_atoms
.
For example, if you have collected data from QE, and obtained the QE output file .pwo
,
you can parse it with ASE, and convert ASE Atoms
into Structure
.
from ase.io import read
from flare.struc import Structure
frames = read('data.pwo', index=':', format='espresso-out') # read the whole traj
trajectory = []
for atoms in frames:
trajectory.append(Structure.from_ase_atoms(atoms))
If the data is from the LAMMPS dump file, use
# if it's text file
frames = read('data.dump', index=':', format='lammps-dump-text')
# if it's binary file
frames = read('data.dump', index=':', format='lammps-dump-binary')
Then the trajectory
can be used to
train GP from AIMD data.
Try building GP from data¶
To have a more complete and better monitored training process, please use our GPFA module.
Here we are not going to use this module, but only provide a simple example on how the GP is constructed from the data.
from flare.gp import GaussianProcess
from flare.utils.parameter_helper import ParameterHelper
# set up hyperparameters, cutoffs
kernels = ['twobody', 'threebody']
parameters = {'cutoff_twobody': 4.0, 'cutoff_threebody': 3.0}
pm = ParameterHelper(kernels=kernels,
random=True,
parameters=parameters)
hm = pm.as_dict()
hyps = hm['hyps']
cutoffs = hm['cutoffs']
hl = hm['hyp_labels']
kernel_type = 'mc' # multi-component. use 'sc' for single component system
# build up GP model
gp_model = \
GaussianProcess(kernels=kernels,
component=kernel_type,
hyps=hyps,
hyp_labels=hl,
cutoffs=cutoffs,
hyps_mask=hm,
parallel=False,
n_cpus=1)
# feed training data into GP
# use the "trajectory" as from above, a list of Structure objects
for train_struc in trajectory:
gp_model.update_db(train_struc, forces)
gp_model.check_L_alpha() # build kernel matrix from training data
# make a prediction with gp, test on a training data
test_env = gp_model.training_data[0]
gp_pred = gp_model.predict(test_env, 1) # obtain the x-component
# (force_x, var_x)
# x: 1, y: 2, z: 3
print(gp_pred)