AGOX

Outline

AGOX's modules are designed for usage of various structure search methods. Some basic modules are used everywhere, but do not need to be defined in the input script

Candidate: an extended ASE atoms object

Mandatory modules for an input script are:

Environment: define the chemical information of the search system
- Such information could be the number of atoms, element types, confinement_cell in which atoms are allowed to be placed
- It is possible to define a template or a seed here. For example, if you want to search Au20, you could provide a structure of Au13 as a template, can search the remaining 7 Au atoms.
Database: stores candidates/structures that have been evaluated during a search
Generator (or Collector(Generator)): define how a new structure is generated/modified
Evaluator: define how good/bad the modified structure is
- Normally a lower total energy means a better structure

Mandatory submodules (not directly passing to agox.run() function) for an input script are:

ASE Calculator: an ase calculator used by Evaluator

Optional modules for an input script are:

Sampler: define which structure to be processed by Generator
Collector: create a pool of structures based on Generator
Postprocessor: postprocess of generated structures from Generator before evaluation by Evaluator
Acquisitor: select which structures should be evaluated

Optional submodules for an input script are:

Model: machine learning models such as neural network, gaussian process model.
- It could be used as inputs for Acquisitor and Postprocessor
- It is normally attached to an Database to get/update training data
- Training is triggered after Evaluator and before the update of Database via the model.training_observer in a model.
- By default, the model uses all structures in the database for training.

Command line tool

agox-analysis -d FOLDER_NAME -fr -p -c -s -hg: generate batch analysis data without plotting

Generator

c1 = 0.75: lower limit on bond lengths
c1 = 1.25: upper limit on bond lengths
If a generator has troubles in generating valid candidates within some trials, an empty list will be returned

RandomGenerator

may_nucleate_at_several_places = False: not placing atoms everywhere in the search box

RattleGenerator

n_rattle = 3: rattle probability is n_rattle / n_non_template_all_atoms
at least, one atom will be rattled

SymmetryRattleGenerator

Point-Group Theory Tables
n_rattle = 3: rattle probability is n_rattle / n_non_template_equivalent_atoms
Once most structures in the sampler have reached C1, the search gets easily stagnated. Possible solutions are
Running more independent runs with shorter iterations would be helpful
Decrease lose_symmetry probability can also help
Force to replace some structures in the sampler with SymmetryGenerator may help

ComplementaryEnergyGenerator

The feature vector is based on the local density of element Z around the central atom i

$$ \begin{array}{ccc} \rho _ i^Z(\lambda) = & \sum_{j \neq i, Z_j=Z} \dfrac{1}{\lambda}\text{exp}(-r_{ij}/\lambda) f_c(r_{ij}) \\ \\ f_c(r) = & \begin{cases} \dfrac{1}{2} \text{cos} (\pi \dfrac{r}{r_c}) + \dfrac{1}{2}, & r \le r_c \\ 0, & r > r_c \end{cases} \end{array} $$

where $\lambda$ is a hyperparameter and can have multiple values like 0.5Å, 1Å, 1.5Å, ...

Taking a silicate (Mg$_2$SiO$_4$)$_x$ system as an example, the whole feature vector is sorted by atomic numbers and is constructed as $$ \textbf{f}_i = [ \underbrace{\rho _ i^O(\lambda_1), \rho _ i^O(\lambda_2), \dots, \rho _ i^O(\lambda_n)} _ \text{Oxygen neighbors}, \underbrace{\rho _ i^{Mg}(\lambda_1), \rho _ i^{Mg}(\lambda_2), \dots, \rho _ i^{Mg}(\lambda_n)} _ \text{Magnesium neighbors}, \underbrace{\rho _ i^{Si}(\lambda_1), \rho _ i^{Si}(\lambda_2), \dots, \rho _ i^{Si}(\lambda_n)} _ \text{Silicon neighbors}, \underbrace{Z_i} _ \text{Atomic number}] $$

In practice, if the feature_matrix has a shape of (n_atoms, n_species, n_lambs), it can be flatten as feature_matrix.reshape(n_atoms, -1). For example, if there are 3 species and 5 lambdas, the feature matrix of one atom [O(1) means $\rho^O(\lambda_1)$]

O(1)    O(2)    O(3)   O(4)   O(5)
Mg(1)   Mg(2)   Mg(3)  Mg(4)  Mg(5)
Si(1)   Si(2)   Si(3)  Si(4)  Si(5)

will be reshaped into

O(1)    O(2)    O(3)   O(4)   O(5)  Mg(1)   Mg(2)   Mg(3)  Mg(4)  Mg(5)   Si(1)   Si(2)   Si(3)  Si(4)  Si(5)

Finally, the atomic numbers can be appended as the last column.

Collector

Ray ParallelCollector

num_candidates parameter defines how many candidates are generated for each generator at each search iteration
- num_candidates = [20, 30, 10] meaning generating 20 candidates with the 1st generator, 30 candidates with the 2rd, and 10 candidates with the 3rd
- num_candidates = {0:[10, 0, 0], 20:[20, 30, 10]}: generating 10 candidates with the 1st generator at iteration 0~20, using the [20, 30, 10] scheme from iteration 20.
Be careful if a frozen and jitted pytorch model is used during Ray collector

Traceback (most recent call last):
  File "/scratch/10071999/run0/script_sym_painn_freeze.py", line 175, in <module>
    post = ParallelRelaxPostprocess(model=model_calculator, 
  File "/home/tang/opt/agox/agox_gitlab/agox/postprocessors/ray_relax.py", line 44, in __init__
    self.model_key = self.pool_add_module(model)
  File "/home/tang/opt/agox/agox_gitlab/agox/utils/ray_utils.py", line 566, in pool_add_module
    return self.pool.add_module(module)
  File "/home/tang/opt/agox/agox_gitlab/agox/utils/ray_utils.py", line 287, in add_module
    futures = [actor.add_module.remote(module, key) for actor in self.idle_actors]
  File "/home/tang/opt/agox/agox_gitlab/agox/utils/ray_utils.py", line 287, in <listcomp>
    futures = [actor.add_module.remote(module, key) for actor in self.idle_actors]
  File "/home/tang/python-virtualenv/spk2/lib/python3.9/site-packages/ray/actor.py", line 144, in remote
    return self._remote(args, kwargs)
  File "/home/tang/python-virtualenv/spk2/lib/python3.9/site-packages/ray/_private/auto_init_hook.py", line 24, in auto_init_wrapper
    return fn(*args, **kwargs)
  File "/home/tang/python-virtualenv/spk2/lib/python3.9/site-packages/ray/util/tracing/tracing_helper.py", line 426, in _start_span
    return method(self, args, kwargs, *_args, **_kwargs)
  File "/home/tang/python-virtualenv/spk2/lib/python3.9/site-packages/ray/actor.py", line 191, in _remote
    return invocation(args, kwargs)
  File "/home/tang/python-virtualenv/spk2/lib/python3.9/site-packages/ray/actor.py", line 178, in invocation
    return actor._actor_method_call(
  File "/home/tang/python-virtualenv/spk2/lib/python3.9/site-packages/ray/actor.py", line 1176, in _actor_method_call
    object_refs = worker.core_worker.submit_actor_task(
  File "python/ray/_raylet.pyx", line 3728, in ray._raylet.CoreWorker.submit_actor_task
  File "python/ray/_raylet.pyx", line 3733, in ray._raylet.CoreWorker.submit_actor_task
  File "python/ray/_raylet.pyx", line 694, in ray._raylet.prepare_args_and_increment_put_refs
  File "python/ray/_raylet.pyx", line 685, in ray._raylet.prepare_args_and_increment_put_refs
  File "python/ray/_raylet.pyx", line 732, in ray._raylet.prepare_args_internal
  File "/home/tang/python-virtualenv/spk2/lib/python3.9/site-packages/ray/_private/serialization.py", line 494, in serialize
    return self._serialize_to_msgpack(value)
  File "/home/tang/python-virtualenv/spk2/lib/python3.9/site-packages/ray/_private/serialization.py", line 472, in _serialize_to_msgpack
    pickle5_serialized_object = self._serialize_to_pickle5(
  File "/home/tang/python-virtualenv/spk2/lib/python3.9/site-packages/ray/_private/serialization.py", line 425, in _serialize_to_pickle5
    raise e
  File "/home/tang/python-virtualenv/spk2/lib/python3.9/site-packages/ray/_private/serialization.py", line 420, in _serialize_to_pickle5
    inband = pickle.dumps(
  File "/home/tang/python-virtualenv/spk2/lib/python3.9/site-packages/ray/cloudpickle/cloudpickle_fast.py", line 88, in dumps
    cp.dump(obj)
  File "/home/tang/python-virtualenv/spk2/lib/python3.9/site-packages/ray/cloudpickle/cloudpickle_fast.py", line 733, in dump
    return Pickler.dump(self, obj)
RuntimeError: Tried to serialize object __torch__.schnetpack.model.base.___torch_mangle_15.NeuralNetworkPotential which does not have a __getstate__ method defined!

Acquisitor

LowerConfidenceBoundAcquisitor (sym_dev)

novelty: selecting candidates depending on the space group that has been visited
cutoff: cutoff value for energy uncertainty