Geometry optimization

General optimization:

For most electronic structure methods, the potential energy as a function of nuclear coordinates is not analytically available. To optimise the structure, the only feasible way is to keep trying different geometries until we are quite certain that the structure with the lowest energy has been found. Different geometry optimization methods have different ways to change geometries. Since the potential energy is a multi variable function, the most wanted optimization strategy aims at moving structure in the direction lowering the energy with respect to all coordinates. This direction is the opposite of the gradient vector, $\bold{g}$, which is defined as: $$ \bold{g} = \begin{bmatrix} \frac{\partial E}{\partial x _ 1} \\ \frac{\partial E}{\partial x _ 2} \\ \dots \\ \frac{\partial E}{\partial x _ n} \end{bmatrix} $$


Suppose the structure to be optimized has current position $\bold{x^k}$, which is a function has $n$ variables ($n=3N_{atoms}-6(5)$). The structure optimization method tries to place the structure in a new position $\bold{x^{k+1}}$ using the displacement $\Delta\bold{x}=\bold{x^{k+1}}-\bold{x^k}$. The energy at new position can be written in Taylor expansions truncating at second order: $$ E(\bold{x^{k+1}}) = E(\bold{x^k}) + (\frac{\partial E(\bold{x^k})}{\partial x^k_1} \Delta x_1 + \frac{\partial E(\bold{x^k}) }{\partial x^k_2} \Delta x_2 + \dots) + \frac{1}{2} (\frac{\partial^2 E(\bold{x^k})}{\partial {x^k_1}^2} \Delta x^2_1 + \frac{\partial^2 E(\bold{x^k})}{\partial {x^k_2}^2} \Delta x^2_2 + \dots) $$ In the matrix form, $E(\bold{x^{k+1}})$ is written as: $$ E(\bold{x^{k+1}}) = E(\bold{x^k}) + (\bold{g^k})^T \Delta\bold{x} + \frac{1}{2} \Delta\bold{x}^T \bold{H}^k \Delta\bold{x} $$ The optimization goal is to make the new structure a local minimum on PES of which the first derivative is zero: $\frac{\partial E(\bold{x^{k+1}})}{\partial {x^{k+1}}_i} = 0$. Applying this stationary condition to the Taylor expansions of $E(\bold{x^{k+1}})$: $$ \begin{aligned} \frac{\partial E(\bold{x^{k+1}})}{\partial {x^{k+1}}_1} = 0 + ( \frac{\partial E(\bold{x^k})}{\partial x^k_1} * 1 + 0 + \dots) + \frac{1}{2} (\frac{\partial^2 E(\bold{x^k})}{\partial {x^k_1}^2} * 2 \Delta x_1 + 0 + \dots) &= 0 \\ \frac{\partial E(\bold{x^{k+1}})}{\partial {x^{k+1}}_2} = 0 + (0 + \frac{\partial E(\bold{x^k})}{\partial x^k_2} * 1 + \dots) + \frac{1}{2} (0 + \frac{\partial^2 E(\bold{x^k})}{\partial {x^k_2}^2} * 2 \Delta x_2 + \dots) &= 0 \\ &\dots \\ \end{aligned} $$ To clean this up: $\frac{\partial E(\bold{x^k})}{\partial x^k_i} + \frac{\partial^2 E(\bold{x^k})}{\partial {x^k_i}^2} \Delta x_i = 0 \text{ (for i in 1, 2, \dots, n)}$. In the martrix form, it is written as: $$ \begin{aligned} \bold{g}^k + \bold{H}^k \Delta{\bold x} &= 0 \\ \Delta{\bold x} &= - {(\bold{H}^k)}^{-1} \bold{g}^k \end{aligned} $$

Quality of Newton-Raphson-based optimization:

  1. Hessian quality (exact or approximated, how to update)
  2. Step control (augmented Hessian, choice of shift parameters)
  3. Coordinates (Cartesian, internal)
  4. Trust radius update (maximum step size allowed)