Calculate the gradient of prediction w.r.t inputs

# Get into evaluation (predictive posterior) mode

# one point at a time
test_x = torch.tensor([1.8], requires_grad=True)

with gpytorch.settings.fast_pred_var():
    # Make predictions
    observed_pred = likelihood(model(test_x))
    mean = observed_pred.mean
    lower, upper = observed_pred.confidence_region()

gradient = test_x.grad

GPU memories

I have tried to use Oganov global fingerprints as inputs (100, 270) and total energy + its derivative w.r.t feature vector as outputs (100, 271) in GPModelWithDerivatives. Then the training is running out of CUDA memories on Tesla P100-PCIE-12GB. If I reduce the number of training data from 100 to 50, the training is running normally, but the prediction does not have enough CUDA memories. If I further reduce the number of training data to 30, both the training and prediction work fine.

scaling train_y

When training on derivative observations like atomic forces, the amplitude of target value (energy) is much higher than its derivatives. It becomes important to scale both target values and derivatives before training and inverse transform its predictions to original amplitudes.