# GPs as regressors for biophyiscal parameter inversion¶

GPs are a general regression technique, and can be used to regress some wanted magnitude from a set of inputs. This isn’t as cool as other things you can do with them, but it’s feasible to do… GPs are flexible for regression and interpolation, but given that this library has a strong remote sensing orientation, we’ll consider their use for bionpysical parameter extraction from Sentinel-2 data (for example).

## Retrieving bionpysical parameters for Sentinel-2¶

Let’s assume that we want to retrieve leaf area index (LAI) from Sentinel-2 surface reflectance data. The regression problem can be stated as one where the inputs to the regressor are the spectral measurements of a pixel, and the output is the retrieved LAI. We can do this mapping by pairing in situ measurements, or we can just use a standard RT model to provide the direct mapping, and then learn the inverse mapping using the GP.

Although the problem is easy, we know that other parameters will have an effect in the measuremed reflectance, so we can only expect this to work over a limited spread of parameters other than LAI. Here, we show how to use the gp_emulator helper functions to create a suitable training set, and perform this.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 import numpy as np import scipy.stats import gp_emulator import prosail import matplotlib.pyplot as plt np.random.seed(42) # Define number of training and validation samples n_train = 200 n_validate = 500 # Define the parameters and their spread parameters = ["n", "cab", "car", "cbrown", "cw", "cm", "lai", "ala"] p_mins = [1.6, 25, 5, 0.0, 0.01, 0.01, 0., 32.] p_maxs = [2.1, 90, 20, 0.4, 0.014, 0.016, 7., 57.] # Create the training samples training_samples, distributions = gp_emulator.create_training_set(parameters, p_mins, p_maxs, n_train=n_train) # Create the validation samples validation_samples = gp_emulator.create_validation_set(distributions, n_validate=n_validate) # Load up the spectral response functions for S2 srf = np.loadtxt("S2A_SRS.csv", skiprows=1, delimiter=",")[100:, :] srf[:, 1:] = srf[:, 1:]/np.sum(srf[:, 1:], axis=0) srf_land = srf[:, [ 2, 3, 4, 5, 6, 7, 8, 9, 12, 13]].T # Generate the reflectance training set by running the RT model # for each entry in the training set, and then applying the # spectral basis functions. training_s2 = np.zeros((n_train, 10)) for i, p in enumerate(training_samples): refl = prosail.run_prosail (p, p, p, p, p, p, p, p, 0.001, 30., 0, 0, prospect_version="D", rsoil=0., psoil=0, rsoil0=np.zeros(2101)) training_s2[i, :] = np.sum(refl*srf_land, axis=-1) # Generate the reflectance validation set by running the RT model # for each entry in the validation set, and then applying the # spectral basis functions. validation_s2 = np.zeros((n_validate, 10)) for i, p in enumerate(validation_samples): refl = prosail.run_prosail (p, p, p, p, p, p, p, p, 0.001, 30., 0, 0, prospect_version="D", rsoil=0., psoil=0, rsoil0=np.zeros(2101)) validation_s2[i, :] = np.sum(refl*srf_land, axis=-1) # Define and train the emulator from reflectance to LAI gp = gp_emulator.GaussianProcess(inputs=training_s2, targets=training_samples[:, 6]) gp.learn_hyperparameters(n_tries=15, verbose=False) # Predict the LAI from the reflectance ypred, _, _ = gp.predict(validation_s2) # Plot fig = plt.figure(figsize=(7,7)) plt.plot(validation_samples[:, 6], ypred, 'o', mfc="none") plt.plot([p_mins, p_maxs], [p_mins, p_maxs], '--', lw=3) x = np.linspace(p_mins, p_maxs, 100) regress = scipy.stats.linregress(validation_samples[:, 6], ypred) plt.plot(x, regress.slope*x + regress.intercept, '-') plt.xlabel(r"Validation LAI $[m^{2}m^{-2}]$") plt.ylabel(r"Retrieved LAI $[m^{2}m^{-2}]$") plt.title("Slope=%8.4f, "%(regress.slope) + "Intercept=%8.4f, "%(regress.intercept) + "$R^2$=%8.3f" % (regress.rvalue**2)) Using Gaussian Processes to regress leaf area index (LAI) from Sentinel-2 data using the PROSAIL RT model. Comparison between the true LAI and retrieved LAI using the GPs.

The results are quite satisfactory. Another issue is whether these results will work as well on real Sentinel-2 data of random vegetation classes!!! One reason why they won’t is because above I have assumed the soil to be black. While this won’t matter for situations with large canopy cover, it will for low LAI.