.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/gaussian_processes/plot_gp_regression_ard.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_gaussian_processes_plot_gp_regression_ard.py: Gaussian Process Regression with Automatic Relevance Determination (ARD) ======================================================================== Hello world .. GENERATED FROM PYTHON SOURCE LINES 8-26 .. code-block:: default import numpy as np import tensorflow.compat.v1 as tf tf.disable_v2_behavior() import tensorflow_probability as tfp import matplotlib.pyplot as plt import seaborn as sns import pandas as pd from sklearn.datasets import load_boston from sklearn.preprocessing import normalize from sklearn.utils import shuffle from collections import defaultdict .. GENERATED FROM PYTHON SOURCE LINES 28-33 .. code-block:: default # shortcuts tfd = tfp.distributions kernels = tfp.math.psd_kernels .. GENERATED FROM PYTHON SOURCE LINES 34-57 .. code-block:: default # environment golden_ratio = 0.5 * (1 + np.sqrt(5)) def golden_size(width): return (width, width / golden_ratio) width = 10.0 rc = { "figure.figsize": golden_size(width), "font.serif": ['Times New Roman'], "text.usetex": False, } sns.set(context="notebook", style="ticks", palette="colorblind", font="serif", rc=rc) .. GENERATED FROM PYTHON SOURCE LINES 58-71 .. code-block:: default # constants boston = load_boston() num_train, num_features = boston.data.shape num_epochs = 500 kernel_cls = kernels.MaternFiveHalves jitter = 1e-6 seed = 42 # set random seed for reproducibility random_state = np.random.RandomState(seed) .. GENERATED FROM PYTHON SOURCE LINES 72-74 The dataset has 506 datapoints, with 13 continuous/categorical features and a single target, the median property value. .. GENERATED FROM PYTHON SOURCE LINES 74-81 .. code-block:: default X, Y = shuffle(normalize(boston.data), boston.target, random_state=random_state) print(X.shape, Y.shape) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none (506, 13) (506,) .. GENERATED FROM PYTHON SOURCE LINES 82-83 We can load this dataset into a Pandas DataFrame for ease of visualization. .. GENERATED FROM PYTHON SOURCE LINES 83-87 .. code-block:: default boston_df = pd.DataFrame(boston.data, columns=boston.feature_names) \ .assign(MEDV=boston.target) .. GENERATED FROM PYTHON SOURCE LINES 88-90 For example, we can see how the median property value (MEDV) varies with the average number of rooms per dwelling (RM). .. GENERATED FROM PYTHON SOURCE LINES 90-97 .. code-block:: default fig, ax = plt.subplots() sns.scatterplot(x="RM", y="MEDV", data=boston_df, alpha=.8, ax=ax) plt.show() .. image:: /auto_examples/gaussian_processes/images/sphx_glr_plot_gp_regression_ard_001.png :alt: plot gp regression ard :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 98-113 Empirical Bayes (maximizing the log marginal likelihood) -------------------------------------------------------- Now let us fit a GP regression model to this dataset. We consider the Matern5/2 covariance function as before, except now, we use the anisotropic variant of the kernel. That is, we incorporate a lengthscale vector of 13 positive scalar values. These hyperparameter values determine how far you need to move along a particular axis in the input space for the function values to become uncorrelated. By estimating these values we effectively implement automatic relevance determination, as the inverse of the lengthscale determines the relevance of the dimension. If the lengthscale is very large, the covariance will practically become independence of that input, and effectively remove it from the inference (GPML Section 5.1 Rasmussen & Williams, 2006). .. GENERATED FROM PYTHON SOURCE LINES 113-141 .. code-block:: default # Base (isotropic) kernel with some scalar base lengthscale amplitude = tf.exp(tf.Variable(np.float64(0), name='amplitude')) length_scale = tf.exp(tf.Variable(np.float64(-1), name='length_scale')) base_kernel = kernel_cls(amplitude=amplitude, length_scale=length_scale) # ARD (anisotropic) kernel with a vector of varying _effective_ lengthscales scale_diag = tf.exp(tf.Variable(np.zeros(num_features), name='scale_diag')) kernel = kernels.FeatureScaled(base_kernel, scale_diag=scale_diag) # Finalize the model observation_noise_variance = tf.exp( tf.Variable(np.float64(-5)), name='observation_noise_variance') gp = tfd.GaussianProcess( kernel=kernel, index_points=X, observation_noise_variance=observation_noise_variance ) # log marginal likelihood nll = - gp.log_prob(Y) nll optimizer = tf.train.AdamOptimizer(learning_rate=0.05, beta1=0.5, beta2=.99) optimize = optimizer.minimize(nll) .. GENERATED FROM PYTHON SOURCE LINES 142-144 Training loop ------------- .. GENERATED FROM PYTHON SOURCE LINES 144-162 .. code-block:: default history = defaultdict(list) with tf.Session() as sess: sess.run(tf.global_variables_initializer()) for i in range(num_epochs): (_, nll_value, amplitude_value, length_scale_value, observation_noise_variance_value, scale_diag_value) = sess.run([optimize, nll, amplitude, length_scale, observation_noise_variance, scale_diag]) history["nll"].append(nll_value) history["amplitude"].append(amplitude_value) history["length_scale"].append(length_scale_value) history["observation_noise_variance"].append(observation_noise_variance_value) history["scale_diag"].append(scale_diag_value) history_df = pd.DataFrame(history) .. GENERATED FROM PYTHON SOURCE LINES 163-165 Learning curve ^^^^^^^^^^^^^^ .. GENERATED FROM PYTHON SOURCE LINES 165-176 .. code-block:: default fig, ax = plt.subplots() sns.lineplot(x='index', y='nll', data=history_df.reset_index(), alpha=0.8, ax=ax) ax.set_xlabel("epoch") ax.set_yscale("log") plt.show() .. image:: /auto_examples/gaussian_processes/images/sphx_glr_plot_gp_regression_ard_002.png :alt: plot gp regression ard :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 177-179 Visualize scalar hyperparameters over epochs -------------------------------------------- .. GENERATED FROM PYTHON SOURCE LINES 179-186 .. code-block:: default scalars = ["nll", "amplitude", "length_scale", "observation_noise_variance"] scalars_history_df = history_df[scalars] g = sns.PairGrid(history_df[scalars], hue="nll", palette="viridis") g = g.map_lower(plt.scatter) .. image:: /auto_examples/gaussian_processes/images/sphx_glr_plot_gp_regression_ard_003.png :alt: plot gp regression ard :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 187-189 Visualize input feature scales over epochs ------------------------------------------ .. GENERATED FROM PYTHON SOURCE LINES 189-205 .. code-block:: default d = pd.DataFrame(history["scale_diag"], columns=boston.feature_names) d.index.name = "epoch" d.columns.name = "feature" s = d.stack() s.name = "scale_diag" data = s.reset_index() fig, ax = plt.subplots() sns.lineplot(x='epoch', y="scale_diag", hue="feature", palette="tab20", sort=False, data=data, alpha=0.8, ax=ax) ax.set_yscale("log") plt.show() .. image:: /auto_examples/gaussian_processes/images/sphx_glr_plot_gp_regression_ard_004.png :alt: plot gp regression ard :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 206-211 Visualize Effective Lengthscales -------------------------------- We display the bar chart of the *effective* lengthscales corresponding to each dimension. .. GENERATED FROM PYTHON SOURCE LINES 211-226 .. code-block:: default base_length_scale_final = scalars_history_df.length_scale.iloc[-1] scale_diag_final = d.iloc[-1] effective_length_scale_final = base_length_scale_final * scale_diag_final effective_length_scale_final.name = "effective_length_scale" data = effective_length_scale_final.reset_index() fig, ax = plt.subplots() sns.barplot(x='feature', y="effective_length_scale", palette="tab20", data=data, alpha=0.8, ax=ax) plt.show() .. image:: /auto_examples/gaussian_processes/images/sphx_glr_plot_gp_regression_ard_005.png :alt: plot gp regression ard :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 227-230 Generating the scatter plot with respect to the feature that has the smallest effective lengthscale, we find that it is indeed highly correlated with the median property value. .. GENERATED FROM PYTHON SOURCE LINES 230-238 .. code-block:: default fig, ax = plt.subplots() sns.scatterplot(x=effective_length_scale_final.idxmin(), y='MEDV', data=boston_df, alpha=.8, ax=ax) plt.show() .. image:: /auto_examples/gaussian_processes/images/sphx_glr_plot_gp_regression_ard_006.png :alt: plot gp regression ard :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 239-240 And vice versa for the feature with the largest effective lengthscale. .. GENERATED FROM PYTHON SOURCE LINES 240-247 .. code-block:: default fig, ax = plt.subplots() sns.scatterplot(x=effective_length_scale_final.idxmax(), y='MEDV', data=boston_df, alpha=.8, ax=ax) plt.show() .. image:: /auto_examples/gaussian_processes/images/sphx_glr_plot_gp_regression_ard_007.png :alt: plot gp regression ard :class: sphx-glr-single-img .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 23.350 seconds) .. _sphx_glr_download_auto_examples_gaussian_processes_plot_gp_regression_ard.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_gp_regression_ard.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_gp_regression_ard.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_