[1]:
# To run this in Google Colab, uncomment the following line
# !pip install geometric_kernels

# If you want to use a version of the library from a specific branch on GitHub,
# say, from the "devel" branch, uncomment the line below instead
# !pip install "git+https://github.com/geometric-kernels/GeometricKernels@devel"

Different Approximations of Matérn Kernels on \(\mathbb{H}_2\)

This notebooks compares two different approximations of the heat kernel (Matérn-\(\infty\)) on the hyperbolic space \(\mathbb{H}_2\).

We use numpy for the computations and matplotlib for vizualization.

Contents

Introduction

We employ the kernel defined by the random phase features. A complete mathematical treatise can be found in Citation. For the purposes of this notebook, it suffices to say that on noncompact symmetric spaces (of which the Hyperbolic space is a fine specimen), the kernel is defined in terms of an intractable integral over a probability density (called the spectral density). To approximate the integral, we use Monte Carlo sampling from the spectral density.

This is in contrast to compact spaces (which we call DiscreteSpectrumSpaces), where the kernel is a sum of a series, that can be approximated via truncation.

In the Hyperbolic space, there are several methods for sampling from the spectral density. We will investigate them in this notebook.

For practical purposes, we just have to define a number of random phases (the more we define, the better the approximation and the more computation required), specify a feature map, and pass it to the MaternFeatureMapKernel. The kernel requires a key to instantiate randomness in the random phases, and is a deterministic function.

[1]:
from geometric_kernels.kernels import MaternFeatureMapKernel
from geometric_kernels.feature_maps import RandomPhaseFeatureMapNoncompact, RejectionSamplingFeatureMapHyperbolic
from geometric_kernels.spaces.hyperbolic import Hyperbolic

import geomstats.visualization as visualization
import matplotlib.pyplot as plt
import numpy as np
INFO (geometric_kernels): Numpy backend is enabled. To enable other backends, don't forget to `import geometric_kernels.*backend name*`.
INFO (geometric_kernels): We may be suppressing some logging of external libraries. To override the logging policy, call `logging.basicConfig`.

Instantiate space and kernel

In the numpy world, the key is the np.random.RandomState:

[2]:
_num_random_phases = 3_000  # This is the default value, we just state it for completeness
[3]:
key = np.random.RandomState(seed=1234)
[4]:
hyperboloid = Hyperbolic(dim=2)
feature_map = RandomPhaseFeatureMapNoncompact(hyperboloid, num_random_phases=_num_random_phases)
kernel = MaternFeatureMapKernel(hyperboloid, feature_map, key)

params = kernel.init_params()
params["nu"] = np.array([np.inf])
params["lengthscale"] = np.array([1.0])

Plot kernel values on a “grid”

We construct a quasi-grid in the Hyperbolic space. Since Hyperbolic inherits from geomstats Hyperbolic, we exploit this:

[5]:
s = np.linspace(-5, 5, 100)
xx, yy = np.meshgrid(s, s)
points = np.c_[xx.ravel(), yy.ravel()]
points = hyperboloid.from_coordinates(points, "intrinsic")
[6]:
base_point = hyperboloid.from_coordinates(np.r_[0, 0], "intrinsic").reshape(1, 3)

Again, we use geomstats vizualization module to plot data on a Hyperbolic disk.

[7]:
kernel_vals = kernel.K(params, base_point, points)
[8]:
plt.figure(figsize=(5,5))
visualization.plot(points, space="H2_poincare_disk", c=kernel_vals, cmap="plasma")
plt.show()
../../_images/examples_other_Hyperbolic_Approximations_17_0.png

We can also visualize as a hyperboloid. We transpose some coordinates to get a better view:

[9]:
fig = plt.figure()
ax = fig.add_subplot(projection='3d')
colors = kernel_vals[0]
ax.scatter(points[:, 2], points[:, 1], points[:, 0], c=colors, cmap='plasma')
plt.show()
../../_images/examples_other_Hyperbolic_Approximations_19_0.png

Plot kernel values along a geodesic

Let’s plot kernel values K(x, y) where y runs the geodesic between base and end_point.

[10]:
base = np.r_[7.14142843, -5.0, -5.0]
end_point = np.r_[14.17744688, 10.0, 10.0]

geodesic = hyperboloid.metric.geodesic(initial_point=base, end_point=end_point)
x1 = geodesic(np.linspace(0.0, 1.0, 30))
x2 = x1[0, None]

distances = hyperboloid.distance(x1, x2)
[11]:
kernel_vals = kernel.K(params, x1, x2)
[12]:
plt.plot(distances, kernel_vals, color="red", linewidth=1)
plt.xlabel('distance')
plt.ylabel('K')
plt.show()
../../_images/examples_other_Hyperbolic_Approximations_23_0.png

Rejection sampling feature map/kernel

Let’s investigate the behaviour of another feature map defined on the Hyperbolic space. The feature map uses a specific form that the spectral density has in the Hyperbolic space, and samples from the spectral density using rejection sampling.

In the limit, when _num_random_phases tends to infinity, both feature maps give the same kernel. In order to contrast the two, we call the previous one “naive sampling based”.

The math can be found in [1].

[13]:
feature_map_rs = RejectionSamplingFeatureMapHyperbolic(hyperboloid, num_random_phases=_num_random_phases)
[14]:
kernel_rs = MaternFeatureMapKernel(hyperboloid, feature_map_rs, key)
[15]:
kernel_vals_rs = kernel_rs.K(params, x1, x2)

Let’s compare the two feature maps. They should be pretty close.

[16]:
plt.plot(distances, kernel_vals, color="red", linewidth=1, label='Naive sampling FM')
plt.plot(distances, kernel_vals_rs, color="green", linewidth=1, label='Rejection Sampling FM')
plt.xlabel('distance')
plt.ylabel('K')
plt.legend()
plt.show()
../../_images/examples_other_Hyperbolic_Approximations_30_0.png

Let’s plot the convergence of the naive and rejection-sampling based approaches, as we construct more and more feature maps.

[17]:
_num_phases = np.linspace(10, 3000, 20).astype(int)
[18]:
naive_feature_maps = [RandomPhaseFeatureMapNoncompact(hyperboloid, num_random_phases=n) for n in _num_phases]
rs_feature_maps = [RejectionSamplingFeatureMapHyperbolic(hyperboloid, num_random_phases=n) for n in _num_phases]
[19]:
naive_kernels = [MaternFeatureMapKernel(hyperboloid, fm, key) for fm in naive_feature_maps]
rs_kernels = [MaternFeatureMapKernel(hyperboloid, fm, key) for fm in rs_feature_maps]

Let’s evaluate the kernels on the same set of points, effectively getting two sequences of vectors.

To evaluate the convergence, we are going to compute the \(L_2\)-distances of the subsequent vectors.

[20]:
naive_kernels_vals = [k.K(params, x1, x2) for k in naive_kernels]
rs_kernels_vals = [k.K(params, x1, x2) for k in rs_kernels]
[21]:
naive_diffs = np.diff(naive_kernels_vals, axis=0)
rs_diffs = np.diff(rs_kernels_vals, axis=0)

naive_l2s = np.sqrt(np.sum(naive_diffs**2, axis=1))
rs_l2s = np.sqrt(np.sum(rs_diffs**2, axis=1))
[22]:
plt.plot(_num_phases[:-1], naive_l2s.ravel(), label='Naive sampling')
plt.plot(_num_phases[:-1], rs_l2s.ravel(), label='Rejection sampling')
plt.legend()
plt.show()
../../_images/examples_other_Hyperbolic_Approximations_38_0.png

The rejection sampling based kernels show a better rate of convergence.

To illustrate this further, plot the kernel values for all the kernels we have computed.

The rejection sampling based kernels values are more concentrated around the true kernel, whereas the naive sampling based kernels show higher distortion.

[23]:
plt.figure(figsize=(10, 5))
for nf, naive_v, rs_v in zip(_num_phases, naive_kernels_vals, rs_kernels_vals):
    plt.subplot(121)
    plt.plot(distances, naive_v, linewidth=1, c='k', alpha=0.3)
    plt.subplot(122)
    plt.plot(distances, rs_v, linewidth=1, c='k', alpha=0.3)

plt.subplot(121)
plt.title('Naive sampling')
plt.ylim(-0.1, 1.1)
plt.subplot(122)
plt.title('Rejection sampling')
plt.ylim(-0.1, 1.1)
plt.xlabel('distance')
plt.ylabel('K')
plt.show()
../../_images/examples_other_Hyperbolic_Approximations_40_0.png

Citation

If you are using GeometricKernels and hyperbolic spaces, please consider citing

@article{mostowsky2024,
      title = {The GeometricKernels Package: Heat and Matérn Kernels for Geometric Learning on Manifolds, Meshes, and Graphs},
      author = {Peter Mostowsky and Vincent Dutordoir and Iskander Azangulov and Noémie Jaquier and Michael John Hutchinson and Aditya Ravuri and Leonel Rozo and Alexander Terenin and Viacheslav Borovitskiy},
      year = {2024},
      journal = {arXiv:2407.08086},
}
@article{azangulov2024b,
  title = {Stationary Kernels and Gaussian Processes on Lie Groups and their Homogeneous Spaces II: non-compact symmetric spaces},
  author = {Azangulov, Iskander and Smolensky, Andrei and Terenin, Alexander and Borovitskiy, Viacheslav},
  journal = {Journal of Machine Learning Research},
  year = {2024},
  volume = {25},
  number = {281},
  pages = {1--51},
}
[ ]: