Escaping GKE gVisor sandboxing using metadata

Note: This bug is now patched by the Google Cloud Team.

Introduction

GKE is a Google Cloud service that offers a managed Kubernetes cluster, the nodes of the clusters are running on Google Cloud VM instances, the control plane and network is fully managed by GKE.

GKE offers a sandboxing feature (https://cloud.google.com/kubernetes-engine/docs/concepts/sandbox-pods), based on gVisor (https://gvisor.dev/docs/) it protects the host kernel from untrusted code. This sandboxing offers a very good isolation and allow SaaS business to execute unknown code submitted by their users.

I tried to use this feature to run isolated workloads and found that the isolation was not entirely effective and that the access to the metadata API was possible under certain conditions.

Network isolation using network policy

By default, in a Kubernetes cluster all pods are able to communicate, GKE recommends to use Network Policy to restrict the network traffic between pods (https://cloud.google.com/kubernetes-engine/docs/how-to/hardening-your-cluster#restrict_with_network_policy).

When running untrusted code, it is a good practice to isolate your clients from each other and from your own services.

With this feature, it is easy to define a policy and attach it to a group of pods and restrict the network access for theses pods.

Sandbox metadata protection

Google Cloud team documents how to harden the workload isolation using GKE sandbox (https://cloud.google.com/kubernetes-engine/docs/how-to/sandbox-pods#sandboxed-application), and gives some hints on how to configure and test the access to the metadata.

To validate that the filtering is properly enabled, you can launch a new pod and run the following command:

curl -s "<http://metadata.google.internal/computeMetadata/v1/instance/attributes/kube-env>" -H "Metadata-Flavor: Google"

This command is failing as described in the documentation because there is filtering denying the access to the metadata API.

By default the instance metadata API server is not supposed to be accessible from any sandboxed pod.

Bug found

When testing the network isolation for untrusted pods, I tried to configure the network policy on the cluster and applied some network filtering rules for the pods that I wanted to isolate.

After more testing, I found out that I was able to query the metadata API, it appears that the network filtering applied for the gVisor sandboxed pod by the GKE team was entirely disabled when the network policy was activated.

Since this sandboxing feature is supposed to run untrusted code, this would give an attacker access to sensitive informations about the node, project and Kubernetes cluster.

The bug was reported to the VRP team and quickly fixed, I was able to mitigate this by manually filtering the 169.254.169.254 IP in the network policy applied to theses pods.