Blog

GKE Workload Identity is now named Workload Identity Federation — what else has changed?

Workload Identity Federation for GKE is the recommended way for your workloads running on Google Kubernetes Engine (GKE) to access Google Cloud services in a secure and manageable way.

Previously this feature was called Workload Identity, whereas Workload Identity Federation (WIF) was the name for a similar feature that allowed external identities to access resources within GCP.

Recently, GCP decided to merge Workload Identity into the Workload Identity Federation umbrella. Besides this change in naming, it is now also possible to address Kubernetes entities (clusters, service accounts) directly as IAM principals. This simplifies the process of setting up WIF for GKE by eliminating the need for a dedicated Google Service Account (GSA) and additional bindings, but more on that later.

In this post we’ll look into the newly named Workload Identity Federation for GKE and how it affects running workloads that are already utilizing Workload Identity bindings.


What is Workload Identity Federation?

WIF is a feature designed to bridge the gap between external identity systems and Google Cloud IAM. It enables organizations to extend their existing identity infrastructure, such as Active Directory, AWS, Okta and more, to Google Cloud resources seamlessly.

By establishing trust relationships between external identity providers (IdP) and Google Cloud IAM, WIF allows users and applications authenticated through systems other than GCP to access GCP resources using their existing identities.

To establish this trust relationship, you first create a Workload Identity Pool to manage the external identities, and then you add Workload Identity Providers to the pool, each describing the relationship between your IdP and GCP.

Once established, a credential token from your IdP can be verified by Google’s Security Token Service and exchanged for a federated token used to authenticate to GCP.

In Workload Identity Federation for GKE (previously named simply Workload Identity) Google manages the pool and provider for you.
Each GCP project has a single, fixed pool (PROJECT_ID.svc.id.goog) and once enabled on a GKE cluster, it is registered as an identity provider on that pool.

The GKE metadata server is then automatically deployed on all cluster nodes as a DaemonSet. Among other things, it’s used to intercept requests coming from your workloads and respond with the correct token for the service account your workload is configured to impersonate.


Grant IAM permissions to your GKE workloads

This is essentially the entire point of this feature — to allow your GKE workloads access to resources on GCP without having to pass explicit credentials, or even utilize the underlying Node’s IAM Service Account.

The first thing to do is to enable WIF on your GKE cluster and node pools if you haven’t done so already, but this is well documented and out of scope for this post.
Once this is done, you’ll need to create a Kubernetes Service Account (KSA) for your workload to use (by configuring it in the Pod spec).

Then, with Workload Identity (WI) you can grant GCP IAM permissions to your workloads in one of two ways:

Using Google IAM Service Accounts impersonation

This was the standard and only way to configure WI until recently, it is now documented as the alternative way to do so.

In short, a Google Service Account (GSA) needs to be created and granted any IAM permissions necessary for your workload. Then a binding needs to be created between the GSA and the KSA by making the KSA a roles/iam.workloadIdentityUser of the GSA, as well as annotating the KSA accordingly.

Using IAM principal identifiers

This is a major feature and a recent change, as mentioned in the introduction. Kubernetes resources can be referenced directly in IAM policies as principals.

You can refer to your KSA by name (for identity sameness) or by UID (available in the KSA spec after creation).
You can even target all pods in a cluster if there’s a common IAM permission required by all of your workloads.

Configuring Workload Identity using GSA Impersonation vs IAM principal identifiers

 

Source: https://twitter.com/_techcet_/status/1773865010651173293
Source: https://twitter.com/_techcet_/status/1773865010651173293

As the table above shows, the configuration was simplified quite a bit by introducing the ability to target Kubernetes resources as IAM principals.

Rather than having to manage an additional GSA and remembering to give your KSA permission to impersonate it (a step often forgotten), one can grant the necessary IAM roles to the KSA directly. There’s also no more need to annotate your KSA with the GSA name, which was another step that was easily misconfigured or forgotten.

The ability to refer to a KSA by UID rather than by name is a great addition and reduces the risk of accidentally granting access to undesired workloads via identity sameness. On the other hand, using identity sameness can be a feature if used correctly, and it is currently feasible in both versions of the configuration.

Moreover, the ability to refer to all pods in a cluster as an IAM principal set is a useful feature to have and allows for more flexibility. I’m sure that in the coming months, we’ll see support added for more types of Kubernetes resources as IAM principals (namespaces would be a great addition).

A limitation of using IAM principals is that there are still a handful of services that are either unsupported or are in preview. In case you’re currently using WI to access any of these services, you’ll need to hold off on upgrading your configuration.

 


Migrating your existing configuration to use IAM principals

The good news is that you don’t have to - the GSA impersonation way of doing things is still supported and I don’t believe it’ll go away any time soon.
However, using IAM principals simplifies the configuration and has some added benefits, as described in the previous section, so I would recommend adopting it if possible.

What is the most frictionless way to switch your workloads from their current configuration to using IAM principals?

Assuming you are currently using Workload Identity to access GCP resources, you will have a GSA with the necessary IAM permissions and a GSA that has the Workload Identity User IAM role on that GSA.

Finally, your KSA would be annotated to point at the GSA, e.g.:

apiVersion: v1
kind: ServiceAccount
metadata:
  annotations:
    iam.gke.io/gcp-service-account: <GSA_NAME>@<PROJECT_ID>.iam.gserviceaccount.com
  ...

Let’s see if we can perform a “zero-downtime” update of the WI configuration.

First, let’s spin up a pod that uses that KSA to run some tests

$ echo 'apiVersion: v1
kind: Pod
metadata:
  name: workload-identity-test
spec:
  serviceAccountName: <KSA_NAME>
  containers:
  - image: google/cloud-sdk:slim
    command: ["sleep","infinity"]
    name: workload-identity-test' | kubectl apply -f -

Once the pod is ready and running, we can exec into it

$ kubectl exec -ti token-test -- bash

From inside the pod, we can make a request to the metadata server and see what identity our pod is using

root@token-test:/# curl -H "Metadata-Flavor: Google" http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true
{"aliases":["default"],"email":"<GSA_NAME>@<PROJECT_ID>.iam.gserviceaccount.com","scopes":["https://www.googleapis.com/auth/cloud-platform","https://www.googleapis.com/auth/userinfo.email"]} 

From the email field in the response, we can see that our identity is the GSA that our KSA is configured to use (via the annotation on the KSA).

What would happen if we remove that annotation from the KSA?

# this is run outside of the pod
$ kubectl annotate serviceaccount <KSA_NAME> iam.gke.io/gcp-service-account-

Let’s try that request again

root@token-test:/# curl -H "Metadata-Flavor: Google" http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true
{"aliases":["default"],"email":"<PROJECT_ID>.svc.id.goog","scopes":["https://www.googleapis.com/auth/cloud-platform","https://www.googleapis.com/auth/userinfo.email"]}

As you can see, our identity immediately changes to the project’s Workload Identity Pool. This should mean that switching over from GSA impersonation to IAM principal binding should work without even restarting our pods (note that there might be certain clients that will require a restart to start using the new identity).

Considering your KSA is still annotated with the GSA and the workload is functional:

  1. Create the IAM policy to grant the necessary IAM roles directly to the KSA (the same roles granted to our GSA)
  2. Remove the WI annotation from the KSA, at this point, the pod should immediately start using the permission granted to the KSA principal (in case you’re seeing errors, a pod restart should solve it)
  3. Once everything is confirmed running, you can clean up the now redundant GSA and make sure that your IaC and manifests are all up to date

Conclusion

The renaming of Workload Identity to Workload Identity Federation for GKE was slightly confusing at the beginning, as there was always a distinction between the two features. However, the added support for IAM principals brings WIF for GKE closer to the rest of WIF and illuminates the path forward for this feature.

In case you aren’t using WIF already, enabling and using it in your GKE clusters has gotten much simpler. In case you already are — updating your existing configuration to the new way is fairly effortless and will allow you to clean up some extra resources and save on lines of configuration in your IaC.

If you’re struggling with configuring WI, take a look at our workload-identity-analyzer project on GitHub (pending support for IAM principals).

Subscribe to updates, news and more.

Leave a Reply

Your email address will not be published. Required fields are marked *

Related blogs

Schedule a call with our team

You will receive a calendar invite to the email address provided below for a 15-minute call with one of our team members to discuss your needs.

You will be presented with date and time options on the next step