Most of the time, we scale our Kubernetes deployments based on metrics such as CPU or memory consumption, but sometimes we need to scale based on external metrics. In this post, I’ll guide you through the process of setting up Horizontal Pod Autoscaler (HPA) autoscaling using any Stackdriver metric; specifically we’ll use the Request Per Second from a Google Cloud HTTP/S Load Balancer.

Let’s Go!
First let’s create a new Google Kubernetes Engine (GKE) cluster:
gcloud beta container clusters create "hpa-with-stackdriver-metrics" --zone "us-central1-a" \ --username "admin" \ --cluster-version "1.10.7-gke.6" \ --machine-type "n1-standard-1" \ --image-type "COS" \ --disk-type "pd-standard" \ --disk-size "100" --scopes \ "https://www.googleapis.com/auth/devstorage.read_only","https://www.googleapis.com/auth/logging.write","https://www.googleapis.com/auth/monitoring","https://www.googleapis.com/auth/servicecontrol","https://www.googleapis.com/auth/service.management.readonly","https://www.googleapis.com/auth/trace.append" --num-nodes "3" \ --enable-cloud-logging \ --enable-cloud-monitoring \ --addons HorizontalPodAutoscaling,HttpLoadBalancing \ --enable-autoupgrade --enable-autorepair
Note the `enable-cloud-monitoring` which will allow us to read from the Stackdriver Monitoring metrics.
Deploy Custom Metrics Stackdriver Adapter
The custom metrics adapter is responsible for importing stackdriver metrics to the Kubernetes API, this will enable the HPA to consume these metrics and act upon them. You can see more details about that in the troubleshooting section below..
To grant GKE objects access to metrics stored in Stackdriver, you need to deploy the Custom Metrics Stackdriver Adapter in your cluster.
In order to run Custom Metrics Adapter you must grant your user the ability to create required authorization roles by running the following command:
kubectl create clusterrolebinding cluster-admin-binding \ --clusterrole cluster-admin \ --user "$(gcloud config get-value account)"
And now let’s deploy the actual adapter that will enable us to read metrics from Stackdriver
kubectl create -f https://raw.githubusercontent.com/GoogleCloudPlatform/k8s-stackdriver/master/custom-metrics-stackdriver-adapter/deploy/production/adapter.yaml
Create a Deployment
Now, let’s deploy a simple nginx application that will be scaling later on based on the RPS measured by HTTP/S Load Balancer.
create this file: deployment.yaml
apiVersion: apps/v1 kind: Deployment metadata: name: nginx spec: selector: matchLabels: app: nginx replicas: 1 template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.8 ports: - containerPort: 80 --- apiVersion: v1 kind: Service metadata: name: nginx labels: app: nginx spec: type: NodePort ports: - port: 80 protocol: TCP selector: app: nginx
Now let’s deploy it:
kubectl apply -f deployment.yaml
Create LoadBalancer Ingress
create ingress file: ingress.yaml
apiVersion: extensions/v1beta1 kind: Ingress metadata: name: basic-ingress spec: backend: serviceName: nginx servicePort: 80
And apply the ingress
kubectl apply -f ingress.yaml
Create HorizontalPodAutoscaler object
This is where the magic happens,
we use an external metric*, with metricName:
loadbalancing.googleapis.com|https|
Note: you can find the list of all Stackdriver metrics here or you can use the Metrics Explorer.
We should also use a metricSelector, to make sure we are using only our specific load balancer metrics, so we use a metricSelector.
let’s find our LB forwarding rule:
$ kubectl describe ingress basic-ingress Name: basic-ingress Namespace: default Address: 35.190.3.165 Default backend: nginx:80 (10.48.2.11:80) Rules: Host Path Backends ---- ---- -------- * * nginx:80 (10.48.2.11:80) Annotations: backends: {"k8s-be-32432--ffd629d77b6630de":"HEALTHY"} forwarding-rule: k8s-fw-default-basic-ingress--ffd629d77b6630de target-proxy: k8s-tp-default-basic-ingress--ffd629d77b6630de url-map: k8s-um-default-basic-ingress--ffd629d77b6630de
now we can add the label match to our config (notice the label: “forwarding_rule_name” )
metricSelector: matchLabels: resource.labels.forwarding_rule_name: k8s-fw-default-basic-ingress--ffd629d77b6630de
The final file will look like this: hpa.yaml
apiVersion: autoscaling/v2beta1 kind: HorizontalPodAutoscaler metadata: name: nginx spec: minReplicas: 1 maxReplicas: 5 metrics: - external: metricName: loadbalancing.googleapis.com|https|request_count metricSelector: matchLabels: resource.labels.forwarding_rule_name: k8s-fw-default-basic-ingress--ffd629d77b6630de targetAverageValue: "1" type: External scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: nginx
notice the we have used targetAverageValue, this specifies how much of a total value of metric each replica can handle. This is useful when using metrics that describe some work or resource that can be divided between replicas, in our case each replica can handle a single (i.e. 1) RPS. You should, of course, change this according to your needs.
Let’s test everything
Let’s start by driving traffic to our load balancer.
As you can see from the above command :
kubectl describe ingress basic-ingress
Our Ingress Public IP address is : 35.190.3.165
now let’s start hitting that endpoint 🥊 with some requests:
while true ; do curl -Ss -k --write-out '%{http_code}\n' --output /dev/null http://35.190.3.165/ ; done
now let’s see if our HorizontalPodAutoscaler is affected:
kubectl describe hpa nginx-hpa
at this point you might see some warnings since the metric is not populated yet, but after a few minutes we see that the Metrics section is populated:
Name: nginx-hpa Namespace: default Labels: <none> Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"autoscaling/v2beta1","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"nginx-hpa","namespace":"default"},"spec":{"ma... CreationTimestamp: Wed, 31 Oct 2018 18:18:28 +0200 Reference: Deployment/nginx Metrics: ( current / target ) "loadbalancing.googleapis.com|https|request_count" (target average value): 1034m / 1 Min replicas: 1 Max replicas: 5
And in the “Events” section we can see:
Events: Type Reason Age From Message ... Normal SuccessfulRescale 2m horizontal-pod-autoscaler New size: 2; reason: external metric loadbalancing.googleapis.com|https|request_count(&LabelSelector{MatchLabels:map[string]string{resource.labels.forwarding_rule_name: k8s-fw-default-basic-ingress--ffd629d77b6630de,},MatchExpressions:[],}) above target
We have a liftoff! 🚀
Troubleshooting:
- An easy way to see if the metric is being imported to the Kubernetes external metrics api is to browse the api manually. I will also help you to check whether you have used the metricSelector correctly.
First thing, we run the kubernetes proxy
kubectl proxy --port=8080
And then we can access from our localhost:
http://localhost:8080/apis/external.metrics.k8s.io/v1beta1/namespaces/default/loadbalancing.googleapis.com%7Chttps%7Crequest_count
And this is an excerpt of the result:
{ "kind": "ExternalMetricValueList", "apiVersion": "external.metrics.k8s.io/v1beta1", "metadata": { "selfLink": "/apis/external.metrics.k8s.io/v1beta1/namespaces/default/loadbalancing.googleapis.com%7Chttps%7Crequest_count" }, "items": [ { "metricName": "loadbalancing.googleapis.com|https|request_count", "metricLabels": { "metric.labels.cache_result": "DISABLED", "resource.labels.backend_target_type": "BACKEND_SERVICE", "resource.labels.backend_name": "k8s-ig--ffd629d77b6630de", ... "resource.labels.forwarding_rule_name": "k8s-fw-default-basic-ingress--ffd629d77b6630de", ... }, "timestamp": "2018-11-01T08:41:30Z", "value": "2433m" } ] }
Voilá!
2. A way to see how the adapter behaves is to watch it’s logs, first let’s list the custom-metrics pods:
$ kubectl get pods -n custom-metrics NAME READY custom-metrics-stackdriver-adapter-c4d98dc54-2n4jz 1/1
Finally, let’s watch to logs:
$ kubectl logs custom-metrics-stackdriver-adapter-c4d98dc54-2n4jz -n custom-metrics ... I1104 06:42:11.125627 1 trace.go:76] Trace[1192308782]: "List /apis/external.metrics.k8s.io/v1beta1/namespaces/default/loadbalancing.googleapis.com|https|request_count" (started: 2018-11-04 06:42:08.155209905 +0000 UTC m=+311951.293335726) (total time: 2.970372027s): Trace[1192308782]: [2.97027864s] [2.970185564s] Listing from storage done ...
You can get a lot of valuable information from those logs, especially if there are any error messages.
Conclusion
Using external metrics such as these collected and stored by the Stackdriver is pretty straightforward and fairly easy to use. In a similar way you can also use your own custom metrics being published to Stackdriver using Monitoring API.
I have created a GitHub repo with the resources I have used for this post here.
Want more stories? Check our blog, or follow Eran on Twitter.