Application Scalability, Part 3: Knative and KEDA

5 min readFeb 7, 2021

In the first part and the second part or the series, we’ve learned about types of scaling and what we get out of the box if we use Kubernetes to deploy our apps. When discussing Kubernetes horizontal autoscaling capabilities, I’ve mentioned that HPA (Horizontal Pod Autoscaler) has two drawbacks: a limited number of metrics that users can use to perform autoscaling, HPA cannot scale a deployment to 0 pods.

In this part, I want to take a closer look at two interesting projects that may help you tackle those issues: Knative and KEDA.

Knative

Knative is a platform to deploy and manage serverless workloads on Kubernetes. As such, Knative is not a universal solution for the autoscaling problem. However, it is worth mentioning it here because Knative ships with interesting Knative Pod Autoscaler (KPA).

KPA provides request-bases autoscaling capabilities. Using it, we can scale our deployment, defining how many concurrent requests to our app we want at a given time. Additionally, KPA has a built-in functionality to scale to 0. Once there are no requests to our app Knative will scale our deployment to zero pods (this functionality has to be turned on and off). However, KPA is not the only autoscaler that we can use to autoscale Knative workloads. We can use the standard Kubernetes HPA or implement our own pod autoscaler with a specialized autoscaling algorithm.

So, once you have to deploy serverless workloads and want to take advantage of custom autoscaling possibilities, it’s worth remembering about Knative. However, there are two things to consider before using Knative:

The workloads served by Knative should not last longer than 30 minutes (at least that’s not recommended) because this may impact the autoscaling behavior.
Setting up Knative on your Kubernetes cluster requires some knowledge of networking (for example, Gloo, Istio); thus, it’s may be not so simple for everyone.

On the other hand, there are cloud offerings of managed Knative like Google Cloud Run, but those do not allow custom autosalers.

KEDA

Another solution that is purely focused on horizontal autoscaling is KEDA. KEDA is an open-source project initially developed by RedHat and Microsoft. Currently, it is a sandbox project of CNCF.

KEDA is a Kubernetes-based event-driven autoscaler. KEDA determines how any container in Kubernetes should be scaled based on the number of events that need to be processed. KEDA is a single-purpose and lightweight component that can be added to any Kubernetes cluster. It works alongside standard Kubernetes components like the Horizontal Pod Autoscaler and can extend functionality without overwriting or duplication. (source)

This abbreviation stands for Kubernetes Event Driven Autoscaler. It extends the HPA mechanism with scale-to-zero and plenty of “metric adapters” called scalers. The zero to one scenario is handled by the KEDA operator. Once there’s a single replica of a pod the further scaling is performed by the horizontal pod autoscaler. And the same goes for scaling down.

KEDA leverages the HPA under the hood. However, to define an autoscaling behavior of a deployment, we have to define ScaledObject (KEDA custom resource definition). It resembles the HPA definition in many ways. However, the specification part of this CRD is where KEDA gives a lot of customization. For example:

apiVersion: keda.k8s.io/v1alpha1
kind: ScaledObject
metadata:
  name: redis-scaledobject
  namespace: default
  labels:
    deploymentName: my-app
spec:
  maxReplicaCount: 4
  scaleTargetRef:
    deploymentName: my-app
  triggers:
    - type: redis
      metadata:
        address: REDIS_ADDRESS
        listName: default
        listLength: "10"

In the spec part of ScaledObject, you have to define what type of scaler (the trigger field) you want to use as a metric provider. In this example, I am using the redis scaler which will use the length of default message queue as a metric for HPA. In general, there’s a lot of similarities between ScaledObject and generic HorizontalPodAutoscaler.

As I mentioned earlier, KEDA supports many scalers including few interesting ones:

Redis / RabittMQ queue length
PubSub / SQS number of messages
Postgres / MySQL value of an arbitrary query

You can see that the metrics used by KEDA are more app- than cluster-centric. Especially, the last two allow you to run a query on a selected database and perform HPA using the returned value. Those are truly powerful metric sources that helped us improve Airflow autoscaling.

Airflow + KEDA

At the end of last year, our Airflow community was researching what can be done to improve Airflow autoscaling capabilities. At first, we thought about using Knative however the problems mentioned earlier in these articles were blockers for us because:

Some of the Airflow tasks may take much more time than 30 minutes
Not all Airflow users are Kubernetes experts.

In the meantime, I’ve stumbled upon the KEDA project (thanks to Cloud Native Warsaw talk!) which sounded like a possible solution to our problem. To check the ease of mixing KEDA with Airflow I decided to add it to a Google Composer (hosted Airflow) instance or more precisely to the underlying Kubernetes cluster. The idea was simple, if I can add KEDA to a managed solution then there should be no problem with deploying it anywhere else. And what is amazing — it worked with a few kubectl apply!

In the beginning, the metric I used for autoscaling was the length of the Redis queue. This value represents the number of tasks queued to be executed and it makes a lot of sense to scale up when there’s a lot of tasks waiting. And scale down when there’s nothing to do. Once we successfully deployed KEDA to scale Airflow we shared this information with our friends from Astronomer. That’s when the idea to use SQL queries was born and MySQL and Postgres scalers were contributed by me and Daniel Imberman from Astronomer. You may ask why would someone want to use SQL queries to scale application deployment? In many cases, this may sound like something weird but in Airflow the underlying database or more precisely meta-database contains information about Airflow state. Those include a number of PENDING, QUEUE, RUNNING tasks. And being able to query these values allows us to create a metric that incorporates all this knowledge!

If you are interested in how exactly KEDA works with Airflow I highly recommend Daniel’s blog here. Additionally, we think that KEDA brings so much value to Airflow autoscaling capabilities that it will be part of the official Airflow helm chart.

This use-case is also a good answer to a question you may ask: when KEDA is a good choice? You should use KEDA when you want to perform scaling on application-related information — that’s what we needed in Airflow. Also, KEDA is great when you want to use a custom scaler. You can write one in an hour and deploy it using a custom KEDA image which is really easy!

This blog post was originally published at https://www.polidea.com/blog/

Application Scalability, Part 3: Knative and KEDA

Knative

KEDA

Airflow + KEDA

Written by Tomasz Urbaszek