Canary deployments: Go easy with Ketch 0.2!

kumar saurabh
6 min readFeb 17, 2021

Recently, Ketch 0.2 has been released with a bunch of different exciting features. As a maintainer of the project, I am delighted to share that I have added a new feature called Automated Canary Deployments. ๐Ÿš€

In this blog, we will discuss how to play with it. ๐Ÿค 

If you donโ€™t know already, Ketch is an Application Delivery Framework that fixes the last mile deployment, implements an application context, and removes cloud-native application deployment complexities with a simple and easy command-line interface.

We will dive into it to understand how it works but letโ€™s revise some of the basics first. ๐Ÿ’ก

What is a Canary Deployment? ๐Ÿค”

The term canary refers to the practice of bringing canaries into coal mines to determine if the mine is safe for humans. Miners used to carry caged canaries (birds) down into the mine tunnels with them. If dangerous gases such as carbon monoxide collected in the mine, the gases would kill the canary before killing the miners, thus providing a warning to exit the tunnels immediately.

In the same way, Canary deployment is used as a deployment strategy in which a new version of an application gets deployed next to a stable production version, then evaluated periodically against the current stable deployment (baseline) to ensure that the new deployment is operating at least as well as the old one.

In short, with this technique, we reduce the risk of introducing a new software version in production by slowly rolling out the new change to a small percentage of users, gradually increase the traffic based on the evaluation of metrics and performance, before making it available to everybody.

Generally, A monitoring agent is used to collect metrics from the deployments that get evaluated periodically to see how the new version is performing.

Why makes Ketch special for Canary deployments?

In general, Canarying involves manual work, such as:

  • Introduce a method to deploy the canary change to some percentage of users for testing the new version against the baseline. It becomes a bit harder and tricky when 3rd party routers/Service meshes/Ingress Controllers are integrated into the Kubernetes cluster. For example, Traefik or Istio.
  • Evaluating the new version regularly and check if it is not in a broken/failed state.
  • Increase the user traffic to the new version manually with time.
  • Rollback if the new version fails at some point in time.
  • Of course, writing the boring K8s YAML. ๐Ÿ˜ฉ

Felt that pain? Ah, no worries. Ketch is there for you to handle all of the above tasks automatically. ๐ŸŽ‰

Letโ€™s automate Canary rollouts with Ketch! โ›ต๏ธ

I am assuming that you have already installed the Ketch controller into your cluster. If not, you could get it done in a few minutes by reading the steps described here.

I am using Traefik 2.4.2 as the default ingress controller and running ketch locally with Minikube in this demonstration.

Ketch implements the concept of pools, which Platform Engineers and DevOps can use to isolate workloads from different teams, resources assigned to applications, and more.

Make sure you have installed Traefik before this running command.

Letโ€™s add a pool, named dev, having Ingress controller type of Traefik. You will get an output like the following:

Using the command below, you could create an application, named a1, on the specific pool devthat will be used to deploy your code. In this example, we have deployed an image from a sample docker registry into that application. We can see the information about the application with ketch app infocommand.

Notice the WEIGHT here which is 100%. It means that all the traffic is going to the deployment version 1.

Since I am running it locally with Minikube, I need to add a route to the Traefik ingress to access the application. ( You might not need to run the following command when on a real cloud. )

$ sudo route -n add -host -net $(kubectl get svc traefik -o jsonpath='{.spec.clusterIP}') $(minikube ip -p ketch)

Hurray! The application is running as expected! ๐ŸŽ‰

Now we have a fully working stable deployment version running.

Letโ€™s try rolling out a canary deployment version against the baseline.

We see two new flags introduced with Ketch 0.2 that are specific to Canary rollout.

--steps : The number of steps to roll out the canary deploymentโ€™s traffic to 100%. The step value will also be used to determine the amount of traffic to increase in each step. For example, if the number of steps is set to 10, (100% / 10 ) i.e. 10% of traffic will be added to the canary on each step, and the same percentage of user traffic will be decreased for primary deployment.

step-interval : This flag defines the time interval between each step.

Users can increase the traffic to the canary release in steps. For example, if I want to go from 10% to 100% traffic in a few steps, say 10, by increasing 10% traffic in each step every 1 hour. Ketch will be able to automate this process.

So to achieve the above goal, We just need to run a command like the following:

$ ketch app deploy a1 -i docker.io/dockercloud/hello-world --steps 10  --step-interval 1h

The canary is successfully deployed! ๐Ÿค 

The following output shows the weight distribution of user traffic to different versions of the deployments. In this case, the Canary is receiving 30% of traffic and primary deployment is now receiving 70% of it.

Now ketch will do all the magic. ๐Ÿช„

Every 1 hour, it will keep increasing traffic to canary release by 10% and decreasing primary traffic by the same percentage.

Note: Once the canary traffic reaches 100%, the primary deployment will be automatically removed.

Automatic Rollback

Ketch will keep eye on the deployment and will keep checking the health of the pods for the deployment. If the canary pods get into an unhealthy state during the rollout, Ketch will rollback all the traffic to primary deployment if it exceeds a certain timeout period in an unhealthy state.

Others advantages:

  • Ketch now supports Traefik 2.x and fully working with secure TLS support.
  • Ketch now supports Istio 1.8.x and above and fully working with secure TLS support.

Conclusion

Ketch is a great framework and it makes it extremely easy to deploy and manage applications on Kubernetes using a simple command-line interface. No Kubernetes object YAML is required!

In this way, ketch will do its job and I can enjoy my favorite movie on Netflix without worrying about the rollouts. ๐Ÿฟ๐Ÿค“

Resources:

Learn more about Ketch at https://www.theketch.io/ โœจ

Are you an Open-Source enthusiast?

Feel free to contribute to ketch at https://github.com/shipa-corp/ketch ๐Ÿ’ป

I hope you find this blog useful and informative. Thanks!

About the Author:

I am Saurabh, a passionate and creative developer from ๐Ÿ‡ฎ๐Ÿ‡ณ with a strong interest in Open-Source. ๐ŸŽฏ I have contributed to some of the big projects by giant firms like @๐—š๐—ผ๐—ผ๐—ด๐—น๐—ฒ, @๐—š๐—ถ๐˜๐—ต๐˜‚๐—ฏ, @๐——๐—ถ๐—ด๐—ถ๐˜๐—ฎ๐—น๐—ข๐—ฐ๐—ฒ๐—ฎ๐—ป, @๐—ข๐—ฝ๐—ฒ๐—ป๐—™๐—ฎ๐—ฎ๐˜€,@๐—ต๐—ฒ๐—น๐—น๐—ผ๐—ณ๐—ฟ๐—ฒ๐˜€๐—ต, and many more.

I mostly work on Backend development with Golang ( Go ) and Cloud-Native technologies.๐Ÿš€ I build robust, secure, and scalable infrastructures using Cloud-native technologies such as Kubernetes, Docker, Helm, Terraform, DigitalOcean, AWS, GCP, GRPC, CI/CD, etc., and proficient with metric collection & monitoring tools such as Grafana, Telegraf, Influxdb, etc.

I am Fully-committed to designing and developing innovative materials. I am highly self-motivated, enthusiastic, and always willing to learn and share more. ๐Ÿ•บ

๐Ÿ“ซ How to reach me?

โฆฟ Visit my Website ๐ŸŒ
โฆฟ Connect with me on LinkedIn ๐Ÿ‘จ๐Ÿปโ€๐Ÿ’ป
โฆฟ Follow me on Twitter ๐Ÿฆ
โฆฟ Shoot Me an Email๐Ÿ’Œ

--

--