A Kubernetes migration story

April 5, 2017

This article was contributed by Tom Yates

KubeCon was full of people willing to tell you how great Kubernetes or its associated infrastructure can be in principle, but the most interesting evangelists are the ones who have adopted it for real work and have felt the pain. Audun Fauchald Strand and Øyvind Ingebrigtsen Øvergaard both work for finn.no, which is apparently a kind of Norwegian eBay or Craigslist. They have migrated their business from a legacy architecture to Kubernetes, and they were willing to be very specific about where exactly that hurt. (Readers unfamiliar with Kubernetes and containerization may want to look at my introductory article on the topic.)

According to Øvergaard, Finn is the second largest site in Norway in terms of traffic; only the principal national newspaper's site carries more. Finn currently has 120 developers maintaining 350 microservices, with about a thousand deployments into production every week. They average a median six minutes between a developer pushing code into the repository and that code going into production.

In case you're worried that this is already a paean to containerization, they had all that functionality before they moved away from their old monolithic application. But that monolithic architecture had been in place during a period of growth that went from one weekly deployment to a thousand, and from three to 350 services. There had been attempts to fix things using virtualization via OpenStack, but their VM count had gone from ten to hundreds; as Strand said, "instead of having 200 applications running on one server, we had two to three hundred virtual machines running on it, which was not easy to operate".

The beginning of the move to Kubernetes was the introduction of Docker into their development processes. It worked brilliantly as a development tool, which led people to introduce it into production, where it worked very badly indeed. It quickly became clear that something would be needed to manage container lifecycle and monitoring; nobody wanted to get paged at 3am because a container had died, when it is in the nature of containers to die and, had it been restarted, all would have been well.

So the infrastructure people at Finn eventually settled on Kubernetes as the tool of choice to do this. They were constrained by the use of a back-end database that was not cloud-friendly and which ran on Solaris, preventing them from hosting out-of-house (the knowing laugh from the audience that accompanied this suggested that they aren't the only people so constrained). Instead, they did Kubernetes the hard way, and set it up themselves in-house. Originally, they provisioned containers on top of their existing VMs, but after a lot of time spent debugging hypervisor performance issues, they moved to running Kubernetes on bare metal.

FIAAS

Having gotten that working, they realized that they needed to build something on top of Kubernetes to simplify the development of in-house applications. This they called "Finn Infrastructure As A Service", or FIAAS; Norwegians apparently pronounce this identically to the word fjas, which means silliness. This has led to a pleasingly large number of high-level internal memos about "needing more fjas" and "migrating to fjas". Strand recommends that, if you're trying this sort of migration at your workplace, you try to get a silly name for your platform; apparently it makes all sorts of meetings go more smoothly and eases adoption.

More seriously, the infrastructure group created a set of contracts that describe how infrastructure and applications should interact. These deliberately restrict developer choice but, in return, promise that, as long as developers adhere to the rules, they can otherwise do what they want and things will work. The contracts are not merely social constructs; some of them have been implemented as shared libraries, by the use of which a developer is guaranteed to be contract-compliant. The infrastructure group accepts that these contracts will only cover 80% of the use cases and exceptions have to be handled but, if a developer can design their application in a contract-compliant way, they can implement it through to production with no further reference to the infrastructure group, and that makes everyone happier.

An important transfer of responsibility came around the same time, and the business now takes a "you wrote it, you run it" approach to the developers. It is the developers, not the infrastructure team, who get service outage notifications, and it's the developers who are expected to get up at 3am and fix their applications. This move was a decision taken early on, at a high level, by the business, and it was the driver for much of what came next.

For this, observability is key. Logging is handled by a combination of Fluentd, Elasticsearch, and Kibana; a FIAAS-compliant application is expected to use this pipeline to report to the world. Strand reports that some developers still want to debug their applications by using a command like ssh or docker exec bash to connect to the instance, then grep or tail on local logfiles. After a series of ominous-sounding "constructive conversations", most developers have been persuaded not to do this.

This was a thread that cropped up more than once at KubeCon: containers are essentially ephemeral, like processes: they are born, acquire resources, use them to provide a service, and terminate, releasing those resources for reuse. To provide containerized microservices effectively, one must treat the container as we have for years treated processes. Any developer whose sole method of diagnosing a process was to strace it, or to examine the running memory footprint, would be thought baroque; a well-behaved Unix process is expected to log while it runs, and leave that log behind after its termination. Similarly, the well-behaved modern container should do likewise, logging outside of itself while it lives so that the log can survive its demise.

Metrics are also important to managing the container environment. Finn uses Prometheus as its preferred monitoring platform, and part of the FIAAS contract is that well-behaved containers should include a Prometheus client. This permits business-level questions, such as "which team uses the most CPU resources?", to be answered through Prometheus queries.

Finn also deploys a canary application, the use of which was also something of a conference theme. It tests all the functionality of the application, reporting back through Prometheus and Kibana, and is used as the final gatekeeper in the move from staging to production. If you want to deploy a thousand times a week, which is once every ten minutes at its least frequent, the job of deciding whether a particular version of the application works can no longer involve a human being at any stage.

The hard part

Even with all this infrastructural effort, Finn accepts that migration of the actual applications has been the most difficult part of the work, requiring that the application becomes "cloud native" — that is, designed to work in an environment of ephemeral containers and frequent redeployment. Strand said, "of course, the easiest way of doing that, you need to be 12-factor", but accepted that, while these are necessary conditions for successful containerization, they are not sufficient, and that most applications will have little wrinkles on top of that. Some applications were migrated in an hour of work, others took weeks; but Strand reported that, even for the weeks-long migrations, the developers came to them afterward and reported that the process of migration had improved their application, making it less coupled to the infrastructure and easier to scale.

Øvergaard went on to describe Finn's traffic ingress process, which was crucial to the migrations. The whole site is fronted by an active-passive pair of proxies, called "unleash", that lives outside the cluster and integrates with their feature-release tool. Unleash knows whether a particular site feature is provided by the legacy infrastructure or Kubernetes, or both, and is capable of having the proxies direct a specified percentage of a given traffic stream one way or the other. So migrating a feature involves starting to direct a small percentage of that stream into Kubernetes, then watching the system-response metrics, comparing the old and new systems. Unleash then increases the fraction going into Kubernetes until a bottleneck is found through a jump in latency, at which point they back off, fix the bottleneck and continue, or until 100% of the traffic is going to Kubernetes, at which point the old system can be retired. He noted that the migration to FIAAS is about 25% done.

You shouldn't hesitate to build something on top of Kubernetes if it solves a problem for you, said Strand, and you shouldn't hesitate to constrain developers where it makes business sense. "If every developer who wanted to create a new service sat down and thought about which ingress controller they would choose, we wouldn't be able to do anything." Extensive metrics are the lifeblood of the process: "If you want to migrate in a safe way, you need to know what happens in production."

In response to questions, they repeated their contention that application migration was the difficult part; each one threw up new problems. They had one application that was designed to pre-warm its caches after activation but before going live, but the pre-warming was done by a puppet call to the application after it had started. This they found impossible to do in Kubernetes; the application had to be rewritten, which took weeks. Another one set no timeouts on its connections to the back-end database; when a NAT state hiccup happened and existing connections were blocked, all the connections hung and, even though new connections could be made successfully, the application died.

In conclusion, Strand said that Kubernetes, and by extension the use of containers in production, isn't for everyone. He feels that if they had by magic instantaneously moved everything from their old infrastructure to Kubernetes, the only real difference would have been that everything got a bit slower. It's the move to microservices and small, containerizable components that accompanies the migration that have given the added business value. Strand and Øvergaard were not the only people at KubeCon to say this, and it's an important caveat to hear while the world is charging madly towards containerizing everything.

[Thanks to the Linux Foundation, LWN's travel sponsor, for assistance in getting to Berlin for CNC and KubeCon.]

Index entries for this article
GuestArticles	Yates, Tom
Conference	CloudNativeCon+KubeCon/2017

A Kubernetes migration story

Posted Apr 6, 2017 15:10 UTC (Thu) by NightMonkey (subscriber, #23051) [Link] (1 responses)

Just an awesomely well-written and frank article, delivered "right on time" for my SysAdmin/DevOops needs. Thanks!!!

A Kubernetes migration story

Posted Apr 7, 2017 4:00 UTC (Fri) by madhatter (subscriber, #4665) [Link]

You're welcome, and thank you (it's my article). Much of the credit must go to Strand and Øvergaard, though, for giving a funny, frank, and very on-topic talk in the first place.

A Kubernetes migration story

Posted Apr 6, 2017 17:29 UTC (Thu) by nix (subscriber, #2304) [Link] (1 responses)

NAT is evil. Why would anyone NAT the connections between their containers and their backend database? That's just *asking* for random, unreproducible, almost-undebuggable networking-related failures.

A Kubernetes migration story

Posted Apr 6, 2017 17:40 UTC (Thu) by ibukanov (subscriber, #3942) [Link]

I second this. I had an application that used UDP protocol. In a docker container it died randomly after a week of smooth running. It turned out it was caused by NAT between the host and the container that docker deployed. After I switched the container to use the host network the app run for months without any troubles.

Unleash tool

Posted Apr 6, 2017 19:02 UTC (Thu) by jnareb (subscriber, #46500) [Link] (3 responses)

> The whole site is fronted by an active-passive pair of proxies, called "unleash", that lives outside the cluster and integrates with their feature-release tool. Unleash knows whether a particular site feature is provided by the legacy infrastructure or Kubernetes, or both, and is capable of having the proxies direct a specified percentage of a given traffic stream one way or the other. So migrating a feature involves starting to direct a small percentage of that stream into Kubernetes, then watching the system-response metrics, comparing the old and new systems.

I guess this "Unleash" tool is too specific to be open-sourced, isn't it? I wonder how it compares with GitHub's Scientist tool (introduced by "Move Fast and Fix Things" article on GitHub Engineering blog: https://githubengineering.com/move-fast/ ; I don't remember if there was LWN.net article about it)...

Unleash tool

Posted Apr 15, 2017 18:27 UTC (Sat) by audunstrand (guest, #115129) [Link]

hi there.

Audun Strand here.

the Unleash tool is actually opens sourced, and not spesfic to FINN itself. However the HAProxy ingress integration is not yet open, but that is probably fixable.

here is the link to unleash https://github.com/Unleash/unleash

Unleash tool

Posted Apr 16, 2017 9:14 UTC (Sun) by tfheen (subscriber, #17598) [Link] (1 responses)

Unleash itself is a feature toggle service. It gives you a UI for setting toggle values and a lightweight dashboard where apps consuming the toggles can report back the distribution as well as an API for querying toggles. As Audun writes, unleash itself is open source. What we have done differently is that we've pulled low-level infrastructure decisions into a tool that has developer visibility and usage. This allows developers to be in charge of their own applications and the migrations.

I'd be happy to provide the source to our "unleash-weighter" tool. It has some hard-coded knowledge about how we name our backends and what the toggle names in unleash are, but that is easy enough to change, or make configurable. It basically just consumes the unleash API and pushes the corresponding values into a set of HAProxy stats sockets.

Unleash tool

Posted Mar 6, 2018 5:25 UTC (Tue) by tfheen (subscriber, #17598) [Link]

The code for this now lives in https://github.com/Unleash/unleash-weighter.