Kubernetes operator best practices: Implementing observedGeneration

There’s a lot of hidden knowledge in core controllers and api conventions doc that is not followed by many controllers in the wild. One of these patterns is observedGeneration. Let’s take a closer look at what problems it can help solve.

Kubernetes resources typically enforce spec and status separation. That means that the desired state (‘spec’ part of the object) and status (‘status’ part of an object) are updated via different API calls and typically also by different personas. While spec is something that the user maintains e.g. via kubectl, status is what your controller owns. More information about this can be found in the api conventions description.

Because of this and the nature of how controllers work, even though both spec and status are part of the same resource, they don’t always describe resources at the same point in time. Spec always point to the latest desired state while status captures the last observed state by controller and those could be two different things.

Let’s try to take a look at the following example. Let’s have a simple CRD with a replicas field. After the controller finishes it’s job, the spec and status will be as follows.

apiVersion: yourgroup.com/v1
kind: YourKindCluster
metadata:
generation: 1
name: cluster-sample
spec:
replicas: 1
status:
state: Deployed

Next someone comes and updates replicas to 2. You might pull a resource and see a state that looks like the following example.

apiVersion: yourgroup.com/v1
kind: YourKindCluster
metadata:
generation: 2
name: cluster-sample
spec:
replicas: 2
status:
state: Deployed

Now seeing this, two things could have happened — either the controller was very fast and your resource already went through deploying state and then stabilized in deployed. Your cluster runs in two replicas and everything is good.

What could also have happened is that the controller might not have seen the latest spec update — it might have been down restarting or just did not yet picked up on the change. Because of that, the state ‘Deployed’ is actually relevant to an old state of the spec.

What’s also important to understand here is the generation field. Generation is a monotonically increasing number that gets bumped every time you update your resource. More details (again) in the api conventions docs. So the first resource with 1 replica was generation 1, while the update of replicas bumped this to generation 2.

Now observed generation comes in place because it helps you resolve the ambiguity described in the example above. And many people know that, that’s why observed generation is being added to plenty of open source projects like Cluster-API or cert-manager.

Going back to the previous example, if your controller sets an observedGeneration field on your resource, you end up in one of the following states. The first one is the controller, that did not see the new spec update just yet — so the status is obsolete.

apiVersion: yourgroup.com/v1
kind: YourKindCluster
metadata:
generation: 2
name: cluster-sample
spec:
replicas: 1
status:
observedGeneration: 1
state: Deployed

If the controller already picked up the spec update, it should update the observedGeneration as well as state (e.g. to Deploying), in the end stabilizing into the following state.

apiVersion: yourgroup.com/v1
kind: YourKindCluster
metadata:
generation: 2
name: cluster-sample
spec:
replicas: 1
status:
observedGeneration: 2
state: Deployed

ObservedGeneration is actually one of the many lessons one can learn when looking closely into kubernetes core kubernetes controllers — this pattern for example is being used by for example StatefulSet. It’s also the thing that rollout status in kubectl uses to be able to say, whether your deployment/statefulset/etc. finished deploying.

Passionate about traveling, food and programming. Tennis player that works on container orchestration at Mesosphere. Always trying to improve.