kubernetes, elasticsearch, operators

Kubernetes Elasticsearch Operator

Just after I wrote a Stateful applications on Kubernetes article which focuses on StatefulSet in general, I started to look into the Kubernetes operators. An operator is basically a custom API object registered as CustomResourceDefinition which enables you to create a custom business logic for operating with particular service, in this case, Elasticsearch. This post will go through Elasticsearch operator in more details to show you why using an operator is probably a better idea than using StatefulSet, Deployment and other resources to create a production ready Elasticsearch cluster on top of Kubernetes.

StatefulSet/Deployment or Elasticsearch operator?

When I started to look into the operators I asked the above question on Twitter and referenced the author of the most used Elasticsearch Kubernetes deployment resource. The repo is here and I highly recommend it for learning more about the deployment of Elasticsearch on top of Kubernetes.

Check this tweet to see the full conversation history!

At first, I just couldn't decide what is right or better. I had a feeling that using operators is not beneficial at all, and that using it will introduce more issues in handling Elasticsearch cluster in production. Also, if you check the stars of the above repo compared to Elasticsearch operator, you will see that people are still using StatefulSet more than this operator. Or, they just don't like the idea.

Elasticsearch operator

"An Operator represents human operational knowledge in software to reliably manage an application."

At the end, I decided to go with the operator and also made a first and really small contribution and just a few days ago another, which is a Helm chart. The latest one is still in progress, so stay tuned.

The operator can do things that are not available with StatefulSet. It utilizes different Kubernetes resources in the background to do things in more automated fashion and adds some additional features:

  • S3 snapshots of indexes
  • Automatic TLS - secrets are automatically generated by the operator
  • Spread loads across zones
  • Support for Kibana and Cerebro
  • Instrumentation with statsd

Basically, you could put in an operator whatever you want. For instructions on how to deploy it please check official GitHub repo. I will not rewrite the same commands here because they can change. As I mentioned, the operator is a custom Kubernetes resource, or CRD. After you deploy Elasticsearch operator you can check that new resource is created, CustomResourceDefinition:

⚡  kubectl get CustomResourceDefinition
NAME                                         AGE
elasticsearchclusters.enterprises.upmc.com   3d

And you can check the details of this CRD with:

⚡  kubectl describe CustomResourceDefinition elasticsearchclusters.enterprises.upmc.com
...
Spec:
  Group:  enterprises.upmc.com
  Names:
    Kind:       ElasticsearchCluster
    List Kind:  ElasticsearchClusterList
    Plural:     elasticsearchclusters
    Singular:   elasticsearchcluster
  Scope:        Namespaced
  Version:      v1
...

As you can see, we have a new kind of resource, ElasticsearchCluster. So, now you can just create an Elasticsearch cluster using only one yaml file which represents it. Here is the example:

apiVersion: enterprises.upmc.com/v1
kind: ElasticsearchCluster
metadata:
  name: example-es-cluster
spec:
  kibana:
    image: upmcenterprises/kibana:5.3.1
  cerebro:
    image: upmcenterprises/cerebro:0.6.8
  client-node-replicas: 3
  master-node-replicas: 2
  data-node-replicas: 3
  network-host: 0.0.0.0
  zones:
  - us-east-1c
  - us-east-1d
  - us-east-1e
  data-volume-size: 10Gi
  java-options: "-Xms256m -Xmx256m"
  snapshot:
    scheduler-enabled: false
    bucket-name: elasticsnapshots99
    cron-schedule: "@every 2m"
  storage:
    type: gp2
    storage-class-provisioner: kubernetes.io/aws-ebs
  resources:
    requests:
      memory: 512Mi
      cpu: 500m
    limits:
      memory: 1024Mi
      cpu: '1'

That's it. Complete Elasticsearch cluster. If you want to add another one, you just create a new yaml file. One operator can manage multiple Elasticsearch clusters. I will try to answer some questions you may have:

How well written operator should look like?

A really good written operator is a Prometheus from CoreOS. Check it here https://coreos.com/operators/prometheus/docs/latest/user-guides/getting-started.html. CoreOS introduced operators as a business logic in the first place.

Can I use custom Elasticsearch Docker image?

The Elasticsearch Docker image used by this operator consists of more layers. You can check those repos in the exact order:

  1. Base image
  2. Kubernetes ready image
  3. Operator ready image

At the end, you have one image upmcenterprises/docker-elasticsearch-kubernetes which is a default one. You can use the official image, but also you can build your own. From above links you can check all environment variables and other things that you need to incorporate into your image to work correctly. It is pretty easy.

Ok, so all looks good, but are there any cons?

Well, yeah, some of them:

  • It is open source project, so you should probably be able to fix something on your own
  • You need to learn how it works (additional tool)
  • I would like to see a support for different Java options for master, data and client nodes
  • Zone awareness so that primary & replica shards are not all scheduled into the same zone

Summary

I referenced only one operator which I found really good for Elasticsearch. There are probably many of them. What I would really like to see is that companies like Elastic to start to embrace Kubernetes and eventually start to develop operators and Helm charts. I'm afraid that we will end up with multiple operators for the same software rather than improving the existing ones. What can you do about it? Well, start contributing and use existing operators. It is like with any other opensource project. Operators are here to stay.

Author image

Alen Komljen

Building and automating infrastructure with Docker, Kubernetes, kops, Helm, Rancher, Terraform, Ansible, SaltStack, Jenkins, AWS, GKE and many others.
  • Sarajevo