Introduction to the Humio Operator for Kubernetes

Part 1 of a series on the new Humio Operator

This blog was originally published Nov. 10, 2020 on humio.com. Humio is a CrowdStrike Company.

As Kubernetes has grown in popularity so have the variety and scale of the applications enterprises run on the container-orchestration system. One of the most exciting developments in the cloud-native landscape has been the Operator pattern. In this first in a series of two blog posts, we introduce the Humio Operator for Kubernetes. Later this week, we’ll walk you through how to use the Humio Operator to deploy Humio on Kubernetes.

Humio began adopting Kubernetes for its software as a service offering in early 2020 utilizing our previously developed Helm chart. This made deploying Humio in a Kubernetes environment easier, but it still lacked the features and intuitive management that our operations team envisioned when we began proof of concept work on developing a Kubernetes Operator. In the last year, our infrastructure engineering team composed of ourselves and Mike Rostermund developed, iterated, and deployed the Humio Operator to run our production cloud environments. Humio’s Operator is available as an open source project so any Humio customer can run Humio in a self-hosted environment using the same best practices we use to manage our own cloud environments.

Watch our on-demand webinar Running Humio on Kubernetes with the Humio Operator to learn best practices for leveraging the Humio Operator to maintain Humio resources, how to deploy the Humio Operator on an AWS EKS environment, and more!

Features of the Humio Operator

The strength of Kubernetes as a platform lies in its integration between the critical components of a reliable service. When an application is deployed on Kubernetes, the containers running the application are only one part of delivering a complete service. The automation and control of other systems, both inside and outside the Kubernetes cluster, must also be automated to facilitate flexibility in architectural choices and secure best practices.

The Humio Operator has many features that allow administrators to deploy and manage a Humio cluster in addition to utilizing commonly used components from the cloud- native ecosystem, such as cert-manager and the nginx ingress controller, to automate the secure communication between Humio pods. The Humio Operator for Kubernetes:

  • automates the installation of a Humio cluster on Kubernetes.
  • automates the management of Humio Repositories, Parsers, and Ingest Tokens.
  • automates the management of Humio, such as partition balancing.
  • automates version upgrades of Humio.
  • automates configuration changes of Humio.
  • allows the use of various storage mediums, including hostPath or storage class PVCs.
  • automates the cluster authentication and security such as pod-to-pod TLS, SAML, and OAuth.

The Humio Operator manages Humio clusters that are defined as Humio cluster resources. The Humio cluster resource specification is wide ranging and combines pod specifications for resources and affinity, Humio configuration, and the configuration for ingress and cert-manager.

This visualization helps to recognize the many moving parts that the Humio Operator coordinates.

You can use the cluster resource definition to configure any of Humio’s available configuration settings, however sometimes you need to supply a container with certificate when enabling features such as SAML or if you were previously utilizing Humio’s permissions files. This is accomplished by loading the files into the Kubernetes secrets service and referencing them in Humio cluster resource definition.

One of the most exciting features Humio has added in the last year is bucket storage. Bucket storage allows Humio to run on ephemeral NVME disks provided in most cloud environments while persisting the data in the cloud provider’s object storage. The Operator and its tight integration with Humio’s features allows us to make managing these complex deployments much simpler. When an upgrade or configuration change is deployed, the Operator manages the pods in such a way that the existing data is used by instances, and data is only loaded from object storage when a Kubernetes worker is replaced. We recommend Humio in this manner for the best performance, but in some cases it may not be feasible if the environment doesn’t have ephemeral local storage available. Given this requirement the Operator includes support for deploying clusters using network block storage as well.

We created additional cluster resources that utilize the Humio cluster’s API to get up and running simple and easy. After the cluster has started, the Humio Operator allows administrators to utilize the custom resource definitions to create repositories, parsers, and ingestion tokens that are stored as Kubernetes secrets. Once your cluster is deployed you can configure your chosen log shipper’s daemonset to send application logs to the Humio repository utilizing the credentials stored in the Kubernetes secret.

Humio on Kubernetes Architecture

If you’re interested in learning more about the Humio Operator for Kubernetes watch our on-demand webinar on Running Humio on Kubernetes with the Humio Operator. You’ll learn best practices for leveraging the Humio Operator to maintain Humio resources, tips for making the best use of your cloud environment’s storage options, and more. And be sure to check back here on Thursday morning for the follow up to this blog post, How to use the Humio Operator to run Humio on Kubernetes.

Additional resources

Related Content