ELK alternative: Modern log management setup with Opentelemetry and Opensearch

As your applications grow, so does the flood of log data. Managing and scaling your logging infrastructure can quickly become a significant challenge. For years, the ELK stack (Elasticsearch, Logstash, Kibana) has been a go-to solution. However, ELK is showing its age, especially Logstash, which is quite resource-intensive and inflexible compared to modern alternatives. On top of that Elasticsearch has become a commercial product after its license change. Many companies refuse to use it due to its licensing terms.

This article introduces a powerfull, trully opensource and resource efficient alternative to ELK: Opentelemetry and Opensearch. You'll gain a clear understanding of this modern approach, how to set it up, and the benefits it brings to your observability strategy.

Why this new stack wins: OpenTelemetry & OpenSearch

OpenTelemetry

OpenTelemetry is more than just a logging tool, it's a Cloud Native Computing Foundation (CNCF) project that aims to standardize the generation and collection of telemetry data – logs, metrics, and traces. While this article focuses on its logging capabilities, adopting OTel for logs paves the way for a unified observability strategy across all three pillars.

The OTel Collector is a key component, offering incredible flexibility. It has

Receivers: These define how data gets into the Collector. Examples include OTLP (OpenTelemetry Protocol), Fluent Forwarder, and filelog (for tailing files).
Processors: These allow you to manipulate, filter, batch, or route data. You can enrich logs with attributes, remove sensitive information, or ensure efficient batching before export.
Exporters: These define how data is sent from the Collector to one or more backends, such as OpenSearch, Kafka, or any OTLP-compliant system. You can find an extensive list of available components and integrations in the OpenTelemetry Collector Contrib GitHub repository.
Agent mode: The OTel Collector can also act as an agent. This means you can deploy it to collect logs from sources that you can't directly instrument with OTel libraries, such as tailing Nginx access logs or collecting Docker container logs using receivers like filelog or dockerstats.
Performance and efficiency Crucially, OpenTelemetry is designed with performance and efficiency in mind. It's generally lighter and more performant than Logstash, allowing you to handle more data with fewer resources.

Opensearch

Elasticsearch, the heart of the ELK stack, also underwent significant licensing changes when Elastic moved core components a source-available license. This shift raised concerns within the open-source community regarding vendor lock-in and the long-term openness of the project. The community really didn't like this change.

As a direct response, OpenSearch, a community-driven, Apache 2.0 licensed fork, was created to ensure a truly open-source path forward. It has a active community and is backed by Amazon Web Services (AWS), which provides a stable foundation for future development.

Beyond its open-source nature, OpenSearch comes packed with features critical for robust log management:

Integrated Security: It offers comprehensive security features out-of-the-box, including authentication, authorization, encryption, and fine-grained access control.
Alerting & Anomaly Detection: You get built-in capabilities to monitor your log data and trigger alerts based on defined conditions or detected anomalies, helping you proactively identify issues.
Application-Level Access Control: OpenSearch security has the ability to control log access per team or user. This is invaluable in larger organizations where different teams need access to specific subsets of logs.

Implementation Guide: Setting Up Your Modern Logging Pipeline

Here’s how you can start building your logging pipeline with OpenTelemetry and OpenSearch.

Prerequisites

A Kubernetes Cluster: While you can run these components in other environments, Kubernetes is highly recommended, especially for observability systems. These systems often grow faster than anticipated, and Kubernetes provides the scalability and manageability you'll need.

Setting up Opensearch

The recommended way to deploy and manage OpenSearch on Kubernetes is by using the OpenSearch Operator.

Why the Operator? It significantly simplifies deployment, ongoing management (like version ugrades, config changes, and scaling), and day-to-day operations. In our experience managing multiple large OpenSearch clusters that ingests billions of spans and logs daily, the operator has substantially reduced operational overhead.

Follow these instructions to setup the Opensearch Operator using Helm: OpenSearch-k8s-operator

Once you have the operator running, you can deploy an OpenSearch cluster. Following is an exmple of a 3-node OpenSearch cluster. Read this user guide for more details: Opensearch Operator User Guide

apiVersion: opensearch.opster.io/v1
kind: OpenSearchCluster
metadata:
  name: observability-opensearch
  namespace: observability
spec:
  general:
    serviceName: observability-opensearch
    version: 2.17.0
    additionalConfig:
      plugins.query.datasources.encryption.masterkey: "cbdda1e0ab9e45c44f9b56a3" # Change this

  security:
    config:
      adminSecret:
        # This secret contains the admin certificate using common name "admin". Use cert-manager to generate it.
        name: observability-opensearch-admin-cert 
      adminCredentialsSecret:
        # This secret contains admin credentials. They keys are "username" and "password".
        name: observability-opensearch-admin-credentials
      securityConfigSecret:
        # This secret contains the security config files.
        # The key is filename and the value is the file content. e.g. "config.yml": "internal_users.yml"
        name: observability-opensearch-security-config-files
    tls:
      transport:
        generate: false
        perNode: false
        secret:
          # Generate this similar to the admin cerntificate, but with common name "opensearch".
          name: observability-opensearch-node-cert
        caSecret:
          # This is the secret that contains the CA certificate used to create the admin and node certificates.
          name: observability-ca-secret
        nodesDn: ["CN=opensearch"]
        adminDn: ["CN=admin"]
      http:
        generate: false
        secret:
          name: observability-opensearch-node-cert
        caSecret:
          name: observability-ca-secret
  dashboards:
    enable: true
    version: 2.17.0
    replicas: 1
    resources:
      requests:
         memory: "512Mi"
         cpu: "200m"
      limits:
         memory: "512Mi"
         cpu: "200m"
    opensearchCredentialsSecret:
      name: observability-opensearch-admin-credentials

  nodePools:
  - component: master 
    replicas: 3
    diskSize: "30Gi"
    nodeSelector:
    resources:
        requests:
          memory: "1.5Gi"
          cpu: "500m"
        limits:
          memory: "2Gi"
          cpu: "1500m"
    roles:
      - "cluster_manager"
      - "data"
    persistence:
      pvc:
        storageClass: "storage-class-name" # Change this to your storage class
        accessModes:
          - "ReadWriteOnce"
    env: 
      - name: DISABLE_INSTALL_DEMO_CONFIG
        value: "false"

Following are the essintial configuration files you need to put in observability-opensearch-security-config-files secret:

internal_users.yml:

_meta:
  type: "internalusers"
  config_version: 2
admin:
  #  Change this to your hashed admin password: Use https://bcrypt-generator.com/
  hash: "$2y$12$eW5Z1z3a8b7c9d8e7f8g9u0h1i2j3k4l5m6n7o8p9q0r1s2t3u4v5w6x7y8z" #
  reserved: true
  description: "Cluster super user"
  backend_roles:
  - "admin"
---

config.yaml:

_meta:
  type: "config"
  config_version: 2

config:
  dynamic:
    http:
      anonymous_auth_enabled: false
    authc:
      basic_internal_auth_domain:
        description: "Authenticate via HTTP Basic against internal users database"
        http_enabled: true
        transport_enabled: true
        order: 4
        http_authenticator:
          type: basic
          challenge: true
        authentication_backend:
          type: intern
      clientcert_auth_domain:
        description: "Authenticate via SSL client certificates"
        http_enabled: false
        transport_enabled: false
        order: 2
        http_authenticator:
          type: clientcert
          config:
            username_attribute: cn #optional, if omitted DN becomes username
          challenge: false
        authentication_backend:
          type: noop
    authz: {}

Setting up data streams for log

Data streams are designed to handle continuously generated time-series append-only data such as logs. We will setup a data stream name logs-stream and make it write to a new index every day. The indices will expire after 30 days. This is a common pattern for log data, allowing you to efficiently manage storage and retention.

Note: You can run the following commands in OpenSearch Dev Tools in Opensearch Dashboards.

Create data stream

PUT _index_template/logs-stream-template
{
  "index_patterns" : "logs-stream",
  "data_stream": {},
  "priority": 100
}

PUT _data_stream/logs-stream

Create ISM policy to rollover indices daily and delete after 30 days

PUT _plugins/_ism/policies/logs-lifecycle-policy
{
  "policy": {
    "description": "Rollover indices daily and delete after 30 days",
    "default_state": "hot",
    "states": [
      {
        "name": "hot",
        "actions": [
          {
            "rollover": {
              "min_index_age": "1d"
            }
          }
        ],
        "transitions": [
          {
            "state_name": "delete",
            "conditions": {
              "min_index_age": "30d"
            }
          }
        ]
      },
      {
        "name": "delete",
        "actions": [
          {
            "delete": {}
          }
        ],
        "transitions": []
      }
    ]
  }
}

POST _plugins/_ism/add/logs-stream
{
  "policy_id": "logs-lifecycle-policy"
}

Setting up the OpenTelemetry collector

Similarly, for the OpenTelemetry Collector, using the OTel Operator is recommended. Follow these instructions to set it up: OpenTelemetry Operator

Once you have the operator running, you can deploy an OpenTelemetry Collector. OpenTelemetry collector is the gateway for your observability data. You applications will send their logs to the collector, which will then be processed and exported to OpenSearch.

apiVersion: opentelemetry.io/v1beta1
kind: OpenTelemetryCollector
metadata:
  name: observability-otel-workers
  namespace: observability
spec:
  mode: deployment
  image: otel/opentelemetry-collector-contrib:0.118.0
  resources:
    resources:
      requests:
        memory: "100Mi"
        cpu: "500m"
      limits:
        memory: "1Gi"
        cpu: "1000m"
  autoscaler:
    maxReplicas: 2
    minReplicas: 1
    targetCPUUtilization: 90
    targetMemoryUtilization: 90
  volumeMounts:
    - name: ca-cert
      mountPath: /tls/ca.crt
      subPath: ca.crt
  volumes:
    - name: ca-cert
      secret:
        #  This is the secret that contains the CA certificate used to create the admin and node certificates.
        secretName: observability-ca-secret 
        items:
          - key: ca.crt
            path: ca.crt
  config:
    extensions:
      basicauth/os:
        client_auth:
          username: admin
          password: opensearch-admin-password # Change this to your OpenSearch admin password

    receivers:
      otlp:
        protocols:
          grpc:
            endpoint: "0.0.0.0:4317"
          http:
            endpoint: ":4318"
            cors:
              allowed_origins:
                - "http://*"
                - "https://*"

    processors:
      batch:
        send_batch_max_size: 3000
        send_batch_size: 1000
        timeout: 5s

    exporters:
      opensearch/logs:
        # This is the OpenSearch data stream we created earlier.
        logs_index: "logs-stream"
        http:
          endpoint: "https://observability-opensearch.observability.svc.cluster.local:9200"
          auth:
            authenticator: basicauth/os
          tls:
            insecure: false
            ca_file: /tls/ca.crt

    service:
      extensions: [basicauth/os]
      pipelines:
        logs:
          receivers: [otlp]
          processors: [batch]
          exporters: [opensearch/logs]

Instumentation: Sending logs to OpenTelemetry

Exposing the collector service

If you are running your application on the same Kubernetes cluster, you can directly send logs to the collector service which is observability-otel-workers.observability.svc.cluster.local:4317. If you are running your application outside the cluster, you can expose the collector service using a LoadBalancer or NodePort service type. We recommand using a internal load balancer if your application is running in the same cloud provider as your Kubernetes cluster.

Here is an example of how to create such load balancer service:

apiVersion: v1
kind: Service
metadata:
  name: observability-otel-collector-alb
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-internal: "true"  # Annotation for internal load balancer
spec:
  type: LoadBalancer
  ports:
    - name: otlp-grpc
      port: 4317
      nodePort: 32007
      protocol: TCP
    - name: otlp-http
      port: 4318
      nodePort: 32008
      protocol: TCP
  selector:
    app.kubernetes.io/component: opentelemetry-collector
    app.kubernetes.io/instance: observability.observability-otel-workers
    app.kubernetes.io/part-of: opentelemetry

You can obtail the alb endpoint using kubectl get svc.

Instrumenting your application

To send logs to the OpenTelemetry Collector, you need to instrument your application using OpenTelemetry libraries. Most languages have zero code instumentation support, which means you can automatically collect logs with miminal changes.

Check the Zero code instrumentation documentation for your language: https://opentelemetry.io/docs/concepts/instrumentation/zero-code/

Collecting logs from files

It is recommended to use the Otel libraries to instrument your application. But if you cannot modify the application code you can still forward the logs to the OTel Collector. For that you have to run a OTel collector on agent mode on the same host as your application. The agent will tail the log files and send them to the OTel Collector which will write it to opensearch.

Here is an example of how to configure the OTel Collector to tail log files: config.yaml

receivers:
  filelog:
    include: 
     - /var/log/myapp/*.log
    operators:
      - type: regex_parser
        regex: '^(?P<time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}) (?P<sev>[A-Z]*) (?P<msg>.*)$'
        timestamp:
          parse_from: attributes.time
          layout: '%Y-%m-%d %H:%M:%S'
        severity:
          parse_from: attributes.sev
exporters:
  otlp:
    endpoint: "<your-otel-collector-endpoint>:4317"
    tls:
      insecure: true
    sending_queue:
      num_consumers: 4
      queue_size: 100
    retry_on_failure:
      enabled: true
processors:
  batch:
service:
  pipelines:
    traces:
      receivers: [filelog]
      processors: [batch]
      exporters: [otlp]

Run the Collecter as /path/to/otelcol --config /path/to/config.yaml. This will start the collector in agent mode, tailing the log files and sending them to the OTel Collector.

Note: Read more about the filelog receiver here: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/filelogreceiver

Conclusion

Moving away from traditional ELK to a stack powered by OpenTelemetry and OpenSearch offers you a more flexible, efficient, and truly open-source solution for your logging needs. You gain the advantage of standardized telemetry collection with OTel, starting with logs and adding traces and metrics later. This is the step 1 of building a full stack observability system in house.

Cheers!

Footnotes:

Securing Opensearch with OIDC: How to secure Opensearch dashboard with Keycloak

ELK alternative: Modern log management setup with Opentelemetry and Opensearch

ELK alternative: Modern log management setup with Opentelemetry and Opensearch

Why this new stack wins: OpenTelemetry & OpenSearch

OpenTelemetry

Opensearch

Implementation Guide: Setting Up Your Modern Logging Pipeline

Prerequisites

Setting up Opensearch

Setting up data streams for log

Create data stream

Create ISM policy to rollover indices daily and delete after 30 days

Setting up the OpenTelemetry collector

Instumentation: Sending logs to OpenTelemetry

Exposing the collector service

Instrumenting your application

Collecting logs from files

Conclusion

Table Of Contents