Malay Hazarika
Jul 9, 2025
|10 minutes Read
As your applications grow, so does the flood of logs and telemetry data. Managing and scaling your telemery infrastructure can quickly become a significant challenge, because they stay ahead of your application growth. For years, the ELK stack (Elasticsearch, Logstash, Kibana) has been a go-to solution.
However, ELK is showing its age, especially Logstash, which is a resource-hog and inflexible compared to modern alternatives. On top of that Elasticsearch has become a source-availabe project after its license change. Many companies refuse to use it due to its licensing terms.
This article is the second part of our series on building a powerful, truly open-source observability stack. In Part 1, we replaced the "E" and "L" of the ELK stack with OpenSearch and a lightweight OpenTelemetry Collector for efficient log management. Now, we'll extend that foundation by integrating Jaeger for distributed tracing.
By the end, you'll understand how to build a robust, scalable, and vendor-neutral observability setup that puts you in control of your data and your budget.
OpenTelemetry is a CNCF-graduated project that standardizes the collection of logs, metrics, and traces.
The OTel Collector is a key component, offering incredible flexibility. It has
Elasticsearch, the heart of the ELK stack, also underwent significant licensing changes when Elastic moved core components to a source-available license. This shift raised concerns within the open-source community regarding vendor lock-in and the long-term openness of the project. The community really didn't like this change.
As a direct response, OpenSearch, a community-driven, Apache 2.0 licensed fork, was created to ensure a truly open-source path forward. It has a active community and is backed by AWS, which provides a stable foundation for future development.
Note: Elastic often claims that OpenSearch is inefficient and not as performant as Elasticsearch. However, we operate clusters that ingests billions of spans and logs daily with nominal cost and no performance issues. So we can can attest that OpenSearch is a solid choice for modern observability needs.
Both OpenSearch dashboard and Kibana have a really bad UI for tracing. So we use Jaeger. Jaeger, another CNCF-graduated project, is purpose-built for trace visualization. Its intuitive UI and powerful analysis features make it the one of the top open-source tool for debugging and understanding complex service interactions. We are using Jaeger for the frontend fror tracing and OpenSearch as the datastore.
A Kubernetes Cluster: While you can run these components in other environments, Kubernetes is highly recommended, especially for observability systems. These systems often grow faster than anticipated, and Kubernetes provides the scalability and manageability you'll need.
The recommended way to deploy and manage OpenSearch on Kubernetes is by using the OpenSearch Operator.
Why the Operator? It significantly simplifies deployment, ongoing management (like version ugrades, config changes, and scaling), and day-to-day operations. In our experience managing multiple large OpenSearch clusters that ingests billions of spans and logs daily, the operator has substantially reduced operational overhead.
Follow these instructions to setup the Opensearch Operator using Helm: OpenSearch-k8s-operator
Once you have the operator running, you can deploy an OpenSearch cluster. Following is an exmple of a 3-node OpenSearch cluster. Read this user guide for more details: Opensearch Operator User Guide.
apiVersion: opensearch.opster.io/v1
kind: OpenSearchCluster
metadata:
name: observability-opensearch
namespace: observability
spec:
general:
serviceName: observability-opensearch
version: 2.17.0
additionalConfig:
plugins.query.datasources.encryption.masterkey: "cbdda1e0ab9e45c44f9b56a3" # Change this
security:
config:
adminSecret:
# This secret contains the admin certificate using common name "admin". Use cert-manager to generate it.
name: observability-opensearch-admin-cert
adminCredentialsSecret:
# This secret contains admin credentials. They keys are "username" and "password".
name: observability-opensearch-admin-credentials
securityConfigSecret:
# This secret contains the security config files.
# The key is filename and the value is the file content. e.g. "config.yml": "internal_users.yml"
name: observability-opensearch-security-config-files
tls:
transport:
generate: false
perNode: false
secret:
# Generate this similar to the admin cerntificate, but with common name "opensearch".
name: observability-opensearch-node-cert
caSecret:
# This is the secret that contains the CA certificate used to create the admin and node certificates.
name: observability-ca-secret
nodesDn: ["CN=opensearch"]
adminDn: ["CN=admin"]
http:
generate: false
secret:
name: observability-opensearch-node-cert
caSecret:
name: observability-ca-secret
dashboards:
enable: true
version: 2.17.0
replicas: 1
resources:
requests:
memory: "512Mi"
cpu: "200m"
limits:
memory: "512Mi"
cpu: "200m"
opensearchCredentialsSecret:
name: observability-opensearch-admin-credentials
nodePools:
- component: master
replicas: 3
diskSize: "30Gi"
nodeSelector:
resources:
requests:
memory: "1.5Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "1500m"
roles:
- "cluster_manager"
- "data"
persistence:
pvc:
storageClass: "storage-class-name" # Change this to your storage class
accessModes:
- "ReadWriteOnce"
env:
- name: DISABLE_INSTALL_DEMO_CONFIG
value: "false"
Following are the essintial configuration files you need to put in observability-opensearch-security-config-files
secret:
internal_users.yml
:
_meta:
type: "internalusers"
config_version: 2
admin:
# Change this to your hashed admin password: Use https://bcrypt-generator.com/
hash: "$2y$12$eW5Z1z3a8b7c9d8e7f8g9u0h1i2j3k4l5m6n7o8p9q0r1s2t3u4v5w6x7y8z" #
reserved: true
description: "Cluster super user"
backend_roles:
- "admin"
---
config.yaml
:
_meta:
type: "config"
config_version: 2
config:
dynamic:
http:
anonymous_auth_enabled: false
authc:
basic_internal_auth_domain:
description: "Authenticate via HTTP Basic against internal users database"
http_enabled: true
transport_enabled: true
order: 4
http_authenticator:
type: basic
challenge: true
authentication_backend:
type: intern
clientcert_auth_domain:
description: "Authenticate via SSL client certificates"
http_enabled: false
transport_enabled: false
order: 2
http_authenticator:
type: clientcert
config:
username_attribute: cn #optional, if omitted DN becomes username
challenge: false
authentication_backend:
type: noop
authz: {}
Data streams are designed to handle continuously generated time-series append-only data such as logs. We will setup a data stream name logs-stream
and make it write to a new index every day. The indices will expire after 30 days. This is a common pattern for log data, allowing you to efficiently manage storage and retention.
Note: You can run the following commands in OpenSearch Dev Tools in Opensearch Dashboards.
PUT _index_template/logs-stream-template
{
"index_patterns" : "logs-stream",
"data_stream": {},
"priority": 100
}
PUT _data_stream/logs-stream
PUT _plugins/_ism/policies/logs-lifecycle-policy
{
"policy": {
"description": "Rollover indices daily and delete after 30 days",
"default_state": "hot",
"states": [
{
"name": "hot",
"actions": [
{
"rollover": {
"min_index_age": "1d"
}
}
],
"transitions": [
{
"state_name": "delete",
"conditions": {
"min_index_age": "30d"
}
}
]
},
{
"name": "delete",
"actions": [
{
"delete": {}
}
],
"transitions": []
}
]
}
}
POST _plugins/_ism/add/logs-stream
{
"policy_id": "logs-lifecycle-policy"
}
We will be using Jaeger operator to deploy Jaeger. Run the following to install the Jaeger Operator.
kubectl apply -f https://github.com/jaegertracing/jaeger-operator/releases/download/v1.65.0/jaeger-operator.yaml
Note that you need to install the operator in the same namespace as your observability stack. In our case that would be observability
. Once that is installed in you can deploy Jaeger.
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
name: observability-jaeger
namespace: observability
spec:
strategy: production
collector:
maxReplicas: 2
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
ui:
options:
dependencies:
menuEnabled: false
menu:
- label: "About Jaeger"
items:
- label: "Documentation"
url: "https://www.jaegertracing.io/docs/latest"
storage:
type: elasticsearch
# Create a secret named `observability-jaeger-es-credentials` with the following keys
# ES_USERNAME: your admin username, in this case it is `admin`
# ES_PASSWORD: <admin password>
secretName: observability-jaeger-es-credentials
options:
es:
server-urls: https://observability-opensearch.observability.svc.cluster.local:9200
index-prefix: jaeger
tls:
enabled: true
ca: /tls/ca.crt
esIndexCleaner:
enabled: true
numberOfDays: 30 # Retain traces for 30 days
schedule: "55 23 * * *" # Run daily at 23:55
query:
options:
prometheus:
query:
normalize-calls: true
normalize-duration: true
volumeMounts:
- name: ca-cert
mountPath: /tls/ca.crt
subPath: ca.crt
volumes:
- name: ca-cert
secret:
# This is the CA certificate that that signed the OpenSearch TLS certificates
secretName: observability-ca-secret
items:
- key: ca.crt
path: ca.crt
Read more about Jaeger Operator and its configuration options here: https://github.com/jaegertracing/jaeger-operator/blob/main/docs/api.md
Similarly, for the OpenTelemetry Collector, using the OTel Operator is recommended. Follow these instructions to set it up: OpenTelemetry Operator
Once you have the operator running, you can deploy an OpenTelemetry Collector. OpenTelemetry collector is the gateway for your observability data. You applications will send their logs and traces, which will then be processed and exported to Jaeger, which will write it to Opensearch.
apiVersion: opentelemetry.io/v1beta1
kind: OpenTelemetryCollector
metadata:
name: observability-otel-workers
namespace: observability
spec:
mode: deployment
image: otel/opentelemetry-collector-contrib:0.118.0
resources:
resources:
requests:
memory: "100Mi"
cpu: "500m"
limits:
memory: "1Gi"
cpu: "1000m"
autoscaler:
maxReplicas: 2
minReplicas: 1
targetCPUUtilization: 90
targetMemoryUtilization: 90
volumeMounts:
- name: ca-cert
mountPath: /tls/ca.crt
subPath: ca.crt
volumes:
- name: ca-cert
secret:
# This is the secret that contains the CA certificate used to create the admin and node certificates.
secretName: observability-ca-secret
items:
- key: ca.crt
path: ca.crt
config:
receivers:
otlp:
protocols:
grpc:
endpoint: "0.0.0.0:4317"
http:
endpoint: ":4318"
cors:
allowed_origins:
- "http://*"
- "https://*"
processors:
batch:
send_batch_max_size: 3000
send_batch_size: 1000
timeout: 5s
exporters:
otlp/jaeger:
endpoint: "observability-jaeger-collector.observability.svc.cluster.local:4317"
tls:
insecure: true
opensearch/logs:
# This is the OpenSearch data stream we created earlier.
logs_index: "logs-stream"
http:
endpoint: "https://observability-opensearch.observability.svc.cluster.local:9200"
auth:
authenticator: basicauth/os
tls:
insecure: false
ca_file: /tls/ca.crt
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [otlp/jaeger]
logs:
receivers: [otlp]
processors: [batch]
exporters: [opensearch/logs]
If you are running your application on the same Kubernetes cluster, you can directly send logs to the collector service which is observability-otel-workers.observability.svc.cluster.local:4317
. If you are running your application outside the cluster, you can expose the collector service using a LoadBalancer or NodePort service type. We recommand using a internal load balancer if your application is running in the same cloud provider as your Kubernetes cluster.
Here is an example of how to create such load balancer service:
apiVersion: v1
kind: Service
metadata:
name: observability-otel-collector-alb
annotations:
service.beta.kubernetes.io/aws-load-balancer-internal: "true" # Annotation for internal load balancer
spec:
type: LoadBalancer
ports:
- name: otlp-grpc
port: 4317
nodePort: 32007
protocol: TCP
- name: otlp-http
port: 4318
nodePort: 32008
protocol: TCP
selector:
app.kubernetes.io/component: opentelemetry-collector
app.kubernetes.io/instance: observability.observability-otel-workers
app.kubernetes.io/part-of: opentelemetry
You can obtail the alb endpoint using kubectl get svc
.
To send traces to the OpenTelemetry Collector, you need to instrument your application using OpenTelemetry libraries. Most languages have zero code instumentation support, which means you can automatically collect logs with miminal changes.
Check the Zero code instrumentation documentation for your language: https://opentelemetry.io/docs/concepts/instrumentation/zero-code/
Out of the box, the Jaeger UI lacks built-in authentication, which is a non-starter for production. You must place it behind an authentication proxy. We recommend using oauth2-proxy integrated with an OIDC provider like Keycloak. This creates a seamless Single Sign-On (SSO) experience for both OpenSearch Dashboards and Jaeger. Other options include placing Jaeger behind a VPN or a reverse proxy with basic authentication.
Moving away from traditional ELK to a stack powered by OpenTelemetry and OpenSearch offers you a more flexible, efficient, and truly open-source solution for observability. Now we add Jaeger to the mix for distributed tracing. This modern stack frees you from vendor lock-in, provides a great developer experience without the runaway costs of proprietary SaaS solutions.
Cheers!
ELK alternative: Modern log management setup with Opentelemetry and Opensearch
Part of this series, where we are setting up log managment
Securing Opensearch with OIDC
This article details how to use Keycloak to secure Opensearch