Malay Hazarika
May 13, 2025
|14 minutes Read
The convinience of SaaS is undeniable: quick setup and managed infrastructure. However, as your company grows, you might find your SaaS bills skyrocketing, consuming a significant portion of your operational budget. You know there are powerful open source alternatives for many of the tools you pay dearly for, but the path to adopting them seems too hard. This is a common dilemma for many tech companies, especially those aiming for frugal yet robust operations.
The primary hurdles?
If these challenges resonate with you, you're in the right place. This guide is designed to show you how to leverage the power of Kubernetes to build a reliable and cost-effective self-hosting platform for your essential open source applications in 2025. We'll get into the Kubernetes architecture for maximum reliability.
While Docker Compose is great for simple setups, Kubernetes is the clear choice for robust self-hosting when reliability, scalability, and manageable complexity are key. It offers built-in resilience, automatically handling failures and ensuring applications stay available. Kubernetes scales applications efficiently, optimizing resource use and costs. Its declarative configuration and automation reduce manual effort, making your setup reproducible. With a vibrant open source ecosystem and portability across environments, Kubernetes provides flexibility and avoids vendor lock-in. It also enhances security with features like network policies and RBAC. Although the learning curve exists, the long-term benefits for reliability, scalability, and operational efficiency make Kubernetes a powerful platform for self-hosting diverse open source tools.
Achieving maximum reliability in your self-hosting Kubernetes environment involves a few things. Our goal is to build a system that not only runs your open source applications effectively but also withstands failures, scales gracefully, and remains manageable in the long run.
This involves attention to detail in several key areas:
Let's dive into each of these areas.
While you could manually set up each component, this approach quickly becomes unmanageable, error-prone, and difficult to replicate or recover.
We recommand using OpenTofu or Terraform from day-1. Even though it is one more thing to learn, it pays off in the long run. Because your entire infrastructure is defined in code, which can be version-controlled. Plus it makes it easier to understand and manage.
For most teams, especially those prioritizing reliability and reduced operational overhead, a managed Kubernetes service from a cloud provider is the recommended starting point. We will talk about Elastic Kubernetes Service (EKS), but the principles apply to other cloud providers( and bare metel clusters) as well.
For AWS EKS, here are our key recommendations for a reliable network setup:
/16
CIDR block for your Virtual Private Cloud (VPC) that doesn't overlap with your application network or any other connected networks. /16
provide a large enough address space for your pods./19
Subnet masks.NAT gateway is AWS's managed NAT service. It's highly available and scalable but can be relatively expensive (around $40 per month per NAT Gateway, plus data processing charges).
For a cluster that needs to be cost effective, you can consider using a NAT instance. We found that a tiny NAT instance is good enough for self-hosting purposes, because most of your traffic will flow though the ingress controller and NAT will only be used for downloading stuff from the internet.
AWS offers mature storage services that integrate seamlessly with EKS via the EBS CSI (Container Storage Interface) driver. Here are our recommendations for a perfomant and reliable storage setup:
gp2
is default for new EKS cluster. We creating a new storage class with gp3
as gp3
is 20% cheper and significantly faster than gp2
.xfs
over ext4
: xfs
can handle bigger files and can handle more IOPS.Here is an example of storage class you can use:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: ebs-gp3
provisioner: ebs.csi.aws.com
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
parameters:
type: gp3
csi.storage.k8s.io/fstype: xfs
iops: "3000"
If you're managing your own hardware for self-hosting, you need a robust distributed storage solution.
Here are our recommendations:
hostPath
volumes directly ties your data to a specific node. If that node fails, your data is at high risk unless you have robust external backup and recovery processes. It severely limits kubernetes' ability to run your pods reliably.Once your open source applications are running in Kubernetes, you need a way to expose them to the outside world (or your internal network) securely and efficiently. This is where Ingress controllers come into play. An Ingress controller is the gatekeeper for all incoming traffic to cluster, providing routing, SSL/TLS termination, load balancing, and more.
We recommand using nginx-ingress controll for this. We choose nginx ingress because it is popular, easily scallable and has been stable for a long time.
You can install the nginx ingress controller using Helm: ingress-nginx Read the official documentation for more details on how to install and configure it.
Once the Nginx Ingress controller is running, it will typically provision an external LoadBalancer Service. You need to point your domain names to this load balancer. Here is how to do it
kubectl get svc -n ingress-nginx
(or the namespace where you installed it). Look for the EXTERNAL-IP
or HOSTNAME
of the ingress-nginx-controller
service.
*.internal
and map it to the load balancer's addresss
CNAME
record.A
record.*.internal
allows you to access your apps as grafana.internal.yourdomain.com
, metabase.internal.yourdomain.com
, etc. This is a great way to avoid having to create individual DNS entries for each app.Example of an ingress resource that allows you to access Metabase instance as metabase.internal.yourdomain.com
.It also specifies TLS settings, indicating that cert-manager
should procure a certificate for this host and store it in metabase-tls-secret
.
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: "metabase"
namespace: "metabase"
annotations:
cert-manager.io/cluster-issuer: "letsencrypt-prod" # For cert-manager (discussed next)
spec:
ingressClassName: nginx
tls:
- hosts:
- "metabase.internal.yourdomain.com"
secretName: "metabase-tls-cert"
rules:
- host: "metabase.internal.yourdomain.com"
http:
paths:
- pathType: Prefix
path: /
backend:
service:
name: metabase
port:
number: 3000
In today's web, HTTPS is non-negotiable. It encrypts traffic between your users and your applications, ensuring privacy and data integrity. For your self-hosted services exposed via Ingress, you need SSL/TLS certificates. Manually issuing, configuring, and renewing these certificates is a tedious and error-prone process, especially when managing multiple services.
cert-manager is a native Kubernetes certificate management controller. It automates issuing and renewing of SSL/TLS cerrtificate. You can integrate it with Let's Encrypt, a free and widely trusted certificate authority. This means you can get SSL/TLS certificates for your services without any cost, and cert-manager will handle the renewal process automatically.
Follow these instructions to install cert-manager on your cluster: https://cert-manager.io/docs/installation/helm/
Following is how to create a ClusterIssuer
for to issue certificates from Let's Encrypt. This is a prerequisite for using cert-manager to manage your SSL/TLS certificates.
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-nginx # this name is used in the Ingress resource
spec:
acme:
server: "https://acme-v02.api.letsencrypt.org/directory"
email: "yourname@yourcompany.com"
privateKeySecretRef:
name: letsencrypt-keys
solvers:
- http01:
ingress:
ingressClassName: nginx
Think of an Operator as a "DevOps engineer in a box" – a piece of software running in your Kubernetes cluster that encodes the operational knowledge and domain-specific expertise required to manage a particular application. Operators extend the Kubernetes API by creating Custom Resource Definitions (CRDs) that represent the application they manage. You then interact with these custom resources just like standard Kubernetes objects (kubectl get postgresclusters
). The Operator continuously watches these resources and takes action to ensure the application's actual state matches the desired state you've defined in the CRD.
This is a golden rule for reliable self-hosting. Databases are critical, complex systems. Managing them manually in a dynamic environment like Kubernetes is a recipe for disaster (data loss, extended downtime). Always use a mature, well-supported Operator for any database you plan to self-host. At osuite we regularly use CloudNativePG for PostgreSQL, Percona XtraDB for MySQL, and MongoDB Community Operator for MongoDB,The community ppensearch operator and many more to run applications at reliably.
Throughout this guide, we've mentioned Helm for installing components like the Nginx Ingress Controller and Cert-Manager. Helm is the de facto package manager for Kubernetes. It allows you to define, install, and upgrade even the most complex Kubernetes applications as "charts." Charts are collections of files that describe a related set of Kubernetes resources.
But Not all Helm charts are created equal. Setting up essential tools with unnecessarily complicated charts is recipe for disaster. Because it limits your ability to undestand the components you are installing and makes it harder to customize and debug issues.
values.yaml
: This file contains the default configuration options. Understand what each option does and customize it according to your environment and security policies. Pay close attention to resource requests/limits, persistence settings, image versions, and any security-related configurationsControversial opinion: At Osuite, We don't recommand Bitnami charts. We do use them, but with caution. They are ofter over engineered even for simple use cases. If you are absolutely sure that you don't need to modify the chart later on, you can use them. But we recommend using the official charts or community charts that are simpler and easier to understand.
Karpenter is an open source, flexible, high-performance Kubernetes cluster autoscaler built by AWS. Unlike the traditional Cluster Autoscaler that manages EC2 Auto Scaling Groups (ASGs), Karpenter works directly with the EC2 Fleet API to provision new nodes "just-in-time" based on the aggregate resource requests of unschedulable pods.
While powerful, Karpenter introduces another layer of complexity to your Kubernetes architecture.
Provisioner
CRDs. The documentation is okay, but lacks some clarity around how to set up AWS resources with Terraform.Unless you meet few of the following conditions, we recommend starting with the standard nodes.
This guide should give you a solid blue print to start your self hosting journey. You don't need to do all of it on the day-1. Start with the basics and get a few applications running and evolve your setup over time.
Happy self-hosting!