Architecture#
High-level#
From a high-level perspective, KOF consists of three layers:
- the Collection layer, where the statistics and events are gathered,
- the Regional layer, which includes storage to keep track of those statistics and events,
- and the Management layer, where you interact through the UI.
flowchart TD
A[Management UI, promxy]
A --> C[Storage Region 1]
A --> D[Storage Region 2]
C --> E[Collect Child 1]
C --> F[Collect Child 2]
D ==> G[...]
Mid-level#
Getting a little bit more detailed, it's important to undrestand that data flows upwards, from observed objects to centralized Grafana on the Management layer:
Management Cluster
kof-operators chart
grafana-operator
opentelemetry-operator
prometheus-operator-crds
kof-mothership chart
victoria-metrics-operator
cluster-api-visualizer
sveltos-dashboard
dex
k0rdent service templates
kof-dashboards
kof-operator
promxy
kof-collectors chart
opencost
opentelemetry-kube-stack
Either kof-istio
Certificates
ClusterProfiles
Or kof-regional and kof-child
MultiClusterServices
Cloud 1..N
Region 1..M
Regional Cluster
kof-operators chart
grafana-operator
opentelemetry-operator
prometheus-operator-crds
kof-storage chart
victoria-metrics-operator
victoria-logs-cluster
external-dns
jaeger-operator
dex
kof-dashboards
kof-collectors chart
opencost
opentelemetry-kube-stack
cert-manager
ingress-nginx
istio/gateway
kof-istio chart
cert-manager-istio-csr
istio/base
istiod
Child Cluster 1
cert-manager
Optional kof-istio
kof-operators chart
Disabled grafana-operator
opentelemetry-operator
prometheus-operator-crds
kof-collectors chart
opencost
opentelemetry-kube-stack
observed objects
Helm Charts#
KOF is deployed as a series of Helm charts at various levels.
kof-operators#
- Grafana dashboards platform, managed by grafana-operator
- OpenTelemetry collectors below, managed by opentelemetry-operator
- prometheus-operator-crds required to create OpenTelemetry collectors, also required to monitor
kof-mothership
itself
kof-mothership#
- Local VictoriaMetrics storage for alerting rules only, managed by victoria-metrics-operator
- cluster-api-visualizer for insight into multicluster configuration
- Sveltos dashboard, automatic secret distribution
- Dex SSO chart
- k0rdent service templates used by
kof-regional
andkof-child
charts - kof-dashboards for Grafana
- kof-operator (don't confuse it with the
kof-operators
chart) for auto-configuration - Promxy for aggregating Prometheus metrics from regional clusters
kof-regional#
- MultiClusterService
which configures and installs
kof-storage
and other charts to regional clusters
kof-child#
- MultiClusterService
which configures and installs
kof-collectors
and other charts to child clusters
kof-istio#
- Optional Istio support for secure connectivity between clusters without external DNS
kof-storage#
- Regional VictoriaMetrics storage with main data, managed by victoria-metrics-operator
- vmauth entrypoint proxy for VictoriaMetrics components
- vmcluster for high-available fault-tolerant version of VictoriaMetrics database
- victoria-logs-cluster for high-performance, cost-effective, scalable logs storage
- external-dns to communicate with other clusters
- Jaeger tracing platform, managed by jaeger-operator
- Dex SSO chart
- kof-dashboards for Grafana
kof-collectors#
- opentelemetry-kube-stack for hardware, OS, and Kubernetes metrics
- OpenCost "shines a light into the black box of Kubernetes spend"
Deployment Scenarios#
KOF supports two topologies:
Production (Regional Clusters)#
- Management-cluster telemetry is stored locally on the k0rdent management cluster.
- Child workloads send telemetry to their regional cluster, supporting data sovereignty and isolation.
Development / QA (Regionless)#
- No regions are defined. All telemetry (management child) is stored on the k0rdent management cluster.
Component Roles & Rationale#
Component | Role | Notes |
---|---|---|
k0rdent | Orchestration | Multi-cluster lifecycle service templates |
OpenTelemetry | Collection | Metrics, logs, traces; auto-instrumentation options |
Promxy | Query Federation | Cross-cluster PromQL alert rule evaluation at management |
VictoriaMetrics | Metrics Storage | Scalable TSDB; selected over Prometheus for clustering efficiency |
VictoriaLogs | Log Storage | Scalable log TSDB with retention controls |
Jaeger | Tracing | Trace store/visualization; regional awareness |
Grafana | Visualization | Unified dashboards; SSO/RBAC |
Dex | SSO | OIDC provider for Grafana |
OpenCost | FinOps | Cost allocation and efficiency ratios |
Dex Integration#
KOF uses Dex as an identity provider to enable Single Sign‑On (SSO) with OAuth2 and OIDC.
- Authentication flow: Dex issues ID tokens to Grafana and other clients after authenticating against an upstream identity provider (IdP).
- External IdP integration: Dex can delegate to providers such as Okta, Entra ID, GitHub, or LDAP.
- Group membership mapping: Dex propagates group membership claims, which KOF uses to enforce RBAC. Grafana dashboards and KOF namespaces can be restricted based on these groups.
This model centralizes authentication, while authorization remains controlled via Kubernetes RBAC and Grafana roles.