Architecture#
High-level#
From a high-level perspective, KOF consists of three layers:
- the Collection layer, where the statistics and events are gathered,
- the Regional layer, which includes storage to keep track of those statistics and events,
- and the Management layer, where you interact through the UI.
flowchart TD
    A[Management UI, promxy] 
    A --> C[Storage Region 1]
    A --> D[Storage Region 2]
    C --> E[Collect Child 1]
    C --> F[Collect Child 2]
    D ==> G[...]Mid-level#
Getting a little bit more detailed, it's important to undrestand that data flows upwards, from observed objects to centralized Grafana on the Management layer:
  Management Cluster
  
    kof-operators chart
    
  
      grafana-operator
    
    
      opentelemetry-operator
    
    
      prometheus-operator-crds
    
  
    kof-mothership chart
    
  
      victoria-metrics-operator
    
    
      cluster-api-visualizer
    
    
      sveltos-dashboard
    
    
      dex
    
    
      k0rdent service templates
    
    
      kof-dashboards
    
    
      kof-operator
    
    
      promxy
    
  
    kof-collectors chart
    
  
      opencost
    
    
      opentelemetry-kube-stack
    
  
    Either kof-istio
    
  
      Certificates
    
    
      ClusterProfiles
    
  
    Or kof-regional and kof-child
    
      MultiClusterServices
    
  
  Cloud 1..N
  
    Region 1..M
    
      Regional Cluster
      
    
        kof-operators chart
        
      
          grafana-operator
        
        
          opentelemetry-operator
        
        
          prometheus-operator-crds
        
      
        kof-storage chart
        
      
          victoria-metrics-operator
        
        
          victoria-logs-cluster
        
        
          external-dns
        
        
          jaeger-operator
        
        
          dex
        
        
          kof-dashboards
        
      
        kof-collectors chart
        
      
          opencost
        
        
          opentelemetry-kube-stack
        
      
        cert-manager
      
      
        ingress-nginx
      
      
        istio/gateway
      
      
        kof-istio chart
        
    
          cert-manager-istio-csr
        
        
          istio/base
        
        
          istiod
        
      
      Child Cluster 1
      
  
        cert-manager
      
      
        Optional kof-istio
      
      
        kof-operators chart
        
      
          Disabled grafana-operator
        
        
          opentelemetry-operator
        
        
          prometheus-operator-crds
        
      
        kof-collectors chart
        
      
          opencost
        
        
          opentelemetry-kube-stack
        
      
        observed objects
      
    Helm Charts#
KOF is deployed as a series of Helm charts at various levels.
kof-operators#
- Grafana dashboards platform, managed by grafana-operator
- OpenTelemetry collectors below, managed by opentelemetry-operator
- prometheus-operator-crds required to create OpenTelemetry collectors, also required to monitor kof-mothershipitself
kof-mothership#
- Local VictoriaMetrics storage for alerting rules only, managed by victoria-metrics-operator
- cluster-api-visualizer for insight into multicluster configuration
- Sveltos dashboard, automatic secret distribution
- Dex SSO chart
- k0rdent service templates used by kof-regionalandkof-childcharts
- kof-dashboards for Grafana
- kof-operator (don't confuse it with the kof-operatorschart) for auto-configuration
- Promxy for aggregating Prometheus metrics from regional clusters
kof-regional#
- MultiClusterService
  which configures and installs kof-storageand other charts to regional clusters
kof-child#
- MultiClusterService
  which configures and installs kof-collectorsand other charts to child clusters
kof-istio#
- Optional Istio support for secure connectivity between clusters without external DNS
kof-storage#
- Regional VictoriaMetrics storage with main data, managed by victoria-metrics-operator
- vmauth entrypoint proxy for VictoriaMetrics components
- vmcluster for high-available fault-tolerant version of VictoriaMetrics database
- victoria-logs-cluster for high-performance, cost-effective, scalable logs storage
- external-dns to communicate with other clusters
- Jaeger tracing platform, managed by jaeger-operator
- Dex SSO chart
- kof-dashboards for Grafana
kof-collectors#
- opentelemetry-kube-stack for hardware, OS, and Kubernetes metrics
- OpenCost "shines a light into the black box of Kubernetes spend"
Deployment Scenarios#
KOF supports two topologies:
Production (Regional Clusters)#
- Management-cluster telemetry is stored locally on the k0rdent management cluster.
- Child workloads send telemetry to their regional cluster, supporting data sovereignty and isolation.
Development / QA (Regionless)#
- No regions are defined. All telemetry (management child) is stored on the k0rdent management cluster.
Component Roles & Rationale#
| Component | Role | Notes | 
|---|---|---|
| k0rdent | Orchestration | Multi-cluster lifecycle service templates | 
| OpenTelemetry | Collection | Metrics, logs, traces; auto-instrumentation options | 
| Promxy | Query Federation | Cross-cluster PromQL alert rule evaluation at management | 
| VictoriaMetrics | Metrics Storage | Scalable TSDB; selected over Prometheus for clustering efficiency | 
| VictoriaLogs | Log Storage | Scalable log TSDB with retention controls | 
| Jaeger | Tracing | Trace store/visualization; regional awareness | 
| Grafana | Visualization | Unified dashboards; SSO/RBAC | 
| Dex | SSO | OIDC provider for Grafana | 
| OpenCost | FinOps | Cost allocation and efficiency ratios | 
Dex Integration#
KOF uses Dex as an identity provider to enable Single Sign‑On (SSO) with OAuth2 and OIDC.
- Authentication flow: Dex issues ID tokens to Grafana and other clients after authenticating against an upstream identity provider (IdP).
- External IdP integration: Dex can delegate to providers such as Okta, Entra ID, GitHub, or LDAP.
- Group membership mapping: Dex propagates group membership claims, which KOF uses to enforce RBAC. Grafana dashboards and KOF namespaces can be restricted based on these groups.
This model centralizes authentication, while authorization remains controlled via Kubernetes RBAC and Grafana roles.