Using KOF#
Most of the time, you'll access KOF's data through Grafana.
Access to Grafana#
To make Grafana available, start with these steps:
-
Get the Grafana username and password:
kubectl get secret -n kof grafana-admin-credentials -o yaml | yq '{ "user": .data.GF_SECURITY_ADMIN_USER | @base64d, "pass": .data.GF_SECURITY_ADMIN_PASSWORD | @base64d }'
-
Forward a port to the Grafana dashboard:
kubectl port-forward -n kof svc/grafana-vm-service 3000:3000
-
Login to http://127.0.0.1:3000/dashboards with the username/password printed above.
-
Open a dashboard and select any cluster:
Single Sign-On#
Port forwarding, as described above, is a quick solution.
Single Single-On provides better experience. If you want to enable it, please apply this advanced guide: SSO for Grafana.
Cluster Overview#
From here you can get an overview of the cluster, including:
- Health metrics
- Resource utilization
- Performance trends
- Cost analysis
Logging Interface#
The logging interface will also be available, including:
- Real-time log streaming
- Full-text search
- Log aggregation
- Alert correlation
Dashboard Categories#
KOF ships with dashboards across: * Infrastructure: Provides infrastructure-related metrics, such as kube clusters, nodes, API server, networking, storage, or GPU. * Applications: Provides metrics for applications, such as VictoriaMetrics, VictoriaLogs, Jaeger and OpenCost. * Service Mesh: Provides metrics for service mesh, such as Istio control-plane and traffic. * Platform: Provides metrics for the platform itself, including KCM, Cluster API, and Sveltos.
Dashboard Lifecycle (GitOps Workflow)#
All dashboards are managed as code to keep environments consistent. To add or change a dashboard, follow these steps:
Add a new dashboard
1. Create a YAML file under charts/kof-dashboards/files/dashboards/
with the new dashboard definition.
2. Commit and push the change to Git.
3. Your CI/CD pipeline applies the Helm chart to the target cluster.
Update an existing dashboard 1. Edit the corresponding YAML file. 2. Commit and push changes. 3. CI/CD will roll out the update automatically.
Delete a dashboard 1. Remove the YAML file. 2. Commit and push changes. 3. CI/CD pipeline removes the dashboard from Grafana.
Warning
Avoid editing dashboards directly in the Grafana UI. Changes will be overwritten by the next Helm release.
Cost Management (OpenCost)#
KOF includes OpenCost, which provides cost management features for Kubernetes clusters. Common signals available in Grafana are:
* node_total_hourly_cost
(per-node hourly cost)
* Namespace and pod-level cost allocation
* Historical spend trends and efficiency ratios
Once you have this information, you can optimize your cluster. Typical optimizations include: * Identify under-utilized resources and right-size workloads * Budgeting and monitoring with Grafana alerts
Common OpenCost metrics include:
Metric | Description |
---|---|
node_total_hourly_cost |
Hourly cost per node (includes CPU, memory, storage) |
namespace_cpu_cost |
CPU cost aggregated by namespace |
namespace_memory_cost |
Memory cost aggregated by namespace |
pod_cost |
Cost allocation at pod granularity |
cluster_efficiency |
Ratio of requested vs actual resource usage |
These metrics appear in the pre-installed Grafana FinOps dashboards.
Access to Jaeger#
Jaeger UI of each regional cluster can be accessed by following these steps:
-
Ensure you have the
regional-kubeconfig
file created on the verification step. -
If you've applied the Istio section:
-
Forward a port to the Jaeger UI:
KUBECONFIG=regional-kubeconfig kubectl port-forward \ -n kof svc/kof-storage-jaeger-query 16686:16686
-
Open the link http://127.0.0.1:16686/search and explore the Jaeger UI.
-
-
If you have not applied the Istio section:
-
Ensure you have the
REGIONAL_DOMAIN
variable set on the installation step. -
Get the regional Jaeger username and password:
KUBECONFIG=regional-kubeconfig kubectl get secret \ -n kof jaeger-admin-credentials -o yaml | yq '{ "user": .data.username | @base64d, "pass": .data.password | @base64d }'
-
Get the the Jaeger UI URL, open it, and login with the username/password printed above:
echo https://jaeger.$REGIONAL_DOMAIN
-
Access to the KOF UI#
When the TargetAllocator is in use, the configuration of OpenTelemetryCollectors Prometheus receivers is distributed across the cluster.
The KOF UI collects metrics metadata from the same endpoints that are scraped by the Prometheus server:
graph TB
KOF_UI[KOF UI] --> C1OTC11
KOF_UI --> C1OTC1N
KOF_UI --> C1OTC21
KOF_UI --> C1OTC2N
KOF_UI --> C2OTC11
KOF_UI --> C2OTC1N
KOF_UI --> C2OTC21
KOF_UI --> C2OTC2N
subgraph Cluster1
subgraph C1Node1[Node 1]
C1OTC11[OTel Collector]
C1OTC1N[OTel Collector]
end
subgraph C1NodeN[Node N]
C1OTC21[OTel Collector]
C1OTC2N[OTel Collector]
end
C1OTC11 --PrometheusReceiver--> C1TA[TargetAllocator]
C1OTC1N --PrometheusReceiver--> C1TA
C1OTC21 --PrometheusReceiver--> C1TA
C1OTC2N --PrometheusReceiver--> C1TA
end
subgraph Cluster2
subgraph C2Node1[Node 1]
C2OTC11[OTel Collector]
C2OTC1N[OTel Collector]
end
subgraph C2NodeN[Node N]
C2OTC21[OTel Collector]
C2OTC2N[OTel Collector]
end
C2OTC11 --PrometheusReceiver--> C2TA[TargetAllocator]
C2OTC1N --PrometheusReceiver--> C2TA
C2OTC21 --PrometheusReceiver--> C2TA
C2OTC2N --PrometheusReceiver--> C2TA
end
You can access the KOF UI by following these steps:
-
Forward a port to the KOF UI:
kubectl port-forward -n kof deploy/kof-mothership-kof-operator 9090:9090
-
Open the link http://127.0.0.1:9090
-
Check the state of the endpoints:
If there is a misconfiguration in the Prometheus targets (for example, if multiple targets scrape the same URL), the UI will display an error:
The KOF UI also allows you to monitor internal telemetry from OpenTelemetry collectors and VictoriaMetrics/Logs, enabling comprehensive observability of their health and performance.
To identify and debug issues in deployed clusters, check if KOF UI shows any errors in these monitored resources:
- ClusterDeployment
- ClusterSummaries
- MultiClusterService
- ServiceSet
- StateManagementProvider
- SveltosCluster