Using KOF#
Optional Grafana#
- Grafana installation and automatic configuration are now disabled in KOF by default.
- If you want to install and enable Grafana, apply the Grafana in KOF guide.
- Otherwise, check the sections below, showing how to use KOF without Grafana.
Metrics and alerts#
- Prometheus UI:
- Run in the management cluster:
kubectl port-forward -n kof svc/kof-mothership-promxy 8082:8082 - Explore the Graph: http://127.0.0.1:8082/graph?g0.expr=up&g0.tab=0
- Explore the Alerts: http://127.0.0.1:8082/alerts
- CLI queries for automation:
curl http://localhost:8082/api/v1/query?query=up \ | jq '.data.result | map(.metric.cluster) | unique' curl http://localhost:8082/api/v1/query?query=up \ | jq '.data.result | map(.metric.job) | unique' curl http://localhost:8082/api/v1/query \ -d 'query=up{cluster="mothership", job="kof-collectors-opencost"}' \ | jq
- Run in the management cluster:
- Alertmanager UI:
- Run in the management cluster:
kubectl port-forward -n kof svc/vmalertmanager-cluster 9093:9093 - Open http://127.0.0.1:9093/
- Run in the management cluster:
- VictoriaMetrics UI:
- Run in the regional cluster:
To get metrics stored from Management to Management (if any), do this port-forward in the management cluster.
KUBECONFIG=regional-kubeconfig kubectl port-forward \ -n kof svc/vmselect-cluster 8481:8481 - Open http://127.0.0.1:8481/select/0/vmui/#/dashboards
- Run in the regional cluster:
Logs#
- VictoriaLogs UI:
- Run in the regional cluster:
To get logs stored from Management to Management (if any), do this port-forward in the management cluster.
KUBECONFIG=regional-kubeconfig kubectl port-forward \ -n kof svc/kof-storage-victoria-logs-cluster-vlselect 9471:9471 - Open http://127.0.0.1:9471/select/vmui/
- CLI query for automation:
curl http://127.0.0.1:9471/select/logsql/query \ -d 'query=_time:1h' \ -d 'limit=10'
- Run in the regional cluster:
- Run inside of Istio mesh:
curl http://$REGIONAL_CLUSTER_NAME-logs-select:9471/select/logsql/query \ -d 'query=_time:1h' \ -d 'limit=10' - Run without Istio and port-forwarding:
VM_USER=$( kubectl get secret -n kof storage-vmuser-credentials -o yaml \ | yq .data.username | base64 -d ) VM_PASS=$( kubectl get secret -n kof storage-vmuser-credentials -o yaml \ | yq .data.password | base64 -d ) curl https://vmauth.$REGIONAL_DOMAIN/vls/select/logsql/query \ -u "$VM_USER":"$VM_PASS" \ -d 'query=_time:1h' \ -d 'limit=10'
Traces#
VictoriaTraces provides a scalable, cost-efficient distributed tracing backend that helps k0rdent users observe application performance while supporting FinOps goals by reducing storage and query costs.
- VictoriaTraces UI:
- Run in the regional cluster:
To get traces stored from Management to Management (if any), do this port-forward in the management cluster.
KUBECONFIG=regional-kubeconfig kubectl port-forward \ -n kof svc/kof-storage-vt-cluster-vtselect 10471:10471 - Open http://127.0.0.1:10471/select/vmui/
- Run in the regional cluster:
- CLI queries for automation:
- LogSQL:
curl http://127.0.0.1:10471/select/logsql/query \ -d 'query=_time:1h' \ -d 'limit=10' - Jaeger HTTP API:
curl http://127.0.0.1:10471/select/jaeger/api/servicescurl http://127.0.0.1:10471/select/jaeger/api/traces?service=test
- LogSQL:
Cost Management (OpenCost)#
KOF includes OpenCost, which provides cost management features for Kubernetes clusters. Common metrics (also available in the pre-installed Grafana FinOps dashboards if enabled) are:
| Metric | Description |
|---|---|
node_total_hourly_cost |
Hourly cost per node (includes CPU, memory, storage) |
namespace_cpu_cost |
CPU cost aggregated by namespace |
namespace_memory_cost |
Memory cost aggregated by namespace |
pod_cost |
Cost allocation at pod granularity |
cluster_efficiency |
Ratio of requested vs actual resource usage |
Once you have this information, you can optimize your cluster. Typical optimizations include:
- Identify under-utilized resources and right-size workloads
- Budgeting and monitoring with alerts
KOF UI#
When the TargetAllocator is in use, the configuration of OpenTelemetryCollectors Prometheus receivers is distributed across the cluster.
The KOF UI collects metrics metadata from the same endpoints that are scraped by the Prometheus server:
graph TB
KOF_UI[KOF UI] --> C1OTC11
KOF_UI --> C1OTC1N
KOF_UI --> C1OTC21
KOF_UI --> C1OTC2N
KOF_UI --> C2OTC11
KOF_UI --> C2OTC1N
KOF_UI --> C2OTC21
KOF_UI --> C2OTC2N
subgraph Cluster1
subgraph C1Node1[Node 1]
C1OTC11[OTel Collector]
C1OTC1N[OTel Collector]
end
subgraph C1NodeN[Node N]
C1OTC21[OTel Collector]
C1OTC2N[OTel Collector]
end
C1OTC11 --PrometheusReceiver--> C1TA[TargetAllocator]
C1OTC1N --PrometheusReceiver--> C1TA
C1OTC21 --PrometheusReceiver--> C1TA
C1OTC2N --PrometheusReceiver--> C1TA
end
subgraph Cluster2
subgraph C2Node1[Node 1]
C2OTC11[OTel Collector]
C2OTC1N[OTel Collector]
end
subgraph C2NodeN[Node N]
C2OTC21[OTel Collector]
C2OTC2N[OTel Collector]
end
C2OTC11 --PrometheusReceiver--> C2TA[TargetAllocator]
C2OTC1N --PrometheusReceiver--> C2TA
C2OTC21 --PrometheusReceiver--> C2TA
C2OTC2N --PrometheusReceiver--> C2TA
end
You can access the KOF UI by following these steps:
-
Forward a port to the KOF UI:
kubectl port-forward -n kof deploy/kof-mothership-kof-operator 9090:9090 -
Open the link http://127.0.0.1:9090
-
Check the state of the endpoints:

If there is a misconfiguration in the Prometheus targets (for example, if multiple targets scrape the same URL), the UI will display an error:

The KOF UI also allows you to monitor internal telemetry from OpenTelemetry collectors and VictoriaMetrics/Logs, enabling comprehensive observability of their health and performance.

To identify and debug issues in deployed clusters, check if KOF UI shows any errors in these monitored resources:
- ClusterDeployment
- ClusterSummaries
- MultiClusterService
- ServiceSet
- StateManagementProvider
- SveltosCluster
