Installing k0rdent Observability and FinOps#
KOF can be installed using Helm charts.
Helm Charts#
KOF is deployed as a series of Helm charts at various levels.
kof-mothership#
- Centralized Grafana dashboard, managed by grafana-operator
- Local VictoriaMetrics storage for alerting rules only, managed by victoria-metrics-operator
- cluster-api-visualizer for insight into multicluster configuration
- Sveltos dashboard, automatic secret distribution
- k0rdent service templates to deploy other charts to regional clusters
- Promxy for aggregating Prometheus metrics from regional clusters
kof-storage#
- Regional Grafana dashboard, managed by grafana-operator
- Regional VictoriaMetrics storage with main data, managed by victoria-metrics-operator
- vmauth entrypoint proxy for VictoriaMetrics components
- vmcluster for high-available fault-tolerant version of VictoriaMetrics database
- victoria-logs-single for high-performance, cost-effective, scalable logs storage
- external-dns to communicate with other clusters
kof-operators#
- prometheus-operator-crds required to create OpenTelemetry collectors, also required to monitor
kof-mothership
itself - OpenTelemetry collectors below, managed by opentelemetry-operator
kof-collectors#
- prometheus-node-exporter for hardware and OS metrics
- kube-state-metrics for metrics about the state of Kubernetes objects
- OpenCost "shines a light into the black box of Kubernetes spend"
Prerequisites#
Before beginning KOF installation, you should have the following components in place:
- A k0rdent management cluster - You can get instructions to create one in the quickstart guide
- To test on macOS you can install using:
brew install kind && kind create cluster -n k0rdent
- To test on macOS you can install using:
- You will also need your infrastructure provider credentials, such as those shown in the guide for AWS
- Note that you should skip the "Create your ClusterDeployment" and later sections.
- Finally, you need access to create DNS records for service endpoints such as
kof.example.com
DNS auto-config#
To avoid manual configuration of DNS records for service endpoints later, you can automate the process now using external-dns.
For example, for AWS you should use the Node IAM Role or IRSA methods in production.
For now, however, just for the sake of this demo based on the aws-standalone
template,
you can use the most straightforward (though less secure) static credentials method:
- Create an
external-dns
IAM user with this policy. - Create an access key and
external-dns-aws-credentials
file, as in:[default] aws_access_key_id = <EXAMPLE_ACCESS_KEY_ID> aws_secret_access_key = <EXAMPLE_SECRET_ACCESS_KEY>
- Create the
external-dns-aws-credentials
secret in thekof
namespace:kubectl create namespace kof kubectl create secret generic \ -n kof external-dns-aws-credentials \ --from-file external-dns-aws-credentials
Management Cluster#
To install KOF on the management cluster, look through the default values of the kof-mothership and kof-operators charts, and apply this example, or use it as a reference:
-
Install
kof-operators
required bykof-mothership
:helm install --wait --create-namespace -n kof kof-operators \ oci://ghcr.io/k0rdent/kof/charts/kof-operators --version 0.1.1
-
Create the
mothership-values.yaml
file:This enables installation ofkcm: installTemplates: true
ServiceTemplates
such ascert-manager
andkof-storage
, to make it possible to reference them from the Regional and ChildClusterDeployments
. -
If you want to use a default storage class, but
kubectl get sc
shows no(default)
, create it. Otherwise you can use a non-default storage class in themothership-values.yaml
file:global: storageClass: <EXAMPLE_STORAGE_CLASS>
-
If you've applied the DNS auto-config section, add to the
kcm:
object in themothership-values.yaml
file:This enables Sveltos to auto-distribute DNS secret to regional clusters.kof: clusterProfiles: kof-aws-dns-secrets: matchLabels: k0rdent.mirantis.com/kof-aws-dns-secrets: "true" secrets: - external-dns-aws-credentials
-
Two secrets are auto-created by default:
-
Install
kof-mothership
:helm install --wait -f mothership-values.yaml -n kof kof-mothership \ oci://ghcr.io/k0rdent/kof/charts/kof-mothership --version 0.1.1
-
Wait for all pods to show that they're
Running
:kubectl get pod -n kof
Regional Cluster#
To install KOF on the regional cluster, look through the default values of the kof-storage chart, and apply this example for AWS, or use it as a reference:
-
Set your KOF variables using your own values:
REGIONAL_CLUSTER_NAME=cloud1-region1 REGIONAL_DOMAIN=$REGIONAL_CLUSTER_NAME.kof.example.com ADMIN_EMAIL=$(git config user.email) echo "$REGIONAL_CLUSTER_NAME, $REGIONAL_DOMAIN, $ADMIN_EMAIL"
-
Use the up-to-date
ClusterTemplate
, as in:kubectl get clustertemplate -n kcm-system | grep aws TEMPLATE=aws-standalone-cp-0-1-0
-
Compose the following objects:
ClusterDeployment
- regional clusterPromxyServerGroup
- for metricsGrafanaDatasource
- for logs
cat >regional-cluster.yaml <<EOF apiVersion: k0rdent.mirantis.com/v1alpha1 kind: ClusterDeployment metadata: name: $REGIONAL_CLUSTER_NAME namespace: kcm-system labels: kof: storage spec: template: $TEMPLATE credential: aws-cluster-identity-cred config: clusterIdentity: name: aws-cluster-identity namespace: kcm-system controlPlane: instanceType: t3.large controlPlaneNumber: 1 publicIP: false region: us-east-2 worker: instanceType: t3.medium workersNumber: 3 clusterLabels: k0rdent.mirantis.com/kof-storage-secrets: "true" k0rdent.mirantis.com/kof-aws-dns-secrets: "true" serviceSpec: priority: 100 services: - name: ingress-nginx namespace: ingress-nginx template: ingress-nginx-4-11-3 - name: cert-manager namespace: cert-manager template: cert-manager-1-16-2 values: | cert-manager: crds: enabled: true - name: kof-storage namespace: kof template: kof-storage-0-1-1 values: | external-dns: enabled: true victoriametrics: vmauth: ingress: host: vmauth.$REGIONAL_DOMAIN security: username_key: username password_key: password credentials_secret_name: storage-vmuser-credentials grafana: ingress: host: grafana.$REGIONAL_DOMAIN security: credentials_secret_name: grafana-admin-credentials cert-manager: email: $ADMIN_EMAIL --- apiVersion: kof.k0rdent.mirantis.com/v1alpha1 kind: PromxyServerGroup metadata: labels: app.kubernetes.io/name: promxy-operator k0rdent.mirantis.com/promxy-secret-name: kof-mothership-promxy-config name: $REGIONAL_CLUSTER_NAME-metrics namespace: kof spec: cluster_name: $REGIONAL_CLUSTER_NAME targets: - "vmauth.$REGIONAL_DOMAIN:443" path_prefix: /vm/select/0/prometheus/ scheme: https http_client: dial_timeout: "5s" tls_config: insecure_skip_verify: true basic_auth: credentials_secret_name: storage-vmuser-credentials username_key: username password_key: password --- apiVersion: grafana.integreatly.org/v1beta1 kind: GrafanaDatasource metadata: labels: app.kubernetes.io/managed-by: Helm name: $REGIONAL_CLUSTER_NAME-logs namespace: kof spec: valuesFrom: - targetPath: "basicAuthUser" valueFrom: secretKeyRef: key: username name: storage-vmuser-credentials - targetPath: "secureJsonData.basicAuthPassword" valueFrom: secretKeyRef: key: password name: storage-vmuser-credentials datasource: name: $REGIONAL_CLUSTER_NAME url: https://vmauth.$REGIONAL_DOMAIN/vls access: proxy isDefault: false type: "victoriametrics-logs-datasource" basicAuth: true basicAuthUser: \${username} secureJsonData: basicAuthPassword: \${password} instanceSelector: matchLabels: dashboards: grafana resyncPeriod: 5m EOF
-
The
ClusterTemplate
above provides the default storage classebs-csi-default-sc
. If you want to use a non-default storage class, add it to theregional-cluster.yaml
file in theClusterDeployment.spec.serviceSpec.services[name=kof-storage].values
:global: storageClass: <EXAMPLE_STORAGE_CLASS> victoria-logs-single: server: storage: storageClassName: <EXAMPLE_STORAGE_CLASS>
-
Verify and apply the Regional
ClusterDeployment
:cat regional-cluster.yaml kubectl apply -f regional-cluster.yaml
-
Watch how the cluster is deployed to AWS until all values of
READY
areTrue
:clusterctl describe cluster -n kcm-system $REGIONAL_CLUSTER_NAME \ --show-conditions all
Child Cluster#
To install KOF on the actual cluster to be monitored, look through the default values of the kof-operators and kof-collectors charts, and apply this example for AWS, or use it as a reference:
-
Set your own value below, verifing the variables:
CHILD_CLUSTER_NAME=$REGIONAL_CLUSTER_NAME-child1 echo "$CHILD_CLUSTER_NAME, $REGIONAL_DOMAIN"
-
Use the up-to-date
ClusterTemplate
, as in:kubectl get clustertemplate -n kcm-system | grep aws TEMPLATE=aws-standalone-cp-0-1-0
-
Compose the
ClusterDeployment
:cat >child-cluster.yaml <<EOF apiVersion: k0rdent.mirantis.com/v1alpha1 kind: ClusterDeployment metadata: name: $CHILD_CLUSTER_NAME namespace: kcm-system labels: kof: collector spec: template: $TEMPLATE credential: aws-cluster-identity-cred config: clusterIdentity: name: aws-cluster-identity namespace: kcm-system controlPlane: instanceType: t3.large controlPlaneNumber: 1 publicIP: false region: us-east-2 worker: instanceType: t3.small workersNumber: 3 clusterLabels: k0rdent.mirantis.com/kof-storage-secrets: "true" serviceSpec: priority: 100 services: - name: cert-manager namespace: kof template: cert-manager-1-16-2 values: | cert-manager: crds: enabled: true - name: kof-operators namespace: kof template: kof-operators-0-1-1 - name: kof-collectors namespace: kof template: kof-collectors-0-1-1 values: | global: clusterName: $CHILD_CLUSTER_NAME opencost: enabled: true opencost: prometheus: username_key: username password_key: password existingSecretName: storage-vmuser-credentials external: url: https://vmauth.$REGIONAL_DOMAIN/vm/select/0/prometheus exporter: defaultClusterId: $CHILD_CLUSTER_NAME kof: logs: username_key: username password_key: password credentials_secret_name: storage-vmuser-credentials endpoint: https://vmauth.$REGIONAL_DOMAIN/vls/insert/opentelemetry/v1/logs metrics: username_key: username password_key: password credentials_secret_name: storage-vmuser-credentials endpoint: https://vmauth.$REGIONAL_DOMAIN/vm/insert/0/prometheus/api/v1/write EOF
-
Verify and apply the
ClusterDeployment
:cat child-cluster.yaml kubectl apply -f child-cluster.yaml
-
Watch while the cluster is deployed to AWS until all values of
READY
areTrue
:clusterctl describe cluster -n kcm-system $CHILD_CLUSTER_NAME \ --show-conditions all