Upgrading KOF#
Data Backup#
Before migrating or upgrading, it's recommended to create a backup of your Victoria Logs and Victoria Metrics data. You can export the data from the regional cluster endpoints to local files using the following commands. Repeat the describe procedure for each of your regional cluster if needed.
- Lookup the regional cluster configmap named
kof-<cluster-name>in thekcm-systemnamespace on the management cluster.
kubectl get cm kof-<cluster-name> -n kcm-system
- Note the
read_logs_endpoint,read_metrics_endpoint, and other endpoints values from the configmap. - Ensure that you have enough free disk space (e.g. check the total used size of volumes allocated to your regional cluster). Victoria Logs/Metrics usually compress the data. Commands to fetch the data compress data too, but it stores data in a JSON line format, so requires 5 times more space.
- Ensure the connectivity to the regional cluster (e.g. do it on the Mothership cluster that surely has connectivity). If you are using Istio service mesh, consider to run a container in the Mothership cluster with attached properly sized volume in the kof namespace for connectivity to regional cluster endpoints. Check the k0rdent cluster documentation to retrieve kubernetes configuration.
Backup Victoria Metrics#
To backup all metrics data to a local file (victoria-metrics-backup.gz):
curl -H 'Accept-Encoding: gzip' -sSN \
-u "<username>:<password>" \
<read_metrics_endpoint>/api/v1/export?reduce_mem_usage=1 \
-d 'match[]={__name__!=""}' \
> victoria-metrics-backup.gz
To restore all metrics to a regional cluster from a local file run:
Without Istio servicemesh:
KUBECONFIG=kubeconfig kubectl port-forward svc/vminsert-cluster 8480:8480 -n kof # use kubeconfig of regional cluster to import data
curl -H 'Content-Encoding: gzip' -sSX POST \
-T victoria-metrics-backup.gz \
http://localhost:8480/insert/0/prometheus/api/v1/import
With Istio servicemesh:
curl -H 'Content-Encoding: gzip' -sSX POST \
-T victoria-metrics-backup.gz \
http://<CLUSTER_NAME>-vminsert:8480/insert/0/prometheus/api/v1/import
- This command fetches all metrics from the old regional cluster and saves them as a gzip-compressed file.
- You can later use this file for migration or restoration by uploading it to the new cluster.
Backup Victoria Logs#
Before querying logs increase the "search.maxQueryDuration" value for the kof-storage-victoria-logs-cluster-vlselect deployment which is just a 30s by default and not enough to query all logs at one request.
KUBECONFIG=kubeconfig kubectl edit deployment/kof-storage-victoria-logs-cluster-vlselect -n kof # use kubeconfig of regional cluster to import data
Edit and save the deployment spec to add search.maxQueryDuration argument (remove it after you finish the backup).
spec:
template:
spec:
containers:
- args:
- --search.maxQueryDuration=24h
To backup all logs data to a local file (victoria-logs-backup.gz):
curl -H 'Accept-Encoding: gzip' -sSN \
-u "<username>:<password>" \
<read_logs_endpoint>/select/logsql/query \
-d 'query=*' > victoria-logs-backup.gz
- This command fetches all logs from the old regional cluster and saves them as a gzip-compressed file.
- You can later use this file for migration or restoration by uploading it to the new cluster.
To restore all logs to a regional cluster from a local file run:
Without Istio servicemesh:
KUBECONFIG=kubeconfig kubectl port-forward svc/kof-storage-victoria-logs-cluster-vlinsert 9481:9481 -n kof # use kubeconfig of regional cluster to import data
curl -H 'Content-Encoding: gzip' -sSX POST \
-T victoria-logs-backup.gz \
http://localhost:9481/insert/jsonline
N.B. the default write_logs_endpoint ends with /insert/opentelemetry/v1/logs: replace it with /insert/jsonline.
With Istio servicemesh:
curl -H 'Content-Encoding: gzip' -sSX POST \
-T victoria-logs-backup.gz \
http://<CLUSTER_NAME>-logs-insert:9481/insert/jsonline
Notes:
- Replace <username>:<password>, <read_metrics_endpoint>, <read_logs_endpoint>, etc with your actual credentials and endpoints (lookup the storage-vmuser-credentials secret in the kof namespace for basic authentication credentials if you are not using Istio servicemesh)
- For chunked or filtered backups, refer to the official documentation for available query options.
- Ensure you have sufficient disk space for the backup files.
Data migration to a new regional cluster#
This section describes the process for migrating data to a new KOF regional cluster, including both the cluster migration steps and the detailed procedure for migrating Victoria Logs and Metrics data.
Migration Overview#
- Deploy a new kof-regional cluster.
- Add
k0rdent.mirantis.com/kof-regional-cluster-namelabel to a child ClusterDeployment to point all collectors to the new regional cluster. - Ensure metrics and logs are available for the child cluster, confirming the new regional cluster is operational.
- Change all other child clusters from the old regional cluster to the new one.
- Start the data migration.
- Ensure that the data is migrated and explorable in Grafana
- Delete old kof-regional cluster
- Remove the label from child ClusterDeployments.
Victoria Logs and Metrics Data Migration#
Follow these steps to migrate Victoria Logs and Metrics data from the old regional cluster to the new one:
- Lookup the old regional cluster configmap named
kof-<cluster-name>in thekcm-systemnamespace on the management cluster. - Note the
read_logs_endpointandread_metrics_endpointvalues from the configmap. - Authentication and Service Mesh:
- If not using Istio servicemesh:
- Lookup the
storage-vmuser-credentialssecret in thekofnamespace for basic authentication credentials.
- Lookup the
- If using Istio servicemesh:
- Copy the
istio-remote-secret-<cluster-name>secret from the old regional cluster to the new regional cluster to enable Istio endpoints discovery.
- Copy the
- If not using Istio servicemesh:
- Prepare a migration environment on the new regional cluster:
KUBECONFIG=regional-kubeconfig kubectl run -it --rm \ --image=alpine:latest -n kof migration -- sh apk update apk add curl pv -
Migrate metrics data by running the following command (replace placeholders as needed):
curl -H 'Accept-Encoding: gzip' -sSN \ -u "<username>:<password>" \ <read_metrics_endpoint>/api/v1/export?reduce_mem_usage=1 \ -d 'match[]={__name__!=""}' \ | pv | curl -H 'Content-Encoding: gzip' -sSX POST -T - \ http://vminsert-cluster.kof:8480/insert/0/prometheus/api/v1/importCheck VictoriaMetrics API for more details.
The query fetches all metrics from the old regional cluster endpoint. If you need to split migration by chunks, please refer the documentation for
match[]filter available options (by metric name, time range, etc)Reset vmselect cache after the migration:
curl -I http://vmselect-cluster.kof:8481/internal/resetRollupResultCache -
Migrate logs data by running the following command (replace placeholders as needed):
curl -H 'Accept-Encoding: gzip' -sSN \ -u "<username>:<password>" \ <read_logs_endpoint>/select/logsql/query \ -d 'query=*' \ | pv | curl -H 'Content-Encoding: gzip' -sSX POST -T - \ http://kof-storage-victoria-logs-cluster-vlinsert.kof:9481/insert/jsonlineCheck VictoriaLogs FAQ for more details.
The query fetches all logs from the old regional cluster endpoint. If you need to split migration by chunks, please refer the documentation for
logsqlquery filter available options (by stream name, time range, etc)
To migrate data with transformation please consider one of the following options:
- Backup data to a file with a JSON line format, change each json, restore the file to a new cluster
- Use the Python script with your custom transformation function
Explanation of curl parameters:
-H 'Accept-Encoding: gzip': Requests gzip-compressed response to reduce bandwidth.-sS: Silent mode. Hides progress, shows error messages.-N: Disables output buffering, streaming data as it arrives.-u "<username>:<password>": Specifies basic authentication credentials (not needed for Istio)-d 'match[]={__name__!=""}': Sends POST data to filter all metrics.| pv: Pipes output throughpvto monitor progress.- Second
curl: -H 'Content-Encoding: gzip': Informs the server that the data is gzip-compressed.-X POST: Uses HTTP POST method.-T -: Reads data from stdin (the pipe) and uploads it.
Notes: Ensure you have the correct credentials and endpoints for your environment. Adjust the commands as necessary for your authentication method and network setup.
Upgrade to any version#
- Create a backup of each KOF chart values in each cluster, for example:
# Management cluster uses default KUBECONFIG="" for cluster in "" regional-kubeconfig child-kubeconfig; do for namespace in kof istio-system; do for chart in $(KUBECONFIG=$cluster helm list -qn $namespace); do KUBECONFIG=$cluster helm get values -n $namespace $chart -o yaml \ > values-$cluster-$chart.bak done done done ls values-*.bak - Ideally, create a backup of everything including VictoriaMetrics/Logs data volumes.
- Open the latest version of the Installing KOF guide.
- Make sure you see the expected new version in the top navigation bar.
- Apply the guide step by step, but:
- Skip unchanged credentials like
external-dns-aws-credentials. - Before applying new YAML files, verify what has changed, for example:
diff -u values--kof-mothership.bak mothership-values.yaml - Run all
helm upgradecommands with the new--versionand files as documented.
- Skip unchanged credentials like
- Do the same for other KOF guides.
- Apply each relevant "Upgrade to" section of this page from older to newer.
- For example, if you're upgrading from v1.1.0 to v1.3.0,
- first apply the Upgrade to v1.2.0 section,
- then apply the Upgrade to v1.3.0 section.
Upgrade to v1.5.0#
Note
To use the new KCM Regional Clusters with KOF, please apply the KCM Region With KOF guide on top of the steps below.
Important!!!
Notice
This is for users of kof-istio only.
Starting from the v1.5.0 release, Istio has been moved to a separate repository, which changes the installation and upgrade process. Upgrading KOF will trigger a complete uninstallation of all components across Istio clusters. To prevent data loss and ensure a smooth migration, you must carefully follow the steps in the given order. Again, the steps below are only required if any Istio clusters were deployed.
1. Back Up Data from Istio Clusters#
Note
Perform all backup steps in this section for each Istio cluster individually. Each cluster must have its own backup before proceeding with the upgrade.
Create a PersistentVolumeClaim#
Create a PersistentVolumeClaim named backup-pvc on the management cluster:
kubectl apply -f - <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: backup-pvc
namespace: kof
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi # Adjust according to your data size
storageClassName: standard
EOF
Note
To estimate how much storage you need for the backup, open the kof-ui, navigate to the VictoriaMetrics/Logs section, and click the VictoriaMetrics storage pod name on the cluster you want to back up. Find the Data Size metric and multiply this value by at least two (or by at least five for VictoriaLogs). This is because the metric shows data compressed using the VictoriaMetrics algorithm, while the backup will be stored in a simple gzip format, which does not compress as efficiently.
Create a Backup Pod with curl#
Deploy a temporary pod for performing backups on the management cluster:
kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: backup-deployment
namespace: kof
spec:
replicas: 1
selector:
matchLabels:
app: backup-deployment
template:
metadata:
labels:
app: backup-deployment
spec:
containers:
- name: backup
image: curlimages/curl:latest
command: ["/bin/sleep", "infinity"]
volumeMounts:
- mountPath: /home/curl_user
name: backup-volume
volumes:
- name: backup-volume
persistentVolumeClaim:
claimName: backup-pvc
EOF
Retrieve Cluster Endpoints#
Locate the regional cluster ConfigMap named kof-<cluster-name> in the kcm-system namespace of the management cluster.
Record the following endpoint values:
- read_logs_endpoint
- read_metrics_endpoint
Connect to the Backup Pod#
kubectl exec -it -n kof <BACKUP_DEPLOYMENT_POD_NAME> -- /bin/sh
Back Up VictoriaMetrics/Logs Data#
For instructions on creating backups of VictoriaMetrics and log data, refer to the Data Backup section.
(Optional) Download Backups Locally#
kubectl cp -n kof <BACKUP_DEPLOYMENT_POD_NAME>:victoria-metrics-backup.gz ./victoria-metrics-backup.gz
kubectl cp -n kof <BACKUP_DEPLOYMENT_POD_NAME>:victoria-logs-backup.gz ./victoria-logs-backup.gz
2. Clean Up the Old Istio Clusters#
Important
Ensure that both VictoriaMetrics and VictoriaLogs backup files have been successfully created and verified before the cluster cleanup.
Pause Synchronization on Managed Clusters#
Before starting the cleanup, execute the following command on the management cluster:
kubectl scale --replicas=0 deployment/addon-controller -n projectsveltos
This command temporarily pauses synchronization between the management cluster and all managed clusters. It ensures that no old configurations are applied during the cleanup process.
Cleanup Old KOF Components#
Use the kof-nuke.bash script (located in the kof/script directory) to completely remove all old kof resources from the remote cluster.
Run the script for each regional and child cluster, for example:
export KUBECONFIG=regional-kubeconfig && ls $KUBECONFIG && scripts/kof-nuke.bash
Cleanup typically takes 1–5 minutes, depending on the cluster size and network speed.
Cleanup Old Istio Components#
Remove old Istio components for each regional and child cluster
export KUBECONFIG=regional-kubeconfig
helm uninstall --wait -n istio-system kof-istio-gateway
helm uninstall --wait -n istio-system kof-istio
helm uninstall --wait -n istio-system cert-manager
kubectl delete namespace istio-system --wait
kubectl get crd -o name | grep --color=never 'istio.io' | xargs kubectl delete
Once the cleanup of all regional and child clusters is complete,
run unset KUBECONFIG to use the management cluster in the next steps.
3. Remove the Old Istio Chart#
Remove the old Istio release and related resources from the management cluster:
helm un -n istio-system kof-istio
kubectl delete namespace istio-system --wait
kubectl get crd -o name | grep --color=never 'istio.io' | xargs kubectl delete
4. Deploy New Istio Release#
Install the new Istio release to the management cluster.
Install the k0rdent-istio-base Chart#
helm upgrade -i --wait \
--create-namespace -n istio-system k0rdent-istio-base \
--set cert-manager-service-template.enabled=false \
--set injectionNamespaces="{kof}" \
oci://ghcr.io/k0rdent/istio/charts/k0rdent-istio-base --version 0.1.0
Notes:
cert-manager-service-template.enabled=falsedisables the deployment of the cert-manager service template. It should already be deployed as part of KOF.injectionNamespaces="{kof}"ensures Istio sidecars are injected only into thekofnamespace. To inject sidecars into additional namespaces, list them comma-separated:{kof,<YOUR_NAMESPACE>}.
Install the k0rdent-istio Chart#
helm upgrade -i --wait -n istio-system k0rdent-istio \
--set cert-manager-service-template.enabled=false \
--set "istiod.meshConfig.extensionProviders[0].name=otel-tracing" \
--set "istiod.meshConfig.extensionProviders[0].opentelemetry.port=4317" \
--set "istiod.meshConfig.extensionProviders[0].opentelemetry.service=kof-collectors-daemon-collector.kof.svc.cluster.local" \
oci://ghcr.io/k0rdent/istio/charts/k0rdent-istio --version 0.1.0
5. Upgrade the KOF Version#
Upgrade KOF to the target version following standard upgrade procedures.
6. Restart the KOF Pods#
You must restart the KOF pods on the management cluster to update the istio-proxy sidecar. Without restarting, you may encounter connection errors or Istio pod crashes.
kubectl delete pod --all -n kof
Notes:
- If you have additional namespaces with Istio sidecar injection, make sure to restart pods in those namespaces as well.
- Restarting ensures all pods run with the updated Istio configuration and sidecar versions.
7. Resume Sveltos Synchronization#
Resume synchronization between the management cluster and all managed (regional and child) clusters.
Run the following command on the management cluster:
kubectl scale --replicas=1 deployment/addon-controller -n projectsveltos
8. Add New Labels to ClusterDeployment#
To enable Istio on your clusters, you need to add two new labels to the corresponding ClusterDeployment resources.
k0rdent.mirantis.com/istio-role: member- Apply this label to СlustersDeployment where Istio should be installed.k0rdent.mirantis.com/istio-gateway: "true"- Apply this label only to regional clusters to install Istio gateway.
9. Restore Data#
Use the backup created in step 2 to restore VictoriaMetrics and VictoriaLogs data.
Connect to the Backup Pod#
kubectl exec -it -n kof <BACKUP_DEPLOYMENT_POD_NAME> -- /bin/sh
Restore VictoriaMetrics/Logs Data#
Follow the restore steps in the Data Backup section to import backups into VM/VL.
Upgrade to v1.4.0#
PromxyServerGroupCRD was moved from thecrds/directory to thetemplates/directory for auto-upgrade.- Please use
--take-ownershipon upgrade ofkof-mothershipto 1.4.0:helm upgrade --take-ownership \ --reset-values --wait -n kof kof-mothership -f mothership-values.yaml \ oci://ghcr.io/k0rdent/kof/charts/kof-mothership --version 1.4.0 - This will not be required in future upgrades.
Upgrade to v1.3.0#
Before upgrading any helm chart:
-
If you have customized VMCluster or VMAlert resources then update your resources accordingly to the new values under "spec".
-
If you have customized VMCluster/VMAlert resources using the "k0rdent.mirantis.com/kof-storage-values" cluster annotation, then keep the old but put new values first to reconcile the new
ClusterDeploymentconfiguration on the current release, then run the helm charts upgrade. After regional clusters have the newkof-storagehelm chart installed, you can remove old values from the cluster annotation. -
If you have set
storagevalues of thekof-regionalchart, update them in the same way. -
If you are not using Istio, then on the step 8 of the Management Cluster upgrade please apply this temporary workaround for the Reconciling MultiClusterService issue:
kubectl rollout restart -n kcm-system deploy/kcm-controller-manager
Upgrade to v1.2.0#
- As part of the KOF 1.2.0 overhaul of metrics collection and representation, we switched from the victoria-metrics-k8s-stack metrics and dashboards to opentelemetry-kube-stack metrics and kube-prometheus-stack dashboards.
- Some of the previously collected metrics have slightly different labels.
- If consistency of timeseries labeling is important, users are advised to conduct relabeling of the corresponding timeseries in the metric storage by running a retroactive relabeling procedure of their preference.
- A possible reference solution here would be to use Rules backfilling via vmalert.
- The labels that would require renaming are these:
- Replace
job="integrations/kubernetes/kubelet"withjob="kubelet", metrics_path="/metrics". - Replace
job="integrations/kubernetes/cadvisor"withjob="kubelet", metrics_path="/metrics/cadvisor". - Replace
job="prometheus-node-exporter"withjob="node-exporter".
- Replace
Also:
- To upgrade from
cert-manager-1-16-4tocert-manager-v1-16-4please apply this patch to management cluster:kubectl apply -f - <<EOF apiVersion: k0rdent.mirantis.com/v1beta1 kind: ServiceTemplateChain metadata: name: patch-cert-manager-v1-16-4-from-1-16-4 namespace: kcm-system annotations: helm.sh/resource-policy: keep spec: supportedTemplates: - name: cert-manager-v1-16-4 - name: cert-manager-1-16-4 availableUpgrades: - name: cert-manager-v1-16-4 EOF
Upgrade to v1.1.0#
- After you
helm upgradethekof-mothershipchart, please run the following:kubectl apply --server-side --force-conflicts \ -f https://github.com/grafana/grafana-operator/releases/download/v5.18.0/crds.yaml - After you get
regional-kubeconfigfile on the KOF Verification step, please run the following for each regional cluster:KUBECONFIG=regional-kubeconfig kubectl apply --server-side --force-conflicts \ -f https://github.com/grafana/grafana-operator/releases/download/v5.18.0/crds.yaml - This is noted as required in the grafana-operator release notes.