Configuring telemetry#
k0rdent enables you to configure telemetry either by using the --set flag to set helm values during installation
or by editing the Management object for an existing management cluster, as
shown in the Extended Management Configuration.
Main settings block#
Telemetry is configured under the telemetry block of the kcm-regional chart (see Components segregation).
In a managements object set the following:
regional:
telemetry:
mode: online # disabled | local | online (default: online)
concurrency: 5 # scrape N clusters in parallel (1..100)
interval: 1h # scrape cadence
jitter: 10 # % jitter (1..99)
modecontrols the collection and storage mechanism.onlinesends to Segment;localwrites files;disabledturns it off. (Default:online)concurrencyis how many child clusters are scraped in parallel.intervalis the scrape frequency.jitterspreads scrapes across time to avoid thundering herds.
In a corresponding regions object set the following:
telemetry:
mode: online # disabled | local | online (default: online)
concurrency: 5 # scrape N clusters in parallel (1..100)
interval: 1h # scrape cadence
jitter: 10 # % jitter (1..99)
Note, that the only change is the absence of the regional subpath.
Note
Here and after all of the examples are given without the regional
top subpath. Set or remove it accordingly depending what resource
is being edited: add it if a managements object is being edited,
do not add it if a regions object is being edited.
Local mode storage settings#
When telemetry.mode is set to local, you can configure where and how files are stored:
telemetry:
mode: local
local:
baseDir: /var/lib/telemetry
volume:
source: hostPath # pvc | existing | hostPath (default)
# For source: pvc (dynamic provisioning)
pvc:
storageClassName: "" # default StorageClass if empty
accessModes: [ReadWriteOnce]
volumeMode: Filesystem
size: 200Mi
annotations: {}
# For source: existing (use pre-created PVC)
existingClaim: ""
# For source: hostPath (static PV + matching PVC)
hostPath:
name: "" # defaults to <fullname>-telemetry-data
reclaimPolicy: Retain
storageClassName: "manual"
accessModes: [ReadWriteOnce]
volumeMode: Filesystem
size: 200Mi
annotations: {}
labels: {}
nodeAffinity: {} # pin PV to a node if needed
baseDiris the directory inside the pod where telemetry files appear; whensource: hostPath, the host path on the node is the samebaseDir.-
volume.sourcechooses between:pvc— the chart creates a PVC (dynamic PV provisioning must be configured prior).existing— the chart uses an existing PVC you name inexistingClaim.hostPath(default) — the chart creates a static hostPath PV + matching PVC mounted atbaseDir. Default PV settings includepersistentVolumeReclaimPolicy: RetainandstorageClassName: manual; optionalnodeAffinityties the PV to a specific node.
Recommendation
Prefer dynamically-provisioned PVCs (source: pvc) or existing PVCs
(source: existing) over hostPath.
HostPath ties data to one node and is generally less stable/portable
for controller pods that may reschedule.
How the chart wires this up#
- Only in
localmode, an init container (telemetry-data-permissions) runs as root (only theCHOWNcapability) tochown -R 65532:65532 <baseDir>so the main controller (running as non-root) can write files. The main pod runs withrunAsNonRoot: true. -
When
source: hostPath, the chart creates:- a PersistentVolume with
hostPath.path: <baseDir>,type: DirectoryOrCreate, reclaim policy Retain (default), optionalnodeAffinity. - a matching PVC bound by name.
- a PersistentVolume with
Note on a legacy flag
controller.enableTelemetry is deprecated in 1.4.0
and removed in 1.5.0 version in favor of the telemetry
block. It may still appear as a container arg for backward compatibility,
but you should configure telemetry via the telemetry values going forward.
Configuration examples#
Note
Replace only the snippets you need; all other values keep their defaults.
Disable telemetry entirely#
telemetry:
mode: disabled
No data is collected or stored.
Online telemetry (defaults)#
telemetry:
mode: online # default
concurrency: 5 # default
interval: 1h # default
jitter: 10 # default
Sends data to Segment.io CDP.
Local telemetry with dynamic provisioning#
Warning
Ensure dynamic PV provisioning is configured prior.
telemetry:
mode: local
local:
baseDir: /var/lib/telemetry
volume:
source: pvc
pvc:
storageClassName: gp2
accessModes: [ReadWriteOnce]
size: 5Gi
Chart creates a PVC; files appear under /var/lib/telemetry
in the pod, rotated daily.
Local telemetry with an existing PVC#
telemetry:
mode: local
local:
baseDir: /var/lib/telemetry
volume:
source: existing
existingClaim: kcm-telemetry-pvc
Chart uses your PVC; no PV/PVC objects are created by the chart.
Local telemetry with hostPath (single-node / pinned-node)#
telemetry:
mode: local
local:
baseDir: /var/lib/telemetry
volume:
source: hostPath
hostPath:
name: kcm-telemetry-hostpath
reclaimPolicy: Retain
storageClassName: manual
size: 10Gi
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values: ["worker-01"]
Chart creates a hostPath PV at /var/lib/telemetry and a matching PVC;
prefer PVC-backed options in HA clusters.
Controller settings#
Note
Added in 1.5.0 version.
There are numerions settings to customize the dedicated telemetry controller. The list of default settings is as follows:
telemetry:
# Telemetry runner configuration
controller:
# Number of replicas
replicas: 1
# Image configuration
image:
# Name of the image
repository: ghcr.io/k0rdent/kcm/telemetry
# Tag of the image
tag: latest
# Image pull policy, one of [IfNotPresent, Always, Never]
pullPolicy: IfNotPresent
# Container resources
resources:
limits:
cpu: 500m
memory: 256Mi
requests:
cpu: 100m
memory: 64Mi
# Node selector to constrain the pod to run on specific nodes
nodeSelector: {}
# Affinity rules for pod scheduling
affinity: {}
# Tolerations to allow the pod to schedule on tainted nodes
tolerations: []
# Global runner's logger settings
logger:
# Development defaults to (encoder=console,logLevel=debug,stackTraceLevel=warn) Production defaults to (encoder=json,logLevel=info,stackTraceLevel=error)
devel: false
# Encoder type, one of [json, console]
encoder: ""
# Log level, one of [info, debug, error]
log-level: ""
# Level on which to produce stacktraces, one of [info, error, panic]
stacktrace-level: ""
# Type of time encoding, one of [epoch, millis, nano, iso8601, rfc3339, rfc3339nano]
time-encoding: rfc3339
Troubleshooting#
- Settings are not applied:
make sure that the correct
regionalsubpath is either added (ifmanagementsobject) or is absent (ifregionsobject). - No local files appear:
ensure
telemetry.mode: localand that a volume source is configured. Check the init containertelemetry-data-permissionscompleted successfully. - Pod unschedulable with hostPath:
if the pod moves to a different node, the hostPath PV won’t be there;
either pin the PV with
nodeAffinityor switch to a PVC-backed option. - Conflicting settings:
if you previously set
controller.enableTelemetry, migrate to thetelemetryblock; the old flag is deprecated and is removed in the 1.5.0 version.