CSM uses Cilium Container Network Interface (CNI) plugin. The plugin is responsible for assigning IP addresses to pods and establishing network connectivity within and between Kubernetes nodes. This document provides guidance on how to troubleshoot Cilium-related issues in a CSM environment.
(ncn-mw#
) Check the overall health of Cilium components.
Look for any errors or warnings in the output.
cilium status
Expected output resembles the following:
/¯¯\
/¯¯\__/¯¯\ Cilium: OK
\__/¯¯\__/ Operator: OK
/¯¯\__/¯¯\ Envoy DaemonSet: OK
\__/¯¯\__/ Hubble Relay: OK
\__/ ClusterMesh: disabled
DaemonSet cilium Desired: 7, Ready: 7/7, Available: 7/7
DaemonSet cilium-envoy Desired: 7, Ready: 7/7, Available: 7/7
Deployment cilium-operator Desired: 1, Ready: 1/1, Available: 1/1
Deployment hubble-relay Desired: 1, Ready: 1/1, Available: 1/1
Deployment hubble-ui Desired: 1, Ready: 1/1, Available: 1/1
Containers: cilium Running: 7
cilium-envoy Running: 7
cilium-operator Running: 1
hubble-relay Running: 1
hubble-ui Running: 1
Cluster Pods: 311/311 managed by Cilium
Helm chart version: 1.16.5
Image versions cilium artifactory.algol60.net/csm-docker/stable/quay.io/cilium/cilium:v1.16.5: 7
cilium-envoy artifactory.algol60.net/csm-docker/stable/quay.io/cilium/cilium-envoy:v1.30.8: 7
cilium-operator artifactory.algol60.net/csm-docker/stable/quay.io/cilium/operator-generic:v1.16.5: 1
hubble-relay artifactory.algol60.net/csm-docker/stable/quay.io/cilium/hubble-relay:v1.16.5: 1
hubble-ui artifactory.algol60.net/csm-docker/stable/quay.io/cilium/hubble-ui-backend:v0.13.1: 1
hubble-ui artifactory.algol60.net/csm-docker/stable/quay.io/cilium/hubble-ui:v0.13.1: 1
All the pods should have Running
status and the number of Ready
pods should match the Desired
count.
If there are discrepancies, then it may indicate issues with Cilium components.
If any errors or warnings are observed, investigate further by checking the logs of the specific Cilium pod or component.
Ensure that all Cilium components are running and healthy.
Get node-specific status from the Cilium pod.
(ncn-mw#
) List all Cilium pods.
kubectl -n kube-system get pod -l k8s-app=cilium -o wide
Expected output resembles the following:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
cilium-9k9wp 1/1 Running 0 17d 10.252.1.9 ncn-m002 <none> <none>
cilium-dlp8x 1/1 Running 0 17d 10.252.1.11 ncn-w004 <none> <none>
cilium-ps8gx 1/1 Running 0 17d 10.252.1.8 ncn-m003 <none> <none>
cilium-t2dn8 1/1 Running 0 17d 10.252.1.6 ncn-w002 <none> <none>
cilium-tlzqz 1/1 Running 0 17d 10.252.1.7 ncn-w001 <none> <none>
cilium-xqm8f 1/1 Running 0 17d 10.252.1.5 ncn-w003 <none> <none>
cilium-zx2d2 1/1 Running 0 17d 10.252.1.10 ncn-m001 <none> <none>
Any errors or warnings that may indicate issues with network connectivity or policies.
(ncn-mw#
) Get the Cilium status for a particular node.
In this example, the pod
cilium-dlp8x
is running on nodencn-w004
, and it is used to check the status of Cilium on that node.
kubectl -n kube-system exec -it cilium-dlp8x -c cilium-agent -- cilium status
Expected output resembles the following:
KVStore: Ok Disabled
Kubernetes: Ok 1.32 (v1.32.5) [linux/amd64]
Kubernetes APIs: ["EndpointSliceOrEndpoint", "cilium/v2::CiliumClusterwideNetworkPolicy", "cilium/v2::CiliumEndpoint", "cilium/v2::CiliumNetworkPolicy", "cilium/v2::CiliumNode", "cilium/v2alpha1::CiliumCIDRGroup", "core/v1::Namespace", "core/v1::Pods", "core/v1::Service", "networking.k8s.io/v1::NetworkPolicy"]
KubeProxyReplacement: False
Host firewall: Disabled
SRv6: Disabled
CNI Chaining: none
CNI Config file: successfully wrote CNI configuration file to /host/etc/cni/net.d/05-cilium.conflist
Cilium: Ok 1.16.5 (v1.16.5-ad688277)
NodeMonitor: Listening for events on 64 CPUs with 64x4096 of shared memory
Cilium health daemon: Ok
IPAM: IPv4: 49/254 allocated from 10.32.9.0/24,
IPv4 BIG TCP: Disabled
IPv6 BIG TCP: Disabled
BandwidthManager: Disabled
Routing: Network: Tunnel [vxlan] Host: Legacy
Attach Mode: TCX
Device Mode: veth
Masquerading: IPTables [IPv4: Enabled, IPv6: Disabled]
Controller Status: 258/258 healthy
Proxy Status: OK, ip 10.32.9.170, 0 redirects active on ports 10000-20000, Envoy: external
Global Identity Range: min 256, max 65535
Hubble: Ok Current/Max Flows: 4095/4095 (100.00%), Flows/s: 210.25 Metrics: Ok
Encryption: Disabled
Cluster health: 7/7 reachable (2025-08-07T21:36:12Z)
Modules Health: Stopped(0) Degraded(0) OK(236)
Cilium monitoring helps to identify issues with network policies, connectivity, and more. Hubble can also be used for more advanced monitoring and troubleshooting. See the Cilium and Hubble Monitoring documentation for more information.
(ncn-mw#
) To use the Hubble tool to diagnose network problems start a port-forwarding session.
cilium hubble port-forward &
Expected output resembles the following:
[1] 903511
ncn-m001:~ # ℹ️ Hubble Relay is available at 127.0.0.1:4245
(ncn-mw#
) Use the hubble
command to inspect network traffic and events.
This example looks at traffic that is being dropped because of network policies. This command will show the last 5 dropped packets along with their details.
hubble observe --verdict DROPPED --last 5
Example output:
Aug 7 21:40:47.320: fe80::a0ec:7cff:fe45:72b (ID:32748) <> ff02::2 (unknown) Unsupported L3 protocol DROPPED (ICMPv6 RouterSolicitation)
Aug 7 21:40:49.515: sysmgmt-health/vmagent-vms-0-6cbf9d467c-49nxn:39932 (ID:64899) <> dvs/cray-dvs-mqtt-ss-1:15020 (ID:50714) Policy denied DROPPED (TCP Flags: SYN)
Aug 7 21:40:50.084: sysmgmt-health/vmagent-vms-1-5d5dddd445-q9znc:60136 (ID:3549) <> dvs/cray-dvs-mqtt-ss-0:15020 (ID:50714) Policy denied DROPPED (TCP Flags: SYN)
Aug 7 21:40:50.091: sysmgmt-health/vmagent-vms-1-5d5dddd445-jp7gf:58510 (ID:3549) <> dvs/cray-dvs-mqtt-ss-0:15020 (ID:50714) policy-verdict:none INGRESS DENIED (TCP Flags: SYN)
For more information on using the Hubble CLI, see the Hubble CLI documentation.