Kyverno policy management

Kyverno is a policy engine designed specifically for Kubernetes.

Kyverno allows cluster administrators to manage environment-specific configurations (independently of workload configurations) and enforce configuration best practices for their clusters.

Kyverno can be used to scan existing workloads for best practices, or it can be used to enforce best practices by blocking or mutating API requests.

Kyverno enables administrators to do the following:

  • Manage policies as Kubernetes resources.
  • Validate, mutate, and generate resource configurations.
  • Select resources based on labels and wildcards.
  • Block nonconforming resources using admission controls, or report policy violations.
  • View policy enforcement as events.
  • Scan existing resources for violations.

Kyverno policies implement the various levels of Kubernetes Pod Security Standards for CSM services.

The policies are minimally restrictive and enforce the best practices for pods. The policies make sure that the following values are set for workloads (if not present):

securityContext:
  allowPrivilegeEscalation: false
  privileged: false
  runAsUser: 65534
  runAsNonRoot: true
  runAsGroup: 65534

Mutation and Validation policies are enforced for the network services such as load balancer and virtual service.

Mutation

Mutation policies are applied in the admission controller while creating pods.

It mutates the manifest of respective workloads before creating them so that when the resource comes up, it will abide by the policy constraints.

Example mutation policy

  1. Create a policy definition.

    apiVersion: kyverno.io/v1
    kind: Policy
    metadata:
      name: add-default-securitycontext
    spec:
    rules:
      - name: set-container-security-context
        match:
          resources:
            kinds:
            - Pod
            selector:
              matchLabels:
                app: nginx
        mutate:
          patchStrategicMerge:
            spec:
              containers:
              - (name): "*"
                securityContext:
                  +(allowPrivilegeEscalation): false
                  +(privileged): false
    
  2. Create a simple pod definition.

    apiVersion: v1
    kind: Pod
    metadata:
      name: nginx
      labels:
        app: nginx
    spec:
    containers:
    - name: nginx
      image: nginx:1.14.2
      ports:
      - containerPort: 80
    
  3. (ncn-mw#) List all of the policies with the following command:

    kubectl get pol -A
    

    Example output:

    NAMESPACE            NAME                        BACKGROUND   ACTION   READY
    default              add-default-securitycontext true         audit    true
    
  4. Check the manifest after applying the policy.

    spec:
      containers:
      - image: nginx:1.14.2
        imagePullPolicy: IfNotPresent
        name: nginx
        ports:
        - containerPort: 80
          protocol: TCP
        resources:
          requests:
            cpu: 10m
            memory: 64Mi
        securityContext:
          allowPrivilegeEscalation: false
          privileged: false
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
          name: default-token-vgggw
          readOnly: true
    
  5. Edit the policy to add one more field and apply the policy again.

    apiVersion: kyverno.io/v1
    kind: Policy
    metadata:
      name: add-default-securitycontext
    spec:
    rules:
      - name: set-container-security-context
        match:
          resources:
            kinds:
            - Pod
            selector:
              matchLabels:
                app: nginx
        mutate:
          patchStrategicMerge:
            spec:
              containers:
              - (name): "*"
                securityContext:
                  +(allowPrivilegeEscalation): false
                  +(privileged): false
                  +(runAsNonRoot): true
    

    If any of the workloads fail to come up after enforcing the policy, then delete the individual policies and restart the workload.

  6. Check the pod description when the pod fails to come up.

    1. (ncn-mw#) Obtain the pod name.

      kubectl get pods
      

      Example output:

      NAME    READY   STATUS                       RESTARTS   AGE
      nginx   0/1     CreateContainerConfigError   0          5s
      
    2. (ncn-mw#) Describe the pod.

      kubectl describe pod nginx
      

      End of example output:

      Events:
      Type     Reason            Age                            From               Message
      ----     ------            ----                           ----               -------
      Normal   Scheduled         <invalid>                      default-scheduler  Successfully assigned default/nginx to ncn-w003-b7534262
      Warning  DNSConfigForming  <invalid> (x9 over <invalid>)  kubelet            Search Line limits were exceeded, some search paths have been omitted, the applied search line is: default.svc.cluster.local svc.cluster.local cluster.local vshasta.io us-central1-b.c.vsha-sri-ram-35682334251634485.internal c.vsha-sri-ram-35682334251634485.internal
      Normal   Pulled            <invalid> (x8 over <invalid>)  kubelet            Container image "nginx:1.14.2" already present on machine
      Warning  Failed            <invalid> (x8 over <invalid>)  kubelet            Error: container has runAsNonRoot and image will run as root (pod: "nginx_default(0ea1d573-219a-4927-b3c3-c76150d35a7a)", container: nginx)
      
  7. (ncn-mw#) If the previous step failed, then delete the policy and restart the workload.

    kubectl delete pol -n default add-default-securitycontext
    
  8. (ncn-mw#) Check the pod status after deleting the policy.

    kubectl get pods
    

    Example output:

    NAME    READY   STATUS    RESTARTS   AGE
    nginx   1/1     Running   0          6s
    

Validation

Validation policies can be applied any time in audit and enforce modes.

In the case of audit mode, violations are only reported. In enforce mode, the resources are blocked from coming up.

Also, it generates the report of policy violation in respective workloads. The following is an example of the validation policy in audit mode.

Example validation policy

  1. Add the following policy before applying the mutation to the workload.

    apiVersion: kyverno.io/v1
    kind: Policy
    metadata:
      name: validate-securitycontext
    spec:
      background: true
      validationFailureAction: audit
      rules:
      - name: container-security-context
        match:
          resources:
            kinds:
            - Pod
            selector:
              matchLabels:
                app: nginx
        validate:
          message: "Non root security context is not set."
          pattern:
            spec:
              containers:
              - (name): "*"
                securityContext:
                  allowPrivilegeEscalation: false
                  privileged: false
    
    • (ncn-mw#) View the policy report status with the following command:

      kubectl get polr -A
      

      Example output:

      NAMESPACE  NAME                   PASS   FAIL   WARN   ERROR   SKIP   AGE
      default    polr-ns-default        0      1      0      0       0      25d
      
    • (ncn-mw#) View a detailed policy report with the following command:

      kubectl get polr -n default polr-ns-default -o yaml
      

      Example output:

      results:
      - message: 'validation error: Non root security context is not set. Rule container-security-context failed at path /spec/containers/0/securityContext/'
        policy: validate-securitycontext
        resources:
        - apiVersion: v1
          kind: Pod
          name: nginx
          namespace: default
          uid: 319e5b09-6027-4d90-b3da-6aa1f14573ff
        result: fail
        rule: container-security-context
        scored: true
        source: Kyverno
        timestamp:
          nanos: 0
          seconds: 1654594319
        summary:
          error: 0
          fail: 1
          pass: 0
          skip: 0
          warn: 0
      
  2. Apply the mutation policy and restart the following workload.

    apiVersion: v1
    kind: Pod
    metadata:
      name: nginx
      labels:
        app: nginx
    spec:
    containers:
    - name: nginx
      image: nginx:1.14.2
      ports:
      - containerPort: 80
    
  3. (ncn-mw#) Check the policy report status.

    kubectl get polr -A
    

    Example output:

    NAMESPACE  NAME                   PASS   FAIL   WARN   ERROR   SKIP   AGE
    default    polr-ns-default        1      0      0      0       0      25d
    

    This shows that the mutation policy for the workload was enforced properly.

    If there are any discrepancies, look at the detailed policy report to triage the issue.

What is new in the HPE CSM 1.4 release and above

The upstream Baseline profile is now available for customers as part of the HPE CSM 1.4 release.

The Baseline profile is a collection of policies which implement the various levels of Kubernetes Pod Security Standards.

The Baseline profile is minimally restrictive and denies the most common vulnerabilities. It also follows many of the common security best practices for Kubernetes pods.

Baseline profile consists of 12 policies as listed below.

kubectl get clusterpolicy -A

Example output:

NAME                             BACKGROUND   ACTION   READY
cluster-job-ttl                  true         audit    true
disallow-capabilities            true         audit    true
disallow-host-namespaces         true         audit    true
disallow-host-path               true         audit    true
disallow-host-ports              true         audit    true
disallow-host-process            true         audit    true
disallow-privileged-containers   true         audit    true
disallow-proc-mount              true         audit    true
disallow-selinux                 true         audit    true
restrict-apparmor-profiles       true         audit    true
restrict-seccomp                 true         audit    true
restrict-sysctls                 true         audit    true

The violations for each of the Baseline policies is logged in a policy report, similar to the other policies mentioned in Validation section above. To get more information on each violation, use the following command.

Example to list policy violations at pod level

kubectl get polr -A -o json | jq -r -c '["Name","kind","Namespace","policy","message"],(.items[].results // [] | map(select(.result=="fail")) | select(. | length > 0) | .[] | select (.resources[0].kind == "Pod") | [.resources[0].name,.resources[0].kind,.resources[0].namespace,.policy,.message]) | @csv'

Example output:

"Name","kind","Namespace","policy","message"
"hms-discovery-28031310-lnvtf","Pod","services","disallow-capabilities","Any capabilities added beyond the allowed list (AUDIT_WRITE, CHOWN, DAC_OVERRIDE, FOWNER, FSETID, KILL, MKNOD, NET_BIND_SERVICE, SETFCAP, SETGID, SETPCAP, SETUID, SYS_CHROOT) are disallowed."
"etcd-backup-pvc-snapshots-to-s3-28031285-ssjfc","Pod","services","disallow-capabilities","Any capabilities added beyond the allowed list (AUDIT_WRITE, CHOWN, DAC_OVERRIDE, FOWNER, FSETID, KILL, MKNOD, NET_BIND_SERVICE, SETFCAP, SETGID, SETPCAP, SETUID, SYS_CHROOT) are disallowed."
"cray-dns-unbound-manager-28031310-wrhvj","Pod","services","disallow-capabilities","Any capabilities added beyond the allowed list (AUDIT_WRITE, CHOWN, DAC_OVERRIDE, FOWNER, FSETID, KILL, MKNOD, NET_BIND_SERVICE, SETFCAP, SETGID, SETPCAP, SETUID, SYS_CHROOT) are disallowed."
"cray-console-data-postgres-1","Pod","services","disallow-capabilities","Any capabilities added beyond the allowed list (AUDIT_WRITE, CHOWN, DAC_OVERRIDE, FOWNER, FSETID, KILL, MKNOD, NET_BIND_SERVICE, SETFCAP, SETGID, SETPCAP, SETUID, SYS_CHROOT) are disallowed."
"hms-discovery-28031292-6cxmz","Pod","services","disallow-capabilities","Any capabilities added beyond the allowed list (AUDIT_WRITE, CHOWN, DAC_OVERRIDE, FOWNER, FSETID, KILL, MKNOD, NET_BIND_SERVICE, SETFCAP, SETGID, SETPCAP, SETUID, SYS_CHROOT) are disallowed."

Example to list all the policy violations

kubectl get polr -A -o json | jq -r -c '["Name","kind","Namespace","policy","message"],(.items[].results // [] | map(select(.result=="fail")) | select(. | length > 0) | .[] | select (.resources[0].kind) | [.resources[0].name,.resources[0].kind,.resources[0].namespace,.policy,.message]) | @csv'

Example output:

"Name","kind","Namespace","policy","message"
"cray-nls","Deployment","argo","disallow-host-path","validation error: HostPath volumes are forbidden. The field spec.volumes[*].hostPath must be unset. Rule autogen-host-path failed at path /spec/template/spec/volumes/4/hostPath/"
"cray-ceph-csi-cephfs-nodeplugin","DaemonSet","ceph-cephfs","disallow-host-ports","validation error: Use of host ports is disallowed. The fields spec.containers[*].ports[*].hostPort , spec.initContainers[*].ports[*].hostPort, and spec.ephemeralContainers[*].ports[*].hostPort must either be unset or set to `0`. Rule autogen-host-ports-none failed at path /spec/template/spec/containers/2/ports/0/hostPort/"
"cray-ceph-csi-cephfs-nodeplugin","DaemonSet","ceph-cephfs","disallow-host-namespaces","validation error: Sharing the host namespaces is disallowed. The fields spec.hostNetwork, spec.hostIPC, and spec.hostPID must be unset or set to `false`. Rule autogen-host-namespaces failed at path /spec/template/spec/hostNetwork/"
"cray-ceph-csi-cephfs-nodeplugin","DaemonSet","ceph-cephfs","disallow-capabilities","Any capabilities added beyond the allowed list (AUDIT_WRITE, CHOWN, DAC_OVERRIDE, FOWNER, FSETID, KILL, MKNOD, NET_BIND_SERVICE, SETFCAP, SETGID, SETPCAP, SETUID, SYS_CHROOT) are disallowed."
"cray-ceph-csi-cephfs-nodeplugin","DaemonSet","ceph-cephfs","disallow-privileged-containers","validation error: Privileged mode is disallowed. The fields spec.containers[*].securityContext.privileged and spec.initContainers[*].securityContext.privileged must be unset or set to `false`. Rule autogen-privileged-containers failed at path /spec/template/spec/containers/0/securityContext/privileged/"

What is new in the HPE CSM 1.6 release and above

  1. Kyverno is upgraded from 1.9.5 version to 1.10.7 version and is now available for customers as part of the HPE CSM 1.6 release.

    This is a major upgrade with many new features and bug fixes. For a complete list, refer to the external CHANGELOG.

  2. Container image signature verification Kyverno policy.

Container Image Signature Verification Kyverno Policy

Container image signing and runtime verification policy by name check-image is delivered (through Kyverno policy engine) as part of CSM 1.6 release in Audit only mode. (Audit only mode will log the policy violation warning messages in to the cluster report and as events)

By default, the check-image policy is shipped as a ClusterPolicy and is configured to work in a sample (non-existent) namespace. To enable the policy, end users must customize it to the targeted namespaces.

CSM runs in an air gapped environment, where the images stored in artifactory are mirrored to the local Nexus registry. Kyverno policy engine does not support this environment for image verification. To enable Kyverno policy for such environments, all image specs need to be replaced from artifactory.algol60.net/csm-docker/stable/image:tag to registry.local/artifactory.algol60.net/csm-docker/stable/image:tag (prepend registry.local/ to the image spec). This will force Kyverno to check local Nexus registry for the images. Without registry.local/, Kyverno will try to contact remote registry (artifactory) every time, and this may trigger timeouts due to delays, and eventually lead to policy failure.

Prepending registry.local/ to the image spec can be achieved either manually or through a mutation policy.

For more information on mutation policy, refer to the Mutation Policy.

Policy customization

If any changes are to be made to the policy, for example, including or excluding certain namespaces and adding a new public key, then the end user must change the CSM customizations and redeploy the kyverno-policy chart. For more information on customization and redeployment, see Redeploying a Chart.

For more information on policy exception and matchings, refer to the Kyverno documentation at Policy Exceptions and match/exclude.

Known issues