The Image Management Service (IMS) customization workflow sets up a temporary image customization environment within a Kubernetes pod and mounts the image to be customized in that environment. A system administrator then makes the desired changes to the image root within the customization environment.
Afterwards, the IMS customization workflow automatically copies the NCN CA public key to /etc/cray/ca/certificate_authority.crt
within the image root being customized, in order
to enable secure communications between NCNs and client nodes. IMS then compresses the customized image root and uploads it and its associated initrd
image and kernel image (needed
to boot a node) to the artifact repository.
cray-ims
, the Image Management Service (IMS)cray-nexus
, the Nexus repository manager servicekubectl
is installed locally and configured to point at the SMS Kubernetes cluster.chroot
(if using a `jailed` environment), the image will only have access to
whatever configuration the image already contains. In order to talk to services, including Nexus RPM repositories, the image root must first be configured with DNS and other
settings. A base level of customization is provided by the default Ansible plays used by the Configuration Framework Service (CFS) to enable DNS resolution.root
user.initrd
image and kernel image are not regenerated automatically when the image root is changed. The admin must manually regenerate them while in the
customization environment, if needed..txz
compressed format need to be converted to SquashFS in order to use IMS image customization.Check for an existing IMS public key.
NOTE
If it is known that a public key associated with the user account being used was not previously uploaded to the IMS service, then skip this step and proceed to upload the SSH public key to the IMS service.
The following query may return multiple public key records. The correct one will have a name value including the current username in use.
cray ims public-keys list --format toml
Example output excerpt:
[[results]]
public_key = "ssh-rsa AAAAB3NzaC1yc2EA ... AsVruw1Zeiec2IWt"
id = "a252ff6f-c087-4093-a305-122b41824a3e"
name = "username public key"
created = "2018-11-21T17:19:07.830000+00:00"
id
field and proceed to record the IMS public key ID.Upload the SSH public key to the IMS service.
If an IMS public key record has already been created for the account being used, then note the value of its
id
field and proceed to record the IMS public key ID.
The IMS debug/configuration shell relies on passwordless SSH. This SSH public key needs to be uploaded to IMS to enable interaction with the image customization environment later in this procedure.
Replace the username
value with the actual username being used on the system when setting the public key name.
cray ims public-keys create --name "username public key" --public-key ~/.ssh/id_rsa.pub --format toml
Example output:
public_key = "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCl50gK4l9uupxC2KHxMpTNxPTJbnwEdWy1jst5W5LqJx9fdTrc9uNJ33HAq+WIOhPVGbLm2N4GX1WTUQ4+wVOSmmBBJnlu/l5rmO9lEGT6U8lKG8dA9c7qhguGHy7M7WBgdW/gWA16gwE/u8Qc2fycFERRKmFucL/Er9wA0/Qvz7/U59yO+HOtk5hvEz/AUkvaaoY0IVBfdNBCl59CIdZHxDzgXlXzd9PAlrXZNO8jDD3jyFAOvMMRG7py78zj2NUngvsWYoBcV3FcREZJU529uJ0Au8Vn9DRADyB4QQS2o+fa6hG9i2SzfY8L6vAVvSE7A2ILAsVruw1Zeiec2IWt"
id = "a252ff6f-c087-4093-a305-122b41824a3e"
name = "username public key"
created = "2018-11-21T17:19:07.830000+00:00"
Note the value of the id
field and proceed to record the IMS public key ID.
Record the IMS public key ID.
Create a variable for the IMS public key id
value noted previously.
IMS_PUBLIC_KEY_ID=a252ff6f-c087-4093-a305-122b41824a3e
Determine if the image root being used is in IMS and ready to be customized.
Locate the IMS image record for the image that is being customized.
cray ims images list --format toml
Example output excerpt:
[[results]]
created = "2018-12-04T17:25:52.482514+00:00"
id = "4e78488d-4d92-4675-9d83-97adfc17cb19"
name = "sles_15_image.squashfs"
[results.link]
type = "s3"
path = "/4e78488d-4d92-4675-9d83-97adfc17cb19/sles_15_image.squashfs"
etag = ""
Record the IMS image ID.
Create a variable for the id
field value for the image that is being customized.
IMS_IMAGE_ID=4e78488d-4d92-4675-9d83-97adfc17cb19
Create an IMS job record in order to start the image customization job.
After customizing the image, IMS will automatically upload any build artifacts (root file system, kernel, and initrd
) to S3, and associate the S3 artifacts with IMS.
Unfortunately, IMS is not able to dynamically determine the names of the Linux kernel and initrd
to look for, because the file name for these vary depending upon Linux
distribution, Linux version, dracut
configuration, and more. Therefore the user must pass into IMS the name of the kernel and initrd
in the resultant image root’s /boot
directory.
Use the following table to help determine the default kernel and initrd
file names to specify when submitting the job to customize an image. These are just default names.
Consult with the site administrator to determine if these names have been changed for a given image or recipe.
Recipe | Recipe Name | Kernel File Name | initrd File Name |
---|---|---|---|
SLES 15 SP3 Barebones | cray-sles15sp3-barebones |
vmlinuz |
initrd |
COS | cray-shasta-compute-sles15sp3.x86_64-1.4.66 |
vmlinuz |
initrd |
Under normal circumstances, IMS customization jobs will download and mount the
rootfs
for the specified IMS image under the/mnt/image/image-root
directory within the SSH shell. After SSHing into the job container,cd
orchroot
into the/mnt/image/image-root
directory in order to interact with the image root being customized.Optionally, IMS can be told to create a jailed SSH environment by specifying the
--ssh-containers-jail True
parameter.A jailed environment lets users SSH into the SSH container and be immediately within the image root for the image being customized. Users do not need to
cd
orchroot
into the image root. Using a jailed environment has some advantages, such as making the IMS SSH job shell look more like a compute node. This allows applications like the CFS to perform actions on both IMS job pods (pre-boot) and compute nodes (post-boot).Before running the following command, replace the
MY_CUSTOMIZED_IMAGE
value with the name of the image root being used.
cray ims jobs create \
--job-type customize \
--kernel-file-name vmlinuz \
--initrd-file-name initrd \
--artifact-id $IMS_IMAGE_ID \
--public-key-id $IMS_PUBLIC_KEY_ID \
--enable-debug False \
--image-root-archive-name MY_CUSTOMIZED_IMAGE \
--format toml
Example output:
status = "creating"
enable_debug = false
kernel_file_name = "vmlinuz"
artifact_id = "4e78488d-4d92-4675-9d83-97adfc17cb19"
build_env_size = 10
job_type = "customize"
kubernetes_service = "cray-ims-ad5163d2-398d-4e93-94f0-2f439f114fe7-service"
kubernetes_job = "cray-ims-ad5163d2-398d-4e93-94f0-2f439f114fe7-customize"
id = "ad5163d2-398d-4e93-94f0-2f439f114fe7"
image_root_archive_name = "MY_CUSTOMIZED_IMAGE"
initrd_file_name = "initrd"
created = "2018-11-21T18:22:53.409405+00:00"
kubernetes_namespace = "ims"
public_key_id = "a252ff6f-c087-4093-a305-122b41824a3e"
kubernetes_configmap = "cray-ims-ad5163d2-398d-4e93-94f0-2f439f114fe7-configmap"
[[ssh_containers]]
status = "pending"
jail = false
name = "customize"
[ssh_containers.connection_info."cluster.local"]
host = "cray-ims-ad5163d2-398d-4e93-94f0-2f439f114fe7-service.ims.svc.cluster.local"
port = 22
[ssh_containers.connection_info.customer_access]
host = "ad5163d2-398d-4e93-94f0-2f439f114fe7.ims.cmn.shasta.cray.com"
port = 22
Create variables for the IMS job ID and Kubernetes job ID.
The values for these variables are taken from the output of the command in the previous step.
Set IMS_JOB_ID
to the value of the id
field. Set the IMS_KUBERNETES_JOB
to the value of
the kubernetes_job
field.
IMS_JOB_ID=ad5163d2-398d-4e93-94f0-2f439f114fe7
IMS_KUBERNETES_JOB=cray-ims-ad5163d2-398d-4e93-94f0-2f439f114fe7-customize
Create variables for the SSH connection values in the returned data.
The IMS customization job enables customization of the image root via an SSH shell accessible by one or more dynamic host names. The user needs to know if they will SSH from inside or outside the Kubernetes cluster to determine which host name to use. Typically, customers access the system from outside the Kubernetes cluster using the Customer Access Network (CAN).
Before setting the SSH values, determine the appropriate method to SSH into the customization pod:
[ssh_containers.connection_info.customer_access]
values (preferred): The customer_access
address is a dynamic hostname that is made available for use by the customer
to access the IMS job from outside the Kubernetes cluster.[ssh_containers.connection_info."cluster.local"]
values: The cluster.local
address is used when trying to access an IMS job from a pod that is running within the HPE Cray
EX Kubernetes cluster. For example, this is the address that CFS uses to talk to the IMS job during a pre-boot customization session.The external IP address should only be used if the dynamic customer_access
hostname does not resolve properly. In the following example, the administrator could then SSH to
the 10.103.2.160
IP address.
kubectl get services -n ims | grep $IMS_JOB_ID
Example output:
cray-ims-06c3dd57-f347-4229-85b3-1d024a947b3f-service LoadBalancer 10.29.129.204 10.103.2.160 22:31627/TCP 21h
To create the variables:
IMS_SSH_HOST=ad5163d2-398d-4e93-94f0-2f439f114fe7.ims.cmn.shasta.cray.com
IMS_SSH_PORT=22
Describe the image create job.
kubectl -n ims describe job $IMS_KUBERNETES_JOB
Example output:
Name: cray-ims-cfa864b3-4e08-49b1-9c57-04573228fd3f-customize
Namespace: default
[...]
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 4m job-controller Created pod: cray-ims-cfa864b3-4e08-49b1-9c57-04573228fd3f-customize-xh2jf
Record the name of the Kubernetes pod from the previous command.
The name is located in the Events
section of the output.
POD=cray-ims-cfa864b3-4e08-49b1-9c57-04573228fd3f-customize-xh2jf
Verify that the status of the IMS job is waiting_on_user
.
cray ims jobs describe $IMS_JOB_ID --format toml
Example output:
status = "waiting_on_user"
enable_debug = false
kernel_file_name = "vmlinuz"
artifact_id = "4e78488d-4d92-4675-9d83-97adfc17cb19"
build_env_size = 10
job_type = "customize"
kubernetes_service = "cray-ims-ad5163d2-398d-4e93-94f0-2f439f114fe7-service"
kubernetes_job = "cray-ims-ad5163d2-398d-4e93-94f0-2f439f114fe7-customize"
id = "ad5163d2-398d-4e93-94f0-2f439f114fe7"
image_root_archive_name = "my_customized_image"
initrd_file_name = "initrd"
created = "2018-11-21T18:22:53.409405+00:00"
kubernetes_namespace = "ims"
public_key_id = "a252ff6f-c087-4093-a305-122b41824a3e"
kubernetes_configmap = "cray-ims-ad5163d2-398d-4e93-94f0-2f439f114fe7-configmap"
[[ssh_containers]]
status = "pending"
jail = false
name = "customize"
[ssh_containers.connection_info."cluster.local"]
host = "cray-ims-ad5163d2-398d-4e93-94f0-2f439f114fe7-service.ims.svc.cluster.local"
port = 22
[ssh_containers.connection_info.customer_access]
host = "ad5163d2-398d-4e93-94f0-2f439f114fe7.ims.cmn.shasta.cray.com"
port = 22
Customize the image in the image customization environment.
Once in the image root using
chroot
(or if using a `jailed` environment) during image customization, the image will only have access to whatever configuration the image already contains. In order to talk to services, including Nexus RPM repositories, the image root must first be configured with DNS and other settings. A base level of customization is provided by the default Ansible plays used by the CFS to enable DNS resolution.
Option 1: SSH to the image customization environment.
In order for passwordless SSH to work, ensure that the correct public/private key pair is used. The private key will need to match the public key that was uploaded to the IMS service and specified in the IMS job.
IMPORTANT: The following command will work when run on any of the master nodes and worker nodes, except for
ncn-w001
.
ssh -p $IMS_SSH_PORT root@$IMS_SSH_HOST
Once connected to the IMS image customization shell, perform any customizations required. If the SSH shell was created without using the --ssh-containers-jail True
parameter, cd
or chroot
into the image root. The image root is available under /mnt/image/image-root
.
After changes have been made, run the touch
command on the complete
file. The location of the complete
file depends on whether or not the SSH job shell was created
using the --ssh-containers-jail True
parameter. See the table below for more information.
--ssh-containers-jail |
Command used to create the complete file |
---|---|
False (default) |
touch /mnt/image/complete |
True |
touch /tmp/complete |
[root@POD image]# cd /mnt/image/
[root@POD image]# chroot image-root/
:/ # (do touch complete flag)
:/ # exit
[root@POD image]#
When the complete
file has been created, the following actions will occur:
buildenv-sidecar
container will compresses the image root.Option 2: Use Ansible to run playbooks against the image root.
ansible all -i $IMS_SSH_HOST, -m ping --ssh-extra-args \
" -p $IMS_SSH_PORT -i ./pod_rsa_key -o StrictHostKeyChecking=no" -u root
Example output:
ad5163d2-398d-4e93-94f0-2f439f114fe7.ims.cmn.shasta.cray.com | SUCCESS => {
"changed": false,
"ping": "pong"
}
The Ansible inventory file below can also be used. The private key (./pod_rsa_key
) corresponds to the public key file the container has in its authorized_keys
file.
myimage-customize ansible_user=root ansible_host=ad5163d2-398d-4e93-94f0-2f439f114fe7.ims.cmn.shasta.cray.com ansible_port=22 \
ansible_ssh_private_key_file=./pod_rsa_key ansible_ssh_common_args='-o \
StrictHostKeyChecking=no'
A sample playbook can be run on the image root:
---
# The playbook creates a new database test and populates data in the database to test the sharding.
- hosts: all
remote_user: root
tasks:
- name: Look at the image root
command: "ls -l /mnt/image/image-root"
- name: chroot and run dracut
command: "chroot /mnt/image/image-root dracut --force --kver 4.4.143-94.47-default"
- name: example copying file with owner and permissions
copy:
src: sample_playbook.yml
dest: /mnt/image/image-root/tmp
- name: Exit the build container
copy:
src: nothing_file
dest: /mnt/image/complete
The sample playbook can be run with the following command:
ansible-playbook -i ./inventory.ini sample_playbook.yml
Follow the buildenv-sidecar
to ensure that any artifacts are properly uploaded to S3 and associated with IMS.
kubectl -n ims logs -f $POD -c buildenv-sidecar
Example output:
+ python -m ims_python_helper image upload_artifacts sles15_barebones_image 7de80ccc-1e7d-43a9-a6e4-02cad10bb60b
-v -r /mnt/image/sles15_barebones_image.sqsh -k /mnt/image/image-root/boot/vmlinuz
-i /mnt/image/image-root/boot/initrd
...
{
"ims_image_artifacts": [
{
"link": {
"etag": "4add976679c7e955c4b16d7e2cfa114e-32",
"path": "s3://boot-images/d88521c3-b339-43bc-afda-afdfda126388/rootfs",
"type": "s3"
},
"md5": "94165af4373e5ace3e817eb4baba2284",
"type": "application/vnd.cray.image.rootfs.squashfs"
},
{
"link": {
"etag": "f836412241aae79d160556ed6a4eb4d4",
"path": "s3://boot-images/d88521c3-b339-43bc-afda-afdfda126388/kernel",
"type": "s3"
},
"md5": "f836412241aae79d160556ed6a4eb4d4",
"type": "application/vnd.cray.image.kernel"
},
{
"link": {
"etag": "ec8793c07f94e59a2a30abdb1bd3d35a-4",
"path": "s3://boot-images/d88521c3-b339-43bc-afda-afdfda126388/initrd",
"type": "s3"
},
"md5": "86832ee3977ca0515592e5d00271d2fe",
"type": "application/vnd.cray.image.initrd"
},
{
"link": {
"etag": "13af343f3e76b0f8c7fbef7ee3588ac1",
"path": "s3://boot-images/d88521c3-b339-43bc-afda-afdfda126388/manifest.json",
"type": "s3"
},
"md5": "13af343f3e76b0f8c7fbef7ee3588ac1",
"type": "application/json"
}
],
"ims_image_record": {
"created": "2018-12-17T22:59:43.264129+00:00",
"id": "d88521c3-b339-43bc-afda-afdfda126388",
"name": "sles15_barebones_image"
"link": {
"etag": "13af343f3e76b0f8c7fbef7ee3588ac1",
"path": "s3://boot-images/d88521c3-b339-43bc-afda-afdfda126388/manifest.json",
"type": "s3"
},
},
"ims_job_record": {
"artifact_id": "2233c82a-5081-4f67-bec4-4b59a60017a6",
"build_env_size": 10,
"created": "2018-11-21T18:22:53.409405+00:00",
"enable_debug": false,
"id": "ad5163d2-398d-4e93-94f0-2f439f114fe7",
"image_root_archive_name": "sles15_barebones_image",
"initrd_file_name": "initrd",
"job_type": "create",
"kernel_file_name": "vmlinuz",
"kubernetes_configmap": "cray-ims-ad5163d2-398d-4e93-94f0-2f439f114fe7-configmap",
"kubernetes_job": "cray-ims-ad5163d2-398d-4e93-94f0-2f439f114fe7-create",
"kubernetes_service": "cray-ims-ad5163d2-398d-4e93-94f0-2f439f114fe7-service",
"public_key_id": "a252ff6f-c087-4093-a305-122b41824a3e",
"resultant_image_id": "d88521c3-b339-43bc-afda-afdfda126388",
"ssh_port": 0,
"status": "packaging_artifacts"
},
"result": "success"
}
The IMS customization workflow automatically copies the NCN Certificate Authority’s public certificate to /etc/cray/ca/certificate_authority.crt
within the image root
being customized. This can be used to enable secure communications between the NCN and the client node.
Look up the ID of the newly created image.
cray ims jobs describe $IMS_JOB_ID --format toml
Example output:
status = "success"
enable_debug = false
kernel_file_name = "vmlinuz"
artifact_id = "4e78488d-4d92-4675-9d83-97adfc17cb19"
build_env_size = 10
job_type = "customize"
kubernetes_service = "cray-ims-ad5163d2-398d-4e93-94f0-2f439f114fe7-service"
kubernetes_job = "cray-ims-ad5163d2-398d-4e93-94f0-2f439f114fe7-customize"
id = "ad5163d2-398d-4e93-94f0-2f439f114fe7"
image_root_archive_name = "my_customized_image"
initrd_file_name = "initrd"
resultant_image_id = "d88521c3-b339-43bc-afda-afdfda126388"
created = "2018-11-21T18:22:53.409405+00:00"
kubernetes_namespace = "ims"
public_key_id = "a252ff6f-c087-4093-a305-122b41824a3e"
kubernetes_configmap = "cray-ims-ad5163d2-398d-4e93-94f0-2f439f114fe7-configmap"
Record the IMS image ID of the created image.
Set IMS_RESULTANT_IMAGE_ID
to the value of the resultant_image_id
field in the output of the previous command.
IMS_RESULTANT_IMAGE_ID=d88521c3-b339-43bc-afda-afdfda126388
Verify that the new IMS image record exists.
cray ims images describe $IMS_RESULTANT_IMAGE_ID --format toml
Example output:
created = "2018-12-04T17:25:52.482514+00:00"
id = "d88521c3-b339-43bc-afda-afdfda126388"
name = "my_customized_image.squashfs"
[link]
type = "s3"
path = "/d88521c3-b339-43bc-afda-afdfda126388/my_customized_image.squashfs"
etag = "28f3d78c8cceca2083d7d3090d96bbb7"
Clean up the image customization environment.
Delete the IMS job record.
cray ims jobs delete $IMS_JOB_ID
Deleting the job record also deletes the underlying Kubernetes job, service, and ConfigMap that were created when the job record was submitted.
Jobs left in a ‘Running’ state continue to consume Kubernetes resources until the job is completed, or deleted. If there are enough ‘Running’ IMS jobs on the system it may not be possible to schedule more pods on worker nodes due to insufficient resouces errors.
The image root has been modified, compressed, and uploaded to S3, along with its associated initrd
and kernel files. The image customization environment has also been cleaned up.