The following procedure includes steps to troubleshoot issues with S3FS mount points on worker and master NCNs. Beginning in the CSM 1.2 release, S3FS
is deployed as tool to reduce space usage on NCNs. Below is a list of the mount points on masters and workers:
Master nodes should host the following three mount points:
/var/opt/cray/config-data (config-data S3 bucket)
/var/lib/admin-tools (admin-tools S3 bucket)
/var/opt/cray/sdu/collection-mount (sds S3 bucket)
Worker nodes should host the following mount point:
/var/lib/cps-local/boot-images (boot-images S3 bucket)
Note: If this mount is missing, the cray-cps-cm-pm
pods may be unhealthy (in the CrashLoopBackoff
state). Proceed to Step 1 to resolve the issue.
Run the following command on master nodes to ensure the mounts are present:
ncn-m: # mount | grep 's3fs on'
s3fs on /var/opt/cray/config-data type fuse.s3fs (rw,nosuid,nodev,relatime,user_id=0,group_id=0)
s3fs on /var/lib/admin-tools type fuse.s3fs (rw,relatime,user_id=0,group_id=0,allow_other)
s3fs on /var/opt/cray/sdu/collection-mount type fuse.s3fs (rw,relatime,user_id=0,group_id=0,allow_other)
If the output is missing one or more of the mounts, proceed to Step 2
Run the following command on worker nodes to ensure the mount is present:
ncn-w: # mount | grep 's3fs on'
s3fs on /var/lib/cps-local/boot-images type fuse.s3fs (rw,nosuid,nodev,relatime,user_id=0,group_id=0)
If the output is missing the mount, proceed to Step 2
/etc/fstab
Contains the Mounts/etc/fstab
EntriesEnsure the /etc/fstab
contains the following content:
ncn-m: # grep fuse.s3fs /etc/fstab
sds /var/opt/cray/sdu/collection-mount fuse.s3fs _netdev,allow_other,passwd_file=/root/.sds.s3fs,url=http://rgw-vip.nmn,use_path_request_style,use_cache=/var/lib/s3fs_cache,check_cache_dir_exist,use_xattr,uid=2370,gid=2370,umask=0007,allow_other 0 0
admin-tools /var/lib/admin-tools fuse.s3fs _netdev,allow_other,passwd_file=/root/.admin-tools.s3fs,url=http://rgw-vip.nmn,use_path_request_style,use_cache=/var/lib/s3fs_cache,check_cache_dir_exist,use_xattr 0 0
config-data /var/opt/cray/config-data fuse.s3fs _netdev,allow_other,passwd_file=/root/.config-data.s3fs,url=http://rgw-vip.nmn,use_path_request_style,use_xattr 0 0
If the three entries above are not present in /etc/fstab
, add the above content and proceed to Step 3.
/etc/fstab
EntryEnsure the /etc/fstab
contains the following content:
ncn-w: # grep fuse.s3fs /etc/fstab
boot-images /var/lib/cps-local/boot-images fuse.s3fs _netdev,allow_other,passwd_file=/root/.ims.s3fs,url=http://rgw-vip.nmn,use_path_request_style,use_cache=/var/lib/s3fs_cache,check_cache_dir_exist,use_xattr 0 0
If the above line is not present in /etc/fstab
, add the above content and proceed to Step 3.
This step is the same for master and worker nodes. Run the following command to mount the directories specified in the /etc/fstab
file:
ncn-mw: # mount -a
If the above command fails, then the error likely indicates that there is an issue communicating with Ceph’s Radosgw
endpoint (rgw-vip
).
In this case the Troubleshoot an Unresponsive S3 Endpoint procedure should be followed to ensure the endpoint is healthy.