Cray System Management Documentation > Cray System Management (CSM) Administration Guide > node management > Configure NTP on NCNs

Configure NTP on NCNs

The management nodes serve Network Time Protocol (NTP) at stratum 10, except for ncn-m001, which serves at stratum 8 (or lower if an upstream NTP server is set). All management nodes peer with each other.

Until an upstream NTP server is configured, the time on the NCNs may not match the current time at the site, but they will stay in sync with each other.

Topics

Change NTP Config
Troubleshooting NTP
Customize NTP
- Set A Local Timezone

Change NTP Config

The three different methods for configuring NTP are described below. The first option is the recommended method.

Edit /etc/chrony.d/cray.conf and restart chronyd on each node.

ncn# vi /etc/chrony.d/cray.conf
ncn# systemctl restart chronyd

Edit the data.json file, restart basecamp, and run the NTP script on each node.

ncn-m001# vi data.json
ncn-m001# systemctl restart basecamp

Run the NTP script on each node.

ncn# /srv/cray/scripts/metal/set-ntp-config.sh

Edit the data.json file, restart basecamp, and restart nodes so cloud-init runs on boot.
```
ncn-m001# vi data.json
ncn-m001# systemctl restart basecamp
```
Reboot each node.
```
ncn# reboot
```
cloud-init caches data, so there could be inconsistent results with this method.

Troubleshooting NTP

Verify NTP is configured correctly and troubleshoot any issues.

The chronyc command can be used to gather information on the state of NTP.

Check if a given host may be used as an NTP server.

This example checks whether 10.252.0.7 is a valid NTP server
```
ncn# chronyc accheck 10.252.0.7
208 Access allowed
```

Check the system clock performance.

ncn# chronyc tracking
Reference ID    : 0AFC0104 (ncn-s003)
Stratum         : 4
Ref time (UTC)  : Mon Nov 30 20:02:24 2020
System time     : 0.000007622 seconds slow of NTP time
Last offset     : -0.000014609 seconds
RMS offset      : 0.000015776 seconds
Frequency       : 6.773 ppm fast
Residual freq   : -0.000 ppm
Skew            : 0.008 ppm
Root delay      : 0.000075896 seconds
Root dispersion : 0.000484318 seconds
Update interval : 513.7 seconds
Leap status     : Normal

View information on drift and offset

ncn# chronyc sourcestats
210 Number of sources = 8
Name/IP Address            NP  NR  Span  Frequency  Freq Skew  Offset  Std Dev
==============================================================================
ncn-w001                    6   3   42m     -0.029      0.126  +4104ns    28us
ncn-w002                    6   6   42m     -0.028      0.030    +44us  7278ns
ncn-w003                   12   7   23m     -0.059      0.023    -35us  8359ns
ncn-s002                   36  17  213m     -0.001      0.010  +5794ns    54us
ncn-s003                   36  17  212m     -0.000      0.007   -178ns    40us
ncn-m001                    0   0     0     +0.000   2000.000     +0ns  4000ms
ncn-m002                   28  15  192m     -0.007      0.009  +9942ns    49us
ncn-m003                   24  15  197m     -0.005      0.009  +9442ns    46us

View the NTP servers, pools, and peers.

ncn# chronyc sources
210 Number of sources = 8
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
=? ncn-w001                      4   9   377   435   +162us[ +164us] +/-  679us
=? ncn-w002                      4   9   377   505   +118us[ +120us] +/-  277us
=? ncn-w003                      4   7   377    82   +850ns[+2686ns] +/-  504us
=? ncn-s002                      4   9   377   542    -38us[  -36us] +/-  892us
=* ncn-s003                      3   9   377    19    +13us[  +15us] +/-  110us
=? ncn-m001                      0   9     0     -     +0ns[   +0ns] +/-    0ns
=? ncn-m002                      4   8   377   161    -47us[  -45us] +/-  408us
=? ncn-m003                      4   8   377   215    -11us[-9109ns] +/-  446us

`chrony` Log Files

The chrony logs are stored in /var/log/chrony/

Force a Time Sync

If the time is out of sync, force a sync of NTP.

If Kubernetes or other services are already up, they do not always react well if there is a large time jump. Ideally, this action should be made as the node is booting.
```
ncn# chronyc burst 4/4
```
Wait about 15 seconds while NTP measurements are gathered
```
ncn# sleep 15
```
Jump the clock manually
```
ncn# chronyc makestep
```

Known Issues and Bugs

Older versions of CSM contained some NTP bugs that can carry forward through CSM upgrades. This can result in problems with time syncing correctly. This section describes how to diagnose and fix these.

These issues all relate to certain nodes not being in a correct state.

Correct State

ncn-m001 should have these important settings in /etc/chrony.d/cray.conf:

server time.nist.gov iburst trust
# or
pool time.nist.gov iburst
# ncn-m001 should NOT use itself as a server and is known to cause issues

# this allows the clock to step itself during a restart without affecting running apps if it drifts more than 1 second
initstepslew 1 time.nist.gov
# the other ncns are set to 10, so in the event of a tie, ncn-m001 is chosen as the leader
local stratum 8 orphan

These settings ensure there is a low-stratum NTP server that ncn-m001 has access to. ncn-m001 also has the following:

# all non-ncn-m001 NCNs use ncn-m001 as their server, and they trust it
server ncn-m001 iburst trust
# no pools are on the other ncns
# ncn-m001 should NOT use itself as a server and is known to cause issues

# this allows the clock to step itself during a restart without affecting running apps if it drifts more than 1 second
initstepslew 1 ncn-m001
# the ncns peer with each other at a high stratum, and choose ncn-m001 (stratum 8 or lower) in the event of a tie
local stratum 10 orphan

# The nodes should have a max of 9 peers and should not include themselves in the list
peer ncn-m001 minpoll -2 maxpoll 9 iburst
peer ncn-m003 minpoll -2 maxpoll 9 iburst
peer ncn-s001 minpoll -2 maxpoll 9 iburst
peer ncn-s002 minpoll -2 maxpoll 9 iburst
peer ncn-s003 minpoll -2 maxpoll 9 iburst
peer ncn-w001 minpoll -2 maxpoll 9 iburst
peer ncn-w002 minpoll -2 maxpoll 9 iburst
peer ncn-w003 minpoll -2 maxpoll 9 iburst

Quick Fixes

Fix BSS Metadata

If nodes are missing metadata for NTP, you will be required to generate the data using csi and your system’s system_config.yaml. If you do not have your seed data in the system_config.yaml then you will need to open a ticket to help generate the NTP data.

The following steps are structured to be executed on one node at a time. However, step #3 will generate all relevant files for each node. If multiple nodes are missing NTP data in BSS, you can apply this fix to each node.

Update system_config.yaml to have the correct NTP settings:

ntp-servers:
  - ncn-m001
  - time.nist.gov
ntp-timezone: UTC

Generate new configurations:
```
ncn# csi config init
```
In the newly created system/basecamp directory, copy in and execute the metadata script that is included in the upgrade scripts of this documentation:
```
ncn# ./upgrade_ntp_timezone_metadata.sh
```
Find the relevant file(s) to the node(s) with missing metadata, such as upgrade-metadata-000000000000.json based on the MAC address of the node.
Find the component name (xname) for the node that needs to be fixed:
```
ncn# cat /etc/cray/xname
```

From ncn-m001 execute the following command to update BSS:

ncn# csi handoff bss-update-cloud-init --user-data="upgrade-metadata-000000000000.json" --limit=<xname>`

Obtain the updated cloud-init code and template files from the scripts directory in the CSM documentation RPM:

ncn# cp ./usr/share/doc/csm/scripts/cc_ntp.py /usr/lib/python3.6/site-packages/cloudinit/config/cc_ntp.py
ncn# cp ./usr/share/doc/csm/scripts/chrony.conf.cray.tmpl /etc/cloud/templates/chrony.conf.cray.tmpl

Alternatively, you can download the latest versions from Github: bash ncn# wget -O /usr/lib/python3.6/site-packages/cloudinit/config/cc_ntp.py https://raw.githubusercontent.com/Cray-HPE/metal-cloud-init/main/cloudinit/config/cc_ntp.py` ncn# wget -O /etc/cloud/templates/chrony.conf.cray.tmpl https://raw.githubusercontent.com/Cray-HPE/metal-cloud-init/main/config/cray.conf.j2`

Continue with the upgrade.
When the upgrade is completed there might be extra files and configurations, execute the following steps to clean up NTP if required:
1. Remove pool.conf on all nodes:
```
ncn# rm /etc/chrony.d/pool.conf
```
2. Remove cray.conf.dist on all nodes:
```
ncn# rm /etc/chrony.d/cray.conf.dist
```
3. Comment out the default pool line in /etc/chrony.conf on all nodes:
```
ncn# sed -i 's/^\!/#/' /etc/chrony.conf
```
4. Restart chronyd on all nodes:
```
ncn# systemctl restart chronyd
```

Fix `ncn-m001`

Most of the bugs from CSM 0.9 carried forward with upgrades. Most commonly, ncn-m001 is the problem because it either does not have a valid upstream server, or it has a bad configuration. This can be quickly remedied by running three commands to download the latest cc_ntp module, download an updated template, and re-run cloud-init.

ncn-m001# wget -O /usr/lib/python3.6/site-packages/cloudinit/config/cc_ntp.py https://raw.githubusercontent.com/Cray-HPE/metal-cloud-init/main/cloudinit/config/cc_ntp.py
ncn-m001# wget -O /etc/cloud/templates/chrony.conf.cray.tmpl https://raw.githubusercontent.com/Cray-HPE/metal-cloud-init/main/config/cray.conf.j2
ncn-m001# cloud-init single --name ntp --frequency always

Fix other NCNs

The other NCNs sometimes have the wrong stratum set or are missing the initstepslew directive. These can be added in fairly quickly with some sed commands:

Increase the stratum on NCNs (other than ncn-m001):

ncn# sed -i "s/local stratum 3 orphan/local stratum 10 orphan/" /etc/chrony.d/cray.conf

Add a new line after the logchange directive

ncn# sed -i "/^\(logchange 1.0\)\$/a initstepslew 1 ncn-m001" /etc/chrony.d/cray.conf

Restart chronyd

ncn# systemctl restart chronyd

Customize NTP

Set A Local Timezone

This procedure needs to be completed on the PIT node before the other management nodes are deployed.

Configure NTP on PIT to Local Timezone

HPE Cray EX systems with CSM software have UTC as the default time zone. To change this, you will need to set an environment variable, as well as chroot into the node images and change some files there. You can find a list of timezones to use in the commands below by running timedatectl list-timezones.

Run the following commands, replacing them with your timezone as needed.

pit# export NEWTZ=America/Chicago
pit# echo -e "\nTZ=${NEWTZ}" >> /etc/environment
pit# sed -i "s#^timedatectl set-timezone UTC#timedatectl set-timezone ${NEWTZ}#" /root/bin/configure-ntp.sh
pit# sed -i 's/--utc/--localtime/' /root/bin/configure-ntp.sh
pit# /root/bin/configure-ntp.sh

The configure-ntp.sh script should have the information for your local timezone in the output.

pit# /root/bin/configure-ntp.sh

Example output:

CURRENT TIME SETTINGS
rtc: 2021-03-26 11:34:45.873331+00:00
sys: 2021-03-26 11:34:46.015647+0000
200 OK
200 OK
NEW TIME SETTINGS
rtc: 2021-03-26 06:35:16.576477-05:00
sys: 2021-03-26 06:35:17.004587-0500

Verify the new timezone setting by running timedatectl and hwclock --verbose.

pit# timedatectl
      Local time: Fri 2021-03-26 06:35:58 CDT
  Universal time: Fri 2021-03-26 11:35:58 UTC
        RTC time: Fri 2021-03-26 11:35:58
       Time zone: America/Chicago (CDT, -0500)
 Network time on: no
NTP synchronized: no
 RTC in local TZ: no

pit# hwclock --verbose
hwclock from util-linux 2.33.1
System Time: 1616758841.688220
Trying to open: /dev/rtc0
Using the rtc interface to the clock.
Last drift adjustment done at 1616758836 seconds after 1969
Last calibration done at 1616758836 seconds after 1969
Hardware clock is on local time
Assuming hardware clock is kept in local time.
Waiting for clock tick...
...got clock tick
Time read from Hardware Clock: 2021/03/26 06:40:42
Hw clock time : 2021/03/26 06:40:42 = 1616758842 seconds since 1969
Time since last adjustment is 6 seconds
Calculated Hardware Clock drift is 0.000000 seconds
2021-03-26 06:40:41.685618-05:00

If the time is off and not accurate to your timezone, you will need to manually set the date and then run the NTP script again.

Manually set the time as close as possible to the real time.
```
pit# timedatectl set-time "2021-03-26 00:00:00"
```
Run the NTP script.
```
pit# /root/bin/configure-ntp.sh
```
The PIT is now configured to your local timezone.

Configure NCN Images to Use Local Timezone

Adjust the node images so that they also boot in the local timezone. This is accomplished by chrooting into the unsquashed images, making some modifications, re-squashing them, and moving the new images into place. This is included as an optional image modification step in the two procedures below.

If the PIT node is booted, see Change NCN Image Root Password and SSH Keys on PIT Node for more information.

Note: Make a note that when performing the csi handoff of NCN boot artifacts in Redeploy PIT Node, you must be sure to specify these new images. Otherwise ncn-m001 will use the default timezone when it boots, and subsequent reboots of the other NCNs will also lose the customized timezone changes.
If the PIT node is not booted, see Change NCN Image Root Password and SSH Keys for more information.