Prepare Compute Nodes

  1. Configure HPE Apollo 6500 XL645d Gen10 Plus Compute Nodes
    1. Gather information
    2. Configure the iLO to use VLAN 4
    3. Configure the switch port for the iLO to use VLAN 4
    4. Clear bad MAC and IP address out of KEA
    5. Clear bad ID out of HSM
    6. Update the BIOS Time on Gigabyte Compute Nodes
  2. Next topic

Configure HPE Apollo 6500 XL645d Gen10 Plus Compute Nodes

The HPE Apollo 6500 XL645d Gen10 Plus compute node uses a NIC/shared iLO network port. The NIC is also referred to as the Embedded LOM (LAN On Motherboard) and is available to the booted OS. This shared port is plugged into a port on the TOR Ethernet switch designated for the Hardware Management Network (HMN) causing the NIC to get an IP address assigned to it from the wrong pool. To prevent this from happening, the iLO VLAN tag needs to be configured for VLAN 4 and the switch port the NIC/shared iLO is plugged into needs to be configured to allow only VLAN 4 traffic. This prevents the NIC from communicating over the switch and it will no longer DHCP an IP address.

This procedure needs to be done for each of the HPE Apollo 6500 XL645d servers that will be managed by CSM software. River compute nodes always index their BMCs from 1. For example, the compute BMCs in servers with more than one node will have component names (xnames) as follows: x3000c0s30b1, x3000c0s30b2, x3000c0s30b3, and so on. The node indicator is always a 0. For example, x3000c0s30b1n0 or x3000c0s30b4n0.

1. Gather information

Gather information.

The following is an example using x3000c0s30b1n0 as the target compute node component name (xname):

ncn# XNAME=x3000c0s30b1n0
ncn# cray hsm inventory ethernetInterfaces list --component-id \
        ${XNAME} --format json | jq '.[]|select((.IPAddresses|length)>0)'

Example output:

{
    "ID": "6805cabbc182",
    "Description": "",
    "MACAddress": "68:05:ca:bb:c1:82",
    "LastUpdate": "2021-04-19T22:15:00.523621Z",
    "ComponentID": "x3000c0s30b1n0",
    "Type": "Node",
    "IPAddresses": [
        {
            "IPAddress": "10.252.1.21"
        }
    ]
},
{
    "ID": "9440c938f7b4",
    "Description": "",
    "MACAddress": "94:40:c9:38:f7:b4",
    "LastUpdate": "2021-05-07T18:37:59.239924Z",
    "ComponentID": "x3000c0s30b1n0",
    "Type": "Node",
    "IPAddresses": [
        {
            "IPAddress": "10.254.1.38"
        }
    ]
}

The second entry is the indication that the NIC is receiving incorrect IP addresses. The 10.254.x.y address is for the HMN and should not be associated with the node itself (x3000c0s30b1n0).

Make a note of the ID, MACAddress, and IPAddress of the entry that has the 10.254 address listed.

ncn# ID="9440c938f7b4"
ncn# MAC="94:40:c9:38:f7:b4"
ncn# IPADDR="10.254.1.38"

These will be used later to clean up KEA and Hardware State Manager (HSM). There may not be a 10.254 address associated with the node – that is OK and will enable skipping of several steps later on.

2. Configure iLO

Configure the iLO to use VLAN 4.

  1. Connect to BMC web user interface and log in with standard root credentials.

    1. From the administrator’s own machine, create an SSH tunnel (-L creates the tunnel, and -N prevents a shell and stubs the connection):

      linux# BMC=x3000c0s30b1
      linux# ssh -L 9443:$BMC:443 -N root@example-ncn-m001
      
    2. Opening a web browser to https://localhost:9443 will give access to the BMC’s web interface.

    3. Login with the root credentials.

  2. Click on iLO Shared Network Port on left menu.

  3. Make a note of the MAC Address under the Information section; that will be needed later.

    ncn# ILOMAC="<MAC Address>"
    

    For example

    ncn# ILOMAC="94:40:c9:38:08:c7"
    
  4. Click on General on the top menu.

  5. Under NIC Settings move slider to Enable VLAN.

  6. In the VLAN Tag box, enter 4.

  7. Click Apply.

  8. Click Reset iLO when it appears.

  9. Click Yes, Reset when it appears on the right.

  10. After accepting the BMC restart, connection to the BMC will be lost until the switch port reconfiguration is performed.

3. Configure switch port

Configure the switch port for the iLO to use VLAN 4.

  1. Find the port and the switch the iLO is plugged into using the SHCD.

  2. SSH to the switch and log in with standard admin credentials. Refer to /etc/hosts for exact hostname.

  3. Verify the MAC address on the port.

    Example using port number 46.

    sw-leaf-001# show mac-address-table | include 1/1/46
    

    Example output:

    94:40:c9:38:08:c7    4        dynamic                   1/1/46
    

    Make sure the MAC address shown for that port matches the ILOMAC address noted in step 2.3 from the Information section of the WebUI.

    NOTE: If the MAC address is not correct, double check the server cabling and SHCD for the correct port then start this section over. Do not move on until the ILOMAC address has been found on the switch at the expected port.

  4. Configure the port if the MAC address is correct.

    Example using port number 46.

    1. Configure the port.

      sw-leaf-001# configure t
      sw-leaf-001(config)# int 1/1/46
      sw-leaf-001(config-if)# vlan trunk allowed 4
      sw-leaf-001(config-if)# write mem
      

      Example output:

      Copying configuration: [Success]
      
    2. Exit configure mode.

      sw-leaf-001(config-if)# exit
      sw-leaf-001(config)# exit
      
  5. Verify the settings.

    sw-leaf-001# show running-config interface 1/1/46
    

    Example output:

    interface 1/1/46
        no shutdown
        mtu 9198
        description dl645d
        no routing
        vlan trunk native 1
        vlan trunk allowed 4
        spanning-tree bpdu-guard
        spanning-tree port-type admin-edge
        exit
    

    After a few minutes the switch will be configured and access to the WebUI will be regained.

4. Cleanup Kea

Clear bad MAC address and IP address out of KEA.

NOTE: Skip this section if there was no bad MAC address and IP address found in section 1.

  1. Retrieve an API token.

    ncn# TOKEN=$(curl -s -S -d grant_type=client_credentials \
                  -d client_id=admin-client \
                  -d client_secret=`kubectl get secrets admin-client-auth -o jsonpath='{.data.client-secret}' | base64 -d` \
                  https://api-gw-service-nmn.local/keycloak/realms/shasta/protocol/openid-connect/token | jq -r '.access_token')
    
  2. Remove the entry from KEA that is associated with the MAC address and IP address gathered previously.

    ncn# curl -s -k -H "Authorization: Bearer ${TOKEN}" -X POST -H \
             "Content-Type: application/json" -d '{"command": "lease4-del", \
             "service": [ "dhcp4" ], "arguments": {"hw-address": "'${MAC}'", \
             "ip-address": "'${IPADDR}'"}}' https://api-gw-service-nmn.local/apis/dhcp-kea
    

    Expected results:

    [ { "result": 0, "text": "IPv4 lease deleted." } ]
    

5. Cleanup HSM

Clear bad ID out of HSM.

NOTE: Skip this section if there was no bad ID found in section 1.

Tell HSM to delete the bad ID out of the Ethernet Interfaces table.

ncn# cray hsm inventory ethernetInterfaces delete $ID --format json

Expected results:

{
   "code": 0,
   "message": "deleted 1 entry"
}

Everything is now configured and the CSM software will automatically discover the node after several minutes. After it has been discovered, the node is ready to be booted.

6. Update the BIOS time on Gigabyte compute nodes

The BIOS time for Gigabyte compute nodes must be synchronized with the rest of the system. See Update the Gigabyte Node BIOS Time.

Next topic

After completing the preparation for compute nodes, the CSM product stream has been fully installed and configured. Check the next topic.

See Next topic for more information on other product streams to be installed and configured after CSM.