The HPE Apollo 6500 XL645d Gen10 Plus compute node uses a NIC/shared iLO network port. The NIC is also referred to as the Embedded LOM (LAN On Motherboard) and is available to the booted OS. This shared port is plugged into a port on the TOR Ethernet switch designated for the Hardware Management Network (HMN) causing the NIC to get an IP address assigned to it from the wrong pool. To prevent this from happening, the iLO VLAN tag needs to be configured for VLAN 4 and the switch port the NIC/shared iLO is plugged into needs to be configured to allow only VLAN 4 traffic. This prevents the NIC from communicating over the switch and it will no longer DHCP an IP address.
This procedure needs to be done for each of the HPE Apollo 6500 XL645d servers
that will be managed by CSM software. River compute nodes always index their
BMCs from 1. For example, the compute BMCs in servers with more than one node
will have component names (xnames) as follows: x3000c0s30b1
, x3000c0s30b2
, x3000c0s30b3
, and so
on. The node indicator is always a 0. For example, x3000c0s30b1n0
or
x3000c0s30b4n0
.
Gather information.
The following is an example using x3000c0s30b1n0
as the target compute node
component name (xname):
ncn# XNAME=x3000c0s30b1n0
ncn# cray hsm inventory ethernetInterfaces list --component-id \
${XNAME} --format json | jq '.[]|select((.IPAddresses|length)>0)'
Example output:
{
"ID": "6805cabbc182",
"Description": "",
"MACAddress": "68:05:ca:bb:c1:82",
"LastUpdate": "2021-04-19T22:15:00.523621Z",
"ComponentID": "x3000c0s30b1n0",
"Type": "Node",
"IPAddresses": [
{
"IPAddress": "10.252.1.21"
}
]
},
{
"ID": "9440c938f7b4",
"Description": "",
"MACAddress": "94:40:c9:38:f7:b4",
"LastUpdate": "2021-05-07T18:37:59.239924Z",
"ComponentID": "x3000c0s30b1n0",
"Type": "Node",
"IPAddresses": [
{
"IPAddress": "10.254.1.38"
}
]
}
The second entry is the indication that the NIC is receiving incorrect IP
addresses. The 10.254.x.y
address is for the HMN and should not be associated
with the node itself (x3000c0s30b1n0
).
Make a note of the ID
, MACAddress
, and IPAddress
of the entry that has the
10.254
address listed.
ncn# ID="9440c938f7b4"
ncn# MAC="94:40:c9:38:f7:b4"
ncn# IPADDR="10.254.1.38"
These will be used later to clean up KEA and Hardware State Manager (HSM).
There may not be a 10.254
address associated with the node – that is OK and
will enable skipping of several steps later on.
Configure the iLO to use VLAN 4.
Connect to BMC web user interface and log in with standard root
credentials.
From the administrator’s own machine, create an SSH tunnel (-L
creates
the tunnel, and -N
prevents a shell and stubs the connection):
linux# BMC=x3000c0s30b1
linux# ssh -L 9443:$BMC:443 -N root@example-ncn-m001
Opening a web browser to https://localhost:9443
will give access to
the BMC’s web interface.
Login with the root
credentials.
Click on iLO Shared Network Port on left menu.
Make a note of the MAC Address under the Information section; that will be needed later.
ncn# ILOMAC="<MAC Address>"
For example
ncn# ILOMAC="94:40:c9:38:08:c7"
Click on General on the top menu.
Under NIC Settings move slider to Enable VLAN.
In the VLAN Tag box, enter 4.
Click Apply.
Click Reset iLO when it appears.
Click Yes, Reset when it appears on the right.
After accepting the BMC restart, connection to the BMC will be lost until the switch port reconfiguration is performed.
Configure the switch port for the iLO to use VLAN 4.
Find the port and the switch the iLO is plugged into using the SHCD.
SSH to the switch and log in with standard admin credentials. Refer to /etc/hosts
for exact hostname.
Verify the MAC address on the port.
Example using port number 46.
sw-leaf-001# show mac-address-table | include 1/1/46
Example output:
94:40:c9:38:08:c7 4 dynamic 1/1/46
Make sure the MAC address shown for that port matches the ILOMAC address noted in step 2.3 from the Information section of the WebUI.
NOTE: If the MAC address is not correct, double check the server cabling and SHCD for the correct port then start this section over. Do not move on until the ILOMAC address has been found on the switch at the expected port.
Configure the port if the MAC address is correct.
Example using port number 46.
Configure the port.
sw-leaf-001# configure t
sw-leaf-001(config)# int 1/1/46
sw-leaf-001(config-if)# vlan trunk allowed 4
sw-leaf-001(config-if)# write mem
Example output:
Copying configuration: [Success]
Exit configure mode.
sw-leaf-001(config-if)# exit
sw-leaf-001(config)# exit
Verify the settings.
sw-leaf-001# show running-config interface 1/1/46
Example output:
interface 1/1/46
no shutdown
mtu 9198
description dl645d
no routing
vlan trunk native 1
vlan trunk allowed 4
spanning-tree bpdu-guard
spanning-tree port-type admin-edge
exit
After a few minutes the switch will be configured and access to the WebUI will be regained.
Clear bad MAC address and IP address out of KEA.
NOTE: Skip this section if there was no bad MAC address and IP address found in section 1.
Retrieve an API token.
ncn# TOKEN=$(curl -s -S -d grant_type=client_credentials \
-d client_id=admin-client \
-d client_secret=`kubectl get secrets admin-client-auth -o jsonpath='{.data.client-secret}' | base64 -d` \
https://api-gw-service-nmn.local/keycloak/realms/shasta/protocol/openid-connect/token | jq -r '.access_token')
Remove the entry from KEA that is associated with the MAC address and IP address gathered previously.
ncn# curl -s -k -H "Authorization: Bearer ${TOKEN}" -X POST -H \
"Content-Type: application/json" -d '{"command": "lease4-del", \
"service": [ "dhcp4" ], "arguments": {"hw-address": "'${MAC}'", \
"ip-address": "'${IPADDR}'"}}' https://api-gw-service-nmn.local/apis/dhcp-kea
Expected results:
[ { "result": 0, "text": "IPv4 lease deleted." } ]
Clear bad ID out of HSM.
NOTE: Skip this section if there was no bad ID found in section 1.
Tell HSM to delete the bad ID out of the Ethernet Interfaces table.
ncn# cray hsm inventory ethernetInterfaces delete $ID --format json
Expected results:
{
"code": 0,
"message": "deleted 1 entry"
}
Everything is now configured and the CSM software will automatically discover the node after several minutes. After it has been discovered, the node is ready to be booted.
The BIOS time for Gigabyte compute nodes must be synchronized with the rest of the system. See Update the Gigabyte Node BIOS Time.
After completing the preparation for compute nodes, the CSM product stream has been fully installed and configured. Check the next topic.
See Next topic for more information on other product streams to be installed and configured after CSM.