This page will detail the manual procedure to configure and verify BGP neighbors on the management switches.
You will not have BGP peers until CSM install.sh
has run. This is where MetalLB is deployed.
show bgp ipv4 unicast summary
for Aruba/HPE switches and show ip bgp summary
for Mellanox.clear ip bgp all
on the Mellanox and clear bgp *
on the Arubas will restart the BGP process. This process may need to be done when a system is reinstalled or when a worker node is rebuilt.CSI CLI arguments with --bgp-peers aggregation
linux# export IPMI_PASSWORD=changeme
linux# csi config init --bootstrap-ncn-bmc-user root --bootstrap-ncn-bmc-pass $IPMI_PASSWORD --ntp-pools cfntp-4-1.us.cray.com,cfntp-4-2.us.cray.com --can-external-dns 10.103.8.113 --can-gateway 10.103.8.1 --site-ip 172.30.56.2/24 --site-gw 172.30.48.1 --site-dns 172.30.84.40 --site-nic em1 --system-name odin --bgp-peers aggregation
For Mellanox there is a script mellanox_set_bgp_peers.py
and for Aruba there is CANU (Cray Automated Network Utility).
In order for these scripts to work the following commands will need to be applied on the switches.
In order for the automated process to work, the following command will need to be run on the switches:
sw-spine-001(config)# https-server rest access-mode read-write
Run CANU.
CANU requires three parameters: the IP address of switch 1, the IP address of Switch 2, and the path to the directory containing the file sls_input_file.json
.
The IP addresses in this example should be replaced by the IP addresses of the switches. Make sure the $SLS_PATH
variable is set to the correct directory.
pit# SYSTEM_NAME=eniac
pit# SLS_PATH="/var/www/ephemeral/prep/${SYSTEM_NAME}/"
pit# canu -s 1.5 config bgp --ips 10.252.0.2,10.252.0.3 --csi-folder "${SLS_PATH}"
In order for the automated process to work, the following commands will need to be run on the switches:
sw-spine-001 [standalone: master] > enable
sw-spine-001 [standalone: master] # configure terminal
sw-spine-001 [standalone: master] (config) # json-gw enable
Run the BGP helper script.
The BGP helper script requires three parameters: the IP address of switch 1, the IP address of Switch 2, and the path to CSI-generated network files.
CAN.yaml
, HMN.yaml
, HMNLB.yaml
, NMNLB.yaml
, and NMN.yaml
. The path must include the $SYSTEM_NAME.
The IP addresses in this example should be replaced by the IP addresses of the switches. Make sure the $CSI_PATH
variable is set to the correct directory.
pit# SYSTEM_NAME=eniac
pit# CSI_PATH="/var/www/ephemeral/prep/${SYSTEM_NAME}/networks/"
pit# /usr/local/bin/mellanox_set_bgp_peers.py 10.252.0.2 10.252.0.3 "${CSI_PATH}"
*WARNING*
The mellanox_set_bgp_peers.py script assumes that the prefix length of the CAN is /24
. If that value is incorrect for the system being installed then update the script with the correct prefix length by editing the following line.
cmd_prefix_list_can = "ip prefix-list pl-can seq 30 permit {} /24 ge 24".format()
After following the previous steps, you will need to verify the configuration and verify the BGP peers are ESTABLISHED
. If it is early in the install process and the CSM services have not been deployed yet, there will not be speakers to peer with, so the peering sessions may not be ESTABLISHED
yet. This is expected and not a problem.
On the Aruba switches, the output of the show bgp ipv4 unicast summary
command should look like the following if the MetalLB speaker pods are running. If it is early in the install process and the CSM services have not been deployed yet, you may see the neighbors in Idle
, Active
, or Connect
state. This is expected and not a problem.
sw-spine-001# show bgp ipv4 unicast summary
VRF : default
BGP Summary
-----------
Local AS : 65533 BGP Router Identifier : 10.252.0.1
Peers : 4 Log Neighbor Changes : No
Cfg. Hold Time : 180 Cfg. Keep Alive : 60
Neighbor Remote-AS MsgRcvd MsgSent Up/Down Time State AdminStatus
10.252.0.3 65533 31457 31474 00m:02w:04d Established Up
10.252.2.8 65533 54730 62906 00m:02w:04d Established Up
10.252.2.9 65533 54732 62927 00m:02w:04d Established Up
10.252.2.18 65533 54732 62911 00m:02w:04d Established Up
On the Mellanox switches, first you must run the switch commands listed in the Automated section above. The output of the show ip bgp summary
command should look like the following if the MetalLB speaker pods are running. If it is early in the install process and the CSM services have not been deployed yet, you may see the neighbors in IDLE
, ACTIVE
, or CONNECT
state. This is expected and not a problem.
sw-spine-001 [standalone: master] # show ip bgp summary
VRF name : default
BGP router identifier : 10.252.0.1
local AS number : 65533
BGP table version : 308
Main routing table version: 308
IPV4 Prefixes : 261
IPV6 Prefixes : 0
L2VPN EVPN Prefixes : 0
------------------------------------------------------------------------------------------------------------------
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
------------------------------------------------------------------------------------------------------------------
10.252.0.7 4 65533 37421 42948 308 0 0 12:23:16:07 ESTABLISHED/53
10.252.0.8 4 65533 37421 42920 308 0 0 12:23:16:07 ESTABLISHED/51
10.252.0.9 4 65533 37420 42962 308 0 0 12:23:16:07 ESTABLISHED/51
ESTABLISHED
state make sure the IP addresses are correct for the route-map and BGP configuration and the MetalLB speaker pods are running on all of the workers.NMN.yaml
, CAN.yaml
, HMN.yaml
), these IP addresses are also located in /etc/dnsmasq.d/statics.conf
on the PIT node.
pit# grep w00 /etc/dnsmasq.d/statics.conf | grep nmn
host-record=ncn-w003,ncn-w003.nmn,10.252.1.13
host-record=ncn-w002,ncn-w002.nmn,10.252.1.14
host-record=ncn-w001,ncn-w001.nmn,10.252.1.15
Bond0 Mac0/Mac1
entry is for the NMN.
pit# grep ncn-w /etc/dnsmasq.d/statics.conf | egrep "Bond0|HMN|CAN" | grep -v mgmt
dhcp-host=50:6b:4b:08:d0:4a,10.252.1.13,ncn-w003,20m # Bond0 Mac0/Mac1
dhcp-host=50:6b:4b:08:d0:4a,10.254.1.20,ncn-w003,20m # HMN
dhcp-host=50:6b:4b:08:d0:4a,10.102.4.12,ncn-w003,20m # CAN
dhcp-host=98:03:9b:0f:39:4a,10.252.1.14,ncn-w002,20m # Bond0 Mac0/Mac1
dhcp-host=98:03:9b:0f:39:4a,10.254.1.22,ncn-w002,20m # HMN
dhcp-host=98:03:9b:0f:39:4a,10.102.4.13,ncn-w002,20m # CAN
dhcp-host=98:03:9b:bb:a9:94,10.252.1.15,ncn-w001,20m # Bond0 Mac0/Mac1
dhcp-host=98:03:9b:bb:a9:94,10.254.1.24,ncn-w001,20m # HMN
dhcp-host=98:03:9b:bb:a9:94,10.102.4.14,ncn-w001,20m # CAN
sw-spine-001# configure terminal
sw-spine-001(config)# no router bgp 65533
This will delete all BGP configurations on this device.
Continue (y/n)? y
sw-spine-001(config)# no route-map ncn-w003
sw-spine-001(config)# no route-map ncn-w002
sw-spine-001(config)# no route-map ncn-w001
ip prefix-list pl-can seq 10 permit 193.167.208.0/25 ge 24
ip prefix-list pl-hmn seq 20 permit 10.94.100.0/24 ge 24
ip prefix-list pl-nmn seq 30 permit 10.92.100.0/24 ge 24
ip prefix-list tftp seq 10 permit 10.92.100.60/32 ge 32 le 32
route-map ncn-w001 permit seq 10
match ip address prefix-list tftp
match ip next-hop 10.252.1.7
set local-preference 1000
route-map ncn-w001 permit seq 20
match ip address prefix-list tftp
match ip next-hop 10.252.1.8
set local-preference 1100
route-map ncn-w001 permit seq 30ß
match ip address prefix-list tftp
match ip next-hop 10.252.1.9
set local-preference 1200
route-map ncn-w001 permit seq 40
match ip address prefix-list pl-can
set ip next-hop 10.103.10.10
route-map ncn-w001 permit seq 50
match ip address prefix-list pl-hmn
set ip next-hop 10.254.1.14
route-map ncn-w001 permit seq 60
match ip address prefix-list pl-nmn
set ip next-hop 10.252.1.9
route-map ncn-w002 permit seq 10
match ip address prefix-list tftp
match ip next-hop 10.252.1.7
set local-preference 1000
route-map ncn-w002 permit seq 20
match ip address prefix-list tftp
match ip next-hop 10.252.1.8
set local-preference 1100
route-map ncn-w002 permit seq 30
match ip address prefix-list tftp
match ip next-hop 10.252.1.9
set local-preference 1200
route-map ncn-w002 permit seq 40
match ip address prefix-list pl-can
set ip next-hop 10.103.10.9
route-map ncn-w002 permit seq 50
match ip address prefix-list pl-hmn
set ip next-hop 10.254.1.12
route-map ncn-w002 permit seq 60
match ip address prefix-list pl-nmn
set ip next-hop 10.252.1.8
route-map ncn-w003 permit seq 10
match ip address prefix-list tftp
match ip next-hop 10.252.1.7
set local-preference 1000
route-map ncn-w003 permit seq 20
match ip address prefix-list tftp
match ip next-hop 10.252.1.8
set local-preference 1100
route-map ncn-w003 permit seq 30
match ip address prefix-list tftp
match ip next-hop 10.252.1.9
set local-preference 1200
route-map ncn-w003 permit seq 40
match ip address prefix-list pl-can
set ip next-hop 10.103.10.8
route-map ncn-w003 permit seq 50
match ip address prefix-list pl-hmn
set ip next-hop 10.254.1.10
route-map ncn-w003 permit seq 60
match ip address prefix-list pl-nmn
set ip next-hop 10.252.1.7
!
router ospfv3 1
area 0.0.0.0
router bgp 65533
distance bgp 20 70
bgp router-id 10.252.0.2
maximum-paths 8
neighbor 10.252.0.3 remote-as 65533
neighbor 10.252.1.7 remote-as 65533
neighbor 10.252.1.7 passive
neighbor 10.252.1.8 remote-as 65533
neighbor 10.252.1.8 passive
neighbor 10.252.1.9 remote-as 65533
neighbor 10.252.1.9 passive
address-family ipv4 unicast
neighbor 10.252.0.3 activate
neighbor 10.252.1.7 activate
neighbor 10.252.1.7 route-map ncn-w003 in
neighbor 10.252.1.8 activate
neighbor 10.252.1.8 route-map ncn-w002 in
neighbor 10.252.1.9 activate
neighbor 10.252.1.9 route-map ncn-w001 in
exit-address-family
sw-spine-001 [standalone: master] #
sw-spine-001 [standalone: master] # conf t
sw-spine-001 [standalone: master] (config) # no router bgp 65533
sw-spine-001 [standalone: master] (config) # no route-map ncn-w001
sw-spine-001 [standalone: master] (config) # no route-map ncn-w002
sw-spine-001 [standalone: master] (config) # no route-map ncn-w003
## Route-maps configuration
##
route-map rm-ncn-w001 permit 10 match ip address pl-nmn
route-map rm-ncn-w001 permit 10 set ip next-hop 10.252.0.7
route-map rm-ncn-w001 permit 20 match ip address pl-hmn
route-map rm-ncn-w001 permit 20 set ip next-hop 10.254.0.7
route-map rm-ncn-w001 permit 30 match ip address pl-can
route-map rm-ncn-w001 permit 30 set ip next-hop 10.103.8.7
route-map rm-ncn-w002 permit 10 match ip address pl-nmn
route-map rm-ncn-w002 permit 10 set ip next-hop 10.252.0.8
route-map rm-ncn-w002 permit 20 match ip address pl-hmn
route-map rm-ncn-w002 permit 20 set ip next-hop 10.254.0.8
route-map rm-ncn-w002 permit 30 match ip address pl-can
route-map rm-ncn-w002 permit 30 set ip next-hop 10.103.8.8
route-map rm-ncn-w003 permit 10 match ip address pl-nmn
route-map rm-ncn-w003 permit 10 set ip next-hop 10.252.0.9
route-map rm-ncn-w003 permit 20 match ip address pl-hmn
route-map rm-ncn-w003 permit 20 set ip next-hop 10.254.0.9
route-map rm-ncn-w003 permit 30 match ip address pl-can
route-map rm-ncn-w003 permit 30 set ip next-hop 10.103.8.9
##
## BGP configuration
##
protocol bgp
router bgp 65533 vrf default
router bgp 65533 vrf default router-id 10.252.0.1 force
router bgp 65533 vrf default maximum-paths ibgp 32
router bgp 65533 vrf default neighbor 10.252.0.7 remote-as 65533
router bgp 65533 vrf default neighbor 10.252.0.7 route-map ncn-w001
router bgp 65533 vrf default neighbor 10.252.0.7 transport connection-mode passive
router bgp 65533 vrf default neighbor 10.252.0.8 remote-as 65533
router bgp 65533 vrf default neighbor 10.252.0.8 route-map ncn-w002
router bgp 65533 vrf default neighbor 10.252.0.8 transport connection-mode passive
router bgp 65533 vrf default neighbor 10.252.0.9 remote-as 65533
router bgp 65533 vrf default neighbor 10.252.0.9 route-map ncn-w003
router bgp 65533 vrf default neighbor 10.252.0.9 transport connection-mode passive
clear ip bgp all
on the Mellanox and clear bgp *
on the Arubas. (This may need to be done multiple times for all the peers to come up)The peer-address should be the IP address of the switch that you are doing BGP peering with.
---
apiVersion: v1
kind: ConfigMap
metadata:
namespace: metallb-system
name: config
data:
config: |
peers:
- peer-address: 10.252.0.2
peer-asn: 65533
my-asn: 65533
- peer-address: 10.252.0.3
peer-asn: 65533
my-asn: 65533
address-pools:
- name: customer-access
protocol: bgp
addresses:
- 10.102.9.112/28
- name: customer-access-dynamic
protocol: bgp
addresses:
- 10.102.9.128/25
- name: hardware-management
protocol: bgp
addresses:
- 10.94.100.0/24
- name: node-management
protocol: bgp
addresses:
- 10.92.100.0/24