Create System Configuration Using Cluster Discovery Service

This stage walks the user through creating the configuration payload for the system using Cluster Discovery Service (CDS) to generate seed files and paddle file.

Run the following steps before starting any of the system configuration procedures.

  1. (pit#) Create the prep and cds directory.

    mkdir -pv "${PITDATA}/prep/cds"
    
  2. (pit#) Change into the cds directory.

    cd "${PITDATA}/prep/cds"
    
  3. (pit#) Set CVT_PATH variable.

    CVT_PATH=/opt/src/cluster-config-verification-tool/
    

Topics

  1. Customize system_config.yaml
  2. Configure the management network
  3. Configure auto-discovery
    1. Configure dnsmasq
    2. Verify and reset the BMCs
    3. Verify the discover status
    4. Collect and compare the inventory data
  4. Generate seed files and paddle file
  5. Stop discovery
  6. Run CSI
  7. Prepare Site Init
  8. Initialize the LiveCD
  9. Next topic

1. Customize system_config.yaml

  1. (pit#) Change into the prep directory.

    cd "${PITDATA}/prep"
    
  2. (pit#) Create or copy system_config.yaml.

    • If one does not exist from a prior installation, then create an empty one:

      csi config init empty
      
    • Otherwise, copy the existing system_config.yaml file into the working directory.

  3. (pit#) Edit the system_config.yaml file with the appropriate values.

    NOTE:

    • For a short description of each key in the file, run csi config init --help.
    • For more description of these settings and the default values, see Default IP Address Ranges and the other topics in CSM Overview.
    • To enable or disable audit logging, refer to Audit Logs for more information.
    • If the system is using a cabinets.yaml file, be sure to update the cabinets-yaml field with 'cabinets.yaml' as its value.
    vim system_config.yaml
    

2. Configure the management network

  1. (pit#) Change into the cds directory.

    cd "${PITDATA}/prep/cds"
    
  2. (pit#) Set up the bond0 network interface as follows:

    • Configure bond0 interface for single VLAN:

      "${CVT_PATH}"/cds/network_setup.py -mtl <ipaddr>,<netmask>,<slave0>,<slave1>
      
    • Configure bond0 interface for HMN configured network (VLAN 1, 2, and 4):

      "${CVT_PATH}"/cds/network_setup.py -mtl <ipaddr>,<netmask>,<slave0>,<slave1> -hmn <ipaddr>,<netmask>,<vlanid> -nmn <ipaddr>,<netmask>,<vlanid>
      

      The variables and their descriptions are as follows:

      Variable Description
      ipaddr IP address
      netmask netmask value
      slave0, slave1 network interfaces
      vlanid VLAN ID number

      NOTE: The IP address value, netmask value, network interface, and VLAN ID are available in the system_config.yaml file in the prep directory.

3. Configure auto-discovery

3.1 Configure Dnsmasq

  1. (pit#) Configure Dnsmasq and PXE Boot.

    • Configure Dnsmasq for single VLAN:

      The <leasetime> parameter is optional.

      "${CVT_PATH}"/cds/discover_enable.py -ip <dhcp_ip_start>,<dhcp_ip_end>,<subnet>,<leasetime> -i bond0
      

      Example command:

      "${CVT_PATH}"/cds/discover_enable.py -ip 10.x.x.x,10.x.x.x,255.x.x.x,24h -i bond0
      
    • Configure Dnsmasq for HMN configured network (VLAN 1, 2, and 4):

      "${CVT_PATH}"/cds/discover_enable.py -mtl <admin_node_mtl_ip>,<start_ip>,<end_ip> \
                                           -nmn <admin_node_nmn_ip>,<start_ip>,<end_ip> \
                                           -hmn <admin_node_hmn_ip>,<start_ip>,<end_ip>
      

      Example command:

      "${CVT_PATH}"/cds/discover_enable.py -mtl 10.x.x.x,10.x.x.x,10.x.x.x \
                                           -nmn 10.x.x.x,10.x.x.x,10.x.x.x \
                                           -hmn 10.x.x.x,10.x.x.x,10.x.x.x
      

      NOTE: The iprange for single VLAN or mtliprange, nmniprange, and hmniprange for HMN configured network are available in the system_config.yaml file in the prep directory.

3.2 Verify and reset the BMCs

The following steps will verify if IP addresses are assigned to the BMC nodes and then reset the nodes:

  1. (pit#) If the fabric switches, PDUs, and CMCs are to be dynamically discovered, then ensure that they are set to DHCP mode.

  2. (pit#) List the BMC nodes.

    "${CVT_PATH}"/cds/discover_start.py -u <bmc_username> -l
    

    A prompt to enter the password will appear, enter the correct BMC password.

  3. Verify that all the required nodes are listed.

    Example output:

    ncn-m001:/opt/src/cluster-config-verification-tool/cds # ./discover_start.py -u root -l
    Enter the password:
    File /var/lib/misc/dnsmasq.leases exists and has contents.
    
    ===== Detected BMC MAC info:
    IP: A.B.C.D BMC_MAC: AA-BB-CC-DD-EE-FF
    
    ==== Count check:
    No of BMC Detected: 1
    
    ====================================================================================
    
    If BMC count matches the expected river node count, proceed with rebooting the nodes
    ./discovery_start.py --bmc_username <username> --bmc_passwprd <password> --reset
    
  4. (pit#) Reset the listed BMC nodes.

    "${CVT_PATH}"/cds/discover_start.py -u <bmc_username> -r
    

    A prompt to enter the password will appear, enter the correct BMC password.

3.3 Verify the discover status

NOTE: If the fabric switches, PDUs, and CMCs were not configured to DHCP mode in the step Verify and reset the BMCs, manually create the fabricswlist, pdulist, and cmclist files in the working directory.

The following steps verify the status and lists the IP addresses of nodes, fabric switches, PDUs, and CMCs:

  1. (pit#) Verify the status and list the IP addresses of the nodes.

    "${CVT_PATH}"/cds/discover_status.py nodes -o
    

    NOTE:

    • Ensure that all the NCNs and CNs are discovered and listed. If any nodes are not discovered, identify the nodes using the MAC address and verify the switch connectivity of those nodes.
    • If the discover status step is rerun after a long duration, start the procedure from the beginning of section Configure the management network.
    1. Create a bond0 network interface on all the listed nodes.

      pdsh -w^nodelist -x localhost "sh "${CVT_PATH}"/scripts/create_bond0.sh"
      

      NOTE: If there is a host-key verification failure, use the ssh-keygen command to retrieve the host-key and retry the procedure.

      ssh-keygen -R <node_IP> -f /root/.ssh/known_hosts

      Where, <node_IP> is the IP of the node where the error exist.

    2. Ensure that bond0 network interface is created on all the nodes listed in the nodelist.

      Example output:

      pit:/var/www/ephemeral/prep/cds # pdsh -w^nodelist -x localhost "sh "${CVT_PATH}"/scripts/create_bond0.sh"
      ncn-w002: Network restarted successfully
      // more nodes to follow - output truncated
      

      NOTE: If there is a failure in creating the bond0 network on any nodes listed in the nodelist, make sure to run the script again on only those particular nodes to create the bond0 network.

  2. (pit#) Verify the status and list the IP addresses of the fabric switches.

    "${CVT_PATH}"/cds/discover_status.py fabric -u <fabric_username> -o
    

    A prompt to enter the password will appear, enter the correct fabric switch password.

    Example output:

    ncn-m001:/opt/src/cluster-config-verification-tool/cds # ./discover_status.py fabric --username root --out
    Enter the password for fabric:
    File /var/lib/misc/dnsmasq.leases exists and has contents.
    
    fabricswlist:
    A.B.C.D
    
    ===== Save list components in a file:
    Created fabricswlist file: /opt/src/cluster-config-verification-tool/cds/fabricswlist
    
  3. (pit#) Verify the status and list the IP addresses of the PDUs.

    "${CVT_PATH}"/cds/discover_status.py pdu -o
    
  4. (pit#) Verify the status and list the IP addresses of the CMCs.

    "${CVT_PATH}"/cds/discover_status.py cmc -u <cmc_username> -o
    

    A prompt to enter the password will appear, enter the correct CMC password.

3.4 Collect and compare the inventory data

3.4.1 Dynamic Data Collection

  1. (pit#) Copy the SSH key to setup passwordless-SSH for the local host.

    ssh-copy-id localhost
    
  2. (pit#) Collect the hardware inventory data.

    "${CVT_PATH}"/system_inventory.py -nf nodelist
    
  3. (pit#) Collect the fabric inventory data.

    "${CVT_PATH}"/fabric_inventory.py -sf fabricswlist -u <switch_username>
    

    A prompt to enter the password will appear, enter the correct fabric password.

  4. (pit#) Collect the management network inventory.

    1. Manually, create a mgmtswlist file in the working directory listing the IP addresses of management switches.

    2. Collect the management network inventory data.

      "${CVT_PATH}"/management_inventory.py  -u <username> -sf mgmtswlist
      

      A prompt to enter the password will appear, enter the correct fabric password.

  5. (pit#) Collect the PDU inventory data.

    "${CVT_PATH}"/cds/pdu_cmc_inventory.py -pf pdulist
    
  6. (pit#) Collect the CMC inventory data.

    "${CVT_PATH}"/cds/pdu_cmc_inventory.py -cf cmclist
    

3.4.2 SHCD data population

  1. Copy the SHCD files to the present working directory.

  2. (pit#) Create a JSON file using the CANU tool.

    See Validate the SHCD.

    Example command:

    canu validate shcd --shcd SHCD.xlsx --architecture V1 --tabs 40G_10G,NMN,HMN --corners I12,Q38,I13,Q21,J20,U38 --json --out cabling.json
    
  3. (pit#) Store the SHCD data in the CVT database.

    "${CVT_PATH}"/parse_shcd.py -c cabling.json
    

3.4.3 Server classification

(pit#) Classify the servers.

In the following command, replace <N> with the number of worker NCNs in the system.

"${CVT_PATH}"/scripts/server_classification.py -n <N>

3.4.4 Compare the SHCD data with CVT inventory data

  1. (pit#) List the snapshot IDs of the SHCD and the CVT inventory data.

    "${CVT_PATH}"/shcd_compare.py -l
    
  2. (pit#) Compare the SHCD data with the CVT inventory data.

    "${CVT_PATH}"/shcd_compare.py -s <SnapshotID_SHCD> -c <SnapshotID_CVT>
    

    Example command:

    "${CVT_PATH}"/shcd_compare.py -s 4ea86f99-47ca-4cd9-b142-ea572d0566bf -c 326a6edd-adce-4d1f-9f30-ae9539284733
    

4. Generate seed files and paddle file

  1. (pit#) Generate the following seed files: switch_metadata.csv, application_node_config.yaml, hmn_connections.json, and ncn_metadata.csv.

    "${CVT_PATH}"/create_seedfile.py
    

    NOTE: The generated seed files will be stored in the seedfiles folder in the working directory.

  2. (pit#) Generate paddle file: cvt-ccj.json.

    In the following command, replace <arch_type> with the type of architecture (Full, V1, or Tds). See Generate configuration files for the characteristics of the architecture.

    "${CVT_PATH}"/create_paddle_file.py --architecture <arch_type>
    
  3. Copy the generated paddle file and seed files to the prep directory.

    cp -rv ${PITDATA}/prep/cds/cvt-ccj.json ${PITDATA}/prep/cds/seedfiles/* ${PITDATA}/prep/
    
  4. Create cabinets.yaml file in the prep directory, see Create cabinets.yaml.

5. Stop discovery

(pit#) Clean and reset the modified files to their original default state.

"${CVT_PATH}"/cds/discover_stop.py -u <bmc_username>

A prompt to enter the password will appear, enter the correct BMC password.

6. Run CSI

  1. (pit#) Change into the prep directory.

    cd "${PITDATA}/prep"
    
  2. (pit#) Generate the initial configuration for CSI.

    This will validate whether the inputs for CSI are correct.

    csi config init
    

7. Prepare Site Init

Follow the Prepare Site Init procedure.

8. Initialize the LiveCD

NOTE: If starting an installation at this point, then be sure to copy the previous prep directory back onto the system.

  1. (pit#) Initialize the PIT.

    The pit-init.sh script will prepare the PIT server for deploying NCNs.

    /root/bin/pit-init.sh
    
  2. (pit#) Set the IPMI_PASSWORD variable.

    read -r -s -p "NCN BMC root password: " IPMI_PASSWORD
    
  3. (pit#) Export the IPMI_PASSWORD variable.

    export IPMI_PASSWORD
    
  4. (pit#) Setup links to the boot artifacts extracted from the CSM tarball.

    NOTE

    • This will also set all the BMCs to DHCP.
    • Changing into the $HOME directory ensures the proper operation of the script.
    cd $HOME && /root/bin/set-sqfs-links.sh
    

Next topic

After completing this procedure, proceed to import the CSM tarball.

See Import the CSM Tarball.