Node Deployment

This section of the guide will walk the user through deploying bare-metal nodes.

Hypervisor Deployment

Deploying hypervisor nodes.

Deployment

Booting hypervisor nodes.

  1. Set credentials for RedFish

    cp /vms/assets/fawkes-*/gru.yml /etc/gru.yml
    vi /etc/gru.yml
    • Replace the username: and password: values with ones that correspond to the RedFish endpoints on the cluster.

    • Set insecure: true if SSL is not configured with a trusted certificate on the RedFish endpoints.

  2. Boot the hypervisor nodes

    conman -q | grep ncn-h | gru -c /etc/gru.yml chassis boot http --now
  3. Each node should boot over HTTP and pull media.

  4. Manually wipe and install each node over the console.

    • Connect to the node(s)

      ssh ncn-h002
    • It’ll ask you to set a password, the default password is blank.

    • Wipe the node.

      crucible storage wipe -y
    • Install the node.

      crucible install
    • Reboot

      reboot
  5. Move onto the next hypervisor.

After the hypervisor reboots, if no IP is set you may have to run the following.

crucible network interface --dhcp mgmt0

This is a bug in the NetworkManager configuration and setup during the hypervisor install.

Inventory

  1. Create an inventory file

    cat << EOF > /etc/ansible/inventory
    [hypervisor]
    hypervisor.local
    $(conman -q | grep ncn | sed -s 's/-mgmt//')
    
    [storage]
    $(conman -q | grep ncn-s | sed -s 's/-mgmt//')
    EOF
  2. Trust all deployed machines.

    for host in $(conman -q | grep -oP 'ncn-\w\d+'); do ssh-keyscan -H $host >> ~/.ssh/known_hosts 2>/dev/null; done

Provision

Provision the hypervisor nodes.

  1. Provision the hypervisors.

    /opt/cray/ansible/bin/ansible-playbook \
        -i /etc/ansible/inventory \
        /srv/cray/metal-provision/hypervisor.yml

    If the note from the Inventory section was followed, Ansible will hang while setting up VLANs. When it hangs, login to each node and run

    systemctl restart NetworkManager
    nmcli device reapply mgmt0

Storage