Parallel Clustering allows for multiple systems to run programs together as if they were one system. A Parallel Cluster is also called a Beowulf Cluster.
In my previous article, “Linux Cluster – Basics”, I covered the various types of Clusters. I went through the process of setting up a cluster using VirtualBox to make a virtual cluster.
For this article, I will be using physical hardware, three Raspberry Pi 4 systems. Technically, two Raspberry Pi 4 systems and one Raspberry Pi 400.
For the setup, I used the following:
- Three Raspberry Pi systems
- Three 16GB SD Cards (Class 10)
- Wifi for Internet access
- Hub (smart switch or even another router)
- Three Ethernet cables
Of course, you will need a monitor (or three), power cables, keyboards, and mice for the systems. You can get by with one of each to share.
There are two ways to set this all up. Let’s do this the simpler way which will take less time. The two ways are:
- Install the updates and software on all systems at the same time
- Perform the install and updates on one system and copy the SD Card to the other two
We will use the second option to set this up. It will give you good practice to manually change the Hostname and IP Address.
Performing the Setup
You will need to download the Operating System (OS) for the Raspberry Pi. I used Ubuntu Server 20.04.
NOTE: Do NOT use Ubuntu 21.04. The program Mpich does not work on it. At a later date, it may be fixed and then these instructions should work with Ubuntu 21.04.
Go to the site ‘https://ubuntu.com/raspberry-pi’ and select ‘Get Ubuntu for the Raspberry Pi’. Scroll down and select the box ‘Download 64-bit’ under ‘Ubuntu Server 20.04.2 LTS’.
Once your download is done, use Balena Etcher or a similar program to place the image on the first SD Card.
After the SD Car imaging is completed, insert the SD Card in the first Raspberry Pi. Power the system on making sure it is hooked to a keyboard, mouse, and monitor. Hook the Ethernet port to your hub or switch.
The system should perform some initial configurations without any input from a user. After a bit, you should be prompted for a username and password. The username is ‘ubuntu’ and the password is also ‘ubuntu’. After logging in, you will be prompted to change the password. Be aware that the first thing it asks for is that you retype the old password. You should then be prompted for the new password and to verify it.
Once you are at a prompt, you will need to enable the Wifi device. To do this you need to enter the command:
The command will edit the file ’50-cloud-init.yaml’ with nano. While we are editing this file, you may include the information for the wired port as well. You will need to make the file look like the following:
“SSID of your Wifi”:
password: “Wifi Password”
Let’s look at what is going on here in this file. There are two sections: the first for ethernet (wired) and the second for the Wifi (wireless) networks. For the ethernet, there is no DHCP. The address (10.0.0.1) is hardcoded with the subnet mask (8). The second section needs you to specify the SSID name of the Wifi network as well as the password. Be sure to keep in the quotes. Do not use any tabs, only spaces. Keep everything aligned as it is in the example. In the next step, the file format will be verified. If needed, you can use a different IP Addressing scheme other than 10.0.0.1/8. If you use a different scheme, be consistent with the rest of the systems so they can communicate through the hub or switch.
Press CTRL+O to write the file and then CTRL+X to exit the editor.
To apply these settings after verifying the format of the file, run the command:
sudo netplan apply
If you get an error then you need to edit the file and fix the error. Once saved, apply the changes. Repeat this until the file is applied. Once there are no errors, wait a bit to let the Wifi adapter make a connection and get completely connected. After the connection is made, run the following commands:
[LIST] [*] sudo apt update [*] sudo apt upgrade -y [*] sudo ap install net-tools [*] sudo ufw allow ssh [*] sudo apt install mpich [*] sudo apt install python3-mpi4pi [*] sudo apt install ubuntu-gnome-desktop [*] sudo apt autoremove [*] sudo restart now [/LIST]
The last line will cause the system to reboot. We have a few more changes to make and we’ll be ready to copy the SD Card.
When the system starts back up, you should eventually get to a Graphical User Interface (GUI) to log in as the user ‘Ubuntu’.
Once logged back into the system, you need to select the tiles in the lower-left corner. Once the screen changes, type in ‘users’. A line should appear to allow you to select ‘Settings’ for editing Users. Once the window opens for ‘Users’ select the ‘Unlock’ button in the top right. Then enable the option for the user ‘Ubuntu’ to autologin.
Open a Terminal and execute the following command:
The file contains a single line that is the hostname of the system. Delete the current name and type in ‘cluster1’. Press CTRL+O and then CTRL+X.
Now, we need to edit the ‘hosts’ file to set the hostname and IP Address of the systems. If you use more systems, edit the file appropriately. The file should include the following:
10.0.0.1 cluster1 10.0.0.2 cluster2 10.0.0.3 cluster3
Make sure any line that contains ‘127.0.0.1’ is removed. Add the above at the beginning of the file right after any existing hostnames and addresses. Press CTRL+O and then CTRL+X for the information to be saved.
At the command prompt, type the command ‘ping cluster2’. Verify that the IP Address is resolved. Repeat for the remaining cluster nodes. If any should not be correct, then edit the file again and fix any typos. Save and close the file and then ‘ping’ the names to test the address resolution.
At this point, you can execute the command ‘ifconfig’. Verify that the IP Address for ‘eth0’ is ‘10.0.0.1’. If you used a different addressing scheme, then it should match what you entered. If the IP Address is not correct, then edit the ‘/etc/netplan/50-cloud-init.yaml’ as was discussed previously.
Perform ‘sudo shutdown now’ and then unplug the power from the system.
Setting Up Other Systems
Remove the SD Card and use a program to copy the whole SD Card. See the article ‘ISO File Manipulation’, specifically the section on the ‘gnome-disk-utility’. If you have another program or method to use, I urge you to use what you are comfortable with using.
Create the image and then copy it to the other two SD Cards. The process could take a little time.
Configuring the Remaining Systems
Once completed, place the SD Cards into the systems. Do not power them all on, only one at a time starting with number two. Currently, they will all boot and try to use the same IP Address on the Ethernet network as well as having identical hostnames.
Once Cluster2 is booted and you logged in to the GUI, open a Terminal.
From here, you need to edit ‘/etc/hostname’ to change the hostname accordingly.
Finally, edit the file ‘/etc/netplan/50-cloud-init.yaml’ and change the IP Address. Once this file is saved, perform the ‘sudo netplan apply’ to verify the file is correct.
Once Cluster2 is completed, you can boot Cluster3 and make the same edits. After each system is complete, move to the next. Once the last node is completed, you can turn Cluster1 on and the Parallel cluster is nearly completed.
On Cluster1, you have a little bit to finish before the whole cluster is ready.
In a Terminal on Cluster1, execute the command ‘sudo ssh-keygen -t rsa’. The command generates a key for the first cluster. For the filename to save as, press ‘enter’. Then for a passphrase, just press ‘enter’ and ‘enter’ again to verify. Once the command is completed, you will need to copy the key it generated to the other nodes. Use the command ‘sudo ssh-copy-id [email protected]’. You will need to type in ‘yes’ to verify you want to continue. You will then be prompted for the password for the user ‘ubuntu’ on the other system. The password should be what you made it on Cluster1. By specifying the username ‘ubuntu’, you are placing the key file into the Home folder of the user ‘ubuntu’. Continue copying the keys to all nodes by just changing the IP Address. Make sure all nodes other than the first one get a copy.
At this point, everything is completed.
Testing the Cluster
From each node, you can SSH into each node from Cluster1, run ‘mpiexec -n 1 hostname’. You should get a response of the system’s hostname.
NOTE: You may get the message ‘Invalid MIT-MAGIC-COOKIE-1 key’. Just press the ‘enter’ key and the command should resume.
If a message appears other than the hostname, press ‘enter’ and it should appear. If it does not work, then a step was missed on one or more of the nodes.
From Cluster1, issue the command ‘mpiexec -n 3 -host 10.0.0.1,10.0.0.2,10.0.0.3 hostname’. You should get a response of the hostname from all nodes.
If you look at the article “Linux Cluster – Basics”, you can try to run the python script for finding prime numbers.
Extract the compressed file and place the files into a folder named ‘prime’. Copy the ‘prime’ folder into the Home folder of each system. From the Terminal, change to the ‘prime’ directory, you can then issue the following command:
mpiexec -n 3 --host 0.0.0.1,10.0.0.2,10.0.0.3 python3 prime.py 10000
You should get a result and the time it takes to complete the task.
I have gone over the steps to set up a Parallel (Beowulf) Cluster using Ubuntu 20.04.
If you have the systems, try this out and see how programs enabled for parallel computing can run faster. Try running a test with a single system, two systems, and then more. Enjoy!