This guide is meant to explain how to create a cluster on computers running Yellow Dog Linux (YDL) using OpenMPI. It was created using 64-bit XServe G5s running YDL 6.2. This guides assumes that the reader: * already installed YDL on each of the computers that will be in the cluster * has assigned each computer a static IP Address * is familiar with navigating the file system via command line ====Keyless ssh Entries==== Taken, with minor changes, from [[cluster_setup_guide|Cluster Setup Guide - Debian]] In order to ssh into each system without entering a password each time, an rsa key from the local computer needs to be placed in the proper file on the remote system. Sound complicated? It’s pretty easy, just follow along. On each system, run the following commands: apt-get install ssh ssh-keygen * type Y when asked * ssh-keygen will give you three prompts. Leave all three blank. Now, copy each node's public key to the same place: for i in 0 1 2; do scp system$i:~/.ssh/id_rsa.pub ~/.ssh/id_rsa.pub.system$i; done Put the contents of all files just made into authorized_keys: for i in 0 1 2; do cat ~/.ssh/id_rsa.pub.system$i >> ~/.ssh/authorized_keys; done Lastly, copy the newly created authorized_keys file onto all the nodes: for i in 0 1 2; do scp ~/.ssh/authorized_keys system$i:~/.ssh/; done The next time you ssh between nodes, you shouldn’t be asked for a password! ====Setup Hosts File==== Taken verbatim from [[cluster_setup_guide|Cluster Setup Guide - Debian]] This file is used for any system on your local network that you want to connect to (ssh, scp, etc) using an alias instead of using the full IP address. Edit /etc/hosts * Add a line at the top of the document for each system in the cluster. Here’s an example hosts file. 127.0.0.1 localhost 192.168.1.100 system0 192.168.1.101 system1 192.168.1.102 system2 # The following lines are… * Also, be sure to remove the default line with ''127.0.1.1'' from the file. '''BE SURE TO''': Copy this hosts file to every system in the cluster. ====Installing OpenMPI==== First, the GNU Compiler Collection, or gcc, needs to be installed yum install gcc On 64 bit machines, a header file indirectly referenced by stdio.h is missing, leading to a compile error. To fix this: yum install glibc-devel.ppc64 One or two of these three packages may be unnecessary, but it's better to be safe: yum install openmpi.ppc64 yum install openmpi-devel.ppc64 yum install openmpi Now MPI is installed, but it is inconvenient to have to type things like /usr/lib64/openmpi/1.2.5-gcc/bin/mpirun ro access it. The solution is to add /usr/lib64/openmpi/1.2.5-gcc/bin/ to your shell's $PATH variable when you login. If the path to this folder is different on your computer, adjust accordingly. Here's how it's done. ==for root user== Edit the text file /root/.bash_profile. Add a line that reads PATH=$PATH:/usr/lib64/openmpi/1.2.5-gcc/bin/ before the line that reads "export PATH". Save the file. ==for all other users== Follow the same steps as for the root user, except edit /etc/profile instead. MPI should at this point work, however all the processes will run on the local computer. Still, it may be helpful to make sure that MPI works at this point. ====Write Program==== Taken verbatim from [[cluster_setup_guide|Cluster Setup Guide - Debian]] Make a new file. It will be ~/hello.c for this example. Copy the contents below into the file: /*The Parallel Hello World Program*/ #include #include main(int argc, char **argv) { int node; MPI_Init(&argc,&argv); MPI_Comm_rank(MPI_COMM_WORLD, &node); printf("Hello World from Node %d\n",node); MPI_Finalize(); } ====Compile Program==== MPI Programs written in C should be compiled with mpicc. The arguments passed to it are much like those of gcc. So, to compile the hello.c example: mpicc -o hello hello.c ====Run Program==== To run an MPI program, type the following: mpirun -np N ./hello Where N is the number of processes and hello is the executable file. ====Configuring OpenMPI to run jobs across the cluster==== If you encounter problems running the program, make sure that there isn't a firewall problem. /etc/init.d/iptables stop If you wish to turn prevent this from running at start up, you can run the following command: chkconfig iptables off And to be safe: chkconfig ip6tables off ====Configuring NFS (optional)==== If you wish to avoid having to copy executables to each node prior to every run and ensure that everything is synced, you can configure the master node as an NFS Server and the other nodes as clients. I successfully accomplished this by using [[http://nfs.sourceforge.net/nfs-howto/index.html|this guide]] with the following changes: * In place of the instructions in Sections 3.3.2 (Starting the Portmapper) and 3.3.3 (The Daemons), I ran: chkconfig nfs on * The "rpcinfo quota" command is "rpcinfo -p" on YDL