Gentoo Linux, a special flavor of Linux that can be automatically optimized and customized for just about any application or need. Extreme performance, configurability and a top-notch user and developer community are all hallmarks of the Gentoo experience.
Thanks to a technology called Portage, Gentoo Linux can become an ideal secure server, development workstation, professional desktop, gaming system, embedded solution or... a High Performance Computing system. Because of its near-unlimited adaptability, we call Gentoo Linux a metadistribution.
This document explains how to turn a Gentoo system into a High Performance Computing system. Step by step, it explains what packages one may want to install and helps configure them.
Obtain Gentoo Linux from the website
During the installation process, you will have to set your USE variables in
USE="-oss 3dnow -apm -arts -avi -berkdb -crypt -cups -encode -gdbm -gif gpm -gtk -imlib -java -jpeg -kde -gnome -libg++ -libwww -mikmod mmx -motif -mpeg ncurses -nls -oggvorbis -opengl pam -pdflib -png -python -qt -qtmt -quicktime -readline -sdl -slang -spell -ssl -svga tcpd -truetype -X -xml2 -xmms -xv -zlib"
Or simply:
USE="-* 3dnow gpm mmx ncurses pam sse tcpd"
In step 15 ("Installing the kernel and a System Logger") for stability
reasons, we recommend the vanilla-sources, the official kernel sources
released on
# emerge -p syslog-ng vanilla-sources
When you install miscellaneous packages, we recommend installing the following:
# emerge -p nfs-utils portmap tcpdump ssmtp iptables xinetd
A cluster requires a communication layer to interconnect the slave nodes to
the master node. Typically, a FastEthernet or GigaEthernet LAN can be used
since they have a good price/performance ratio. Other possibilities include
use of products like
A cluster is composed of two node types: master and slave. Typically, your cluster will have one master node and several slave nodes.
The master node is the cluster's server. It is responsible for telling the slave nodes what to do. This server will typically run such daemons as dhcpd, nfs, pbs-server, and pbs-sched. Your master node will allow interactive sessions for users, and accept job executions.
The slave nodes listen for instructions (via ssh/rsh perhaps) from the master node. They should be dedicated to crunching results and therefore should not run any unnecessary services.
The rest of this documentation will assume a cluster configuration as per the
hosts file below. You should maintain on every node such a hosts file
(
# Adelie Linux Research & Development Center # /etc/hosts 127.0.0.1 localhost 192.168.1.100 master.adelie master 192.168.1.1 node01.adelie node01 192.168.1.2 node02.adelie node02
To setup your cluster dedicated LAN, edit your
# Global config file for net.* rc-scripts # This is basically the ifconfig argument without the ifconfig $iface # iface_eth0="192.168.1.100 broadcast 192.168.1.255 netmask 255.255.255.0" # Network Connection to the outside world using dhcp -- configure as required for you network iface_eth1="dhcp"
Finally, setup a DHCP daemon on the master node to avoid having to maintain a network configuration on each slave node.
# Adelie Linux Research & Development Center
# /etc/dhcp/dhcpd.conf
log-facility local7;
ddns-update-style none;
use-host-decl-names on;
subnet 192.168.1.0 netmask 255.255.255.0 {
option domain-name "adelie";
range 192.168.1.10 192.168.1.99;
option routers 192.168.1.100;
host node01.adelie {
# MAC address of network card on node 01
hardware ethernet 00:07:e9:0f:e2:d4;
fixed-address 192.168.1.1;
}
host node02.adelie {
# MAC address of network card on node 02
hardware ethernet 00:07:e9:0f:e2:6b;
fixed-address 192.168.1.2;
}
}
The Network File System (NFS) was developed to allow machines to mount a disk partition on a remote machine as if it were on a local hard drive. This allows for fast, seamless sharing of files across a network.
There are other systems that provide similar functionality to NFS which could
be used in a cluster environment. The
# emerge -p nfs-utils portmap # emerge nfs-utils portmap
Configure and install a kernel to support NFS v3 on all nodes:
CONFIG_NFS_FS=y CONFIG_NFSD=y CONFIG_SUNRPC=y CONFIG_LOCKD=y CONFIG_NFSD_V3=y CONFIG_LOCKD_V4=y
On the master node, edit your
portmap:192.168.1.0/255.255.255.0
Edit the
/home/ *(rw)
Add nfs to your master node's default runlevel:
# rc-update add nfs default
To mount the nfs exported filesystem from the master, you also have to
configure your salve nodes'
master:/home/ /home nfs rw,exec,noauto,nouser,async 0 0
You'll also need to set up your nodes so that they mount the nfs filesystem by issuing this command:
# rc-update add nfsmount default
SSH is a protocol for secure remote login and other secure network services over an insecure network. OpenSSH uses public key cryptography to provide secure authorization. Generating the public key, which is shared with remote systems, and the private key which is kept on the local system, is done first to configure OpenSSH on the cluster.
For transparent cluster usage, private/public keys may be used. This process has two steps:
For user based authentication, generate and copy as follows:
# ssh-keygen -t dsa Generating public/private dsa key pair. Enter file in which to save the key (/root/.ssh/id_dsa): /root/.ssh/id_dsa Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_dsa. Your public key has been saved in /root/.ssh/id_dsa.pub. The key fingerprint is: f1:45:15:40:fd:3c:2d:f7:9f:ea:55:df:76:2f:a4:1f root@masterWARNING! If you already have an "authorized_keys" file, please append to it, do not use the following command. # scp /root/.ssh/id_dsa.pub node01:/root/.ssh/authorized_keys root@master's password: id_dsa.pub 100% 234 2.0MB/s 00:00 # scp /root/.ssh/id_dsa.pub node02:/root/.ssh/authorized_keys root@master's password: id_dsa.pub 100% 234 2.0MB/s 00:00
For host based authentication, you will also need to edit your
node01.adelie node02.adelie master.adelie
And a few modifications to the
# $OpenBSD: sshd_config,v 1.42 2001/09/20 20:57:51 mouring Exp $ # This sshd was compiled with PATH=/usr/bin:/bin:/usr/sbin:/sbin # This is the sshd server system-wide configuration file. See sshd(8) # for more information. # HostKeys for protocol version 2 HostKey /etc/ssh/ssh_host_rsa_key
If your application require RSH communications, you will need to emerge net-misc/netkit-rsh and sys-apps/xinetd.
# emerge -p xinetd # emerge xinetd # emerge -p netkit-rsh # emerge netkit-rsh
Then configure the rsh deamon. Edit your
# Adelie Linux Research & Development Center
# /etc/xinetd.d/rsh
service shell
{
socket_type = stream
protocol = tcp
wait = no
user = root
group = tty
server = /usr/sbin/in.rshd
log_type = FILE /var/log/rsh
log_on_success = PID HOST USERID EXIT DURATION
log_on_failure = USERID ATTEMPT
disable = no
}
Edit your
# Adelie Linux Research & Development Center # /etc/hosts.allow in.rshd:192.168.1.0/255.255.255.0
Or you can simply trust your cluster LAN:
# Adelie Linux Research & Development Center # /etc/hosts.allow ALL:192.168.1.0/255.255.255.0
Finally, configure host authentication from
# Adelie Linux Research & Development Center # /etc/hosts.equiv master node01 node02
And, add xinetd to your default runlevel:
# rc-update add xinetd default
The Network Time Protocol (NTP) is used to synchronize the time of a computer client or server to another server or reference time source, such as a radio or satellite receiver or modem. It provides accuracies typically within a millisecond on LANs and up to a few tens of milliseconds on WANs relative to Coordinated Universal Time (UTC) via a Global Positioning Service (GPS) receiver, for example. Typical NTP configurations utilize multiple redundant servers and diverse network paths in order to achieve high accuracy and reliability.
Select a NTP server geographically close to you from
# /etc/conf.d/ntpd # NOTES: # - NTPDATE variables below are used if you wish to set your # clock when you start the ntp init.d script # - make sure that the NTPDATE_CMD will close by itself ... # the init.d script will not attempt to kill/stop it # - ntpd will be used to maintain synchronization with a time # server regardless of what NTPDATE is set to # - read each of the comments above each of the variable # Comment this out if you dont want the init script to warn # about not having ntpdate setup NTPDATE_WARN="n" # Command to run to set the clock initially # Most people should just uncomment this line ... # however, if you know what you're doing, and you # want to use ntpd to set the clock, change this to 'ntpd' NTPDATE_CMD="ntpdate" # Options to pass to the above command # Most people should just uncomment this variable and # change 'someserver' to a valid hostname which you # can acquire from the URL's below NTPDATE_OPTS="-b ntp1.cmc.ec.gc.ca" ## # A list of available servers is available here: # http://www.eecis.udel.edu/~mills/ntp/servers.html # Please follow the rules of engagement and use a # Stratum 2 server (unless you qualify for Stratum 1) ## # Options to pass to the ntpd process that will *always* be run # Most people should not uncomment this line ... # however, if you know what you're doing, feel free to tweak #NTPD_OPTS=""
Edit your
# Adelie Linux Research & Development Center # /etc/ntp.conf # Synchronization source #1 server ntp1.cmc.ec.gc.ca restrict ntp1.cmc.ec.gc.ca # Synchronization source #2 server ntp2.cmc.ec.gc.ca restrict ntp2.cmc.ec.gc.ca stratum 10 driftfile /etc/ntp.drift.server logfile /var/log/ntp broadcast 192.168.1.255 restrict default kod restrict 127.0.0.1 restrict 192.168.1.0 mask 255.255.255.0
And on all your slave nodes, setup your synchronization source as your master node.
# /etc/conf.d/ntpd NTPDATE_WARN="n" NTPDATE_CMD="ntpdate" NTPDATE_OPTS="-b master"
# Adelie Linux Research & Development Center # /etc/ntp.conf # Synchronization source #1 server master restrict master stratum 11 driftfile /etc/ntp.drift.server logfile /var/log/ntp restrict default kod restrict 127.0.0.1
Then add ntpd to the default runlevel of all your nodes:
# rc-update add ntpd default
To setup a firewall on your cluster, you will need iptables.
# emerge -p iptables # emerge iptables
Required kernel configuration:
CONFIG_NETFILTER=y CONFIG_IP_NF_CONNTRACK=y CONFIG_IP_NF_IPTABLES=y CONFIG_IP_NF_MATCH_STATE=y CONFIG_IP_NF_FILTER=y CONFIG_IP_NF_TARGET_REJECT=y CONFIG_IP_NF_NAT=y CONFIG_IP_NF_NAT_NEEDED=y CONFIG_IP_NF_TARGET_MASQUERADE=y CONFIG_IP_NF_TARGET_LOG=y
And the rules required for this firewall:
# Adelie Linux Research & Development Center # /var/lib/iptables/rule-save *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A INPUT -p tcp -m tcp --dport 22 -j ACCEPT -A INPUT -s 192.168.1.0/255.255.255.0 -i eth1 -j ACCEPT -A INPUT -s 127.0.0.1 -i lo -j ACCEPT -A INPUT -p icmp -j ACCEPT -A INPUT -j LOG -A INPUT -j REJECT --reject-with icmp-port-unreachable COMMIT *nat :PREROUTING ACCEPT [0:0] :POSTROUTING ACCEPT [0:0] :OUTPUT ACCEPT [0:0] -A POSTROUTING -s 192.168.1.0/255.255.255.0 -j MASQUERADE COMMIT
Then add iptables to the default runlevel of all your nodes:
# rc-update add iptables default
The Portable Batch System (PBS) is a flexible batch queueing and workload management system originally developed for NASA. It operates on networked, multi-platform UNIX environments, including heterogeneous clusters of workstations, supercomputers, and massively parallel systems. Development of PBS is provided by Altair Grid Technologies.
# emerge -p openpbs
Before starting using OpenPBS, some configurations are required. The files you will need to personalize for your system are:
Here is a sample sched_config:
# # Create queues and set their attributes. # # # Create and define queue upto4nodes # create queue upto4nodes set queue upto4nodes queue_type = Execution set queue upto4nodes Priority = 100 set queue upto4nodes resources_max.nodect = 4 set queue upto4nodes resources_min.nodect = 1 set queue upto4nodes enabled = True set queue upto4nodes started = True # # Create and define queue default # create queue default set queue default queue_type = Route set queue default route_destinations = upto4nodes set queue default enabled = True set queue default started = True # # Set server attributes. # set server scheduling = True set server acl_host_enable = True set server default_queue = default set server log_events = 511 set server mail_from = adm set server query_other_jobs = True set server resources_default.neednodes = 1 set server resources_default.nodect = 1 set server resources_default.nodes = 1 set server scheduler_iteration = 60
To submit a task to OpenPBS, the command
(submit and request from OpenPBS that myscript be executed on 2 nodes) # qsub -l nodes=2 -j oe -m abe myscript
Normally jobs submitted to OpenPBS are in the form of scripts. Sometimes, you may want to try a task manually. To request an interactive shell from OpenPBS, use the "-I" parameter.
# qsub -I
To check the status of your jobs, use the qstat command:
# qstat Job id Name User Time Use S Queue ------ ---- ---- -------- - ----- 2.geist STDIN adelie 0 R upto1nodes
Message passing is a paradigm used widely on certain classes of parallel machines, especially those with distributed memory. MPICH is a freely available, portable implementation of MPI, the Standard for message-passing libraries.
The mpich ebuild provided by Adelie Linux allows for two USE flags:
# emerge -p mpich # emerge mpich
You may need to export a mpich work directory to all your slave nodes in
/home *(rw)
Most massively parallel processors (MPPs) provide a way to start a program on
a requested number of processors;
Edit this file to reflect your cluster-lan configuration:
# Change this file to contain the machines that you want to use # to run MPI jobs on. The format is one host name per line, with either # hostname # or # hostname:n # where n is the number of processors in an SMP. The hostname should # be the same as the result from the command "hostname" master node01 node02 # node03 # node04 # ...
Use the script
The only argument to
# /usr/local/mpich/sbin/tstmachines LINUX
# /usr/local/mpich/sbin/tstmachines -v LINUX
The output from this command might look like:
Trying true on host1.uoffoo.edu ... Trying true on host2.uoffoo.edu ... Trying ls on host1.uoffoo.edu ... Trying ls on host2.uoffoo.edu ... Trying user program on host1.uoffoo.edu ... Trying user program on host2.uoffoo.edu ...
If
And the required test for every development tool:
# cd ~ # cp /usr/share/mpich/examples1/hello++.c ~ # make hello++ # mpirun -machinefile /usr/share/mpich/machines.LINUX -np 1 hello++
For further information on MPICH, consult the documentation at
(Coming Soon!)
(Coming Soon!)
The original document is published at the