| 1 |
<?xml version='1.0' encoding="UTF-8"?> |
| 2 |
<!-- $Header: /var/cvsroot/gentoo/xml/htdocs/doc/en/hpc-howto.xml,v 1.13 2006/12/18 21:47:19 nightmorph Exp $ --> |
| 3 |
<!DOCTYPE guide SYSTEM "/dtd/guide.dtd"> |
| 4 |
|
| 5 |
<guide link="/doc/en/hpc-howto.xml"> |
| 6 |
<title>High Performance Computing on Gentoo Linux</title> |
| 7 |
|
| 8 |
<author title="Author"> |
| 9 |
<mail link="marc@adelielinux.com">Marc St-Pierre</mail> |
| 10 |
</author> |
| 11 |
<author title="Author"> |
| 12 |
<mail link="benoit@adelielinux.com">Benoit Morin</mail> |
| 13 |
</author> |
| 14 |
<author title="Assistant/Research"> |
| 15 |
<mail link="jean-francois@adelielinux.com">Jean-Francois Richard</mail> |
| 16 |
</author> |
| 17 |
<author title="Assistant/Research"> |
| 18 |
<mail link="olivier@adelielinux.com">Olivier Crete</mail> |
| 19 |
</author> |
| 20 |
<author title="Reviewer"> |
| 21 |
<mail link="dberkholz@gentoo.org">Donnie Berkholz</mail> |
| 22 |
</author> |
| 23 |
|
| 24 |
<!-- No licensing information; this document has been written by a third-party |
| 25 |
organisation without additional licensing information. |
| 26 |
|
| 27 |
In other words, this is copyright adelielinux R&D; Gentoo only has |
| 28 |
permission to distribute this document as-is and update it when appropriate |
| 29 |
as long as the adelie linux R&D notice stays |
| 30 |
--> |
| 31 |
|
| 32 |
<abstract> |
| 33 |
This document was written by people at the Adelie Linux R&D Center |
| 34 |
<http://www.adelielinux.com> as a step-by-step guide to turn a Gentoo |
| 35 |
System into a High Performance Computing (HPC) system. |
| 36 |
</abstract> |
| 37 |
|
| 38 |
<version>1.6</version> |
| 39 |
<date>2006-12-18</date> |
| 40 |
|
| 41 |
<chapter> |
| 42 |
<title>Introduction</title> |
| 43 |
<section> |
| 44 |
<body> |
| 45 |
|
| 46 |
<p> |
| 47 |
Gentoo Linux, a special flavor of Linux that can be automatically optimized |
| 48 |
and customized for just about any application or need. Extreme performance, |
| 49 |
configurability and a top-notch user and developer community are all hallmarks |
| 50 |
of the Gentoo experience. |
| 51 |
</p> |
| 52 |
|
| 53 |
<p> |
| 54 |
Thanks to a technology called Portage, Gentoo Linux can become an ideal secure |
| 55 |
server, development workstation, professional desktop, gaming system, embedded |
| 56 |
solution or... a High Performance Computing system. Because of its |
| 57 |
near-unlimited adaptability, we call Gentoo Linux a metadistribution. |
| 58 |
</p> |
| 59 |
|
| 60 |
<p> |
| 61 |
This document explains how to turn a Gentoo system into a High Performance |
| 62 |
Computing system. Step by step, it explains what packages one may want to |
| 63 |
install and helps configure them. |
| 64 |
</p> |
| 65 |
|
| 66 |
<p> |
| 67 |
Obtain Gentoo Linux from the website <uri>http://www.gentoo.org</uri>, and |
| 68 |
refer to the <uri link="/doc/en/">documentation</uri> at the same location to |
| 69 |
install it. |
| 70 |
</p> |
| 71 |
|
| 72 |
</body> |
| 73 |
</section> |
| 74 |
</chapter> |
| 75 |
|
| 76 |
<chapter> |
| 77 |
<title>Configuring Gentoo Linux for Clustering</title> |
| 78 |
<section> |
| 79 |
<title>Recommended Optimizations</title> |
| 80 |
<body> |
| 81 |
|
| 82 |
<note> |
| 83 |
We refer to the <uri link="/doc/en/handbook/">Gentoo Linux Handbooks</uri> in |
| 84 |
this section. |
| 85 |
</note> |
| 86 |
|
| 87 |
<p> |
| 88 |
During the installation process, you will have to set your USE variables in |
| 89 |
<path>/etc/make.conf</path>. We recommended that you deactivate all the |
| 90 |
defaults (see <path>/etc/make.profile/make.defaults</path>) by negating them in |
| 91 |
make.conf. However, you may want to keep such use variables as x86, 3dnow, gpm, |
| 92 |
mmx, nptl, nptlonly, sse, ncurses, pam and tcpd. Refer to the USE documentation |
| 93 |
for more information. |
| 94 |
</p> |
| 95 |
|
| 96 |
<pre caption="USE Flags"> |
| 97 |
USE="-oss 3dnow -apm -arts -avi -berkdb -crypt -cups -encode -gdbm -gif gpm -gtk |
| 98 |
-imlib -java -jpeg -kde -gnome -libg++ -libwww -mikmod mmx -motif -mpeg ncurses |
| 99 |
-nls nptl nptlonly -oggvorbis -opengl pam -pdflib -png -python -qt3 -qt4 -qtmt |
| 100 |
-quicktime -readline -sdl -slang -spell -ssl -svga tcpd -truetype -X -xml2 -xv |
| 101 |
-zlib" |
| 102 |
</pre> |
| 103 |
|
| 104 |
<p> |
| 105 |
Or simply: |
| 106 |
</p> |
| 107 |
|
| 108 |
<pre caption="USE Flags - simplified version"> |
| 109 |
USE="-* 3dnow gpm mmx ncurses pam sse tcpd" |
| 110 |
</pre> |
| 111 |
|
| 112 |
<note> |
| 113 |
The <e>tcpd</e> USE flag increases security for packages such as xinetd. |
| 114 |
</note> |
| 115 |
|
| 116 |
<p> |
| 117 |
In step 15 ("Installing the kernel and a System Logger") for stability |
| 118 |
reasons, we recommend the vanilla-sources, the official kernel sources |
| 119 |
released on <uri>http://www.kernel.org/</uri>, unless you require special |
| 120 |
support such as xfs. |
| 121 |
</p> |
| 122 |
|
| 123 |
<pre caption="Installing vanilla-sources"> |
| 124 |
# <i>emerge -p syslog-ng vanilla-sources</i> |
| 125 |
</pre> |
| 126 |
|
| 127 |
<p> |
| 128 |
When you install miscellaneous packages, we recommend installing the |
| 129 |
following: |
| 130 |
</p> |
| 131 |
|
| 132 |
<pre caption="Installing necessary packages"> |
| 133 |
# <i>emerge -p nfs-utils portmap tcpdump ssmtp iptables xinetd</i> |
| 134 |
</pre> |
| 135 |
|
| 136 |
</body> |
| 137 |
</section> |
| 138 |
<section> |
| 139 |
<title>Communication Layer (TCP/IP Network)</title> |
| 140 |
<body> |
| 141 |
|
| 142 |
<p> |
| 143 |
A cluster requires a communication layer to interconnect the slave nodes to |
| 144 |
the master node. Typically, a FastEthernet or GigaEthernet LAN can be used |
| 145 |
since they have a good price/performance ratio. Other possibilities include |
| 146 |
use of products like <uri link="http://www.myricom.com/">Myrinet</uri>, <uri |
| 147 |
link="http://quadrics.com/">QsNet</uri> or others. |
| 148 |
</p> |
| 149 |
|
| 150 |
<p> |
| 151 |
A cluster is composed of two node types: master and slave. Typically, your |
| 152 |
cluster will have one master node and several slave nodes. |
| 153 |
</p> |
| 154 |
|
| 155 |
<p> |
| 156 |
The master node is the cluster's server. It is responsible for telling the |
| 157 |
slave nodes what to do. This server will typically run such daemons as dhcpd, |
| 158 |
nfs, pbs-server, and pbs-sched. Your master node will allow interactive |
| 159 |
sessions for users, and accept job executions. |
| 160 |
</p> |
| 161 |
|
| 162 |
<p> |
| 163 |
The slave nodes listen for instructions (via ssh/rsh perhaps) from the master |
| 164 |
node. They should be dedicated to crunching results and therefore should not |
| 165 |
run any unnecessary services. |
| 166 |
</p> |
| 167 |
|
| 168 |
<p> |
| 169 |
The rest of this documentation will assume a cluster configuration as per the |
| 170 |
hosts file below. You should maintain on every node such a hosts file |
| 171 |
(<path>/etc/hosts</path>) with entries for each node participating node in the |
| 172 |
cluster. |
| 173 |
</p> |
| 174 |
|
| 175 |
<pre caption="/etc/hosts"> |
| 176 |
# Adelie Linux Research & Development Center |
| 177 |
# /etc/hosts |
| 178 |
|
| 179 |
127.0.0.1 localhost |
| 180 |
|
| 181 |
192.168.1.100 master.adelie master |
| 182 |
|
| 183 |
192.168.1.1 node01.adelie node01 |
| 184 |
192.168.1.2 node02.adelie node02 |
| 185 |
</pre> |
| 186 |
|
| 187 |
<p> |
| 188 |
To setup your cluster dedicated LAN, edit your <path>/etc/conf.d/net</path> |
| 189 |
file on the master node. |
| 190 |
</p> |
| 191 |
|
| 192 |
<pre caption="/etc/conf.d/net"> |
| 193 |
# Global config file for net.* rc-scripts |
| 194 |
|
| 195 |
# This is basically the ifconfig argument without the ifconfig $iface |
| 196 |
# |
| 197 |
|
| 198 |
iface_eth0="192.168.1.100 broadcast 192.168.1.255 netmask 255.255.255.0" |
| 199 |
# Network Connection to the outside world using dhcp -- configure as required for you network |
| 200 |
iface_eth1="dhcp" |
| 201 |
</pre> |
| 202 |
|
| 203 |
|
| 204 |
<p> |
| 205 |
Finally, setup a DHCP daemon on the master node to avoid having to maintain a |
| 206 |
network configuration on each slave node. |
| 207 |
</p> |
| 208 |
|
| 209 |
<pre caption="/etc/dhcp/dhcpd.conf"> |
| 210 |
# Adelie Linux Research & Development Center |
| 211 |
# /etc/dhcp/dhcpd.conf |
| 212 |
|
| 213 |
log-facility local7; |
| 214 |
ddns-update-style none; |
| 215 |
use-host-decl-names on; |
| 216 |
|
| 217 |
subnet 192.168.1.0 netmask 255.255.255.0 { |
| 218 |
option domain-name "adelie"; |
| 219 |
range 192.168.1.10 192.168.1.99; |
| 220 |
option routers 192.168.1.100; |
| 221 |
|
| 222 |
host node01.adelie { |
| 223 |
# MAC address of network card on node 01 |
| 224 |
hardware ethernet 00:07:e9:0f:e2:d4; |
| 225 |
fixed-address 192.168.1.1; |
| 226 |
} |
| 227 |
host node02.adelie { |
| 228 |
# MAC address of network card on node 02 |
| 229 |
hardware ethernet 00:07:e9:0f:e2:6b; |
| 230 |
fixed-address 192.168.1.2; |
| 231 |
} |
| 232 |
} |
| 233 |
</pre> |
| 234 |
|
| 235 |
</body> |
| 236 |
</section> |
| 237 |
<section> |
| 238 |
<title>NFS/NIS</title> |
| 239 |
<body> |
| 240 |
|
| 241 |
<p> |
| 242 |
The Network File System (NFS) was developed to allow machines to mount a disk |
| 243 |
partition on a remote machine as if it were on a local hard drive. This allows |
| 244 |
for fast, seamless sharing of files across a network. |
| 245 |
</p> |
| 246 |
|
| 247 |
<p> |
| 248 |
There are other systems that provide similar functionality to NFS which could |
| 249 |
be used in a cluster environment. The <uri |
| 250 |
link="http://www.openafs.org">Andrew File System |
| 251 |
from IBM</uri>, recently open-sourced, provides a file sharing mechanism with |
| 252 |
some additional security and performance features. The <uri |
| 253 |
link="http://www.coda.cs.cmu.edu/">Coda File System</uri> is still in |
| 254 |
development, but is designed to work well with disconnected clients. Many |
| 255 |
of the features of the Andrew and Coda file systems are slated for inclusion |
| 256 |
in the next version of <uri link="http://www.nfsv4.org">NFS (Version 4)</uri>. |
| 257 |
The advantage of NFS today is that it is mature, standard, well understood, |
| 258 |
and supported robustly across a variety of platforms. |
| 259 |
</p> |
| 260 |
|
| 261 |
<pre caption="Ebuilds for NFS-support"> |
| 262 |
# <i>emerge -p nfs-utils portmap</i> |
| 263 |
# <i>emerge nfs-utils portmap</i> |
| 264 |
</pre> |
| 265 |
|
| 266 |
<p> |
| 267 |
Configure and install a kernel to support NFS v3 on all nodes: |
| 268 |
</p> |
| 269 |
|
| 270 |
<pre caption="Required Kernel Configurations for NFS"> |
| 271 |
CONFIG_NFS_FS=y |
| 272 |
CONFIG_NFSD=y |
| 273 |
CONFIG_SUNRPC=y |
| 274 |
CONFIG_LOCKD=y |
| 275 |
CONFIG_NFSD_V3=y |
| 276 |
CONFIG_LOCKD_V4=y |
| 277 |
</pre> |
| 278 |
|
| 279 |
<p> |
| 280 |
On the master node, edit your <path>/etc/hosts.allow</path> file to allow |
| 281 |
connections from slave nodes. If your cluster LAN is on 192.168.1.0/24, |
| 282 |
your <path>hosts.allow</path> will look like: |
| 283 |
</p> |
| 284 |
|
| 285 |
<pre caption="hosts.allow"> |
| 286 |
portmap:192.168.1.0/255.255.255.0 |
| 287 |
</pre> |
| 288 |
|
| 289 |
<p> |
| 290 |
Edit the <path>/etc/exports</path> file of the master node to export a work |
| 291 |
directory structure (/home is good for this). |
| 292 |
</p> |
| 293 |
|
| 294 |
<pre caption="/etc/exports"> |
| 295 |
/home/ *(rw) |
| 296 |
</pre> |
| 297 |
|
| 298 |
<p> |
| 299 |
Add nfs to your master node's default runlevel: |
| 300 |
</p> |
| 301 |
|
| 302 |
<pre caption="Adding NFS to the default runlevel"> |
| 303 |
# <i>rc-update add nfs default</i> |
| 304 |
</pre> |
| 305 |
|
| 306 |
<p> |
| 307 |
To mount the nfs exported filesystem from the master, you also have to |
| 308 |
configure your salve nodes' <path>/etc/fstab</path>. Add a line like this |
| 309 |
one: |
| 310 |
</p> |
| 311 |
|
| 312 |
<pre caption="/etc/fstab"> |
| 313 |
master:/home/ /home nfs rw,exec,noauto,nouser,async 0 0 |
| 314 |
</pre> |
| 315 |
|
| 316 |
<p> |
| 317 |
You'll also need to set up your nodes so that they mount the nfs filesystem by |
| 318 |
issuing this command: |
| 319 |
</p> |
| 320 |
|
| 321 |
<pre caption="Adding nfsmount to the default runlevel"> |
| 322 |
# <i>rc-update add nfsmount default</i> |
| 323 |
</pre> |
| 324 |
|
| 325 |
</body> |
| 326 |
</section> |
| 327 |
<section> |
| 328 |
<title>RSH/SSH</title> |
| 329 |
<body> |
| 330 |
|
| 331 |
<p> |
| 332 |
SSH is a protocol for secure remote login and other secure network services |
| 333 |
over an insecure network. OpenSSH uses public key cryptography to provide |
| 334 |
secure authorization. Generating the public key, which is shared with remote |
| 335 |
systems, and the private key which is kept on the local system, is done first |
| 336 |
to configure OpenSSH on the cluster. |
| 337 |
</p> |
| 338 |
|
| 339 |
<p> |
| 340 |
For transparent cluster usage, private/public keys may be used. This process |
| 341 |
has two steps: |
| 342 |
</p> |
| 343 |
|
| 344 |
<ul> |
| 345 |
<li>Generate public and private keys</li> |
| 346 |
<li>Copy public key to slave nodes</li> |
| 347 |
</ul> |
| 348 |
|
| 349 |
<p> |
| 350 |
For user based authentication, generate and copy as follows: |
| 351 |
</p> |
| 352 |
|
| 353 |
<pre caption="SSH key authentication"> |
| 354 |
# <i>ssh-keygen -t dsa</i> |
| 355 |
Generating public/private dsa key pair. |
| 356 |
Enter file in which to save the key (/root/.ssh/id_dsa): /root/.ssh/id_dsa |
| 357 |
Enter passphrase (empty for no passphrase): |
| 358 |
Enter same passphrase again: |
| 359 |
Your identification has been saved in /root/.ssh/id_dsa. |
| 360 |
Your public key has been saved in /root/.ssh/id_dsa.pub. |
| 361 |
The key fingerprint is: |
| 362 |
f1:45:15:40:fd:3c:2d:f7:9f:ea:55:df:76:2f:a4:1f root@master |
| 363 |
|
| 364 |
<comment>WARNING! If you already have an "authorized_keys" file, |
| 365 |
please append to it, do not use the following command.</comment> |
| 366 |
|
| 367 |
# <i>scp /root/.ssh/id_dsa.pub node01:/root/.ssh/authorized_keys</i> |
| 368 |
root@master's password: |
| 369 |
id_dsa.pub 100% 234 2.0MB/s 00:00 |
| 370 |
|
| 371 |
# <i>scp /root/.ssh/id_dsa.pub node02:/root/.ssh/authorized_keys</i> |
| 372 |
root@master's password: |
| 373 |
id_dsa.pub 100% 234 2.0MB/s 00:00 |
| 374 |
</pre> |
| 375 |
|
| 376 |
<note> |
| 377 |
Host keys must have an empty passphrase. RSA is required for host-based |
| 378 |
authentication. |
| 379 |
</note> |
| 380 |
|
| 381 |
<p> |
| 382 |
For host based authentication, you will also need to edit your |
| 383 |
<path>/etc/ssh/shosts.equiv</path>. |
| 384 |
</p> |
| 385 |
|
| 386 |
<pre caption="/etc/ssh/shosts.equiv"> |
| 387 |
node01.adelie |
| 388 |
node02.adelie |
| 389 |
master.adelie |
| 390 |
</pre> |
| 391 |
|
| 392 |
<p> |
| 393 |
And a few modifications to the <path>/etc/ssh/sshd_config</path> file: |
| 394 |
</p> |
| 395 |
|
| 396 |
<pre caption="sshd configurations"> |
| 397 |
# $OpenBSD: sshd_config,v 1.42 2001/09/20 20:57:51 mouring Exp $ |
| 398 |
# This sshd was compiled with PATH=/usr/bin:/bin:/usr/sbin:/sbin |
| 399 |
|
| 400 |
# This is the sshd server system-wide configuration file. See sshd(8) |
| 401 |
# for more information. |
| 402 |
|
| 403 |
# HostKeys for protocol version 2 |
| 404 |
HostKey /etc/ssh/ssh_host_rsa_key |
| 405 |
</pre> |
| 406 |
|
| 407 |
<p> |
| 408 |
If your application require RSH communications, you will need to emerge |
| 409 |
net-misc/netkit-rsh and sys-apps/xinetd. |
| 410 |
</p> |
| 411 |
|
| 412 |
<pre caption="Installing necessary applicaitons"> |
| 413 |
# <i>emerge -p xinetd</i> |
| 414 |
# <i>emerge xinetd</i> |
| 415 |
# <i>emerge -p netkit-rsh</i> |
| 416 |
# <i>emerge netkit-rsh</i> |
| 417 |
</pre> |
| 418 |
|
| 419 |
<p> |
| 420 |
Then configure the rsh deamon. Edit your <path>/etc/xinet.d/rsh</path> file. |
| 421 |
</p> |
| 422 |
|
| 423 |
<pre caption="rsh"> |
| 424 |
# Adelie Linux Research & Development Center |
| 425 |
# /etc/xinetd.d/rsh |
| 426 |
|
| 427 |
service shell |
| 428 |
{ |
| 429 |
socket_type = stream |
| 430 |
protocol = tcp |
| 431 |
wait = no |
| 432 |
user = root |
| 433 |
group = tty |
| 434 |
server = /usr/sbin/in.rshd |
| 435 |
log_type = FILE /var/log/rsh |
| 436 |
log_on_success = PID HOST USERID EXIT DURATION |
| 437 |
log_on_failure = USERID ATTEMPT |
| 438 |
disable = no |
| 439 |
} |
| 440 |
</pre> |
| 441 |
|
| 442 |
<p> |
| 443 |
Edit your <path>/etc/hosts.allow</path> to permit rsh connections: |
| 444 |
</p> |
| 445 |
|
| 446 |
<pre caption="hosts.allow"> |
| 447 |
# Adelie Linux Research & Development Center |
| 448 |
# /etc/hosts.allow |
| 449 |
|
| 450 |
in.rshd:192.168.1.0/255.255.255.0 |
| 451 |
</pre> |
| 452 |
|
| 453 |
<p> |
| 454 |
Or you can simply trust your cluster LAN: |
| 455 |
</p> |
| 456 |
|
| 457 |
<pre caption="hosts.allow"> |
| 458 |
# Adelie Linux Research & Development Center |
| 459 |
# /etc/hosts.allow |
| 460 |
|
| 461 |
ALL:192.168.1.0/255.255.255.0 |
| 462 |
</pre> |
| 463 |
|
| 464 |
<p> |
| 465 |
Finally, configure host authentication from <path>/etc/hosts.equiv</path>. |
| 466 |
</p> |
| 467 |
|
| 468 |
<pre caption="hosts.equiv"> |
| 469 |
# Adelie Linux Research & Development Center |
| 470 |
# /etc/hosts.equiv |
| 471 |
|
| 472 |
master |
| 473 |
node01 |
| 474 |
node02 |
| 475 |
</pre> |
| 476 |
|
| 477 |
<p> |
| 478 |
And, add xinetd to your default runlevel: |
| 479 |
</p> |
| 480 |
|
| 481 |
<pre caption="Adding xinetd to the default runlevel"> |
| 482 |
# <i>rc-update add xinetd default</i> |
| 483 |
</pre> |
| 484 |
|
| 485 |
</body> |
| 486 |
</section> |
| 487 |
<section> |
| 488 |
<title>NTP</title> |
| 489 |
<body> |
| 490 |
|
| 491 |
<p> |
| 492 |
The Network Time Protocol (NTP) is used to synchronize the time of a computer |
| 493 |
client or server to another server or reference time source, such as a radio |
| 494 |
or satellite receiver or modem. It provides accuracies typically within a |
| 495 |
millisecond on LANs and up to a few tens of milliseconds on WANs relative to |
| 496 |
Coordinated Universal Time (UTC) via a Global Positioning Service (GPS) |
| 497 |
receiver, for example. Typical NTP configurations utilize multiple redundant |
| 498 |
servers and diverse network paths in order to achieve high accuracy and |
| 499 |
reliability. |
| 500 |
</p> |
| 501 |
|
| 502 |
<p> |
| 503 |
Select a NTP server geographically close to you from <uri |
| 504 |
link="http://www.eecis.udel.edu/~mills/ntp/servers.html">Public NTP Time |
| 505 |
Servers</uri>, and configure your <path>/etc/conf.d/ntp</path> and |
| 506 |
<path>/etc/ntp.conf</path> files on the master node. |
| 507 |
</p> |
| 508 |
|
| 509 |
<pre caption="Master /etc/conf.d/ntp"> |
| 510 |
# /etc/conf.d/ntpd |
| 511 |
|
| 512 |
# NOTES: |
| 513 |
# - NTPDATE variables below are used if you wish to set your |
| 514 |
# clock when you start the ntp init.d script |
| 515 |
# - make sure that the NTPDATE_CMD will close by itself ... |
| 516 |
# the init.d script will not attempt to kill/stop it |
| 517 |
# - ntpd will be used to maintain synchronization with a time |
| 518 |
# server regardless of what NTPDATE is set to |
| 519 |
# - read each of the comments above each of the variable |
| 520 |
|
| 521 |
# Comment this out if you dont want the init script to warn |
| 522 |
# about not having ntpdate setup |
| 523 |
NTPDATE_WARN="n" |
| 524 |
|
| 525 |
# Command to run to set the clock initially |
| 526 |
# Most people should just uncomment this line ... |
| 527 |
# however, if you know what you're doing, and you |
| 528 |
# want to use ntpd to set the clock, change this to 'ntpd' |
| 529 |
NTPDATE_CMD="ntpdate" |
| 530 |
|
| 531 |
# Options to pass to the above command |
| 532 |
# Most people should just uncomment this variable and |
| 533 |
# change 'someserver' to a valid hostname which you |
| 534 |
# can acquire from the URL's below |
| 535 |
NTPDATE_OPTS="-b ntp1.cmc.ec.gc.ca" |
| 536 |
|
| 537 |
## |
| 538 |
# A list of available servers is available here: |
| 539 |
# http://www.eecis.udel.edu/~mills/ntp/servers.html |
| 540 |
# Please follow the rules of engagement and use a |
| 541 |
# Stratum 2 server (unless you qualify for Stratum 1) |
| 542 |
## |
| 543 |
|
| 544 |
# Options to pass to the ntpd process that will *always* be run |
| 545 |
# Most people should not uncomment this line ... |
| 546 |
# however, if you know what you're doing, feel free to tweak |
| 547 |
#NTPD_OPTS="" |
| 548 |
|
| 549 |
</pre> |
| 550 |
|
| 551 |
<p> |
| 552 |
Edit your <path>/etc/ntp.conf</path> file on the master to setup an external |
| 553 |
synchronization source: |
| 554 |
</p> |
| 555 |
|
| 556 |
<pre caption="Master ntp.conf"> |
| 557 |
# Adelie Linux Research & Development Center |
| 558 |
# /etc/ntp.conf |
| 559 |
|
| 560 |
# Synchronization source #1 |
| 561 |
server ntp1.cmc.ec.gc.ca |
| 562 |
restrict ntp1.cmc.ec.gc.ca |
| 563 |
# Synchronization source #2 |
| 564 |
server ntp2.cmc.ec.gc.ca |
| 565 |
restrict ntp2.cmc.ec.gc.ca |
| 566 |
stratum 10 |
| 567 |
driftfile /etc/ntp.drift.server |
| 568 |
logfile /var/log/ntp |
| 569 |
broadcast 192.168.1.255 |
| 570 |
restrict default kod |
| 571 |
restrict 127.0.0.1 |
| 572 |
restrict 192.168.1.0 mask 255.255.255.0 |
| 573 |
</pre> |
| 574 |
|
| 575 |
<p> |
| 576 |
And on all your slave nodes, setup your synchronization source as your master |
| 577 |
node. |
| 578 |
</p> |
| 579 |
|
| 580 |
<pre caption="Node /etc/conf.d/ntp"> |
| 581 |
# /etc/conf.d/ntpd |
| 582 |
|
| 583 |
NTPDATE_WARN="n" |
| 584 |
NTPDATE_CMD="ntpdate" |
| 585 |
NTPDATE_OPTS="-b master" |
| 586 |
</pre> |
| 587 |
|
| 588 |
<pre caption="Node ntp.conf"> |
| 589 |
# Adelie Linux Research & Development Center |
| 590 |
# /etc/ntp.conf |
| 591 |
|
| 592 |
# Synchronization source #1 |
| 593 |
server master |
| 594 |
restrict master |
| 595 |
stratum 11 |
| 596 |
driftfile /etc/ntp.drift.server |
| 597 |
logfile /var/log/ntp |
| 598 |
restrict default kod |
| 599 |
restrict 127.0.0.1 |
| 600 |
</pre> |
| 601 |
|
| 602 |
<p> |
| 603 |
Then add ntpd to the default runlevel of all your nodes: |
| 604 |
</p> |
| 605 |
|
| 606 |
<pre caption="Adding ntpd to the default runlevel"> |
| 607 |
# <i>rc-update add ntpd default</i> |
| 608 |
</pre> |
| 609 |
|
| 610 |
<note> |
| 611 |
NTP will not update the local clock if the time difference between your |
| 612 |
synchronization source and the local clock is too great. |
| 613 |
</note> |
| 614 |
|
| 615 |
</body> |
| 616 |
</section> |
| 617 |
<section> |
| 618 |
<title>IPTABLES</title> |
| 619 |
<body> |
| 620 |
|
| 621 |
<p> |
| 622 |
To setup a firewall on your cluster, you will need iptables. |
| 623 |
</p> |
| 624 |
|
| 625 |
<pre caption="Installing iptables"> |
| 626 |
# <i>emerge -p iptables</i> |
| 627 |
# <i>emerge iptables</i> |
| 628 |
</pre> |
| 629 |
|
| 630 |
<p> |
| 631 |
Required kernel configuration: |
| 632 |
</p> |
| 633 |
|
| 634 |
<pre caption="IPtables kernel configuration"> |
| 635 |
CONFIG_NETFILTER=y |
| 636 |
CONFIG_IP_NF_CONNTRACK=y |
| 637 |
CONFIG_IP_NF_IPTABLES=y |
| 638 |
CONFIG_IP_NF_MATCH_STATE=y |
| 639 |
CONFIG_IP_NF_FILTER=y |
| 640 |
CONFIG_IP_NF_TARGET_REJECT=y |
| 641 |
CONFIG_IP_NF_NAT=y |
| 642 |
CONFIG_IP_NF_NAT_NEEDED=y |
| 643 |
CONFIG_IP_NF_TARGET_MASQUERADE=y |
| 644 |
CONFIG_IP_NF_TARGET_LOG=y |
| 645 |
</pre> |
| 646 |
|
| 647 |
<p> |
| 648 |
And the rules required for this firewall: |
| 649 |
</p> |
| 650 |
|
| 651 |
<pre caption="rule-save"> |
| 652 |
# Adelie Linux Research & Development Center |
| 653 |
# /var/lib/iptables/rule-save |
| 654 |
|
| 655 |
*filter |
| 656 |
:INPUT ACCEPT [0:0] |
| 657 |
:FORWARD ACCEPT [0:0] |
| 658 |
:OUTPUT ACCEPT [0:0] |
| 659 |
-A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT |
| 660 |
-A INPUT -p tcp -m tcp --dport 22 -j ACCEPT |
| 661 |
-A INPUT -s 192.168.1.0/255.255.255.0 -i eth1 -j ACCEPT |
| 662 |
-A INPUT -s 127.0.0.1 -i lo -j ACCEPT |
| 663 |
-A INPUT -p icmp -j ACCEPT |
| 664 |
-A INPUT -j LOG |
| 665 |
-A INPUT -j REJECT --reject-with icmp-port-unreachable |
| 666 |
COMMIT |
| 667 |
*nat |
| 668 |
:PREROUTING ACCEPT [0:0] |
| 669 |
:POSTROUTING ACCEPT [0:0] |
| 670 |
:OUTPUT ACCEPT [0:0] |
| 671 |
-A POSTROUTING -s 192.168.1.0/255.255.255.0 -j MASQUERADE |
| 672 |
COMMIT |
| 673 |
</pre> |
| 674 |
|
| 675 |
<p> |
| 676 |
Then add iptables to the default runlevel of all your nodes: |
| 677 |
</p> |
| 678 |
|
| 679 |
<pre caption="Adding iptables to the default runlevel"> |
| 680 |
# <i>rc-update add iptables default</i> |
| 681 |
</pre> |
| 682 |
|
| 683 |
</body> |
| 684 |
</section> |
| 685 |
</chapter> |
| 686 |
|
| 687 |
<chapter> |
| 688 |
<title>HPC Tools</title> |
| 689 |
<section> |
| 690 |
<title>OpenPBS</title> |
| 691 |
<body> |
| 692 |
|
| 693 |
<p> |
| 694 |
The Portable Batch System (PBS) is a flexible batch queueing and workload |
| 695 |
management system originally developed for NASA. It operates on networked, |
| 696 |
multi-platform UNIX environments, including heterogeneous clusters of |
| 697 |
workstations, supercomputers, and massively parallel systems. Development of |
| 698 |
PBS is provided by Altair Grid Technologies. |
| 699 |
</p> |
| 700 |
|
| 701 |
<pre caption="Installing openpbs"> |
| 702 |
# <i>emerge -p openpbs</i> |
| 703 |
</pre> |
| 704 |
|
| 705 |
<note> |
| 706 |
OpenPBS ebuild does not currently set proper permissions on var-directories |
| 707 |
used by OpenPBS. |
| 708 |
</note> |
| 709 |
|
| 710 |
<p> |
| 711 |
Before starting using OpenPBS, some configurations are required. The files |
| 712 |
you will need to personalize for your system are: |
| 713 |
</p> |
| 714 |
|
| 715 |
<ul> |
| 716 |
<li>/etc/pbs_environment</li> |
| 717 |
<li>/var/spool/PBS/server_name</li> |
| 718 |
<li>/var/spool/PBS/server_priv/nodes</li> |
| 719 |
<li>/var/spool/PBS/mom_priv/config</li> |
| 720 |
<li>/var/spool/PBS/sched_priv/sched_config</li> |
| 721 |
</ul> |
| 722 |
|
| 723 |
<p> |
| 724 |
Here is a sample sched_config: |
| 725 |
</p> |
| 726 |
|
| 727 |
<pre caption="/var/spool/PBS/sched_priv/sched_config"> |
| 728 |
# |
| 729 |
# Create queues and set their attributes. |
| 730 |
# |
| 731 |
# |
| 732 |
# Create and define queue upto4nodes |
| 733 |
# |
| 734 |
create queue upto4nodes |
| 735 |
set queue upto4nodes queue_type = Execution |
| 736 |
set queue upto4nodes Priority = 100 |
| 737 |
set queue upto4nodes resources_max.nodect = 4 |
| 738 |
set queue upto4nodes resources_min.nodect = 1 |
| 739 |
set queue upto4nodes enabled = True |
| 740 |
set queue upto4nodes started = True |
| 741 |
# |
| 742 |
# Create and define queue default |
| 743 |
# |
| 744 |
create queue default |
| 745 |
set queue default queue_type = Route |
| 746 |
set queue default route_destinations = upto4nodes |
| 747 |
set queue default enabled = True |
| 748 |
set queue default started = True |
| 749 |
# |
| 750 |
# Set server attributes. |
| 751 |
# |
| 752 |
set server scheduling = True |
| 753 |
set server acl_host_enable = True |
| 754 |
set server default_queue = default |
| 755 |
set server log_events = 511 |
| 756 |
set server mail_from = adm |
| 757 |
set server query_other_jobs = True |
| 758 |
set server resources_default.neednodes = 1 |
| 759 |
set server resources_default.nodect = 1 |
| 760 |
set server resources_default.nodes = 1 |
| 761 |
set server scheduler_iteration = 60 |
| 762 |
</pre> |
| 763 |
|
| 764 |
<p> |
| 765 |
To submit a task to OpenPBS, the command <c>qsub</c> is used with some |
| 766 |
optional parameters. In the example below, "-l" allows you to specify |
| 767 |
the resources required, "-j" provides for redirection of standard out and |
| 768 |
standard error, and the "-m" will e-mail the user at beginning (b), end (e) |
| 769 |
and on abort (a) of the job. |
| 770 |
</p> |
| 771 |
|
| 772 |
<pre caption="Submitting a task"> |
| 773 |
<comment>(submit and request from OpenPBS that myscript be executed on 2 nodes)</comment> |
| 774 |
# <i>qsub -l nodes=2 -j oe -m abe myscript</i> |
| 775 |
</pre> |
| 776 |
|
| 777 |
<p> |
| 778 |
Normally jobs submitted to OpenPBS are in the form of scripts. Sometimes, you |
| 779 |
may want to try a task manually. To request an interactive shell from OpenPBS, |
| 780 |
use the "-I" parameter. |
| 781 |
</p> |
| 782 |
|
| 783 |
<pre caption="Requesting an interactive shell"> |
| 784 |
# <i>qsub -I</i> |
| 785 |
</pre> |
| 786 |
|
| 787 |
<p> |
| 788 |
To check the status of your jobs, use the qstat command: |
| 789 |
</p> |
| 790 |
|
| 791 |
<pre caption="Checking the status of the jobs"> |
| 792 |
# <i>qstat</i> |
| 793 |
Job id Name User Time Use S Queue |
| 794 |
------ ---- ---- -------- - ----- |
| 795 |
2.geist STDIN adelie 0 R upto1nodes |
| 796 |
</pre> |
| 797 |
|
| 798 |
</body> |
| 799 |
</section> |
| 800 |
<section> |
| 801 |
<title>MPICH</title> |
| 802 |
<body> |
| 803 |
|
| 804 |
<p> |
| 805 |
Message passing is a paradigm used widely on certain classes of parallel |
| 806 |
machines, especially those with distributed memory. MPICH is a freely |
| 807 |
available, portable implementation of MPI, the Standard for message-passing |
| 808 |
libraries. |
| 809 |
</p> |
| 810 |
|
| 811 |
<p> |
| 812 |
The mpich ebuild provided by Adelie Linux allows for two USE flags: |
| 813 |
<e>doc</e> and <e>crypt</e>. <e>doc</e> will cause documentation to be |
| 814 |
installed, while <e>crypt</e> will configure MPICH to use <c>ssh</c> instead |
| 815 |
of <c>rsh</c>. |
| 816 |
</p> |
| 817 |
|
| 818 |
<pre caption="Installing the mpich application"> |
| 819 |
# <i>emerge -p mpich</i> |
| 820 |
# <i>emerge mpich</i> |
| 821 |
</pre> |
| 822 |
|
| 823 |
<p> |
| 824 |
You may need to export a mpich work directory to all your slave nodes in |
| 825 |
<path>/etc/exports</path>: |
| 826 |
</p> |
| 827 |
|
| 828 |
<pre caption="/etc/exports"> |
| 829 |
/home *(rw) |
| 830 |
</pre> |
| 831 |
|
| 832 |
<p> |
| 833 |
Most massively parallel processors (MPPs) provide a way to start a program on |
| 834 |
a requested number of processors; <c>mpirun</c> makes use of the appropriate |
| 835 |
command whenever possible. In contrast, workstation clusters require that each |
| 836 |
process in a parallel job be started individually, though programs to help |
| 837 |
start these processes exist. Because workstation clusters are not already |
| 838 |
organized as an MPP, additional information is required to make use of them. |
| 839 |
Mpich should be installed with a list of participating workstations in the |
| 840 |
file <path>machines.LINUX</path> in the directory |
| 841 |
<path>/usr/share/mpich/</path>. This file is used by <c>mpirun</c> to choose |
| 842 |
processors to run on. |
| 843 |
</p> |
| 844 |
|
| 845 |
<p> |
| 846 |
Edit this file to reflect your cluster-lan configuration: |
| 847 |
</p> |
| 848 |
|
| 849 |
<pre caption="/usr/share/mpich/machines.LINUX"> |
| 850 |
# Change this file to contain the machines that you want to use |
| 851 |
# to run MPI jobs on. The format is one host name per line, with either |
| 852 |
# hostname |
| 853 |
# or |
| 854 |
# hostname:n |
| 855 |
# where n is the number of processors in an SMP. The hostname should |
| 856 |
# be the same as the result from the command "hostname" |
| 857 |
master |
| 858 |
node01 |
| 859 |
node02 |
| 860 |
# node03 |
| 861 |
# node04 |
| 862 |
# ... |
| 863 |
</pre> |
| 864 |
|
| 865 |
<p> |
| 866 |
Use the script <c>tstmachines</c> in <path>/usr/sbin/</path> to ensure that |
| 867 |
you can use all of the machines that you have listed. This script performs |
| 868 |
an <c>rsh</c> and a short directory listing; this tests that you both have |
| 869 |
access to the node and that a program in the current directory is visible on |
| 870 |
the remote node. If there are any problems, they will be listed. These |
| 871 |
problems must be fixed before proceeding. |
| 872 |
</p> |
| 873 |
|
| 874 |
<p> |
| 875 |
The only argument to <c>tstmachines</c> is the name of the architecture; this |
| 876 |
is the same name as the extension on the machines file. For example, the |
| 877 |
following tests that a program in the current directory can be executed by |
| 878 |
all of the machines in the LINUX machines list. |
| 879 |
</p> |
| 880 |
|
| 881 |
<pre caption="Running a test"> |
| 882 |
# <i>/usr/local/mpich/sbin/tstmachines LINUX</i> |
| 883 |
</pre> |
| 884 |
|
| 885 |
<note> |
| 886 |
This program is silent if all is well; if you want to see what it is doing, |
| 887 |
use the -v (for verbose) argument: |
| 888 |
</note> |
| 889 |
|
| 890 |
<pre caption="Running a test verbosively"> |
| 891 |
# <i>/usr/local/mpich/sbin/tstmachines -v LINUX</i> |
| 892 |
</pre> |
| 893 |
|
| 894 |
<p> |
| 895 |
The output from this command might look like: |
| 896 |
</p> |
| 897 |
|
| 898 |
<pre caption="Output of the above command"> |
| 899 |
Trying true on host1.uoffoo.edu ... |
| 900 |
Trying true on host2.uoffoo.edu ... |
| 901 |
Trying ls on host1.uoffoo.edu ... |
| 902 |
Trying ls on host2.uoffoo.edu ... |
| 903 |
Trying user program on host1.uoffoo.edu ... |
| 904 |
Trying user program on host2.uoffoo.edu ... |
| 905 |
</pre> |
| 906 |
|
| 907 |
<p> |
| 908 |
If <c>tstmachines</c> finds a problem, it will suggest possible reasons and |
| 909 |
solutions. In brief, there are three tests: |
| 910 |
</p> |
| 911 |
|
| 912 |
<ul> |
| 913 |
<li> |
| 914 |
<e>Can processes be started on remote machines?</e> tstmachines attempts |
| 915 |
to run the shell command true on each machine in the machines files by |
| 916 |
using the remote shell command. |
| 917 |
</li> |
| 918 |
<li> |
| 919 |
<e>Is current working directory available to all machines?</e> This |
| 920 |
attempts to ls a file that tstmachines creates by running ls using the |
| 921 |
remote shell command. |
| 922 |
</li> |
| 923 |
<li> |
| 924 |
<e>Can user programs be run on remote systems?</e> This checks that shared |
| 925 |
libraries and other components have been properly installed on all |
| 926 |
machines. |
| 927 |
</li> |
| 928 |
</ul> |
| 929 |
|
| 930 |
<p> |
| 931 |
And the required test for every development tool: |
| 932 |
</p> |
| 933 |
|
| 934 |
<pre caption="Testing a development tool"> |
| 935 |
# <i>cd ~</i> |
| 936 |
# <i>cp /usr/share/mpich/examples1/hello++.c ~</i> |
| 937 |
# <i>make hello++</i> |
| 938 |
# <i>mpirun -machinefile /usr/share/mpich/machines.LINUX -np 1 hello++</i> |
| 939 |
</pre> |
| 940 |
|
| 941 |
<p> |
| 942 |
For further information on MPICH, consult the documentation at <uri |
| 943 |
link="http://www-unix.mcs.anl.gov/mpi/mpich/docs/mpichman-chp4/mpichman-chp4.htm">http://www-unix.mcs.anl.gov/mpi/mpich/docs/mpichman-chp4/mpichman-chp4.htm</uri>. |
| 944 |
</p> |
| 945 |
|
| 946 |
</body> |
| 947 |
</section> |
| 948 |
<section> |
| 949 |
<title>LAM</title> |
| 950 |
<body> |
| 951 |
|
| 952 |
<p> |
| 953 |
(Coming Soon!) |
| 954 |
</p> |
| 955 |
|
| 956 |
</body> |
| 957 |
</section> |
| 958 |
<section> |
| 959 |
<title>OMNI</title> |
| 960 |
<body> |
| 961 |
|
| 962 |
<p> |
| 963 |
(Coming Soon!) |
| 964 |
</p> |
| 965 |
|
| 966 |
</body> |
| 967 |
</section> |
| 968 |
</chapter> |
| 969 |
|
| 970 |
<chapter> |
| 971 |
<title>Bibliography</title> |
| 972 |
<section> |
| 973 |
<body> |
| 974 |
|
| 975 |
<p> |
| 976 |
The original document is published at the <uri |
| 977 |
link="http://www.adelielinux.com">Adelie Linux R&D Centre</uri> web site, |
| 978 |
and is reproduced here with the permission of the authors and <uri |
| 979 |
link="http://www.cyberlogic.ca">Cyberlogic</uri>'s Adelie Linux R&D |
| 980 |
Centre. |
| 981 |
</p> |
| 982 |
|
| 983 |
<ul> |
| 984 |
<li><uri>http://www.gentoo.org</uri>, Gentoo Foundation, Inc.</li> |
| 985 |
<li> |
| 986 |
<uri link="http://www.adelielinux.com">http://www.adelielinux.com</uri>, |
| 987 |
Adelie Linux Research and Development Centre |
| 988 |
</li> |
| 989 |
<li> |
| 990 |
<uri link="http://nfs.sourceforge.net/">http://nfs.sourceforge.net</uri>, |
| 991 |
Linux NFS Project |
| 992 |
</li> |
| 993 |
<li> |
| 994 |
<uri link="http://www-unix.mcs.anl.gov/mpi/mpich/">http://www-unix.mcs.anl.gov/mpi/mpich/</uri>, |
| 995 |
Mathematics and Computer Science Division, Argonne National Laboratory |
| 996 |
</li> |
| 997 |
<li> |
| 998 |
<uri link="http://www.ntp.org/">http://ntp.org</uri> |
| 999 |
</li> |
| 1000 |
<li> |
| 1001 |
<uri link="http://www.eecis.udel.edu/~mills/">http://www.eecis.udel.edu/~mills/</uri>, |
| 1002 |
David L. Mills, University of Delaware |
| 1003 |
</li> |
| 1004 |
<li> |
| 1005 |
<uri link="http://www.ietf.org/html.charters/secsh-charter.html">http://www.ietf.org/html.charters/secsh-charter.html</uri>, |
| 1006 |
Secure Shell Working Group, IETF, Internet Society |
| 1007 |
</li> |
| 1008 |
<li> |
| 1009 |
<uri link="http://www.linuxsecurity.com/">http://www.linuxsecurity.com/</uri>, |
| 1010 |
Guardian Digital |
| 1011 |
</li> |
| 1012 |
<li> |
| 1013 |
<uri link="http://www.openpbs.org/">http://www.openpbs.org/</uri>, |
| 1014 |
Altair Grid Technologies, LLC. |
| 1015 |
</li> |
| 1016 |
</ul> |
| 1017 |
|
| 1018 |
</body> |
| 1019 |
</section> |
| 1020 |
</chapter> |
| 1021 |
|
| 1022 |
</guide> |