[ previous ] [ Contents ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 6 ] [ 7 ] [ 8 ] [ 9 ] [ 10 ] [ 11 ] [ next ]


FAI Guide (Fully Automatic Installation)
Chapter 8 - How to build a Beowulf cluster using FAI


This chapter describes the details about building a Beowulf cluster using Debian GNU/Linux and FAI. This chapter was written for FAI version 2.x for Debian woody and was not yet updated. The example configuration files were removed from the fai packages after FAI 2.8.4. For more information about the Beowulf concept look at www.beowulf.org.


8.1 Planning the Beowulf setup

The example of a Beowulf cluster consists of one master node and 25 clients. A big rack was assembled which all the cases were put into. A keyboard and a monitor, which are connected to the master server most of the time, were also put into the rack. But since we have very long cables for a monitor and a keyboard, they can also be connected to all nodes if something has to be changed in the BIOS, or when looking for errors when a node does not boot. Power supply is another topic you have to think about. Don't connect many nodes to one power cord and one outlet. Distribute them among several breakout boxes and outlets. And what about the heat emission? A dozen nodes in a small room can create too much heat, so you will need an air conditioner. Will the power supplies of each node go to stand-by mode or are all nodes turned on simultaneously after a power failure?

All computers in this example are connected to a Fast Ethernet switch. The master node (or master server) is called nucleus. It has two network cards. One for the connection to the external Internet, one for the connection to the internal cluster network. If connected from the external Internet, it's called nucleus, but the cluster nodes access the master node with the name atom00, which is a name for the second network interface. The master server is also the install server for the computing nodes. A local Debian mirror will be installed on the local harddisk. The home directories of all user accounts are also located on the master server. It will be exported via NFS to all computing nodes. NIS will be used to distribute account, host, and printer information to all nodes.

All client nodes atom01 to atom25 are connected via the switch with the second interface card of the master node. They can only connect to the other nodes or the master, but can't communicate to any host outside their cluster network. So, all services (NTP, DNS, NIS, NFS, ...) must be available on the master server. I chose the class C network address 192.168.42.0 for building the local Beowulf cluster network. You can replace the subnet 42 with any other number you like. If you have more than 253 computing nodes, choose a class A network address (10.X.X.X).

In the phase of preparing the installation, you have to boot the first install client many times, until there's no fault in your configuration scripts. Therefore you should have physical access to the master server and one client node. So, connect both computers to a switch box, so one keyboard and monitor can be shared among both.


8.2 Set up the master server

The master server will be installed by hand if it is your first computer installed with Debian. If you already have a host running Debian, you can also install the master server via FAI. Create a partition on /files/scratch/debmirror for the local Debian mirror with more than 22.0GB GB space available.


8.2.1 Set up the network

Add the following lines for the second network card to /etc/network/interfaces:

     # Beowulf cluster connection
     auto eth1
     iface eth1 inet static
     address 192.168.42.250
     netmask 255.255.255.0
     broadcast 192.168.42.255

Add the IP addresses for the client nodes. The FAI package has an example for this /etc/hosts file:

     # create these entries with the Perl one liner
     # perl -e 'for (1..25) {printf "192.168.42.%s atom%02s\n",$_,$_;}'
     
     # Beowulf nodes
     # atom00 is the master server
     192.168.42.250 atom00
     192.168.42.1 atom01
     192.168.42.2 atom02

You can give the internal Beowulf network a name when you add this line to /etc/networks:

     beowcluster 192.168.42.0

Activate the second network interface with: /etc/init.d/networking start.


8.2.2 Setting up NIS

Add a normal user account tom which is the person who edits the configuration space and manages the local Debian mirror:

     # adduser tom
     # addgroup linuxadmin

This user should also be in the group linuxadmin.

     # adduser tom linuxadmin

First set the NIS domainname name by creating the file /etc/defaultdomain and call domainname(8). To initialize the master server as NIS server call /usr/lib/yp/ypinit -m. Also edit /etc/default/nis so the host becomes a NIS master server. Then, copy the file netgroup from the examples directory to /etc and edit other files there. Adjust access to the NIS service.

     # cat /etc/ypserv.securenets
     # Always allow access for localhost
     255.0.0.0       127.0.0.0
     # This line gives access to the Beowulf cluster
     255.255.255.0 192.168.42.0

Rebuild the NIS maps:

     # cd /var/yp; make

You will find much more information about NIS in the NIS-HOWTO document.


8.2.3 Create a local Debian mirror

Now the user tom can create a local Debian mirror on /files/scratch/debmirror using mkdebmirror. You can add the option --debug to see which files are received. This will need about 22.0GB GB disk space for Debian 3.0 (aka woody). Export this directory to the netgroup @faiclients read only. Here's the line for /etc/exports

/files/scratch/debmirror *(ro)


8.2.4 Install FAI package on the master server

Add the following packages to the install server:

     nucleus:/# apt-get install ntp tftpd-hpa dhcp3-server \
     nfs-kernel-server etherwake fai
     nucleus:/# tasksel -q -n install dns-server
     nucleus:/# apt-get dselect-upgrade

Configure NTP so that the master server will have the correct system time.

It's very important to use the internal network name atom00 for the master server (not the external name nucleus) in /etc/dhcp3/dhcpd.conf and /etc/fai/make-fai-nfsroot.conf. Replace the strings FAISERVER with atom00 and uncomment the following line in /etc/fai/make-fai-nfsroot.conf so the Beowulf nodes can use the name for connecting to their master server.

     NFSROOT_ETC_HOSTS="192.168.42.250 atom00"

8.2.5 Prepare network booting

Set up the install server daemon as described in Booting from network card with a PXE conforming boot ROM, Section 4.2. If you will have many cluster nodes (more than about 10) and you will use rsh in /etc/fai/fai.conf raise the number of connects per minute to some services in inetd.conf:

     shell stream tcp  nowait.300  root /usr/sbin/tcpd /usr/sbin/in.rshd
     login stream tcp  nowait.300  root /usr/sbin/tcpd /usr/sbin/in.rlogind

The user tom should have permission to create the symlinks for booting via network card, so change the group and add some utilities.

     # chgrp -R linuxadmin /srv/tftp/fai; chmod -R g+rwx /srv/tftp/fai
     # cp /usr/share/doc/fai-doc/examples/utils/* /usr/local/bin

Now, the user tom sets the boot image for the first beowulf node.

     fai-chboot -IFv atom01

Now boot the first client node for the first time. Then start to adjust the configuration for your client nodes. Don't forget to build the kernel for the cluster nodes using make-kpkg(8) and store it in /srv/fai/config/files/packages.


8.3 Tools for Beowulf clusters

The following tools are useful for a Beowulf cluster:

tlink

Change the symbolic link that points to the kernel image for booting from a network card. Only used when you boot using BOOTP.

all_hosts

Print a list of all hosts, print only the hosts which respond to a ping or the hosts which do not respond. The complete list of hosts is defined by the netgroup allhosts. Look at /usr/share/doc/fai-doc/examples/etc/netgroup for an example.

rshall

Execute a command on all hosts which are up via rsh. Uses all_hosts to get the list of all hosts up. You can also use the dsh(1) command (dancer's shell, or distributed shell).

rup

The command rup(1) shows briefly the CPU load of every host.

clusterssh

The package clusterssh allows you to control multiple ssh or rsh sessions at the same time.

These are some common tools for a cluster environment:

rgang

For a huge cluster try rgang. It's is a tool which executes commands on or distributes files to many nodes. It uses an algorithm to build a tree-like structure to allow the distribution processing time to scale very well to 1000 or more nodes (available at fermitools.fnal.gov/ abstracts/rgang/abstract.html).

jmon

For observing the resources of all clients (CPU, memory, swap,...) you can use jmon(1) which installs a simple daemon on every cluster node.

ganglia

This toolkit is very good for monitoring your cluster with a nice web frontend. Available at ganglia.sourceforge.net/


8.4 Wake on LAN with 3Com network cards

Wake on LAN is a very nice feature to power on a computer without having physical access to it. By sending a special ethernet packet to the network card, the computer will be turned on. The following things have to be done, to use the wake on LAN (WOL) feature.

  1. Connect the network card to the Wake-On-LAN connector on the motherboard using a 3 pin cable.

  1. My ASUS K7M motherboard has a jumper called Vaux (3VSBSLT) which allows to select the voltage supplied to add-in PCI cards. Set it to Add 3VSB (3 Volt stand by).

  1. Turn on the wake on LAN feature in BIOS

  1. For a 3Com card using the 3c59x driver you must enable the WOL feature using the kernel module option enable_wol.

To wake up a computer use the command ether-wake(8). Additional information is available from www.scyld.com/expert/wake-on-lan.html.


[ previous ] [ Contents ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 6 ] [ 7 ] [ 8 ] [ 9 ] [ 10 ] [ 11 ] [ next ]


FAI Guide (Fully Automatic Installation)

FAI Guide version 2.6.8, 7 December 2007 for FAI package version 3.2.1

Thomas Lange lange@informatik.uni-koeln.de