NOTE: starting from 4/15/2004, Wednesday classes will take place in the Unisys lab (D112).
High Performance Computational Cluster: STEP 2
Practicing with MPI and installing Sun Grid Engine
After you make sure the MPI works fine on your cluster, you can try to compile
and run sample codes in /usr/local/mpich-1.2.5/examples/basic, for example,
cpi.c, iotest.c, etc. You can compile them with mpicc and submit to run with mpirun.
Eventually, each of you will need to write your own MPI code.
If you are not familiar with MPI yet,
then, as the first step, you can read articles in Linux Gazette
and Linux Magazine skiping the PVM stuff in it. You should try
running the example codes from the articles.
To share resources on your cluster between multiple applications and users,
you need to install a queue scheduling system, specifically in our case,
Sun Grid Engine).
You need to download and install it. First install it on the master node:
create an account for the queue system administrator, for example, sgeadmin;
create an installation directory, for example, /usr/local/sge;
the follow the installation instructions, which are provided with the
package. On the master node, you will need to run install_qmaster and
install_execd installation scripts. Then you export the installation directory
over NFS to the computational nodes and run install_execd on the computational
nodes. After the installation is done, you will need to configure the queue
system.
Building Linux Vitual Server: STEP 2
Following the installation instructions for LVS on RH 9, download and install on the balancer machine
the patched Ultra Monkey kernel, ipvsadm, libnet and related Perl modules.
LVS via NAT
Assuming that your traffic director machine is already configured
for two IP addresses and runs masquerading with IPTABLES:
/sbin/iptables -A POSTROUTING -t nat \
-s 192.168.5.0/24 -d ! 192.168.5.0/24 -j MASQUERADE
and the routing table on the real servers is set
to relay traffic for 192.168.6.0/24 through unisys14:
route add -net 192.168.6.0 netmask 255.255.255.0 gw 192.168.5.14
Assuming your load balancer has VIP 192.168.6.14 on eth0:0 and your real
servers are 192.168.5.34 and 192.168.5.35, set the following rules:
/sbin/ipvsadm -A -t 192.168.6.14:80 -s wlc
/sbin/ipvsadm -a -t 192.168.6.14:80 -r 192.168.5.34:80 -m
/sbin/ipvsadm -a -t 192.168.6.14:80 -r 192.168.5.35:80 -m
Check the active rules:
/sbin/ipvsadm -Ln
Schematically, these rules can be represented in the Figure below:
|
In /var/www/html/index.html on node14 (one of the real servers),
change "Test Page" for
"Test Page on node14". On node15 (the other real server), change it for
"Test Page on node15".
From unisys15 (the client machine), connect to http://192.168.6.14 and
hit reload several times. The director should re-direct the traffic
between the servers.
However, the balancer wouldn't know if one of the servers dies.
You can implement ipvsadm through the http-monitoring script,
httpdMonitor.sh.
Clear the ipvsadm table with /sbin/ipvsadm -C;
modify httpdMonitor.sh with your settings;
start the script with
httpdMonitor.sh &
Connect to one of the real servers and stop Apache:
/etc/rc.d/init.d/httpd stop
On the balancer, check the ipvsadm table with
/sbin/ipvsadm -Ln
From unisys15 (the client machine), try to connect to http://192.168.6.14.
Start Apache:
/etc/rc.d/init.d/httpd start
Then try to connect to http://192.168.6.14 from unisys15 again.
Copy httpdMonitor.sh into /usr/local/sbin directory and write a
startup script for it, /etc/rc.d/init.d/webmon, to
start/stop httpdMonitor.sh with
/etc/rc.d/init.d/webmon start
and
/etc/rc.d/init.d/webmon stop.
Hint: there was a discussion on startup scripts in Lesson 7.
LVS via Direct Routing
Make sure that your traffic director machine is prepared for packet forwarding,
i.e. /etc/sysctl.conf should look as follows:
# Disables packet forwarding
net.ipv4.ip_forward = 1
# Enables source route verification
net.ipv4.conf.default.rp_filter = 0
# Disables the magic-sysrq key
kernel.sysrq = 0
If you have modified the settings in /etc/sysctl.conf, run
/sbin/sysctl -p
Set your load balancer VIP to 192.168.5.44:
/sbin/ifconfig eth0:0 192.168.5.44 netmask 255.255.255.255 broadcast 192.168.5.44 up
Set IPVSADM rules for Direct Routing:
/sbin/ipvsadm -C
/sbin/ipvsadm -A -t 192.168.5.44:80 -s wlc
/sbin/ipvsadm -a -t 192.168.5.44:80 -r 192.168.5.34 -g
/sbin/ipvsadm -a -t 192.168.5.44:80 -r 192.168.5.35 -g
/sbin/ipvsadm -L
The balancer with Direct Routing technique would modify the packet frame that
came from the client putting there the MAC address of the real servers,
then re-directing it to the real server.
Your real servers stay with the original IP addresses, i.e. 192.168.5.34 and 192.168.5.35.
Note, the Director's IP and VIP and the servers IP addresses are
on the same subnet now, 192.168.5. The client machine should also be on the
same subnet. If you still have the old routing for net 192.168.6.0 on the
real servers, remove it:
route del -net 192.168.6.0/24
Set redirect rules for the VIP address
on the real servers using IP tables:
/sbin/iptables -t nat -A PREROUTING -d 192.168.5.44 -j REDIRECT
If the real server receives a packet with destination at 192.168.5.44, it would
pass it through its TCP/IP stack towards the application, the web service.
From unisys15, the client machine with IP address 192.168.5.15, connect to http://192.168.5.44 and
hit reload several times. The director should re-direct the traffic
between the servers. The servers send their respond directly to the client
machine, as shown in Figure above.
Modify httpdMonitor.sh script to work on the balancer with Direct Routing
scheme: set parameter CF="-g" . Clear the ipvsadm table with /sbin/ipvsadm -C;
start the script with
httpdMonitor.sh &
Connect to one of the real servers and stop Apache:
/etc/rc.d/init.d/httpd stop
On the balancer, check the ipvsadm table with
/sbin/ipvsadm -Ln
From unisys15 (the client machine), try to connect to http://192.168.5.44.
Start Apache:
/etc/rc.d/init.d/httpd start
Then try to connect to http://192.168.5.44 from unisys15 again.
You can copy httpdMonitor.sh into /usr/local/sbin directory and
start/stop it with webmon start and
webmon stop, as you already did for LVS-NAT scheme.