NOTE: starting from 4/15/2004, Wednesday classes will take place in the Unisys lab (D112).

High Performance Computational Cluster: STEP 2

Practicing with MPI and installing Sun Grid Engine

After you make sure the MPI works fine on your cluster, you can try to compile
 and run sample codes in /usr/local/mpich-1.2.5/examples/basic, for example,
cpi.c, iotest.c, etc. You can compile them with mpicc and submit to run with mpirun. 
 Eventually, each of you will need to write your own MPI code. 
If you are not familiar with MPI yet,
 then, as the first step, you can read articles in Linux Gazette
and Linux Magazine skiping the PVM stuff in it. You should try 
 running the example codes from the articles.  

To share resources on your cluster between multiple applications and users,
 you need to install a queue scheduling system, specifically in our case, 
Sun Grid Engine).
You need to download and install it. First install it on the master node:
 create an account for the queue system administrator, for example, sgeadmin;
 create an installation directory, for example, /usr/local/sge; 
 the follow the installation instructions, which are provided with the 
 package. On the master node, you will need to run install_qmaster and 
 install_execd installation scripts. Then you export the installation directory
 over NFS to the computational nodes and run install_execd on the computational 
 nodes. After the installation is done, you will need to configure the queue 
 system.    
 





Building Linux Vitual Server: STEP 2

Following the installation instructions for LVS on RH 9, download and install on the balancer machine  
the patched Ultra Monkey kernel, ipvsadm, libnet and related Perl modules.

LVS via NAT

Assuming that your traffic director machine is already configured
 for two IP addresses and runs masquerading with IPTABLES:

/sbin/iptables -A POSTROUTING -t nat \
               -s 192.168.5.0/24 -d ! 192.168.5.0/24 -j MASQUERADE

and the routing table on the real servers is set 
to relay traffic for 192.168.6.0/24 through unisys14:  
route add -net 192.168.6.0 netmask 255.255.255.0 gw 192.168.5.14

Assuming your load balancer has VIP 192.168.6.14 on eth0:0 and your real 
 servers are 192.168.5.34 and 192.168.5.35, set the following rules:

/sbin/ipvsadm -A -t 192.168.6.14:80 -s wlc
/sbin/ipvsadm -a -t 192.168.6.14:80 -r 192.168.5.34:80 -m
/sbin/ipvsadm -a -t 192.168.6.14:80 -r 192.168.5.35:80 -m

Check the active rules:
/sbin/ipvsadm -Ln
Schematically, these rules can be represented in the Figure below:

        
In /var/www/html/index.html on node14 (one of the real servers), change "Test Page" for "Test Page on node14". On node15 (the other real server), change it for "Test Page on node15". From unisys15 (the client machine), connect to http://192.168.6.14 and hit reload several times. The director should re-direct the traffic between the servers. However, the balancer wouldn't know if one of the servers dies. You can implement ipvsadm through the http-monitoring script, httpdMonitor.sh. Clear the ipvsadm table with /sbin/ipvsadm -C; modify httpdMonitor.sh with your settings; start the script with httpdMonitor.sh & Connect to one of the real servers and stop Apache: /etc/rc.d/init.d/httpd stop On the balancer, check the ipvsadm table with /sbin/ipvsadm -Ln From unisys15 (the client machine), try to connect to http://192.168.6.14. Start Apache: /etc/rc.d/init.d/httpd start Then try to connect to http://192.168.6.14 from unisys15 again. Copy httpdMonitor.sh into /usr/local/sbin directory and write a startup script for it, /etc/rc.d/init.d/webmon, to start/stop httpdMonitor.sh with /etc/rc.d/init.d/webmon start and /etc/rc.d/init.d/webmon stop. Hint: there was a discussion on startup scripts in Lesson 7. LVS via Direct Routing Make sure that your traffic director machine is prepared for packet forwarding, i.e. /etc/sysctl.conf should look as follows: # Disables packet forwarding net.ipv4.ip_forward = 1 # Enables source route verification net.ipv4.conf.default.rp_filter = 0 # Disables the magic-sysrq key kernel.sysrq = 0 If you have modified the settings in /etc/sysctl.conf, run /sbin/sysctl -p Set your load balancer VIP to 192.168.5.44: /sbin/ifconfig eth0:0 192.168.5.44 netmask 255.255.255.255 broadcast 192.168.5.44 up Set IPVSADM rules for Direct Routing: /sbin/ipvsadm -C /sbin/ipvsadm -A -t 192.168.5.44:80 -s wlc /sbin/ipvsadm -a -t 192.168.5.44:80 -r 192.168.5.34 -g /sbin/ipvsadm -a -t 192.168.5.44:80 -r 192.168.5.35 -g /sbin/ipvsadm -L The balancer with Direct Routing technique would modify the packet frame that came from the client putting there the MAC address of the real servers, then re-directing it to the real server. Your real servers stay with the original IP addresses, i.e. 192.168.5.34 and 192.168.5.35. Note, the Director's IP and VIP and the servers IP addresses are on the same subnet now, 192.168.5. The client machine should also be on the same subnet. If you still have the old routing for net 192.168.6.0 on the real servers, remove it: route del -net 192.168.6.0/24 Set redirect rules for the VIP address on the real servers using IP tables: /sbin/iptables -t nat -A PREROUTING -d 192.168.5.44 -j REDIRECT If the real server receives a packet with destination at 192.168.5.44, it would pass it through its TCP/IP stack towards the application, the web service.
From unisys15, the client machine with IP address 192.168.5.15, connect to http://192.168.5.44 and hit reload several times. The director should re-direct the traffic between the servers. The servers send their respond directly to the client machine, as shown in Figure above. Modify httpdMonitor.sh script to work on the balancer with Direct Routing scheme: set parameter CF="-g" . Clear the ipvsadm table with /sbin/ipvsadm -C; start the script with httpdMonitor.sh & Connect to one of the real servers and stop Apache: /etc/rc.d/init.d/httpd stop On the balancer, check the ipvsadm table with /sbin/ipvsadm -Ln From unisys15 (the client machine), try to connect to http://192.168.5.44. Start Apache: /etc/rc.d/init.d/httpd start Then try to connect to http://192.168.5.44 from unisys15 again. You can copy httpdMonitor.sh into /usr/local/sbin directory and start/stop it with webmon start and webmon stop, as you already did for LVS-NAT scheme.