Installation Guide for the M-grid Clusters

Author: Arto Teräs
Status: Final, version 1.13
Date: 2005-07-28

This guide describes how to install the Rocks cluster distribution including CSC customizations on the M-grid clusters. This guide is brief and concentrates on the M-grid specific things, for more detailed general description of each step refer to the Rocks Users Guide.

The physical installation of the clusters which should be done by the supplier (HP) is described in a separate document titled Installation and Configuration Instructions for the M-grid Clusters.

Download the installation images

The preferred method of installation is network install from the CSC distribution server rocks.csc.fi. For this you need only a small network boot image which should be burned on a cd. The fallback method is install from a cd set, burning a complete Rocks installation cd and each roll on a separate cd. All cd images are available from

http://rocks.csc.fi/install/downloads/.

The server is not accessible from the whole world. Contact CSC before installation and give your IP address (or address range) so that we can open access for you.

Installing the admin server and cluster front end

NOTE 1! We made an error in the installation instructions sent to HP, indicating that the first ethernet interface in admin server and frontend should be connected to the public network. However, Rocks uses the first interface for the compute network, the second interface is for public network. You will probably have to switch the network cables of the two first interfaces before installation to fix this.

NOTE 2! When there is an additional network card installed (front ends and admin nodes) it is a somewhat difficult to predict which card is detected first by the Linux kernel. In the mini cluster at CSC, the supplementary card was detected first in the admin server (DL145) but the internal card was first in the frontend (DL585). Therefore first and second interface in the front end are the ports integrated on the motherboards while in the admin server they are the ports located on the supplementary card. This should be taken account when connecting cables. If the order of detection is reversed for some reason, moving the cables to another port may be necessary for the installation to succeed.

Install first the administration server and then the cluster front end. The basic installation procedure is the same for both. The network installation is described first.

Boot the machine from cd
Select "mgrid" (just type return) in the initial screen
Configure network settings for the external interface eth1. The machine will first try to get an address using DHCP. If you don't have DHCP just wait that it fails and fill in your fixed ip, gateway etc. in the dialog that appears. After that the system should automatically contact rocks.csc.fi and start network installation.
Fill in relevant cluster information (name, owner etc.)
Roll selection phase 1. For the admin server, select rolls hpc and madm. For the frontend, select rolls hpc, sge and mgrid.
Roll selection phase 2. This is a bit tricky. The system will ask if you have another roll server. For the admin server, select no. For the front end, select yes. Then deselect all the rolls on the upcoming list and select arch => i386 => AMD roll (32bit compatibility roll). When asked again if you have still another roll server, say no.
Disk partitioning. Select Disk Druid to partition your disk referring to a separate document titled Disk partitioning in M-grid clusters. For the front-end, HP should have preconfigured the system so that you have two visible disks, a 72 GB system disk (actually a RAID-1 setup consisting of two disks) and a 1 TB or 1.75 TB large disk (RAID-5). If this is not the case, please ask the HP guy for assistance in configuring the hardware RAID. In the admin server, there is no hardware RAID so you should create a software RAID-1 of the two system disks using the Disk Druid interface.
Hint: The user interface in Disk Druid is a bit awkward, you'll have to explicitly select the disk each time to ensure that the new partition will be created on the disk you want.
Network settings. Configure network settings for the frontend as follows:
- eth0: ip 10.11.1.1, netmask 255.255.255.0 (compute network)
- eth1: ip and netmask university specific (public network)
- eth2: ip 10.12.1.1, netmask 255.255.255.0 (nfs network)
- eth3 will be left unconfigured at this point
For the admin server they should be as follows:
- eth0: ip 10.11.1.2, netmask 255.255.255.0 (compute network)
- eth1: ip and netmask university specific (public network)
- eth2: ip 10.12.1.2, netmask 255.255.255.0 (nfs network)
- eth3: ip 10.13.1.2, netmask 255.255.255.0 (administration network)
After the interface specific settings there's another dialog for gateway, dns etc. Two notes:
- set up the gateway and DNS to point to the public network gateway and DNS. The default value for gateway is misleading, calculated based on the compute network address.
- Hostname should be fully qualified hostname (e.g. kivi.csc.fi, kivi-a.csc.fi for the admin server).
Select a root password. The password should be different on the front end and admin server.
Installing packages. This takes some time, go get some coffee. ;) If something goes wrong, there will be some error message and a reboot, Rocks error handling at this stage is quite poor...
The system will reboot, asking you to take out the cd.
During your first login (as root), a ssh keypair will be generated to enable access to the nodes without password. Leave the passphrase empty.

If you are installing from the cd set, the procedure is almost the same, but you won't need to configure network settings for network installation in the beginning. Instead, you'll need to feed the roll cds in one at a time when asked to do so, and then again during the package installation.

Installing nodes

Computing nodes are installed by running a daemon on the front-end listening to bootp requests and powering up the nodes one by one. It is not necessary to wait for each installation to finish before starting with next node. However it is a good idea to first install one node until the end and see that it works properly before moving to the other nodes. For example, if partitioning or installing the bootloader fails for some reason, the nodes end up in a state where one needs to manually wipe the hard disks or force booting from the network (and later change the boot order back).

Before starting the node installation, download a partition schema file which suits your hard disk setup. See the disk partitioning document for more details. The following options are available:

The partitioning schema file should be copied as file /export/home/install/site-profiles/3.2.0/nodes/replace-auto-partition.xml on the cluster front-end. After that change to directory /home/install and run command rocks-dist dist. Then move on to the actual installation of the nodes:

Type insert-ethers on the front end
Select 'Compute' from the list
Power up the first node. Wait until you can see it's MAC address appearing on the frontend "Inserted Appliances" screen
The node should start installing the operating system automatically. You can view the progress by opening another virtual console and logging in to the node during installation using ssh (or by connecting another monitor to the node)
If the node is installing packages, everything is fine and you may boot the next node (no need to wait for the first installation to finish)
When all nodes have been installed, move to the post-configuration phase.

NOTE! Rocks assigns names and ip addresses to the nodes in successive order: first node will be called compute-0-0 and have ip 10.11.1.254, second will be compute-0-1 and have ip 10.11.1.253 etc. This should correspond the labels on the computers. In case there's a broken node which doesn't boot, exit the insert-ethers program before powering up the next one. Relaunch insert-ethers with the --rank parameter, indicating which number the next node should get. For example, if the sixth node (labeled compute-0-5) doesn't boot, type insert-ethers --rank=6 and then power up the seventh node.

Later you can reinstall nodes using the shoot-node command. See Rocks manual for details.

Post-configuration

Move /tmp directory to /var/tmp so that there is no risk of user temporary files filling up the root partition. This can be done by creating a directory /var/tmp, copying the contents of /tmp there, then moving /tmp to /tmp-old, creating a symbolic link /tmp (ln -s /var/tmp /tmp) and removing /tmp-old. (If the system isn't happy with /tmp disappearing for a while, you can also boot from a rescue cd and do the operation from there.)

Check that the cluster is up and running properly by running links http://localhost/ and selecting cluster status from the list. It should show the total number of CPUs with frontend CPUs included.

Configure the keyboard by running redhat-config-keyboard and selecting Finnish latin1 (if you don't happen to use a us keyboard).

Configure X by running redhat-config-xfree86, which should auto-detect the display card (and hopefully your mouse too). This can be omitted, but if you plan to work locally on the front-end having X available makes many things nicer. For example, you can follow the progress of node reinstallations in xterms.

Configure the firewall settings. A default configuration is provided by CSC in the file /etc/rc.d/rc.firewall. This should have different contents on the admin server and frontend, the admin server one being more strict. Add your own modifications primarily to /etc/rc.d/rc.firewall.local. You will probably want to do at least the following:

Enable ssh, http and https access from your lab (or the whole university network) to the frontend
Enable ssh access from a couple of selected ip addresses to the admin server

Configure the mail server, see a separate guide.

Add local user accounts.

Contact CSC (if you haven't already done so). Proceed to acceptance tests.

Update 2005-08-28: The guide is already a bit outdated: a freshly installed cluster requires a number of updates which are not part of the installation package.

Changelog

2005-08-27 Version 1.13. Added link to mail server configuration guide, and a mention that a reinstalled server needs updates after the installation procedure.
2004-09-29 Version 1.12. Removed mention of (nonexisting) roll frontend.
2004-09-28 Version 1.11. Added a note about configuring the keyboard and instructions to run the rocks-dist command to partitioning schema instructions. (AJT)
2004-09-28 Version 1.1. Added partitioning schema files. (AJT)
2004-09-27 Version 1.01. Added a link to disk partitioning guide, mention about software RAID on admin server and instructions how to move /tmp to /var/tmp (AJT)
2004-09-23 Version 1.0 published.