NorduGrid ARC Installation Instructions for M-grid

Author: Arto Teräs
Status: Final, version 1.4
Date: 2005-10-11

NOTE: At the moment these instructions are out-of-date and should be used as a historical reference only

Introduction

This guide describes how to install the NorduGrid ARC grid middleware on the M-grid clusters. The NorduGrid website has a good collection of documentation, especially the User Guide, Client Installation Instructions and Server Installation Instructions are relevant here. However, the procedure in M-grid is a bit different from standard ARC installation because the current stable ARC release does not yet support 64 bit environment (installing in 32 bit mode would be possible) and N1 Grid Engine batch queue system. Therefore in M-grid a development release is used.

Mainly due to the fact that the most recent version of the Globus Toolkit is not yet available in rpm format, package dependencies are partly ignored and installation requires the using command line options such as --nodeps and --force. Consider this a work in progress which will be cleaned up later. ;-)

All installations should be done as root, because there are still components in NorduGrid ARC which cannot be run under normal user accounts. Compiling software can of course optionally be done under a normal user account.

Installing Globus

Download the Globus Toolkit 4.0.1 source package. We have a mirror available at http://rocks.csc.fi/install/downloads/gt4.0.1-all-source-installer.tar.bz2.

Building parts of the Globus toolkit requires java and some sql libraries. Install the Java development kit and other necessary packages:

rpm -Uvh /home/install/updates/mgrid/1.0/x86_64/RPMS/nordugrid/libiodbc-*
rpm -Uvh /home/install/updates/mgrid/1.0/x86_64/RPMS/nordugrid/MyODBC-*
rpm -Uvh /home/install/updates/mgrid/1.0/x86_64/RPMS/jdk-1_5_0_04-linux-amd64.rpm
export JAVA_HOME=/usr/java/jdk1.5.0_04
export PATH="/usr/java/jdk1.5.0_04/bin:$PATH"
If you have already installed another java development kit you can also use it. ARC depends only on the pre-web-services components of Globus which are not java based. Replica Location Service (RLS) uses Java and is not yet used in the current M-grid configuration but compiled for future use.
tar jxvf gt4.0.1-all-source-installer.tar.bz2
cd gt4.0.1-all-source-installer
./configure --prefix=/opt/globus
make prews prewsmds rls install

The configuration script gives an ant related warning, but it can be ignored (those components are not used). Or you can install ant to get rid of the warning.

Installing NorduGrid ARC packages

NorduGrid ARC rpm packages (version 0.5.30) have been recompiled against Globus Toolkit 4.0.1 by CSC. They and a few other rpms needed as dependencies are available on the M-grid clusters in directory /home/install/updates/mgrid/1.0/x86_64/RPMS/nordugrid/. Please install all the packages in that directory:

rpm -Uvh /home/install/updates/mgrid/1.0/x86_64/RPMS/nordugrid/gsoap*
rpm -Uvh /home/install/updates/mgrid/1.0/x86_64/RPMS/nordugrid/perl-*
rpm -Uvh /home/install/updates/mgrid/1.0/x86_64/RPMS/nordugrid/globus-config-*
rpm -Uvh /home/install/updates/mgrid/1.0/x86_64/RPMS/nordugrid/ca_NorduGrid-*
rpm -Uvh --nodeps /home/install/updates/mgrid/1.0/x86_64/RPMS/nordugrid/nordugrid-*
rpm -Uvh /home/install/updates/mgrid/1.0/x86_64/RPMS/nordugrid/gsincftp*

Nodeps option is needed because ARC depends on Globus libraries which are available but not part of any rpm package and therefore not found by the dependency check.

There is also a patch which fixes a few important bugs in N1 Grid Engine support of ARC and contains Juha Lento's partial rewrite of the ARC information system components (work in progress). They will become part of the main ARC release tree later. Install the patch as follows:

cd /opt/nordugrid/
wget http://rocks.csc.fi/install/downloads/nordugrid-server-0.5.30_mgrid_patch_2005-08-09.tar.gz
tar zxvf nordugrid-server-0.5.30_mgrid_patch_2005-08-09.tar.gz

After the installation, please log out and in again or source the files /etc/profile.d/globus.sh and /etc/profile.d/nordugrid.sh (or .csh if you're using a csh based shell) to set up environment variables and paths correctly.

Requesting certificates

In the grid, both servers and users are identified and authenticated using X.509 certificates. The certificates need to be signed by a certification authority, which is NorduGrid CA in the case of M-grid. CSC acts as a registration authority, forwarding requests from Finnish grid users to NorduGrid CA signing them electronically so that NorduGrid CA can trust that the requests are valid and authentic. Obtaining the certificates takes some time so it's best to do it as early as possible during the installation. You may want to read the Grid Certificate Mini How-to on the NorduGrid pages.

Generate a host certificate request by typing the command

grid-cert-request -host <your.host.fqdn>
(for example grid-cert-request -host kivi.csc.fi)
This creates the following files:
/etc/grid-security/hostkey.pem
/etc/grid-security/hostcert_request.pem
/etc/grid-security/hostcert.pem

File hostcert_request.pem is the certificate request. Please do not email it directly to ca@nbi.dk as instructed on screen but instead to grid-support@csc.fi. Sign the email with your gpg key which were exchanged in the M-grid administrators' meeting.

File hostcert.pem is empty and should be replaced with the actual certificate later. The certificate will be sent to you by email.

Generate also a ldap certificate request by typing the command

grid-cert-request -service ldap -host <your.host.fqdn>
(for example grid-cert-request -service ldap -host kivi.csc.fi)
This creates a similar set of files in directory /etc/grid-security/ldap/. Again the request should be sent to CSC in a signed email.

If you don't already have a personal grid certificate log in also using your own user account and generate a personal certificate request by typing the command

grid-cert-request -int -ca
This asks a few questions and generates a key and request in the ~/.globus/ subdirectory. Remember to select a good passphrase for protecting the secret key and use your home institution email address. The OU value should be your domain, e.g. hut.fi or utu.fi. Again, send the request in a signed email to grid-support@csc.fi.

NOTE: All users need to request a personal certificate before they can start using the grid. The procedure is the same as above except that users should give their requests to site admins who should send them to CSC signing the email with their gpg key. This is a temporary procedure and will be replaced by a service available for all CSC customers later in the fall.

Creating grid users, configuring the firewall and preparing the directory structure

In M-grid, grid users are mapped to local unix accounts so that there's one local account representing each M-grid site. These accounts are named mgucsc, mguhip, mguhel, mguhut, mgujyu, mgulut, mguutu, mgutut and mguoul, mgucsca and should exist on all sites. The last one, mgucsca, is used for CSC customers while mgucsc is for CSC personnel. In addition there is user gridadm which will be used for administrative purposes in the grid.

To facilitate user account creation, we have prepared a script which is included in the mgrid-arc-conf rpm package. The same package also includes partly preconfigured NorduGrid ARC configuration files. Please install the package (--force is needed because it overwrites default NorduGrid ARC configuration files):

rpm -Uvh --force /home/install/updates/mgrid/1.0/x86_64/RPMS/nordugrid/mgrid-arc-conf-LATEST.noarch.rpm

(Replace LATEST with the latest version number found in the directory.)

Then run the script /opt/mgrid/bin/create-mgrid-users.sh to create the user accounts:

/opt/mgrid/bin/create-mgrid-users.sh

There's another rpm package which updates the M-grid frontend firewall configuration so that grid ports are opened to the M-grid networks. Please install the firewall package:

rpm -Uvh /home/install/updates/mgrid/1.0/x86_64/RPMS/mgrid-firewall-frontend-LATEST.noarch.rpm

(Replace LATEST with the latest version number found in the directory.)

The firewall rpm will overwrite /etc/rc.d/rc.firewall (saving the old file to rc.firewall.rpmsave) but not the local settings file /etc/rc.d/rc.firewall.local. If you have made changes to the main firewall script you'll need to reapply them manually, but local changes should remain intact.

If you prefer to open the grid ports to the whole world, please uncomment the relevant lines in /etc/rc.d/rc.firewall after the comment "Uncomment the following to open grid ports for the whole world". Actually in M-grid administrators' meeting on March 11, 2005 it was decided that ports should be opened worldwide by default, but to respect stricter site policies the package default is only M-grid networks.

A number of directories are also needed for the ARC middleware to work. Please create the following directories:

/export/grid/cache
/export/grid/runtime
/export/grid/session
/var/log/grid
/var/spool/nordugrid/jobstatus

The directories under /export/grid need to be visible also in the computing nodes. Please add the following line in /etc/exports:

/export/grid 10.12.1.0/255.255.255.0(rw)

Make sure that the directory is exported by running

/usr/sbin/exportfs -a

Then mount the directories in the nodes and add the relevant line to /etc/fstab:

cluster-fork mkdir -p /export/grid
cluster-fork 'echo "10.12.1.1:/export/grid /export/grid nfs defaults 0 0" >>/etc/fstab'
cluster-fork mount /export/grid

To make this permanent so that it works also after node reinstallation, add the following to file /home/install/site-profiles/3.2.0/nodes/replace-csc.xml within the <post></post> section:

<!-- Add the grid shared directory mount -->
mkdir -p /export/grid
<file name="/etc/fstab" mode="append">
10.12.1.1:/export/grid  /export/grid            nfs     defaults        0 0
</file>
(If the file replace-csc.xml doesn't exist take the base from the file csc.xml in the same directory or /home/install/rocks-dist/enterprise/3/en/os/AMD64/build/nodes/csc.xml.)

After that change to directory /home/install and run

rocks-dist dist

Configuring N1 Grid Engine

The queue configuration works as follows. The local queues are left intact. There will be one additional grid queue called mgrid, which has maximum execution time of 24 hours as agreed in the researcher meeting on 2004-11-19.

Users are be divided in two categories: local users and grid users, implemented as Grid Engine access lists. Grid users is an additional access list called mgrid_users, which only has access to the grid queue. Local users are automatically mapped to the ACL defaultdepartment, which will have access to the local queues but not the grid queue.

Both the local queues and grid queue are mapped to the same nodes, with the same number of slots. However, each node will be also limited to two slots (using the consumables in Grid Engine configuration) so that the node may be executing two local jobs, two grid jobs or one each but no more than two jobs total at any time.

The priorities are handled using Grid Engine tickets and the Functional policy scheme. Local users (defaultdepartment) will get a 80% share and mgrid_users a 20% share. This works so that when there are empty nodes either group can fill any remaining space, but queued jobs will be organized fairly using these 80/20 shares. The reorganization of jobs in the queue works dynamically. We can fine-tune this later if it seems necessary.

N1 Grid Engine version 6.0u4 or newer is required. This configuration can be achieved as follows:

NOTE: Sites with modified local queue configurations (especially inhomogeneous clusters) may need to adjust the instructions above to take account their setup.

The grid queues will be for serial jobs only in the first phase, parallel runtime environments will be added later.

Configuring NorduGrid ARC

The main configuration file of NorduGrid ARC is /etc/arc.conf. The rpm package mgrid-arc-conf installed earlier contains a partly preconfigured configuration file. The entries that need to be modified in each cluster are marked with tags like %MGRID_ENTRY%. The fields to fill in are

Authorized grid users are listed in file /etc/grid-security/grid-mapfile. Normally this file is not managed by hand but using the utility /opt/nordugrid/sbin/nordugridmap. This utility reads the configuration file /etc/grid-security/nordugridmap.conf which defines authorized groups (known as virtual organizations, VOs, in the grid) and adds local entries defined in /etc/grid-security/local-grid-mapfile. The mgrid-arc-conf package installs a preconfigured /etc/grid-security/nordugridmap.conf, which contains entries for M-grid VOs. At least in the beginning we plan to manage the VO files at CSC, later the management of VOs containing grid users of each M-grid site will be moved to the admins of respective sites.

To initialize the local-grid-mapfile (it can be left empty) and update the authorized user list, please run the following:

touch /etc/grid-security/local-grid-mapfile
/opt/nordugrid/sbin/nordugridmap

Starting the daemons

Before starting the daemons, wait that you have received the host and ldap certificates for the cluster frontend and copied them to /etc/grid-security/hostcert.pem and /etc/grid-security/ldap/ldapcert.pem. After that you're ready to go:
/etc/init.d/gridftpd start
/etc/init.d/grid-manager start
/etc/init.d/grid-infosys start

Add the daemons also to the default startup configuration so that they are started when the system is rebooted:

/sbin/chkconfig gridftpd on
/sbin/chkconfig grid-manager on
/sbin/chkconfig grid-infosys on

Testing the installation

Here's a simple script and NorduGrid job file which can be used to test the installation:

Save these files to a directory initialize your grid proxy (you need your personal certificate to do this) and submit the job:

grid-proxy-init
ngsub -d 1 -f hellogrid.xrsl
# To select a specific host, use the -c option, such as
ngsub -c kivi.csc.fi -d 1 -f hellogrid.xrsl
You can monitor the job status using command ngstat -a and download the result files using ngget <job_id>. You can also take a look at on the NorduGrid web site if the cluster can be seen in the Grid Monitor, and browse the information in the monitor. For more information, see the NorduGrid User Guide, documentation section on the website and tutorials.

Changelog

2005-10-11 Version 1.4. Added user mgucsca which is used to run CSC customers' jobs in M-grid. Added the gsincftp package. (AJT)
2005-08-29 Version 1.3. Added a note about adding the daemons to chkconfig configuration so that they are started after reboot. (AJT)
2005-08-18 Version 1.2. Replaced some references to specific package versions with the LATEST keyword. (AJT)
2005-08-11 Version 1.1. Added instructions how to update the access list with nordugridmap, added mention to create the /var/spool/nordugrid/jobstatus directory. (AJT)
2005-08-10 Version 1.02. Upgraded mgrid-arc-conf package to release 0.1-5, added instructions of running exportfs and rocks-dist dist commands, modified instructions how to report node memory (AJT)
Engine instructions and updated the note respectively. (AJT)
2005-08-10 Version 1.01. Changed -rattr to -mattr in Grid Engine instructions and updated the note respectively. (AJT)
2005-08-09 Version 1.0. Added instructions how to make the nfs mount stay after node reinstallation, other minor fixes. (AJT)
2005-08-09 Version 0.92. Added one file to the NorduGrid ARC infosystem fix patch. (AJT)
2005-08-08 Version 0.91. Upgraded Globus to version 4.0.1. (AJT)
2005-07-28 Version 0.9. Initial public version. (AJT)