Using the Elektra/Minotaur cluster

Setup

The Elektra cluster consists of 16 dual Intel Xeon and 12 dual AMD Athlon Linux boxes on a private network that is separated from the outside world through a switch. The master node (elektra) acts as a firewall: it is the only node that can be accessed from outside. In principle, there is no need to log through to the slaves nodes, although you can (machine names: eXX and mXX, with XX the node number). The private network has the following advantages: (1) added security; (2) no additional IP addresses are needed for the slaves; (3) the fast ethernet connection is completely available to the cluster, as all external traffic is masked by the switch. A disadvantage of the private network is that partitions residing on other machines (like the /home partition on ariadne) can only be mounted on the master node, and the NFS daemon currently being used (knfsd) cannot re-export them to the private network. In practice, this is not a big problem, as (for efficiency reasons) we want to minimize the usage of such partitions anyway. There is a separate /home on the cluster machines, which prevents you from writing output to /home on ariadne (which easily could lead to a bottleneck). In addition, the separate /home partition makes the cluster independent of the external (building-wide) network. Note, however, that you still can log in on elektra with your standard password, as the master node is a NIS slave server to ariadne.

Usage

In practice, you use the cluster as follows:

Prerequisites

  1. Check whether you can log through to all slave nodes via 'ssh eXX', where X is the slave number, without giving your password. This is essential for the delivery of your output files. This link provides instructions on how to arrange matters. You should also be able to log through (from the slave node) back to the master node via 'ssh elektra.mse.uiuc.edu' (use the full name, as that is what PBS uses).
  2. Use the PBS '-M' option to specify a valid e-mail address where PBS can send any error messages. If you receive such e-mail messages, it is quite likely that you skipped step 1.

Heavy I/O

If your program needs large amounts of temporary disk space, or produces data at a high rate, the private network could still act as a throttle to your program. In this case, it is best to perform all I/O on the /scratch-local partition of the slave node, and copy your data to /home at the end of the run. Contact Erik to learn more about this.

Tips and tools

Back to resource overview

This page was created on April 14, 2002 and last updated on May 20, 2004.