Login | Register
Login | Register

My pages Projects SunSource.net openCollabNet

Grid Engine HOWTOs

Table Of Contents

General Grid Engine concepts
Resource management
Cluster management
Special Applications
Tight Integration of Parallel Libraries
Accounting and Reporting Database (ARCo)
DRMAA
Installation, Upgrade, Patches for SGE 5.3 and 6.0

Content

General Grid Engine concepts

[NEW] Introduction to Grid Engine video
Basic Usage
Common Administrative Tasks
Customization of Qmon
Migration of Qmaster to Another Machine
Setting Up a Shadow Master
Commonly Seen Problems
Troubleshooting

Resource management

Managing Resources Abstractly
Consumable Resources
Setting Up Load Sensors to Track Resource Availablility/Utilisation
Different resource management approaches with Grid Engine
Tracking interactive idle time of desktop workstations
Relocating Jobs From a User's Workstation
Grid Engine Enterprise Edition
Sun Grid Engine, Enterprise Edition -- Configuration Use Cases and Guidelines
[NEW] Scheduler Policies for Job Prioritization in the N1 Grid Engine 6 System
File Staging [NEW 6.1] Logical resource expressions
[NEW 6.1] Resource quotas

Cluster management

Tuning guide
[NEW 6.0+6.1] Master monitoring and bottleneck analysis
Command Line and Scripting of Administrative Tasks
Submitting Binaries
Configure qrsh and qlogin to use ssh as transport protocol
Rotating and truncating Log Files
Reducing and Eliminating NFS Usage
Installing on a system with multiple network interfaces
Installing on a system with Solaris IP Multipathing
Deploying PCs with Grid Engine enabled KNOPPIX boot images
Using Host Groups and Cluster Queues
[NEW] What Solaris 10 containers are good for? A hands-on sample.
[NEW 6.0+6.1] Running jobs on data kept (on a USB connected HD) in a separate network via sshfs

Special Applications

[NEW] SGE Transfer Queue to Globus and GridWay
FlexLM Integration, also as slides
[UPDATED] Olesen-FLEXlm-Integration v1.26, also wiki documentation of the Olesen method
Using Clearcase
Using Mentor ModelSim and Mentor JobSpy
[NEW] Mathematica
[NEW] Ansys
Using mpiBLAST
MultiClustering using Transfer Queues
Integration of SGE and Solaris 9 Resource Manager
SGE-Globus integration
Checkpointing jobs using SGE's checkpointing support
Checkpointing under Linux with Berkeley Lab Checkpoint/Restart
Integration of Meiosys MetaCluster HPC with N1[TM] Grid Engine 6
JAM - Job & Application Manager
JGrid - an RMI-based Java interface for Grid Engine

Tight Integration of Parallel Libraries

[IN PREPARATION] Tight Integration of GlobalArrays (TCGMSG) and SGE
Tight Integration of LAM/MPI and SGE
Tight Integration of MPICH and SGE -- With Application Notes
[NEW] Tight Integration of MPICH2 and SGE
Tight Integration of PVM and SGE
Mvapich (MPICH Infiniband) + Loose/Tight SGE Integration
Sun HPC Cluster Tools parallel jobs (MPI, MPI2, OpenMP)
Tight integration of Open MPI with SGE

DRMAA

DRMAA C Binding
File Staging in Grid Engine 6.0 with DRMAA
DRMAA JavaTM Language Binding
[NEW] DRMAA Python Tutorial

Accounting and Reporting Database (ARCo)

ARCo and Oracle 10g Database
ARCo on MySQL Database
[NEW] Space Requirements for the ARCo database

Installation, Upgrade, Patches

[NEW] Install SGE 6.1 patches
[NEW] Bugfixes for SGE 6.1
Install SGE 6.0 patches
[UPDATED] Bugfixes for SGE 6.0
Install SGE 5.3 patches
Bugfixes for SGE 5.3