|
|
Using SolarisTM 9 Resource Manager software with Grid Engine software.What is Solaris 9 Resource Manager software? Solaris 9 Resource Manager software is
a new set of features designed to enhance resource control and accounting
on the SolarisTM Operating Environment
(Solaris OE). Detailed information about it can be found in the SolarisTM 9 Operating Environment documentation set.
Here are some related links: FAQ: http://wwws.sun.com/software/solaris/faqs/resource_manager.html. Overview: http://wwws.sun.com/software/solaris/ds/ds-srm/,
Solaris 9 Resource Manager manual: http://docs.sun.com/db/doc/806-4076/6jd6amqor?q=Solaris+9+resource+Manager&a=view. Sun Blue Prints: http://www.sun.com/solutions/blueprints/0902/816-7753-10.pdf Two papers on this subject were presented at Sun's SuperG 2002 conference.
It can be used to generate detailed accounting information, to better enforce limits in a job and to prevent a job from using more CPUs than requested.
Before you start, set up the projects database and extended accounting facility. Please refer to the Solaris OE documentation for detailed setup instructions. The key is to associate each job to a taskid of Solaris 9 Resource Manager software. To do this, run newtask at job start up, either on the starter method or on the prolog. The starter method is simpler; since it execs the end user script. An example of a starter method is: exec /usr/bin/newtask $SGE_STARTER_SHELL_PATH $* A sample starter method is provided in Appendix E.
The advantage of using the queue/host prolog is that it can be run as root; and thus perform privileged operations without the need of setuid helper programs. On the other hand, the prolog process does not exec the user script. To make the changes "stick" you have to change the taskid of the shepherd process instead of the prolog's process:
newtask -c `ps -o ppid= -p $$` (CORRECT: changes the shepherd's taskid). Once the shepherd has the new taskid the starter method, user script and epilog script will inherit the proper taskids from the shepherd.
Note: The SunTM ONE Grid Engine, Enterprise Edition software has projects, but for this document, only the Solaris 9 Resource Manager software concept of projects will be used. newtask can be used to create a new taskid as well as bill the job to a project. To do this, use the -p <<project name>> flag on the newtask command line inside the starter method or prolog script. You can have resource pools associated with a project; therefore, billing the job to a project will also bind it to the associated resource pool.
There are two ways of using resource pools with jobs: (1) associate a resource pool to a project and use this project on the newtask call. (2) use the poolbind command and explicitly bind the job to a resource pool. The poolbind command requires root privileges and should be used on the queue's prolog or be called from the starter method through a setuid wrapper. The prolog example on Appendix B, uses the poolbind method. Resource pools can be used to limit the number of CPUs a job can use; therefore, a grid administrator can block a multithreaded program to use more CPUs than requested. Note that if a parallel job is bound to a resource pool with less CPUs than given by the PARALLEL environment variable, the parallel job will severely slow down.
At this point resource controls do not offer much new functionality to the Grid Engine software. However, you can use resource controls to enforce CPU time limits and Light Weight Process (LWP) limits on a job, by using the prctl command on the prolog or start method to set the job's resource controls. The prolog in Appendix B illustrates the usage of prctl.
There are no reporting tools for exacct at this time. Only the C API and a demo program that prints all the exacct records from a file is available. The demo program in C is in the SUNWosdem package and is usually installed at /usr/demo/libexacct. You will need to compile it before you use it. I wrote a simple tool that scans the exacct file that keeps track of processes and prints selected parts of the process records that used more than 0.5 seconds of CPU time. You can download this tool here.
The setup presented
here only works for BATCH queues. Interactive loads
have to be controlled by projects because the in.rlogind starts a new task
for the login session, ignoring the task created by the starter method
or prolog. Parallel Environments
are not covered in this HOWTO.
The following is a minimal prolog script to assign a taskid to a job; You can download it here. === Minimal prolog === #!/bin/sh
# (c) 2002 Sun Microsystems, Inc. Use is subject to license terms. # Copyright © 2002 Sun Microsystems, Inc. All rights reserved. #****** s9rm_prolog_minimal.sh ********************************************* # # NAME # s9rm_prolog_minimal.sh -- prolog to associate a job with a Solaris 9 # Resource Manager task. # # SYNOPSIS # s9rm_prolog_minimal.sh # # FUNCTION # This script can be used as prolog in sge_queue(5), to create # a new taskid for the job. # # NOTES # The /usr/bin/newtask command is not available before Solaris 8. # #*************************************************************************** if [ ! -x /usr/bin/newtask ] then echo "Warning: /usr/bin/newtask is not available, skipping Solaris 9 Resource Manager setup." exit 0 fi #### #### The line below creates a new task for this job. #### /usr/bin/newtask -c `/bin/ps -o ppid= -p $$` #### exit 0 #### ### ### End of the prolog ### ### ### === Minimal prolog === Appendix: B Here is an example of a fancier prolog script. It creates a new taskid, assigns it to the job, sets resource controls, and binds the job to a resource pool. You can download it here. == fancy prolog === # Copyright © 2002 Sun Microsystems, Inc. All rights reserved. # (c) 2002 Sun Microsystems, Inc. Use is subject to license terms. #****** util/resources/s9rm_prolog.sh *************************************** # # NAME # s9rm_prolog.sh -- Prolog to set up Solaris 9 Resource Manager # resource control and accounting. # # SYNOPSIS # s9rm_prolog.sh # # FUNCTION # This script can be used as prolog in sge_queue(5). It creates # a new taskid for the job, sets the user's default project as projid, # sets resource controls, and binds the job to a resource pool. # If some of this functionality is not desired, comment out the # respective commands. # # NOTES # The /usr/bin/newtask command is not available before Solaris 8. # #*************************************************************************** if [ ! -x /usr/bin/newtask ] then echo "Warning: /usr/bin/newtask is not available, skipping Solaris 9 res ource Manager setup." exit 0 fi #### Get the shepherd's process id SHEP_PID="`/bin/ptree $$ | awk 'BEGIN {getline; getline; print $1}'`" #### ################################################################## #### #### Setting the jobs' task and project ids #### ################################################################## #### #### The lines below get the user's default project name and id #### You might want a fancier mapping of jobs/queues to projects #### PROLOG_DEFAULT_PROJECT="`/bin/projects -d ${USER}`" PROLOG_PROJECT_ID="`grep $PROLOG_DEFAULT_PROJECT /etc/project| /usr/bin/awk -F: '{print $2}'`" #### #### The line below creates a new task for this job and assigns it to the #### user's default project. #### /usr/bin/newtask -p $PROLOG_DEFAULT_PROJECT -c $SHEP_PID PROLOG_JOB_TASKID="`/bin/ps -o taskid= -p $SHEP_PID`" #### ################################################################## #### #### Binding the job to a resource pool #### ################################################################## #### PROLOG_POOL=single PROLOG_MYTASK="`/bin/ps -o taskid= -p $$`" /usr/sbin/poolbind -p $PROLOG_POOL -i taskid $PROLOG_JOB_TASKID ################################################################## #### #### Setting resource controls #### ################################################################## #### /usr/bin/prctl -n task.max-lwps -v 9 -e signal=9 -i task $PROLOG_JOB_TASKID #### exit 0 #### #### ### ### End of the prolog ### ### ## == fancy prolog == To use this prolog in a queue, use the following command: qconf -mqattr prolog root@<<path to the prolog script>> <<queues>>
Here are the commands that will create three resource pools in a 4 CPU machine. One pool is allocated to system resources and the other two ("single", with 1 CPU and "dual" with 2 CPUs) can be used by SGE prologs. You can download this file here. create system mymachine create pset sys-procs (string pset.comment = "System Pset"; string
pool.scheduler = "TS" ;uint pset.min = 1; uint pset.max=1)
create pool sys-procs (string pool.comment = "System resource pool")
associate pool sys-procs (pset sys-procs)
create pset single (string pset.comment = "SGE Pset"; string pool.scheduler
= "TS" ;uint pset.min = 1; uint pset.max=1)
create pool single (string pool.comment = "SGE Resource Pool")
associate pool single (pset single)
create pset dual (string pset.comment = "SGE Pset"; string pool.scheduler
= "TS" ;uint pset.min = 2; uint pset.max=2)
create pool dual (string pool.comment = "SGE Resource Pool")
associate pool dual (pset dual)
Appendix: D Sample epilog Below is an epilog script that runs the simple exacct report generator available here. The epilog script can be downloaded here. #!/bin/sh # (c) 2002 Sun Microsystems, Inc. Use is subject to license terms. # Copyright © 2002 Sun Microsystems, Inc. All rights reserved. #****** util/resources/s9rm_epilog.sh *************************************** # # NAME # s9rm_epilog.sh -- epilog to generate accounting reports at # the end of a job # # SYNOPSIS # s9rm_epilog.sh # # FUNCTION # This script can be used as epilog in sge_queue(5); It runs a # a program to generate a summary report from the job's exacct records. # # NOTES # The /usr/bin/newtask command is not available before Solaris 8. # Please set the EPILOG_SUMMARY variable to the path of report generator # generator program before using this script. # #*************************************************************************** #### #### Name the program that generates a summary of the #### job's exacct records #### EPILOG_SUMMARY=<<path to report generator>>/proclist #### #### EPILOG_MYTASK="`/bin/ps -o taskid= -p $$`" #### You might want to call newtask again here, to finish the previous task, #### otherwise the exacct task record will not be available at this point if [ -x $EPILOG_SUMMARY ] then $EPILOG_SUMMARY fi #### To use this prolog in a queue, edit the variable EPILOG_SUMMARY to point to the proper path of proclist then use the following command: qconf -mqattr epilog <<path to the prolog script>> <<queues>> Sample Starter Method Simple starter_method, similar to the minimal prolog (available for download
here): # (c) 2002 Sun Microsystems, Inc. Use is subject to license terms. # Copyright © 2002 Sun Microsystems, Inc. All rights reserved. #****** s9rm_starter_method.sh ********************************************* # # NAME # s9rm_starter_method.sh -- prolog to associate a job with a Solaris 9 # Resource Manager task. # # SYNOPSIS # s9rm_starter_method.sh # # FUNCTION # This script can be used as starter_method in sge_queue(5) to create # a new taskid for the job. # # NOTES # The /usr/bin/newtask command is not available before Solaris 8. # #*************************************************************************** if [ -x /usr/bin/newtask ] then SM_DEFAULT_PROJECT="`/bin/projects -d ${USER}`" exec /usr/bin/newtask -p $SM_DEFAULT_PROJECT $SGE_STARTER_SHELL_PATH $* else echo "Warning: /usr/bin/newtask is not available, skipping Solaris 9 Resource Manager setup." exec $SGE_STARTER_SHELL_PATH $* fi #### ### ### End of the starter_method ### ### ###
To use this starter_method in a queueuse the following command:
qconf -mqattr starter_method <<path to the starter_method script script>> <<queues>>
Trademarks
By Paulo Tibério Muradas Bulhões, November 2002.
|
|
![]() |
By any use of this Website, you agree to be bound by these Policies and Terms of Use. |