libdrmaa

package
v0.3.19 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 19, 2022 License: Apache-2.0 Imports: 14 Imported by: 4

README

JobTracker Implementation on Top of libdrmaa.so (Grid Engine, SLURM, ...)

For testing using a container please call, libdrmaatest.sh in drmaa2os root directory

Basic implementation of a JobTracker wrapper for libdrmaa.so. This is the DRMAA version 1 c library which is shipped by many workload managers. The JobTracker implementation can be used by drmaa2os to provide a Go DRMAA2 interface for drmaa version 1.

The Jobtracker uses github.com/dgruber/drmaa Go wrapper for job submission. It supports Grid Engine (Univa Grid Engine, SGE, Son of Grid Engine, SLURM, and more).

Usage in drmaa2os

Known limitations

The drmaa (v1) implementation (at least in Grid Engine) does not allow to create different job sessions in a single process at the same point in time. The currently also limits the MonitoringSession so that no JobSession can be created when a MonitoringSession is used and vice versa. If this functionality is required a Job implementation which works on command line wrappers needs to implemented.

Default Usage

The default usage is creating a session manager which calls the NewDRMAATracker() The DB is for the session manager only to store job session names etc.

sm, err := drmaa2os.NewLibDRMAASessionManager("testdb.db")
if err != nil {
    panic(err)
}
Job Persistency

If job persistency is required (like for having the jobs available after restart), then, following initialization can be use:

params := libdrmaa.LibDRMAASessionParams{
    ContactString:           "",
    UsePersistentJobStorage: true,
    DBFilePath:              "testdbjobs.db",
}
sm, err := drmaa2os.NewLibDRMAASessionManagerWithParams(params, "testdb.db")

This calls the underlying NewDRMAATrackerWithParams(). Contact string should be empty unless you know what you are doing. If UsePersistentJobStorage is turned on the DBFilePath must be specified in which job related information is written. If the DB file does not exist it will be created. The contact string of the underlying drmaa1 session is written in the session manager DB, and when re-connecting to the same session name it transparently uses it. Hence still running jobs can be still available after application restart.

JobTemplate Mapping

DRMAA2 JobTemplate Internal Go drmaa job template
RemoteCommand SetRemoteCommand
Args SetArgs
InputPath SetInputPath(":"+InputPath)
OutputPath SetOutputPath(":"+OutputPath)
ErrorPath SetErrorPath(":"+ErrorPath)
JobName "cdrmaatrackerjob" if not set / SetJobName
JoinFiles SetJoinFiles
Email SetEmail
JobEnviornment map[key]value SetJobEnviornment("key=value", ...)
ExtensionList map["DRMAA1_NATIVE_SPECIFICATION"]value SetNativeSpecification("value")

JobState Mapping

The following table shows how DRMAA2 job states are mapped to DRMAA version 1 job states.

DRMAA2 Job State Internal Go drmaa job state
drmaa2interface.Undetermined drmaa.PsUndetermined
drmaa2interface.Queued drmaa.PsQueuedActive
drmaa2interface.QueuedHeld drmaa.PsSystemOnHold
drmaa2interface.QueuedHeld drmaa.PsUserOnHold
drmaa2interface.QueuedHeld drmaa.PsUserSystemOnHold
drmaa2interface.Running drmaa.PsRunning
drmaa2interface.Suspended drmaa.PsSystemSuspended
drmaa2interface.Suspended drmaa.PsUserSuspended
drmaa2interface.Suspended drmaa.PsUserSystemSuspended
drmaa2interface.Done drmaa.PsDone
drmaa2interface.Failed drmaa.PsFailed

JobInfo Mapping

JobInfo is set when the job is in an end state.

DRMAA2 JobInfo Internal Go drmaa job info
ExitStatus Only meaningful when drmaa job HasExited()
ID JobID()
SubmissionTime Resource usage map submission_time value
DispatchTime Resource usage map start_time value
FinishTime Resource usage map end_time value

TODO add more

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ConvertDRMAA2JobTemplateToDRMAAJobTemplate

func ConvertDRMAA2JobTemplateToDRMAAJobTemplate(jt drmaa2interface.JobTemplate, t *drmaa.JobTemplate) error

ConvertDRMAA2JobTemplateToDRMAAJobTemplate transforms a Go DRMAA2 job template into a C drmaa job template.

func ConvertDRMAAJobInfoToDRMAA2JobInfo

func ConvertDRMAAJobInfoToDRMAA2JobInfo(ji *drmaa.JobInfo) (info drmaa2interface.JobInfo)

ConvertDRMAAJobInfoToDRMAA2JobInfo takes a drmaa v1 JobInfo object and converts it into a DRMAA2 JobInfo object.

func ConvertDRMAAJobTemplateToDRMAA2JobTemplate

func ConvertDRMAAJobTemplateToDRMAA2JobTemplate(jt *drmaa.JobTemplate) (drmaa2interface.JobTemplate, error)

ConvertDRMAAJobTemplateToDRMAA2JobTemplate transforms a C DRMAA job template into a Go DRMAA2 job template.

func ConvertDRMAAStateToDRMAA2State

func ConvertDRMAAStateToDRMAA2State(pt drmaa.PsType) drmaa2interface.JobState

ConvertDRMAAStateToDRMAA2State takes a DRMAA v1 state and converts it in a DRMAA2 job state.

func ConvertQstatJobState added in v0.3.17

func ConvertQstatJobState(state string) drmaa2interface.JobState

func ConvertUnixToTime

func ConvertUnixToTime(t string) time.Time

ConvertUnixToTime converts something like 1590752891.0000 to time. Some systems report ms since epoch others just seconds.

func NewAllocator

func NewAllocator() *allocator

func ParseLines added in v0.3.17

func ParseLines(out string) []string

func ParseQhostForHostnames added in v0.3.17

func ParseQhostForHostnames(out string) []drmaa2interface.Machine

func ParseQstatForAllJobIDs added in v0.3.17

func ParseQstatForAllJobIDs(out string) []string

func ParseQstatForJobIDs added in v0.3.17

func ParseQstatForJobIDs(out string, ids []string) map[string]string

func Qconf added in v0.3.17

func Qconf(args []string) (string, error)

func QconfSQL added in v0.3.17

func QconfSQL() ([]string, error)

func QhostGetAllHosts added in v0.3.17

func QhostGetAllHosts() ([]drmaa2interface.Machine, error)

func QstatGetJobIDs added in v0.3.17

func QstatGetJobIDs() ([]string, error)

func QstatJobState added in v0.3.17

func QstatJobState(jobID string) (string, error)

Types

type DRMAATracker

type DRMAATracker struct {
	sync.Mutex
	// contains filtered or unexported fields
}

DRMAATracker implements the JobTracker interface with drmaa.so as backend for job management. That allows to user drmaa.so with a DRMAA2 compatible interface.

func NewDRMAATracker

func NewDRMAATracker() (*DRMAATracker, error)

NewDRMAATracker creates a new JobTracker interface implementation which manages jobs through the drmaa (version 1) interface.

func NewDRMAATrackerWithParams

func NewDRMAATrackerWithParams(params interface{}) (*DRMAATracker, error)

func (*DRMAATracker) AddArrayJob

func (t *DRMAATracker) AddArrayJob(template drmaa2interface.JobTemplate, begin int, end int, step int, maxParallel int) (string, error)

AddArrayJob submits an array job through the underlying drmaa.so RunBulkJobs function.

func (*DRMAATracker) AddJob

func (t *DRMAATracker) AddJob(template drmaa2interface.JobTemplate) (string, error)

AddJob makes a new job submission through the underlying drmaa.so RunJob function.

func (*DRMAATracker) Close

func (jt *DRMAATracker) Close() error

func (*DRMAATracker) CloseMonitoringSession added in v0.3.17

func (m *DRMAATracker) CloseMonitoringSession(name string) error

func (*DRMAATracker) Contact

func (t *DRMAATracker) Contact() (string, error)

Contact() returns the contact string. Implements ContactStringer interface. Used for getting the job session name of DRMAA1 out of Grid Engine.

func (*DRMAATracker) DeleteJob

func (t *DRMAATracker) DeleteJob(jobID string) error

DeleteJob removes a job from the internal DB. It can only be removed when it is in an end state (failed or done.

func (*DRMAATracker) DestroySession

func (t *DRMAATracker) DestroySession() error

DestroySession is not part of the interface but neccessary for shutting down the connection to the workload manager.

func (*DRMAATracker) GetAllJobIDs added in v0.3.17

func (m *DRMAATracker) GetAllJobIDs(filter *drmaa2interface.JobInfo) ([]string, error)

func (*DRMAATracker) GetAllMachines added in v0.3.17

func (m *DRMAATracker) GetAllMachines(names []string) ([]drmaa2interface.Machine, error)

GetAllMachines returns all machines the cluster consists of. If names is != nil, it returns only a subset of machines which names are defined in names and are in the cluster.

func (*DRMAATracker) GetAllQueueNames added in v0.3.17

func (m *DRMAATracker) GetAllQueueNames(names []string) ([]string, error)

func (*DRMAATracker) JobControl

func (t *DRMAATracker) JobControl(jobID, action string) error

JobControl allows the job to be executed.

func (*DRMAATracker) JobInfo

func (t *DRMAATracker) JobInfo(jobID string) (drmaa2interface.JobInfo, error)

JobInfo returns more detailed information about a job when the job is finished.

func (*DRMAATracker) JobInfoFromMonitor added in v0.3.17

func (m *DRMAATracker) JobInfoFromMonitor(id string) (drmaa2interface.JobInfo, error)

func (*DRMAATracker) JobState

func (t *DRMAATracker) JobState(jobID string) (drmaa2interface.JobState, string, error)

JobState returns the current state and substate of the given job.

func (*DRMAATracker) JobTemplate

func (jt *DRMAATracker) JobTemplate(jobID string) (drmaa2interface.JobTemplate, error)

JobTemplate returns the stored job template of the job. This job tracker implements the JobTemplater interface additional to the JobTracker interface.

func (*DRMAATracker) ListArrayJobs

func (t *DRMAATracker) ListArrayJobs(arrayJobID string) ([]string, error)

ListArrayJobs returns all job IDs of the job array task.

func (*DRMAATracker) ListJobCategories

func (t *DRMAATracker) ListJobCategories() ([]string, error)

ListJobCategories returns the job categories available at the workload manager. Since this is not a drmaa v1 concept we ignore it for now.

func (*DRMAATracker) ListJobs

func (t *DRMAATracker) ListJobs() ([]string, error)

ListJobs returns all jobs previously submitted and still locally cached.

func (*DRMAATracker) OpenMonitoringSession added in v0.3.17

func (m *DRMAATracker) OpenMonitoringSession(name string) error

func (*DRMAATracker) Wait

func (t *DRMAATracker) Wait(jobid string, timeout time.Duration, state ...drmaa2interface.JobState) error

Wait blocks until the job reached one of the given states or the timeout is reached.

type LibDRMAASessionParams

type LibDRMAASessionParams struct {
	// ContactString is required also for opening job sessions
	// hence do not change the name "ContactString" as SessionManager
	// depends on that through reflection
	ContactString string
	// UsePersistentJobStorage saves job ids in a DB file
	// so that they are availabe after an application restart
	// (could be slower with massive amounts of jobs)
	UsePersistentJobStorage bool
	// DBFilePath points to an existing or non-existing boltdb file
	// which is used when persistent storage is used.
	DBFilePath string
}

LibDRMAASessionParams contains arguments which can be evaluated during DRMAA2 job session creation.

func (*LibDRMAASessionParams) SetContact

func (l *LibDRMAASessionParams) SetContact(contact string)

type WorkloadManagerType

type WorkloadManagerType int

WorkloadManagerType is related to a specific drmaa.so backend as there are minor differences in terms of capabilities

const (
	// UnivaGridEngine as recogized drmaa.so backend
	UnivaGridEngine WorkloadManagerType = iota
	// SonOfGridEngine as recogized drmaa.so backend
	SonOfGridEngine
)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL