If the internal submit cluster function completes successfully, the retries
counter is set back to 0 and the next job or chunk is submitted.
If the internal submit cluster function returns a fatal error, the submit process
is completely stopped and an exception is thrown.
If the internal submit cluster function returns a temporary error, the submit process
waits for a certain time, which is determined by calling the user-defined
wait
-function with the current retries
counter, the counter is
increased by 1 and the same job is submitted again. If max.retries
is
reached the function simply terminates.
Potential temporary submit warnings and errors are logged inside your file
directory in the file “submit.log”.
To keep track you can use tail -f [file.dir]/submit.log
in another
terminal.
submitJobs(
reg,
ids,
resources = list(),
wait,
max.retries = 10L,
chunks.as.arrayjobs = FALSE,
job.delay = FALSE,
progressbar = TRUE
)
[Registry
]
Registry.
[integer
]
Vector for job id or list of vectors of chunked job ids.
Only corresponding jobs are submitted. Chunked jobs will get executed
sequentially as a single job for the scheduler.
Default is all jobs which were not yet submitted to the batch system.
[list
]
Required resources for all batch jobs. The elements of this list
(e.g. something like “walltime” or “nodes” are defined by your template job file.
Defaults can be specified in your config file.
Default is empty list.
[function(retries)
]
Function that defines how many seconds should be waited in case of a temporary error.
Default is exponential back-off with 10*2^retries
.
[integer(1)
]
Number of times to submit one job again in case of a temporary error
(like filled queues). Each time wait
is called to wait a certain
number of seconds.
Default is 10 times.
[logical(1)
]
If ids are passed as a list of chunked job ids, execute jobs in a chunk
as array jobs. Note that your scheduler and your template must be adjusted to
use this option. Default is FALSE
.
[function(n, i)
or logical(1)
]
Function that defines how many seconds a job should be delayed before it starts.
This is an expert option and only necessary to change when you want submit
extremely many jobs. We then delay the jobs a bit to write the submit messages as
early as possible to avoid writer starvation.
n
is the number of jobs and i
the number of the ith job.
The default function used with job.delay
set to TRUE
is no delay for
100 jobs or less and otherwise runif(1, 0.1*n, 0.2*n)
.
If set to FALSE
(the default) delaying jobs is disabled.
[logical(1)
]
Set to FALSE
to disable the progress bar.
To disable all progress bars, see makeProgressBar
.
[integer
]. Vector of submitted job ids.
# NOT RUN {
reg = makeRegistry(id = "BatchJobsExample", file.dir = tempfile(), seed = 123)
f = function(x) x^2
batchMap(reg, f, 1:10)
submitJobs(reg)
waitForJobs(reg)
# Submit the 10 jobs again, now randomized into 2 chunks:
chunked = chunk(getJobIds(reg), n.chunks = 2, shuffle = TRUE)
submitJobs(reg, chunked)
# }
Run the code above in your browser using DataLab