Welcome to the atools (Job Array Tools) documentation
atools has been designed to conveniently deal with job arrays, a feature
supported by many queue systems and schedulers. A job array consists of
a (potentially large) number of individual tasks that can be run in
parallel, independent of one another.
Typically, these tasks originate from a few scenarios such as
- performing the same computation on many input files, or
- running an algorithm with many different parameter sets.
atools in combination with a queue system or scheduler will allow you
to conveniently handle such MapReduce scenarios without the overhead,
both in terms of computation and setup of systems such as Hadoop or
atools supports PBS torque, Adaptive Computing Moab, SUN Grid
Engine and Slurm workload manager, but extending the list to other resource
managers and schedulers should be easy if they support a feature similar in
spirit to job arrays.
This documentation provides a walk through of the features, and serves as a reference for the more arcane features. Topics:
atoolsfeatures using templates, (using
- instantiating parameter values per task (using
- logging task start and completion information (using
- resuming computations if not all tasks were completed
- aggregating output generated by the tasks (using `areduce',
- analyzing task run times and load balance (using
atools is an open source project hosted on