Parallel execution#
Scheduler options for parallel execution can be provided when executing a workflow
from ewoksdask import execute_graph
result = execute_graph("/path/to/graph.json", scheduler=..., scheduler_options=...)
The different schedulers are
- multithreading:
num_workers: CPU_COUNT by default
- multiprocessing:
context: spawn or fork
num_workers: CPU_COUNT by default
- cluster: scheduler with workers in the current process
n_workers: each worker has multiple threads
threads_per_worker:
processes: subprocesses instead of threads
127.0.0.1:40331: remote scheduler
Remote scheduler#
Start a scheduler (+ workers) on any host
from ewoksdask.clusters import local_scheduler
cluster = local_scheduler(n_workers=5)
Separate processes#
Start a scheduler on any host
dask scheduler
Add workers to a scheduler with 4 cores each
dask worker 127.0.0.1:8786 --nprocs 4
dask worker 127.0.0.1:8786 --nprocs 4
dask worker 127.0.0.1:8786 --nprocs 4
Slurm scheduler#
Start a scheduler on a Slurm submitter host (spawns one Slurm job for each worker)
from ewoksdask.clusters import slurm_scheduler
cluster = slurm_scheduler(maximum_jobs=5)