Executor ======== Execute a function on SLURM with an :code:`concurrent.futures.Executor` implementation called :code:`SlurmRestExecutor` .. code:: python from pyslurmutils.concurrent.futures import SlurmRestExecutor with SlurmRestExecutor( url=url, # SLURM REST URL user_name=user_name, # SLURM user name token=token, # SLURM access token log_directory="/path/to/log", # for log files data_directory="/path/to/data", # TCP communication when not provided pre_script="module load ewoks", # load environment parameters={"time_limit": "02:00:00"} # SLURM job parameters python_cmd="python", # python command (python3 by default) ) as executor: future = executor.submit(sum, [1, 1]) assert future.result() == 2 The python environment can be selected with the :code:`pre_script` argument. Since *pyslurmutils* only relies on python builtins on the SLURM side, no specific libraries are required. When :code:`data_directory` is not provided, the SLURM job connects to a TCP port opened locally before submitting the job. .. warning:: The function is pickled locally and unpickled by the SLURM job which means it needs to have access to the module it is defined in. When the function is defined locally in a script, the source code is send instead of pickling. Even so, everything used in the function needs to exist remotely. Multiple tasks per job ++++++++++++++++++++++ By default every time you submit a task, a SLURM job is scheduled. To allow SLURM jobs to execute more that one task, use the :code:`max_tasks_per_worker` argument. In addition, when more that one task is executed in one SLURM job, you may want to initialize the SLURM job when it starts. For example run two tasks per SLURM job and initialize a global variable .. code:: python import os import socket def initializer(): global GLOBAL_VAR GLOBAL_VAR = 0 def job(): global GLOBAL_VAR GLOBAL_VAR += 1 host = socket.gethostname() pid = os.getpid() ncpus = len(os.sched_getaffinity(pid)) return f"{host=}, {pid=}, {ncpus=}, {GLOBAL_VAR=}" if __name__ == "__main__": from concurrent.futures import as_completed with SlurmRestExecutor( url=url, token=token, user_name=user_name, max_tasks_per_worker=2, initializer=initializer, log_directory=log_directory, ) as executor: futures = [executor.submit(job) for _ in range(16)] for future in as_completed(futures): print(future.result()) .. warning:: Note the :code:`__name__` statement. When not provided it gets executed by the SLURM job since this is a local script, in which case the source code is send and execution by the SLURM job. Keep SLURM jobs alive +++++++++++++++++++++ Keep SLURM jobs alive within the executor context .. code:: python with SlurmRestExecutor( url=url, token=token, user_name=user_name, max_workers=4, lazy_scheduling=False, initializer=initializer, log_directory=log_directory, ) as executor: futures = [executor.submit(job) for _ in range(16)] for future in as_completed(futures): print(future.result()) This starts four SLURM jobs when entering the context and restarts them when they go down due to :code:`max_tasks_per_worker` or a failure.