Arcane  4.1.12.0
User documentation
Loading...
Searching...
No Matches
Concurrency and Multi-threading

The notion of concurrency is implemented in Arcane via the notion of a task.

This task notion allows the concurrent execution of multiple operations via threads.

This notion is complementary to the notion of domain decomposition used by Arcane::IParallelMng. It is therefore entirely possible to mix domain decomposition and threads.

Warning
Nevertheless, if the implementation of Arcane::IParallelMng is done via MPI, it is not recommended to call Arcane::IParallelMng when tasks are running concurrently, for example in parallelized loops. Most MPI implementations are not very performant in this mode, and some only support it partially.

To use tasks, you must include the following file:

Classes, Types, and macros for managing concurrency.

There are two mechanisms for using tasks:

  1. Implicitly via the notion of a parallel loop
  2. Explicitly by creating tasks directly

The first solution is the simplest and should be considered first.

Activation

By default, concurrency support is disabled. Activation is done before launching the code, by specifying the number of tasks that can run concurrently on the command line (see page Launching a Calculation to find out how to do this).

It is possible to check in the code whether concurrency is active by calling the Arcane::TaskFactory::isActive() method.

It is not possible to activate concurrency during execution.

Parallel Loops

There are two forms of parallel loops. The first form applies to classic loops, the second to groups of entities.

The operating mechanism is similar to the omp parallel for directives in OpenMp.

Warning
The user of this mechanism must ensure that the loop can be correctly parallelized without edge effects. Specifically, this includes (but is not limited to) the guarantee that the loop iterations are independent, and that there are no loop exit operations (return, break).

The first form is for parallelizing the following sequential loop:

void func()
{
for( Integer i=0; i<n; ++i )
p[i] = (gamma[i]-1) * rho[i] * e[i];
}

Parallelization is done as follows: you must first write a functor class that represents the operation you wish to perform over an iteration interval. Then, you must use the arcaneParallelFor() operation, specifying this functor as an argument, as follows:

class Func
{
public:
void exec(Integer begin,Integer size)
{
for( Integer i=begin; i<(begin+size); ++i )
p[i] = (gamma[i]-1) * rho[i] * e[i];
}
};
void func()
{
Func my_functor;
Arcane::arcaneParallelFor(0,n,&my_functor,&Func::exec);
}
void arcaneParallelFor(Integer i0, Integer size, InstanceType *itype, void(InstanceType::*lambda_function)(Integer i0, Integer size))
Applies the lambda function lambda_function concurrently over the iteration range [i0,...

This syntax is a bit verbose. If the compiler supports the C++11 standard, it is possible to use lambda functions to simplify the writing:

void func()
{
Arcane::arcaneParallelFor(0,n,[&](Integer begin,Integer size){
for( Integer i=begin; i<(begin+size); ++i )
p[i] = (gamma[i]-1.0) * rho[i] * e[i];
});
}

A specialization exists for groups of entities. To parallelize an enumeration over a group like the following code:

void func()
{
ENUMERATE_(Cell, icell, my_group){
p[icell] = (gamma[icell]-1.0) * rho[icell] * e[icell];
}
}
#define ENUMERATE_(type, name, group)
Generic enumerator for an entity group.

You must write it like this:

using namespace Arcane;
class Func
{
public:
void exec(CellVectorView view)
{
ENUMERATE_(Cell, icell, view){
p[icell] = (gamma[icell]-1.0) * rho[icell] * e[icell];
}
}
};
void func()
{
Func my_functor;
arcaneParallelForeach(my_group,&my_functor,&Func::exec);
}
Cell of a mesh.
Definition Item.h:1300
void arcaneParallelForeach(const ItemVectorView &items_view, const ForLoopRunInfo &run_info, InstanceType *instance, void(InstanceType::*function)(ItemVectorViewT< ItemType > items))
Applies the method function of the instance instance concurrently on the view items_view with the opt...
Definition Concurrency.h:57
ItemVectorViewT< Cell > CellVectorView
View over a vector of cells.
Definition ItemTypes.h:305
-- tab-width: 2; indent-tabs-mode: nil; coding: utf-8-with-signature --

Similarly, with C++11 support, you can simplify:

using namespace Arcane;
void func()
{
ENUMERATE_(Cell, icell, cells){
p[icell] = (gamma[icell]-1.0) * rho[icell] * e[icell];
}
});
}

For the Arcane::arcaneParallelFor() and Arcane::arcaneParallelForeach() loops, it is possible to pass an instance of ParallelLoopOptions as an argument to configure the parallel loop. For example, it is possible to specify the interval size to divide the loop:

void func()
{
// Executes the loop in chunks of about 50 cells.
options.setGrainSize(50);
ENUMERATE_(Cell, icell,cells){
p[icell] = (gamma[icell]-1.0) * rho[icell] * e[icell];
}
});
}
Execution options for a parallel loop in multi-threading.
void setGrainSize(Integer v)
Sets the size (approximate) of an iteration interval.

Explicit Task Usage

Creating a task is done via the task factory. You must specify a functor as an argument in the same way as with parallel loops:

class Func
{
public:
void exec(const TaskContext& ctx)
{
// Execute the task.
}
};
void func()
{
Func my_functor
Arcane::ITask* master_task = Arcane::TaskFactory::createTask(&my_functor,&Func::exec);
}
Interface for a concurrent task.
Definition Task.h:194
Execution context of a task.
Definition Task.h:50
static ITask * createTask(InstanceType *instance, void(InstanceType::*function)(const TaskContext &tc))
Creates a task. During execution, the task will call the method function via the instance instance.
Definition TaskFactory.h:50

Once the task is created, it is possible to launch it and wait for its termination using the ITask::launchAndWait() method. For simplicity reasons, the task is not launched until this method has been called.

It is possible to create sub-tasks from a primary task using the Arcane::TaskFactory::createChildTask() method. The user must manage the launching and waiting of sub-tasks. For example:

using namespace Arcane;
ITask* master_task = TaskFactory::createTask(...);
sub_tasks.add(TaskFactory::createChildTask(master_task,&my_functor,&Func::exec);
sub_tasks.add(TaskFactory::createChildTask(master_task,&my_functor,&Func::exec);
master_task->launchAndWait(sub_tasks);
void add(ConstReferenceType val)
Adds element val to the end of the array.
virtual void launchAndWait()=0
Launches the task and blocks until it finishes.
static ITask * createChildTask(ITask *parent_task, InstanceType *instance, void(InstanceType::*function)(const TaskContext &tc))
Creates a child task.
Definition TaskFactory.h:75
1D data vector with value semantics (STL style).

The following complete example shows the implementation of calculating a Fibonacci sequence using the task mechanism.

using namespace Arcane;
class Fibonnaci
{
public:
const long n;
long* const sum;
Fibonnaci( long n_, long* sum_ ) : n(n_), sum(sum_)
{}
void execute(const TaskContext& context)
{
if( n<10 ) {
*sum = SerialFib(n);
}
else {
long x, y;
Fibonnaci a(n-1,&x);
Fibonnaci b(n-2,&y);
ITask* child_tasks[2];
ITask* parent_task = context.task();
child_tasks[0] = TaskFactory::createChildTask(parent_task,&a,&Test5Fibonnaci::execute);
child_tasks[1] = TaskFactory::createChildTask(parent_task,&b,&Test5Fibonnaci::execute);
parent_task->launchAndWait(ConstArrayView<ITask*>(2,child_tasks));
// Perform the sum
*sum = x+y;
}
}
static long SerialFib( long n )
{
if( n<2 )
return n;
else
return SerialFib(n-1)+SerialFib(n-2);
}
static long ParallelFib( long n )
{
long sum;
Test5Fibonnaci a(n,&sum);
ITask* task = TaskFactory::createTask(&a,&Test5Fibonnaci::execute);
task->launchAndWait();
return sum;
}
};
Constant view of an array of type T.
ITask * task() const
Current task.
Definition Task.h:60