Arcane variables usually use the default allocator to allocate memory. Without a GPU, local machine memory is used, and with a GPU, unified memory is used.
A new memory allocator (internal to Arcane) is available and allows memory to be allocated in shared machine memory. To do this, internally, we use the previously presented class: Arcane::MachineShMemWin. We will therefore have access to non-contiguous segments.
This mode is compatible with all Arcane variable types (except scalar variables without support (e.g., VariableScalarReal) and partial variables).
The main difficulty in using this shared memory mode is ensuring that all calls that reallocate memory are collective.
For variables resized by Arcane, the user does not need to worry about these collective calls; Arcane handles them. For example, with mesh variables, Arcane handles resizing if the mesh evolves.
Conversely, for variables for which a resize() method is available (or reshape() for multi-dimensional variables), it is necessary to ensure that all machine subdomains call this method (even if doing var.resize(var.size()) for subdomains that do not require resizing).
Setting aside these resizing calls, the use of shared memory variables is identical to the use of local memory variables.
To declare a variable in shared memory, simply add the IVariable::PInShMem property when creating it (in AXL files, the corresponding option is in-shmem="true").
The utility of putting variables in shared memory is to be able to access data from other subdomains without message exchanges.
To access data from all subdomains, you can use the MachineShMemWinVariable classes. One class per Arcane variable type:
| Variable Type (example) | Class to Use |
|---|---|
| 1D Array Variable without support (Arcane::VariableArrayInt32) | Arcane::MachineShMemWinVariableArrayT |
| Mesh Scalar Variable (Arcane::VariableCellInt32) | Arcane::MachineShMemWinMeshVariableScalarT |
| 2D Array Variable without support (Arcane::VariableArray2Int32) | Arcane::MachineShMemWinVariableArray2T |
| Mesh 1D Array Variable (Arcane::VariableCellArrayInt32) | Arcane::MachineShMemWinMeshVariableArrayT |
| Scalar Multi-dimensional Variable (Arcane::MeshMDVariableRefT<Cell, Real, MDDim2>) | Arcane::MachineShMemWinMeshMDVariableT |
| Vector Multi-dimensional Variable (Arcane::MeshVectorMDVariableRefT<Cell, Real, 7, MDDim2>) | Arcane::MachineShMemWinMeshVectorMDVariableT |
| Matrix Multi-dimensional Variable (Arcane::MeshMatrixMDVariableRefT<Cell, Real, 2, 5, MDDim1>) | Arcane::MachineShMemWinMeshMatrixMDVariableT |
Three methods are common to these classes:
The first two have already been briefly described in the previous section (Usage).
Arcane::MachineShMemWinVariableCommon::machineRanks() allows retrieving the ranks of the computation node's subdomains.
For example, if the returned view contains [0, 2, 4, 6], we know that the computation node possesses these subdomains and that we have access to their data via MachineShMemWin.
By using the Arcane::IParallelMng::commSize() method, knowing that the ranks are contiguous, we can also determine which subdomains are not in our computation node. For example, if commSize() = 8, then the subdomains for which we must perform inter-node communications are subdomains [1, 3, 5, 7].
Arcane::MachineShMemWinVariableCommon::barrier() allows performing a barrier for all subdomains of the computation node (so, if we take the previous example, a barrier for subdomains [0, 2, 4, 6]).
This is useful in the case where subdomains use a memory window to share information, to wait until each subdomain has written to its window before other subdomains in the node read this data. The granularity is smaller than Arcane::IParallelMng::barrier().
The real difference from the previous section is the method Arcane::MachineShMemWinMeshVariableArrayT::updateVariable().
Internally, as explained in the introduction, we use an allocator that allocates memory in shared memory and we use the Arcane::MachineShMemWin class to access it.
Arcane::MachineShMemWinVariable in turn uses Arcane::MachineShMemWin to access the shared memory of the variables.
The problem is that the size of an array in Arcane is not necessarily the same size as the memory allocated by it. Consequently, internally, we cannot rely on the size returned by Arcane::MachineShMemWin to build views on the variables.
We must therefore retrieve the sizes of the variables from each subdomain in another way. To do this, we use a memory window to share them.
When changing the size of a variable (via a change in the mesh or via a resize for array variables), we must update the variable sizes.
Today, it is up to the user to do this via a call to updateVariable().
It is also possible to destroy the Arcane::MachineShMemWinVariable object and recreate it after updating the variable.
Some examples to illustrate the use of these classes:
In this example, each subdomain has an array of two Int32.
Each subdomain puts its rank in the two cells of the array, and then each subdomain displays the view of each array (var_sh.view(rank) returns a view of two Int32 from the rank array).
The call to the updateVariable() method could easily be removed by putting var.resize(2); between the creation of the variable and the creation of the MachineShMemWinVariable:
An alternative to calling updateVariable() is the destruction/recreation of MachineShMemWinVariable:
For mesh quantities, we have access to the operator Arcane::MachineShMemWinMeshVariableScalarT::operator()() which allows accessing the value of an Item using its local_id.
If multiple values need to be read from another subdomain, it is strongly recommended to do so by retrieving a view using the method Arcane::MachineShMemWinMeshVariableScalarT::view(). Example:
In Example 2, the barrier is important, given that each subdomain will access the data of the other subdomains.
Nevertheless, it is also possible to do this to avoid the barrier:
Here, we have a 2D array without support.
The method Arcane::MachineShMemWinVariableArray2T::view() allows retrieving a view (of type Arcane::Span2) on the 2D array of another subdomain of the computation node.
A mesh 1D array variable is a 2D array but with the first dimension corresponding to the number of Items.
We find the method Arcane::MachineShMemWinMeshVariableArrayT::view(), which returns a view of the 2D array of the variable from another subdomain. The first dimension takes a local_id of an Item from the other subdomain and the second dimension is the position in the array of the Item.
With multi-dimensional variables, the method Arcane::MachineShMemWinMeshMDVariableT::view() returns an Arcane::MDSpan with one extra dimension compared to the variable's dimension, the first dimension corresponding to the support.
The operator Arcane::MachineShMemWinMeshMDVariableT::operator()() is also available and allows retrieving a multi-dimensional view of the variable's dimension (since the local_id is also provided).
As mentioned previously, if accessing multiple Items arrays for a given subdomain, it is better to retrieve a complete view via Arcane::MachineShMemWinMeshMDVariableT::view().
For MD vector and matrix variables, we find the same methods as in the previous example.
Shared memory variables are compatible with the checkpoint mechanism.
A variable property has been added to allow not saving the specified subdomain arrays.
This is the IVariable::PDumpNull property. This is not a property reserved for shared memory variables.
This property, when specified on a variable for a given subdomain, allows saving an empty array. This is particularly useful in recovery for shared memory variables given the obligation to perform collective operations.
We will call a master subdomain of the computation node, the subdomain with the smallest rank of the node (since machine_ranks is sorted in ascending order, it is the first rank of the array).
In this example, during the first iteration of the time loop, we resize the variable for all subdomains.
Then, we assign the IVariable::PDumpNull property to all non-master subdomains.
Finally, during recovery, we check that the master subdomains' arrays have been restored and that the other arrays are empty.