Two different implementations are available: one implementation with all segments contiguous and with a constant size, defined when the object is constructed, and another implementation with non-contiguous segments and a variable size.
This implementation allows the creation of a memory window whose segments are all contiguous. It is thus quite simple to re-slice the segments during use (for example, to balance a calculation).
This part is managed by the Arcane::ContigMachineShMemWin class.
This class can use three implementations of IContigMachineShMemWinBase, one for each type of Arcane::IParallelMng. It is therefore possible to use this class regardless of whether you have an MpiParallelMng, a SequentialParallelMng, a SharedMemoryParallelMng, or a HybridParallelMng (Choosing the Message Exchange Manager).
The creation of an object of this type is collective. An instance of this class will create a memory window composed of several segments (one per subdomain).
Access to the elements of the segments is not collective. Concurrent access to an element is possible using semaphores, mutexes, or std::atomic. For std::atomic, the operations must be address-free:
When this object is constructed, each subdomain provides a segment size. The window size will be equal to the sum of the segment sizes.
To access its segment, you can use the method Arcane::ContigMachineShMemWin::segmentView().
Once the segment has been modified, you can perform a barrier to ensure that everyone has written to their segment before using it.
To find out which subdomains share a window on the node, you can retrieve an array of ranks.
The position of the ranks in this array corresponds to the position of their segment in the window.
To read the segments of other subdomains on the node, you can use the method Arcane::ContigMachineShMemWin::segmentConstView().
The window size cannot be modified. However, the implementation in Arcane allows resizing the segments collectively (provided that the new window size is less than or equal to the original size).
Since the window is contiguous, access to the entire window is possible for all subdomains.
This implementation is quite different from the previous one. Here, the segments of the memory windows are no longer contiguous. Furthermore, with this implementation, it is possible to resize the segments like a classic dynamic array.
Nevertheless, this operation is collective, which contaminates most of the implementation's methods.
This part is managed by the Arcane::MachineShMemWin class.
As with the previous implementation, this one is compatible with all Arcane parallelism modes.
The creation of an object of this type is collective. An instance of this class will create a memory window composed of several segments (one per subdomain).
Like a UniqueArray, it is possible to specify an initial size (here 5):
And it is possible not to specify an initial size.
The method Arcane::MachineShMemWin::machineRanks() is available and returns the same array as the Arcane::ContigMachineShMemWin implementation.
To explore our segment or the segment of another subdomain, you can use the same methods as before:
However, since the segments are not contiguous, the windowView() methods are not available.
The segments have a size that can be increased or decreased over time.
It is possible to add elements using the method Arcane::MachineShMemWin::add(Arcane::Span<const Type> elem):
This method is collective; all subdomains on a node must call it. If a subdomain does not wish to add elements to its segment, it can call the add() method with an empty array or without arguments (Arcane::MachineShMemWin::add()).
This operation can be costly due to memory reallocation. It is therefore advisable to add a large quantity of elements at once rather than element by element.
If element-by-element addition is indispensable, the method Arcane::MachineShMemWin::reserve(Arcane::Int64 new_capacity) is available to avoid reallocating a segment multiple times:
In this piece of code, we will reserve space for 20 Integers for all subdomains. This value can be different for each subdomain (if a subdomain does not want to reserve more space, it can call Arcane::MachineShMemWin::reserve()).
In our example, resize() will increase the number of elements in all segments except for the subdomain that previously performed the add() operations (which has 15 elements, compared to 5 for the others). This subdomain will go from 15 elements to 12.
Like the reserve() method, each subdomain can set the value it wants.
It is also possible to add elements to the segment of another subdomain using the collective method Arcane::MachineShMemWin::addToAnotherSegment(Arcane::Int32 rank, Arcane::Span<const Type> elem).
The functionality is almost identical to the add() method but with an extra parameter to designate the rank of the subdomain possessing the segment to be modified.
Inter-process shared memory should not be seen as multithreaded shared memory. This sharing only occurs on a part of the memory, not on all of the memory.
Consider this structure:
It can be used like this:
If this structure is used in a window, it would look like this:
You can display the value you assigned, and it works correctly:
But if you want to display the value of another process:
In multi-process mode (launching with mpirun -n 2 ...), the program will crash (segfault), whereas it will not crash in multithreading (launching with -A,S=2).
In multi-process mode, the attributes of the UniqueArray<Integer> array_integer; array in the structure are not allocated in shared memory (the new or malloc calls are made in local memory), so other processes do not have access to them.
It is also important to note that the same memory location in shared memory is addressed differently between processes. Therefore, if you provide a shared memory allocator to the UniqueArray, the addresses used will only be valid locally.