Re: [AMBER-Developers] Sending Structs in MPI

From: Duke, Robert E Jr <>
Date: Tue, 27 Sep 2011 01:14:45 +0000

Hi Dave,
Okay, just to be clear - "marshalling" really refers to a process whereby you transform how data is laid out in a known way, typically to make transport or storage more efficient, and the algorithm for restoring the data to the form in which it is used is of course then known. The point I was making about disordered data is that since I have to tailor what is sent anyway for individual nodes, I have to copy data around, and thus, as long as the transform is simple (like packing into a byte array), then the data packing really adds negligible overhead. I guess I have been wandering about trying to give some sense of the various ramifications of your question; hopefully it did not get confusing.
Regards - Bob

From: []
Sent: Monday, September 26, 2011 3:57 PM
To: AMBER Developers Mailing List
Subject: Re: [AMBER-Developers] Sending Structs in MPI

Well, when you put it that way, I've been "marshalling" in some sense all
along. What I have in my direct space decomposition are little
compartment cells, each of which has its own pre-allocated "import" and
"export" regions. There are also "importV" and "exportV" and "importX"
and "exportX" arrays, each containing progressively larger structs for
shipping increasingly detailed information when necessary. I make use of
these buffers in the normal code even without parallel considerations, as
the cells must keep ordered lists of atoms at all times and the buffers
are ideal for transferring abridged forms of data between ordered lists.
The largest importX / exportX buffers are only used when atoms migrate
between cells, and the "global" (non-reimaged) atom positions, previous
positions, velocities and previous velocities must migrate with them.

A higher level of marshalling may occur when I want to go to higher levels
of parallelism, in that separate nodes may get their own communicators and
make a pooled buffer for sending over an interconnect. That's roughly the
plan with direct space communication in MLE implementations, because it'll
be the direct space that kills us unless we get it under control.


> There is no reason that structs cannot be in a memory contiguous array in
> C. For arrays of strings, you do typically have the pointer
> implementation, but unless you set up your struct array as an array of
> struct pointers, not true. But it is still not the best way to go. As
> far as sending arrays of data from memory, at least for me, the bulk of
> the data exchange problem becomes disordered quickly when working in
> parallel. This means you have to effectively marshall anyway (ie., copy
> the data into and out of send buffers), unless you are sending all data to
> all nodes (which is occasionally necessary but mostly unnecessary and
> hugely inefficient).
> - Bob
> ________________________________________
> From: Ross Walker []
> Sent: Monday, September 26, 2011 11:22 AM
> To: 'AMBER Developers Mailing List'
> Subject: Re: [AMBER-Developers] Sending Structs in MPI
>> you might think. So you either 1) live with the extra bytes "over the
>> wire" because any other solution has equivalent costs, 2) marshal into
>> and out of a byte stream or 3) there is a third option - send the data
>> in sequential arrays.
> The 3rd option here is what I was essentially trying to refer to when I
> said
> sending structs was non-optimal. The problem you have in C is that, unlike
> Fortran, structures are not linear in memory to begin with. If you
> allocate
> arrays in structures what you really do is have a pointer in there that
> points to some other piece of memory in a different location. When you go
> to
> 2 D arrays in structures it gets even worse. This is why allocatable
> variables were not allowed in structures in the F95 standard (they were
> added reluctantly as part of F2003) since it prevents the data in your
> structures being linear in memory.
> So while using the offset approach to build a custom MPI datatype works,
> and
> works well given it avoids all the packing etc that Bob is referring to it
> does nothing to address the fact that using structures in the first place
> will be hurting your performance. Then MPI sends of structures etc mean
> going and getting data from lots of different locations in memory which
> hurts performance even before you start transmitting over the infiniband
> 'line'.
> Thus your best option would probably be to go back to the old school
> approach of putting things in linear arrays and sending them. Note, I do
> mean copying the data into linear arrays I mean actually having it in
> linear
> arrays to begin with. Essentially you should write down exactly how you
> will
> traverse the data in your code and exactly how you will communicate it
> between nodes. Then you need to lay out your arrays etc to match these two
> cases as best you can. This of course will make the code more ugly and
> most
> likely more difficult to follow but it is a necessary evil if you want to
> get the performance.
> All the best
> Ross
> /\
> \/
> |\oss Walker
> ---------------------------------------------------------
> | Assistant Research Professor |
> | San Diego Supercomputer Center |
> | Adjunct Assistant Professor |
> | Dept. of Chemistry and Biochemistry |
> | University of California San Diego |
> | NVIDIA Fellow |
> | | |
> | Tel: +1 858 822 0854 | EMail:- |
> ---------------------------------------------------------
> Note: Electronic Mail is not secure, has no guarantee of delivery, may not
> be read every day, and should not be used for urgent or sensitive issues.
> _______________________________________________
> AMBER-Developers mailing list
> _______________________________________________
> AMBER-Developers mailing list

AMBER-Developers mailing list
AMBER-Developers mailing list
Received on Mon Sep 26 2011 - 18:30:05 PDT
Custom Search