Re: [AMBER-Developers] How do I ensure each MPI message a unique tag?

From: Jason Swails <>
Date: Fri, 23 Dec 2011 23:40:27 -0500

I'm not sure how you're parallelizing things in mdgx, but I'm going to take
a guess here and say that you're probably doing all communications over

The necessity of a unique tag is only valid over a given MPI communicator
or group. Therefore, if you found a way to divide up your communications
into a group of smaller communicators, that would probably simplify some of
the bookkeeping (since some of the book keeping could effectively be rolled
into how the communicators are laid out in the first place), and allow you
to re-use tags, since the sub-communicators will likely be much smaller
than the global mpi_comm_world.

Indeed, I'm guessing you could find yourself in a situation where the comm
size of the parts actually doing the sending and receiving would be held
more or less constant as processor count grew; just the number of
communicators you created would grow. In this case, the issue you describe
shouldn't exist.

Furthermore, it would make some communications much cheaper and easier to
implement. You can implement some communications that take place over a
subset of processors as collective communications across just those
processors grouped into a communicator (thereby taking advantage of an MPI
implementation's optimization of collective communication) rather than just
doing point-to-points across everyone in that group.

I'm of the opinion that the ability to create multiple, smaller
communicators is a fantastic way of simplifying the program and increasing
scalability, but it's often (unfortunately) overlooked and undervalued.


On Fri, Dec 23, 2011 at 9:27 PM, <> wrote:

> Hello,
> Awhile back I wrote to AmberDevs and the response confirmed my suspicion
> that I have to guarantee each MPI message a unique tag, particularly
> messages that take place during separate rounds of communication as they
> can still get crossed in fringe cases when one processor gets way behind
> and we try to make the code go as fast as possible by letting every
> process go as far as possible asynchronously. As I contemplate how to
> grant unique tags to every message, it seems that I may be imposing a
> limit, albeit a rather large one, on the number of processors that mdgx
> can utilize.
> If I say that each separate round of communication has its tags boosted by
> some offset determined by a constant (1, 2, 3, 4, ...) times the number of
> processors squared, I can ensure that any message in any round, which has
> the boost plus (# of sender X # of threads + # of receiver) will have a
> unique tag. But, if there are, for example, 128 separate rounds of
> communication, the largest number of processors I can support is 2^12 or
> 4096. At that point I overflow the "tag" integer argument to the
> MPI_Irecv and MPI_Isend functions.
> Obviously, we do not have situations where we need to send asynchronous
> messages from every process to every other process. In fact, I think that
> all of the major dynamics communication rounds involve each process
> communicating with only their six nearest neighbors. But, there could be
> cases in the foreseeable future when I want to permit long-ranged
> restraints or some other contingency when I can't guarantee that a message
> between any two given processes would never occur.
> So, how do I avoid this obvious ceiling on the number of processors?
> We're far from the limit at the moment, but codes like GROMACS, NAMD, and
> DESMOND have pushed through it and I don't want to write stopgap code if I
> can avoid revisions later. Perhaps I could enumerate all possible
> messages in some array and store that on every process, but it seems like
> that would be very tedious. Any ideas would be appreciated.
> Dave
> _______________________________________________
> AMBER-Developers mailing list

Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Candidate
AMBER-Developers mailing list
Received on Fri Dec 23 2011 - 21:00:02 PST
Custom Search