[AMBER-Developers] MPI Communicator creation from dcerutti.rci.rutgers.edu on 2012-01-14 (Amber Developers Archive Jan 2012)

From: <dcerutti.rci.rutgers.edu>
Date: Sat, 14 Jan 2012 14:21:54 -0500 (EST)

Hello, looking for a developer who can help me understand what an MPI
communicator really is, once I create it. I thought that I had this down,
after going through the various LLNL and Argon Labs tutorials.

1.) Create a new group (of threads) by first referencing the group
associated with MPI_COMM_WORLD, and then including a subset of the
processes in MPI_COMM_WORLD in a new group. Let's call the reference to
the group of MPI_COMM_WORLD gworld and the new group cpugrp.

2.) Create a new communicator associated with cpugrp. Let's call the new
communicator newcomm.

That leads to the following code:

  MPI_Comm_group(MPI_COMM_WORLD, &gworld);
  MPI_Group_incl(gworld, ncpu, CPUlist, &cpugrp);
  MPI_Comm_create(MPI_COMM_WORLD, cpugrp, &newcomm);

where ncpu is the number of threads that I've decided need to be part of
the new thread group and CPUlist contains the ranks of those threads
within MPI_COMM_WORLD.

Now, my confusion comes about because I want to define a new communicator
for each process to include precisely those other processes that it sends
information to during the direct space nonbonded force calculation. It
would seem that if I have the above code run on each of, say, four
threads, I should get four new communicators. That's not what I think I'm
seeing. If I have each thread print out the (integer) handle to the
communicator that it has created, it's thread ID, and the list of threads
that it has included in the new communicator, I get the following:

Newcomm -2080374780 on process 0 with IDs [ 0 1];
Newcomm -2080374782 on process 1 with IDs [ 1 2 3 0];
Newcomm -2080374782 on process 2 with IDs [ 2 3 1];
Newcomm -2080374782 on process 3 with IDs [ 3 0 1 2];

Notice, there are only two unique handles there. Why is that? What
really confuses me is what happens to the other processes that may be
included in each communicator. If process 2 creates a communicator X that
includes processes 2, 3, and 1, do processes 1 and 3 then know that they
are part of X? I'm broadcasting the handles to each communicator by
MPI_Allgather() so that every process is carrying an array of handles to
the various communicators--is it OK to communicate them as MPI_INT?--but
I'm not sure what happens because each communicator is more than just its
integer handle. Ultimately, I want to be passing messages around like
have process 3 send to process 1 over communicator X. The big reason I'm
doing this is to restrict the talking that has to be done "all in one
room" to prevent messages from being misidentified or running out of tag
numbers that I can guarantee to be unique.

Any help is appreciated!

Dave

_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Sat Jan 14 2012 - 11:30:03 PST