dwww Home | Manual pages | Find package

SHMEM_SUM(3)                       Open MPI                       SHMEM_SUM(3)

NAME
       shmem_comp4_sum_to_all(3),     shmem_comp8_sum_to_all(3),    shmem_com-
       plexd_sum_to_all(3),      shmem_complexf_sum_to_all(3),      shmem_dou-
       ble_sum_to_all(3),  shmem_float_sum_to_all(3), shmem_int_sum_to_all(3),
       shmem_int4_sum_to_all(3),                     shmem_int8_sum_to_all(3),
       shmem_long_sum_to_all(3),  shmem_longdouble_sum_to_all(3),  shmem_long-
       long_sum_to_all(3),                          shmem_real4_sum_to_all(3),
       shmem_real8_sum_to_all(3),                  shmem_real16_sum_to_all(3),
       shmem_short_sum_to_all(3) - Performs a sum reduction across  a  set  of
       processing elements (PEs)

SYNOPSIS
       C or C++:
       #include <mpp/shmem.h>

       void shmem_complexd_sum_to_all(double complex *target,
         const double complex *source, int nreduce, int PE_start,
         int logPE_stride, int PE_size, double complex *pWrk,
         long *pSync);

       void shmem_complexf_sum_to_all(float complex *target,
         const float complex *source, int nreduce, int PE_start,
         int logPE_stride, int PE_size, float complex *pWrk,
         long *pSync);

       void shmem_double_sum_to_all(double *target,
         const double *source, int nreduce, int PE_start, int logPE_stride,
         int PE_size, double *pWrk, long *pSync);

       void shmem_float_sum_to_all(float *target, const float *source,
         int nreduce, int PE_start, int logPE_stride, int PE_size,
         float *pWrk, long *pSync);

       void shmem_int_sum_to_all(int *target, const int *source,
         int nreduce, int PE_start, int logPE_stride, int PE_size,
         int *pWrk, long *pSync);

       void shmem_long_sum_to_all(long *target, const long *source,
         int nreduce, int PE_start, int logPE_stride, int PE_size,
         long *pWrk, long *pSync);

       void shmem_longdouble_sum_to_all(long double *target,
         const long double *source, int nreduce, int PE_start, int
         logPE_stride, int PE_size, long double *pWrk, long *pSync);

       void shmem_longlong_sum_to_all(long long *target,
         const long long *source, int nreduce, int PE_start,
         int logPE_stride, int PE_size, long long *pWrk,
         long *pSync);

       void shmem_short_sum_to_all(short *target, const short *source,
         int nreduce, int PE_start, int logPE_stride, int PE_size,
         short *pWrk, long *pSync);

       Fortran:
       INCLUDE "mpp/shmem.fh"

       INTEGER pSync(SHMEM_REDUCE_SYNC_SIZE)
       INTEGER nreduce, PE_start, logPE_stride, PE_size

       CALL SHMEM_COMP4_SUM_TO_ALL(target, source, nreduce,
       & PE_start, logPE_stride, PE_size, pWrk, pSync)

       CALL SHMEM_COMP8_SUM_TO_ALL(target, source, nreduce,
       & PE_start, logPE_stride, PE_size, pWrk, pSync)

       CALL SHMEM_INT4_SUM_TO_ALL(target, source, nreduce,
       & PE_start, logPE_stride, PE_size, pWrk, pSync)

       CALL SHMEM_INT8_SUM_TO_ALL(target, source, nreduce,
       & PE_start, logPE_stride, PE_size, pWrk, pSync)

       CALL SHMEM_REAL4_SUM_TO_ALL(target, source, nreduce,
       & PE_start, logPE_stride, PE_size, pWrk, pSync)

       CALL SHMEM_REAL8_SUM_TO_ALL(target, source, nreduce,
       & PE_start, logPE_stride, PE_size, pWrk, pSync)

       CALL SHMEM_REAL16_SUM_TO_ALL(target, source, nreduce,
       & PE_start, logPE_stride, PE_size, pWrk, pSync)

DESCRIPTION
       The shared memory (SHMEM) reduction routines compute one or more reduc-
       tions across symmetric arrays on multiple virtual PEs. A reduction per-
       forms  an  associative  binary  operation across a set of values. For a
       list of other SHMEM reduction routines, see intro_shmem(3).

       As with all SHMEM collective routines, each of these  routines  assumes
       that  only  PEs  in the active set call the routine. If a PE not in the
       active set calls a SHMEM collective  routine,  undefined  behavior  re-
       sults.

       The  nreduce  argument  determines the number of separate reductions to
       perform. The source array on all PEs in the active set provides one el-
       ement  for  each reduction. The results of the reductions are placed in
       the target array on all PEs in the active set. The active  set  is  de-
       fined by the PE_start, logPE_stride, PE_size triplet.

       The source and target arrays may be the same array, but they may not be
       overlapping arrays.

       The arguments are as follows:

       target The remotely accessible integer data object to be updated on the
              remote  PE.  If  you  are using C/C++, the type of target should
              match that implied in the SYNOPSIS section.  If  you  are  using
              the Fortran compiler, it must be of type integer with an element
              size  of  4  bytes  for   SHMEM_INT4_ADD   and   8   bytes   for
              SHMEM_INT8_ADD.

       value  The  value  to  be  atomically added to target. If you are using
              C/C++, the type of value should match that implied in the SYNOP-
              SIS  section. If you are using Fortran, it must be of type inte-
              ger with an element size of target.

       pe     An integer that indicates the PE number upon which target is  to
              be updated. If you are using Fortran, it must be a default inte-
              ger value.

       target A symmetric array of length nreduce elements to receive the  re-
              sults of the reduction operations.
              The data type of target varies with the version of the reduction
              routine being called and the language used.  When  calling  from
              C/C++,  refer to the SYNOPSIS section for data type information.
              When calling from Fortran, the target data types are as follows:

              shmem_comp4_sum_to_all: COMPLEX(KIND=4).

              shmem_comp8_sum_to_all: Complex. If you are  using  Fortran,  it
              must be
                     a default complex value.

              shmem_int4_sum_to_all: INTEGER(KIND=4).

              shmem_int8_sum_to_all:  Integer.  If  you  are using Fortran, it
              must be a
                     default integer value.

              shmem_real4_sum_to_all: REAL(KIND=4).

              shmem_real8_sum_to_all: Real. If you are using Fortran, it  must
              be a
                     default real value.

              shmem_real16_sum_to_all: Real. If you are using Fortran, it must
              be a
                     default real value.

       source A symmetric array, of length nreduce elements, that contains one
              element  for each separate reduction operation. The source argu-
              ment must have the same data type as target.

       nreduce
              The number of elements in the target and source arrays.  nreduce
              must  be of type integer. If you are using Fortran, it must be a
              default integer value.

       PE_start
              The lowest virtual PE number of the active set of PEs.  PE_start
              must  be of type integer. If you are using Fortran, it must be a
              default integer value.

       logPE_stride
              The log (base 2) of the stride between  consecutive  virtual  PE
              numbers in the active set. logPE_stride must be of type integer.
              If you are using Fortran, it must be a default integer value.

       PE_size
              The number of PEs in the active set. PE_size must be of type in-
              teger.  If  you  are using Fortran, it must be a default integer
              value.

       pWrk   A symmetric work array. The pWrk argument  must  have  the  same
              data  type as target. In C/C++, this contains max(nreduce/2 + 1,
              _SHMEM_REDUCE_MIN_WRKDATA_SIZE) elements. In Fortran, this  con-
              tains  max(nreduce/2  +  1,  SHMEM_REDUCE_MIN_WRKDATA_SIZE) ele-
              ments.

       pSync  A symmetric work array. In C/C++, pSync is of type long and size
              _SHMEM_REDUCE_SYNC_SIZE.  In  Fortran,  pSync is of type integer
              and size SHMEM_REDUCE_SYNC_SIZE. It must be  a  default  integer
              value.  Every element of this array must be initialized with the
              value _SHMEM_SYNC_VALUE (in C/C++) or SHMEM_SYNC_VALUE (in  For-
              tran)  before  any of the PEs in the active set enter the reduc-
              tion routine.

       The values of arguments nreduce, PE_start,  logPE_stride,  and  PE_size
       must  be equal on all PEs in the active set. The same target and source
       arrays, and the same pWrk and pSync work arrays, must be passed to  all
       PEs in the active set.

       Before  any PE calls a reduction routine, you must ensure that the fol-
       lowing conditions exist (synchronization via a barrier  or  some  other
       method  is  often  needed to ensure this): The pWrk and pSync arrays on
       all PEs in the active set are not still in use from a prior call  to  a
       collective SHMEM routine. The target array on all PEs in the active set
       is ready to accept the results of the reduction.

       Upon return from a reduction routine, the following are  true  for  the
       local  PE:  The  target array is updated. The values in the pSync array
       are restored to the original values.

NOTES
       The terms collective, symmetric, and cache aligned are defined  in  in-
       tro_shmem(3).

       All  SHMEM reduction routines reset the values in pSync before they re-
       turn, so a particular pSync buffer need only be initialized  the  first
       time it is used.

       You  must ensure that the pSync array is not being updated on any PE in
       the active set while any of the PEs  participate  in  processing  of  a
       SHMEM reduction routine. Be careful of the following situations: If the
       pSync array is initialized at run time, some type of synchronization is
       needed to ensure that all PEs in the working set have initialized pSync
       before any of them enter a SHMEM routine called with the pSync synchro-
       nization array. A pSync or pWrk array can be reused in a subsequent re-
       duction routine call only if none of the PEs  in  the  active  set  are
       still  processing  a  prior  reduction  routine call that used the same
       pSync or pWrk arrays. In general, this can be  assured  only  by  doing
       some type of synchronization. However, in the special case of reduction
       routines being called with the same active set, you  can  allocate  two
       pSync and pWrk arrays and alternate between them on successive calls.

EXAMPLES
       Example  1: This Fortran example statically initializes the pSync array
       and finds the sum of the real variable FOO across all even PEs.
       INCLUDE "mpp/shmem.fh"

       INTEGER PSYNC(SHMEM_REDUCE_SYNC_SIZE)
       DATA PSYNC /SHMEM_REDUCE_SYNC_SIZE*SHMEM_SYNC_VALUE/
       PARAMETER (NR=1)
       REAL FOO, FOOSUM, PWRK(MAX(NR/2+1,SHMEM_REDUCE_MIN_WRKDATA_SIZE))
       COMMON /COM/ FOO, FOOSUM, PWRK
       INTRINSIC MY_PE

       IF ( MOD(MY_PE(),2) .EQ. 0) THEN
         CALL SHMEM_INT4_SUM_TO_ALL(FOOSUM, FOO, NR, 0, 1, N$PES/2,
         & PWRK, PSYNC)
         PRINT *, 'Result on PE ', MY_PE(), ' is ', FOOSUM
       ENDIF

       Example 2: Consider the following C/C++ call:
       shmem_int_sum_to_all( target, source, 3, 0, 0, 8, pwrk, psync );

       The preceding call is more efficient, but semantically  equivalent  to,
       the combination of the following calls:
       shmem_int_sum_to_all(&(target[0]), &(source[0]), 1, 0, 0, 8,
         pwrk1, psync1);
       shmem_int_sum_to_all(&(target[1]), &(source[1]), 1, 0, 0, 8,
         pwrk2, psync2);
       shmem_int_sum_to_all(&(target[2]), &(source[2]), 1, 0, 0, 8,
         pwrk1, psync1);

       Note that two sets of pWrk and pSync arrays are used alternately because no
       synchronization is done between calls.

SEE ALSO
       intro_shmem(3)

4.1.2                            Nov 24, 2021                     SHMEM_SUM(3)

Generated by dwww version 1.14 on Fri Jan 24 06:27:50 CET 2025.