SYNOPSIS

       pgcc [ -flag ]...  sourcefile...


DESCRIPTION

       pgcc  is  the interface to the Portland Group Inc. (PGI) C compiler for
       AMD64 and IA32/EM64T processors.  pgcc invokes the C  compiler,  assem-
       bler,  and linker with options derived from its command line arguments.

       Suffixes of source file names indicate the type  of  processing  to  be
       done:

       .c     C source; preprocess, compile
       .i     C source after preprocessing; compile
       .s     assembler source; assemble
       .S     assembler source; preprocess, assemble
       .o     object file; passed to linker
       .a     library archive file; passed to linker

       If  coinstalled with pgf77 or pgfortran, Fortran file suffixes are also
       recognized and compiled with the  pgf77  or  pgfortran  compilers;  see
       pgf77,  pgfortran, and PGI User’s Guide.  Other files are passed to the
       linker (if linking is requested) with a warning message.

       Unless one overrides the default action using  a  command-line  option,
       pgcc deletes the intermediate preprocessor and assembler files (see the
       options -c, -E, -P, and -Mkeepasm); if a single C program  is  compiled
       and  linked with one pgcc command, the intermediate object file is also
       deleted.  Linking is the last stage of the compile process, unless  you
       use  one of the -c, -E, -P, or -S options, or unless compilation errors
       stop the whole process.


OPTIONS

       Options must be separate; -cs is different from -c -s.  Here is a  list
       of  all  options,  grouped  by type.  More detailed explanations are in
       following sections.

       Overall Options
              -- -# -### -c -[no]defaultoptions -dryrun -drystdinc --flagcheck
              -flags -help[=option] -Manno -Minform=level -Mkeepasm -M[no]list
              -noswitcherror -o file -rc rcfile -S -show -silent -time -v -V
              -V<ver> --version -w -Wpass,option -Ypass,directory

       Optimization Options
              -alias=option -fast -fastsse -fpic -fPIC -Kpic -KPIC
              -M[no]autoinline=option -Mcache_align -Mconcur=option
              -M[no]depchk -M[no]dse -Mextract=option -M[no]frame
              -Minfo=option -Minline=option -Minstrument=option
              -M[no]ipa=option -M[no]lre[=assoc|noassoc] -M[no]movnt
              -Mneginfo=option -Mnoopenmp -Mnosgimp -Mnovintr -Mpfi[=option]
              -Mpfo[=option] -M[no]pre[=all] -M[no]prefetch=option
              -Mprof=option -M[no]propcond -Mquad -Msafe_lastval
              -M[no]safeptr=option -M[no]scalarsse -M[no]smart

       Assembler Options
              -Wa,argument[,argument]...  -Ya,directory

       Linker Options
              -acclibs --[no-]as-needed -Bdynamic -Bstatic -Bstatic_pgi
              -g77libs -llibrary -Ldirectory -m -M[no]eh_frame -Mlfs
              -Mmpi=option -Mnostartup -Mnostdlib -M[no]rpath -Mscalapack
              -pgcpplibs -pgf77libs -pgf90libs -r -Rdirectory -rpath directory
              -s -shared -soname name -uname --[no-]whole-archive
              -Wl,argument[,argument]...  -YC,directory -Yl,directory
              -YL,directory -YS,directory -YU,directory

       Language Options
              -asmsuffix=suffix -B -c8x -c89 -c9x -c99 -csuffix=suffix
              -M[no]asmkeyword -M[no]builtin -M[no]dalign -Mdollar=char -Mfcon
              -Mlibsuffix=suffix -M[no]llalign -M[no]m128 -Mobjsuffix=suffix
              -Mschar -M[no]signextend -M[no]single -Muchar -Xa -Xc -Xs -Xt

       Target-specific Options
              -K[no]ieee -Ktrap=option -M[no]daz -M[no]flushz
              -M[no]fpapprox=option -M[no]fpmisalign -M[no]fprelaxed=option
              -M[no]func32 -Mgccbugs -M[no]longbranch -M[no]loop32
              -M[no]reg_struct_return -M[no]second_underscore
              -Mwritable-strings -m32 -m64 -mcmodel=small|medium -pc val
              -ta=target -tp=target

       Note: when source files are compiled using any of the -g, -mp,
       -Mconcur, -Mipa, or -Mprof options, the same option(s) should be used
       when using pgcc to link the objects.



Overall Options

       --        Anything after this switch is treated as a filename.  Note
                 that most tools will not allow a filename starting with a
                 dash, so these should be avoided.

       -#        Display the invocations of the compiler, assembler, and
                 linker.  These invocations are the command lines created by
                 pgcc.

       -###      Display invocations of the compiler, assembler and linker,
                 but do not execute them.

       -c        Skip the link step; compile and assemble only.

       -defaultoptions (default) -nodefaultoptions
                 Use (don’t use) the default options set in site-specific or
                 user-specific PREOPTIONS or POSTOPTIONS driver variables.

       -dryrun   Use this option to display the invocations of the compiler,
                 assembler, and linker but do not execute them.

                 standard output.  pgcc -help -otherswitch will give help
                 about -otherswitch.  The default is to list pgcc command line
                 options by group; options are:

                 groups   Print out the groups into which the switches are
                          organized.

                 asm      Print help for assembler command-line options.

                 debug    Print help for debugging command-line options.

                 language Print help for language-specific command-line
                          options.

                 linker   Print help for linker options.

                 opt      Print help for optimization command-line options.

                 other    Print help for any other command-line options.

                 overall  Print help for overall command-line options.

                 phase    Print help for the known compiler phases.

                 prepro   Print help for preprocessor command-line options.

                 suffix   Describe the known file suffixes.

                 switch   Print all switches in alphabetical order.

                 target   Print help for target-specific command-line options.

                 variable Show the pgcc configuration; this is the same as
                          -show.

       -Manno    Produce annotated assembly files, where source code is
                 intermixed with assembly language; implies -Mkeepasm.

       -Minform=level
                 Specify the minimum level of error severity that the compiler
                 displays during compilation.

                 fatal     Instructs the compiler to display fatal error
                           messages.

                 file (default) nofile
                           Print out (don’t print out) the names of files as
                           they are compiled; this is only active when there
                           is more than one file on the command line.

                 severe    Instructs the compiler to display severe and fatal
                           error messages.

                 Create (don’t create) a listing file.

       -noswitcherror
                 Ignore unknown command line switches after printing an
                 warning message; the default behavior is to print an error
                 message and halt.

       -o file   Use file as the name of the executable program, rather than
                 the default a.out.  If used with -c, -P, or -S, and a single
                 input file, file is used as the name of the object,
                 preprocessor, or assembler output file.

       -rc rcfile
                 Specifies the name of a pgcc startup configuration file.  If
                 rcfile is a full pathname, then use the specified file.  If
                 rcfile is a relative pathname, use the file name as found in
                 the $DRIVER directory.

       -S        Skip the assembly and link steps. Leave the output from the
                 compile step in a file named file.s for each file named
                 file.c.

       -show     Produce help information describing the current pgcc
                 configuration.

       -silent   Do not print warning messages. Same as -Minform=severe.

       -time     Print execution times for the various steps in the compiler
                 itself.

       -v        Verbose mode; print out the command line for each tool before
                 it is executed.

       -V        Display version messages and other information.

       -V<ver>   If the specified version of the compiler is installed, that
                 version of the compiler is invoked.

       --version Display version messages and other information.

       -w        Do not print warning messages.

       -Wpass,option[,option...]
                 Pass option to the specified pass.  Each comma-delimited
                 option is passed as a separate argument.  The passes are:

                 0         for the compiler,

                 a         for the assembler,

                 i         for the interprocedural analyzer, and


                 l         Search for the linker in directory.

                 I         Set the compiler’s standard include directory to
                           directory.  The standard include directory is set
                           to a default value by the driver and can be
                           overridden by this option.

                 L         If the linker supports the -YL option, then pass
                           the option -YL,directory to the linker. Otherwise,
                           use directory as the standard library location.

                 S         Search for the startup object files in directory.

                 U         If the linker supports the -YU option, then pass
                           the option -YU,directory to the linker. Otherwise
                           this option is ignored.



Optimization Options

       -alias=option
              Specifies whether to optimizing using ANSI C type-based pointer
              disambiguation rules.  The options can be one of:

              ansi      Assume ANSI C type-based pointer disambiguation rules
                        apply; this can enable better optimization in some
                        cases.  The rules state that a load or store through a
                        pointer of any type will not conflict with a load or
                        store of a variable or through a pointer of a
                        different type.  This is the default with -O2 and
                        above.

              traditional
                        Assume traditional C semantics apply.  The compiler
                        will assume that a load or store through any pointer
                        might conflict with any variable or pointer
                        dereference unless it can prove otherwise.  This is
                        the default with -O1 and below, and when there is a
                        type-cast pointer reference in the function.

       -fast  Chooses generally optimal flags for the target platform.  Use
              pgcc -fast -help to see the equivalent switches.  Note this sets
              the optimization level to a minimum of 2; see -O.

       -fastsse
              Chooses generally optimal flags for a processor that supports
              SSE instructions (Pentium 3/4, AthlonXP/MP, Opteron) and SSE2
              (Pentium 4, Opteron).  Use pgcc -fastsse -help to see the
              equivalent switches.

       -fpic  (Linux only) Instructs the compiler to generate position-
              independent code which can be used to create shared object files

              levels:n  Inline up to n levels of function calls; the default
                        is to inline up to 10 levels.

              maxsize:n Only inline functions with a size of n or less.  The
                        size roughly corresponds to the number of statements
                        in the function, though the correspondence is not
                        direct.  The default is to inline functions with a
                        size of 100 or less.

              totalsize:n
                        Stop inlining when this function reaches a size of n.
                        The default is to stop inlining when a size of 8000
                        has been reached.

       -Mcache_align
              Align unconstrained data objects of size greater than or equal
              to 16 bytes on cache-line boundaries.  An unconstrained object
              is a variable or array that is not a member of an aggregate
              structure or common block, is not allocatable, and is not an
              automatic array.

       -Mconcur[=option[,option,...]]
              Instructs the compiler to enable auto-concurrentization of
              loops.  This also sets the optimization level to a minimum of 2;
              see -O.  If -Mconcur is specified, multiple processors will be
              used to execute loops which the compiler determines to be
              parallelizable.  When linking, the -Mconcur switch must be
              specified or unresolved references will occur. The
              OMP_NUM_THREADS or NCPUS environment variables control how many
              processors will be used to execute parallelized loops.  The
              options can be one or more of the following:

              allcores  Use all available cores when the environment variables
                        OMP_NUM_THREADS and NCPUS are not set.  This must be
                        specified at link time.

              bind      Bind threads to cores or processors.  This must be
                        specified at link time.

              altcode:n noaltcode
                        Generate (don’t generate) alternate scalar code for
                        parallelized loops.  The parallelizer generates scalar
                        code to be executed whenever the loop count is less
                        than or equal to n.  If noaltcode is specified, the
                        parallelized version of the loop is always executed
                        regardless of the loop count.

              altreduction[:n]
                        Generate alternate scalar code for parallelized loops
                        containing a reduction.  If a parallelized loop
                        contains a reduction, the parallelizer generates

              dist:block
                        Parallelize with block distribution. Contiguous blocks
                        of iterations of a parallelizable loop are assigned to
                        the available processors.

              dist:cyclic
                        Parallelize with cyclic distribution. The outermost
                        parallelizable loop in any loop nest is parallelized.
                        If a parallelized loop is innermost, its iterations
                        are allocated to processors cyclically. For example,
                        if there are 3 processors executing a loop, processor
                        0 performs iterations 0, 3, 6, etc; processor 1
                        performs iterations 1, 4, 7, etc; and processor 2
                        performs iterations 2, 5, 8, etc.

              innermost noinnermost (default)
                        Enable (disable) parallelization of innermost loops.

              levels:n  Parallelize loops nested at most n levels deep; the
                        default is 3.

              numa nonuma
                        (Linux only) Use (don’t use) thread/processor affinity
                        for NUMA architectures; use this option when linking
                        the program.  -Mconcur=numa will link in a numa
                        library and objects to prevent the operating system
                        from migrating threads from one processor to another.

       -Mdepchk (default) -Mnodepchk
              Assume (don’t assume) that potential data dependencies exist.
              -Mnodepchk may result in incorrect code; the -Msafeptr switch
              provides a less dangerous way to accomplish the same thing.

       -Mdse -Mnodse (default)
              Enable (disable) the dead store elimination optimization.

       -Mextract=[option[,option,...]]
              Run the subprogram extraction phase to prepare for inlining.
              The =lib:filename option must be used with this switch to name
              an extract library.  See -Minline for more details on inlining.

              subprogram[,subprogram]
                     A non-numeric option not containing a period is assumed
                     to be the name of a subprogram to be extracted.

              name:subprogram[,subprogram]
                     Specifies the name of a subprogram or subprograms to be
                     extracted.

              lib:directory
                     Specifies the name of a directory to contain the
                     extracted subprograms; this directory will be created if

       -Minfo[=option[,option,...]]
              Emit useful information to stderr. The options are:

              all       Includes options accel, inline, ipa, loop, lre, mp,
                        opt, par, unified, vect.

              accel     Emit information about accelerator region targeting.

              ccff      Append complete CCFF information to the object files.

              ftn       Emit Fortran-specific information.

              hpf       Emit HPF-specific information.

              inline    Emit information about functions extracted and
                        inlined.

              intensity Emit compute intensity information about loops.

              ipa       Emit information about the optimizations enabled by
                        interprocedural analysis (IPA).

              loop | opt
                        Emit information about loop optimizations.  This
                        includes information about vectorization and loop
                        unrolling.

              lre       Emit information about loop-carried redundancy
                        elimination.

              mp        Emit information about OpenMP parallel regions.

              par       Emit information about loop parallelization.

              pfo       Emit profile feedback information

              time | stat
                        Emit compilation statistics.

              unified   Emit information about which routines are selected for
                        target-specific optimizations using the PGI Unified
                        Binary.

              vect      Emit information about automatic loop vectorization.
       With no options, -Minfo is the same as
       -Minfo=accel,inline,ipa,loop,lre,mp,opt,par,unified,vect.

       -Minline[=option[,option,...]]
              Pass options to the function inliner. The options are:

              lib:filename.ext
                        Specify an inline library created by a previous
                        A numeric option is assumed to be a size.  Functions
                        containing number or less statements are inlined.  If
                        both number and function are specified, then functions
                        matching the given name(s) or meeting the size
                        requirements, are inlined.

              levels:number
                        number of levels of inlining are performed.  The
                        default is 1.

              reshape   For Fortran, the default is to not inline subprograms
                        with array arguments if the array shape does not match
                        the shape in the caller. This overrides the default.

       -Minstrument [=option]
              (linux86-64 only) Generate additional code to enable function-
              level instrumentation.  This option implies -Minfo=ccff and
              -Mframe.  The option is

              functions (default)

       -Mipa [=option[,option,...]] -Mnoipa (default)
              Enable and specify options for InterProcedural Analysis (IPA).
              This also sets the optimization level to a minimum of 2; see -O.
              If no option list is specified, then it is equivalent to
              -Mipa=const.  The options are:

              align noalign (default)
                        Enable (disable) recognition when pointer targets are
                        all cache-line aligned, allowing better SSE code
                        generation.

              arg noarg (default)
                        Remove (don’t remove) arguments replaced by
                        -Mipa=ptr,const.  -Mipa=noarg implies
                        -Mipa=nolocalarg.

              cg nocg (default)
                        Generate information for the pgicg call graph display
                        tool.  Run pgicgexecutable to see the call graph
                        information.

              const (default) noconst
                        Enable (disable) propagation of constants across
                        procedure calls.

              f90ptr nof90ptr (default)
                        Enable (disable) Fortran 90 pointer disambiguation
                        across procedure calls.

              fast      Chooses generally optimal -Mipa flags for the target
                        platform; use pgcc -Mipa -help to see the equivalent

                        nopfo
                            Ignore any profile frequency information from
                            -Mpfo when choosing which functions to inline.

                        reshape noreshape (default)
                            Enable (disable) Fortran inlining with mismatched
                            array shapes.

              ipofile   Save IPA information in a .ipo file instead of the
                        default of appending the information to the object
                        file.

              jobs:n    Use up to n jobs in parallel to reoptimize object
                        files.

              keepobj (default) nokeepobj
                        Keep (don’t keep) the optimized object files, using
                        file name mangling, to reduce recompile time in
                        subsequent application builds.

              libc nolibc (default)
                        Optimize calls to certain standard C library routines.

              libinline nolibinline (default)
                        Allow (don’t allow) inlining from routines in
                        libraries; -Mipa=libinline implies -Mipa=inline.

              libopt nolibopt (default)
                        Allow (don’t allow) recompiling and reoptimizing
                        routines from libraries with IPA information.

              localarg nolocalarg (default)
                        Enable (disable) feature to externalize local
                        variables to allow arguments to be replaced by
                        -Mipa=ptr.  -Mipa=localarg implies -Mipa=arg.

              main:func Specify a function to serve as a global entry point;
                        may appear multiple times; disables linking.

              ptr noptr (default)
                        Enable (disable) pointer disambiguation across
                        procedure calls.

              pure nopure (default)
                        Detect (don’t detect) pure functions.

              quiet     Don’t print out messages about which files are
                        recompiled at link time.

              required  Return an error condition if IPA is inhibited for any
                        reason, rather than the default behavior of linking

              shape noshape (default)
                        Perform (don’t perform) Fortran 90 shape propagation.

              summary   Only collect IPA summary information when compiling;
                        this prevents IPA optimization of this file, but
                        allows optimization for other files linked with this
                        file.

              vestigial novestigial (default)
                        Remove (don’t remove) functions that are not called.

       -Mlre[=assoc|noassoc] -Mnolre
              Enable (disable) loop-carried redundancy elimination.  The assoc
              option allows expression reassociation, and the noassoc option
              disallows expression reassociation.

       -Mmovnt -Mnomovnt
              Force (disable) generation of nontemporal moves.  -Mmovnt used
              with -fastsse can sometimes be faster than -fastsse alone.
              -Mnomovnt also disables -Mvect=movntaltcode.  By default
              nontemporal moves are generated for loops with large loop
              counts.

       -Mneginfo=option[,option...]
              Instructs the compiler to produce information on why certain
              optimizations are not performed.  Use the -Minfo flag instead.

       -Mnoopenmp
              When -mp is present, ignore the OpenMP pragmas.

       -Mnosgimp
              When -mp is present, ignore the SGI parallelization pragmas.

       -Mnovintr
              Do not generate vector intrinsic calls.

       -Mpfi[=option]
              Generate profile feedback instrumentation; this includes extra
              code to collect run-time statistics to be used in a subsequent
              compile; -Mpfi must also appear when the program is linked.
              When the program is run, a profile feedback file pgfi.out will
              be generated; see -Mpfo.  The allowed options are:

              indirect noindirect (default)
                        Enable (disable) collection of indirect function call
                        targets, which can be used for indirect function call
                        inlining.

       -Mpfo[=option[,option,...]]
              Enable profile feedback optimizations; there must be a profile
              feedback file pgfi.out in the current directory, which contains
              the result of an execution of the program compiled with -Mpfi.
                        feedback information file; the default is the current
                        directory.

       -Mpre[=all] -Mnopre (default)
              Enable (disable) the partial redundancy elimination
              optimization.

              all       Enable aggressive PRE.

       -Mprefetch[=option:n] -Mnoprefetch
              Add (don’t add) prefetch instructions for those processors that
              support them (Pentium 4, Opteron); -Mprefetch is default on
              Opteron; -Mnoprefetch is default on other processors.  The
              options are:

              distance:d
                        Set the fetch-ahead distance for prefetch instructions
                        to d cache lines.

              n:n       Set the maximum number of prefetch instructions to
                        generate in a loop to n.

              nta       Use the prefetchnta instruction.

              plain     Use the prefetch instruction.

              t0        Use the prefetcht0 instruction.

              w         Allow the AMD-specific prefetchw instruction.

       -Mprof[=option[,option,...]]
              Set performance profiling options.  Use of these options will
              cause the resulting executable to create a performance profile
              that can be viewed and analyzed with the PGPROF performance
              profiler.  In the descriptions below, PGI-style profiling
              implies compiler-generated source instrumentation.  MPICH-style
              profiling implies the use of instrumented wrappers for MPI
              library routines.  The -Mprof options are:

              ccff

              dwarf     Generate limited DWARF symbol information sufficient
                        for most performance profilers.

              func      Perform PGI-style function level profiling.

              hwcts     Generate a profile using event-based sampling of
                        hardware counters via the PAPI interface (linux86-64
                        only, PAPI must be installed).

              lines     Perform PGI-style line level profiling.

                        except that the profile is saved in a file named
                        pgprof.out instead of gmon.out.

              On Linux systems that have OProfile installed, PGPROF supports
              collection of performance data without recompilation. Use of
              -Mprof=dwarf is useful for this mode of profiling.

       -Mpropcond (default) -Mnopropcond
              Enable (disable) propagation of constant values derived from
              conditional branches with equality tests.

       -Mquad Align large objects on quad-word boundaries.

       -Msafe_lastval
              In the case where a scalar is used after a loop, but is not
              defined on every iteration of the loop, the compiler does not by
              default parallelize the loop. However, this option tells the
              compiler it is safe to parallelize the loop.

       -Msafeptr[=option[,option,...]] -Mnosafeptr (default)
              Override (don’t override) data dependence between C pointers and
              between pointers and variables or arrays.  This option must be
              used with care since the potential exists for code to be
              generated that will result in unexpected or incorrect results as
              is defined by ANSI C. However, when used properly, this option
              has the potential to greatly enhance the performance of code,
              especially floating point oriented loops.  Combinations of the
              options may be used and interact appropriately.

              all       All pointers are assumed not to overlap or conflict
                        with other data objects; -Msafeptr with no options
                        implies -Msafeptr=all.

              arg | dummy
                        C dummy arguments (pointers and arrays) are treated
                        with the same copyin/copyout semantics as Fortran
                        dummy arguments.

              auto | local
                        C local or auto variables (pointers and arrays) are
                        assumed not to overlap or conflict with other data
                        objects and are independent.

              global    C global or extern variables (pointers and arrays) are
                        assumed not to overlap or conflict with other data
                        objects and are independent.

              static    C static variables (pointers and arrays) are assumed
                        not to overlap or conflict with other data objects and
                        are independent.

       -Mscalarsse -Mnoscalarsse
              this switch must be specified when compiling the file containing
              the Fortran, C, or C++ main routine.  This is currently only
              available on 64-bit Linux systems.  The behavior of -Msmartalloc
              can be modified with the following options:

              huge      Link in the huge page runtime library, so dynamic
                        memory will be allocated in huge pages.

              huge:n    Link in the huge page runtime library and allocate n
                        huge pages.

              hugebss   (x86-64 only) Link in the huge page runtime library
                        and allocate the BSS section (containing uninitialized
                        static symbols) in huge pages.  This requires that the
                        huge page runtime library be linked dynamically, so
                        the -rpath option for that directory will be added
                        regardless of the setting of -Mnorpath.

              nohuge    Override any previous -Msmartalloc=huge or
                        -Msmartalloc=hugebss switches; do not link in the huge
                        page runtime library.

       -Mstride0 -Mnostride0 (default)
              Generate (don’t generate) alternate code for a loop that
              contains an induction variable whose increment may be zero.

       -Munroll[=option[,option...]] -Mnounroll (default)
              Invoke (don’t invoke) the loop unroller.  This also sets the
              optimization level to a minimum of 2; see -O.  The option is one
              of the following:

              c:m       Instructs the compiler to completely unroll loops with
                        a constant loop count less than or equal to m, a
                        supplied constant.  If this value is not supplied, the
                        m count is set to 4.

              n:u       Instructs the compiler to unroll u times, a single-
                        block loop which is not completely unrolled, or has a
                        non-constant loop count.  If u is not supplied, the
                        unroller computes the number of times a candidate loop
                        is unrolled.

              m:u       Instructs the compiler to unroll u times, a multi-
                        block loop which is not completely unrolled, or has a
                        non-constant loop count.  If u is not supplied, the
                        unroller computes the number of times a candidate loop
                        is unrolled.

              -Mnounroll instructs the compiler not to unroll loops.

       -Munsafe_par_align -Mnounsafe_par_align
              Use (don’t use) aligned moves for array loads in parallelized
                        loops, depending on such characteristics as array
                        alignments and loop counts.

              assoc (default) noassoc
                        Enable (disable) certain associativity conversions
                        that can change the results of a computation due to
                        floating point roundoff error differences.  A typical
                        optimization is to change the order of additions,
                        which is mathematically correct, but can be
                        computationally different, due to roundoff error.

              cachesize:number (default=automatic)
                        Instructs the vectorizer, when performing cache tiling
                        optimizations, to assume a cache size of number.

              fuse nofuse (default)
                        Enable (disable) loop fusion to combine adjacent loops
                        into a single loop.

              gather (default) nogather
                        Enable (disable) vectorization of loops with indirect
                        array references.

              idiom noidiom (default)
                        Enable idiom recognition; this currently has no
                        effect.

              levels:n  Set maximum nest level of loops to optimize.

              partial   Enable partial loop vectorization via innermost loop
                        distribution.

              prefetch  Use prefetch instructions in loops where profitable.

              short noshort (default)
                        Enable (disable) recognition of short vector
                        operations that arise from scalar code outside of
                        loops or within the body of loops.

              simd[:128|256] nosimd (default)
                        Use vector SIMD instructions (SSE, AVX) instructions.
                        The argument may be used to limit usage to 128-bit
                        SIMD instructions.  Specifying 256-bit SIMD
                        instructions is only possible for target processors
                        that support AVX.


              sizelimit[:number] nosizelimit (default)
                        Limit the size of loops that are
                        vectorized; the default is to attempt to
                        vectorize all loops.


       -Mnovect disables the vectorizer, and is the default.

       -Mzerotrip (default) -Mnozerotrip
              Include (don’t include) a zero-trip test for
              loops.  Use -Mnozerotrip only when all loops are
              known to execute at least once.

       -mp[=option]
              Interpret OpenMP pragmas to explicitly parallelize
              regions of code for execution by multiple threads
              on a multi-processor system. Most OpenMP pragmas
              as well as the SGI parallelization pragmas are
              supported. See Chapters 5 and 6 of the PGI User’s
              Guide for more information on these pragmas.  The
              options allowed are:

              align noalign (default)
                        Modify (don’t modify) default loop
                        iteration scheduling to align iterations
                        with array references.  The default is
                        to use simple static scheduling.

              allcores  Use all available cores when the
                        environment variables OMP_NUM_THREADS
                        and NCPUS are not set.  This must be
                        specified at link time.

              bind      Bind threads to cores or processors.
                        This must be specified at link time.

              numa nonuma
                        Use (don’t use) libraries to give
                        affinity between threads and processors;
                        this is useful with NUMA (non-uniform
                        memory access) parallel architectures,
                        so memory allocated by a particular
                        thread will be allocated close to that
                        processor, and will remain close to that
                        thread.  The default depends on the host
                        machine.

       -O[level]
              Set the optimization level.  If -O is not
              specified, then the default level is 1 if -g is
              not specified, and 0 if -g is specified.  If a
              number is not supplied with -O then the
              optimization level is set to 2.  The optimization
              levels and their meanings are as follows:

              0         A basic block is generated for each C
                        statement. No scheduling is done between

              3         All level 1 and 2 optimizations are
                        performed.  In addition, this level
                        enables more aggressive code hoisting
                        and scalar replacement optimizations
                        that may or may not be profitable.

              4         All level 1, 2 and 3 optimizations are
                        performed.  In addition, hoisting of
                        guarded invariant floating point
                        expressions is enabled.

       -pg    (Linux only) Enable gprof-style sample-based
              profiling; implies -Mframe.



Debugging Options

       -g     Generate symbolic debug information. This also
              sets the optimization level to zero, unless a -O
              switch is present on the command line. Symbolic
              debugging may give confusing results if an
              optimization level other than zero is selected.
              Using -O0 the generated code will be slower than
              code generated at other optimization levels.

       -gopt  Generate symbolic debug information, without
              affecting optimizations.  This may give confusing
              results when debugging with optimizations; it is
              intended for use with other tools that use the
              debug information.

       -Mbounds -Mnobounds (default)
              Add (don’t add) array bounds checking.  Bounds
              checking is not applied to a subscripting pointer.

       -Mchkfpstk
              Check for internal consistency of the IA-32
              floating point stack in the prologue of a function
              and after returning from a function or subroutine
              call. If the PGI_CONTINUE environment variable is
              set, the stack will be automatically cleaned up
              and execution will continue. There is a
              performance penalty associated with the stack
              cleanup. If PGI_CONTINUE is set to verbose, the
              stack will be automatically cleaned up and
              execution will continue after a warning message is
              printed.

       -Mchkstk
              Check the stack for available space upon entry to
              and before the start of a parallel region. Useful
              when many private variables are declared.

       -Mnodwarf
              Don’t add the default dwarf information.

       -traceback -notraceback (default)
              Add debug information for runtime traceback



Preprocessor Options

       -C     Preserve comments in preprocessed C source files.

       -Dname[=def]
              Define name to be def in the preprocessor. If def
              is missing, it is assumed to be empty. If the =
              sign is missing, then name is defined to be the
              string 1.

       -dD    Print to standard output a list of the macros and
              their values as defined in the source files, along
              with the file name and line number where the
              definitions occur.

       -dI    Print to standard output a list of all files
              included by the preprocessor, including the file
              name and line number where the include line
              occurred, and the full path of the included file.

       -dM    Print to standard output a list of all the macros
              and their values as defined in the source files,
              along with the file name and line number where the
              definitions occur, as well as predefined and
              command-line macros.

       -dN    Like -dD, print to standard output a list of macro
              names, but not their values, as defined in the
              source files, along with the file name and line
              number where the definitions occur.

       -E     Preprocess each .c file and send the result to
              standard output. No compilation, assembly, or
              linking is performed.

       -Idirectory
              Add directory to the compiler’s search path for
              include files.  For include files surrounded by <
              >, each -I directory is searched followed by the
              standard area. For include files surrounded by "
              ", the directory containing the file containing
              the #include directive is searched, followed by
              the -I directories, followed by the standard area.

       -M     Generate a list of make dependences and print them
              to stdout.

              m    Print makefile dependencies to stdout, a la
                   -M.

              md   Print makefile dependencies to file.d, a la
                   -MD.

              mm   Print makefile dependencies to stdout,
                   ignoring system includes (includes with angle
                   braces), a la -MM.

              mmd  Print makefile dependencies to file.d,
                   ignoring system includes (includes with angle
                   braces), a la -MMD.

              line Include line numbers into the preprocessed
                   output.

              suffix:suff
                   When generating makefile dependencies, name
                   the dependent file file.suff; the default is
                   to name the dependent file file.o.

       -MD    Generate a list of make dependences and print them
              to the file file.d, where file is the root name of
              the file under compilation.

       -MM    Generate a list of make dependences and print them
              to stdout; ignore system includes.

       -MMD   Generate a list of make dependences and print them
              to the file file.d, where file is the root name of
              the file under compilation. Ignore system
              includes.

       -Mnostddef
              Do not predefine any macros to the preprocessor.

       -Mnostdinc
              Do not search in the standard location for include
              files when those files are not found elsewhere.

       -Mpreprocess
              Run the preprocessor on assembler source files.

       -P     Preprocess each file and leave the output in a
              file named file.i for each file named file.c.

       -Uname Remove any initial definition of name in the
              preprocessor. The only names predefined by the
              preprocessor itself are the standard ANSI C
              predefined macros. The driver may predefine other


Assembler Options

       -Wa,option[,option...]
              Pass each comma-delimited option to the assembler.

       -Ya,directory
              Look in directory for the assembler executable.



Linker Options

       -acclibs
              Link-time option to add the accelerator libraries
              to the link line.

       --as-needed --no-as-needed
              (Linux only; not supported by all linkers) Passed
              to the linker.  Instructs the linker to only set
              the DT_NEEDED flag for subsequent shared
              libraries, requiring those libraries at run time,
              if they are used to satisfy references.
              --no-as-needed restores the default behavior.

       -Bdynamic
              (Linux only) Passed to the linker to specify
              dynamic binding.

       -Bstatic
              (Linux only) Passed to the linker to specify
              static binding.

       -Bstatic_pgi
              (Linux only) Statically link in the PGI libraries,
              while using dynamic linking for the system
              libraries; implies -Mnorpath.

       -g77libs
              (Linux only) Link-time option which allows object
              files generated by GNU g77 (or gcc) to be linked
              in to pgcc main programs.

       -llibrary
              Passed to the linker; load the library
              liblibrary.a from the standard library directory.
              See also the -L option.

       -Ldirectory
              Add directory to the list of directories in which
              the linker searches for libraries.

       -m     Cause the linker to display a link map.

       -Meh_frame -Mnoeh_frame
              Add (don’t add) arguments to the link line to
              to build an MPI application using MPI libraries
              installed with the PGI Cluster Development Kit
              (CDK). -Mmpi inserts -I$MPIDIR/include into the
              compile line, and -L$MPIDIR/lib -lfmpich -lmpich
              into the link line.  The specified option is used
              to determine whether to select MPICH-1 or MPICH-2
              headers and libraries. The base directories for
              MPICH-1 and MPICH-2 are set in localrc.  The -Mmpi
              options are:

              hpmpi     Select preconfigured HP-MPI libraries.

              mpich1    Select preconfigured MPICH-1 libraries.

              mpich2    Select preconfigured MPICH-2 libraries.

              mvapich1  Select preconfigured MVAPICH libraries.

              The user can set the environment variables MPIDIR
              and MPILIBNAME to override the default values for
              the MPI directory and library name.

       -Mnostartup
              Do not link in the usual startup routine. This
              routine contains the entry point for the program.

       -Mnostdlib
              Do not link in the standard libraries when linking
              a program.

       -Mrpath (default) -Mnorpath
              The default is to add -rpath to the link line
              giving the directories containing the PGI shared
              objects.  Use -Mnorpath to instruct the driver not
              to add any -rpath switches to the link line.

       -Mscalapack
              (PGI CDK only) Add the Scalapack libraries.

       -pgcpplibs
              Link-time option to add the C++ runtime libraries,
              allowing mixed-language programming.

       -pgf77libs
              Link-time option to add the pgf77 runtime
              libraries, allowing mixed-language programming.

       -pgf90libs
              Link-time option to add the pgf90 runtime
              libraries, allowing mixed-language programming.

       -r     Passed to the linker; generate a re-linkable
       -shared
              (Linux only) Passed to the linker. Instructs the
              linker to generate a shared object file
              (dynamically linked library).  Implies -fpic.

       -soname name
              (Linux only) Passed to the linker. When creating a
              shared object, instructs the linker to set the
              internal DT_SONAME field to the specified name.

       -uname Passed to the linker; generate undefined
              reference.

       --whole-archive --no-whole-archive
              (Linux only) Passed to the linker.  Instructs the
              linker to include all objects in subsequent
              archive files.  --no-whole-archive restores the
              default behavior.

       -Wl,option[,option...]
              Pass each comma-delimited option to the linker.

       -YC,directory
              Look in directory for the standard compiler
              library files.

       -Yl,directory
              Look in directory for the linker.

       -YL,directory
              Look in directory for the standard system library
              files.

       -YS,directory
              Look in directory for the standard system startup
              object files.

       -YU,directory
              Passed to the linker; change library search path.



Language Options

       -asmsuffix=suffix
              Define that a file with the given suffix is an
              assembly language file.

       -B        Allow C++-style comments in source code; these
                 begin with the characters ’//’ and continue to
                 the end of the current line. Such comments are
                 stripped unless you specify the -C option.

       -c8x      Use the C89 standard as the C source language.

       -Mbuiltin (default) -Mnobuiltin
                 Compile (don’t compile) with math subroutine
                 builtin support, which causes selected math
                 library routines to be inlined.

       -Mdalign (default) -Mnodalign
                 Align (don’t align) doubles in structures on
                 8-byte boundaries.  -Mnodalign may lead to data
                 alignment exceptions.

       -Mdollar=char
                 Set the character used to replace dollar signs
                 in names to be char.  Default is an underscore
                 (_).

       -Mfcon    Treat non-suffixed floating point constants as
                 float, rather than double.  This may improve
                 the performance of single-precision code.

       -Mlibsuffix=suffix
                 Define that a file with the given suffix is an
                 object library file.

       -Mllalign -Mnollalign (default)
                 Align (don’t align) long longs or INTEGER*8 in
                 structures or common blocks on 8-byte
                 boundaries.  -Mnollalign is the default, and
                 this is a change beginning with release 4.0.
                 Releases prior to 4.0 aligned long longs on
                 8-byte boundaries.

       -Mm128 -Mnom128 (default)
                 (C only) Recognize the datatypes __m128,
                 __m128d and __m128i.

       -Mobjsuffix=suffix
                 Define that a file with the given suffix is a
                 binary object file.

       -Mschar (default)
                 Specify that the char type is signed by
                 default; see -Muchar.

       -Msignextend (default) -Mnosignextend
                 Sign extend (don’t sign extend) when a
                 narrowing conversion overflows.  For example,
                 when -Msignextend is in effect and an integer
                 containing the value 65535 is converted to a
                 short, the value of the short will be -1.  ANSI
                 C specifies that the result of such conversions
                 are undefined.

       -Xc       Conformance mode: Specify that the compiled
                 language should conform to all ANSI features,
                 but warnings may be produced about some
                 extensions.

       -Xs       Standard mode: specify that the compiled
                 language should conform to K&R C.  This also
                 implies -ansi=traditional.

       -Xt       Specify that the compiled language should
                 conform to K&R C.  The compiler may produce
                 warnings for semantics where ANSI C and K&R C
                 conflict.  This also implies -ansi=traditional.



Target-specific Options

       -Kieee -Knoieee (default)
              Perform floating-point operations in strict
              conformance with the IEEE 754 standard.  Some
              optimizations are disabled with -Kieee, and a more
              accurate math library is used.  The default
              -Knoieee uses faster but very slightly less
              accurate methods.

       -Ktrap=[option,[option]...]
              Controls the behavior of the processor when
              exceptions occur.  Possible options include

              align   Trap on memory alignment errors, currently
                      ignored.

              denorm  Trap on denormalized operands.

              divz    Trap on divide by zero.

              fp      Trap on floating point exceptions.

              inexact Trap on inexact result.

              inv     Trap on invalid operands.

              none (default)
                      Disable all traps.

              ovf     Trap on floating point overflow.

              unf     Trap on floating point underflow.
       -Ktrap is only processed when compiling a main
       function/program.  -Ktrap=fp is equivalent to
       -Ktrap=divz,inv,ovf.  These options correspond to the
       processor’s exception mask bits.  Normally, the
              Set SSE to flush-to-zero mode.

       -Mfpapprox [=option[,option,...]] -Mnofpapprox (default)
              Perform (don’t perform) certain single-precision
              floating point operations using low-precision
              approximation.  This can be very dangerous; the
              low-precision approximations are much faster than
              the full precision computation, but the results
              will be different.  This option should be used
              only with the utmost care.  The options are

              div       Approximate single precision floating
                        point division.

              rsqrt     Approximate single precision floating
                        point reciprocal square root.

              sqrt      Approximate single precision floating
                        point square root.
       With no options, -Mfpapprox will approximate all three
       operations.

       -Mfpmisalign -Mnofpmisalign
              Allow (don’t allow) vector arithmetic instructions
              with memory operands that are not aligned on
              16-byte boundaries.

       -Mfprelaxed [=option[,option,...]] -Mnofprelaxed (default)
              Perform (don’t perform) certain floating point
              operations using relaxed precision when it
              improves speed.  The options are

              div       Perform divide using relaxed precision.

              order noorder
                        Allow (don’t allow) expression
                        reordering, including factoring such as
                        computing a*b+a*c as a*(b+c).

              recip     Perform reciprocal operations using
                        relaxed precision.

              rsqrt     Perform reciprocal square root (1/sqrt)
                        using relaxed precision.

              sqrt      Perform square root using relaxed
                        precision.
       With no options, -Mfprelaxed will choose to generate
       relaxed precision code for those operations that generate
       a significant performance improvement, depending on the
       target processor.

       -Mreg_struct_return -Mnoreg_struct_return (default)
              Return (don’t return) small struct/union function
              values in registers.  This switch only affects
              32-bit code.

       -Msecond_underscore -Mnosecond_underscore (default)
              Add (don’t add) a second underscore to the name of
              a Fortran global if its name already contains an
              underscore. This option is useful for maintaining
              compatibility with g77, which adds a second
              underscore to such symbols by default.

       -Mwritable-strings
              Store string constants in the writable data
              segment.

       -m32   Compile for 32-bit target.

       -m64   Compile for 64-bit target.

       -mcmodel=small|medium
              (AMD64 and IA32/EM64T only) Use the memory model
              that limits objects to less than 2GB (small) or
              allows data sections to be larger than 2GB
              (medium); implies -Mlarge_arrays

       -pc val
              The IA-32 architecture implements a floating-point
              stack using 8 80-bit registers. Each register uses
              bits 0-63 as the significand, bits 64-78 for the
              exponent, and bit 79 is the sign bit. This 80-bit
              real format is the default format (called the
              extended format).  When values are loaded into the
              floating point stack they are automatically
              converted into extended real format.  The
              precision of the floating point stack can be
              controlled, however, by setting the precision
              control bits (bits 8 and 9) of the floating
              control word appropriately. In this way, the
              programmer can explicitly set the precision to
              standard IEEE double using 64 bits, or to single
              precision using 32 bits.  The default precision
              setting is system dependent.  If you use -pc to
              alter the precision setting for a routine, the
              main program must be compiled with the same value
              for -pc.  The command line option -pc val lets the
              programmer set the compiler’s precision
              preference. Valid values for val are:
                  32 single precision
                  64 double precision
                  80 extended precision
              Operations performed exclusively on the floating
                      suboptions valid after -ta=nvidia are:

                      analysis
                          Perform the analysis, but do not
                          generate GPU code.

                      nofma
                          Do not generate fused multiply-add
                          operations.

                      cc1x
                          Generate code for the lowest compute
                          capability 1.x device that supports
                          all the features required in the
                          program.

                      cc10
                          Generate code for a device with
                          compute capability 1.0.

                      cc11
                          Generate code for a device with
                          compute capability 1.1.

                      cc12
                          Generate code for a device with
                          compute capability 1.2.

                      cc13
                          Generate code for a device with
                          compute capability 1.3.

                      cc2x
                          Generate code for the lowest compute
                          capability 2.x device that supports
                          all the features required in the
                          program.

                      cc20
                          Generate code for a device with
                          compute capability 2.0.  This requires
                          -ta=nvidia,cuda3.0, or changing the
                          default CUDA version to 3.0 in the
                          siterc file.

                      cuda4.0
                          Use the CUDA 4.0 toolkit to build the
                          GPU code.

                      4.0 An alias for -Mcuda=cuda4.0.

                      fastmath
                          Keep the generated CUDA GPU source
                          files, with a .gpu suffix.

                      keepptx
                          Keep the generated portable assembly
                          files, with a .ptx suffix.

                      maxregcount:n
                          Set the maximum number of registers to
                          use in the generated GPU code.

                      mul24
                          Use 24-bit multiplication for array
                          subscripting.

                      time
                          Link with a profile library to collect
                          simple timing information for
                          accelerator regions.
              Note that multiple compute capabilities can be
              specified, and one version will be generated for
              each capability specified.  The default is to
              generate one version for the lowest compute
              capability that will support all the features in
              the program, one version with compute capability
              1.3, and if the CUDA 3.0 toolkit is used, one
              version with compute capability 2.0.

              -ta=host
                      Compile the accelerator regions to run on
                      the host processor.

              The default in the absence of the -ta flag is to
              ignore the accelerator directives and compile for
              the host.  Multiple targets are allowed, such as
              -ta=nvidia,host, in which case two versions of
              each routine with accelerator regions are
              generated, one to run on the NVIDIA GPU and one on
              the host; the selection of which version to
              execute is made at run time.

       -tp=target
              Specify the type of the target processor;
              possibilities are

              -tp= amd64
                      AMD Opteron or Athlon-64 in 64-bit mode

              -tp= amd64e
                      AMD Opteron revision E or later, in 64-bit
                      mode; includes SSE3 instructions


              -tp= bulldozer
                      AMD Bulldozer processor

              -tp= bulldozer-32
                      AMD Bulldozer processor in 32-bit mode

              -tp= bulldozer-64
                      AMD Bulldozer processor in 64-bit mode

              -tp= core2
                      Intel core2 processor

              -tp= core2-32
                      Intel core2 processor in 32-bit mode

              -tp= core2-64
                      Intel core2 processor in 64-bit mode

              -tp= istanbul
                      AMD Istanbul architecture Opteron
                      processor

              -tp= istanbul-32
                      AMD Istanbul architecture Opteron
                      processor, 32-bit mode

              -tp= istanbul-64
                      AMD Istanbul architecture Opteron
                      processor, 64-bit mode

              -tp= k7 AMD Athlon processor

              -tp= k8 AMD Opteron or Athlon-64

              -tp= k8-32
                      AMD Opteron or Athlon-64 in 32-bit mode

              -tp= k8-64
                      AMD Opteron or Athlon-64 in 64-bit mode

              -tp= k8-64e
                      AMD Opteron revision E or later, in 64-bit
                      mode; includes SSE3 instructions

              -tp= nehalem
                      Intel Nehalem architecture Core processor

              -tp= nehalem-32
                      Intel Nehalem architecture Core processor,
                      32-bit mode

                      IA32/EM64T processor in 64-bit mode

              -tp= penryn
                      Intel Penryn architecture Pentium
                      processor

              -tp= penryn-32
                      Intel Penryn architecture Pentium
                      processor, 32-bit mode

              -tp= penryn-64
                      Intel Penryn architecture Pentium
                      processor, 64-bit mode

              -tp= piii
                      Pentium III processor

              -tp= piv
                      Pentium 4 processor

              -tp= px Blended code generation that will work on
                      any x86-compatible processor

              -tp= px-32
                      Blended code generation that will work on
                      any 32-bit x86-compatible processor

              -tp= px-64
                      Blended code generation that will work on
                      any 64-bit x86 processor

              -tp= sandybridge
                      Intel SandyBridge architecture Core
                      processor

              -tp= sandybridge-32
                      Intel SandyBridge architecture Core
                      processor, 32-bit mode

              -tp= sandybridge-64
                      Intel SandyBridge architecture Core
                      processor, 64-bit mode

              -tp= shanghai
                      AMD Shanghai architecture Opteron
                      processor

              -tp= shanghai-32
                      AMD Shanghai architecture Opteron
                      processor, 32-bit mode

              -tp= shanghai-64
              equivalent to -m64 -tp=target, and -tp=target-32
              is equivalent to -m32 -tp=target.  When 32- and
              64-bit targets are available for a target,
              -tp=target by itself will compile for a 32-bit or
              64-bit target depending on whether the 32-bit or
              64-bit compiler is invoked from your command line
              path.



FILES

       a.out       executable output file
       pgpf.out    Profile feedback data file; see -Mpfi
       pgprof.out  PGPROF output file; see -Mprof
       file.a      library of object files
       file.c      C source file
       file.i      C source file after preprocessing
       file.ipa    InterProcedural Analyzer (IPA) file
       file.ipo    InterProcedural Analyzer (IPA) file
       file.o      object file
       file.s      assembler source file
       .mypgccrc   You may add custom switches or make other
                   additions to pgcc by creating a file named
                   .mypgccrc in your home directory.

       The installation of this version of the compiler resides
       in $PGI/target/12.4-0/; other versions may coexist in
       $PGI/target/release/.  $PGI is an environment variable
       that points to the root of the compiler installation
       directory. If $PGI is not set, the default is /usr/pgi.
       The target is one of the following:
       linux86     for 32-bit IA32 Linux targets
       linux86-64  for 64-bit AMD64 or IA32/EM64T Linux targets

       The compiler installation subdirectories are:
       bin/        compiler and tool executables and
                   configuration (rc) files
       include/    compiler include files
       lib/        libraries and object files
       liblf/      libraries and object files


SEE ALSO

       pgCC (1), pgf77 (1), pgfortran (1), pghpf (1), pgprof
       (1), pgdbg (1), and the PGI User’s Guide.


DIAGNOSTICS

       The compiler produces information and error messages as
       it translates the input program. The linker and assembler
       may issue their own error messages.



                                  April 2012                           pgcc(1)

Man(1) output converted with man2html