pgcc [ -flag ]... sourcefile...
pgcc is the interface to the PGI C compiler for OpenPOWER processors. pgcc invokes the C compiler, assembler, and linker with options derived from its command line argu-
Suffixes of source file names indicate the type of processing to be done:
.c C source; preprocess, compile
.i C source after preprocessing; compile
.s assembler source; assemble
.S assembler source; preprocess, assemble
.o object file; passed to linker
.a library archive file; passed to linker
Unless one overrides the default action using a command-line option, pgcc deletes the intermediate preprocessor and assembler files (see the options -c, -E, -P, and -Mkeep-
asm); if a single C program is compiled and linked with one pgcc command, the intermediate object file is also deleted. Linking is the last stage of the compile process,
unless you use one of the -c, -E, -P, or -S options, or unless compilation errors stop the whole process.
Options must be separate; -cs is different from -c -s. Here is a list of all options, grouped by type. More detailed explanations are in following sections.
-### -c -dryrun -drystdinc -echo -help[=option] -Minform=level -Mkeepasm -o file -rc rcfile -S -show -silent -time -V -V<ver> -v -w
-alias=option -fast -fpic -M[no]autoinline=option -M[no]depchk -Mextract=option -M[no]idiom -Minfo=option -Minline=option -Mneginfo=option -Msafe_lastval
-M[no]safeptr=option -M[no]unroll=option -M[no]vect=option -M[no]zerotrip -mp[=option] -Olevel
-g -gopt -M[no]bounds
-C -Dmacro -dD -dI -dM -dN -E -Idirectory -M -MD -MM -MMD -Mnostddef -Mnostdinc -Mpreprocess -P -Umacro
-acclibs -Bdynamic -Bstatic -Bstatic_pgi -Bsymbolic -cudalibs -Ldirectory -llibrary -m -Mnostartup -Mnostdlib -M[no]rpath -pgc++libs -pgf90libs -Rdirectory -r -rpath
directory -s -shared -soname name -uname --[no-]whole-archive -Wl,argument[,argument]...
-B -c8x -c89 -c9x -c99 -c11 -c1x -M[no]builtin -Mfcon -Mschar -M[no]signextend -M[no]single -Muchar -Xa -Xc -Xs -Xt
Display the standard include directories without invoking the compiler.
-echo Echo the command line flags and stop. This is useful when the compiler is invoked by a script.
Displays command-line options recognized by pgcc on the standard output. pgcc -help -otherswitch will give help about -otherswitch. The default is to list pgcc
command line options by group; options are:
groups Print out the groups into which the switches are organized.
asm Print help for assembler command-line options.
debug Print help for debugging command-line options.
language Print help for language-specific command-line options.
linker Print help for linker options.
opt Print help for optimization command-line options.
other Print help for any other command-line options.
overall Print help for overall command-line options.
phase Print help for the known compiler phases.
prepro Print help for preprocessor command-line options.
suffix Describe the known file suffixes.
switch Print all switches in alphabetical order.
target Print help for target-specific command-line options.
variable Show the pgcc configuration; this is the same as -show.
Specify the minimum level of error severity that the compiler displays during compilation.
fatal Instructs the compiler to display fatal error messages.
file (default) nofile
Print out (don't print out) the names of files as they are compiled; this is only active when there is more than one file on the command line.
severe Instructs the compiler to display severe and fatal error messages.
warn Instructs the compiler to display warning, severe and fatal error messages.
Instructs the compiler to display all error messages (inform, warn, severe and fatal).
-show Produce help information describing the current pgcc configuration.
Do not print warning messages. Same as -Minform=severe.
-time Print execution times for the various steps in the compiler itself.
-V Display version messages and other information.
If the specified version of the compiler is installed, that version of the compiler is invoked.
-v Verbose mode; print out the command line for each tool before it is executed.
-w Do not print warning messages.
Specifies whether to optimizing using ANSI C type-based pointer disambiguation rules. The options can be one of:
ansi Assume ANSI C type-based pointer disambiguation rules apply; this can enable better optimization in some cases. The rules state that a load or store through
a pointer of any type will not conflict with a load or store of a variable or through a pointer of a different type. This is the default with -O2 and above.
Assume traditional C semantics apply. The compiler will assume that a load or store through any pointer might conflict with any variable or pointer
dereference unless it can prove otherwise. This is the default with -O1 and below, and when there is a type-cast pointer reference in the function.
-fast Chooses generally optimal flags for the target platform. Use pgcc -fast -help to see the equivalent switches. Note this sets the optimization level to a minimum of
2; see -O.
-fpic (Linux only) Instructs the compiler to generate position-independent code which can be used to create shared object files (dynamically linked libraries).
-Mautoinline[option[,option...] -Mnoautoinline (default)
Enable inlining of functions with the inline attribute. -Mautoinline is implied with the -fast switch. The options are:
levels:n Inline up to n levels of function calls; the default is to inline up to 10 levels.
maxsize:n Only inline functions with a size of n or less. The size roughly corresponds to the number of statements in the function, though the correspondence is not
direct. The default is to inline functions with a size of 100 or less.
Stop inlining when this function reaches a size of n. The default is to stop inlining when a size of 8000 has been reached.
-Mdepchk (default) -Mnodepchk
Assume (don't assume) that potential data dependencies exist. -Mnodepchk may result in incorrect code; the -Msafeptr switch provides a less dangerous way to
accomplish the same thing.
Run the subprogram extraction phase to prepare for inlining. The =lib:filename option must be used with this switch to name an extract library. See -Minline for more
details on inlining.
Enable loop idiom recognition.
Emit useful information to stderr. The options are:
all Includes options accel, inline, ipa, loop, lre, mp, opt, par, unified, vect.
accel Emit information about accelerator region targeting.
ccff Append complete CCFF information to the object files.
ftn Emit Fortran-specific information.
inline Emit information about functions extracted and inlined.
intensity Emit compute intensity information about loops.
ipa Emit information about the optimizations enabled by interprocedural analysis (IPA).
loop | opt
Emit information about loop optimizations. This includes information about vectorization and loop unrolling.
lre Emit information about loop-carried redundancy elimination.
mp Emit information about OpenMP parallel regions.
par Emit information about loop parallelization.
pfo Emit profile feedback information
time | stat
Emit compilation statistics.
unified Emit information about which routines are selected for target-specific optimizations using the PGI Unified Binary.
vect Emit information about automatic loop vectorization.
With no options, -Minfo is the same as -Minfo=accel,inline,ipa,loop,lre,mp,opt,par,unified,vect.
Pass options to the function inliner. The options are:
Specify an inline library created by a previous -Mextract option. Functions from the specified library are inlined. If no library is specified, functions
are extracted from a temporary library created during an extract prepass.
Specifies which functions should not be inlined.
A non-numeric option is assumed to be a function name. If name: is specified, what follows is always the name of a function.
this option tells the compiler it is safe to parallelize the loop.
-Msafeptr[=option[,option,...]] -Mnosafeptr (default)
Override (don't override) data dependence between C pointers and between pointers and variables or arrays. This option must be used with care since the potential
exists for code to be generated that will result in unexpected or incorrect results as is defined by ANSI C. However, when used properly, this option has the potential
to greatly enhance the performance of code, especially floating point oriented loops. Combinations of the options may be used and interact appropriately.
all All pointers are assumed not to overlap or conflict with other data objects; -Msafeptr with no options implies -Msafeptr=all.
arg | dummy
C dummy arguments (pointers and arrays) are treated with the same copyin/copyout semantics as Fortran dummy arguments.
auto | local
C local or auto variables (pointers and arrays) are assumed not to overlap or conflict with other data objects and are independent.
global C global or extern variables (pointers and arrays) are assumed not to overlap or conflict with other data objects and are independent.
static C static variables (pointers and arrays) are assumed not to overlap or conflict with other data objects and are independent.
-Munroll[=option[,option...]] -Mnounroll (default)
Invoke (don't invoke) the loop unroller. This also sets the optimization level to a minimum of 2; see -O. The option is one of the following:
c:m Instructs the compiler to completely unroll loops with a constant loop count less than or equal to m, a supplied constant. If this value is not supplied,
the m count is set to 4. If m is set to 1, a compiler heuristic determines the maximum loop count at which such loops will be completely unrolled.
n:u Instructs the compiler to unroll u times, a single-block loop which is not completely unrolled, or has a non-constant loop count. If u is not supplied, the
unroller computes the number of times a candidate loop is unrolled.
m:u Instructs the compiler to unroll u times, a multi-block loop which is not completely unrolled, or has a non-constant loop count. If u is not supplied, the
unroller computes the number of times a candidate loop is unrolled.
-Mnounroll instructs the compiler not to unroll loops.
-Mvect [=option[,option,...]] -Mnovect (default)
Pass options to the internal vectorizer. This also sets the optimization level to a minimum of 2, the equivalent of -O; for more information see optimization levels
under -O. If no option list is specified, then the following vector optimizations are used: assoc,cachesize:c,nosimd, where c is the actual cache size of the machine.
The -Mvect options are:
altcode (default) noaltcode
Enable (disable) alternate code generation for vector loops, depending on such characteristics as array alignments and loop counts.
fuse nofuse (default)
Enable (disable) loop fusion to combine adjacent loops into a single loop.
prefetch Use prefetch instructions in loops where profitable.
simd[:128|256] nosimd (default)
Use vector SIMD instructions (SSE, AVX) instructions. The argument may be used to limit usage to 128-bit SIMD instructions. Specifying 256-bit SIMD
instructions is only possible for target processors that support AVX.
uniform nouniform (default)
Perform the same optimizations in the vectorized and residual loops. This may affect the performance of the residual loop.
Enable idiom recognition; this currently has no effect.
levels:n Set maximum nest level of loops to optimize.
partial Enable partial loop vectorization via innermost loop distribution.
short noshort (default)
Enable (disable) recognition of short vector operations that arise from scalar code outside of loops or within the body of loops.
sizelimit[:number] nosizelimit (default)
Limit the size of loops that are vectorized; the default is to attempt to vectorize all loops.
sse nosse (default)
Use (don't use) SSE, SSE2, 3Dnow, and prefetch instructions in loops where possible. The sse option is now deprecated, and the simd option should be used
tile notile (default)
Enable (disable) loop tiling to optimize for cache locality.
-Mnovect disables the vectorizer, and is the default.
-Mzerotrip (default) -Mnozerotrip
Include (don't include) a zero-trip test for loops. Use -Mnozerotrip only when all loops are known to execute at least once.
Interpret OpenMP directives to explicitly parallelize regions of code for execution by multiple threads on a multi-processor system. Most OpenMP directives as well as
the SGI parallelization directives are supported. See Chapters 5 and 6 of the PGI User's Guide for more information on these directives.
Set the optimization level. If -O is not specified, then the default level is 1 if -g is not specified, and 0 if -g is specified. If a number is not supplied with -O
then the optimization level is set to 2. The optimization levels and their meanings are as follows:
-O0 Sets the optimization level to 0. A basic block is generated for each statement. No scheduling is done between statements. No global optimizations are
-O1 Sets the optimization level to 1. Scheduling within extended basic blocks is performed. No global optimizations are performed.
-O Sets the optimization level to 2, with no SIMD vectorization enabled. All level 1 optimizations are performed. In addition, traditional scalar optimizations
such as induction recognition and loop invariant motion are performed by the global optimizer.
-O2 All -O optimizations are performed. In addition, more advanced optimizations such as SIMD code generation, cache alignment and partial redundancy elimination
-O3 All -O1 and -O2 optimizations are performed. In addition, this level enables more aggressive code hoisting and scalar replacement optimizations that may or
may not be profitable.
-O4 All -O1, -O2, and -O3 optimizations are performed. In addition, hoisting of guarded invariant floating point expressions is enabled.
-g Generate symbolic debug information. This also sets the optimization level to zero, unless a -O switch is present on the command line. Symbolic debugging may give
confusing results if an optimization level other than zero is selected. Using -O0 the generated code will be slower than code generated at other optimization levels.
-dI Print to standard output a list of all files included by the preprocessor, including the file name and line number where the include line occurred, and the full path
of the included file.
-dM Print to standard output a list of all the macros and their values as defined in the source files, along with the file name and line number where the definitions
occur, as well as predefined and command-line macros.
-dN Like -dD, print to standard output a list of macro names, but not their values, as defined in the source files, along with the file name and line number where the
-E Preprocess each .c file and send the result to standard output. No compilation, assembly, or linking is performed.
Add directory to the compiler's search path for include files. For include files surrounded by < >, each -I directory is searched followed by the standard area. For
include files surrounded by " ", the directory containing the file containing the #include directive is searched, followed by the -I directories, followed by the
-M Generate a list of make dependences and print them to stdout. -MQ and -MT are synonyms.
-MD Generate a list of make dependences and print them to the file file.d, where file is the root name of the file under compilation.
-MM Generate a list of make dependences and print them to stdout; ignore system includes.
-MMD Generate a list of make dependences and print them to the file file.d, where file is the root name of the file under compilation. Ignore system includes.
Do not predefine any macros to the preprocessor.
Do not search in the standard location for include files when those files are not found elsewhere.
Run the preprocessor on assembler source files.
-P Preprocess each file and leave the output in a file named file.i for each file named file.c.
-Uname Remove the definition of the name macro in the preprocessor.
Pass each comma-delimited option to the assembler.
Link-time option to add the accelerator libraries to the link line.
(Linux only) Passed to the linker to specify dynamic binding.
(Linux only) Passed to the linker to specify static binding.
-m Cause the linker to display a link map.
Do not link in the usual startup routine. This routine contains the entry point for the program.
Do not link in the standard libraries when linking a program.
-Mrpath (default) -Mnorpath
The default is to add -rpath to the link line giving the directories containing the PGI shared objects. Use -Mnorpath to instruct the driver not to add any -rpath
switches to the link line.
Link-time option to add the C++ runtime libraries, allowing mixed-language programming.
Link-time option to add the pgf90 runtime libraries, allowing mixed-language programming.
Passed to the linker; instructs the linker to hard-code the pathname directory into the search path for generated shared object files. Note that there cannot be a
space between R and directory .
-r Passed to the linker; generate a re-linkable object file.
Passed to the linker to add the directory to the runtime shared library search path.
-s Passed to the linker; strip symbol table information.
(Linux only) Passed to the linker. Instructs the linker to generate a shared object file (dynamically linked library). Implies -fpic.
(Linux only) Passed to the linker. When creating a shared object, instructs the linker to set the internal DT_SONAME field to the specified name.
-uname Passed to the linker; generate undefined reference.
(Linux only) Passed to the linker. Instructs the linker to include all objects in subsequent archive files. --no-whole-archive restores the default behavior.
Pass each comma-delimited option to the linker.
-B Allow C++-style comments in source code; these begin with the characters '//' and continue to the end of the current line. Such comments are stripped unless you
specify the -C option.
-c8x Use the C89 standard as the C source language.
-c89 Use the C89 standard as the C source language.
-c9x Use the C99 standard as the C source language.
-Msignextend (default) -Mnosignextend
Sign extend (don't sign extend) when a narrowing conversion overflows. For example, when -Msignextend is in effect and an integer containing the value 65535 is
converted to a short, the value of the short will be -1. ANSI C specifies that the result of such conversions are undefined.
-Msingle -Mnosingle (default)
Suppress (don't suppress) the ANSI-specified conversion of float to double when passing arguments to a function with no prototype in scope. -Msingle may result in
faster code when single precision is used a lot, but is non-ANSI compliant.
-Muchar Specify that the char type is unsigned by default; see -Mschar.
-Xa ANSI mode: Specify that the compiled language should conform to all ANSI features.
-Xc Conformance mode: Specify that the compiled language should conform to all ANSI features, but warnings may be produced about some extensions.
-Xs Standard mode: specify that the compiled language should conform to K&R C. This also implies -ansi=traditional.
-Xt Specify that the compiled language should conform to K&R C. The compiler may produce warnings for semantics where ANSI C and K&R C conflict. This also implies
-acc Enable OpenACC pragmas and directives to explicitly parallelize regions of code for execution by accelerator devices. See the -ta flag to select target accelerators
for which to compile. The options are:
autopar (default) noautopar
Enable loop autoparallelization within parallel constructs.
routineseq noroutineseq (default)
Compile every routine for the device, as if it had a routine seq directive.
sync Ignore async clauses, and run every data transfer and kernel launch on the default sync queue.
wait nowait (default)
Wait for each compute kernel to finish.
-Kieee -Knoieee (default)
Perform (don't perform) float and double divides in conformance with the IEEE 754 standard. This is done by replacing the usual in-line divide algorithm with a
subroutine call, at the expense of performance. The default algorithm produces results that differ from the correctly rounded result by no more than 3 units in the
last place. Also, on some systems, a more accurate math library may be linked if -Kieee is used during the link step.
Specify the type of the accelerator to which to target accelerator regions; accepted values are
Compile the accelerator regions for a CUDA-enabled NVIDIA GPU. Additional suboptions valid after -ta=tesla are:
cc30 cc35 cc60
Generate code for a device with compute capability 3.0, 3.5 or 6.0. Note that multiple compute capabilities can be specified, and one version will be
generated for each capability specified. The default is to generate a version for compute capability 3.5, and if cuda8.0 is specified, for compute
capability 6.0. Specifying cc60 also implies the cuda8.0 option.
cuda7.0 (default) cuda7.5 cuda8.0
Keep the generated CUDA GPU source files, with a .gpu suffix.
Keep the generated portable assembly files, with a .ptx suffix.
Generate code to cache global memory loads in the L1 or L2 hardware cache.
Set the maximum number of registers to use in the generated GPU code.
managed (Beta feature)
Allocate any dynamically allocated data in CUDA Unified (managed) memory. This may not be used with -ta=tesla:pinned. This option must appear in both the
compile and link lines.
Allocate any dynamically allocated data in CUDA Pinned host memory. This may not be used with -ta=tesla:managed. This option must appear in both the
compile and link lines.
rdc (default) nordc
Generate (do not generate) relocatable device code for separate compilation, and invoke the device linker before the host linker at the link step.
Automatically (do not) unroll inner loops. This is enabled by default at optimization level -O3.
Compile the accelerator regions to run on the host processor.
The default in the absence of the -ta flag is to ignore the accelerator directives and compile for the host. Multiple targets are allowed, such as -ta=tesla,host, in
which case code is generated for the Tesla GPU as well as the host for each accelerator region, which allows the executable to run on a system with or without an
attached Tesla GPU.
a.out executable output file
file.a library of object files
file.c C source file
file.i C source file after preprocessing
file.o object file
file.s assembler source file
.mypgccrc You may add custom switches or make other additions to pgcc by creating a file named .mypgccrc in your home directory.
The installation of this version of the compiler resides in $PGI/target/16.10/; other versions may coexist in $PGI/target/release/. $PGI is an environment variable that
points to the root of the compiler installation directory. If $PGI is not set, the default is /usr/pgi. The target is one of the following:
linuxpower for 64-bit OpenPOWER (little-endian) Linux targets
The compiler installation subdirectories are:
bin/ compiler and tool executables and configuration (rc) files
include/ compiler include files
include_acc/ compiler include files for OpenACC
Man(1) output converted with