pgfortran [ -flag ]... sourcefile...
pgfortran is the interface to the PGI Fortran compiler for OpenPOWER processors. pgfortran invokes the Fortran compiler, assembler and linker with options derived from its
command line arguments.
Suffixes of source file names indicate the type of processing to be done:
.f, .for, .ftn
fixed-format Fortran source; compile
.f90, .f95, .f03
free-format Fortran source; compile
.F, .FOR, .FTN, .fpp, .FPP
fixed-format Fortran source; preprocess, compile
.F90, .F95, .F03
free-format Fortran source; preprocess, compile
.cuf free-format CUDA Fortran source; compile
.CUF free-format CUDA Fortran source; preprocess, compile
.s assembler source; assemble
.S assembler source; preprocess, assemble
.o object file; passed to linker
.a library archive file; passed to linker
Unless one overrides the default action using a command-line option, pgfortran deletes the intermediate preprocessor and assembler files (see the options -c, -E, -F, and
-Mkeepasm); if a single Fortran program is compiled and linked with one pgfortran command, the intermediate object file is also deleted. Linking is the last stage of the
compile process, unless you use one of the -c, -E, -F, or -S options, or unless compilation errors stop the whole process.
Options must be separate; -cs is different from -c -s. Here is a list of all options, grouped by type. More detailed explanations are in following sections.
-### -c -dryrun -drystdinc -help[=option] -Minform=level -Mkeepasm -o file -rc rcfile -S -show -silent -time -V -V<ver> -v -w
-fast -fpic -M[no]depchk -Mextract=option -Minfo=option -Minline=option -Mneginfo=option -Mnoopenmp -Mnosgimp -Msafe_lastval -M[no]unroll=option -M[no]vect=option
-M[no]zerotrip -mp[=option] -Olevel
-C -g -gopt -M[no]bounds
-acc -K[no]ieee -mcmodel=small|medium -ta=target
When source files are compiled using -g, or -mp, the same option should be used when using pgfortran to link the objects.
-### Display invocations of the compiler, assembler and linker, but do not execute them.
-c Skip the link step; compile and assemble only.
Use this option to display the invocations of the compiler, assembler, and linker but do not execute them.
Display the standard include directories without invoking the compiler.
Displays command-line options recognized by pgfortran on the standard output. pgfortran -help -otherswitch will give help about -otherswitch. The default is to list
pgfortran command line options by group; options are:
groups Print out the groups into which the switches are organized.
asm Print help for assembler command-line options.
debug Print help for debugging command-line options.
language Print help for language-specific command-line options.
linker Print help for linker options.
opt Print help for optimization command-line options.
other Print help for any other command-line options.
overall Print help for overall command-line options.
phase Print help for the known compiler phases.
prepro Print help for preprocessor command-line options.
suffix Describe the known file suffixes.
switch Print all switches in alphabetical order.
target Print help for target-specific command-line options.
variable Show the pgfortran configuration; this is the same as -show.
Specify the minimum level of error severity that the compiler displays during compilation.
Keep the assembly file for each source file, but continue to assemble and link the program. This is mainly for use in compiler performance analysis and debugging.
Use file as the name of the executable program, rather than the default a.out. If used with -c or -S and a single input file, file is used as the name of the object
or assembler output file.
Specifies the name of a pgfortran startup configuration file. If rcfile is a full pathname, then use the specified file. If rcfile is a relative pathname, use the
file name as found in the $DRIVER directory.
-S Skip the assembly and link steps. Leave the output from the compile step in a file named file.s for each file named, for instance, file.f. See also -o.
-show Produce help information describing the current pgfortran configuration.
Do not print warning messages. Same as -Minform=severe.
-time Print execution times for the various steps in the compiler itself.
-V Display version messages and other information.
If the specified version of the compiler is installed, that version of the compiler is invoked.
-v Verbose mode; print out the command line for each tool before it is executed.
-w Do not print warning messages.
-fast Chooses generally optimal flags for the target platform. Use pgfortran -fast -help to see the equivalent switches. Note this sets the optimization level to a minimum
of 2; see -O.
-fpic (Linux only) Instructs the compiler to generate position-independent code which can be used to create shared object files (dynamically linked libraries).
-Mdepchk (default) -Mnodepchk
Assume (don't assume) that potential data dependencies exist. -Mnodepchk may result in incorrect code.
Run the subprogram extraction phase to prepare for inlining. The =lib:filename option must be used with this switch to name an extract library. See -Minline for more
details on inlining.
A non-numeric option not containing a period is assumed to be the name of a subprogram to be extracted.
Specifies the name of a subprogram or subprograms to be extracted.
Specifies the name of a directory to contain the extracted subprograms; this directory will be created if it does not exist.
inline Emit information about functions extracted and inlined.
intensity Emit compute intensity information about loops.
ipa Emit information about the optimizations enabled by interprocedural analysis (IPA).
loop | opt
Emit information about loop optimizations. This includes information about vectorization and loop unrolling.
lre Emit information about loop-carried redundancy elimination.
mp Emit information about OpenMP parallel regions.
par Emit information about loop parallelization.
pfo Emit profile feedback information
time | stat
Emit compilation statistics.
unified Emit information about which routines are selected for target-specific optimizations using the PGI Unified Binary.
vect Emit information about automatic loop vectorization.
With no options, -Minfo is the same as -Minfo=accel,inline,ipa,loop,lre,mp,opt,par,unified,vect.
Pass options to the function inliner. The options are:
Specify an inline library created by a previous -Mextract option. Functions from the specified library are inlined. If no library is specified, functions
are extracted from a temporary library created during an extract prepass.
Specifies which functions should not be inlined.
A non-numeric option is assumed to be a function name. If name: is specified, what follows is always the name of a function.
A numeric option is assumed to be a size. Functions containing number or less statements are inlined. If both number and function are specified, then
functions matching the given name(s) or meeting the size requirements, are inlined.
number of levels of inlining are performed. The default is 1.
reshape For Fortran, the default is to not inline subprograms with array arguments if the array shape does not match the shape in the caller. This overrides the
Instructs the compiler to produce information on why certain optimizations are not performed. Use the -Minfo flag instead.
n:u Instructs the compiler to unroll u times, a single-block loop which is not completely unrolled, or has a non-constant loop count. If u is not supplied, the
unroller computes the number of times a candidate loop is unrolled.
m:u Instructs the compiler to unroll u times, a multi-block loop which is not completely unrolled, or has a non-constant loop count. If u is not supplied, the
unroller computes the number of times a candidate loop is unrolled.
-Mnounroll instructs the compiler not to unroll loops.
-Mvect [=option[,option,...]] -Mnovect (default)
Pass options to the internal vectorizer. This also sets the optimization level to a minimum of 2, the equivalent of -O; for more information see optimization levels
under -O. If no option list is specified, then the following vector optimizations are used: assoc,cachesize:c,nosimd, where c is the actual cache size of the machine.
The -Mvect options are:
altcode (default) noaltcode
Enable (disable) alternate code generation for vector loops, depending on such characteristics as array alignments and loop counts.
fuse nofuse (default)
Enable (disable) loop fusion to combine adjacent loops into a single loop.
prefetch Use prefetch instructions in loops where profitable.
simd[:128|256] nosimd (default)
Use vector SIMD instructions (SSE, AVX) instructions. The argument may be used to limit usage to 128-bit SIMD instructions. Specifying 256-bit SIMD
instructions is only possible for target processors that support AVX.
uniform nouniform (default)
Perform the same optimizations in the vectorized and residual loops. This may affect the performance of the residual loop.
These options are also supported, but are not recommended for use in new development, except by experienced users, and may be phased out in future releases:
assoc (default) noassoc
Enable (disable) certain associativity conversions that can change the results of a computation due to floating point roundoff error differences. A typical
optimization is to change the order of additions, which is mathematically correct, but can be computationally different, due to roundoff error.
Instructs the vectorizer, when performing cache tiling optimizations, to assume a cache size of number.
gather (default) nogather
Enable (disable) vectorization of loops with indirect array references.
idiom noidiom (default)
Enable idiom recognition; this currently has no effect.
levels:n Set maximum nest level of loops to optimize.
partial Enable partial loop vectorization via innermost loop distribution.
short noshort (default)
Enable (disable) recognition of short vector operations that arise from scalar code outside of loops or within the body of loops.
sizelimit[:number] nosizelimit (default)
Interpret OpenMP directives to explicitly parallelize regions of code for execution by multiple threads on a multi-processor system. Most OpenMP directives as well as
the SGI parallelization directives are supported. See Chapters 5 and 6 of the PGI User's Guide for more information on these directives.
Set the optimization level. If -O is not specified, then the default level is 1 if -g is not specified, and 0 if -g is specified. If a number is not supplied with -O
then the optimization level is set to 2. The optimization levels and their meanings are as follows:
-O0 Sets the optimization level to 0. A basic block is generated for each statement. No scheduling is done between statements. No global optimizations are
-O1 Sets the optimization level to 1. Scheduling within extended basic blocks is performed. No global optimizations are performed.
-O Sets the optimization level to 2, with no SIMD vectorization enabled. All level 1 optimizations are performed. In addition, traditional scalar optimizations
such as induction recognition and loop invariant motion are performed by the global optimizer.
-O2 All -O optimizations are performed. In addition, more advanced optimizations such as SIMD code generation, cache alignment and partial redundancy elimination
-O3 All -O1 and -O2 optimizations are performed. In addition, this level enables more aggressive code hoisting and scalar replacement optimizations that may or
may not be profitable.
-O4 All -O1, -O2, and -O3 optimizations are performed. In addition, hoisting of guarded invariant floating point expressions is enabled.
-C Add array bounds checking; the same as -Mbounds.
-g Generate symbolic debug information. This also sets the optimization level to zero, unless a -O switch is present on the command line. Symbolic debugging may give
confusing results if an optimization level other than zero is selected. Using -O0 the generated code will be slower than code generated at other optimization levels.
-gopt Generate symbolic debug information, without affecting optimizations. This may give confusing results when debugging with optimizations; it is intended for use with
other tools that use the debug information.
-Mbounds -Mnobounds (default)
Add (don't add) array bound checking.
Define name to be def in the preprocessor. If def is missing, it is assumed to be empty. If the = sign is missing, then name is defined to be the string 1.
-E Preprocess each .c file and send the result to standard output. No compilation, assembly, or linking is performed.
-F Stop after preprocessing.
Add directory to the compiler's search path for include files. For include files surrounded by < >, each -I directory is searched followed by the standard area. For
include files surrounded by " ", the directory containing the file containing the #include directive is searched, followed by the -I directories, followed by the
Do not predefine any macros to the preprocessor.
Link-time option to add the accelerator libraries to the link line.
(Linux only; not supported by all linkers) Passed to the linker. Instructs the linker to only set the DT_NEEDED flag for subsequent shared libraries, requiring those
libraries at run time, if they are used to satisfy references. --no-as-needed restores the default behavior.
(Linux only) Passed to the linker to specify dynamic binding.
(Linux only) Passed to the linker to specify static binding.
(Linux only) Statically link in the PGI libraries, while using dynamic linking for the system libraries; implies -Mnorpath.
Passed to the linker; add directory to the list of directories in which the linker searches for libraries.
Passed to the linker; load the library liblibrary.a from the standard library directory. See also the -L option.
-m Cause the linker to display a link map.
Do not link in the usual startup routine. This routine contains the entry point for the program.
Do not link in the standard libraries when linking a program.
-Mrpath (default) -Mnorpath
The default is to add -rpath to the link line giving the directories containing the PGI shared objects. Use -Mnorpath to instruct the driver not to add any -rpath
switches to the link line.
Link-time option to add the C++ runtime libraries, allowing mixed-language programming.
-r Passed to the linker; generate a re-linkable object file.
Passed to the linker; instructs the linker to hard-code the pathname directory into the search path for generated shared object files. Note that there cannot be a
space between R and directory .
Passed to the linker to add the directory to the runtime shared library search path.
-s Passed to the linker; strip symbol table information.
(Linux only) Passed to the linker. Instructs the linker to generate a shared object file (dynamically linked library). Implies -fpic.
(Linux only) Passed to the linker. When creating a shared object, instructs the linker to set the internal DT_SONAME field to the specified name.
-i2 Treat INTEGER variables as two bytes.
-i4 Treat INTEGER variables as four bytes.
-i8 Treat default INTEGER and LOGICAL variables as eight bytes. For operations involving integers, use 64-bits for computations.
Select whether to use Fortran 1995 or Fortran 2003 semantics for assignments to allocatable objects and allocatable components of derived types. Fortran 1995
semantics require the user to allocate the object or component and that an array object or component be conformant before the assignment. Fortran 2003 semantics
require the compiler to add code to check whether the object or component is allocated and whether an array object is conformant before the assignment, and to allocate
or reallocate if not.
-Mbackslash -Mnobackslash (default)
Treat (don't treat) backslash as a normal (non-escape) character in strings. -Mnobackslash causes the standard C backslash escape sequences to be recognized in quoted
strings; -Mbackslash causes the backslash to be treated like any other character.
Swap bytes from big-endian to little-endian or vice versa on input/output of unformatted Fortran data. Use of this option enables reading/writing of Fortran
unformatted data files compatible with those produced on Sun or SGI systems.
Force Cray Fortran (CF77) compatibility with respect to the listed options. Possible options include:
pointer For purposes of optimization, assume that pointer-based variables do not overlap the storage of any other variable.
Enable CUDA Fortran extensions, and link with the CUDA Fortran libraries. -Mcuda is required on the link line if there are no CUDA Fortran source files specified on
the command line. The options are:
cc30 cc35 cc60
Generate code for a device with compute capability 3.0, 3.5 or 6.0. The default is to generate code for compute capability 3.5 and, if cuda8.0
is specified, compute capability 6.0. Specifying cc60 also implies the cuda8.0 option.
cuda7.0 (default) cuda7.5 cuda8.0
Use the CUDA 7.0 (default), 7.5 or 8.0 toolkit to build the GPU code.
fastmath Use the faster (but lower precision) versions of math library routines.
flushz noflushz (default)
Enable (disable) flush-to-zero mode on the GPU.
fma nofma Generate (do not) fused multiply-add operations. This is enabled by default at optimization level -O3.
keepbin Keep the generated CUDA binary files, with a .bin suffix.
keepgpu Keep the generated CUDA GPU source files, with a .gpu suffix.
keepptx Keep the generated portable assembly files, with a .ptx suffix.
lineinfo nolineinfo (default)
Generate debugging line information.
Automatically (do not) unroll inner loops. This is enabled by default at optimization level -O3.
Note that multiple compute capabilities can be specified, and one version will be generated for each capability specified.
Add the names CUDA libraries to the link line. -Mcudalib will use the version of the library appropriate to the CUDA version being used. The libraries
-Mdclchk -Mnodclchk (default)
Require (don't require) that all variables be declared.
-Mdefaultunit -Mnodefaultunit (default)
Treat (don't treat) '*' as stdout/stdin regardless of the status of units 6/5. -Mnodefaultunit causes * to be a synonym for 5 on input and 6 on output;
-Mdefaultunit causes * to be a synonym for stdin on input and stdout on output.
-Mdlines -Mnodlines (default)
Treat (don't treat) lines beginning with D in column 1 as executable statements, ignoring the D.
Allow 132-column source lines.
Process Fortran source using fixed form specifications. The -Mfree options specify free form formatting. By default files with a .f or .F extension use
fixed form formatting.
-Mfree -Mfreeform -Mnofree -Mnofreeform
Process Fortran source using free form specifications. The -Mnofree and -Mfixed options specify fixed form formatting. By default files with a .f90,
.F90, .f95 or .F95 extension use freeform formatting.
-Mi4 (default) -Mnoi4
Treat (don't treat) INTEGER as INTEGER*4. -Mnoi4 treats INTEGER as INTEGER*2.
-Miomutex -Mnoiomutex (default)
Generate (don't generate) critical section calls around Fortran I/O statements.
When the link step is called, don't include the object file which calls the Fortran main program. Useful for using the pgfortran driver to link programs
with the main program written in C or C++ and one or more subroutines written in Fortran.
-Monetrip -Mnoonetrip (default)
Force (don't force) each DO loop to be iterated at least once.
-Mr8 -Mnor8 (default)
-Msave -Mnosave (default)
Assume (don't assume) that all local variables are subject to the SAVE statement. -Msave may allow many older Fortran programs to run but can greatly
-Msignextend (default) -Mnosignextend
Sign extend (don't sign extend) when a narrowing conversion overflows. For example, when -Msignextend is in effect and an integer containing the value
65535 is converted to a short, the value of the short will be -1. ANSI C specifies that the result of such conversions are undefined.
-Mstack_arrays -Mnostack_arrays (default)
Allocate automatic arrays on the stack (on the heap).
Flag non-ANSI-Fortran usage.
-Munixlogical -Mnounixlogical (default)
When -Munixlogical is in effect, a logical is considered to be .TRUE. if its value is non-zero and .FALSE. otherwise. When -Mnounixlogical is in
effect (the default), a logical considered to be .TRUE. if its value is odd and .FALSE. if its value is even.
-Mupcase -Mnoupcase (default)
Preserve (don't preserve) case in names. -Mnoupcase causes all names to be converted to lower case. Note that, if -Mupcase is used, then variable name
'X' is different than variable name 'x', and keywords must be in lower case.
Save/search for module files in directory
-r4 Interpret DOUBLE PRECISION variables as REAL.
-r8 Interpret REAL variables as DOUBLE PRECISION. Equivalent to using the options -Mr8 and -Mr8intrinsics.
-acc Enable OpenACC pragmas and directives to explicitly parallelize regions of code for execution by accelerator devices. See the -ta flag to select target
accelerators for which to compile. The options are:
autopar (default) noautopar
Enable loop autoparallelization within parallel constructs.
routineseq noroutineseq (default)
Compile every routine for the device, as if it had a routine seq directive.
sync Ignore async clauses, and run every data transfer and kernel launch on the default sync queue.
wait nowait (default)
Wait for each compute kernel to finish.
-Kieee -Knoieee (default)
Perform (don't perform) real and doubleprecision divides in conformance with the IEEE 754 standard. This is done by replacing the usual in-line divide
algorithm with a subroutine call, at the expense of performance. The default algorithm produces results that differ from the correctly rounded result by
no more than 3 units in the last place. Also, on some systems, a more accurate math library may be linked if -Kieee is used during the link step.
Use the memory model that limits objects to less than 2GB (small) or allows data sections to be larger than 2GB (medium).
Enable the fast math library, which includes faster, but lower precision, implementations of certain math and intrinsic functions.
flushz noflushz (default)
Enable (disable) flush-to-zero mode on the GPU.
Generate (do not) fused multiply-add operations. This is enabled by default at optimization level -O3.
Keep the generated CUDA binary, with a .bin suffix.
Keep the generated CUDA GPU source files, with a .gpu suffix.
Keep the generated portable assembly files, with a .ptx suffix.
Generate code to cache global memory loads in the L1 or L2 hardware cache.
Set the maximum number of registers to use in the generated GPU code.
managed (Beta feature)
Allocate any dynamically allocated data in CUDA Unified (managed) memory. This may not be used with -ta=tesla:pinned. This option must
appear in both the compile and link lines.
Allocate any dynamically allocated data in CUDA Pinned host memory. This may not be used with -ta=tesla:managed. This option must appear in
both the compile and link lines.
rdc (default) nordc
Generate (do not generate) relocatable device code for separate compilation, and invoke the device linker before the host linker at the link
Automatically (do not) unroll inner loops. This is enabled by default at optimization level -O3.
Compile the accelerator regions to run on the host processor.
The default in the absence of the -ta flag is to ignore the accelerator directives and compile for the host. Multiple targets are allowed, such as
-ta=tesla,host, in which case code is generated for the Tesla GPU as well as the host for each accelerator region, which allows the executable to run on
a system with or without an attached Tesla GPU.
a.out executable output file
file.a library of object files
file.f fixed-format Fortran source file
file.F fixed-format Fortran source file that requires preprocessing
file.f90 free-format Fortran source file
The installation of this version of the compiler resides in $PGI/target/16.10/; other versions may coexist in $PGI/target/release/. $PGI is an environment
variable that points to the root of the compiler installation directory. If $PGI is not set, the default is /usr/pgi. The target is one of the following:
linuxpower for 64-bit OpenPOWER (little-endian) Linux targets
The compiler installation subdirectories are:
bin/ compiler and tool executables and configuration (rc) files
include/ compiler include files
include_acc/ compiler include files for OpenACC
include_man/ compiler include files for OpenACC using managed memory
lib/ libraries and object files
man/ man pages
share/ LLVM sub-directories
pgcc (1), pgc++ (1)
The compiler produces information and error messages as it translates the input program. The linker and assembler may issue their own error messages.
November 2016 pgfortran(1)
Man(1) output converted with