Ibm binary incompatible problem between gcc and xl compilers for vector data types
The two target addresses represent the callee and the branch island. The branch island is appended to the body of the calling function; it computes the full bit address of the callee and jumps to it.
On Mach-O Darwin systems, this option directs the compiler emit to the glue for every direct call, and the Darwin linker decides whether to use or discard it. In the future, GCC may ignore all longcall specifications when the linker is known to generate glue. Older versions of GCC prior to 4. This is corrected in more recent versions of GCC. This option may be used to generate code that is compatible with functions compiled with older versions of GCC.
The -mno-compat-align-parm option is the default. The default for those is as specified in the relevant ABI. The default value of these options is determined when configuring GCC.
The -mcpu options automatically enable or disable the following options: If both are specified, the code generated uses the architecture and registers set by -mcpu , but the scheduling parameters set by -mtune. The TOC is limited to 64k. The TOC and other static data may be up to a total of 4G in size. This is the default for bit Linux. The TOC may be up to 4G in size. Other data and code is only limited by the bit address space. This is the default when targeting a big-endian platform.
This is the default when targeting a little-endian platform. This option is currently ignored when targeting a big-endian platform. An example of a Cell microcode instruction is a variable shift. Use -misel and -mno-isel instead. By default the port uses LRA. Use -mspe and -mno-spe instead. Also enable the use of built-in functions that allow more direct access to the vector instructions.
The -mquad-memory option requires use of bit mode. The -mquad-memory-atomic option requires use of bit mode. This option is currently only available on the MPCx. The bit environment sets int, long and pointer to 32 bits and generates code that runs on any PowerPC variant. The bit environment sets int to 32 bits and long and pointer to 64 bits, and generates code for PowerPC64, as for -mpowerpc The -mfull-toc option is selected by default. However, only 16, entries are available in the TOC.
Specifying -maix64 implies -mpowerpc64 , while -maix32 disables the bit ABI and implies -mno-powerpc GCC defaults to -maix Pass floating-point arguments to prototyped functions beyond the register save area RSA on the stack in addition to argument FPRs.
Do not assume that most significant double in bit long double value is properly rounded when comparing values and converting to double. Use XL symbol names for long double support routines. Link an application written to use message passing with special startup code to enable the application to run. The Parallel Environment does not support threads, so the -mpe option and the -pthread option are incompatible.
Software floating-point emulation is provided if you use the -msoft-float option, and pass the option to GCC when linking. Do not use -mmultiple on little-endian PowerPC systems, since those instructions do not work when the processor is in little-endian mode. Do not use -mstring on little-endian PowerPC systems, since those instructions do not work when the processor is in little-endian mode. These instructions are generated by default.
If you use -mno-update , there is a small window between the time that the stack pointer is updated and the address of the previous frame is stored, which means code that walks the stack frame across interrupts or signals may get corrupted data. These instructions can incur a performance penalty on Power6 processors in certain situations, such as when stepping through large arrays that cross a 16M boundary.
This option is enabled by default when targeting Power6 and disabled otherwise. These instructions are generated by default if hardware floating point is used. These instructions are generated by default when targeting those processors. This instruction is generated by default when targeting those processors. A simple embedded PowerPC system loader should relocate the entire contents of. For this to work, all objects linked together must be compiled with -mrelocatable or -mrelocatable-lib.
Objects compiled with -mrelocatable-lib may be linked with objects compiled with any combination of the -mrelocatable options. The -mlittle-endian option is the same as -mlittle.
The -mbig-endian option is the same as -mbig. The resulting code is suitable for applications, but not shared libraries.
The runtime system is responsible for initializing this register with an appropriate value before execution begins. The argument scheme takes one of the following values: Insert exactly as many NOPs as needed to force an insn to a new group, according to the estimated processor grouping. For example, by default a structure containing nothing but 8 unsigned bit-fields of length 1 is aligned to a 4-byte boundary and has a size of 4 bytes. By using -mno-bit-align , the structure is aligned to a 1-byte boundary and is 1 byte in size.
Generate code that allows does not allow a static executable to be relocated to a different address at run time. A simple embedded PowerPC system loader should relocate the entire contents of.
For this to work, all objects linked together must be compiled with -mrelocatable or -mrelocatable-lib. Like -mrelocatable , -mrelocatable-lib generates a. Objects compiled with -mrelocatable-lib may be linked with objects compiled with any combination of the -mrelocatable options.
The -mlittle-endian option is the same as -mlittle. The -mbig-endian option is the same as -mbig. On Darwin and Mac OS X systems, compile code so that it is not relocatable, but that its external references are relocatable.
The resulting code is suitable for applications, but not shared libraries. Treat the register used for PIC addressing as read-only, rather than loading it in the prologue for each function. The runtime system is responsible for initializing this register with an appropriate value before execution begins. This option controls the priority that is assigned to dispatch-slot restricted instructions during the second scheduling pass.
This option controls which dependences are considered costly by the target during instruction scheduling. This option controls which NOP insertion scheme is used during the second scheduling pass. The argument scheme takes one of the following values:. Insert NOPs to force costly dependent insns into separate groups. Insert exactly as many NOPs as needed to force an insn to a new group, according to the estimated processor grouping.
Insert number NOPs to force an insn to a new group. Select the type of traceback table. Extend the current ABI with a particular extension, or remove such extension. This is not likely to work if your system defaults to using IEEE extended-precision long double. If you change the long double type from IEEE extended-precision, the compiler will issue a warning unless you use the -Wno-psabi option. This is not likely to work if your system defaults to using IBM extended-precision long double.
If you change the long double type from IBM extended-precision, the compiler will issue a warning unless you use the -Wno-psabi option. Overriding the default ABI requires special system support and is likely to fail in spectacular ways.
Otherwise, the compiler must insert an instruction before every non-prototyped call to set or clear bit 6 of the condition code register CR to indicate whether floating-point values are passed in the floating-point registers in case the function takes variable arguments.
With -mprototype , only calls to prototyped variable argument functions set or clear the bit. On embedded PowerPC systems, assume that the startup module is called sim-crt0. On embedded PowerPC systems, assume that the startup module is called crt0. Selecting -mno-eabi means that the stack is aligned to a byte boundary, no EABI initialization function is called from main , and the -msdata option only uses r13 to point to a single small data area.
Put small initialized non- const global and static data in the. Put small uninitialized global and static data in the. Put small uninitialized global data in the. Do not use register r13 to address small data however. This is the default behavior unless other -msdata options are used. On embedded PowerPC systems, put all initialized global and static data in the. Inline all block moves such as calls to memcpy or structure copies less than or equal to num bytes.
The minimum value for num is 32 bytes on bit targets and 64 bytes on bit targets. The default value is target-specific.
Generate non-looping inline code for all block compares such as calls to memcmp or structure compares less than or equal to num bytes. If num is 0, all inline expansion non-loop and loop of block compare is disabled.
Generate an inline expansion using loop code for all block compares that are less than or equal to num bytes, but greater than the limit for non-loop inline block compare expansion. If the block length is not constant, at most num bytes will be compared before memcmp is called to compare the remainder of the block.
Generate at most num pairs of load instructions to compare the string inline. If the difference or end of string is not found at the end of the inline compare a call to strcmp or strncmp will take care of the rest of the comparison. The default is 8 pairs of loads, which will compare 64 bytes on a bit target and 32 bytes on a bit target. By default, num is 8. The -G num switch is also passed to the linker.
All modules should be compiled with the same -G num value. By default assume that all calls are far away so that a longer and more expensive calling sequence is required. This is required for calls farther than 32 megabytes 33,, bytes from the current location. A short call is generated if the compiler knows the call cannot be that far away. This setting can be overridden by the shortcall function attribute, or by pragma longcall 0.
Some linkers are capable of detecting out-of-range calls and generating glue code on the fly. On these systems, long calls are unnecessary and generate slower code. The two target addresses represent the callee and the branch island.
The branch island is appended to the body of the calling function; it computes the full bit address of the callee and jumps to it. On Mach-O Darwin systems, this option directs the compiler emit to the glue for every direct call, and the Darwin linker decides whether to use or discard it. In the future, GCC may ignore all longcall specifications when the linker is known to generate glue.
The relocation allows the linker to reliably associate function call with argument setup instructions for TLS optimization, which in turn allows GCC to better schedule the sequence. This option enables use of the reciprocal estimate and reciprocal square root estimate instructions with additional Newton-Raphson steps to increase precision instead of doing a divide or square root and divide for floating-point arguments.
You should use the -ffast-math option when using -mrecip or at least -funsafe-math-optimizations , -ffinite-math-only , -freciprocal-math and -fno-trapping-math. Note that while the throughput of the sequence is generally higher than the throughput of the non-reciprocal instruction, the precision of the sequence can be decreased by up to 2 ulp i. This option controls which reciprocal estimate instructions may be used.
Enable the reciprocal square root approximation instructions for both single and double precision. Assume do not assume that the reciprocal estimate instructions provide higher-precision estimates than is mandated by the PowerPC ABI. The double-precision square root estimate instructions are not generated by default on low-precision machines, since they do not provide an estimate that converges after three steps.
Specifies the ABI type to use for vectorizing intrinsics using an external library. GCC currently emits calls to acosd2 , acosf4 , acoshd2 , acoshf4 , asind2 , asinf4 , asinhd2 , asinhf4 , atan2d2 , atan2f4 , atand2 , atanf4 , atanhd2 , atanhf4 , cbrtd2 , cbrtf4 , cosd2 , cosf4 , coshd2 , coshf4 , erfcd2 , erfcf4 , erfd2 , erff4 , exp2d2 , exp2f4 , expd2 , expf4 , expm1d2 , expm1f4 , hypotd2 , hypotf4 , lgammad2 , lgammaf4 , log10d2 , log10f4 , log1pd2 , log1pf4 , log2d2 , log2f4 , logd2 , logf4 , powd2 , powf4 , sind2 , sinf4 , sinhd2 , sinhf4 , sqrtd2 , sqrtf4 , tand2 , tanf4 , tanhd2 , and tanhf4 when generating code for power7.
Both -ftree-vectorize and -funsafe-math-optimizations must also be enabled. The MASS libraries must be specified at link time. Generate do not generate the friz instruction when the -funsafe-math-optimizations option is used to optimize rounding of floating-point values to bit integer and back to floating point. The friz instruction does not return the same value if the floating-point number is too large to fit in an integer.
Generate do not generate code to load up the static chain register r11 when calling through a pointer on AIX and bit Linux systems where a function pointer points to a 3-word descriptor giving the function address, TOC value to be loaded in register r2 , and static chain value to be loaded in register r The -mpointers-to-nested-functions is on by default. You cannot call through pointers to nested functions or pointers to functions compiled in other languages that use the static chain if you use -mno-pointers-to-nested-functions.
Generate do not generate code to save the TOC value in the reserved stack location in the function prologue if the function calls through a pointer on AIX and bit Linux systems.
If the TOC value is not saved in the prologue, it is saved just before the call through the pointer. The -mno-save-toc-indirect option is the default. Generate do not generate code to pass structure parameters with a maximum alignment of 64 bits, for compatibility with older versions of GCC.