A bit of explanation regarding the quiz in the last post
Pascal Cuoq - 20th Jan 2012There are only positive constants in C, as per section 6.4.4 in the C99 standard:
integer-constant: decimal-constant integer-suffixopt octal-constant integer-suffixopt hexadecimal-constant integer-suffixopt decimal-constant: nonzero-digit decimal-constant digit octal-constant: 0 octal-constant octal-digit hexadecimal-constant: hexadecimal-prefix hexadecimal-digit hexadecimal-constant hexadecimal-digit ...
The minus sign is not part of the constant according to the grammar.
The expression -0x80000000
is parsed and typed as the application of the unary negation operator -
to the constant 0x80000000
. The table in section 6.4.4.1 of the standard shows that, when typing hexadecimal constants, unsigned types must be tried. The list of types to try to fit the hexadecimal constant in is, in order, int
, unsigned int
, long
, unsigned long
, long long
, unsigned long long
.
For many architectures, the first type in the list that fits 0x80000000
is unsigned int
. Unary negation, when applied to an unsigned int
, returns an unsigned int
, so that -0x80000000
has type unsigned int
and value 0x80000000
.
Following the same reasoning as above, reading from the \Decimal Constant" column of the table in the C99 standard the types to try are int
long
and long long
. This might lead you to expect -2147483648
for the value of the expression -2147483648
compiled with GCC. Instead when compiling this expression on a 32-bit architecture GCC emits a warning and the expression has the value 2147483648
instead. The warning is:
t.c:6: warning: this decimal constant is unsigned only in ISO C90
Indeed there is a subtlety here for 32-bit architectures. GCC by default follows the C90 standard. It's not so much that the spirit of the table in section 6.4.4.1 in C99 changed between C90 and C99. The spirit remained the same with unsigned types being tried for octal and hexadecimal constants and mostly only signed types being tried for decimal constants. Here is the relevant snippet from the C90 standard:
The type of an integer constant is the first of the corresponding list in which its value can be represented. Unsuffixed decimal: int long int unsigned long int;
The difference really stems from the fact C90 did not have a long long
type and the list of types to try for a decimal constant ended in unsigned long
since that type contained values that did not fit in any other type. On a 32-bit architecture where long
and int
are both 32-bit 2147483648
fits neither int
nor long
and so ends up being typed as an unsigned long
. Note that on an architecture where long
is 64-bit then 2147483648
and -2147483648
are typed as long
.
Finally when GCC is told with option -std=c99
to apply C99 rules on an architecture where long
is 32-bit then 2147483648
is typed as long long
so that the expression -2147483648
has type long long
and value -2147483648
.
This should explain the results obtained when compiling the three programs from last post with GCC on 32-bit and on 64-bit architectures.