Discussion:
[patch][pr22217] Use the most recent decl for mangling
(too old to reply)
Rafael Espíndola
2015-01-20 15:29:00 UTC
Permalink
Currently clang will produce a @foo when given

int foo;
extern int foo __asm__("bar");

The attached patch makes it produce a @bar, which is what gcc produces.

I am confused by the output changes in the microsoft abi tests.
Hopefully someone more familiar with it can comment on why using a
more recent decl causes problems and has an idea of what should be
changed to accommodate it.

Cheers,
Rafael
Rafael Espíndola
2015-01-20 15:30:14 UTC
Permalink
Sorry, I attached an older version of the patch. This one has a
slightly simpler test.

On 20 January 2015 at 10:29, Rafael Espíndola
Post by Rafael Espíndola
int foo;
extern int foo __asm__("bar");
I am confused by the output changes in the microsoft abi tests.
Hopefully someone more familiar with it can comment on why using a
more recent decl causes problems and has an idea of what should be
changed to accommodate it.
Cheers,
Rafael
David Majnemer
2015-01-20 17:42:12 UTC
Permalink
I don't think this approach would handle things like:

int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __asm__("pr22217_bar");

With your patch, pr22217_foo gets emitted.
With gcc, pr22217_bar gets emitted.

On Tue, Jan 20, 2015 at 7:30 AM, Rafael Espíndola <
Post by Rafael Espíndola
Sorry, I attached an older version of the patch. This one has a
slightly simpler test.
On 20 January 2015 at 10:29, Rafael Espíndola
Post by Rafael Espíndola
int foo;
extern int foo __asm__("bar");
I am confused by the output changes in the microsoft abi tests.
Hopefully someone more familiar with it can comment on why using a
more recent decl causes problems and has an idea of what should be
changed to accommodate it.
Cheers,
Rafael
David Majnemer
2015-01-20 18:16:21 UTC
Permalink
The following corrects the MS ABI test output:

--- a/lib/AST/MicrosoftMangle.cpp
+++ b/lib/AST/MicrosoftMangle.cpp
@@ -2329,6 +2329,7 @@ void
MicrosoftMangleContextImpl::mangleTypeName(QualType T, raw_ostream &Out) {
void MicrosoftMangleContextImpl::mangleCXXCtor(const CXXConstructorDecl *D,
CXXCtorType Type,
raw_ostream &Out) {
+ D = cast<CXXConstructorDecl>(D->getFirstDecl());
MicrosoftCXXNameMangler mangler(*this, Out);
mangler.mangle(D);
}
@@ -2336,6 +2337,7 @@ void MicrosoftMangleContextImpl::mangleCXXCtor(const
CXXConstructorDecl *D,
void MicrosoftMangleContextImpl::mangleCXXDtor(const CXXDestructorDecl *D,
CXXDtorType Type,
raw_ostream &Out) {
+ D = cast<CXXDestructorDecl>(D->getFirstDecl());
MicrosoftCXXNameMangler mangler(*this, Out, D, Type);
mangler.mangle(D);
}
Post by David Majnemer
int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __asm__("pr22217_bar");
With your patch, pr22217_foo gets emitted.
With gcc, pr22217_bar gets emitted.
On Tue, Jan 20, 2015 at 7:30 AM, Rafael Espíndola <
Post by Rafael Espíndola
Sorry, I attached an older version of the patch. This one has a
slightly simpler test.
On 20 January 2015 at 10:29, Rafael Espíndola
Post by Rafael Espíndola
int foo;
extern int foo __asm__("bar");
I am confused by the output changes in the microsoft abi tests.
Hopefully someone more familiar with it can comment on why using a
more recent decl causes problems and has an idea of what should be
changed to accommodate it.
Cheers,
Rafael
Rafael Espíndola
2015-01-21 14:30:17 UTC
Permalink
Post by David Majnemer
int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __asm__("pr22217_bar");
With your patch, pr22217_foo gets emitted.
With gcc, pr22217_bar gets emitted.
Gosh, this part is horrible. I can see two options:

* Check in CodeGenModule::getMangledName than the name we get is the
same as the cache and error if not. Very expensive. Just rejecting
__asm__ after the name has been computed might have the same effect
and be cheaper.

* Delay all of codegen. Basically just return false from
CodeGenModule::MayBeEmittedEagerly and try to avoid the order change
in codegen. The more I think about it, the more this looks desirable.
It should also fix pr16187 and allow us to finally cache visibility
computations.

Cheers,
Rafael
David Majnemer
2015-01-21 18:01:25 UTC
Permalink
I think clang's delayed codegen approach is a design flaw, I'd be happy to
see it gone.
Post by Rafael Espíndola
Post by David Majnemer
int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __asm__("pr22217_bar");
With your patch, pr22217_foo gets emitted.
With gcc, pr22217_bar gets emitted.
* Check in CodeGenModule::getMangledName than the name we get is the
same as the cache and error if not. Very expensive. Just rejecting
__asm__ after the name has been computed might have the same effect
and be cheaper.
* Delay all of codegen. Basically just return false from
CodeGenModule::MayBeEmittedEagerly and try to avoid the order change
in codegen. The more I think about it, the more this looks desirable.
It should also fix pr16187 and allow us to finally cache visibility
computations.
Cheers,
Rafael
David Majnemer
2015-01-22 01:19:55 UTC
Permalink
Post by David Majnemer
I think clang's delayed codegen approach is a design flaw, I'd be happy to
see it gone.
To be clear, I am in favor of delaying everything and emitting only at the
end.
Post by David Majnemer
On Wednesday, January 21, 2015, Rafael Espíndola <
Post by Rafael Espíndola
Post by David Majnemer
int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __asm__("pr22217_bar");
With your patch, pr22217_foo gets emitted.
With gcc, pr22217_bar gets emitted.
* Check in CodeGenModule::getMangledName than the name we get is the
same as the cache and error if not. Very expensive. Just rejecting
__asm__ after the name has been computed might have the same effect
and be cheaper.
* Delay all of codegen. Basically just return false from
CodeGenModule::MayBeEmittedEagerly and try to avoid the order change
in codegen. The more I think about it, the more this looks desirable.
It should also fix pr16187 and allow us to finally cache visibility
computations.
Cheers,
Rafael
Richard Smith
2015-01-21 21:30:03 UTC
Permalink
On Tue, Jan 20, 2015 at 9:42 AM, David Majnemer
Post by David Majnemer
int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __asm__("pr22217_bar");
I think we should either reject this code, or ignore the __asm__ with
a warning. __asm__ is effectively an attribute, and we don't support
adding attributes after we've seen the definition of an entity, for
exactly this reason. For instance:

<stdin>:3:39: warning: attribute declaration must precede definition
[-Wignored-attributes]
extern int pr22217_foo __attribute__((section("foobar")));
^
<stdin>:1:5: note: previous definition is here
int pr22217_foo;
^
Post by David Majnemer
With your patch, pr22217_foo gets emitted.
With gcc, pr22217_bar gets emitted.
On Tue, Jan 20, 2015 at 7:30 AM, Rafael Espíndola
Post by Rafael Espíndola
Sorry, I attached an older version of the patch. This one has a
slightly simpler test.
On 20 January 2015 at 10:29, Rafael Espíndola
Post by Rafael Espíndola
int foo;
extern int foo __asm__("bar");
I am confused by the output changes in the microsoft abi tests.
Hopefully someone more familiar with it can comment on why using a
more recent decl causes problems and has an idea of what should be
changed to accommodate it.
Cheers,
Rafael
_______________________________________________
cfe-commits mailing list
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
Rafael Espíndola
2015-01-21 21:52:31 UTC
Permalink
Post by Richard Smith
On Tue, Jan 20, 2015 at 9:42 AM, David Majnemer
Post by David Majnemer
int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __asm__("pr22217_bar");
I think we should either reject this code, or ignore the __asm__ with
a warning. __asm__ is effectively an attribute, and we don't support
adding attributes after we've seen the definition of an entity, for
In here int is a tentative definition (C code).

It was also first noticed when building glibc, so it is both support
by gcc and used in at least one location.

I can add an error/warning for now. To be clear, you want to error/warn on

int pr22217_foo;
extern int pr22217_foo __asm__("pr22217_bar");

Even when pr22217_foo is not used in between the two lines?

Cheers,
Rafael
Richard Smith
2015-01-21 22:24:52 UTC
Permalink
On Wed, Jan 21, 2015 at 1:52 PM, Rafael Espíndola
Post by Rafael Espíndola
Post by Richard Smith
On Tue, Jan 20, 2015 at 9:42 AM, David Majnemer
Post by David Majnemer
int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __asm__("pr22217_bar");
I think we should either reject this code, or ignore the __asm__ with
a warning. __asm__ is effectively an attribute, and we don't support
adding attributes after we've seen the definition of an entity, for
In here int is a tentative definition (C code).
It is also a tentative definition in the corresponding case with
__attribute__((section)). In that case, both GCC and Clang silently
ignore the attribute.
Post by Rafael Espíndola
It was also first noticed when building glibc, so it is both support
by gcc and used in at least one location.
I can add an error/warning for now. To be clear, you want to error/warn on
int pr22217_foo;
extern int pr22217_foo __asm__("pr22217_bar");
Even when pr22217_foo is not used in between the two lines?
Yes, for now. If we find that breaks too much stuff (or the glibc
folks aren't prepared to change their code) then we can reconsider
whether we want to support this for tentative definitions. (The other
option is what I think you were suggesting before -- defer emitting
tentative definitions more aggressively. I'm OK with doing that too,
since that better matches the model espoused by the C standard, but I
think we should at least reject this for a non-tentative definition
followed by a declaration with an asm label.)
Rafael Espíndola
2015-01-22 20:49:40 UTC
Permalink
Post by Richard Smith
It is also a tentative definition in the corresponding case with
__attribute__((section)). In that case, both GCC and Clang silently
ignore the attribute.
That is not what I am seeing with gcc. Given

int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __attribute__((section("zed")));

It produces

.section zed,"aw",@progbits
.align 4
.type pr22217_foo, @object
.size pr22217_foo, 4
pr22217_foo:

I am testing with gcc 4.9.2.

Cheers,
Rafael
Richard Smith
2015-01-23 00:36:12 UTC
Permalink
On Thu, Jan 22, 2015 at 12:49 PM, Rafael Espíndola
Post by Rafael Espíndola
Post by Richard Smith
It is also a tentative definition in the corresponding case with
__attribute__((section)). In that case, both GCC and Clang silently
ignore the attribute.
That is not what I am seeing with gcc. Given
int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __attribute__((section("zed")));
It produces
.align 4
.size pr22217_foo, 4
I am testing with gcc 4.9.2.
gcc SVN from ~6 months ago ("GCC: (GNU) 4.10.0 20140708
(experimental)") produces:

.comm pr22217_foo,4,4

in C, and

.globl pr22217_foo
.bss
.align 4
.type pr22217_foo, @object
.size pr22217_foo, 4
pr22217_foo:
.zero 4

in C++.
Richard Smith
2015-01-23 00:38:40 UTC
Permalink
Post by Richard Smith
On Thu, Jan 22, 2015 at 12:49 PM, Rafael Espíndola
Post by Rafael Espíndola
Post by Richard Smith
It is also a tentative definition in the corresponding case with
__attribute__((section)). In that case, both GCC and Clang silently
ignore the attribute.
That is not what I am seeing with gcc. Given
int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __attribute__((section("zed")));
It produces
.align 4
.size pr22217_foo, 4
I am testing with gcc 4.9.2.
gcc SVN from ~6 months ago ("GCC: (GNU) 4.10.0 20140708
.comm pr22217_foo,4,4
in C, and
.globl pr22217_foo
.bss
.align 4
.size pr22217_foo, 4
.zero 4
in C++.
Testing on gcc.godbolt.org shows that GCC 4.5-4.9 all do include the
section, so perhaps this is just a semi-recent regression?
Rafael Espíndola
2015-01-23 14:56:52 UTC
Permalink
Post by Richard Smith
Testing on gcc.godbolt.org shows that GCC 4.5-4.9 all do include the
section, so perhaps this is just a semi-recent regression?
It was a regression and it was explicitly fixed:

https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=43096b526a9f23008b9769372f11475ae63487bc

Cheers,
Rafael
Rafael Espíndola
2015-01-23 00:43:49 UTC
Permalink
On 22 January 2015 at 15:49, Rafael Espíndola
Post by Rafael Espíndola
Post by Richard Smith
It is also a tentative definition in the corresponding case with
__attribute__((section)). In that case, both GCC and Clang silently
ignore the attribute.
That is not what I am seeing with gcc. Given
int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __attribute__((section("zed")));
The above works even in c++ and with foo being initialized:

.section zed,"aw",@progbits
.align 4
.type pr22217_foo, @object
.size pr22217_foo, 4
pr22217_foo:
.long 42

I tested gcc 4.9.2 and current trunk (r220018).

Cheers,
Rafael
Rafael Espíndola
2015-01-23 00:52:39 UTC
Permalink
Sent the email a bit early.
Post by Rafael Espíndola
Post by Rafael Espíndola
That is not what I am seeing with gcc. Given
int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __attribute__((section("zed")));
.align 4
.size pr22217_foo, 4
.long 42
I tested gcc 4.9.2 and current trunk (r220018).
Which brings the question: should we delay more of codegen?

The argument I have seen for not doing so is that we get better
locality. The advantages that I know of of delaying *all* of codegen
until EOF are:

* We can get the above cases to match gcc.
* We can support all cases of "typedef struct { ...} foo;"
* We can compute the correct visibility in the case of pr16187.
* We can compute visibility of types only once.

Cheers,
Rafael
John McCall
2015-01-23 08:25:49 UTC
Permalink
Post by Rafael Espíndola
Sent the email a bit early.
Post by Rafael Espíndola
That is not what I am seeing with gcc. Given
int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __attribute__((section("zed")));
This should be an error in both C and C++. I see absolutely no reason to allow a declaration following a definition (even a tentative definition) to add a section attribute. We should not be afraid to reject stupidly-written code when it abuses language extensions, even when they’re not “our” extensions.

There are fair arguments against our current emit-as-you-go IRGen model, but allowing us to more perfectly emulate GCC’s bugs is not one of them. Nor is there a need to exactly copy GCC’s visibility model in every conceivable case. One very nice incidental advantage of emit-as-you-go is that it encourages us to ensure that language decisions are made locally by the declarations involved, which — beyond simply being better language design in and of itself — also means that they’re not susceptible to random breakage by differences in module import.

John.
Rafael Espíndola
2015-01-23 13:43:06 UTC
Permalink
Post by John McCall
On Jan 22, 2015, at 4:52 PM, Rafael Espíndola <
Sent the email a bit early.
Post by Rafael Espíndola
That is not what I am seeing with gcc. Given
int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __attribute__((section("zed")));
This should be an error in both C and C++. I see absolutely no reason to
allow a declaration following a definition (even a tentative definition) to
add a section attribute. We should not be afraid to reject
stupidly-written code when it abuses language extensions, even when they’re
not “our” extensions.
Not sure if that is viable fight on our side, but we can try making it an
error and see.
Post by John McCall
There are fair arguments against our current emit-as-you-go IRGen model,
but allowing us to more perfectly emulate GCC’s bugs is not one of them.
Nor is there a need to exactly copy GCC’s visibility model in every
conceivable case.
So, the case in pr16187 is one where I think there is no question that our
answer is worse than gcc's. The *same* type shows up as both hidden and
default. If this was a new language we were designing, it is hard to
imagine a worse compromise.

The reason we got there is that we tried and failed to enforce a stricter
models. First one that says that we can compute the type early and then one
that says we can compute it on first "use". Both have failed to build real
world software, which IMHO is a fundamental requirement for clang.

The case of "typedef struct {...} foo;" doesn't look as widespread, but it
is unfortunately a core part of the language (not a gcc extension) that we
cannot currently implement.
Post by John McCall
One very nice incidental advantage of emit-as-you-go is that it
encourages us to ensure that language decisions are made locally by the
declarations involved, which — beyond simply being better language design
in and of itself — also means that they’re not susceptible to random
breakage by differences in module import.
An interesting language design advice, but given the requirement that clang
continues to build real c++ code, it is important to ask if we are not
pushing ourselves to solutions that are worse than what gcc does (which I
am sure is the case in pr16187).

Cheers,
Rafael
John McCall
2015-01-23 18:43:54 UTC
Permalink
Post by Rafael Espíndola
Sent the email a bit early.
Post by Rafael Espíndola
That is not what I am seeing with gcc. Given
int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __attribute__((section("zed")));
This should be an error in both C and C++. I see absolutely no reason to allow a declaration following a definition (even a tentative definition) to add a section attribute. We should not be afraid to reject stupidly-written code when it abuses language extensions, even when they’re not “our” extensions.
Not sure if that is viable fight on our side, but we can try making it an error and see.
How is it not a viable fight? Is the section attribute coming from a completely different place? Or are you suggesting that it is never viable to tell people that they ought to fix their code, no matter how unnecessarily perverse it is? A section should be an intrinsic part of an definition, saying that you can’t define the same thing in multiple inconsistent ways is not even slightly unreasonable.
There are fair arguments against our current emit-as-you-go IRGen model, but allowing us to more perfectly emulate GCC’s bugs is not one of them. Nor is there a need to exactly copy GCC’s visibility model in every conceivable case.
So, the case in pr16187 is one where I think there is no question that our answer is worse than gcc's. The *same* type shows up as both hidden and default. If this was a new language we were designing, it is hard to imagine a worse compromise.
The reason we got there is that we tried and failed to enforce a stricter models. First one that says that we can compute the type early and then one that says we can compute it on first "use". Both have failed to build real world software, which IMHO is a fundamental requirement for clang.
“Build everything GCC can without modification” has never been a fundamental requirement for clang, though, and that appears to be your standard.

PR16187 is an example that I would feel fairly comfortable diagnosing. You could certainly construct a more challenging example, though.
The case of "typedef struct {...} foo;" doesn't look as widespread, but it is unfortunately a core part of the language (not a gcc extension) that we cannot currently implement.
You’ll need to remind me what it is that we can’t implement here.

Also, I don’t think there’s ever a case where we initiate IRGen during the parsing of a top-level declaration (we even take this to unnecessary extremes, e.g. with namespaces), and if there is, it should not be too difficult to avoid doing so.

John.
One very nice incidental advantage of emit-as-you-go is that it encourages us to ensure that language decisions are made locally by the declarations involved, which — beyond simply being better language design in and of itself — also means that they’re not susceptible to random breakage by differences in module import.
An interesting language design advice, but given the requirement that clang continues to build real c++ code, it is important to ask if we are not pushing ourselves to solutions that are worse than what gcc does (which I am sure is the case in pr16187).
Cheers,
Rafael
Rafael Espíndola
2015-01-23 19:04:28 UTC
Permalink
How is it not a viable fight? Is the section attribute coming from a
Post by John McCall
completely different place? Or are you suggesting that it is never viable
to tell people that they ought to fix their code, no matter how
unnecessarily perverse it is? A section should be an intrinsic part of an
definition, saying that you can’t define the same thing in multiple
inconsistent ways is not even slightly unreasonable.
The bug first got reported to us while trying to build glibc. The bug
Richard noticed was fixed in gcc because it was breaking the linux kernel.
If anyone thinks it is productive to try to get them to change, go for it.

“Build everything GCC can without modification” has never been a
Post by John McCall
fundamental requirement for clang, though, and that appears to be your
standard.
PR16187 is an example that I would feel fairly comfortable diagnosing.
You could certainly construct a more challenging example, though.
It is an example where we *were* diagnosing. Except that then we cannot
build firefox or chrome. You are also more than welcome to get them to
change because you don't like how gcc implemented an extension.

It is not about being bug by bug compatible, it is about building *real*
software versus having a slower (compute visibility multiple times per
type) and fuzzier model for no reason other than not liking reality.

This is not the same case as, for example, requiring "this->" in dependent
contexts, where our model was better (and eventually gcc switched).
Post by John McCall
You’ll need to remind me what it is that we can’t implement here.
test/SemaCXX/anonymous-struct.cpp

Cheers,
Rafael
Richard Smith
2015-01-23 19:28:25 UTC
Permalink
Post by Rafael Espíndola
Post by John McCall
How is it not a viable fight? Is the section attribute coming from a
completely different place? Or are you suggesting that it is never viable
to tell people that they ought to fix their code, no matter how
unnecessarily perverse it is? A section should be an intrinsic part of an
definition, saying that you can’t define the same thing in multiple
inconsistent ways is not even slightly unreasonable.
Post by Rafael Espíndola
The bug first got reported to us while trying to build glibc. The bug
Richard noticed was fixed in gcc because it was breaking the linux kernel.
If anyone thinks it is productive to try to get them to change, go for it.
Post by Rafael Espíndola
Post by John McCall
“Build everything GCC can without modification” has never been a
fundamental requirement for clang, though, and that appears to be your
standard.
Post by Rafael Espíndola
Post by John McCall
PR16187 is an example that I would feel fairly comfortable diagnosing.
You could certainly construct a more challenging example, though.
Post by Rafael Espíndola
It is an example where we *were* diagnosing. Except that then we cannot
build firefox or chrome. You are also more than welcome to get them to
change because you don't like how gcc implemented an extension.
Post by Rafael Espíndola
It is not about being bug by bug compatible, it is about building *real*
software versus having a slower (compute visibility multiple times per
type) and fuzzier model for no reason other than not liking reality.
Post by Rafael Espíndola
This is not the same case as, for example, requiring "this->" in
dependent contexts, where our model was better (and eventually gcc
switched).
Post by Rafael Espíndola
Post by John McCall
You’ll need to remind me what it is that we can’t implement here.
test/SemaCXX/anonymous-struct.cpp
We can implement that (by performing lookahead after parsing an anonymous
class in a typedef and before parsing method bodies etc), we just don't do
so yet. This is also not just an IR generation problem, there are also
semantic reasons why we might care about linkage.
John McCall
2015-01-23 19:52:15 UTC
Permalink
We can implement that (by performing lookahead after parsing an anonymous class in a typedef and before parsing method bodies etc), we just don't do so yet. This is also not just an IR generation problem, there are also semantic reasons why we might care about linkage.
Lookahead is a better solution than delaying diagnostics and not caching linkage. :) We’d need another arbitrary-lookahead tentative parser, because the typedef name isn’t necessarily the first declarator, but the fact that you can’t semantically have top-level expressions in there means that you can indeed do this without incorporating semantic information from prior declarators into lookup.

John.
John McCall
2015-01-23 19:34:15 UTC
Permalink
Post by John McCall
How is it not a viable fight? Is the section attribute coming from a completely different place? Or are you suggesting that it is never viable to tell people that they ought to fix their code, no matter how unnecessarily perverse it is? A section should be an intrinsic part of an definition, saying that you can’t define the same thing in multiple inconsistent ways is not even slightly unreasonable.
The bug first got reported to us while trying to build glibc. The bug Richard noticed was fixed in gcc because it was breaking the linux kernel. If anyone thinks it is productive to try to get them to change, go for it.
Sorry, do these open-source projects no longer accept patches? Adding section attributes after a definition does not seem defensible to me, and I would guess that the declarations are actually in the same file, just in the wrong order.
Post by John McCall
“Build everything GCC can without modification” has never been a fundamental requirement for clang, though, and that appears to be your standard.
PR16187 is an example that I would feel fairly comfortable diagnosing. You could certainly construct a more challenging example, though.
It is an example where we *were* diagnosing. Except that then we cannot build firefox or chrome. You are also more than welcome to get them to change because you don't like how gcc implemented an extension.
There are ways to work with projects so that you can carefully roll out a diagnostic without completely shutting down development for everyone. This is a danger literally every time we add a warning. We’re not going to stop adding warnings.
Post by John McCall
Post by John McCall
You’ll need to remind me what it is that we can’t implement here.
test/SemaCXX/anonymous-struct.cpp
You will note that IRGen is not computing the linkage in this Sema test. Linkage is a semantic property that the type-checker inherently has to care about. We can support this (by delaying the diagnostic about specializing on internal types) without changing anything in IRGen, but only by not caching linkage, either.

John.
Rafael Espíndola
2015-01-23 20:20:38 UTC
Permalink
Post by Rafael Espíndola
Post by John McCall
How is it not a viable fight? Is the section attribute coming from a
completely different place? Or are you suggesting that it is never viable
to tell people that they ought to fix their code, no matter how
unnecessarily perverse it is? A section should be an intrinsic part of an
definition, saying that you can’t define the same thing in multiple
inconsistent ways is not even slightly unreasonable.
The bug first got reported to us while trying to build glibc. The bug
Richard noticed was fixed in gcc because it was breaking the linux kernel.
If anyone thinks it is productive to try to get them to change, go for it.
Sorry, do these open-source projects no longer accept patches? Adding
section attributes after a definition does not seem defensible to me, and I
would guess that the declarations are actually in the same file, just in
the wrong order.
Maybe, but I have better things to do than being a message boy between your
opinion on gnu extensions and their use of them.
Post by Rafael Espíndola
“Build everything GCC can without modification” has never been a
Post by John McCall
fundamental requirement for clang, though, and that appears to be your
standard.
PR16187 is an example that I would feel fairly comfortable diagnosing.
You could certainly construct a more challenging example, though.
It is an example where we *were* diagnosing. Except that then we cannot
build firefox or chrome. You are also more than welcome to get them to
change because you don't like how gcc implemented an extension.
There are ways to work with projects so that you can carefully roll out a
diagnostic without completely shutting down development for everyone. This
is a danger literally every time we add a warning. We’re not going to stop
adding warnings.
If anyone is interested. Check the thread on r175326.That were we decide to
go with "We should just not cache visibility for types." which created the
horrible situation in PR16187.

Going back would required chrome (and any other project trying to optimize
their .so really) to add visibility attributes to every forward declaration
of a type if the definition has it.

Cheers,
Rafael
John McCall
2015-01-23 20:40:25 UTC
Permalink
Post by John McCall
Post by John McCall
How is it not a viable fight? Is the section attribute coming from a completely different place? Or are you suggesting that it is never viable to tell people that they ought to fix their code, no matter how unnecessarily perverse it is? A section should be an intrinsic part of an definition, saying that you can’t define the same thing in multiple inconsistent ways is not even slightly unreasonable.
The bug first got reported to us while trying to build glibc. The bug Richard noticed was fixed in gcc because it was breaking the linux kernel. If anyone thinks it is productive to try to get them to change, go for it.
Sorry, do these open-source projects no longer accept patches? Adding section attributes after a definition does not seem defensible to me, and I would guess that the declarations are actually in the same file, just in the wrong order.
Maybe, but I have better things to do than being a message boy between your opinion on gnu extensions and their use of them.
Rafael, you are taking this very personally and apparently trying to pick a fight. I’m sorry, but putting visibility aside, this is not the right fix, and this code should be considered ill-formed. We should not allow declarations following a definition to add section attributes, and we should not allow redeclarations to add asm attributes at all.

John.
David Majnemer
2015-01-23 17:25:13 UTC
Permalink
Post by John McCall
On Jan 22, 2015, at 4:52 PM, Rafael Espíndola <
Sent the email a bit early.
Post by Rafael Espíndola
That is not what I am seeing with gcc. Given
int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __attribute__((section("zed")));
This should be an error in both C and C++. I see absolutely no reason to
allow a declaration following a definition (even a tentative definition) to
add a section attribute. We should not be afraid to reject
stupidly-written code when it abuses language extensions, even when they’re
not “our” extensions.
There are fair arguments against our current emit-as-you-go IRGen model,
but allowing us to more perfectly emulate GCC’s bugs is not one of them.
Nor is there a need to exactly copy GCC’s visibility model in every
conceivable case. One very nice incidental advantage of emit-as-you-go is
that it encourages us to ensure that language decisions are made locally by
the declarations involved, which — beyond simply being better language
design in and of itself — also means that they’re not susceptible to random
breakage by differences in module import.
Some of my argument against eagerly emitting IR comes stems from how we
handle the following:
struct S;

typedef void (*FP)(struct S);

void f(FP x) { }

struct S { int i; };

void g() { f(0); }

When we are emitting f, we decide that FP should have IR type {}*.
However, the definition of 'S' is available when we are emitting 'g' and so
we decide that FP have IR type void (i32)*.

The fact that types change from under us is very surprising.
Post by John McCall
John.
John McCall
2015-01-23 18:30:57 UTC
Permalink
Post by Rafael Espíndola
Sent the email a bit early.
Post by Rafael Espíndola
That is not what I am seeing with gcc. Given
int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __attribute__((section("zed")));
This should be an error in both C and C++. I see absolutely no reason to allow a declaration following a definition (even a tentative definition) to add a section attribute. We should not be afraid to reject stupidly-written code when it abuses language extensions, even when they’re not “our” extensions.
There are fair arguments against our current emit-as-you-go IRGen model, but allowing us to more perfectly emulate GCC’s bugs is not one of them. Nor is there a need to exactly copy GCC’s visibility model in every conceivable case. One very nice incidental advantage of emit-as-you-go is that it encourages us to ensure that language decisions are made locally by the declarations involved, which — beyond simply being better language design in and of itself — also means that they’re not susceptible to random breakage by differences in module import.
struct S;
typedef void (*FP)(struct S);
void f(FP x) { }
struct S { int i; };
void g() { f(0); }
When we are emitting f, we decide that FP should have IR type {}*.
However, the definition of 'S' is available when we are emitting 'g' and so we decide that FP have IR type void (i32)*.
The fact that types change from under us is very surprising.
Global values changing types is, unfortunately, inevitable, because it is not currently possible to give a global variable an arbitrary IR type when it has a constant initializer, and global variables can have references to each other, even cyclically. It is not actually difficult to deal with; it's good practice to hold persistent references with a value handle anyway.

John.
Richard Smith
2015-01-29 22:47:45 UTC
Permalink
Post by John McCall
On Jan 22, 2015, at 4:52 PM, Rafael Espíndola <
Sent the email a bit early.
Post by Rafael Espíndola
That is not what I am seeing with gcc. Given
int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __attribute__((section("zed")));
This should be an error in both C and C++. I see absolutely no reason to
allow a declaration following a definition (even a tentative definition) to
add a section attribute. We should not be afraid to reject
stupidly-written code when it abuses language extensions, even when they’re
not “our” extensions.
I completely agree with the principle here. It is not reasonable to write
attributes that affect a definition after the definition. It is not
reasonable to write attributes that affect how a symbol is referenced (such
as an asm label) after the first use (and perhaps we should simply require
them on the first declaration).

(Segue away from attributes and towards tentative definitons follows...)

I don't agree with what you said about tentative definitions. The C
standard is very clear on the model for tentative definitions: they act
exactly like non-defining declarations until you get to the end of the
translation unit; if you've not seen a non-tentative definition by that
point "then the behavior is exactly as if the translation unit contains a
file scope declaration of that identifier, with the composite type as of
the end of the translation unit, with an initializer equal to 0."

Based on that simple semantic model, it is not reasonable for us to reject
this:

int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __attribute__((section("zed")));
int pr22217_foo = 123;

See also PR20688, which is a rejects-valid for standard C11 code due to our
being confused about how tentative definitions work.

And here's another case we get wrong:

int a[];
extern int a[5];

We're required to emit a definition of 'a' with type 'int[5]', but we emit
it with type 'int[1]'. We get the corresponding case with an incomplete
struct correct:

struct foo x; // ok, tentative definition
struct foo { int n, m; };
// definition emitted now and has complete type; initializer is {0}.

There are lots of ways we can fix this; perhaps the easiest one would be to
literally follow what the C standard says: synthesize a definition for each
tentatively-defined variable at the end of the translation unit. Then we
can change isThisDeclarationADefinition to simply return 'bool' instead of
an enum, and have it return 'false' for tentative definitions. Sema would
track the tentative definitions it's seen, and consider converting each one
to a definition at end-of-TU.

Or we can try to keep our current model with a tristate for whether a
declaration is a definition, but that requires both Sema and IRGen to get a
lot smarter with regard to handling of tentative definitions.
John McCall
2015-01-29 23:47:02 UTC
Permalink
Post by Rafael Espíndola
Sent the email a bit early.
Post by Rafael Espíndola
That is not what I am seeing with gcc. Given
int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __attribute__((section("zed")));
This should be an error in both C and C++. I see absolutely no reason to allow a declaration following a definition (even a tentative definition) to add a section attribute. We should not be afraid to reject stupidly-written code when it abuses language extensions, even when they’re not “our” extensions.
I completely agree with the principle here. It is not reasonable to write attributes that affect a definition after the definition. It is not reasonable to write attributes that affect how a symbol is referenced (such as an asm label) after the first use (and perhaps we should simply require them on the first declaration).
(Segue away from attributes and towards tentative definitons follows...)
I don't agree with what you said about tentative definitions. The C standard is very clear on the model for tentative definitions: they act exactly like non-defining declarations until you get to the end of the translation unit; if you've not seen a non-tentative definition by that point "then the behavior is exactly as if the translation unit contains a file scope declaration of that identifier, with the composite type as of the end of the translation unit, with an initializer equal to 0.”
So, this is interesting. Unix C compilers have traditionally defaulted to -fcommon, i.e. to treating uninitialized variables as common definitions that are overridable not just within a translation unit, but within the entire program. (I’m not sure whether ELF platforms implement this as “program” or “linkage unit”. Darwin uses “linkage unit”.) Whether that’s actually compliant is arguable, but regardless, it’s the semantics we use, and so we really do have to maintain the tri-state, because tentative definitions are semantically quite different from non-tentative definitions.

But in the sense that non-tentative definitions fully replace tentative definitions, I agree that the correct behavior is probably to allow a non-tentative definition with a section attribute to “override” a tentative definition which lacks the attribute.

That's reasonable as long as section attributes never affect the code-generation of accesses to an object. I think we can agree that section attributes that do affect code-generation of references (in an incompatible way) would clearly need to be on all declarations. But that’s more like an address-space attribute than a section attribute.
int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __attribute__((section("zed")));
int pr22217_foo = 123;
See also PR20688, which is a rejects-valid for standard C11 code due to our being confused about how tentative definitions work.
int a[];
extern int a[5];
struct foo x; // ok, tentative definition
struct foo { int n, m; };
// definition emitted now and has complete type; initializer is {0}.
There are lots of ways we can fix this; perhaps the easiest one would be to literally follow what the C standard says: synthesize a definition for each tentatively-defined variable at the end of the translation unit. Then we can change isThisDeclarationADefinition to simply return 'bool' instead of an enum, and have it return 'false' for tentative definitions. Sema would track the tentative definitions it's seen, and consider converting each one to a definition at end-of-TU.
Like I mentioned above, this isn’t actually allowed under -fcommon.
Or we can try to keep our current model with a tristate for whether a declaration is a definition, but that requires both Sema and IRGen to get a lot smarter with regard to handling of tentative definitions.
I think this is reasonable. IRGen should be able to just completely replace an existing tentative definition. As I mentioned up-thread, IRGen needs to hold persistent references to global variables with handles anyway just because types can change.

John.
Richard Smith
2015-01-29 23:57:20 UTC
Permalink
Post by Richard Smith
Post by John McCall
On Jan 22, 2015, at 4:52 PM, Rafael Espíndola <
Sent the email a bit early.
Post by Rafael Espíndola
That is not what I am seeing with gcc. Given
int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __attribute__((section("zed")));
This should be an error in both C and C++. I see absolutely no reason to
allow a declaration following a definition (even a tentative definition) to
add a section attribute. We should not be afraid to reject
stupidly-written code when it abuses language extensions, even when they’re
not “our” extensions.
I completely agree with the principle here. It is not reasonable to write
attributes that affect a definition after the definition. It is not
reasonable to write attributes that affect how a symbol is referenced (such
as an asm label) after the first use (and perhaps we should simply require
them on the first declaration).
(Segue away from attributes and towards tentative definitons follows...)
I don't agree with what you said about tentative definitions. The C
standard is very clear on the model for tentative definitions: they act
exactly like non-defining declarations until you get to the end of the
translation unit; if you've not seen a non-tentative definition by that
point "then the behavior is exactly as if the translation unit contains a
file scope declaration of that identifier, with the composite type as of
the end of the translation unit, with an initializer equal to 0.”
So, this is interesting. Unix C compilers have traditionally defaulted to
-fcommon, i.e. to treating uninitialized variables as common definitions
that are overridable not just within a translation unit, but within the
entire program. (I’m not sure whether ELF platforms implement this as
“program” or “linkage unit”. Darwin uses “linkage unit”.) Whether that’s
actually compliant is arguable, but regardless, it’s the semantics we use,
and so we really do have to maintain the tri-state, because tentative
definitions are semantically quite different from non-tentative definitions.
But in the sense that non-tentative definitions fully replace tentative
definitions, I agree that the correct behavior is probably to allow a
non-tentative definition with a section attribute to “override” a tentative
definition which lacks the attribute.
That's reasonable as long as section attributes never affect the
code-generation of accesses to an object. I think we can agree that
section attributes that do affect code-generation of references (in an
incompatible way) would clearly need to be on all declarations. But that’s
more like an address-space attribute than a section attribute.
int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __attribute__((section("zed")));
int pr22217_foo = 123;
See also PR20688, which is a rejects-valid for standard C11 code due to
our being confused about how tentative definitions work.
int a[];
extern int a[5];
We're required to emit a definition of 'a' with type 'int[5]', but we emit
it with type 'int[1]'. We get the corresponding case with an incomplete
struct foo x; // ok, tentative definition
struct foo { int n, m; };
// definition emitted now and has complete type; initializer is {0}.
There are lots of ways we can fix this; perhaps the easiest one would be
to literally follow what the C standard says: synthesize a definition for
each tentatively-defined variable at the end of the translation unit. Then
we can change isThisDeclarationADefinition to simply return 'bool' instead
of an enum, and have it return 'false' for tentative definitions. Sema
would track the tentative definitions it's seen, and consider converting
each one to a definition at end-of-TU.
Like I mentioned above, this isn’t actually allowed under -fcommon.
I don't see why not. We just need to make sure that the definition we
create at the end of TU is emitted as a common definition.
Post by Richard Smith
Or we can try to keep our current model with a tristate for whether a
declaration is a definition, but that requires both Sema and IRGen to get a
lot smarter with regard to handling of tentative definitions.
I think this is reasonable. IRGen should be able to just completely
replace an existing tentative definition. As I mentioned up-thread, IRGen
needs to hold persistent references to global variables with handles anyway
just because types can change.
John.
Richard Smith
2015-01-29 23:59:51 UTC
Permalink
Post by Richard Smith
Post by Richard Smith
Post by John McCall
On Jan 22, 2015, at 4:52 PM, Rafael Espíndola <
Sent the email a bit early.
Post by Rafael Espíndola
That is not what I am seeing with gcc. Given
int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __attribute__((section("zed")));
This should be an error in both C and C++. I see absolutely no reason
to allow a declaration following a definition (even a tentative definition)
to add a section attribute. We should not be afraid to reject
stupidly-written code when it abuses language extensions, even when they’re
not “our” extensions.
I completely agree with the principle here. It is not reasonable to write
attributes that affect a definition after the definition. It is not
reasonable to write attributes that affect how a symbol is referenced (such
as an asm label) after the first use (and perhaps we should simply require
them on the first declaration).
(Segue away from attributes and towards tentative definitons follows...)
I don't agree with what you said about tentative definitions. The C
standard is very clear on the model for tentative definitions: they act
exactly like non-defining declarations until you get to the end of the
translation unit; if you've not seen a non-tentative definition by that
point "then the behavior is exactly as if the translation unit contains a
file scope declaration of that identifier, with the composite type as of
the end of the translation unit, with an initializer equal to 0.”
So, this is interesting. Unix C compilers have traditionally defaulted
to -fcommon, i.e. to treating uninitialized variables as common definitions
that are overridable not just within a translation unit, but within the
entire program. (I’m not sure whether ELF platforms implement this as
“program” or “linkage unit”. Darwin uses “linkage unit”.) Whether that’s
actually compliant is arguable, but regardless, it’s the semantics we use,
and so we really do have to maintain the tri-state, because tentative
definitions are semantically quite different from non-tentative definitions.
But in the sense that non-tentative definitions fully replace tentative
definitions, I agree that the correct behavior is probably to allow a
non-tentative definition with a section attribute to “override” a tentative
definition which lacks the attribute.
That's reasonable as long as section attributes never affect the
code-generation of accesses to an object. I think we can agree that
section attributes that do affect code-generation of references (in an
incompatible way) would clearly need to be on all declarations. But that’s
more like an address-space attribute than a section attribute.
int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __attribute__((section("zed")));
int pr22217_foo = 123;
See also PR20688, which is a rejects-valid for standard C11 code due to
our being confused about how tentative definitions work.
int a[];
extern int a[5];
We're required to emit a definition of 'a' with type 'int[5]', but we
emit it with type 'int[1]'. We get the corresponding case with an
struct foo x; // ok, tentative definition
struct foo { int n, m; };
// definition emitted now and has complete type; initializer is {0}.
There are lots of ways we can fix this; perhaps the easiest one would be
to literally follow what the C standard says: synthesize a definition for
each tentatively-defined variable at the end of the translation unit. Then
we can change isThisDeclarationADefinition to simply return 'bool' instead
of an enum, and have it return 'false' for tentative definitions. Sema
would track the tentative definitions it's seen, and consider converting
each one to a definition at end-of-TU.
Like I mentioned above, this isn’t actually allowed under -fcommon.
I don't see why not. We just need to make sure that the definition we
create at the end of TU is emitted as a common definition.
(There are also a small number of checks we need to turn off for the
synthesized definition; in particular, for weird cases like:

int a[];
void f() { extern int a[5]; }

... we should not reject even though the synthesized definition has type
'int a[1];' which is not compatible with the declaration in 'f'. I suppose
we could even detect this case and not emit a definition of 'a' at all
here, because we know that another TU must be providing one with the right
size.)
Post by Richard Smith
Or we can try to keep our current model with a tristate for whether a
Post by Richard Smith
declaration is a definition, but that requires both Sema and IRGen to get a
lot smarter with regard to handling of tentative definitions.
I think this is reasonable. IRGen should be able to just completely
replace an existing tentative definition. As I mentioned up-thread, IRGen
needs to hold persistent references to global variables with handles anyway
just because types can change.
John.
John McCall
2015-01-30 00:04:42 UTC
Permalink
Post by John McCall
Post by Rafael Espíndola
Sent the email a bit early.
Post by Rafael Espíndola
That is not what I am seeing with gcc. Given
int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __attribute__((section("zed")));
This should be an error in both C and C++. I see absolutely no reason to allow a declaration following a definition (even a tentative definition) to add a section attribute. We should not be afraid to reject stupidly-written code when it abuses language extensions, even when they’re not “our” extensions.
I completely agree with the principle here. It is not reasonable to write attributes that affect a definition after the definition. It is not reasonable to write attributes that affect how a symbol is referenced (such as an asm label) after the first use (and perhaps we should simply require them on the first declaration).
(Segue away from attributes and towards tentative definitons follows...)
I don't agree with what you said about tentative definitions. The C standard is very clear on the model for tentative definitions: they act exactly like non-defining declarations until you get to the end of the translation unit; if you've not seen a non-tentative definition by that point "then the behavior is exactly as if the translation unit contains a file scope declaration of that identifier, with the composite type as of the end of the translation unit, with an initializer equal to 0.”
So, this is interesting. Unix C compilers have traditionally defaulted to -fcommon, i.e. to treating uninitialized variables as common definitions that are overridable not just within a translation unit, but within the entire program. (I’m not sure whether ELF platforms implement this as “program” or “linkage unit”. Darwin uses “linkage unit”.) Whether that’s actually compliant is arguable, but regardless, it’s the semantics we use, and so we really do have to maintain the tri-state, because tentative definitions are semantically quite different from non-tentative definitions.
But in the sense that non-tentative definitions fully replace tentative definitions, I agree that the correct behavior is probably to allow a non-tentative definition with a section attribute to “override” a tentative definition which lacks the attribute.
That's reasonable as long as section attributes never affect the code-generation of accesses to an object. I think we can agree that section attributes that do affect code-generation of references (in an incompatible way) would clearly need to be on all declarations. But that’s more like an address-space attribute than a section attribute.
int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __attribute__((section("zed")));
int pr22217_foo = 123;
See also PR20688, which is a rejects-valid for standard C11 code due to our being confused about how tentative definitions work.
int a[];
extern int a[5];
struct foo x; // ok, tentative definition
struct foo { int n, m; };
// definition emitted now and has complete type; initializer is {0}.
There are lots of ways we can fix this; perhaps the easiest one would be to literally follow what the C standard says: synthesize a definition for each tentatively-defined variable at the end of the translation unit. Then we can change isThisDeclarationADefinition to simply return 'bool' instead of an enum, and have it return 'false' for tentative definitions. Sema would track the tentative definitions it's seen, and consider converting each one to a definition at end-of-TU.
Like I mentioned above, this isn’t actually allowed under -fcommon.
I don't see why not. We just need to make sure that the definition we create at the end of TU is emitted as a common definition.
Ah. I see what you’re saying. This does become a channel of out-of-bound information that other clients have to be aware of, but yes, it’s possible.
Post by John McCall
int a[];
void f() { extern int a[5]; }
... we should not reject even though the synthesized definition has type 'int a[1];' which is not compatible with the declaration in 'f'. I suppose we could even detect this case and not emit a definition of 'a' at all here, because we know that another TU must be providing one with the right size.)
Odds that that wouldn’t cause a failure somewhere? :)

John.
Richard Smith
2015-01-30 00:16:51 UTC
Permalink
Post by John McCall
Post by Richard Smith
Post by Richard Smith
Post by John McCall
On Jan 22, 2015, at 4:52 PM, Rafael Espíndola <
Sent the email a bit early.
Post by Rafael Espíndola
That is not what I am seeing with gcc. Given
int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __attribute__((section("zed")));
This should be an error in both C and C++. I see absolutely no reason
to allow a declaration following a definition (even a tentative definition)
to add a section attribute. We should not be afraid to reject
stupidly-written code when it abuses language extensions, even when they’re
not “our” extensions.
I completely agree with the principle here. It is not reasonable to
write attributes that affect a definition after the definition. It is not
reasonable to write attributes that affect how a symbol is referenced (such
as an asm label) after the first use (and perhaps we should simply require
them on the first declaration).
(Segue away from attributes and towards tentative definitons follows...)
I don't agree with what you said about tentative definitions. The C
standard is very clear on the model for tentative definitions: they act
exactly like non-defining declarations until you get to the end of the
translation unit; if you've not seen a non-tentative definition by that
point "then the behavior is exactly as if the translation unit contains a
file scope declaration of that identifier, with the composite type as of
the end of the translation unit, with an initializer equal to 0.”
So, this is interesting. Unix C compilers have traditionally defaulted
to -fcommon, i.e. to treating uninitialized variables as common definitions
that are overridable not just within a translation unit, but within the
entire program. (I’m not sure whether ELF platforms implement this as
“program” or “linkage unit”. Darwin uses “linkage unit”.) Whether that’s
actually compliant is arguable, but regardless, it’s the semantics we use,
and so we really do have to maintain the tri-state, because tentative
definitions are semantically quite different from non-tentative definitions.
But in the sense that non-tentative definitions fully replace tentative
definitions, I agree that the correct behavior is probably to allow a
non-tentative definition with a section attribute to “override” a tentative
definition which lacks the attribute.
That's reasonable as long as section attributes never affect the
code-generation of accesses to an object. I think we can agree that
section attributes that do affect code-generation of references (in an
incompatible way) would clearly need to be on all declarations. But that’s
more like an address-space attribute than a section attribute.
int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __attribute__((section("zed")));
int pr22217_foo = 123;
See also PR20688, which is a rejects-valid for standard C11 code due to
our being confused about how tentative definitions work.
int a[];
extern int a[5];
We're required to emit a definition of 'a' with type 'int[5]', but we
emit it with type 'int[1]'. We get the corresponding case with an
struct foo x; // ok, tentative definition
struct foo { int n, m; };
// definition emitted now and has complete type; initializer is {0}.
There are lots of ways we can fix this; perhaps the easiest one would be
to literally follow what the C standard says: synthesize a definition for
each tentatively-defined variable at the end of the translation unit. Then
we can change isThisDeclarationADefinition to simply return 'bool' instead
of an enum, and have it return 'false' for tentative definitions. Sema
would track the tentative definitions it's seen, and consider converting
each one to a definition at end-of-TU.
Like I mentioned above, this isn’t actually allowed under -fcommon.
I don't see why not. We just need to make sure that the definition we
create at the end of TU is emitted as a common definition.
Ah. I see what you’re saying. This does become a channel of out-of-bound
information that other clients have to be aware of, but yes, it’s possible.
Yeah, we were intending on tracking this on VarDecl so that consumers who
care can find out that a particular implicit definition was synthesized to
represent the definition of a tentatively-defined variable. I contend that
very few consumers of the AST will need to be aware of this; fewer than
need to be aware of the isThisDeclarationADefinition special case today.
Post by John McCall
(There are also a small number of checks we need to turn off for the
int a[];
void f() { extern int a[5]; }
... we should not reject even though the synthesized definition has type
'int a[1];' which is not compatible with the declaration in 'f'. I suppose
we could even detect this case and not emit a definition of 'a' at all
here, because we know that another TU must be providing one with the right
size.)
Odds that that wouldn’t cause a failure somewhere? :)
If it does, the code is doing scary things today -- we'll emit 'a' as an
array of 1 element. But just because we could, doesn't mean we should... =)
*jurassic park flashback*
John McCall
2015-01-30 00:35:49 UTC
Permalink
Post by John McCall
Post by John McCall
Post by Rafael Espíndola
Sent the email a bit early.
Post by Rafael Espíndola
That is not what I am seeing with gcc. Given
int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __attribute__((section("zed")));
This should be an error in both C and C++. I see absolutely no reason to allow a declaration following a definition (even a tentative definition) to add a section attribute. We should not be afraid to reject stupidly-written code when it abuses language extensions, even when they’re not “our” extensions.
I completely agree with the principle here. It is not reasonable to write attributes that affect a definition after the definition. It is not reasonable to write attributes that affect how a symbol is referenced (such as an asm label) after the first use (and perhaps we should simply require them on the first declaration).
(Segue away from attributes and towards tentative definitons follows...)
I don't agree with what you said about tentative definitions. The C standard is very clear on the model for tentative definitions: they act exactly like non-defining declarations until you get to the end of the translation unit; if you've not seen a non-tentative definition by that point "then the behavior is exactly as if the translation unit contains a file scope declaration of that identifier, with the composite type as of the end of the translation unit, with an initializer equal to 0.”
So, this is interesting. Unix C compilers have traditionally defaulted to -fcommon, i.e. to treating uninitialized variables as common definitions that are overridable not just within a translation unit, but within the entire program. (I’m not sure whether ELF platforms implement this as “program” or “linkage unit”. Darwin uses “linkage unit”.) Whether that’s actually compliant is arguable, but regardless, it’s the semantics we use, and so we really do have to maintain the tri-state, because tentative definitions are semantically quite different from non-tentative definitions.
But in the sense that non-tentative definitions fully replace tentative definitions, I agree that the correct behavior is probably to allow a non-tentative definition with a section attribute to “override” a tentative definition which lacks the attribute.
That's reasonable as long as section attributes never affect the code-generation of accesses to an object. I think we can agree that section attributes that do affect code-generation of references (in an incompatible way) would clearly need to be on all declarations. But that’s more like an address-space attribute than a section attribute.
int pr22217_foo;
int *b = &pr22217_foo;
extern int pr22217_foo __attribute__((section("zed")));
int pr22217_foo = 123;
See also PR20688, which is a rejects-valid for standard C11 code due to our being confused about how tentative definitions work.
int a[];
extern int a[5];
struct foo x; // ok, tentative definition
struct foo { int n, m; };
// definition emitted now and has complete type; initializer is {0}.
There are lots of ways we can fix this; perhaps the easiest one would be to literally follow what the C standard says: synthesize a definition for each tentatively-defined variable at the end of the translation unit. Then we can change isThisDeclarationADefinition to simply return 'bool' instead of an enum, and have it return 'false' for tentative definitions. Sema would track the tentative definitions it's seen, and consider converting each one to a definition at end-of-TU.
Like I mentioned above, this isn’t actually allowed under -fcommon.
I don't see why not. We just need to make sure that the definition we create at the end of TU is emitted as a common definition.
Ah. I see what you’re saying. This does become a channel of out-of-bound information that other clients have to be aware of, but yes, it’s possible.
Yeah, we were intending on tracking this on VarDecl so that consumers who care can find out that a particular implicit definition was synthesized to represent the definition of a tentatively-defined variable. I contend that very few consumers of the AST will need to be aware of this; fewer than need to be aware of the isThisDeclarationADefinition special case today.
Yeah. I think this is a reasonable design, and Doug concurs.
Post by John McCall
Post by John McCall
int a[];
void f() { extern int a[5]; }
... we should not reject even though the synthesized definition has type 'int a[1];' which is not compatible with the declaration in 'f'. I suppose we could even detect this case and not emit a definition of 'a' at all here, because we know that another TU must be providing one with the right size.)
Odds that that wouldn’t cause a failure somewhere? :)
If it does, the code is doing scary things today -- we'll emit 'a' as an array of 1 element. But just because we could, doesn't mean we should... =) *jurassic park flashback*
Well, it’s pretty easy to imagine f() not really caring about how long a[] is but putting a useless bound on it anyway. I assume that’s technically a violation, but it’s a harmless one.

That said, the rule about defaulting incomplete array types to length 1 is just bad language design, especially under the official model that lacks common definitions. We should consider warning about it, at least in translation units that actually use the array.

John.

Loading...