-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define capture behavior when primary ctor bypassed #7354
base: main
Are you sure you want to change the base?
Conversation
proposals/primary-constructors.md
Outdated
@@ -160,6 +160,8 @@ If a primary constructor parameter is referenced from within an instance member, | |||
|
|||
Capturing is not allowed for parameters that have ref-like type, and capturing is not allowed for `ref`, `in` or `out` parameters. This is similar to a limitation for capturing in lambdas. | |||
|
|||
Structs present a challenge for primary constructor parameter capture: there is no way to enforce the execution of constructors on structs. (For example, creating a single-element array of some struct type will produce an instance of that type without running any of its constructors.) Section [§9.2.5](https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/language-specification/variables#925-value-parameters) of the C# spec states that constructor parameters do not come into existence until the constructor is invoked. Since primary constructor parameters are in scope throughout the type, this can create the awkward situation in which a parameter is in scope, but does not actually exist. If a parameter of a struct's primary constructor is captured, this would mean that the member causing its capture would be using a variable that does not exist. To avoid this, we assert that in cases where an instance of a type with a primary constructor was created through a mechanism that bypasses that constructor, all captured parameters come into existence when the instance is created, and are all initialized with the default value for their type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems ok to me, but I want @MadsTorgersen to give some input as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not have the "Structs present a challenge for primary constructor parameter capture:" part.
Similarly the " this can create the awkward situation in which a parameter is in scope, but does not actually exist."
I would simply state that constructor parameters for structs** are always in existence. And they either have the values provided when the constructor is invoked. Or they have the default
value otherwise.
** We can also limit this to structs with primary constructors only. Or we can just state it's for any struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not have the...
That'll be my background as an instructor kicking in: always give a clear motivation so people understand why they should care. But it was probably a bit much for a spec, so I've taken out the negative language.
Regarding:
constructor parameters for structs are always in existence
and:
We can also limit this to structs with primary constructors only. Or we can just state it's for any struct.
I think that expanding it out to apply to all struct constructors (or even all struct primary constructors) would be problematic. Although I appreciate that a more broad-spectrum approach is simpler, and simpler is generally better in specs, I think there are likely to be additional problems with stating that all constructor arguments exist from the start of the lifetime of the class: it would disagree with §9.2.5 of the C# language spec about when the variable comes into existence. (That problem doesn't arise with my current wording, because it only applies in cases where §9.2.5 says the variable doesn't exist, meaning there can't be any disagreement about when it comes into existence: my addition would only define the moment of coming into existence for variables where §9.2.5 does not provide such a definition.) I'm not convinced it's possible to redefine when all struct
constructor arguments come into existence without causing problems for other parts of the spec. This feels like it would be opening a can of worms.
And if the text only describes behaviour for captured parameters for primary constructors, it would then be odd to state that it applies to any struct, not just those with a primary constructor, because in a struct without a primary constructor, there won't be any captured primary constructor parameters. (So we'd be saying it applied in places where it has no effect.)
It really is just the captured primary constructor arguments for which this is relevant. (My wording doesn't even apply to primary constructor arguments which are used only for initialization, because the scenarios where those fail to come into existence are also the scenarios in which they are not used. It's only capture of primary ctor parameters that causes a problem. And it causes a problem because it defines a lexical scope for these parameters that is slightly incompatible with their dynamic lifetime.)
And making it all about structs would miss two other scenarios that occurred to me while writing this. First, binary serialization enables constructors to be bypassed. I know binary serialization has been deprecated for some time, and now generates runtime errors in .NET 8, but you can still use it if you set EnableUnsafeBinaryFormatterSerialization
. So despite the deprecated status, it is absolutely possible to use it on current .NET 8 previews to create an instance of a class without running its primary constructor.
Second, MemberwiseClone
also bypasses constructors for both struct
and class
types. There do not appear to be any public plans for MemberwiseClone
to be deprecated.
There may also be other ways to bypass construction that I'm unaware of. But it's certainly true that there is at least one non-deprecated way to instantiate a class without running its primary constructor.
So the effect of saying that it applied to all constructors of structs would, paradoxically, be both too narrow (you can bypass constructors for non-structs too) and also too wide (this situation in which code can use a non-existent variable arises only for captured primary constructor arguments, so it's unnecessary to state it for any other kind of constructor argument).
This is why I scoped it very carefully to this:
in cases where an instance of a type with a primary constructor was created through a mechanism that bypasses that constructor, all captured parameters
The aim here is to characterise precisely the cases where the parameters would otherwise fail to exist, and only those cases.
To me personally all this "captuared parameters" wording is kinda confusing. If a PC parameter gets promoted to a field then why don't just state that instead? It somehow creates a wrong impression that struxts with primary constructors are special kind of struct which is not, really. The same behavior applies to any struct when 'default' instance is created. No constructor is called. Fields are initialized to default values. That's it. |
@En3Tho wrote:
These things are definitely not fields from the perspective of code using them. Take this example: public struct HasAField
{
private int myField;
public int Get() => this.myField;
} That's allowed. Now consider this: public struct HasNoField(int notAField)
{
public int Get() => this.notAField;
} The compiler rejects the If I remove the If As it happens, the compiler generates a field under the covers to make this work, but that's also what happens when you capture a local variable or method argument inside a lambda. We don't generally think of those as fields either, even though the lamba capture implementation happens to create fields for them. |
Honestly, I'm not sure why this is more complicated than saying: A struct's primary constructors parameters are always in existence for the type. Default instances of the struct have default values for the parameters. That's clear and easy without needing mountains of text to explain this. |
@idg10 I'm talking about the generated type. This parameter will get promoted to a field in such a code. How do you access it (with "this" or not) doesn't really change that. Maybe from a language perspective these are different things (or at least treated as such) but I just don't like the fact that it is using a different wording from the one devs are used to. Everyone know what fields are. Everyone know how default works. I've been using primary ctors for ages in F# but when I've read this I was just like "come on, these are just fields". |
It does though. You are talking about an implementation detail. Importantly, the language wants to not specify that, allowing impls to do different things, as long as all the lang rules are maintained. From the language, these are not fields. THe roslyn compiler may have chosen to emit as fields here in some cases, but that's not something you should be able to depend on. Note: teh same is true for captures in other contexts. |
One reason is that as described in my reply to your earlier comment (further up, as part of a conversation thread attached to a particular line in markdown submitted in this PR), it's also possible to bypass constructors with Given that, we'd need to rephrase your statement as:
(I.e., we apply this to all types, not just structs.) (Side note: it's not clear to me what "always in existence for the type" really means. When does "always" start? §9.2.5 considers parameters to be instance-level things. Lexically they are type-level of course, but §9.2.5 talks about "the invocation"—it considers each invocation to cause parameters to come into existence. So I am interpreting what you've written as being equivalent to "A primary constructor's parameters are in existence as soon as an instance of the type is in existence." If that's not what you meant, then I don't know what you meant.) There are two issues. First, you have not dealt with initialization of the parameters. Since they aren't fields, we can't rely on the existing language rules for fields. As far as I've been able to find, the places in the existing spec that define when parameters come into existence also define how the parameter is initialized, because there are no defaults for that. So you would also need to do that. This would not be complex, but it brings the complexity of your version closer to mine. (I.e., I'm gradually answering your "why this is more complicated" question.) And there's another issue. The second issue with your suggestion is it would mean there are now two separate statements defining when a primary constructor's parameters come into existence:
In the specific scenarios that I'm concerned about (where the constructor doesn't actually run), it doesn't matter, because only 2 applies. (The condition described in 1 never occurs in this scenario.) But this now creates uncertainty for the normal case, in which the constructor does actually run. §9.2.5 says its parameters come into existence upon the invocation of this constructor. Your new text says the parameters are "always in existence". §9.2.5 says it is initialized with the argument value, and as discussed earlier, your text is going to have to say that it was initialized with the default. Of course, we can add text to disambiguate, but I think by the time we've cleared all that up, it's not going to be significantly simpler than my wording. To remind you, the relevant text from my proposal is:
Once you are past the text that provides the necessary context, it's not that complicated. It is more explicit about exactly when these parameters come into existence, and unlike yours it does say what values these parameters are initialized with (which we have to do, because they're parameters, not fields). I didn't find your wording on the timing clear, but if I'm in a minority of 1 there, we can just substitute your wording into my version:
So the only "complication" is the initial part where I state that this only applies in specific conditions. Specifically this bit:
Yes, it's a few more words. But the upside is that this avoids the situation where we say two different things about when these parameters come into existence and how they are initialized that. Since your text applies "always" it will conflict with §9.2.5 in the situations where §9.2.5 is also applicable (i.e., in the normal expected use case where the primary ctor does actually run). With my wording I was carefully trying to avoid saying anything in the situations where the parameters come into existence thanks to the wording already in §9.2.5. Once you've added wording to resolve the conflict, I don't think it's going to look any simpler than what I've suggested. |
How would one do this? |
I have no idea what thsi means. As i mentioned, they have the default-values of their type in the case where you have a default-instance of the struct. I don't know what 'initialization' would have to do with this. |
We can trivially update 9.2.5 if there is any concern. I personally do not have any. The two statements work fine with each other to me. But, again, i don't see any problems with the spec right now. I'm willing to add some small normative text just to prevent any confusion. But i see all this as mainly hand-wringing. |
Yes, that is what i mean. For classes, you can only get that by going through a constructor. For structs you can get it by going through a constructor or getting a default instance. That's why my statement only covered structs with PCs. |
I think it was possible through odd/abusive uses of binary deserialization? |
Like my text said: they have the default value of their type. If you want, you can say "they are initialized with the default value of their type". But this is normative text, it's just meant to help clarify in case someone is confused. The text i wrote is simple and clear as to it's meaning. It can even have a small code example to show what it means. |
That's all outside of hte lang. And we'd basically say: "it's undefined" :) |
If you go back to the sentence you quoted here, and read the part that you deleted from this quote, you will find that I explained that the answer to your question can be found:
I'm hoping you will at some point read that comment, since it is a reply to your own comment (which was a reply to the the discussion attached to line 163 started by @333fred). But to recap, there are two ways to do this:
If it was only 2, then I think it would be tolerable for behaviour to be undefined in this scenario, since use of binary serialization has been deprecated for several versions. But public class BaseToShowWhetherConstructorRan
{
public BaseToShowWhetherConstructorRan()
{
Console.WriteLine("Constructor ran");
}
}
public class Point(double x, double y) : BaseToShowWhetherConstructorRan
{
public override string ToString() => $"({x}, {y})";
public Point Dup() => (Point)this.MemberwiseClone();
} if we then do this: Console.WriteLine("Creating with normal construction");
// Constructor runs when we do this.
Point p = new Point(10, 20);
Console.WriteLine(p);
Console.WriteLine("Creating with MemberwiseClone");
Point p3 = p.Dup();
Console.WriteLine(p3);
Console.WriteLine(object.ReferenceEquals(p, p3)); We get this output:
You can see the constructor ran only once, but as that final |
That's not what §9.2.5 says. It says that a constructor parameter:
So your statement contradicts the specification. The specification does not say that "they have the default-values of the type." It says they are "initialized with the value of the argument given in the invocation." The spec does say (in §16.4.1) that a struct's fields are set to their default value. And this fully explains the behaviour of the current preview implementation, given the implementation detail that it implements capture by generating a field for each captured parameter. But as you pointed out in #7354 (comment) although captured primary constructor parameters might be implemented as fields, this is an implementation detail. From the language's perspective they are not fields. They are constructor parameters, so their behaviour is defined by §9.2.5, and §9.2.5 does not agree with your description of how these are initialized.
Initialization comes into it because primary constructor parameters are (by definition, and obviously) parameters. This makes them a kind of variable. §9.1 states that:
If you look at all of the different kinds of variables that §9.2 describes as "initially assigned", in every case the spec describes explicitly what the initial value is. There is no presumed default initial value for variables. They only have a default initial value if the specification says so. (The only reason struct fields have a defined value when a struct is initialized to its default value is because the specification explicitly says so in §16.4.1.) In the case of primary constructor parameters, this initial value is defined to be "the value of the argument given in the invocation." So if the constructor was not invoked, then the initial value is undefined. It seems like you believe there is some sort of fallback default in which a variable will have its default value if for some reason the part of the spec explicitly describing its initial value doesn't apply. But the spec does not appear to say this for variables in general, and it does not say this about constructor parameters in particular. |
I just want to highlight this one sentence from my previous reply, because in my attempt to back up everything that I'm saying with references to the spec, I might have buried the lede:
I think this is at the heart of why we disagree over whether the design document for primary constructors creates a hole. |
Neither of these ways are part of the language. And it's up to those systems to behave in a way that would be reasonable. Presumably, any non-lang 'clone' feature would place the object in a similar was as if the new object was created in the same way the original was. Similarly, serialization of objects is entirely outside of the language. I see no reason for the lang to have any concerns or specifications around these concepts. If you want to take up with those systems how they should behave, you are welcome to. However, i imagine they both strive to give objects that behave the same way as the original, so i'm not sure why there is any contention there.
That's entirely irrelevant from the perspective of the language. Or, put another way, it sidesteps things. And such sidestepping has never been a concern for us in terms of specification.
Which is why i said it would be fine to add explanatory text explaining that a default instance (a non-constructed instance) has the default values for those parameters. As i've stated, i think it would be fine to add a simple note to the effect. |
Which is why i said explicitly i felt it was sufficient to simply state that a default instance of a struct with primary constructors is initialized with default values of those types. That seems to be all that is necessary to clarify things. It's simple, and means that for anyone unsure of how this should work, things are spelled out. This can even include a simple example if you would like. No such thing is needed for classes afaict as the cases you have brought up are all particularly out of bounds of the language. So those cases are undefined afaict. Now, i would presume the runtime (and serialization libraries) would endeavor to do things sensibly. So, in practice, they'll produce instances that behave sensibly. But there's no need to discuss them in the C# specification afaict. I'm not even sure why there is an argument here. I gave feedback on the original documentation, and it looks like you've revised it heavily to be in line with what i was asking for. I've modified things more to be in line with what i was thinking. And i think the revision is simple and clear and prevents any confusion. |
proposals/primary-constructors.md
Outdated
@@ -160,6 +160,8 @@ If a primary constructor parameter is referenced from within an instance member, | |||
|
|||
Capturing is not allowed for parameters that have ref-like type, and capturing is not allowed for `ref`, `in` or `out` parameters. This is similar to a limitation for capturing in lambdas. | |||
|
|||
There are circumstances in which constructors do not run. (For example, creating a single-element array of some struct type will produce an instance of that type without running any of its constructors.) Section [§9.2.5](https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/language-specification/variables#925-value-parameters) of the C# spec states that constructor parameters do not come into existence until the constructor is invoked. If a primary constructor parameter is captured in such a scenario, this would mean that the member causing its capture would be using a variable that does not exist. To avoid this, we assert that in cases where an instance of a type with a primary constructor was created through a mechanism that bypasses that constructor, all captured parameters come into existence when the instance is created, and are all initialized with the default value for their type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are circumstances in which constructors do not run. (For example, creating a single-element array of some struct type will produce an instance of that type without running any of its constructors.) Section [§9.2.5](https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/language-specification/variables#925-value-parameters) of the C# spec states that constructor parameters do not come into existence until the constructor is invoked. If a primary constructor parameter is captured in such a scenario, this would mean that the member causing its capture would be using a variable that does not exist. To avoid this, we assert that in cases where an instance of a type with a primary constructor was created through a mechanism that bypasses that constructor, all captured parameters come into existence when the instance is created, and are all initialized with the default value for their type. | |
There are circumstances in which struct constructors do not run. For example `default(StructType)` produces such an instance. In those cases, despite no constructor being invoked (Section [§9.2.5](https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/language-specification/variables#925-value-parameters), all captured parameters of that struct-type come into existence when the instance is created, and are all initialized with the default value for their respective types. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would also add:
For instance for this declaration and code:
``` c#
public struct Point(int x, int y)
{
public override string ToString() => $"Point({x}, {y});
}
var point = default(Point);
var text = point.ToString();
```
"text" will have the value `"Point(0, 0)"` due to the captured `x` and `y` parameters having default values.
The language permits them, though. So although the language specification does not explicitly list any of the ways of creating new
If the language permits it as a general mechanism, how can it be irrelevant? I can see how the specific systems that exploit it (such as the .NET runtime with its And up to C# 11.0, this "constructorless initialization" does not cause any problems, because the spec does in fact fully define what it means for all variables of all kinds that could be affected by this. The problem with the primary constructors design as it currently stands (this PR notwithstanding) is that it creates a new situation (the ability to capture constructor parameters in a constructor that might not run—before primary ctors, there was no way to get access to a constructor's parameters without running that constructor) that means the spec no longer fully defines what constructorless initialization implies in one particular case.
I didn't think you had said that. You said this:
That's a statement about the existence of the parameters, not how they are initialized. You followed that with this:
but that didn't look to me like part of what you were proposing putting in the additional statement. I thought it was just a thing you thought was already true. You also said this:
So I hope you can understand why I thought you were arguing that we don't need to say anything about initialization. If your position is that we should state (without prejudice to any particular wording) both:
then I think we are in agreement about that. I think we still disagree on whether it is absolutely necessary to state this, or whether this is purely informative text. Your most recent statement is:
I think you probably meant "non-normative text". But I believe this is normative text because nothing elsewhere in the existing C# language spec or in the current primary constructors design document defines how constructor parameters are initialized in cases where the constructor did not run. Since this would be the only text defining what happens in that case, that would make it normative. I guess as far as this PR goes, it doesn't matter. If there's something in the language spec I haven't seen that already covers this, then it's non-normative, and you believe it to be consistent with the spec, and that's fine. And if I'm right that there isn't actually anything in the language spec that covers this case, then it is in fact normative and it closes the hole that I believe would otherwise exists, and that's also fine. So I suppose we don't actually need to agree to make progress. The one nagging doubt is that since I still don't understand which bit of the language specification you're looking at that makes you think that this is already covered, I can't comprehend your point of view. And I have found in the past that when there is a disagreement and neither party understands why the other person holds their point of view, it's often the case that both parties are wrong. So that's why I've been pushing away at this point. I was really hoping that at some point you'd say "Ian, just go and read §X.Y.Z of the language spec, and then you'll understand why I'm right." Until I've understood why you think what you think I'm going to have a nagging doubt that both of us have missed something here, which is why I think it would be better if I could understand why you think the existing spec has anything to say about the use of parameters to a constructor that was never invoked. |
Where does it implicitly permit it to happen? Afaict, it's simply undefined. And we don't need to spec undefined things.
How does the language permit it? The bcl is not the language. The bcl does tons on stuff that are simply not defined in the language. It is free to do so. Our preference is that they make reasonable choices when they do this. But it's strictly outside the language. :)
And i literally said, multiple times variants of "Default instances of the struct have default values for the parameters.". That is me stating explicitly they're initialized to the default value of their respective type. I'm really not sure what is unclear about this. :)
It's literally the statement i was making. They are in existence. They have the default value. It's the same point i've been making since we started talking on this subject. My position on this has not moved at all :)
As i've said, my view is that this falls out from what i think the zero-initialized instance means for a struct. To me, zero-initialized is a strong statement that impacts all further lang work though i have agreed clarity is worthwhile here. To me, zero-initialized means that all aspects of the struct are in the zero state. We state explicitly that that holds for it's fields. For me, the deep implication though of 'zero-initialized' is that it holds for everything (which, contextually/historically is obviously the position we've always taken). So personally, i don't see a need to expand on it further to enumerate everything. However, at the same time as i've stated at least a half dozen times times, i also have no issue with a little exposition to make it firmly clear that that zero-initialization also covers these variables. Basically, i read the spec as broadly taking this property and expanding it to everything for structs in the future (unless something explicitly specifies that it would not behave this way). You read the spec as stating it in a narrow fashion, and that new items need to be explicit that this holds for them. I am fine with your reading, which is why i'm ok adding clarity here. |
@@ -160,6 +160,8 @@ If a primary constructor parameter is referenced from within an instance member, | |||
|
|||
Capturing is not allowed for parameters that have ref-like type, and capturing is not allowed for `ref`, `in` or `out` parameters. This is similar to a limitation for capturing in lambdas. | |||
|
|||
There are circumstances in which constructors do not run. (For example, `default(StructType)` will produce an instance of `StructType` without running any of its constructors.) Section [§9.2.5](https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/language-specification/variables#925-value-parameters) of the C# spec states that constructor parameters do not come into existence until the constructor is invoked. If a primary constructor parameter is captured in such a scenario, this would mean that the member causing its capture would be using a variable that does not exist. To avoid this, we assert that in cases where an instance of a type with a primary constructor was created through a mechanism that bypasses that constructor, all captured parameters come into existence when the instance is created, and are all initialized with the default value for their type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are circumstances in which constructors do not run. (For example, `default(StructType)` will produce an instance of `StructType` without running any of its constructors.) Section [§9.2.5](https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/language-specification/variables#925-value-parameters) of the C# spec states that constructor parameters do not come into existence until the constructor is invoked. If a primary constructor parameter is captured in such a scenario, this would mean that the member causing its capture would be using a variable that does not exist. To avoid this, we assert that in cases where an instance of a type with a primary constructor was created through a mechanism that bypasses that constructor, all captured parameters come into existence when the instance is created, and are all initialized with the default value for their type. | |
There are circumstances in which constructors do not run. For example, `default(StructType)` will produce an instance of `StructType` without running any of its constructors. In those cases, despite no constructor being invoked (Section [§9.2.5](https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/language-specification/variables#925-value-parameters), all captured constructor parameters of that struct-type come into existence when the instance is created, and are all initialized with the default value for their respective types. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i also think an example would be appropriate here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really like the phrase "come into existence" in the spec. I think the normative way to phrase it would be "are in scope, and definitely assigned. They are all initialized to the default value for their respective types."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That language works for me Bill!
As discussed at #2691 (comment) the Primary Constructors spec runs into a problem when the execution of the constructor can be bypassed (as can happen with any
struct
.) Capture of constructor parameters results in variables that are in scope but which, according to the C# language specification, do not actually exist.This change defines behaviour for captured parameters that avoids this problem.
This does not require any change in the implementation of this language feature. The preview already works exactly as this spec change describes. This just removes ambiguity arising from the fact that the existing C# language specification does not define what should happen when using a variable that does not exist. The existing implementation chose to resolve that ambiguity in an obvious way, and this spec change simply aligns with that.