Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missing property reference error ignored if followed by unary operator #9564

Closed
msftrncs opened this issue May 8, 2019 · 17 comments
Closed
Labels
Issue-Question ideally support can be provided via other mechanisms, but sometimes folks do open an issue to get a Resolution-By Design The reported behavior is by design.

Comments

@msftrncs
Copy link
Contributor

msftrncs commented May 8, 2019

I found this by accident, left the . property reference while testing ideas.

I think it is incorrect that this code goes without any errors, and also generates no result.

Steps to reproduce

$Departments = @'
Alpha
Beta
Gamma
Zulu
'@. -split '\r'

Expected behavior

At line:6 char:4
+ '@.
+    ~
Missing property name after reference operator.
+ CategoryInfo          : ParserError: (:) [], ParentContainsErrorRecordException
+ FullyQualifiedErrorId : MissingPropertyName

Actual behavior


Exactly, nothing happens, no error, no result. Any unary operator seems to do this. A non-unary operator will generate two error messages, missing property, and unexpected token (for the right operand).

Environment data

PowerShell 6.2 on Windows 10 1809
Windows PowerShell 5.1 (same OS)

@msftrncs msftrncs added the Issue-Question ideally support can be provided via other mechanisms, but sometimes folks do open an issue to get a label May 8, 2019
@msftrncs
Copy link
Contributor Author

msftrncs commented May 8, 2019

I suppose the property reference is consuming the result of the unary operation? This is kind of unexpected, as I wouldn't expect it to accept such a result without something like parenthesis around it or something ...

@vexx32
Copy link
Collaborator

vexx32 commented May 8, 2019

What seems to be happening is the PS ignores whitespace immediately after a property accessor. This case in the parser is meant to handle things like this:

$ReallySillyLongName.PropertyOneThatIsLong.
    MethodName()

But it does seem to be a bug that if you follow it with whitespace and then a defined operator that it simply processes the property name with either just whitespace or possibly an empty string. This is because the hyphen that starts such an operator is actually a token-separating character and isn't usable as a property name character without encloding is in quotes.

So it's trying to be lenient and helpful and ignore the space, but the subsequent breaking character leaves it with a whitespace-only property name -- which it presumably trims to an empty string -- which isn't a defined property name at all, so it gives back $null as you're not working in StrictMode. Then the split operation is actually applied, but it already has nothing to work with, and also returns $null.

/cc @daxian-dbw seems like a parser bug where it forgets that this should be a parse error if the token is ended before it gets any characters to use for a property name?

@msftrncs
Copy link
Contributor Author

msftrncs commented May 8, 2019

I was thinking maybe it was going to use the result of the -split unary operator as a member name, but I cannot seem to prove its doing that either.

@{hello=3}. -split 'hello' # result is $null
@{hello=3}. 'hello' # result is 3

@vexx32
Copy link
Collaborator

vexx32 commented May 8, 2019

Yeah if the operator didn't need the hyphen it would. But since the hyphen is a character that effectively creates a token boundary it seems to end up with a null property name. Even this fails:

PS> @{'  -split "hello"' = "tricks"}.  -split "hello"
# no result

@msftrncs
Copy link
Contributor Author

msftrncs commented May 8, 2019

I also tried:

@{hello=3}.(-split 'hello') # result is $null
@{hello=3}.('hello') # result is 3

@msftrncs
Copy link
Contributor Author

msftrncs commented May 8, 2019

If you leave off the operand for the split operator, the error message will imply that the operator was of the unary type, so its clearly looking for an expression, to which if the first term is an operator, it must be unary.

@msftrncs
Copy link
Contributor Author

msftrncs commented May 8, 2019

This works:

@{hello=3}.$(-split 'hello') # result is 3

@daxian-dbw
Copy link
Member

@msftrncs you are right that it's using the unary expression as the member of the MemberExpression. It's more obvious if you look at the ASTs:

$s = @"
@'
Alpha
Beta
Gamma
Zulu
'@. -split '\r'
"@
$a = [System.Management.Automation.Language.Parser]::ParseInput($s, [ref]$null, [ref]$null)
$a.EndBlock.Statements[0].PipelineElements[0].Expression.Member.GetType()

> IsPublic IsSerial Name                                     BaseType
> -------- -------- ----                                     --------
> True     False    UnaryExpressionAst                       System.Management.Automation.Language.ExpressionAst

The UnaryExpression -split '\r' will be evaluated first and the result will be used as the member name, which is \r. The member doesn't exist, and thus the MemberExpression returns $null.

It's not that obvious, but the behavior should be by design.

@msftrncs
Copy link
Contributor Author

@daxian-dbw, I had gathered this so far, but it seems that this doesn't work as designed, because I provided further examples showing that it does not work, specifically with -split. I have since realized that it does work with -join.

Examples:

@{hello=3}.(-split 'hello') # result is $null
@{hello=3}.('hello') # result is 3
@{hello=3}.$(-split 'hello') # result is 3
@{hello=3}. -join 'hello' # result is 3

There seems to be something special going on with -split when it supplies the member expression without a full sub-statement context. This is what I have determined. -split normally returns an array object, but since PowerShell normally unwraps single element arrays, this particular example appears to only output 1 string, but using get-member -in (-split 'hello') I can see the array. I am guessing the unwrap doesn't occur when -split's result is passed to the accessor lookup, and it cannot handle the array result, as the operation is not defined. Using the substatement causes the unwrap, and then the lookup can occur.

I also want to say that I think this is a bad design. It leads to confusion as to what later member accesses will actually be accessing. Of course, this is easy to clarify, by wrapping the expression in parenthesis, but I think the user/programmer should be forced to do that to disambiguate the reference.

@{hello=3}. -join 'hello'.ToString() # result is [int]3
@{hello=3}.( -join 'hello').ToString() # result is [string]3

Above (first line), the .ToString() works against the 'hello' of the -join, and further accessors against the original expression will not happen. We should be forced to disambiguate this, up front. When a language is too lax, it can create problems that become hard to troubleshoot.

Anyway, this is just my opinion. I do wonder, even without strict mode, if an error shouldn't be raised on certain inputs to the accessor member lookup. What good does an array object, or even numeric values, provide to an accessor member lookup?

I have learned things from this at any rate.

  1. member access is actually an operand (or said as the . (or ::) operator followed by an operand, though it has a strict syntax where the operator must immediately follow the prior object). I already understood the index access worked that way, and that is easy to see, this was not. Also, must clarify that this particular operand permits unquoted non-expandable strings, which is what is normally used.
  2. any operand can start with a unary operator (while it might seem obvious, its easy to want to just consider pre-unary operators as straight operators, but once you realize they are actually part of the operand, processing syntax suddenly becomes so much easier)

This is important because I will use this information in PowerShell/EditorSyntax#156 to improve both the accessor syntax scoping and the general operation in expression mode.

@daxian-dbw
Copy link
Member

There seems to be something special going on with -split when it supplies the member expression without a full sub-statement context.

It's not -split, but hashtable has something special going on. For a dynamic member access, powershell evaluates the dynamic member expression, and then converts the result to a string. But when the target is an IDictionary, it's handled specially -- no string conversion, the result is used directly as the key because IDictionary takes an object key. See the code here

else if (PSObject.Base(target.Value) is IDictionary)
{
// We don't want to convert the member name to a string, we'll just try indexing the dictionary and nothing else.
restrictions = target.PSGetTypeRestriction();
restrictions = restrictions.Merge(BindingRestrictions.GetExpressionRestriction(
Expression.Not(Expression.TypeIs(args[0].Expression, typeof(string)))));
return new DynamicMetaObject(
Expression.Call(CachedReflectionInfo.PSGetDynamicMemberBinder_GetIDictionaryMember,
PSGetMemberBinder.GetTargetExpr(target, typeof(IDictionary)),
args[0].Expression.Cast(typeof(object))),
restrictions).WriteToDebugLog(this);
}

The following would work:

$s = -split 'hello'
@{$s = 3}. $s
> 3

I also want to say that I think this is a bad design. It leads to confusion as to what later member accesses will actually be accessing.

The dynamic member accessing feature is very useful with a variable expression as the member, something like $obj.$member. Yes, it provides the ability to use any expression as the member, but we shouldn't abuse it if we want clear and readable code :)

I do find a bug with indexing a Hashtable, see #9580

@msftrncs
Copy link
Contributor Author

Yikes, I don't know why I didn't see this earlier, but hashtables accept any kind of object for a key (it seems). The EditorSyntax grammar doesn't recognize that.

Since the key can contain the [string[]] object returned by -split 'hello' I can see then that's why it doesn't work in the original cases.

[string[]]'hello' !== [string]'hello'

Little confusing why this doesn't work:

@{-split 'hello'=4}. -split 'hello'

Yet when you put it in a variable, and use the variable it does. That almost makes it look like its using the reference, and if the reference isn't the same, it doesn't work?? In the above, -split 'hello' in both places is not the same reference, though they produce the same result.

@daxian-dbw
Copy link
Member

Little confusing why this doesn't work: @{-split 'hello'=4}. -split 'hello'

Well, those two -split 'hello' returns two different arrays, which have different hashcode values. So Hashtable doesn't think the latter points to the same key. That's how Hashtable works.

@vexx32
Copy link
Collaborator

vexx32 commented May 12, 2019

I think the real crux of the confusion here is that this works without a subexpression operator, and as such there's no visible differentiator that would indicate that the operator is behaving as unary.

In my own opinion, it would make more sense for this behaviour to require a subexpression operator to be used, and syntax error in all other cases where a member-access operator is followed by a unary operator.

i.e., this should be fine, it's pretty clear what's happening:

PS> $thing.$(-split 'hello')

Whereas this is only ever confusing in terms of what it's operating on, how it ought to behave, and what the result should ever be, if anything:

PS> $thing. -split 'hello'

@msftrncs
Copy link
Contributor Author

msftrncs commented May 12, 2019

That's all fine and dandy … but the average user doesn't see it that way … the result is the same, and what makes it any more likely that two constants to follow those rules … that's why this stuff confuses us low level programmers... 'hello' = 'hello' because I compared every bit. obviously the difference is that [valuetype]'s do compare directly, but all others use their instance generated hashcodes for comparison. (actually [valuetype]s just generate the same hashcode when their bits match.)

I can also understand the basis for dynamic member access … but if you cannot truly access certain members dynamically … whats the value? Of course from a processing standpoint, it makes for a simple rule. . followed by an operand. operand can be a unary operator and its operand. that operand then can be a unary operator, and so that repeats. any operand is valid. $a is an operand. @(x,x) is an operand. The unquoted non expanding string hello is an operand. and so on. There would be no way to say, $variable is an acceptable operand, but @(x,x,x) is not.

Originally I only thought there was a few things that was accepted, the unquoted non-expanding string, or a variable or a quoted string (I have been using all 3). I didn't realize it accepted an entire expression. And now I know there are some very big gotchas when that expression doesn't result in a [valuetype].

(and my reference to [valuetype] is probably not accurate, as strings are not [valuetype])

@iSazonov
Copy link
Collaborator

Is a conclusion "by design"?

@msftrncs
Copy link
Contributor Author

msftrncs commented Oct 1, 2019

while it is surprising, I do think it is by design.

@iSazonov
Copy link
Collaborator

iSazonov commented Oct 1, 2019

Ok, until we get more UX negative feedback and I close with "By-Design" label.

@iSazonov iSazonov closed this as completed Oct 1, 2019
@iSazonov iSazonov added the Resolution-By Design The reported behavior is by design. label Oct 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue-Question ideally support can be provided via other mechanisms, but sometimes folks do open an issue to get a Resolution-By Design The reported behavior is by design.
Projects
None yet
Development

No branches or pull requests

4 participants