Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: add safe navigation operator #2816

Closed
nodefish opened this issue Jul 4, 2019 · 11 comments
Closed

Proposal: add safe navigation operator #2816

nodefish opened this issue Jul 4, 2019 · 11 comments
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@nodefish
Copy link

nodefish commented Jul 4, 2019

This issue is to propose the addition of the safe navigation operator.

Currently, working with chained optionals causes a lot of friction, because a null check/unwrap must be done at each level of indirection, resulting in code that is both harder to write and read.

Real example: const result = if (root_node.nodes) |nodes| if (nodes[0].symbol) |sym| sym.val else null else null; (only 2 levels of indirection, and it's already on the verge of losing clarity - suffice to say this scales very poorly for n>2 chained optionals).

When an else branch is needed for the situation where any of the chained fields is null, things can even get significantly worse, e.g.:

// We want to get to `root_node.nodes[0].symbol.val` or else if any of the chained fields is null, handle that situation.
// Slightly hyperbolic status quo approach:
var encountered_null= true;

if (root_node.nodes) |nodes| {
    if (nodes[0]) |node| {
        if (node.symbol) |symbol| {
            encountered_null = false;
            return symbol.val;
        } // can't use `else` without having to repeat the handling code
    } // can't use `else` without having to repeat the handling code
} // can't use `else` without having to repeat the handling code

// "else"
if (encountered_null) {
    ...
}

This is an especially significant problem in zig compared to some other languages, because null checks/unwraps cannot be combined in a single expression for safety reasons.

Proposed solution:

Add the safe navigation operator, such that any chained variable/field with this operator causes the whole expression (entire chain) to evaluate to null if it is null.

Syntax option 1:
varname?

The hyperbolic example becomes much clearer:

if (root_node.nodes?[0]?.symbol?.val) |val| {
     return val;
} else { 
    // handle _any_ chained member being null
}

Syntax option 2:
In an IRC discussion about this proposed feature, some people expressed a preference for .? syntax on the basis of consistency with .*; e.g.

varname.?

if (root_node.nodes.?[0]?.symbol.?.val) |val| {
     return val;
} else { 
    // handle _any_ chained member being null
}

Consequences

In both options, the current .? should be deprecated (in order to reduce ambiguity for option 1, and by necessity for option 2). The proposed feature makes it easy to replace this functionality e.g. article?.author?.name orelse unreachable -- which, it can be argued, is superior than using a shorthand, since unreachable should never be reached at runtime (more explicit).

Additional ideas

mikdusan proposes further unwrapping operators in the same vein, particularly the dereferencing operator .* being used to replace .? (general unwrap/"unbox" operator).

Additionally, a similar error unwrapping operator could perform the same function for chained members that might evaluate to an error, resulting in the entire chain evaluating to the first error encountered.

Thank you for discussing this with me on IRC: samtebbs, mikdusan, Flaminator, scientes, mq32, fengb

edit: minor corrections

@mikdusan
Copy link
Member

mikdusan commented Jul 4, 2019

For historical reference see #1095 and #1023

That thread also talked briefly about an operator for unwrapping error unions such as .! and consistent with this proposal's additional ideas, .* seems like an apt place to do all 3 things:

  1. pointer deref (existing behaviour)
  2. optional unwrap (new)
  3. error union unwrap (possibly something new)

It should be noted negatives to overloading .*:

  • losing the symmetry between ?T and .?

@andrewrk andrewrk added this to the 0.6.0 milestone Jul 4, 2019
@andrewrk andrewrk added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Jul 4, 2019
@SamTebbs33
Copy link
Contributor

I suggest that the question mark and exclamation mark were to be placed before the dot, like ?. and !. as the unwrapping operation is being applied to the left hand side and so the symbol should be on the left as well.

@hryx
Copy link
Contributor

hryx commented Jul 4, 2019

Can you link to examples in the standard library that would benefit from this change?

As mentioned in #1023, all control flow now uses keywords instead of sigils, which lines up with Zig's philosophies and would be nice to uphold. Perhaps it would look weird here, I dunno. For example, if we use a hypothetical keyword then:

if (root_node.nodes then[0] then.symbol then.val) |val| {
     return val;
}

Most examples on the Wikipedia page use your suggested syntax, but the Crystal example uses a keyword.

@nodefish
Copy link
Author

nodefish commented Jul 4, 2019

@hryx

I don't know that stdlib is where this pattern would occur the most, but I do know it occurs a lot in data-driven application code e.g. server backend, games... I encountered it in my admittedly very-wip-and-maybe-should-be-totally-refactored Deflate implementation Huffman code tree attempt, which is where the "real example" above is from.

I would remind of the following points:

  • The "else" situation is uniquely problematic for Zig compared to other languages, because of the requirement to unwrap and capture individually when checking for null (in other languages, you can do a combined if statement).

  • ? would not exactly be introducing a new sigil:

  1. Zig already has ? as a prefix for optional types, e.g. const x: ?SomeType which denotes: "this can be null." The proposed solution (? suffix) would read as a follow-up version of the existing sigil: some_struct? which denotes: "is [thing defined with ?] actually ?"

  2. The current .? would be deprecated in favor of explicit orelse unreachable -> fewer sigils overall.

  • I'm all for being clear and explicit, but that Crystal-ish then is imho sort of a descent into explicitness-madness. We could by the same token remove . and {} and replace them with new keywords accessfield and beginblock ... endblock. But that would be overkill; and so too would then between every field being accessed.

@SamTebbs33
Copy link
Contributor

I agree with all of your points @nodefish , I'm just confused by the value of something? without the member access (like in something?.value) since that would (from what I understand) be equivalent to if (something) |something_notnull| something_notnull else null, which is the same as just something.

@nodefish
Copy link
Author

nodefish commented Jul 4, 2019

You are right, something? on its own should be a compiler error. It is after all called the safe navigation operator, using it without navigation is pointless.

@hryx
Copy link
Contributor

hryx commented Jul 4, 2019

? would not exactly be introducing a new sigil

Fact! I don't think anyone is arguing otherwise. :)

The current .? would be deprecated in favor of explicit orelse unreachable

As @Hejsil has pointed out, the current .? sugar is offered because it positively affects readability. That's what this proposal strives for as well, just optimized for a different use case. So this appears to be a trade. A good one? Maybe! I genuinely don't know.

that Crystal-ish then is imho sort of a descent into explicitness-madness

Hyperbole aside, my counteroffer wasn't meant to increase explicitness, it was meant to preserve the practice of only offering keywords for control flow. Since this proposal adds sugar to control flow, I thought it should be mentioned. (Note that the existing .? does not invoke control flow, so it doesn't violate this pattern.)

@fengb
Copy link
Contributor

fengb commented Jul 6, 2019

.* seems like an apt place to do all 3 things
@mikdusan

I would much prefer not overloading .*. There are quite a few instances where explicitly distinguishing non-nullable has saved me from a bug hunt: e.g. converting *N to ?*N or wrongly assuming a function returns non-nullable pointer.

@raulgrell
Copy link
Contributor

These are some forms that came to mind:

Anonymous chained unwraps with ?. operator, which takes the value of of the previously unwrapped element of the condition list or short circuits to the else block.

if (root_node.nodes, ?.[0], ?.symbol, ?.val) |nodes, node, symbol, val| {
   return val
} else {
   return null
}

Named chained unwraps with the : operator/separator, which allows us to name the intermediate values of the expression and use the current unwrap syntax

if (root_node.nodes) |n| : (n[0]) |x| : (x.symbol) |s| : (s.val) |val| {
   return val
} else {
   return null
}

Labeled chained conditions - something like the chained unwraps, but with nicer left-to-right reading.

if (n: root_node.nodes, x: n[0], s: x.symbol, s.val) |val| {
   return val
} else {
   return null
}

Short-circuit-if-null, or something like that? The .| operator - this is basically what is being suggested above, but it's another option for the operator that fits with unwrapping instead of asserting non-null. I'm not sure what it would mean outside an if statement, which wasn't clear for your proposal either. Maybe null coalesce?

if (root_node.nodes.|[0].|.symbol.|.val) |val| {
   return val
} else {
   return null
}

// Null coalesce? if any value is null, x will be null.
var x = root_node.nodes.|[0].|.symbol.|.val;

@SamTebbs33
Copy link
Contributor

@raulgrell your examples there are much too verbose for my liking. Perhaps they could be refined.

@raulgrell
Copy link
Contributor

raulgrell commented Jul 13, 2019

@SamTebbs33 I agree, but it was the best I could do in a way that I felt is consistent with the rest of the language.

The anonymous chained unwraps with ?. operator avoids the creation of names before they are used, and allow unwrapped values to be ignored with the usual _. We can also make it so that only the last unwrapped value is available:

if (root_node.nodes, ?.[0], ?.symbol, ?.val) |_, _, _, val| ...
if (root_node.nodes, ?.[0], ?.symbol, ?.val) |val| ...

The named chained unwraps with : separator would be better if parentheses weren't required in if statements, but we don't have a solution for that...

if root_node.nodes |n| : n[0]) |x| : x.symbol |s| : s.val |val| ...

The issue with the labeled chained conditions is that the label becomes a source of data as opposed to a destination. This looks better, but is a bit inconsistent with other uses of labels. You could also also consider n: root_node.nodes to be a var named n of type root_node.nodes, which is a bit silly.

The last one, Short-circuit-if-null, is no more verbose than the proposed replacement of .? (syntax 2 of the original proposal) but it allows us to keep the original meaning for .? and creates the same kind of "symmetry" that .? has with ?T, but with the statement (predicate) |name| syntax.
The equivalent "syntax 1" with a postfix | would be

if (root_node.nodes|[0]|.symbol|.val) |val| ...

Another variation on the named chained unwraps would be:

if |nodes, node, symbol| (root_node.nodes, nodes[0], node.symbol, symbol.val) |val| ...
if |n, x, s| (root_node.nodes, n[0], x.symbol, s.val) |val| ...

All of these are inconsistent with the "control flow with keywords" principle that Zig adheres to and are thus flawed. I also think we can't have operators that are both infix and postfix for parsing reasons, so that "syntax 1" variation probably out as well...

It's a doozy =P

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests

7 participants