Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Custom Operators / Infix Functions #427

Closed
raulgrell opened this issue Aug 20, 2017 · 8 comments
Closed

Proposal: Custom Operators / Infix Functions #427

raulgrell opened this issue Aug 20, 2017 · 8 comments
Labels
enhancement Solving this issue will likely involve adding new logic or components to the codebase. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@raulgrell
Copy link
Contributor

raulgrell commented Aug 20, 2017

Operator overloading can be very useful, but there is often a concern that it hinders the ability to understand code at first glance: not only may you have to check whether + really means add, but it hides a function call. One of Zig's main objectives is clarity, so this makes operator overloading a no-go.

Proposal: allow binary functions to be called as operators.
Corollary: allow infix function calls

The # is there to make it easier for both humans and computers to parse, and marks the infix function call:

const Vec2 = struct {
    data: [2]f32,

    pub fn add(self: &const Vec2, other: &const Vec2) -> Vec2 {
        vec2(self.data[0] + other.data[0], self.data[1] + other.data[1])
    }
    pub fn mul(self: &const Vec2, other: &const Vec2) -> Vec2 {
        vec2(self.data[0] * other.data[0], self.data[1] * other.data[1])
    }
}

const x = vec2(1,2)
const y = vec2(3,4)

add(x, y) == x #add y;
add(mul(add(x, y), x), x) == x #add  y #mul x  #add x;
add(add(x, mul(y, x)), x) == x #add (y #mul x) #add x;

// The UFCS way looks like it's mutating the vecs:
x.add(y).mul(x).add(x) == x #add  y #mul x  #add x;
x.add(y.mul(x)).add(x) == x #add (y #mul x) #add x

Notes:

  • Precedence: all custom operators have the same precedence level and are therefore evaluated left to right. Handling associativity could be more toruble than it's worth.
  • I'd like to be able to have both unary/binary functions. It would also be interesting to allow prefix operators somehow. Since Zig has no postfix operators, we probably don't want to allow them.
  • This will also be useful for things like parser combinators and 'expression-level' DSL's

Kotlin does something similar with unary functions:
kotlinlang.org/docs/reference/functions.html#infix-notation

I'll expand on this at a later date if there is any interest.

@andrewrk andrewrk added the enhancement Solving this issue will likely involve adding new logic or components to the codebase. label Aug 20, 2017
@andrewrk andrewrk added this to the 0.2.0 milestone Aug 20, 2017
@andrewrk
Copy link
Member

This is a really coherent idea, thank you for the proposal. I can think of 2 reasons that make me hesitant to go down this road, despite the (correct) observation that the UFCS way looks like it's mutating the vecs.

  • Goes against "only one obvious way to do things"
  • Makes the language bigger, increasing knowledge overhead of using it, with arguably low benefit. We're already getting a lot of flak for requiring knowledge of the sigils % and ?. (even though I agree that # makes it easier for humans and compilers to parse)

Milestone 0.2.0 means we'll accept and implement or reject this proposal before 0.2.0 release.

@thejoshwolfe
Copy link
Contributor

A trivial counter to the point about mutating the parameters is an application level solution: just name the functions differently so they sound more like pure functions.

// Probably not mutating the vecs
x.plus(y).times(x).plus(x);
x.plus(y.times(x)).plus(x);

This has nothing to do with infix operator/function syntax; mutable self parameters are a thing that needs to be understood in every context. Even if we had infix functions, we'd still need to worry about it.

@marler8997
Copy link
Contributor

This proposal by @raulgrell could serve as an alternative to supporting UFCS. In order to see how this could be useful, consider a common use case in D. UFCS allows you to write a chain of functions that transform an input range, and you can write them in the order that they are applied instead of an awkward set of function calls, i.e.

// the awkward way
raiseToThePowerOf( plus( filterIfGreaterThan(myInputRange, 3), 10 ), 2)
// using UFCS to write them in order
myInputRange.filterGreaterThan(3).plus(10).raiseToThePowerOf(2)

In this example filterIfGreaterThan, plus and raiseToThePowerOf are just normal functions that take an input range but UFCS allows them to behave like "postfix operators" meaning they can be chained together.

With this proposal the # could generally mean, take the previous expression and pass it as an argument to the following function

// zig, awkward way
raiseToThePowerOf( plus( filterIfGreaterThan(myInputRange, 3), 10 ), 2)
// zig, using proposed syntax
zigInputRange #filterIfGreaterThan(3) #plus(3) #raiseToThePowerOf(2)

However, as Andrew pointed out, this violates the "only one obvious way to do things", but there may be a way to make this work only one way. That would be to support a function argument annotation (maybe prefix) that indicates the argument should come before the function name when calling it.

const Vec2 = struct {
    data: [2]f32,

    pub fn plus(prefix self: &const Vec2, other: &const Vec2) -> Vec2 {
        vec2(self.data[0] + other.data[0], self.data[1] + other.data[1])
    }
}

const x = vec2(1,2)
const y = vec2(3,4)
plus(x, y); // Error: the 'plus' function requires 1 prefix argument
x #plus y; // OK

Actually since the compiler now knows that the plus function requires 1 prefix argument, the # is not necessary, you could just do

x plus y;
// example of changing order of evaluation
a plus b plus c;
a plus (b plus c);

Original example would be

zigInputRange filterIfGreaterThan(3) plus(3) raiseToThePowerOf(2)

If we wanted to support something like this, I propose that we find as many interesting use cases as we can and come up with a set of generalized function annotations that allow the developer to customize the syntax for how their function is called. I think it's clear that providing a way to customize the syntax can be invaluable in certain cases, so the question is, can we make a simple enough set of rules/annotations that is simultaneously "very useful" without being "very complicated".

@thejoshwolfe
Copy link
Contributor

Language features which are just syntactic sugar are rare in Zig, and they tend to only show up where it makes a big difference. Some examples are ??x desugaring to x ?? unreachable and %return x desugaring to x %% |err| return err. These examples correspond to very specific very common situations where we want to encourage developers to write the proper semantics in their code. For example, %return x encourages developers to propagate errors responsibly, and ??x is better than having dead code check for a null pointer from a C API that will never return one. Also, these language features are very simple: you only need a single placeholder x to illustrate them.

This proposal so far does not encourage or enable any particular semantics. x #f y desugaring to f(x, y) is roughly the same number of characters; its benefits come from rearranging x and f and removing the ) after y. If you compare x #f y desugaring to x.f(y) then it's the same number of characters, and doesn't rearrange anything; all it does it remove parentheses and do something different with operator precedence.

We've already got method chaining syntax that you can use with your matrix, vector, and complex number types like this:

w = x.plus(y).times(x).plus(x);
z = x.plus(y.times(x)).plus(x);

and I see nothing wrong with this.

I understand that this has the limitation of only working with methods defined in the type of the first parameter. I read the wikipedia article that calls this excessive coupling between classes, but that's a gross overstatement. You'll always be able to use normal function call syntax if you need it, and method syntax is available if the struct author provides it. If your code has a mix of normal function calls and method calls, that might actually be better for readability for the following reason:

An import feature of current function call semantics in Zig is the question: where is this function defined? For f(x, y), then f is defined globally, so check the declarations in the current file and check the @imports. For x.f(y), then f is defined in the type of x, so figure out what type x is, and then look at its declaration.

For more discussion on UFCS in Zig, see #148.

@raulgrell
Copy link
Contributor Author

I have actually been meaning to retract my confidence in this idea.

The UFCS syntax has been proving itself sufficient for my use cases. I still haven't finished my parser combinator, but I already think the operator syntax isn't going to make code any more elegant.

The #'s were also a way of making the name stand out, make it explicit that it's a function call and not just an identifier. I like sigils, but I concede that we should save them for less trivial things
than this. The % and ? are easy to justify because they're part of Zig's value
proposition: nullables and errors. This, not so much.

@thejoshwolfe made a good point that naming can make things obvious enough. Even though I was looking for code structure as a way to communicate intent, this did end up being gratuitous sugaring.

@andrewrk
Copy link
Member

There is discussion happening on the (reopened) UFCS issue, but I think nobody is still making a case for this issue.

@tiehuis tiehuis added proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. rejected labels Sep 15, 2017
@milahu
Copy link

milahu commented May 8, 2022

// zig, awkward way
raiseToThePowerOf( plus( filterIfGreaterThan(myInputRange, 3), 10 ), 2)

// zig, using proposed syntax
zigInputRange #filterIfGreaterThan(3) #plus(3) #raiseToThePowerOf(2)

compromise solution

const result = blk: {
  var x = zigInputRange.filterIfGreaterThan(3);
  x = x.plus(3);
  x = x.raiseToThePowerOf(2);
  break :blk x;
}

the goal of the chain syntax is to show the "imperative data flow" from left to right
we get the same effect by writing the imperative flow from top to bottom
(just like a complete noob would write the code)
(and the "pros" will say "meh that's not elegant")

the result = at the start announces where the block's result is stored

zig is a compiled language, so the compiler can optimize the code

this could also be solved with functional pipes
but that is more obscure, and requires "well behaved" functions

another example: https://nelari.us/post/raytracer_with_rust_and_zig/#living-without-operator-overloading

// Zig

if (discriminant > 0.0) {
    // I stared at this monster for a while to ensure I got it right
    return uv.sub(n.mul(dt)).mul(ni_over_nt).sub(n.mul(math.sqrt(discriminant)));
}

infix

(ni_over_nt * (uv - (dt * n))) - (discriminant.sqrt() * n)

"imperative block"

if (discriminant > 0.0) {
  return blk: {
    // a = (ni_over_nt * (uv - (dt * n)))
    var a = dt.mul(n);
    a = uv.sub(a);
    a = ni_over_nt.mul(a);
    // b = (discriminant.sqrt() * n)
    var b =  math.sqrt(discriminant);
    b = b.mul(n);
    // a - b
    break :blk a.sub(b);  
  }
}

edit: fixed syntax, thanks @leecannon

@leecannon
Copy link
Contributor

leecannon commented May 8, 2022

@milahu The two given examples are not that different from status quo:

const result = blk: {
  var x = zigInputRange.filterIfGreaterThan(3);
  x = x.plus(3);
  x = x.raiseToThePowerOf(2);
  break :blk x;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Solving this issue will likely involve adding new logic or components to the codebase. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests

7 participants