Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make higher-order functions composable #33027

Closed
wants to merge 1 commit into from

Conversation

matbesancon
Copy link
Contributor

This proposal allows the direct composition of higher-order functions, starting with map and filter.

This is especially convenient with the piping syntax:

xs = rand(50)
xs |> filter(>(0.5)) |> map(x -> 2x) |> map(string)

@matbesancon matbesancon reopened this Aug 22, 2019
@StefanKarpinski
Copy link
Member

Huh. This is cool and so simple I'm surprised that no one has proposed it before.

@StefanKarpinski StefanKarpinski added feature Indicates new feature / enhancement requests speculative Whether the change will be implemented is speculative labels Aug 22, 2019
@StefanKarpinski
Copy link
Member

I would want to have a better notion of the scope of this: where do we stop? With a clear answer to that, I would be pretty happy with this.

@KristofferC
Copy link
Member

starting with map and filter.

This scares me.

@StefanKarpinski
Copy link
Member

Hence my request for a notion of the scope of these changes. For example with the <(2) style currying, we have clear criteria for when we will do that kind of explicit currying: the function must be a binary operation whose second argument is the object where we are specializing second argument. That is a limited set of functions that we could do that kind of currying for. If we had a clear rule like that for this kind of behavior as well, it would be a lot less scary.

@matbesancon
Copy link
Contributor Author

there was a discussion on slack for this. Given the frightened comments, one heuristic to stop would be to apply this to functions with signatures hof(f, C{A}) -> C{B} with C a collection type. The rationale is that this is useful when composing.

For sure, we can keep composing the result of a fold, but it would be less common than continuing a pipeline on a collection

@matbesancon
Copy link
Contributor Author

With this scope, this does not apply to many other functions, most (HO) functions I am trying to find either:

  1. don't take a function as positional argument (zip, sort, flatten)
  2. reduce the collection to an element (reduce, mapreduce)

@matbesancon
Copy link
Contributor Author

Another example that would work for this case but isn't implemented in base would be zipWith:
http://learnyouahaskell.com/higher-order-functions

@bramtayl
Copy link
Contributor

bramtayl commented Aug 22, 2019

Wouldn’t chaining be just as powerful and more flexible? The LightQuery syntax would be

xs = rand(50)
@> xs |> filter((@_ > 0.5), _) |> map((@_ 2*_), _) |> map(string, _)

@bramtayl
Copy link
Contributor

And for super lazy powers just use Base.Generator and Iterators.filter instead

@ararslan
Copy link
Member

This seems like the slowest possible way to compose transformations; adding this just encourages people to write slow, unreadable, surprising, hard-to-reason-about code. One of the things I've appreciated about Julia over the years was that it didn't have this kind of thing, where you forget an argument to the method and suddenly your result is an anonymous function. And what if you want to compose higher order functions based on a different argument? It seems weird to decide that one particular argument of one particular method is The Blessed One which when omitted gives you a function.

@StefanKarpinski
Copy link
Member

Why would this be particularly slow?

@ararslan
Copy link
Member

Lots of anonymous functions and it tempts people to write things as in the OP, e.g. xs |> filter(>(0.5)) |> map(x -> 2x) |> map(string) does two maps, which makes two passes over the input and allocates two separate arrays.

@StefanKarpinski
Copy link
Member

Fair point about the temporary arrays. I don't think the anonymous functions are so bad though.

@tkf
Copy link
Member

tkf commented Aug 22, 2019

The temporary arrays are not problem if you do this only for functions in Base.Iterators. There is no lazy equivalent of map so you need to define something like Iterators.map(f, itr) = Base.Generator(f, itr) first, though.

By the way, another nice thing is that xs |> filter(>(0.5)) |> map(x -> 2x) |> map(string) can also be written as

ixf = map(string)  map(x -> 2x)  filter(>(0.5))
ixf(xs)

The "iterator transform" ixf decouples computation from data: you can apply ixf to different arrays and iterators.

@matbesancon
Copy link
Contributor Author

This came up in a conversation on #types with @JeffBezanson, to paraphrase him building anonymous functions wasn't too bad.

The temporary allocation of arrays depends on the type of array being fed in.

@andyferris
Copy link
Member

Aren't transducers the way to compose functions like map and filter? E.g. Clojure has it built into their standard library. With transducers we can create functions like map(f) ∘ filter(g) where everything is lazy and there are no temporary copies, and incidentally it gives the OP well (which would tend to give temporary copies of intermediate result sets).

@tkf
Copy link
Member

tkf commented Aug 24, 2019

@andyferris Transducers and iterator transforms (:= curried version of "iterator factories" like map(f) and filter(f) in the OP) are identical in terms of the expressiveness at the level of surface syntax. They "just" differ in how they compose the codes into a loop (and so how iterators/transducers implementers write the code). It is even possible to execute map(string) ∘ map(x -> 2x) ∘ filter(>(0.5)) as defined in the OP using transducers behind the scene (and vice versa) if we have enough machinery. I think it might be reasonable to go with the iterator transform path especially when map, filter, etc. are defined as ("eager") iterators.

At this point, I cannot help mentioning that transducers are "adjoint" of iterator transforms (Thanks to @jw3126 who figured it out. See also my slides 32 and 33 of JuliaCon). That is to say, if Map and Filter are the corresponding transducers (and if I were using like Clojure does), the code in the OP would be expressed as

map(string)  map(x -> 2x)  filter(>(0.5))  # iterator transform
Filter(>(0.5))  Map(x -> 2x)  Map(string)  # transducers

(this is like adjoint in linear algebra: (ABC)' = C'B'A')

So, to convert iterator transforms to transducers, we can just map an iterator transform (e.g., filter(f)) to a corresponding transducer (e.g., Filter(f)) while reversing the order of .

Going back to the original point, I think defining map(f) to be a transducer could be confusing for users since the order of composition would be flipped if they expect it to return a curried version (which is reasonable since we have a plenty of such examples like ==(x)). Another difference is that transducers are applied to reducing functions (e.g., +) while iterator transforms are applied to iterators (e.g., a Vector).

@matbesancon
Copy link
Contributor Author

I guess this should be closed, following the discussion that followed (thanks everyone).
Given what @tkf explained, the most promising direction for this would be iterator transforms, while transducers can live in an external package as https://github.com/tkf/Transducers.jl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Indicates new feature / enhancement requests speculative Whether the change will be implemented is speculative
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants