-
Notifications
You must be signed in to change notification settings - Fork 41
add Polygons + do long standing refactors #166
Conversation
I see you’ve merged and refactored.
Plan to factor out colortypes too? |
Also, it looks good, nice work :-) |
I just really love the syntax
I guess not, I was just still inspired by the StructArray method and thought that would be the clear path forward. But actually, best argument against it would be, that one has a non-hole polygon, and tries to insert it into a multipolygon as a hole - that'd be pretty inconvenient! I'll change it |
A case for this is because the simple feature standard defines polygons to support holes, taken from https://r-spatial.github.io/sf/articles/sf1.html
So if we want to read simple features polygons from GeoJSON.jl or GDAL.jl, we cannot convert it to GeometryTypes polygons if there are holes. |
No the idea would be to convert it to a |
Fair enough, it is nice.
With "it" you mean the polygon? Because |
I do see the point in adhering to an ISO standard if it doesn't come at a cost in other contexts though. @visr would it be possible to implement geometries in accordance with the simple features standard in a way that is consistent with the GeometryTypes classes, and which relies could take advantage of the work you've already done in the Geo packages? |
Yeah I'm not sure about it. Just wanted to point out that if we deviate from it here already, it will make interoperability more difficult. I don't think we can fix it by using multipolygons or this, since that is just a vector of polyons as a single feature. If we can easily make sure this mostly works with simple features it's worth keeping an eye on that. cc @yeesian |
But FWIW I don't think we need to follow the same implementation details, as long as the types circumscribe the same information. Remember that SF had to be expressible in a very non-expressive language such as R. I mean wouldn't it be perfectly consistent with struct ComplexPolygon <: AbstractPolygon
poly::Polygon
holes::Vector{Polygon}
end
struct MultiPolygon{T<:Union{ComplexPolygon, Polygon}} <: AbstractPolygon
polys::Vector{T}
bbox::Rect
end etc |
|
||
struct Polygon{T, A <: AbstractVector{<: AbstractPoint{2, T}}} | ||
points::A | ||
boundingbox::HyperRectangle{2, T} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is boundingbox
required inside the definition of the Point? Can we make the case for applications that deems it expensive (and/or irrelevant) to compute, that don't wish to force it at construction? I think the LightGraphs approach will be to say that it should be metadata that's maintained outside of the type.
Same thing for isconvex
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point (I think LightGraphs design philosophy is a bit too extreme, but that's beside the point). It is really a cache - it makes many operations like intersects potentially a lot faster, at the price of some time at construction. A third approach might be to have boundingbox
be a Union{Rect, Nothing}
and calculate it the first time boundingbox
is called on the polygon?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is really a cache - it makes many operations like intersects potentially a lot faster, at the price of some time at construction.
It's why GEOS has normal geometries, and prepared geometries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah that's a very C++-like way of dealing with that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I asked a question relevant to this here: https://discourse.julialang.org/t/best-practice-approach-for-caching-data-in-objects/20419
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kinda agree with @yeesian. I generally prefer minimalism in the basic type definitions, and implement a generic way of appending metadata like this.
A more Julian way would be e.g. a BoundedGeometry
that contains the AbstractGeometry
and a boundingbox
. Often you would compute/store this data as part of a spatial index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense - the only worry I have is to make the type hierarchy too complex too early when what we really want is composability.
|
||
struct MultiPolygon{T, A <: AbstractVector{<: Polygon{T}}} | ||
polygons::A | ||
boundingbox::HyperRectangle{2, T} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is boundingbox
required here too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been thinking about this and I came to the conclusion, that we should strictly get rid of any additional information and just figure out a nice infrastructure to pass metadata to the simple types!
Otherwise it's too hard to draw a line, and type will never be "complete", since there will always be new use cases that require different metadata ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And then the possibility of "prepared types" that could be made by the user or by batching functions I guess, just like @yeesian suggested?. I can see the logic and since I was ever the only one really wanting to store the extra metadata anyway let's go with that.
Thanks for pushing on this Simon and Michael!
Yeah it feels more natural to me for the MultiPolygon to know whether a given Polygon is a hole or not, than for a Polygon to know whether or not it's the hole of a MultiPolygon. |
Why not go one step further and just say that Polygons have to implement a struct MultiPolygon{T, A <: AbstractVector{<: Polygon{T}}}
polygons::A
boundingbox::HyperRectangle{2, T}
end
holes(mp::MultiPolygon) = (p for p in mp.polygons if ishole(p))
struct ComplexPolygon{A <: AbstractVector{<: Polygon{T}}}
poly::Polygon
holes::A
end
holes(cp::ComplexPolygon) = cp.holes Then people can define whatever Polygons they want, and you can define function in(point::AbstractPoint, polygons)
for polygon in holes(polygons)
point in polygon && return false
end
# [...]
end Btw I'm not sure the current implementation of |
Btw it's not a deal-breaker, but it seems to me the current implementation of Polygon can't seem to decide whether it wants to be a Shapefile.Polygon or GeoJSON.Polygon or LinearRing or its own thing. I'll elaborate later when I find the time. Update: I'm going to postpone the discussion on this until after we agree upon a common geometry model in #166 (comment), since I imagine we want to see something like
at some point. |
It wants to be all of these IMHO. |
makes sense |
A few refactoring ideas:
|
Cool. I think I agree with a lot of what @yeesian has written already. I see this takes the Shapefile approach - the "multipolygon" we've been using is a collection of complex polygons. (As a side note, this makes me feel that we should really share the code we've written...) |
The end goal here is to have a uniform and lightweight representation for geometries throughout the ecosystem. Compasability is a goal, and it would be nice to do as much as possible with concrete types as 1) they shouldn't be too controversial for this case, 2) it's more robust and 3) it will avoid copying/conversions.
Now would seem like a good time to do it, I do agree :-)
I don't care so much but my view is that most operations could stay in this package in a separate files, since they rely on the types here and don't involve further dependencies. Also the names shouldn't conflict with names elsewhere. But I'm curious what it would take for @andyferris to think of this as basic enough? |
I'd like for us to move towards a single model of geometries that we can all work with [1]. It doesn't have to be comprehensive, but it has to be agreed upon. I'll like for such a definition to be sufficiently straightforward that something like We can add more methods to it as we go, but I want to be prudent/conservative in terms of what is required (versus nice-to-have). I'll be happy to flesh out a more detailed description of such a geometry type if we are all in agreement. Without further ado: may I propose the following starting point for all of us? We start with a single eltype(::AbstractGeometry)
ndims(::AbstractGeometry)
copy(::AbstractGeometry) I have some remarks (based on my reading of the current GeometryTypes codebase):
@SimonDanisch @mkborregaard @andyferris @visr (and anyone else following this thread with a vested interest in its outcome, thumbs up to this comment if it does not pose a problem for your model of geometries. Otherwise, provide a description of a case where it might be problematic.) [1]: My personal preference is for us to move towards a trait-like interface of working with geometries. But my understanding from various discussions is that there is still reluctance about it, and greater enthusiasm for building a unified type hierarchy of geometries. All of the discussion above is based on my reading of For
For
My guess from the docstring of
Therefore, the current type hierarchy is meant to be something like
with the following set of methods: eltype(::AbstractGeometry)
ndims(::AbstractGeometry)
vertices(::AbstractGeometry)
nvertices(::AbstractGeometry)
copy(::AbstractGeometry)
push!(::AbstractMutableGeometry, point)
deleteat!(::AbstractMutableGeometry, i) Is that a reasonable reading? What might I be missing? |
Thanks for writing it up @yeesian! Regarding:
It might be good to list some pros and cons of traits vs type hierarchies for this purpose. I'm more familiar with type hierarchies than traits, so that's what I mainly spoke about. But the success of Tables.jl in providing a common interface to many different table-like types might be transferred to geometries? |
So the way I see it there are three possible levels of integration: shared concrete types, abstract types with an informal interface, and traits. The advantage with concrete types is simplicity the way I see it. Everybody writes functions for the same types, it's easy to reason about what a type contains in a function, and it's thus much easier to write code that is optimised for performance. The issue with abstractions is that they may hide a non-performant implementation, which is exactly why specialized methods throughout the ecosystem tend to be faster. Another advantage is that it avoids copying because you never have to convert types. Another advantage is that if using the same basic concrete types becomes widespread throughout the ecosystem, you will have packages working together without ever having to align anything, or even plan for it. For instance if all plotting packages used these basic types it would be possible to plot shapefiles with any package without needing recipes. The big disadvantage of course is that you have to agree on the type implementation, so it must be possible to define an implementation that is actually ideal for all purposes; also refactors can be much more complicated if you don't make sure to still code mainly with abstractions such as getter functions. The advantage of abstract interfaces is that it allows you to have different types and still code to them under the same logic. This is extremely useful if you need to have different types. While abstractions are a lot easier to agree on and implement throughout the ecosystem, it also comes at the cost of unclarity. Say I load a shapefile with Shapefiles which is just julia Vectors of Points, and one with GDAL which wraps a pointer to a GDAL object, then uses LibgGEOS to intersect them - what type of Polygon should I then have? Do I end up with a mix of pure-Julia objects and C pointers etc in my workspace? And how many implicit copies happened by conversion in the process? (Let me know if I misunderstand any of the relationships here). To be clear what I suggest is having both - i.e. to have an abstract interface, but also attempt to use the same concrete types where possible. The advantage of traits is that it gives you many of the same advantages as an abstract interface but 1) it allows you to add traits to a type after it has been created (you can't really add a non-package-specific trait to a type created in a different package though, that would be type piracy), and it does not interfere with the existing type hierarchy, i.e. it allows you to have a semblance of multiple inheritance. It's also less simple, and IMHO clunkier to program with than an abstract interface. Traits are great exactly for something like Tables, because it is an explicit design goal of the Tables ecosystem that there should be many types of Table. Some types are faster for row access, some for columns access, some for big persistent data sets in many different files, some for type stability etc etc. The idea is that the user should choose exactly the type of Table that is best for his/her use case, then be able to apply all the same operations on it. To my mind though that's a very different situation from the Geo case, where we need coherence more than pluralism imho. I do have a question about design though, that relates to Geos and GDAL etc., given that they are the core of the geographical ecosystem in many languages. Are the C++ objects they use the same, or interchangeable? Or do you need converting? That's fine - julia is perfectly great as a scripting language. But it also has more powerful aspects - so then the question is - is there also space for a (simpler) Julia representation, , with julia types and deep integration with the rest of the ecosystem? I think so, which would be what we would try to build here. The ideal would be if these julia types could use geos and gdal natively, and I think I understood from Simon that it might be possible to define the julia types so they end up with the same memory representation as C types, but it's unclear to me how feasible this would be. So maybe the most feasible other possibility is to have one ecosystem based on concrete Julia types, another (like what you already have) wrapping the same C++ types (if all the c++ objects are compatible), all connected by an abstract interface? |
Could you elaborate on this point? What is the inheritance tree you're suggesting? |
well the points from @yeesian ;) Even though it'd would be wonderfull to have one type that can rule them all, we can't really plan with it since there will definitely reasons for different layouts. |
OK but then I don't think we disagree. The idea that it sounds like we're all converging on is to have 1) an ecosystem based around the C libraries that try to be consistent without copying among all the C libraries and 2) simple Julia-based implementations of a subset of the functionality, like Shapefiles.jl, that try to use the same concrete geometry that people like me who might go on a "crusade" to port clipper could work with, and both of these satisfying the same abstract interface, using traits where that is most convenient? |
That sounds perfectly in sync with what I meant ;) |
I don't know GeometryTypes very well but to me @yeesian 's scetch of a type hierarchy sounds reasonable. Philosophically I'm inclined to try to make it as simple as possible. |
I mean, the simplest possible thing to do is to just define a abstract type AbstractGeometry end and to let other package developers interpret what they will of it, haha. Multiple subtype hierarchies might exist and nothing is guaranteed to work across subtype hierarchies, but that can be left for subsequent discussions to resolve rather than holding up this PR. (My proposal was just to kickstart that conversation, but it doesn't have to happen here.) |
I think we should have at least AbstractPoint <: AbstractGeometry
AbstractLineSegment <: AbstractGeometry
AbstractPolygon <: AbstractGeometry might perhaps also be good to have AbstractComplexPolygon <: AbstractPolygon
AbstractMultiPolygon <: AbstractFlexibleGeometry{<:AbstractPolygon}
AbstractMultiLine <: AbstractFlexibleGeometry{<:AbstractLineSegment} Not wedded to the design just trying to circumscribe some types. |
(In case it is of interest, there's now some polygon code (and more) at https://github.com/FugroRoames/RoamesGeometry.jl) |
Nice discussion. I do tend to agree with the above that a package defining an abstract type hierarchy for geometry, with a few simple Julia concrete types, is the way to go. We just need to flesh out the abstract interface functions we use to interact with such geometries to the correct level of granularity. A couple of points on the abstract interface functions - if we use functions from
Finally, I find that when I want to mutate a geometry, I tend to know its concrete type. If you want to |
Good points! |
Answers to my own questions. 1 you mean within, like in the pr. 2. Is not a problem, as sphere does not need to be iterable. |
@SimonDanisch @mkborregaard is this ready to merge / are there any TODOs that we could pick up? @visr et al were considering moving over to GeometryTypes once it's ready. |
Yeah I wonder. I see that there is a Polygon in GeometryBasics.jl. Should we integrate with that package or do you think it will still be merged into this one? |
I plan to switch everything to GeometryBasics :) |
Ok good to know. In that case I guess we best try hopping on the basic bandwagon straightaway and see what we run into ;) |
Ok don't hesitate to spam me with questions ;) I'd love to get this all to work nicely together! |
I would like to implement this paper for fast point-in-poly. Do we want to move GeometryBasics to this org? |
@sjkelly I moved it! :) |
so awesome - action on this PR :-) |
I'm going to merge this, even though its incomplete :D I still plan to move the "real refactor" into GeometryBasics. |
No description provided.