Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for finding closest feature #111

Closed
MHaringa opened this issue May 8, 2021 · 7 comments · Fixed by #112
Closed

Request for finding closest feature #111

MHaringa opened this issue May 8, 2021 · 7 comments · Fixed by #112

Comments

@MHaringa
Copy link

MHaringa commented May 8, 2021

Firstly, great work with the package! For example I have the following cities:

library(s2)
city_names <- c("Vatican City", "San Marino", "Luxembourg")
cities <- s2_data_cities(city_names)

And I would like to find the closest city to each city:

city_names[s2_closest_feature(cities, cities)]
#> [1] "Vatican City" "San Marino"   "Luxembourg"

This gives the city name itself. I am not sure how I can use s2_closest_feature() to find the closest city, and not the city itself.
Ideally I would like to use for example (with no specification for the vector y):

city_names[s2_closest_feature(cities)]

Can this be added?

Created on 2021-05-08 by the reprex package (v1.0.0)

@edzer
Copy link
Member

edzer commented May 8, 2021

Maybe this?

> sapply(seq_along(cities), function(i) setdiff(seq_along(cities), i)[s2_closest_feature(cities[i], cities[-i])])
[1] 2 1 2

@paleolimbot
Copy link
Collaborator

You're right that this is difficult to reproduce in s2! Is your use-case dealing with so many points that an vapply() or sapply()-based solution is going to be unreasonably slow? The following might be slightly faster:

library(s2)

city_names <- c("Vatican City", "San Marino", "Luxembourg")
cities <- s2_data_cities(city_names)
city_names[s2_closest_feature(cities, cities)]
#> [1] "Vatican City" "San Marino"   "Luxembourg"

ind <- vapply(
  seq_along(cities),
  function(i) s2_closest_feature(cities[i], cities[-i]),
  integer(1)
)

ifelse(ind >= seq_along(cities), ind + 1, ind)
#> [1] 2 1 2

Created on 2021-05-08 by the reprex package (v0.3.0)

@MHaringa
Copy link
Author

MHaringa commented May 8, 2021

Thanks for your solutions! My use-case indeed deals with so many points that looping takes a little while. An option to solve this kind of problems using s2 easily would be very much appreciated.

@edzer
Copy link
Member

edzer commented May 8, 2021

We can do this now in sf, which calls s2 for unprojected data - seems a common enough use case:

> library(sf)
Linking to GEOS 3.9.0, GDAL 3.2.1, PROJ 7.2.1
> library(s2)
> city_names <- c("Vatican City", "San Marino", "Luxembourg")
> cities <- st_as_sfc(s2_data_cities(city_names))
> st_nearest_feature(cities)
[1] 2 1 2

@paleolimbot
Copy link
Collaborator

This is related to k-nearest neighbours, which is also a common request and wasn't possible quite yet. To implement this I added s2_closest_edges() in #112...I think the current function has some rough edges but it can do what you'd like it to do about 20x faster than the vapply() version:

library(s2)
n <- 1e4
geog <- as_s2_geography(s2_point(runif(n, -1, 1), runif(n, -1, 1), runif(n, -1, 1)))

next_closest_native <- function(geog) unlist(s2_closest_edges(geog, geog, k = 2, min_distance = 0))

bench::mark(next_closest_native(geog))
#> # A tibble: 1 x 6
#>   expression                     min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 next_closest_native(geog)    349ms    357ms      2.80     147KB        0

Created on 2021-05-08 by the reprex package (v0.3.0)

@edzer
Copy link
Member

edzer commented May 9, 2021

I believe you hit linear scaling here, where my Q&D does quadratic or worse...

@paleolimbot
Copy link
Collaborator

I think there's an even faster way to do it too ( https://github.com/r-spatial/s2/blob/master/inst/include/s2/s2closest_point_query.h ), but is more conducive to a dedicated S2Point data structure which we don't really have a system for yet.

netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this issue Sep 20, 2021
# version 1.0-2
* `st_read()` and `st_write()` using GDAL handle binary attributes
  (OFTBinary fields) ; #1721

* a `pivot_longer` method is added for `sf` objects (the `data.frame`
  method works, but raises a warning)

* `rbind.sf` preserves primary geometry column; #1717

* `configure` constrains using `--static` to `Darwin` platform; #1702,
  #1712, #1713

* old-style `crs` objects created with sf < 0.9-0 generate a message,
  and will cause a warning in the future.

* when `st_crs()` is called with a WKT2 as text input, its `input`
  field will be replaced with the CRS name (if it has one).

* GEOS (>= 3.9.0) operations use `GEOSGeom_setPrecision_r` to set
  precision of geometries; #1535

* `st_read()` with specified `query` ignores argument `layers`, and
  warns if it is given; #1444


# version 1.0-1
* fix regression in `st_intersection()`: when using s2 attributes were
  assigned wrongly; #1704

* `crs` (sf) to `CRS` (sp) conversion no longer needs validation by
  `rgdal`; edzer/sp#107

* retrieve ESRI's WKT version of CRS by `st_crs(id)$WKT1_ESRI`; #1690


# version 1.0-0
* add `s2` to Imports:

* add Dewey Dunnington to contributors

* `sf_use_s2()` prints a message when using s2 has been switched to on or off.

* use `s2` spherical geometry as default when coordinates are ellipsoidal. This can
  be switched off (defaulting to planar geometry, using GEOS, as in sf < 1.0-0)
  by setting environment variable `_SF_USE_S2` to `false` before package `sf`
  is loaded, or by `sf_use_s2(FALSE)`; #1649

* `st_nearest_feature()` with missing `y` returns nearest features in
  the remaining set of `x`; r-spatial/s2#111

* `st_write()` gains an argument `config_options` to set GDAL config
  options; #1618

* fix regression in `sf_project(..., keep = TRUE)`; #1635
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants