-
Notifications
You must be signed in to change notification settings - Fork 379
C API for shared edge simplification? #640
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for starting to think about this. The core functionality of coverage building and simplification is still on the drawing board, to it's a little early to set any kind of API in stone. This is a good reminder that coverage handling (cleaning, simplifying, merging, overlay) are currently un-handled parts of the GIS pantheon that many many users of GEOS are eager to see filled in. |
I would greatly love to see something like this available either through GEOS or another C/C++ library! The sheer amount of complex javascript-to-xxx bridges in use just to expose the mapshaper simplification operations to other non-javascript tools is testament to how widespread the demand for this operation is 🙀 |
This would be fantastic! Just a note that I think that S2 can do this via the builder, which can simplify multiple objects at once without creating new intersections (https://github.com/google/s2geometry/blob/master/src/s2/s2builder.h#L113-L117). Making it accessible via the R package isn't something I've tried yet, though, so I don't know if the result is exactly what y'all are looking for here. |
Thumbs up! My limited experience with S2 (through this issue mattijn/topojson#50) is that the edges are actually not shared when zoomed in. This also fits the narrative that S2 is focused on spheres. I'm also not sure how one could do this without defining a topological representation within GEOS. I will subscribe to this issue, but not sure if there is anything I can contribute here. |
Thanks @brendan-ward for the detailed suggestion and analysis. As Paul alluded to there is some early-days work being done on improving the GEOS functionality for handling polygonal coverages. Clearly coverage simplification is of great interest, so that helps to focus the development effort. As Paul also says, the API will be ultimately be driven by the implementation. Some of the capabilities you suggest are obviously core to coverage simplification, while others imply a greater design effort. For instance, supporting multiple simplification algorithms will require significantly more complex code and a wider API. I think the initial effort will focus on VW simplification, since it is usually considered to provide a better quality result for polygons. Supporting a snapping tolerance is also more complex than simply processing the data as is. But I'm optimistic that coverage misalignment can be handled in a graceful way, where the output won't be any worse than the input, at least. Most likely this work will involve creating an internal topological data structure, but as you say this doesn't need to be exposed in the C API. Definitely the output elements should match the input elements 1-1, so that they can be easily mapped by client code. In particular, this will allow GEOS Coverage Simplification to be exposed as a window function in PostGIS. Stay tuned! |
To be clear, the callback idea was specifically intended to mitigate that, requiring the caller to implement that using available GEOS C API simplification methods. However, this occurs in a very hot loop in the process, so in addition to putting an awful lot of responsibility on the callback (to pass back in a valid geometry), it may a big performance hit.
👍 This is a great idea. Fully understood and agreed that figuring out the API before enough of the implementation is in place is putting the cart before the horse. The goal was definitely not to set the API in stone or use it to constrain the implementation, but rather to use that to help get the conversation rolling, since that is how many of us outside GEOS would interact with it. However, I think one of the key things raised here - which it looks like there is agreement on - is the aspect of the API that involves passing in a collection and getting back a collection that can be related 1:1 with the original. I might be mistaken, but I don't think that approach has been used previously for other C API functions. @paleolimbot this is separate from S2, unless you were suggesting using the API or implementation in S2 for spherical geometries as inspiration here? |
Yes, in order to make coverage simplification performant, the simplification strategy needs to be able to check if/when a change would cause invalidity to occur, and then back off. This requires some spatial indexing to be maintained. This needs to be exposed to the simplification strategy via an internal API. So there's a bit of a library-client dance required, which would be "fun" to implement in C!
Definitely that is a key aspect of the design for this kind of bulk processing. Up until now GEOS (and JTS) have mainly been focussed on operations on single or pairs of geometries. Moving into providing bulk operations needs to follow this new pattern. |
I didn't look into it, but under the hood, spatialite uses the following library: https://gitlab.com/rttopo/rttopo. Maybe a python wrapper of this library is an option? |
Note that the first steps along the path to providing polygonal coverage operations in GEOS has been commited. It includes:
These have not yet been exposed in the C API, pending strategizing about how to define a consistent API for coverage functions in general. Coverage simplification is now being worked in JTS. |
Coverage Simplification has now landed in JTS: locationtech/jts#911 and locationtech/jts#963. Hopefully this will be ported to GEOS soon, and exposed via the C API. See also blog post here. |
Thanks for the work @dr-jts! This is something to look forward to. One question. What type of strategy did you implement to detect junctions. Do you consider a path/arc shared when coordinates are on the same path or when coordinates appear in both paths? Eg. will the following example result in two shared arcs or one? from shapely import geometry
from shapely.plotting import plot_line
import matplotlib.pyplot as plt
data = geometry.MultiLineString([
[(0, 0), (10, 0), (10, 5), (20, 5)],
[(5, 0), (20, 0), (20, 5), (10, 5), (0, 5)]
])
fig = plt.figure(figsize=(6.5, 2.5))
ax = fig.add_subplot(111)
plot_line(data.geoms[0], ax=ax, add_points=True, color='green', linewidth=5, alpha=0.7)
plot_line(data.geoms[1], ax=ax, add_points=True, color='orange')
plt.show() |
The input to the simplifier algorithm is a set of polygons. So not sure whether or how your question applies to that? |
Not exactly sure either.. based on this code comment: * * If the polygon boundary intersects another polygon boundary, the vertices
* and line segments of the intersection match exactly. It seems a coordinated connected strategy is implemented, but again, what this exactly will mean for real-world use cases is yet to see. The goal here is also simplification and not returning coverages or topologies perse. Thank you for the feedback! |
@mattijn check out this comment in the blog post regarding coverage validation:
It's in that "fix" step that the cleanup will need to be implemented to get from a "path connected" to a "coordinate connected" or "valid polygon coverage" state I suppose. |
I'm going to close this as "finished". |
Being able to correctly simplify shared boundaries between polygons (or in theory any linear features with shared edges) is an ongoing need for users of GEOS (e.g., GeoPandas #1387, Shapely #1425, R Simple Features #381 to name a few). Indeed, some of these questions come up because users are surprised that the topology-preserving simplification in GEOS instead means that simplification is performed on a per-geometry basis and that outputs are individually topologically correct, yet the use within a set of polygons of shared boundaries produces gaps and overlaps (not topologically correct for the set of polygons).
There are packages in various ecosystems that help perform this simplification, including topojson, mapshaper, and others. In addition to the algorithms for performing the simplification of shared boundaries, these packages also center around TopoJSON as a representation of the topological relationships.
Whether or not GEOS should include a topological representation is a bigger issue that is outside the scope of what I'm trying to raise here. Instead, I wanted to see if there was interest in a more simplistic approach within GEOS as it exists now and what that API might look like.
To grossly simplify the simplification process (as I understand it):
For GEOS, perhaps this could be presented in an API that consumes and emits collections of geometries without needing an alternative I/O representation. Ideally this could be generalized across simplification algorithms (e.g., DP, VW, etc), rather than needing a separate implementation for each.
This seems like it could also leverage some of the existing capabilities and ongoing work around things like coverage union (e.g., #638) as well as STRtree since shared edge detection is a slow step that needs a spatial index.
From a C API perspective, I think it could look something like
(don't get fixated on the names, they are just examples)
The input GEOSGeometry would probably need to be a GeometryCollection because:
edge_det_tolerance
is for detecting shared edges; I think of this similar togridsize
in terms of set precision. Input edges may be very close to shared but have non-identical vertices; this tolerance parameter could be used to coerce them into being shared edges (I think?).GEOSSharedEdgeSimplifyCallback
is what makes this generalizable. It would look something like:The callback would take as input:
The output would need to be of lesser or equal dimensionality (if we allow collapse). If is_shared is 1, it must collapse no further than a point representing the start and end point locations (if we allow collapse), or a LineString connecting those.
The caller would provide this callback, and perform whatever simplification algorithm they want within it. Perhaps they would call the DP simplification already available in GEOS or VW once it becomes available, or they could provide an algorithm outside GEOS.
userdata
could be a struct that holds whatever simplification parameters they want to pass to their simplification algorithm; this is where they would pass in the actual tolerance parameter for their simplification.One of the important things we have to watch for here is that we need to (at least optionally) be able to get back out a GeometryCollection that is exactly the same length and same order as the input GeometryCollection, where input and output geometries are related 1:1. This is so that we can relate the outputs back to anything we have attached to the inputs (e.g., other record-level data).
Thoughts?
/cc @martinfleis @jorisvandenbossche @mattijn @edzer
The text was updated successfully, but these errors were encountered: