8000 C API for shared edge simplification? · Issue #640 · libgeos/geos · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
C API for shared edge simplification? #640
Closed
@brendan-ward

Description

@brendan-ward

Being able to correctly simplify shared boundaries between polygons (or in theory any linear features with shared edges) is an ongoing need for users of GEOS (e.g., GeoPandas #1387, Shapely #1425, R Simple Features #381 to name a few). Indeed, some of these questions come up because users are surprised that the topology-preserving simplification in GEOS instead means that simplification is performed on a per-geometry basis and that outputs are individually topologically correct, yet the use within a set of polygons of shared boundaries produces gaps and overlaps (not topologically correct for the set of polygons).

There are packages in various ecosystems that help perform this simplification, including topojson, mapshaper, and others. In addition to the algorithms for performing the simplification of shared boundaries, these packages also center around TopoJSON as a representation of the topological relationships.

Whether or not GEOS should include a topological representation is a bigger issue that is outside the scope of what I'm trying to raise here. Instead, I wanted to see if there was interest in a more simplistic approach within GEOS as it exists now and what that API might look like.

To grossly simplify the simplification process (as I understand it):

  • given input linear features that may have underlying shared edges
  • find all shared edges and decompose geometries into sets of those while retaining unshared edges
  • perform some sort of simplification to shared and unshared edges; for each shared edge the simplification is performed once no matter how many geometries share that edge
  • topological cleanup (e.g., dealing with holes in polygons whose boundaries have now been moved, so that output polygons are individually topologically correct)
  • reassemble geometries

For GEOS, perhaps this could be presented in an API that consumes and emits collections of geometries without needing an alternative I/O representation. Ideally this could be generalized across simplification algorithms (e.g., DP, VW, etc), rather than needing a separate implementation for each.

This seems like it could also leverage some of the existing capabilities and ongoing work around things like coverage union (e.g., #638) as well as STRtree since shared edge detection is a slow step that needs a spatial index.

From a C API perspective, I think it could look something like

*GEOSGeometry GEOSSharedEdgeSimplify(
    GEOSGeometry* g, 
    double edge_det_tolerance, 
    GEOSSharedEdgeSimplifyCallback callback, 
    void *userdata
)

(don't get fixated on the names, they are just examples)

The input GEOSGeometry would probably need to be a GeometryCollection because:

  • we may want to support cases of polygons and lines along their boundaries
  • polygons may have overlaps (because data are not clean), which means that MultiPolygons are not valid.

edge_det_tolerance is for detecting shared edges; I think of this similar to gridsize in terms of set precision. Input edges may be very close to shared but have non-identical vertices; this tolerance parameter could be used to coerce them into being shared edges (I think?).

GEOSSharedEdgeSimplifyCallback is what makes this generalizable. It would look something like:

*GEOSGeometry GEOSSharedEdgeSimplifyCallback(GEOSGeometry *geometry, int is_shared, void *userdata)

The callback would take as input:

  • LineString (?) for shared edges, and is_shared would be 1
  • Polygon, LineString, Point, etc for isolated geometries, and is_shared would be 0

The output would need to be of lesser or equal dimensionality (if we allow collapse). If is_shared is 1, it must collapse no further than a point representing the start and end point locations (if we allow collapse), or a LineString connecting those.

The caller would provide this callback, and perform whatever simplification algorithm they want within it. Perhaps they would call the DP simplification already available in GEOS or VW once it becomes available, or they could provide an algorithm outside GEOS.

userdata could be a struct that holds whatever simplification parameters they want to pass to their simplification algorithm; this is where they would pass in the actual tolerance parameter for their simplification.

One of the important things we have to watch for here is that we need to (at least optionally) be able to get back out a GeometryCollection that is exactly the same length and same order as the input GeometryCollection, where input and output geometries are related 1:1. This is so that we can relate the outputs back to anything we have attached to the inputs (e.g., other record-level data).

Thoughts?

/cc @martinfleis @jorisvandenbossche @mattijn @edzer

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementNew feature or feature improvement.

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0