-
Notifications
You must be signed in to change notification settings - Fork 5
Docs: Does polars-st support multiple geometry columns? #19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The thing is... The GeoPandas concept of a geometry column doesn't really exist in this library. It seems like it, because there is a global configuration with a default geometry column name, but that's all. So yes, you can have as many geometry columns as you want and work on them in parallel, because they're actually just regular polars binary columns. For example you could "dissolve" (which really is just an union aggregation) several at once by writing: gdf.group_by("category").agg(st.union_all("geom_1", "geom_2", "geom_3", ...)) |
Well that's awesome to hear! I guess I almost wish that the default geometry column behaviour in queries/transforms didn't exist, as I feel that it creates a bit of a confusing situation where the example code is resorting to implicit behaviours. I think the default geometry column name behaviour on DataFrame construction might be okay and could make sense, but I'm havin trouble getting behind it in examples like this: area = gdf.select(st.area()) In normal Polars, there really isn't this same concept. There's |
This is because most of the time, people only deal with dataframes with a single geometry column, which will always have the same name. That's why the concept a default geometry column exists and has been adopted by GeoPandas (at which point it's basically the de-facto standard). The most common operations should be the easiest to write. If you only every deal with a single column, which is always named "geometry", it can feel very redundant having to write every single time: area = gdf.select(st.area("geometry")) instead of : area = gdf.select(st.area()) At least, that's how it felt to me, especially considering you just have to write |
Okay, well fair enough. I guess just the original docs request is where I'll leave this then. |
So I've actually been thinking about this a bit more, and might end up changing my mind. Just so you know. 👍 |
Supporting multiple geometry columns would be a huge bragging point for this library.
GeoPandas doesn't support multiple geometry columns very well. For example, their
.dissolve
method (which does a groupby) assumes there's only one geometry column (which is a huge pain, for obvious reasons). It automatically assumes you're talking about "the only geometry column" in a lot of methods, leading to bad assumptions.Seems that this library could have (and probably does have) first-class support for multiple geometry columns, so bragging about that in the docs (maybe in a "Coming from GeoPandas" section) would be huge!
The text was updated successfully, but these errors were encountered: