8000 Clustering algorithm · Issue #1 · visioncortex/visioncortex · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Clustering algorithm #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
dbalabka opened this issue Sep 21, 2021 · 10 comments
Open

Clustering algorithm #1

dbalabka opened this issue Sep 21, 2021 · 10 comments

Comments

@dbalabka
Copy link

What clustering algorithm this library implements?
What distance between colors/pixels is used to calculate clusters?

@tyt2y3
Copy link
Member
tyt2y3 commented Sep 22, 2021

Thank you for sticking around. Sorry for missing the the other issue in vtracer.

This is ancient stuff haha.

Basically the idea is https://www.researchgate.net/publication/8199997_Statistical_Region_Merging (Although I did not start from this paper, the idea is the same).

The key feature of the technique is to able to handle natural images with 'infinite' number of colours and retain sane amount of information to process on.

There is a fundamental trade off between efficiency and statistical optimum, so my suggestion is to start with smaller regions first, then merge those regions hierarchically.

If you knew how many colours there are, or 100% sure it is a graphical image composed with solid colour patches, then perhaps K-means clustering will produce better results. (The question is K=?)

@dbalabka
Copy link
Author
dbalabka commented Sep 22, 2021

@tyt2y3, thanks for explaining. It really helps!

Currently, I have to post-process the SVG. I tried to cluster color with K-means, but the problem that I should know exact amount of colors. To fully automate this process, I used DBSCAN which produces the same clustering result but w/o need to know colors amount. The only thing that epsilon input parameter should be tuned properly.

IMO it would great to have different clustering algorithms.

Also, I think about to implement shapes antialiasing. Do you have any ideas? I can creat additional ticket to discuss this.

@tyt2y3
Copy link
Member
tyt2y3 commented Sep 22, 2021

Definitely great to have different clustering algorithms.
Though, it seems some refactoring has to be done to make the Cluster interface algo-agnostic.

Also, I think about to implement shapes antialiasing.

Um... I don't get it. You want to remove the jagges from where?

@dbalabka
Copy link
Author
dbalabka commented Sep 22, 2021

@tyt2y3 here is two examples to illustrate my idea:
Original (VTracer):
image
Antialiased (VectorMagic):
image

@tyt2y3
Copy link
Member
tyt2y3 commented Sep 23, 2021

The above image, are not generated from vtracer right?

I think it can be done by first reducing the number of points in the shape first, then subdivide-smoothing it afterwards.

There are such methods within the codebase.

@dbalabka
Copy link
Author

@tyt2y3 the first variant has been generated with Vtracer and the second generated with VectorMagic.
Here is an example:
image

Here is image example:
color-chart-rabbit_cropped (1)

@tyt2y3
Copy link
Member
tyt2y3 commented Sep 28, 2021

Oh that's why, I mean the source image is already full of jaggies.

Then yes, it's possible to remove the jaggies by reducing the path with the 'radius' algorithm prior to curve fitting.

Just a side question, in what business scenario are you doing this? Can you share a bit?

That may motivate me a bit or quench my curiosity.

@dbalabka
Copy link
Author

The goal is to restore the vector image. Provided image is good example of noisy image that previously had vector version. I'm sorry but I can not share much.

Previously, had similar idea to overcome this problem.

@tyt2y3
Copy link
Member
tyt2y3 commented Sep 30, 2021

Well it looks cute. Not all graphics are made the same.

The more assumptions we can make on the original graphic, the better the recovered result will be.

Or, if we can model the degradation process, we can develop special processing to counteract it.

@dbalabka
Copy link
Author

Here is a library which might help to implement different clustering algorithms: https://github.com/rust-ml/linfa

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
0