[DOC] Hierarchical, spectral, or density-based clustering using sklearn and aeon distance metrics #1241

SebastianSchmidl · 2024-02-24T14:32:59Z

Describe the issue linked to the documentation

The clustering component in aeon currently supports only partition-based methods. However, there are also hierarchical, spectral, and density-based clustering methods [1].

Suggest a potential alternative/fix

Using the distance metrics in aeon, we can pre-compute the distance matrix for traditional clustering methods. Some methods are already implemented in sklearn, which is a core dependency of eaon and, thus, available to users. I think we should at least link to the sklearn-clusterers in the documentation. With a bit more effort, we could provide examples on how to use sklearn's clusterers with aeon's distance measures (here).

hierarchical clustering:
- sklearn.cluster.AgglomerativeClustering with metric="precomputed"
density-based clustering:
- sklearn.cluster.DBSCAN with metric="precomputed"
- sklearn.cluster.OPTICS with metric="precomputed"
- Density Peaks? https://doi.org/10.1007/s10115-018-1189-7 ➡️ separate issue [ENH] Density Peaks (DP) clusterer #2133
spectral clustering:
- sklearn.cluster.SpectralClustering with affinity="precomputed" and the inverse of the distance matrix (large values indicate greater similarity)

I did not yet test this approach.

[1]: Paparrizos, John, and Luis Gravano. "Fast and Accurate Time-Series Clustering." ACM Transactions on Database Systems 42, no. 2 (2017): 8:1-8:49. https://doi.org/10.1145/3044711.

The text was updated successfully, but these errors were encountered:

TonyBagnall · 2024-02-27T13:25:07Z

thanks for this, we have some examples I think of using precomputed with scikit, but if its not clear it would be great if it was clearer. I would like to get density peaks in, iirc we have a java implementation.

SalmanDeveloperz · 2025-01-03T16:18:19Z

Hey, Can i work on this issue?

SebastianSchmidl · 2025-01-04T14:48:17Z

Yes, sure.
@aeon-actions-bot assign @SalmanDeveloperz

SalmanDeveloperz · 2025-01-18T17:29:43Z

Hey,
I’m working on this issue and appreciate your guidance on a few points:-

Where should I add the example? Should it go in an existing documentation file (if so, which one), or should I create a new file in the docs/ directory?
Are there any specific datasets or clustering algorithms you would like me to include in the examples (e.g., Agglomerative, Spectral Clustering)?
Is there a preferred format for the documentation (e.g., .md) or specific style guidelines I should follow?
Should I include the example code in a separate script or keep it embedded within the documentation file?

Once I have clarification, I’ll proceed with the implementation and submit a PR.
Thank you for your guidance!

SebastianSchmidl added the documentation Improvements or additions to documentation label Feb 24, 2024

TonyBagnall added the good first issue Good for newcomers label Jun 8, 2024

aeon-actions-bot bot assigned SalmanDeveloperz Jan 4, 2025

aeon-actions-bot bot removed the good first issue Good for newcomers label Jan 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DOC] Hierarchical, spectral, or density-based clustering using sklearn and aeon distance metrics #1241

[DOC] Hierarchical, spectral, or density-based clustering using sklearn and aeon distance metrics #1241

[DOC] Hierarchical, spectral, or density-based clustering using sklearn and aeon distance metrics #1241

[DOC] Hierarchical, spectral, or density-based clustering using sklearn and aeon distance metrics #1241

Comments

Describe the issue linked to the documentation

Suggest a potential alternative/fix