Dynamic Similarity and Distance Measures Based on Quantiles

Monica J. Ruiz-Miró¹⁶ &
Margaret Miró-Julià¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9520))

Included in the following conference series:

International Conference on Computer Aided Systems Theory

1642 Accesses

Abstract

Data Mining emerges in response to technological advances and considers the treatment of large amounts of data. The aim of Data Mining is the extraction of new, valid, comprehensible and useful knowledge by the construction of a simple model that describes the data and can also be used in prediction tasks. The challenge of extracting knowledge from data is an interdisciplinary discipline and draws upon research in statistics, pattern recognition and machine learning among others.

A common technique for identifying natural groups hidden in data is clustering. Clustering is a process that automatically discovers structure in data and does not require any supervision. The model’s performance relies heavily on the choice of an appropriate measure. It is important to use the appropriate similarity metric to measure the proximity between two objects, but the separability of clusters must also be taken into account.

This paper addresses the problem of comparing two or more sets of overlapping data as a basis for comparing different partitions of quantitative data. An approach that uses statistical concepts to measure the distance between partitions is presented. The data’s descriptive knowledge is expressed by means of a boxplot that allows for the construction of clusters taking into account conditional probabilities.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 35.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 44.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Combinatorial Optimization Approaches for Data Clustering

FPDclustering: a comprehensive R package for probabilistic distance clustering based methods

Article Open access 15 May 2024

A new methodology for exploratory and predictive data analysis based on level sets of probability density function

Article 19 August 2022

References

Cios, K.J., Pedrycz, W., Swiniarski, R.W., Kurgan, L.A.: Data Mining. A Knowledge Discovery Approach. Springer, New York (2007)
MATH Google Scholar
Grabmeier, J., Rudolph, A.: Techniques of cluster algorithms in data mining. Data Min. Knowl. Discov. 6, 303–360 (2002)
Article MathSciNet Google Scholar
Witte, R.S., Witte, J.S.: Statistics, 9th edn. Wiley, New Jersey (2010)
MATH Google Scholar
Kim, H., Loh, W.Y.: Classification trees with unbiased multiway splits. J. Am. Stat. Assoc. 96, 589–604 (2001)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Departament de Ciències Matemàtiques i Informàtica, Universitat de les Illes Balears, 07122, Palma de Mallorca, Spain
Monica J. Ruiz-Miró & Margaret Miró-Julià

Authors

Monica J. Ruiz-Miró
View author publications
You can also search for this author in PubMed Google Scholar
Margaret Miró-Julià
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Margaret Miró-Julià .

Editor information

Editors and Affiliations

de Gran Canaria, Universidad de las Palmas de Gran Canari, Las Palmas de Gran Canaria, Spain
Roberto Moreno-Díaz
Johannes Kepler University Linz, Linz, Austria
Franz Pichler
de Gran Canaria, Universidad de las Palmas de Gran Canari, Las Palmas de Gran Canaria, Spain
Alexis Quesada-Arencibia

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ruiz-Miró, M.J., Miró-Julià, M. (2015). Dynamic Similarity and Distance Measures Based on Quantiles. In: Moreno-Díaz, R., Pichler, F., Quesada-Arencibia, A. (eds) Computer Aided Systems Theory – EUROCAST 2015. EUROCAST 2015. Lecture Notes in Computer Science(), vol 9520. Springer, Cham. https://doi.org/10.1007/978-3-319-27340-2_11

Download citation

DOI: https://doi.org/10.1007/978-3-319-27340-2_11
Published: 17 December 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27339-6
Online ISBN: 978-3-319-27340-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Dynamic Similarity and Distance Measures Based on Quantiles

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Combinatorial Optimization Approaches for Data Clustering

FPDclustering: a comprehensive R package for probabilistic distance clustering based methods

A new methodology for exploratory and predictive data analysis based on level sets of probability density function

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Dynamic Similarity and Distance Measures Based on Quantiles

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Combinatorial Optimization Approaches for Data Clustering

FPDclustering: a comprehensive R package for probabilistic distance clustering based methods

A new methodology for exploratory and predictive data analysis based on level sets of probability density function

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation