[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Meyerson et al., 2004 - Google Patents

A k-median algorithm with running time independent of data size

Meyerson et al., 2004

View PDF
Document ID
10722351027856968133
Author
Meyerson A
O'callaghan L
Plotkin S
Publication year
Publication venue
Machine Learning

External Links

Snippet

We give a sampling-based algorithm for the k-Median problem, with running time O (k (k^ 2 ∈\log k)^ 2 log (k ∈\log k)), where k is the desired number of clusters and∈ is a confidence parameter. This is the first k-Median algorithm with fully polynomial running time that is …
Continue reading at link.springer.com (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30386Retrieval requests
    • G06F17/30424Query processing
    • G06F17/30533Other types of queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30386Retrieval requests
    • G06F17/30424Query processing
    • G06F17/30477Query execution
    • G06F17/30483Query execution of query operations
    • G06F17/30486Unary operations; data partitioning operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30705Clustering or classification
    • G06F17/3071Clustering or classification including class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30312Storage and indexing structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30244Information retrieval; Database structures therefor; File system structures therefor in image databases
    • G06F17/30247Information retrieval; Database structures therefor; File system structures therefor in image databases based on features automatically derived from the image data
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30613Indexing
    • G06F17/30619Indexing indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30017Multimedia data retrieval; Retrieval of more than one type of audiovisual media
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6217Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6267Classification techniques
    • G06K9/6279Classification techniques relating to the number of classes

Similar Documents

Publication Publication Date Title
Meyerson et al. A k-median algorithm with running time independent of data size
Cunningham et al. k-Nearest neighbour classifiers: (with Python examples)
US7797323B1 (en) Producing representative hashes for segments of a file
US6507669B1 (en) Method of selecting clusters of items using a fuzzy histogram analysis
Dohnal et al. D-index: Distance searching index for metric data sets
Nguyen et al. Consensus clusterings
Lejsek et al. NV-Tree: An efficient disk-based index for approximate search in very large high-dimensional collections
Law et al. An adaptive nearest neighbor classification algorithm for data streams
US7475071B1 (en) Performing a parallel nearest-neighbor matching operation using a parallel hybrid spill tree
KR101266358B1 (en) A distributed index system based on multi-length signature files and method thereof
US7603370B2 (en) Method for duplicate detection and suppression
US20100106713A1 (en) Method for performing efficient similarity search
US6760718B2 (en) Database operation processor
WO2000028441A2 (en) A density-based indexing method for efficient execution of high-dimensional nearest-neighbor queries on large databases
JP2004199472A (en) Computer system for generating data structure for information retrieval, its method, computer executable program for generating data structure for information retrieval, computer readable storage medium storing computer executable program, information retrieval system, and graphical user interface
US7120624B2 (en) Optimization based method for estimating the results of aggregate queries
Tavenard et al. Improving the efficiency of traditional DTW accelerators
US20080065682A1 (en) Search index generation apparatus
Cayton et al. A learning framework for nearest neighbor search
Han et al. Ranking the big sky: efficient top-k skyline computation on massive data
Bansal et al. Ad-hoc aggregations of ranked lists in the presence of hierarchies
Andoni et al. Approximate nearest neighbors beyond space partitions
Erdinç et al. MCMSTStream: applying minimum spanning tree to KD-tree-based micro-clusters to define arbitrary-shaped clusters in streaming data
Zhang et al. PARROT: pattern-based correlation exploitation in big partitioned data series
Aslam et al. Scalable information organization