WO2000016250A1 - Data decomposition/reduction method for visualizing data clusters/sub-clusters - Google Patents
Data decomposition/reduction method for visualizing data clusters/sub-clusters Download PDFInfo
- Publication number
- WO2000016250A1 WO2000016250A1 PCT/US1999/021363 US9921363W WO0016250A1 WO 2000016250 A1 WO2000016250 A1 WO 2000016250A1 US 9921363 W US9921363 W US 9921363W WO 0016250 A1 WO0016250 A1 WO 0016250A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- level
- clusters
- projection
- visualization
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 25
- 230000009467 reduction Effects 0.000 title claims description 15
- 238000000354 decomposition reaction Methods 0.000 title claims description 12
- 238000012800 visualization Methods 0.000 claims abstract description 68
- 238000012545 processing Methods 0.000 claims abstract description 23
- 238000000513 principal component analysis Methods 0.000 claims abstract description 18
- 230000000007 visual effect Effects 0.000 claims abstract description 11
- 239000011159 matrix material Substances 0.000 claims description 13
- 230000008569 process Effects 0.000 claims description 9
- 230000003044 adaptive effect Effects 0.000 claims description 7
- 238000000605 extraction Methods 0.000 claims description 7
- 230000000295 complement effect Effects 0.000 claims description 4
- 238000007621 cluster analysis Methods 0.000 claims 4
- 230000004044 response Effects 0.000 claims 1
- 239000000203 mixture Substances 0.000 abstract description 15
- 238000009826 distribution Methods 0.000 abstract description 8
- 238000007476 Maximum Likelihood Methods 0.000 description 6
- 241000212384 Bifora Species 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 4
- 238000013459 approach Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 238000013079 data visualisation Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 229910052704 radon Inorganic materials 0.000 description 3
- SYUHGPGVQRZVTB-UHFFFAOYSA-N radon atom Chemical compound [Rn] SYUHGPGVQRZVTB-UHFFFAOYSA-N 0.000 description 3
- 238000010187 selection method Methods 0.000 description 3
- 238000010276 construction Methods 0.000 description 2
- 238000013506 data mapping Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000009472 formulation Methods 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 238000000638 solvent extraction Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003936 working memory Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/231—Hierarchical techniques, i.e. dividing or merging pattern sets so as to obtain a dendrogram
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/40—Software arrangements specially adapted for pattern recognition, e.g. user interfaces or toolboxes therefor
Definitions
- the present invention relates gene ⁇ cally to the field of data analysis and data presentation ⁇ and, more particularly, to the analysis of data sets having higher dimensionality data points m order to optimally present the data m a lower dimensional order context, i.e., m a hierarchy of two- or three-dimensional visual contexts to reveal data structures within the data set.
- the visualization of data sets having a large number of data points with multiple variables or attributes associated with each data point represents a complex problem.
- a priori to easily identify groups or subgroups of data points that have relational attributes such that structures and sub-structures existing within the data set can be visualized.
- Various techniques have been developed for processing the data sets to reveal internal structures as an aid to understanding the data.
- a large data set will oftentimes have data points that are multi-variant, that is, a single data point can have a multitude of attributes, including attributes that are completely independent from one another or have some degree of mter- attribute relationship or dependency.
- a single projection of a higher- order data set onto a visualization space may not be able to present all of the structures and substructures within the data set of interest m such a way that the structures or sub-structures can be visually distinguished or discriminated.
- presentation schema involves hierarchical visualization by which the data set is viewed at a highest - level , whole data set viewpoint. Thereafter, features within the highest-level projection are identified m accordance with an algorithm (s) or other identification criteria and those next highest level features further processed to reveal their respective internal structure m another projection (s) .
- This hierarchal process can be repeated for successive levels to present successively finer and detailed views of the data set.
- m a hierarchical visualization scheme
- an image tree is provided with the successively lower images of the tree revealing more detail .
- the data set is subjected by Bishop and Tipping to a form of linear latent variable modelling to find a representation of the multidimensional data set m terms of two latent, or "hidden,” variables that is determined indirectly from the data set .
- the modelling is similar to principal component analysis, but defines a probability density m the data space.
- a single top-level latent variable model is generated with the posterior mean of each data point plotted m the latent space. Any cluster centers identified m this initial plot are used as the basis for initiating the next -lower level analysis leading to a mixture of the latent variable models.
- the parameters, including the optimal projections, are determined by maximum likelihood; this criterion need not always lead to the most interesting or mterpretable visualization plots. Disclosure of Invention
- the present invention provides a data decomposition/reduction method for visualizing large sets of multi -variant data including the processing of the multi -variant data down to two- or three- dimensional space m order to optimally reveal otherwise hidden structures within the data set including the principal data cluster or clusters at a first or top level of processing and additional sub-clusters within the principal data clusters m successive lower level visualizations.
- the identification of the morphology of clusters and subclusters and mter-cluster separation and relative positioning within a large data set allows investigation of the underlying drive that created the data set morphology and the mtra-data-set features .
- the data set constituted by a multitude of data points each having a plurality of attributes, is initially processed as a whole using multiple finite normal mixture models and hierarchical visualization spaces to develop the multi-level data visualization and interpretation.
- the top-level model and its projection explain the entire data set revealing the presence of clusters and cluster relationships, while lower-level models and projections display internal structure within individual clusters, such as the presence of subclusters, which might not be apparent in the higher-level models and projections.
- each level is relatively simple while the complete hierarchy maintains overall flexibility while still conveying considerable structural information.
- the arrangement combines (a) minimax entropy modeling by which the models are determined and various parameters estimated and (b) principal component analysis to optimize structure decomposition and dimensionality reduction.
- the present invention advantagiously performs a probabilistic principal component analysis to project the softly partitioned data space down to a desired two-dimensional visualization space to lead to an optimal dimensionality reduction allowing the best extraction and visualization of local clusters.
- the minimax entropy principle is used to select the model structures and estimate its parameter values, where the soft partitioning of the data set results in a standard finite normal mixture model with minimum conditional bias and variance.
- the present invention treats structure decomposition and dimensionality reduction as two separate but complementary operations, where the criterion used to optimize dimensionality reduction is the separation of clusters rather than the maximum likelihood approach of Bishop and Tipping.
- the resulting projections in turn, enhance the performance of structure decomposition at the next lower level .
- a model selection procedure is applied to determine the number of subclusters inside each cluster at each level using an information theoretic criteria based upon the minimum of alternate calculations of the Akaike Information Critera (AIC) and the minimum description length (MDL) criteria. This determination allows the process of the present invention to automatically determine whether a further split of a subspace should be implemented or whether to terminate the further processing.
- AIC Akaike Information Critera
- MDL minimum description length
- a probabilistic adaptive principal component extraction (PAPEX) algorithm is also applied to estimate the desired number of principal axes. When the dimensionality of the raw data is high, this PAPEX approach is computationally very efficient.
- the present invention defines a probability distribution in data space which naturally induces a corresponding distribution in projection space through a Radon transform. This defined probability distribution permits an independent procedure in determining values for the intrinsic model parameters without concurrent estimation of projection mapping matrix. ⁇
- the underlying "drive" that give rise to the data points often form clusters of points because more than one variable may be a function of that same underlying drive .
- the data set (designated herein as the t-space) is projected onto a single x-space (i.e., two- dimensional space) , in which a descriptor W is determined from the sample covariance matrix C t by fitting a single Gaussian model to the data set over t-space .
- a descriptor W is determined from the sample covariance matrix C t by fitting a single Gaussian model to the data set over t-space .
- the a value f(t) is then determined for K 0 m which the values of ⁇ k z lk , ⁇ tk , and C tk are further refined by maximizing the likelihood over t-space.
- G k (t) is determined by repeating the above process steps to thus construct multiple x-subspaces at the third level; the hierarchy is completed under the information theoretic criteria using the AIC and the MDL and all x-space subspaces plotted for visual evaluation.
- the present invention advantageously provides a data decomposition/reduction method for visualizing data clusters/sub-clusters within a large data space that is optimally effective and computationally efficient .
- FIG. 1 is a schematic block diagram of a system for processing a raw multi -varient data set m accordance with the present invention
- FIG. 2 is a flow diagram of the process flow of the present invention
- FIG. 2A is an alternative visualization of the process flow of the present invention.
- FIG. 3 is an example of the projection of a data set onto a 2 -dimensional visualization space after determination of the principal axis
- FIG. 4A is a 2 -dimensional visualization space of one of the clusters of FIG. 3 ;
- FIG. 4B is a 2 -dimensional visualization space of another of the clusters of FIG. 3;
- FIG. 5 is an example of the projection of a data set onto a 2 -dimensional visualization space after determination of the principal axis;
- FIG. 6A is a 2 -dimensional visualization space of one of the clusters of FIG. 5 ;
- FIG. 6B is a 2 -dimensional visualization space of a second of the clusters of FIG. 5;
- FIG. ⁇ C is a 2-d ⁇ mens ⁇ onal visualization space of a third of the clusters of FIG. 5.
- FIG. 1 A processing system for implementing the dimensionality reduction using probabilistic principal component analysis and structure decomposition using adaptive expectation maximization methods for visualizing data m accordance with the present invention is shown m FIG. 1 and designated generally therein by the reference character 10.
- the system 10 includes a working memory 12 that accepts the raw multi-varient data set, indicated at 14, and which bi-directionally interfaces with a processor 16.
- the processor 16 processes the raw t-space data set ⁇ ⁇ 14 as explained m more detail below and presents that data to a graphical user interface (GUI) 18 which presents a two- or three- dimensional visual presentation to the user as also explained below.
- GUI graphical user interface
- a plotter or printer 20 can be provided to generate a printed record of the display output of the graphical user interface (GUI) .
- the processor 16 may take the form of a software or firmware programed CPU, ALU, ASIC, or microprocessor or a combination thereof.
- the data set is subject to a global principal component analysis to thereafter effect a top most projection.
- This step is initiated by determining the value of a variable W for the top-most projection m the hierarchy of projections.
- W is directly found by evaluating the covariance matrix C t .
- APEX adaptive principal components extraction
- the two-step expectation maximization (EM) algorithm can be applied to allow a standard finite normal mixture model (SFNM) , i.e., where
- the standard finite normal mixture (SFNM) modeling solution addresses the estimation of the regional parameters ( ⁇ k ⁇ tk ) and the detection of the structural parameter K 0 in the relationship
- the EM algorithm is implemented as a two-step process, i.e., the E-step and the M-step as follows:
- K a 7K 0 - 1 (i.e., the values of Akaike' s Information Criteria (AIC) and the Minimum Description Length (MDL) for K with selection of a model m which K corresponds to the minimum of the
- EQ. 9 are then used as the initial means of the respective submodels. Since the mixing proportions ⁇ are pro ection- invariant , a 2 x 2 unit matrix is assigned to the remaining parameters of the covariance matrix C tk .
- the expectation-maximization (EM) algorithm can be again applied to allow a standard finite normal matrix (SFNM) with K 0 submodels to be fitted to the data over t-space.
- SFNM finite normal matrix
- the corresponding EM algorithm can be derived by replacing all x m the E-step and the M-step equations, above, by t.
- C tk can be directly evaluated to obtain W k as described above.
- an algorithm termed the probabilistic adaptive principal component extraction (PAPEX) is applied as follows .
- i f c(i + 1) i f c(i) + ifc(i)tifc -
- a k (i + 1) a k (i) - ⁇ 7feto(»)yifc(*) + y2fc(*) a fc( )]
- W k the eigenvector associated with the second largest eigenvalue of the covariance matrix C k .
- the determination of the parameters of the models at the third level can again be viewed as a two-step estimation problem, in which a further split of the models at the second level is determined within each of the subspaces over x- space, and then the parameters of the selected models are fine tuned over t-space.
- the learning of ⁇ k (x) can again be performed using the expectation-maximization (EM) algorithm and the model selection procedures described above.
- the third level EM algorithm has the same form as the EM algorithm at the second level, except that in the E-step, the posterior probability that a data point x 1 belongs to submodel j is given by
- EQ. 19 are then used to initialize the means of the respective submodels, and the expectation maximization (EM) algorithm can be applied to allow a standard finite normal matrix (SFNM) distribution with K 0 submodels to be fitted to the data over t- space.
- the formulation can be derived by simply replacing all x in the second level M-step by t. With the resulting z 1 ( k _-) in t-space, the PAPEX algorithm can be applied to estimate W ( k ) , in which the effective input values are expressed by
- Hk z i(kJ) ⁇ ⁇ ⁇ t ⁇ k ,j)) EQ. 20 * "
- the next level visualization subspace is generated by plotting each data point t x at the corresponding
- FIGS. 3, 4A, and 4B A first exemplary two-level implementation of the present invention is shown in FIGS. 3, 4A, and 4B in which the entire data set is present in the top level projection and two local clusters within that top level projection each individually presented in FIGS. 4A and 4B.
- the entire data set is subject to principal component analysis as described above to obtain the principal axis or axes (axis A x being representative) for the top level display. Additionally, the axis (unnumbered) for each of the apparent clusters is displayed. Thereafter, the apparent centers of the two clusters are identified and the data subject to the aforementioned processing to further reveal the local cluster of FIG. 4A and the local cluster of FIG. 4B .
- FIGS. 5, 6A, 6B, and 6C A second exemplary two- level implementation of the present invention is shown in FIGS. 5, 6A, 6B, and 6C in which the entire data set is present in the top level projection and three local clusters within that top level projection are each individually presented in FIGS. 6A, 6B, and 6C.
- the entire data set is subject to principal component analysis as described above to obtain the principal axis (A x ) and the axis (unnumbered) for each of the apparent clusters as displayed.
- the t-space raw data set arises from a mixture of three Gaussians consisting of 300 data points as presented in FIG. 5.
- two cloud- like clusters are well separated while a third cluster appears spaced in between the two well- separated cloud-like clusters.
- the second level visual space is generated with a mixture of two local principal component axis subspaces where the line A x indicates the global principal axis.
- the plot on the "right" of FIG. 5 shows evidence of further split.
- a hierarchical model is adopted, which illustrates that there are indeed total three clusters within the data set, as shown in FIGS. 6A, 6B, and 6C.
- An alternate visualization of the process of flow of the present invention is shown in FIG.
- the present invention has use m all applications requiring the analysis of data, particularly multi -dimensional data, for the purpose of optimally visualizing various underlying structures and distributions present within the universe of data. Applications include the detection of data clusters and sub-clusters and their relative relationships m areas of medical, industrial, geophysical imaging, and digital library processing, for example.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Radiology & Medical Imaging (AREA)
- Health & Medical Sciences (AREA)
- Quality & Reliability (AREA)
- Probability & Statistics with Applications (AREA)
- Human Computer Interaction (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Complex Calculations (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP99946966A EP1032918A1 (en) | 1998-09-17 | 1999-09-17 | Data decomposition/reduction method for visualizing data clusters/sub-clusters |
CA002310333A CA2310333A1 (en) | 1998-09-17 | 1999-09-17 | Data decomposition/reduction method for visualizing data clusters/sub-clusters |
AU59262/99A AU5926299A (en) | 1998-09-17 | 1999-09-17 | Data decomposition/reduction method for visualizing data clusters/sub-clusters |
JP2000570715A JP2002525719A (en) | 1998-09-17 | 1999-09-17 | Data decomposition / reduction method for visualizing data clusters / subclusters |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10062298P | 1998-09-17 | 1998-09-17 | |
US60/100,622 | 1998-09-17 | ||
US39842199A | 1999-09-17 | 1999-09-17 | |
US09/398,421 | 1999-09-17 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2000016250A1 true WO2000016250A1 (en) | 2000-03-23 |
Family
ID=26797375
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US1999/021363 WO2000016250A1 (en) | 1998-09-17 | 1999-09-17 | Data decomposition/reduction method for visualizing data clusters/sub-clusters |
Country Status (5)
Country | Link |
---|---|
EP (1) | EP1032918A1 (en) |
JP (1) | JP2002525719A (en) |
AU (1) | AU5926299A (en) |
CA (1) | CA2310333A1 (en) |
WO (1) | WO2000016250A1 (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7440986B2 (en) | 2003-12-17 | 2008-10-21 | Internatioanl Business Machines Corporation | Method for estimating storage requirements for a multi-dimensional clustering data configuration |
US9202178B2 (en) | 2014-03-11 | 2015-12-01 | Sas Institute Inc. | Computerized cluster analysis framework for decorrelated cluster identification in datasets |
CN105447001A (en) * | 2014-08-04 | 2016-03-30 | 华为技术有限公司 | Dimensionality reduction method and device for high dimensional data |
US9424337B2 (en) | 2013-07-09 | 2016-08-23 | Sas Institute Inc. | Number of clusters estimation |
US9996543B2 (en) | 2016-01-06 | 2018-06-12 | International Business Machines Corporation | Compression and optimization of a specified schema that performs analytics on data within data systems |
CN110287978A (en) * | 2018-03-19 | 2019-09-27 | 国际商业机器公司 | For having the computer implemented method and computer system of the machine learning of supervision |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4670010B2 (en) * | 2005-10-17 | 2011-04-13 | 株式会社国際電気通信基礎技術研究所 | Mobile object distribution estimation device, mobile object distribution estimation method, and mobile object distribution estimation program |
US8239379B2 (en) * | 2007-07-13 | 2012-08-07 | Xerox Corporation | Semi-supervised visual clustering |
US20090232388A1 (en) * | 2008-03-12 | 2009-09-17 | Harris Corporation | Registration of 3d point cloud data by creation of filtered density images |
JP5332647B2 (en) * | 2009-01-23 | 2013-11-06 | 日本電気株式会社 | Model selection apparatus, model selection apparatus selection method, and program |
JP6586764B2 (en) * | 2015-04-17 | 2019-10-09 | 株式会社Ihi | Data analysis apparatus and data analysis method |
US11847132B2 (en) | 2019-09-03 | 2023-12-19 | International Business Machines Corporation | Visualization and exploration of probabilistic models |
-
1999
- 1999-09-17 CA CA002310333A patent/CA2310333A1/en not_active Abandoned
- 1999-09-17 WO PCT/US1999/021363 patent/WO2000016250A1/en not_active Application Discontinuation
- 1999-09-17 JP JP2000570715A patent/JP2002525719A/en active Pending
- 1999-09-17 EP EP99946966A patent/EP1032918A1/en not_active Withdrawn
- 1999-09-17 AU AU59262/99A patent/AU5926299A/en not_active Abandoned
Non-Patent Citations (8)
Title |
---|
AKAIKE H: "A NEW LOOK AT THE STATISTICAL MODEL IDENTIFICATION", IEEE TRANSACTIONS ON AUTOMATIC CONTROL,US,IEEE INC. NEW YORK, vol. AC-19, no. 6, December 1974 (1974-12-01), pages 716-723, XP000675871, ISSN: 0018-9286 * |
ANONYMOUS: "Data Preprocessing With Clustering Algorithms.", IBM TECHNICAL DISCLOSURE BULLETIN, vol. 33, no. 10B, March 1991 (1991-03-01), New York, US, pages 26 - 27, XP000109861 * |
ANONYMOUS: "Multivariate Statistical Data Reduction Method", IBM TECHNICAL DISCLOSURE BULLETIN, vol. 36, no. 4, April 1993 (1993-04-01), New York, US, pages 181 - 184, XP000364481 * |
BISHOP C M ET AL: "A HIERARCHICAL LATENT VARIABLE MODEL FOR DATA VISUALIZATION", IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,US,IEEE INC. NEW YORK, vol. 20, no. 3, March 1998 (1998-03-01), pages 281-293, XP000767918, ISSN: 0162-8828 * |
CHATTERJEE C ET AL: "ON SELF-ORGANIZING ALGORITHMS AND NETWORKS FOR CLASS-SEPARABILITY FEATURES", IEEE TRANSACTIONS ON NEURAL NETWORKS,US,IEEE INC, NEW YORK, vol. 8, no. 3, May 1997 (1997-05-01), pages 663-678, XP000656917, ISSN: 1045-9227 * |
JIANCHANG MAO ET AL: "ARTIFICIAL NEURAL NETWORKS FOR FEATURE EXTRACTION AND MULTIVARIATE DATA PROJECTION", IEEE TRANSACTIONS ON NEURAL NETWORKS,US,IEEE INC, NEW YORK, vol. 6, no. 2, 2 March 1995 (1995-03-02), pages 296-316, XP000492664, ISSN: 1045-9227 * |
KUNG S Y ET AL: "ADAPTIVE PRINCIPAL COMPONENT EXTRACTION (APEX) AND APPLICATIONS", IEEE TRANSACTIONS ON SIGNAL PROCESSING,US,IEEE, INC. NEW YORK, vol. 42, no. 5, May 1994 (1994-05-01), pages 1202-1216, XP000460366, ISSN: 1053-587X * |
PAO Y -H ET AL: "Visualization of pattern data through learning of non-linear variance-conserving dimension-reduction mapping", PATTERN RECOGNITION,US,PERGAMON PRESS INC. ELMSFORD, N.Y, vol. 30, no. 10, 1 October 1997 (1997-10-01), pages 1705-1717, XP004094254, ISSN: 0031-3203 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7440986B2 (en) | 2003-12-17 | 2008-10-21 | Internatioanl Business Machines Corporation | Method for estimating storage requirements for a multi-dimensional clustering data configuration |
US7912798B2 (en) | 2003-12-17 | 2011-03-22 | International Business Machines Corporation | System for estimating storage requirements for a multi-dimensional clustering data configuration |
US9424337B2 (en) | 2013-07-09 | 2016-08-23 | Sas Institute Inc. | Number of clusters estimation |
US9202178B2 (en) | 2014-03-11 | 2015-12-01 | Sas Institute Inc. | Computerized cluster analysis framework for decorrelated cluster identification in datasets |
CN105447001A (en) * | 2014-08-04 | 2016-03-30 | 华为技术有限公司 | Dimensionality reduction method and device for high dimensional data |
US9996543B2 (en) | 2016-01-06 | 2018-06-12 | International Business Machines Corporation | Compression and optimization of a specified schema that performs analytics on data within data systems |
CN110287978A (en) * | 2018-03-19 | 2019-09-27 | 国际商业机器公司 | For having the computer implemented method and computer system of the machine learning of supervision |
CN110287978B (en) * | 2018-03-19 | 2023-04-25 | 国际商业机器公司 | Computer-implemented method and computer system for supervised machine learning |
Also Published As
Publication number | Publication date |
---|---|
EP1032918A1 (en) | 2000-09-06 |
AU5926299A (en) | 2000-04-03 |
JP2002525719A (en) | 2002-08-13 |
CA2310333A1 (en) | 2000-03-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Stanford et al. | Finding curvilinear features in spatial point patterns: principal curve clustering with noise | |
Clausi | K-means Iterative Fisher (KIF) unsupervised clustering algorithm applied to image texture segmentation | |
Tirandaz et al. | A two-phase algorithm based on kurtosis curvelet energy and unsupervised spectral regression for segmentation of SAR images | |
Ugarriza et al. | Automatic image segmentation by dynamic region growth and multiresolution merging | |
Sharma et al. | A review on image segmentation with its clustering techniques | |
Attene et al. | Hierarchical mesh segmentation based on fitting primitives | |
Keuchel et al. | Binary partitioning, perceptual grouping, and restoration with semidefinite programming | |
WO2000016250A1 (en) | Data decomposition/reduction method for visualizing data clusters/sub-clusters | |
Krasnoshchekov et al. | Order-k α-hulls and α-shapes | |
Allassonniere et al. | A stochastic algorithm for probabilistic independent component analysis | |
Cai et al. | A new partitioning process for geometrical product specifications and verification | |
Tsuchie et al. | High-quality vertex clustering for surface mesh segmentation using Student-t mixture model | |
Lavoué et al. | Markov Random Fields for Improving 3D Mesh Analysis and Segmentation. | |
AlZu′ bi et al. | 3D medical volume segmentation using hybrid multiresolution statistical approaches | |
Blanchet et al. | Triplet Markov fields for the classification of complex structure data | |
Vilalta et al. | An efficient approach to external cluster assessment with an application to martian topography | |
Huang et al. | Texture classification by multi-model feature integration using Bayesian networks | |
Kouritzin et al. | A graph theoretic approach to simulation and classification | |
Gehre et al. | Feature Curve Co‐Completion in Noisy Data | |
Marras et al. | 3D geometric split–merge segmentation of brain MRI datasets | |
Guizilini et al. | Iterative continuous convolution for 3d template matching and global localization | |
Li et al. | High resolution radar data fusion based on clustering algorithm | |
Li | Unsupervised texture segmentation using multiresolution Markov random fields | |
Roy et al. | A finite mixture model based on pair-copula construction of multivariate distributions and its application to color image segmentation | |
Dokur et al. | Segmentation of medical images by using wavelet transform and incremental self-organizing map |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG UZ VN YU ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1999946966 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 2310333 Country of ref document: CA Ref country code: CA Ref document number: 2310333 Kind code of ref document: A Format of ref document f/p: F |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
ENP | Entry into the national phase |
Ref country code: JP Ref document number: 2000 570715 Kind code of ref document: A Format of ref document f/p: F |
|
WWE | Wipo information: entry into national phase |
Ref document number: 59262/99 Country of ref document: AU |
|
WWP | Wipo information: published in national office |
Ref document number: 1999946966 Country of ref document: EP |
|
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 1999946966 Country of ref document: EP |