CN104537380A - Clustering method and device - Google Patents
Clustering method and device Download PDFInfo
- Publication number
- CN104537380A CN104537380A CN201410841904.7A CN201410841904A CN104537380A CN 104537380 A CN104537380 A CN 104537380A CN 201410841904 A CN201410841904 A CN 201410841904A CN 104537380 A CN104537380 A CN 104537380A
- Authority
- CN
- China
- Prior art keywords
- class
- target class
- distance
- noise
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a clustering method and device. The method includes the steps that noise objects in each target class are respectively recognized; the class distance between the first target class and the second target class is calculated according to non-noise objects in the first target class and the second target class; if the class distance between the first target class and the second target class meets a preset condition, the first target class and the second target class are combined so that a new target class can be formed. Due to the technical scheme, the accuracy of a clustering result can be improved.
Description
Technical field
The disclosure relates to technical field of data processing, particularly relates to a kind of clustering method and device.
Background technology
Cluster is the process set of physics or abstract object being divided into the multiple classes be made up of similar object, the class generated by cluster or bunch be the set of one group of data object, these objects are similar each other to the object in same class, different with the object in other classes.
In correlation technique, hierarchical clustering algorithm can be taked to realize cluster, in hierarchical clustering algorithm, normally according to all objects in each class to calculate the distance between two classes, and then the accuracy of cluster result can be caused lower.
Summary of the invention
For overcoming Problems existing in correlation technique, the disclosure provides a kind of clustering method and device, has solved the problem that in correlation technique, the accuracy of cluster result is lower.
According to the first aspect of disclosure embodiment, a kind of clustering method is provided, comprises:
Identify the noise object in each target class respectively;
Class distance according to the non-noise calculation and object in first object class and the second target class between first object class and described second target class;
If the class distance between described first object class and described second target class meets the condition preset, then described first object class and described second target class are merged, to form new target class.
Optionally, the described noise object identified respectively in each target class, comprising:
For each object in described target class, judge whether be less than first threshold with the quantity of the object of distance in predeterminable range of destination object;
If be less than described first threshold with the quantity of the object of distance in predeterminable range of described destination object, then confirm that described destination object is noise object.
Optionally, the described noise object identified respectively in each target class, comprising:
Calculate the class center of described target class;
Judge whether the destination object in described target class is greater than Second Threshold apart from the distance at described class center;
If described destination object is greater than described Second Threshold apart from the distance at described class center, then confirm that described destination object is noise object.
Optionally, before identifying the noise object in each target class respectively, also comprise:
Clustering algorithm according to presetting is expanded initial classes;
Judge whether the quantity of the object in the initial classes after expanding is more than or equal to the 3rd threshold value;
If the quantity of object is more than or equal to described 3rd threshold value in the initial classes after expansion, then confirm that the initial classes after described expansion is target class.
Optionally, after described first object class and described second target class are merged, also comprise:
Judge whether to there are two target class that class distance meets described default condition;
If there are two target class that class distance meets described default condition, then described two target class are merged.
According to the second aspect of disclosure embodiment, a kind of clustering apparatus is provided, comprises:
Noise recognizing unit, for identifying the noise object in each target class respectively;
Metrics calculation unit, for the class distance according to the non-noise calculation and object in first object class and the second target class between first object class and described second target class;
First merge cells, when meeting for the class distance between described first object class and described second target class the condition preset, merges described first object class and described second target class, to form new target class.
Optionally, described noise recognizing unit comprises:
First judgment sub-unit, for for each object in described target class, judges whether be less than first threshold with the quantity of the object of distance in predeterminable range of destination object;
First confirms subelement, for when the quantity of the object of the distance with described destination object in predeterminable range is less than described first threshold, confirms that described destination object is noise object.
Optionally, described noise recognizing unit comprises:
Center calculation subelement, for calculating the class center of described target class;
Second judgment sub-unit, for judging whether the destination object in described target class is greater than Second Threshold apart from the distance at described class center;
Second confirms subelement, for when described destination object is greater than described Second Threshold apart from the distance at described class center, confirms that described destination object is noise object.
Optionally, also comprise:
Initial extension unit, for expanding initial classes according to the clustering algorithm preset;
Quantity judging unit, for judging whether the quantity of object in the initial classes after expanding is more than or equal to the 3rd threshold value;
Goal verification unit, for when the quantity of object is more than or equal to described 3rd threshold value in initial classes after expansion, confirms that the initial classes after described expansion is target class.
Optionally, also comprise:
Distance Judgment unit, for after described first object class and described second target class being merged, judges whether to there are two target class that class distance meets described default condition;
Described two target class, for when there is class distance and meeting two target class of described default condition, are merged by the second merge cells.
According to the third aspect of disclosure embodiment, a kind of clustering apparatus is provided, comprises:
Processor;
For the storer of storage of processor executable instruction;
Wherein, described processor is configured to:
Identify the noise object in each target class respectively;
Class distance according to the non-noise calculation and object in first object class and the second target class between first object class and described second target class;
If the class distance between described first object class and described second target class meets the condition preset, then described first object class and described second target class are merged, to form new target class.
The technical scheme that embodiment of the present disclosure provides can comprise following beneficial effect:
The disclosure is by identifying the noise object in each target class, and then noise object can be got rid of when calculating the class distance between first object class and the second target class, class distance according to the non-noise calculation and object in described first object class and described second target class, and described first object class and described second target class are merged when described class distance satisfies condition, thus improve the accuracy of cluster result.
The disclosure can by judging whether the destination object in target class is kernel object, and then when described destination object is not kernel object, confirm that described destination object is noise object, improve the accuracy of noise object identification, and then improve the accuracy of cluster result.
The disclosure can by judging whether the distance of destination object in target class and class center is greater than default Second Threshold, and then when the distance at described destination object and class center is greater than described Second Threshold, confirm that described destination object is noise object, improve the accuracy of noise object identification, and then improve the accuracy of cluster result.
Should be understood that, it is only exemplary and explanatory that above general description and details hereinafter describe, and can not limit the disclosure.
Accompanying drawing explanation
Accompanying drawing to be herein merged in instructions and to form the part of this instructions, shows and meets embodiment of the present disclosure, and is used from instructions one and explains principle of the present disclosure.
Fig. 1 is the process flow diagram of a kind of clustering method according to an exemplary embodiment.
Fig. 2 is the process flow diagram of the another kind of clustering method according to an exemplary embodiment.
Fig. 3 is the process flow diagram of a kind of noise object identified in target class according to an exemplary embodiment.
Fig. 4 is the process flow diagram of the noise object in the another kind identification target class according to an exemplary embodiment.
Fig. 5 is the block diagram of a kind of clustering apparatus according to an exemplary embodiment.
Fig. 6 is the block diagram of the another kind of clustering apparatus according to an exemplary embodiment.
Fig. 7 is the block diagram of the another kind of clustering apparatus according to an exemplary embodiment.
Fig. 8 is the block diagram of the another kind of clustering apparatus according to an exemplary embodiment.
Fig. 9 is the block diagram of the another kind of clustering apparatus according to an exemplary embodiment.
Figure 10 is a kind of structural representation for clustering apparatus according to an exemplary embodiment.
Embodiment
Here will be described exemplary embodiment in detail, its sample table shows in the accompanying drawings.When description below relates to accompanying drawing, unless otherwise indicated, the same numbers in different accompanying drawing represents same or analogous key element.Embodiment described in following exemplary embodiment does not represent all embodiments consistent with the disclosure.On the contrary, they only with as in appended claims describe in detail, the example of apparatus and method that aspects more of the present disclosure are consistent.
Fig. 1 is the process flow diagram of a kind of clustering method according to an exemplary embodiment.
As shown in Figure 1, described clustering method may be used for, in terminal, comprising the following steps:
In step S101, identify the noise object in each target class respectively.
Generally include multiple object in described each target class, in this step, identify the noise object in each target class respectively.Such as: whether the destination object that can be judged in described target class by density-based algorithms is kernel object, if described destination object is not kernel object, then can confirm that described destination object is noise object.Also can by judging whether the distance of destination object in described target class and class center is greater than threshold value, if the distance at described destination object and class center is greater than described threshold value, then can confirm that described destination object is noise object.
In step s 102, the class distance according to the non-noise calculation and object in first object class and the second target class between first object class and described second target class.
In the present embodiment, when calculating the class distance of described first object class and described second target class, get rid of noise object, the class distance of two classes according to the non-noise calculation and object in institute's first object class and described second target class.
In step s 103, if the class distance between described first object class and described second target class meets the condition preset, then described first object class and described second target class are merged, to form new target class.
Based on the result of calculation of abovementioned steps S102, in this step, judge whether the class distance between described first object class and described second target class meets the condition preset, such as: judge whether described class distance is less than default threshold value, if described class distance is less than described threshold value, then described first object class and described second target class are merged, to form new target class.
Described as can be seen from above, the disclosure is by identifying the noise object in each target class, and then noise object can be got rid of when calculating the class distance between first object class and the second target class, class distance according to the non-noise calculation and object in described first object class and described second target class, and described first object class and described second target class are merged when described class distance satisfies condition, thus improve the accuracy of cluster result.
Fig. 2 is the process flow diagram of the another kind of clustering method according to an exemplary embodiment.
As shown in Figure 2, described clustering method may be used for, in terminal, comprising the following steps:
In step s 201, according to the clustering algorithm preset, initial classes is expanded.
The clustering method that the disclosure provides can be applied in the cluster process to facial image, such as: for multiple facial images, can be flocked together by the facial image of same person formation cluster by described clustering method.Certainly, the clustering method that the disclosure provides can also be applied in the cluster process to other objects, and the disclosure does not make particular restriction to this.
To carry out cluster to facial image, suppose there are 1000 facial images, to can often open facial image as an object, such as: be one group of vector by the Feature Conversion in described facial image, then described object can by one group of vector representation, therefore, the distance between object and object is exactly the distance between vector and vector.Those skilled in the art can realize said process according to the method provided in correlation technique, and this is no longer going to repeat them.
In the present embodiment, when carrying out cluster can first using each object as an initial classes, namely an object is included in described initial classes, then according to the clustering algorithm preset, described initial classes can be expanded, such as: can according to hierarchical clustering algorithm, class distance between two initial classes that calculating is chosen arbitrarily, namely the distance between object in described two initial classes is calculated, if described class distance meets the condition preset, then these two initial classes are merged, to obtain new initial classes, the number of objects in described new initial classes is 2.Certainly, in this step, those skilled in the art also can adopt other clustering algorithm according to correlation technique, such as: DSCAN (Density-Based Spatial Clustering ofApplications with Noise) density-based algorithms, according to the sweep radius (EPS) preset and minimum count (minPts) comprised, to expand described initial classes, the disclosure does not make particular restriction to this.
In step S202, judge whether the quantity of object in the initial classes after expanding is more than or equal to the 3rd threshold value.If the quantity of object is more than or equal to described 3rd threshold value in the initial classes after expansion, then perform step S203.If the quantity of object is less than described 3rd threshold value in the initial classes after expansion, then can continue to perform step S201.
Based on abovementioned steps S201, after described initial classes is expanded, can judge whether the quantity of object in the initial classes obtained after expanding is more than or equal to the 3rd threshold value, and wherein, described 3rd threshold value can be arranged by developer.
In this step, if the quantity of object is more than or equal to described 3rd threshold value in the initial classes after described expansion, then noise object may be included in the initial classes after described expansion, so perform step S203.If the quantity of object is less than described 3rd threshold value in the initial classes after described expansion, then can continue the expansion flow process performing step S201.
In step S203, confirm that the initial classes after described expansion is target class.
Based on the judged result of abovementioned steps S202, if the quantity of object is more than or equal to described 3rd threshold value in the initial classes after described expansion, then no longer the initial classes after described expansion is expanded, the initial classes after described expansion is confirmed as target class.
In step S204, identify the noise object in each target class respectively.
Based on abovementioned steps S203, the initial classes after confirming described expansion starts to perform this step, identifies the noise object in described target class after being target class.
Please refer to Fig. 3, the process flow diagram of a kind of noise object identified in target class shown in the disclosure one exemplary embodiment, the noise object in described identification target class, can comprise the following steps:
In step S301, for each object in described target class, judge whether be less than first threshold with the quantity of the object of distance in predeterminable range of destination object.If be less than described first threshold with the quantity of the object of distance in predeterminable range of described destination object, then perform step S302.If be more than or equal to described first threshold with the quantity of the object of distance in predeterminable range of described destination object, then continue to perform this step.
In the present embodiment, the each object in described target class can be traveled through, the all objects of distance in described sweep radius with destination object are obtained successively according to the sweep radius (EPS) preset, then judge whether the quantity of the object got is less than described first threshold, described first threshold is generally default minimum count (minPts) comprised, namely in this step, judge whether described destination object is kernel object, if the quantity of the object got is more than or equal to described first threshold, then illustrate that described destination object is kernel object, continue to perform this step, next destination object is judged, until judge complete to all objects in described target class.If the quantity of the object got is less than described first threshold, then described destination object is not kernel object, performs step S302.
In step s 302, confirm that described destination object is noise object.
Based on the judged result of abovementioned steps S301, if be less than described first threshold with the quantity of the object of distance in predeterminable range of described destination object, then described destination object is not kernel object, in this step, can confirm that described destination object is noise object.
In the present embodiment, by judging whether the destination object in target class is kernel object, and then when described destination object is not kernel object, can confirm that described destination object is noise object, improve the accuracy of noise object identification, and then improve the accuracy of cluster result.
Please refer to Fig. 4, the process flow diagram of the noise object in the another kind identification target class shown in the disclosure one exemplary embodiment, the noise object in described identification target class, can comprise the following steps:
In step S401, calculate the class center of described target class.
In the present embodiment, for each target class, calculate the class center of described target class.Such as: can according to K-Means algorithm, by asking the mean value of all objects in described target class to obtain the class center of described target class.Certainly, those skilled in the art also can adopt other algorithms to calculate the class center of described target class, and the disclosure does not make particular restriction to this.
In step S402, judge whether the destination object in described target class is greater than Second Threshold apart from the distance at described class center, if described destination object is greater than described Second Threshold apart from the distance at described class center, then perform step S403.If described destination object is less than or equal to described Second Threshold apart from the distance at described class center, then continue to perform this step.
Based on abovementioned steps S401, after calculating the class center of described target class, for each object in described target class, calculate destination object successively apart from the distance at described class center, then perform the judgement flow process of this step.If described destination object is less than or equal to described Second Threshold apart from the distance at described class center, then can confirm that described destination object is not noise object, continue to perform this step, next destination object is judged, until judge complete to all objects in described target class.If described destination object is greater than described Second Threshold apart from the distance at described class center, then perform step S403.Wherein, described Second Threshold can developer be arranged, and the disclosure does not make particular restriction to this.
In step S403, confirm that described destination object is noise object.
Based on the judged result of abovementioned steps S402, if described destination object is greater than described Second Threshold apart from the distance at described class center, then can confirm that described destination object is noise object.
In the present embodiment, can by judging whether the distance of destination object in target class and class center is greater than default Second Threshold, and then when the distance at described destination object and class center is greater than described Second Threshold, confirm that described destination object is noise object, improve the accuracy of noise object identification, and then improve the accuracy of cluster result.
In step S205, the class distance according to the non-noise calculation and object in first object class and the second target class between first object class and described second target class.
Based on abovementioned steps S204, after identifying the noise object in described target class, two target class can be chosen arbitrarily, in the disclosure, these two target class chosen are called first object class and the second target class.When calculating the class distance between described first object class and described second target class, getting rid of noise object, according to the non-noise object in described first object class and described second target class, calculating described class distance.
In this step, the class distance algorithm provided in correlation technique can be taked to calculate described class distance, such as: closely algorithm, remote algorithm, mean distance algorithm, sum of squares of deviations algorithm etc., the disclosure does not make particular restriction to this.
In step S206, if the class distance between described first object class and described second target class meets the condition preset, then described first object class and described second target class are merged, to form new target class.
Based on abovementioned steps S205, after calculating the distance of the class between described first object class and described second target class, judge whether described class distance meets the condition preset, if described class distance meets described default condition, then in this step, described first object class and described second target class are merged, to form new target class.If described class distance does not meet described default condition, then continue to perform step S205, choose the 3rd target class identifying noise object, class distance according to the non-noise calculation and object in described first object class and described 3rd target class between first object class and described 3rd target class, then this step is performed, judge whether the class distance between described first object class and described 3rd target class meets described default condition, and the class distance between described first object class and described 3rd target class is when meeting described default condition, described first object class and described 3rd target class are merged, by that analogy, until the class distance in remaining all target class between any two target class does not meet described default condition, the target class obtained is exactly the result of the present embodiment cluster.
In this step, described default condition can be arranged according to the class distance algorithm chosen by those skilled in the art, such as: if the described class distance algorithm adopted in step S205 is mean distance algorithm, then can by described default condition setting for being less than or equal to the 4th threshold value, namely in this step, time class distance between described first object class and described second target class is less than or equal to described 4th threshold value, described first object class and described second target class are merged.
Described as can be seen from above, the disclosure is by identifying the noise object in each target class, and then noise object can be got rid of when calculating the class distance between first object class and the second target class, class distance according to the non-noise calculation and object in described first object class and described second target class, and described first object class and described second target class are merged when described class distance satisfies condition, thus improve the accuracy of cluster result.
Corresponding with aforementioned clustering method embodiment, the disclosure additionally provides the embodiment of clustering apparatus.
Fig. 5 is a kind of clustering apparatus block diagram according to an exemplary embodiment.
Please refer to Fig. 5, described clustering apparatus 500 may be used for, in terminal, including: noise recognizing unit 501, metrics calculation unit 502 and the first merge cells 503.
Wherein, described noise recognizing unit 501 is configured to: identify the noise object in each target class respectively.
Described metrics calculation unit 502 is configured to: the class distance according to the non-noise calculation and object in first object class and the second target class between first object class and described second target class.
Described first merge cells 503 is configured to: when the class distance between described first object class and described second target class meets the condition preset, and merges, described first object class and described second target class to form new target class.
In above-described embodiment, can by identifying the noise object in each target class, and then noise object can be got rid of when calculating the class distance between first object class and the second target class, class distance according to the non-noise calculation and object in described first object class and described second target class, and described first object class and described second target class are merged when described class distance satisfies condition, thus improve the accuracy of cluster result.
Fig. 6 is the another kind of clustering apparatus block diagram according to an exemplary embodiment.
Please refer to Fig. 6, this embodiment is on the basis of the embodiment shown in earlier figures 5, and described noise recognizing unit 501 can comprise: the first judgment sub-unit 5011 and first confirms subelement 5012.
Wherein, described first judgment sub-unit 5011 is configured to: for each object in described target class, judges whether be less than first threshold with the quantity of the object of distance in predeterminable range of destination object.
Described first confirms that subelement 5012 is configured to: when the quantity of the object of the distance with described destination object in predeterminable range is less than described first threshold, confirm that described destination object is noise object.
In above-described embodiment, by judging whether the destination object in target class is kernel object, and then when described destination object is not kernel object, can confirm that described destination object is noise object, improve the accuracy of noise object identification, and then improve the accuracy of cluster result.
Fig. 7 is the another kind of clustering apparatus block diagram according to an exemplary embodiment.
Please refer to Fig. 7, this embodiment is on the basis of the embodiment shown in earlier figures 5, and described noise recognizing unit 501 can also comprise: center calculation subelement 5013, second judgment sub-unit 5014 and second confirms subelement 5015.
Wherein, described center calculation subelement 5013 is configured to: the class center calculating described target class.
Described second judgment sub-unit 5014 is configured to: judge whether the destination object in described target class is greater than Second Threshold apart from the distance at described class center.
Described second confirms that subelement 5015 is configured to: when described destination object is greater than described Second Threshold apart from the distance at described class center, confirm that described destination object is noise object.
In above-described embodiment, can by judging whether the distance of destination object in target class and class center is greater than default Second Threshold, and then when the distance at described destination object and class center is greater than described Second Threshold, confirm that described destination object is noise object, improve the accuracy of noise object identification, and then improve the accuracy of cluster result.
Fig. 8 is the another kind of clustering apparatus block diagram according to an exemplary embodiment.
Please refer to Fig. 8, this embodiment is on the basis of the embodiment shown in earlier figures 5, and described clustering apparatus 500 can also comprise: initial extension unit 504, quantity judging unit 505 and goal verification unit 506.
Wherein, described initial extension unit 504 is configured to: the clustering algorithm according to presetting is expanded initial classes.
Described quantity judging unit 505 is configured to: judge whether the quantity of the middle object of the initial classes after expanding is more than or equal to the 3rd threshold value.
Described goal verification unit 506 is configured to: when in initial classes after expansion, the quantity of object is more than or equal to described 3rd threshold value, confirms that the initial classes after described expansion is target class.
It should be noted that the initial extension unit 504 in the device embodiment shown in above-mentioned Fig. 8, quantity judging unit 505 and goal verification unit 506 also can be included in the device embodiment of earlier figures 6 to Fig. 7, not limit this disclosure.
Fig. 9 is the another kind of clustering apparatus block diagram according to an exemplary embodiment.
Please refer to Fig. 9, this embodiment is on the basis of the embodiment shown in earlier figures 5, and described clustering apparatus 500 can also comprise: Distance Judgment unit 507 and the second merge cells 508.
Wherein, described Distance Judgment unit 507 is configured to: after described first object class and described second target class being merged, and judges whether to there are two target class that class distance meets described default condition.
Described second merge cells 508 is configured to: when there is class distance and meeting two target class of described default condition, described two target class is merged.
It should be noted that, the Distance Judgment unit 507 in the device embodiment shown in above-mentioned Fig. 9 and the second merge cells 508, also can be included in the device embodiment of earlier figures 6 to Fig. 8, not limit this disclosure.
In said apparatus, the implementation procedure of the function and efficacy of unit specifically refers to the implementation procedure of corresponding step in said method, does not repeat them here.
For device embodiment, because it corresponds essentially to embodiment of the method, so relevant part illustrates see the part of embodiment of the method.Device embodiment described above is only schematic, the wherein said unit illustrated as separating component or can may not be and physically separates, parts as unit display can be or may not be physical location, namely can be positioned at a place, or also can be distributed in multiple network element.Some or all of module wherein can be selected according to the actual needs to realize the object of disclosure scheme.Those of ordinary skill in the art, when not paying creative work, are namely appreciated that and implement.
Accordingly, the disclosure also provides a kind of clustering apparatus, and described device comprises: processor; For the storer of storage of processor executable instruction; Wherein, described processor is configured to: identify the noise object in each target class respectively; Class distance according to the non-noise calculation and object in first object class and the second target class between first object class and described second target class; If the class distance between described first object class and described second target class meets the condition preset, then described first object class and described second target class are merged, to form new target class.
Accordingly, the disclosure also provides a kind of non-transitory computer-readable recording medium, when the instruction in described storage medium is performed by the processor of terminal, make terminal can perform a kind of clustering method, described method comprises: identify the noise object in each target class respectively; Class distance according to the non-noise calculation and object in first object class and the second target class between first object class and described second target class; If the class distance between described first object class and described second target class meets the condition preset, then described first object class and described second target class are merged, to form new target class.
Figure 10 is a kind of block diagram for clustering apparatus 1000 according to an exemplary embodiment.Such as, device 1000 can be mobile phone, computing machine, digital broadcast terminal, messaging devices, game console, tablet device, Medical Devices, body-building equipment, personal digital assistant etc.
With reference to Figure 10, device 1000 can comprise following one or more assembly: processing components 1002, storer 1004, power supply module 1006, multimedia groupware 1008, audio-frequency assembly 1010, the interface 1012 of I/O (I/O), sensor module 1014, and communications component 1016.
The integrated operation of the usual control device 1000 of processing components 1002, such as with display, call, data communication, camera operation and record operate the operation be associated.Treatment element 1002 can comprise one or more processor 1020 to perform instruction, to complete all or part of step of above-mentioned method.In addition, processing components 1002 can comprise one or more module, and what be convenient between processing components 1002 and other assemblies is mutual.Such as, processing element 1002 can comprise multi-media module, mutual with what facilitate between multimedia groupware 1008 and processing components 1002.
Storer 1004 is configured to store various types of data to be supported in the operation of equipment 1000.The example of these data comprises for any application program of operation on device 1000 or the instruction of method, contact data, telephone book data, message, picture, video etc.Storer 1004 can be realized by the volatibility of any type or non-volatile memory device or their combination, as static RAM (SRAM), Electrically Erasable Read Only Memory (EEPROM), Erasable Programmable Read Only Memory EPROM (EPROM), programmable read only memory (PROM), ROM (read-only memory) (ROM), magnetic store, flash memory, disk or CD.
The various assemblies that electric power assembly 1006 is device 1000 provide electric power.Electric power assembly 1006 can comprise power-supply management system, one or more power supply, and other and the assembly generating, manage and distribute electric power for device 1000 and be associated.
Multimedia groupware 1008 is included in the screen providing an output interface between described device 1000 and user.In certain embodiments, screen can comprise liquid crystal display (LCD) and touch panel (TP).If screen comprises touch panel, screen may be implemented as touch-screen, to receive the input signal from user.Touch panel comprises one or more touch sensor with the gesture on sensing touch, slip and touch panel.Described touch sensor can the border of not only sensing touch or sliding action, but also detects the duration relevant to described touch or slide and pressure.In certain embodiments, multimedia groupware 1108 comprises a front-facing camera and/or post-positioned pick-up head.When equipment 1100 is in operator scheme, during as screening-mode or video mode, front-facing camera and/or post-positioned pick-up head can receive outside multi-medium data.Each front-facing camera and post-positioned pick-up head can be fixing optical lens systems or have focal length and optical zoom ability.
Audio-frequency assembly 1010 is configured to export and/or input audio signal.Such as, audio-frequency assembly 1010 comprises a microphone (MIC), and when device 1000 is in operator scheme, during as call model, logging mode and speech recognition mode, microphone is configured to receive external audio signal.The sound signal received can be stored in storer 1004 further or be sent via communications component 1016.In certain embodiments, audio-frequency assembly 1010 also comprises a loudspeaker, for output audio signal.
I/O interface 1012 is for providing interface between processing components 1002 and peripheral interface module, and above-mentioned peripheral interface module can be keyboard, some striking wheel, button etc.These buttons can include but not limited to: home button, volume button, start button and locking press button.
Sensor module 1014 comprises one or more sensor, for providing the state estimation of various aspects for device 1000.Such as, sensor module 1014 can detect the opening/closing state of equipment 1000, the relative positioning of assembly, such as described assembly is display and the keypad of device 1000, the position of all right pick-up unit 1000 of sensor module 1014 or device 1000 assemblies changes, the presence or absence that user contacts with device 1000, the temperature variation of device 1000 orientation or acceleration/deceleration and device 1000.Sensor module 1014 can comprise proximity transducer, be configured to without any physical contact time detect near the existence of object.Sensor module 1014 can also comprise optical sensor, as CMOS or ccd image sensor, for using in imaging applications.In certain embodiments, this sensor module 1014 can also comprise acceleration transducer, gyro sensor, Magnetic Sensor, pressure transducer or temperature sensor.
Communications component 1016 is configured to the communication being convenient to wired or wireless mode between device 1000 and other equipment.Device 1000 can access the wireless network based on communication standard, as WiFi, 2G or 3G, or their combination.In one exemplary embodiment, communication component 1016 receives from the broadcast singal of external broadcasting management system or broadcast related information via broadcast channel.In one exemplary embodiment, described communication component 1016 also comprises near-field communication (NFC) module, to promote junction service.Such as, can based on radio-frequency (RF) identification (RFID) technology in NFC module, Infrared Data Association (IrDA) technology, ultra broadband (UWB) technology, bluetooth (BT) technology and other technologies realize.
In the exemplary embodiment, device 1000 can be realized, for performing said method by one or more application specific integrated circuit (ASIC), digital signal processor (DSP), digital signal processing appts (DSPD), programmable logic device (PLD) (PLD), field programmable gate array (FPGA), controller, microcontroller, microprocessor or other electronic components.
In the exemplary embodiment, additionally provide a kind of non-transitory computer-readable recording medium comprising instruction, such as, comprise the storer 1004 of instruction, above-mentioned instruction can perform said method by the processor 1020 of device 1000.Such as, described non-transitory computer-readable recording medium can be ROM, random access memory (RAM), CD-ROM, tape, floppy disk and optical data storage devices etc.
Those skilled in the art, at consideration instructions and after putting into practice disclosed herein disclosing, will easily expect other embodiment of the present disclosure.The application is intended to contain any modification of the present disclosure, purposes or adaptations, and these modification, purposes or adaptations are followed general principle of the present disclosure and comprised the undocumented common practise in the art of the disclosure or conventional techniques means.Instructions and embodiment are only regarded as exemplary, and true scope of the present disclosure and spirit are pointed out by claim below.
Should be understood that, the disclosure is not limited to precision architecture described above and illustrated in the accompanying drawings, and can carry out various amendment and change not departing from its scope.The scope of the present disclosure is only limited by appended claim.
Claims (11)
1. a clustering method, is characterized in that, comprising:
Identify the noise object in each target class respectively;
Class distance according to the non-noise calculation and object in first object class and the second target class between first object class and described second target class;
If the class distance between described first object class and described second target class meets the condition preset, then described first object class and described second target class are merged, to form new target class.
2. clustering method according to claim 1, is characterized in that, the described noise object identified respectively in each target class, comprising:
For each object in described target class, judge whether be less than first threshold with the quantity of the object of distance in predeterminable range of destination object;
If be less than described first threshold with the quantity of the object of distance in predeterminable range of described destination object, then confirm that described destination object is noise object.
3. clustering method according to claim 1, is characterized in that, the described noise object identified respectively in each target class, comprising:
Calculate the class center of described target class;
Judge whether the destination object in described target class is greater than Second Threshold apart from the distance at described class center;
If described destination object is greater than described Second Threshold apart from the distance at described class center, then confirm that described destination object is noise object.
4. clustering method according to claim 1, is characterized in that, before identifying the noise object in each target class respectively, also comprises:
Clustering algorithm according to presetting is expanded initial classes;
Judge whether the quantity of the object in the initial classes after expanding is more than or equal to the 3rd threshold value;
If the quantity of object is more than or equal to described 3rd threshold value in the initial classes after expansion, then confirm that the initial classes after described expansion is target class.
5. clustering method according to claim 1, is characterized in that, after described first object class and described second target class being merged, also comprises:
Judge whether to there are two target class that class distance meets described default condition;
If there are two target class that class distance meets described default condition, then described two target class are merged.
6. a clustering apparatus, is characterized in that, comprising:
Noise recognizing unit, for identifying the noise object in each target class respectively;
Metrics calculation unit, for the class distance according to the non-noise calculation and object in first object class and the second target class between first object class and described second target class;
First merge cells, when meeting for the class distance between described first object class and described second target class the condition preset, merges described first object class and described second target class, to form new target class.
7. clustering apparatus according to claim 6, is characterized in that, described noise recognizing unit comprises:
First judgment sub-unit, for for each object in described target class, judges whether be less than first threshold with the quantity of the object of distance in predeterminable range of destination object;
First confirms subelement, for when the quantity of the object of the distance with described destination object in predeterminable range is less than described first threshold, confirms that described destination object is noise object.
8. clustering apparatus according to claim 6, is characterized in that, described noise recognizing unit comprises:
Center calculation subelement, for calculating the class center of described target class;
Second judgment sub-unit, for judging whether the destination object in described target class is greater than Second Threshold apart from the distance at described class center;
Second confirms subelement, for when described destination object is greater than described Second Threshold apart from the distance at described class center, confirms that described destination object is noise object.
9. clustering apparatus according to claim 6, is characterized in that, also comprises:
Initial extension unit, for expanding initial classes according to the clustering algorithm preset;
Quantity judging unit, for judging whether the quantity of object in the initial classes after expanding is more than or equal to the 3rd threshold value;
Goal verification unit, for when the quantity of object is more than or equal to described 3rd threshold value in initial classes after expansion, confirms that the initial classes after described expansion is target class.
10. clustering apparatus according to claim 6, is characterized in that, also comprises:
Distance Judgment unit, for after described first object class and described second target class being merged, judges whether to there are two target class that class distance meets described default condition;
Described two target class, for when there is class distance and meeting two target class of described default condition, are merged by the second merge cells.
11. 1 kinds of clustering apparatus, is characterized in that, comprising:
Processor;
For the storer of storage of processor executable instruction;
Wherein, described processor is configured to:
Identify the noise object in each target class respectively;
Class distance according to the non-noise calculation and object in first object class and the second target class between first object class and described second target class;
If the class distance between described first object class and described second target class meets the condition preset, then described first object class and described second target class are merged, to form new target class.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410841904.7A CN104537380A (en) | 2014-12-30 | 2014-12-30 | Clustering method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410841904.7A CN104537380A (en) | 2014-12-30 | 2014-12-30 | Clustering method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN104537380A true CN104537380A (en) | 2015-04-22 |
Family
ID=52852900
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410841904.7A Pending CN104537380A (en) | 2014-12-30 | 2014-12-30 | Clustering method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104537380A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105426878A (en) * | 2015-12-22 | 2016-03-23 | 小米科技有限责任公司 | Method and device for face clustering |
CN105608430A (en) * | 2015-12-22 | 2016-05-25 | 小米科技有限责任公司 | Face clustering method and device |
CN105631404A (en) * | 2015-12-17 | 2016-06-01 | 小米科技有限责任公司 | Method and device for clustering pictures |
CN107067045A (en) * | 2017-05-31 | 2017-08-18 | 北京京东尚科信息技术有限公司 | Data clustering method, device, computer-readable medium and electronic equipment |
CN108428474A (en) * | 2018-03-30 | 2018-08-21 | 四川斐讯信息技术有限公司 | A kind of method and system for recommending motion scheme based on room air situation |
CN109086697A (en) * | 2018-07-20 | 2018-12-25 | 腾讯科技(深圳)有限公司 | A kind of human face data processing method, device and storage medium |
CN109861953A (en) * | 2018-05-14 | 2019-06-07 | 新华三信息安全技术有限公司 | A kind of abnormal user recognition methods and device |
CN111415151A (en) * | 2020-03-10 | 2020-07-14 | 支付宝(杭州)信息技术有限公司 | Identification method and device for chain commercial tenant, electronic equipment and storage medium |
CN112148942A (en) * | 2019-06-27 | 2020-12-29 | 北京达佳互联信息技术有限公司 | Business index data classification method and device based on data clustering |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010038187A1 (en) * | 2008-09-30 | 2010-04-08 | Fausto Artico | Method for data clusters indexing, recognition and retrieval in presence of noise |
CN103902689A (en) * | 2014-03-26 | 2014-07-02 | 小米科技有限责任公司 | Clustering method, incremental clustering method and related device |
CN104102726A (en) * | 2014-07-22 | 2014-10-15 | 南昌航空大学 | Modified K-means clustering algorithm based on hierarchical clustering |
-
2014
- 2014-12-30 CN CN201410841904.7A patent/CN104537380A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010038187A1 (en) * | 2008-09-30 | 2010-04-08 | Fausto Artico | Method for data clusters indexing, recognition and retrieval in presence of noise |
CN103902689A (en) * | 2014-03-26 | 2014-07-02 | 小米科技有限责任公司 | Clustering method, incremental clustering method and related device |
CN104102726A (en) * | 2014-07-22 | 2014-10-15 | 南昌航空大学 | Modified K-means clustering algorithm based on hierarchical clustering |
Non-Patent Citations (1)
Title |
---|
岳士弘 等: "一组新的聚类有效性指标", 《模式识别与人工智能》 * |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105631404A (en) * | 2015-12-17 | 2016-06-01 | 小米科技有限责任公司 | Method and device for clustering pictures |
CN105631404B (en) * | 2015-12-17 | 2018-11-30 | 小米科技有限责任公司 | The method and device that photo is clustered |
CN105426878B (en) * | 2015-12-22 | 2019-05-21 | 小米科技有限责任公司 | Face cluster method and device |
CN105608430A (en) * | 2015-12-22 | 2016-05-25 | 小米科技有限责任公司 | Face clustering method and device |
CN105426878A (en) * | 2015-12-22 | 2016-03-23 | 小米科技有限责任公司 | Method and device for face clustering |
CN105608430B (en) * | 2015-12-22 | 2019-04-26 | 小米科技有限责任公司 | Face cluster method and device |
CN107067045A (en) * | 2017-05-31 | 2017-08-18 | 北京京东尚科信息技术有限公司 | Data clustering method, device, computer-readable medium and electronic equipment |
CN108428474A (en) * | 2018-03-30 | 2018-08-21 | 四川斐讯信息技术有限公司 | A kind of method and system for recommending motion scheme based on room air situation |
CN109861953A (en) * | 2018-05-14 | 2019-06-07 | 新华三信息安全技术有限公司 | A kind of abnormal user recognition methods and device |
CN109861953B (en) * | 2018-05-14 | 2020-08-21 | 新华三信息安全技术有限公司 | Abnormal user identification method and device |
US11671434B2 (en) | 2018-05-14 | 2023-06-06 | New H3C Security Technologies Co., Ltd. | Abnormal user identification |
CN109086697A (en) * | 2018-07-20 | 2018-12-25 | 腾讯科技(深圳)有限公司 | A kind of human face data processing method, device and storage medium |
CN112148942A (en) * | 2019-06-27 | 2020-12-29 | 北京达佳互联信息技术有限公司 | Business index data classification method and device based on data clustering |
CN112148942B (en) * | 2019-06-27 | 2024-04-09 | 北京达佳互联信息技术有限公司 | Business index data classification method and device based on data clustering |
CN111415151A (en) * | 2020-03-10 | 2020-07-14 | 支付宝(杭州)信息技术有限公司 | Identification method and device for chain commercial tenant, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104537380A (en) | Clustering method and device | |
CN105260732A (en) | Image processing method and device | |
CN105119812A (en) | Method and apparatus for changing emoticons at chat interface, and terminal equipment | |
CN105117033A (en) | Connection method and device of external equipment | |
CN104185304B (en) | A kind of method and device accessing WI-FI network | |
CN105224349A (en) | The deletion reminding method of application program and device | |
CN104238912A (en) | Application control method and application control device | |
CN104899610A (en) | Picture classification method and device | |
CN105068958A (en) | Switching control method and device | |
CN104105169A (en) | Method and device for automatically connecting with wireless local area network | |
CN104378501A (en) | Phone number naming method and device | |
CN104766005A (en) | Management method and device for application software access authority | |
CN105224171A (en) | icon display method, device and terminal | |
CN105678133A (en) | Terminal unlocking method and device | |
CN104735670A (en) | Network access method and device | |
CN105159496A (en) | Touch event response method and mobile terminal | |
CN104331228A (en) | Screen locking method and device | |
CN105426878A (en) | Method and device for face clustering | |
CN105160320A (en) | Fingerprint identification method and apparatus, and mobile terminal | |
CN104243829A (en) | Self-shooting method and self-shooting device | |
CN105472157A (en) | Method and device for monitoring terminal motion state | |
CN104809744A (en) | Method and device for processing pictures | |
CN105430715A (en) | Method and apparatus for controlling WIFI scanning | |
CN105187671A (en) | Recording method and device | |
CN104573642A (en) | Face recognition method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20150422 |
|
RJ01 | Rejection of invention patent application after publication |