CN104182511B

CN104182511B - A kind of fuzzy distribution clustering method that compacts of cluster characteristic weighing

Info

Publication number: CN104182511B
Application number: CN201410413719.8A
Authority: CN
Inventors: 周媛; 王丽娜; 何军
Original assignee: Nanjing University of Information Science and Technology
Current assignee: Nanjing Ditavi Data Technology Co ltd
Priority date: 2014-08-20
Filing date: 2014-08-20
Publication date: 2017-09-26
Anticipated expiration: 2034-08-20
Also published as: CN104182511A

Abstract

For existing WFCM algorithms sample hard plot actual conditions are not accounted in cluster and bad to the unbalanced situation data clusters effect of sample distribution, FCS algorithms do not account for the situation of hard plot boundary point and ignore the problem of sample characteristics parameter influences on all kinds of clusters, the invention discloses a kind of fuzzy distribution clustering method that compacts of cluster characteristic weighing, by to sample degree of membership, feature weight is adjusted, it have followed the actual conditions of sample hard plot, and take into full account influence of the sample characteristics parameter to all kinds of clusters, as far as possible so that being compacted in sample class, disperse between class, solve the problems, such as the sample degree of membership positioned at hard plot border, in the case of sample distribution is unbalanced more effective division is realized for noise data and abnormal data.This method clustering performance is good, fast convergence rate, iteration efficiency high.Clustering performance is good, iteration efficiency high, is suitable for application in the occasion that unbalanced sample distribution in Industry Control, real-time, accuracy rate require high.

Description

Fuzzy compact scattering clustering method based on cluster feature weighting

Technical Field

The invention belongs to the technical field of data processing, and particularly relates to a fuzzy compact scattering clustering method based on cluster feature weighting.

Background

In natural science and social science, a large number of classification problems exist, a clustering method is a statistical analysis method for researching (sample or index) classification problems, and is also an important algorithm for data mining, and the application field is very wide. A fuzzy C-means (FCM) clustering algorithm is a common unsupervised pattern recognition method, and is continuously improved by many people, and the fuzzy C-means (FCM) clustering algorithm considers the influence of each characteristic parameter of a sample on a clustering center and improves the conditions of noise, abnormal data influence and the like. However, these FCM-based clustering algorithms essentially only consider intra-class compactness (intra-class divergence) of samples, but ignore inter-class scatter (inter-class divergence) of samples, and cannot well deal with the problem of data clustering with sample maldistribution. The FCS (fuzzy Compactness and separation) algorithm proposed by Kuo-Lung Wu et al considers the intra-class Compactness and the inter-class dispersion and is compatible with the hard division and the fuzzy division of the sample, which is more practical; a classification method of a maximum divergence difference judgment criterion is proposed by Song Fenxi and other people in China, and the criterion comprehensively considers the inter-class divergence and the intra-class divergence to obtain an optimal projection vector so as to classify samples; the Satsuga et al introduces the ambiguity into the maximum divergence difference discrimination criterion and provides an FMSDC (fuzzy approximation scanner difference criterion) algorithm, and dimension reduction is carried out while fuzzy clustering is carried out; the algorithm of the Satsu et al is a correct version of the algorithm of the Satsu et al, membership and sample mean values are initialized by using an FCM algorithm, dimension reduction is carried out by using an FMSDC algorithm, dimension reduction data are clustered by using an FCS algorithm, and the clustering essence adopts the FCS algorithm.

In the process of classifying data by using the above algorithm, we find that, some actual data are in some hard division areas, the membership degrees of the data do not need to be fuzzified, and how to effectively divide data with unbalanced sample distribution cannot be solved by the FCM algorithm and the related extended FCM algorithm. Although the FCS algorithm considers the problem of hard partitioning of samples, the FCS algorithm does not consider the sample situation at the boundary of hard partitioning, which results in the problem that the algorithm fails when boundary data is encountered when actual data is classified.

Disclosure of Invention

The invention discloses a fuzzy compact scattering clustering method based on cluster feature weighting, which aims at solving the problems that the actual situation of sample hard division is not considered in the clustering process of the conventional WFCM algorithm, the unbalanced data division of sample distribution cannot be well managed, the situation of hard division boundary points is not considered in the FCS algorithm, and the influence of sample feature parameters on various clusters is ignored.

In order to achieve the purpose, the invention provides the following technical scheme:

a cluster feature weighted fuzzy compact scattering clustering method comprises the following steps:

step one, setting membership index m and characteristic weighting index α∈ [ -10, -1 [ ]]∪(1,10]β∈ {0.005,0.05,0.5,1}, an initial iteration number p of 0 and an iteration error > 0, randomly generating an initial clustering center a_i，(s is the number of characteristic parameters);

step two, calculating the coefficient η according to the following formula_i：

Wherein,is the sample mean;

step three: updating the sample membership mu according to_ij：

Note the book

When the sample point x_jWhen there is a fall on a hard division boundary, at this time Δ_ij0, for all Δ s, while ensuring that the distance scale of each sample point with respect to the ith class is unchanged_ijP (. DELTA.) utilization of not less than 0_ij) And (3) adjusting:

after adjustment, the new μ is calculated using the following equation_ij：

Because there is a sample point x_jFalls within the class i hard partition region, so there will be μ_ij< 0, therefore for μ_ijAnd (3) carrying out hard division adjustment:

step four: the feature weight ω is calculated according to the following equation_ik：

Note the book

If Δ_ik< 0 because of ω_ik∈[0,1]Therefore, it is necessary to adjust Δ_ikProjecting to an interval larger than 0 and ensuring the kth characteristic parameter and the ith class of each sampleThe distance scale of the hard segment is unchanged, and then Δ is adjusted using the following equation_k：

After adjustment, a new omega is calculated by utilizing a characteristic weight formula_ik；

Step five: calculating the clustering center a according to the following formula_ik：

Step six: let iteration number p be p +1 untilOtherwise, turning to the step two;

step seven: the mu obtained by the p iteration_ijOutput according toI.e. the jth sample belongs to the ith class.

Further, the sample membership μ_ijAnd a feature weight ω_ikCalculated by the following steps:

establishing an objective function:

the cluster feature weighted FCS clustering problem is expressed as follows:

obtained by using a Lagrange multiplier method:

in the above formula, λ_i、λ_jIs a lagrange multiplier;

respectively corresponding to mu according to the above formula_ij、ω_ik、λ_i、λ_jCalculating the offset and making the offset result zero to obtain mu_ij、ω_ik。

The invention also provides an industrial data classification method based on the fuzzy compact scattering clustering method based on cluster feature weighting, which comprises the following steps: after the data acquired by the sensor is obtained, the acquired data is classified by the CWFCS method (steps one to seven) provided by the invention, and then the current state of the industrial equipment or the process is judged according to the classification result.

Furthermore, the sensor collects the state data of the aircraft engine, and the health state of the aircraft engine is judged.

Has the advantages that:

the method follows the actual situation of sample hard division, fully considers the influence of sample characteristic parameters on the sample division, makes the sample class compact and disperse as much as possible, solves the problem of sample membership degree at the hard division boundary, and realizes more effective division of noise data and abnormal data under the condition of unbalanced sample distribution. The clustering performance is good, the convergence rate is high, and the iteration efficiency is high. Experiments prove that the algorithm has good clustering performance, high convergence speed and high iteration efficiency. Compared with the prior art, the method has high clustering accuracy and obviously reduced time consumption, and is suitable for being applied to occasions with unbalanced sample distribution and high real-time requirement in industrial control.

Drawings

FIG. 1 is a schematic flow chart of the steps of a fuzzy compact scattering clustering method based on cluster feature weighting;

FIG. 2 is a schematic diagram of data distribution of an Iris data set, clustering effects and clustering centers of a CWBCs algorithm, an FCS algorithm and a WFCM algorithm;

fig. 3 is a schematic diagram of the clustering result, the hard partitioning result and the convergence of the CWFCS algorithm when β is 1;

fig. 4 is a schematic diagram of the clustering result, the hard partitioning result and the convergence of the CWFCS algorithm when β is 0.5;

fig. 5 is a schematic diagram of the clustering result, the hard partitioning result and the convergence of the CWFCS algorithm when β is 0.05;

fig. 6 is a schematic diagram of the clustering result, the hard partitioning result and the convergence of the CWFCS algorithm when β is 0.005;

FIG. 7 is a schematic diagram showing the influence of different values of parameters α, β, and m on the clustering result.

Detailed Description

The technical solutions provided by the present invention will be described in detail below with reference to specific examples, and it should be understood that the following specific embodiments are only illustrative of the present invention and are not intended to limit the scope of the present invention.

The invention provides an improved fuzzy compact scattering clustering method based on the thought, wherein the unsupervised clustering of data in real life exists in the hard partition of a sample to a clustering center, the sample on the hard partition boundary has the maximum membership degree to the class compared with the sample outside the hard partition region, but the sample in the relatively hard partition region is relatively fuzzy, and each characteristic parameter of the sample has different influence on various clustering results.

Firstly, defining the intra-class divergence and the inter-class divergence of the weighting of the cluster characteristics as follows:

a characteristic weighting coefficient α ∈ [ -10,0) — (1,10 ];

establishing an objective function:

the cluster feature weighted FCS clustering problem is expressed as follows:

obtained by using a Lagrange multiplier method:

in the above formula, λ_i、λ_jIs a lagrange multiplier;

respectively corresponding to mu according to the above formula_ij、ω_ik、λ_i、λ_jAnd solving the partial derivatives and making the partial derivatives result be zero, and solving:

the fuzzy compact scattering clustering method based on cluster feature weighting, as shown in fig. 1, includes the following steps:

Wherein,is the sample mean;

step three: updating the sample membership mu according to the formula (3)_ij：

Note the book

Consider a sample point x_jIn the case of hard-divided boundary, if μ is calculated directly by equation (3)_ijThe algorithm is invalid when the result is positive infinity; the sample point falling on the i-th class hard division boundary has ambiguity per se, if the sample point is subjected to hard division, the sample point is inconsistent with the actual situation, but is not subjected to hard division with other sample points falling on the hard division boundaryComparison of sample points outside the subarea, x_jThe higher fuzzy membership degree exists for the ith class, and all delta are subjected to the condition of ensuring that the distance scale of each sample point relative to the ith class is unchanged_ijUtilization adjustment function P (Delta) of more than or equal to 0_ij) And (3) adjusting:

after adjustment, the new μ is calculated using the following equation_ij：

Note the book

When delta_ikWhen 0, the k-th characteristic parameter has the same influence on the i-th class, so ω is_ik＝0。

If the sample distribution is very uneven, there is Δ_ik< 0 because of ω_ik∈[0,1]Therefore, it is necessary to adjust Δ_ikProject to an interval greater than 0 and ensure eachThe k characteristic parameter of the sample is invariant to the distance scale of the i hard-divided region, and then Δ is adjusted using the following equation_ik：

Through the steps, the actual situation of sample hard division is followed, the influence of sample characteristic parameters on various divisions is fully considered, the sample classes are compact and dispersed as much as possible, the problem of sample membership degree at the hard division boundary is solved, and more effective division is realized on noise data and abnormal data under the condition of unbalanced sample distribution.

The first embodiment is as follows:

to better illustrate the performance of the present invention, we adopted the method of the present invention to target one of the real data sets of UCI retrieval of machine learning databases: iris data set was subjected to classification experiments, fuzzyThe indexes m are respectively set to be (1.5,2,2.5,3,3.5), and the iteration error precision is 10^-6The Iris data set retains all data of a first class and a second class and randomly selects 10 samples from a third class, and the 110 samples are divided into 3 classes, wherein the 2 nd class and the 3 rd class are crossed, the clustering result obtained by the algorithm (CWFSC algorithm for short) is shown in figures 2-6, the algorithm has a basic clustering function, the clustering result is approximately the same as the original data distribution shown in figure 2(a), figures 3-6 show that the distance between the three classes of clustering centers changes along with β, when β is reduced from 1 to 0.05, the system fuzzy degree is increased, the three classes of clustering centers are gradually closed, the number of the third class samples is far less than that of the first class and the second class and is overlapped with the first class and the second class, in order to make the cluster internal compactness as large as possible, the sample internal distribution is also as large as possible, the parameters of CWFSC algorithm are set to be (0.005, 0.5, 1), the weighted sample distribution speed is higher, the weighted sample distribution speed is set as the hard, the weighted sample distribution efficiency of the algorithm is set to be higher, the weighted sample distribution efficiency of the algorithm is set as high as the weighted sample distribution of the weighted sample distribution algorithm is set as 0.05, the weighted sample distribution of the first class 2-3 cluster algorithm is set as the weighted distribution of the cluster algorithm is higher, the cluster algorithm is set as the hard distribution of the cluster algorithm is set as the cluster distribution of the cluster algorithm is set as the cluster characteristics of the cluster internal distribution of the cluster characteristics of the cluster distribution of the cluster 2-3 cluster algorithm is higher, the cluster algorithm is set as the weighted cluster-3 cluster-6, the weighted cluster-3 cluster algorithm is set as the weighted cluster algorithm is.

Fig. 7 shows the impact of different values of the parameters α, β, m on the clustering. The smaller beta is, the larger the error fraction is; no matter what value beta takes, the algorithm is sensitive to the values of alpha and m when the average error fraction is minimum and beta is less than 0.5 for the same value of beta, alpha is 2 and m belongs to {1.5,2 }. Fig. 7(a) β is 1, and α > 3, when m is an integer (2, 3), the larger α is, the smaller the error fraction is, otherwise, the smaller α is; when α < 0, the error fraction becomes smaller as α becomes smaller, and m has little influence. FIGS. 7(b) - (d) show that when β < 1, the algorithm has substantially the same tendency to be influenced by α and m, and the larger m is for a certain α, the larger the error rate is; for a certain m (without considering the optimal case of α ═ 2), the larger α is the error fraction smaller if α > 0, and the smaller α is the error fraction smaller if α < 0.

Example two:

to verify the superiority of the present invention, we tested the Iris dataset using FCS, WFCM and CFCS provided by the present invention.

In the experiment, fuzzy indexes m in the experiment are respectively set as (1.5,2,2.5,3 and 3.5), and the iteration error precision is 10^-6The parameters β in the CWFSC algorithm are respectively set as (0.005, 0.05,0.5, 1), the experiment is repeated 100 times, and the optimal result and the average result are obtained, the optimal performance of the algorithm is measured by three indexes of Accuracy (Accuracy), iteration frequency (Iter) and execution Time (Time), the overall performance of the algorithm is measured by average Accuracy (avg _ Accuracy, number of correctly divided samples/total number of samples), average iteration frequency (avg _ Iter) and average execution Time (avg _ Time), and the best and average result in the clustering results of the three algorithms is shown in Table 1:

Algorithm	Accury	IterNO	Time	avg_Accury	avg_Iterno	avg_Time
							FCS	0.754545	28	0.028236	0.689091	35	0.193956
WFCM	0.854545	30	0.103216	0.852424	29	0.090867
							CWFCS	0.981818	48	0.055334	0.966364	55	0.063656

TABLE 1

As can be seen from Table 1, for the Iris dataset, both the highest accuracy and the average accuracy of the CFCS algorithm are higher than those of the other two algorithms; the execution time of the CWFS is the shortest, the average execution time of the CWFS is shortened by about 67 percent compared with an FCS algorithm, the average accuracy is improved by 40 percent compared with the FCS algorithm, the time of the CWFS is shortened by 21 percent compared with a WFCM algorithm, and the average accuracy is improved by 23 percent.

The experimental results are obtained based on the noiseless Iris data set, and we can also use three methods of FCS, WFCM and CWFCS provided by the present invention to perform experiments on the noisy Iris data set, and the experimental parameters and environment are the same as those described above for the noiseless Iris data set. The best and average of the clustering results for the three algorithms are shown in table 2:

Algorithm	Accury	IterNO	Time	avg_Accury	avg_Iterno	avg_Time
							FCS	0.754545	40	0.386212	0.720606	62	0.468495
WFCM	0.845455	26	0.109535	0.845455	29	0.101066
							CWFCS	0.972727	29	0.031420	0.887879	43	0.049336

TABLE 2

As can be seen from table 2, the highest and average accuracy of the CWFCS algorithm is also significantly higher for the noisy Iris dataset than the other two algorithms.

Example three:

we then performed experiments on the clean Cancer data set by using FCS, WFCM and CWFCS provided by the present invention, wherein the clean Cancer data set has 30 attributes, and in order to indicate that the samples are unevenly distributed, 10 samples are randomly selected in the first category and 367 samples in the second category, and the results are shown in table 2. Table 3 shows that the CWFCS algorithm performs most stably, the iteration number is slightly higher than the WFCM algorithm, the execution time is within 0.1 second, and the clustering accuracy is higher than the other two algorithms.

Algorithm	Accury	IterNO	Time	avg_Accury	avg_Iterno	avg_Time
							FCS	0.737401	45	0.827577	0.737401	43	0.533281
WFCM	0.819629	11	0.026210	0.767109	11	0.030475
							CWFCS	0.965517	13	0.074786	0.960212	12	0.075808

TABLE 3

Example four:

the aeroengine gas circuit simulation data set (noise adding) is tested by using FCS, WFCM and CWFCS provided by the invention respectively, and the result is shown in Table 4. The GasPath data set is aeroengine gas path data and comprises three characteristic parameters of DEGT, DNH and DFF, wherein 200 healthy data samples are obtained, and 5 fault data samples are randomly selected.

Algorithm	Accury	IterNO	Time	avg_Accury	avg_Iterno	avg_Time
							FCS	0.614634	24	0.290102	0.614634	24	0.181671
WFCM	0.6	19	0.046147	0.6	21	0.052607
							CWFCS	0.917073	15	0.023733	0.86878	23	0.033184

TABLE 4

As shown in table 4, for the GasPath data set, the robustness is very good for the data polluted by noise in engineering application, and the data can be divided more accurately, and for such data, the accuracy of the algorithm for clustering by using the intra-class compactness and the inter-class dispersion of the samples is higher than that of the WFCM algorithm only considering the intra-class compactness.

Example five:

the invention also provides a specific application method in the industrial control, which comprises the following steps:

firstly, state monitoring must be performed on important specific parameters in industrial control (various sensors are usually required to be arranged to obtain comprehensive data), after the data acquired by the sensors are acquired, the acquired data are classified by the CWFCS method (steps one to seven) provided by the invention, and then the current state of the industrial equipment or process is judged according to the classification result. For example, the state of the aircraft engine is monitored by a sensor, and whether the aircraft engine is in an unhealthy state is judged by classifying the collected data (through the CWFS method provided by the invention, steps one to seven).

The technical means disclosed in the invention scheme are not limited to the technical means disclosed in the above embodiments, but also include the technical scheme formed by any combination of the above technical features. It should be noted that those skilled in the art can make various improvements and modifications without departing from the principle of the present invention, and such improvements and modifications are also considered to be within the scope of the present invention.

Claims

1. A cluster feature weighted fuzzy compact scattering clustering method is applied to the field of data processing and is characterized in that after data collected by a sensor are obtained, the collected data are classified by the cluster feature weighted fuzzy compact scattering clustering method, and then the current state of industrial equipment or technology is judged according to the classification result, wherein the cluster feature weighted fuzzy compact scattering clustering method comprises the following steps:

step one, setting membership index m and characteristic weighting index α∈ [ -10, -1 [ ]]∪(1,10]β∈ {0.005,0.05,0.5,1}, initial iteration number p equal to 0 and an iterationThe generation error is more than 0, and an initial clustering center a is randomly generated_i，s is the number of characteristic parameters;

<mrow> <msub> <mi>&eta;</mi> <mi>i</mi> </msub> <mo>=</mo> <mfrac> <mi>&beta;</mi> <mn>4</mn> </mfrac> <mfrac> <mrow> <msub> <mi>min</mi> <mrow> <mi>i</mi> <mo>&NotEqual;</mo> <msup> <mi>i</mi> <mo>&prime;</mo> </msup> </mrow> </msub> <mo>|</mo> <mo>|</mo> <msub> <mi>a</mi> <mi>i</mi> </msub> <mo>-</mo> <msub> <mi>a</mi> <msup> <mi>i</mi> <mo>&prime;</mo> </msup> </msub> <mo>|</mo> <msup> <mo>|</mo> <mn>2</mn> </msup> </mrow> <mrow> <msub> <mi>max</mi> <mi>t</mi> </msub> <mo>|</mo> <mo>|</mo> <msub> <mi>a</mi> <mi>t</mi> </msub> <mo>-</mo> <mover> <mi>X</mi> <mo>&OverBar;</mo> </mover> <mo>|</mo> <msup> <mo>|</mo> <mn>2</mn> </msup> </mrow> </mfrac> </mrow>

Wherein,is the sample mean;

step three: updating the sample membership mu according to_ij：

<mrow> <msub> <mi>&mu;</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>=</mo> <mfrac> <msup> <mrow> <mo>(</mo> <munderover> <mi>&Sigma;</mi> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>s</mi> </munderover> <msubsup> <mi>&omega;</mi> <mrow> <mi>i</mi> <mi>k</mi> </mrow> <mi>&alpha;</mi> </msubsup> <mo>(</mo> <mrow> <mo>|</mo> <mo>|</mo> <msub> <mi>x</mi> <mrow> <mi>j</mi> <mi>k</mi> </mrow> </msub> <mo>-</mo> <msub> <mi>a</mi> <mrow> <mi>i</mi> <mi>k</mi> </mrow> </msub> <mo>|</mo> <msup> <mo>|</mo> <mn>2</mn> </msup> <mo>-</mo> <msub> <mi>&eta;</mi> <mi>i</mi> </msub> <mo>|</mo> <mo>|</mo> <msub> <mi>a</mi> <mrow> <mi>i</mi> <mi>k</mi> </mrow> </msub> <mo>-</mo> <mover> <msub> <mi>X</mi> <mi>k</mi> </msub> <mo>&OverBar;</mo> </mover> <mo>|</mo> <msup> <mo>|</mo> <mn>2</mn> </msup> </mrow> <mo>)</mo> <mo>)</mo> </mrow> <mfrac> <mn>1</mn> <mrow> <mn>1</mn> <mo>-</mo> <mi>m</mi> </mrow> </mfrac> </msup> <mrow> <munderover> <mo>&Sigma;</mo> <mrow> <mi>t</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>c</mi> </munderover> <msup> <mrow> <mo>(</mo> <munderover> <mi>&Sigma;</mi> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>s</mi> </munderover> <msubsup> <mi>&omega;</mi> <mrow> <mi>t</mi> <mi>k</mi> </mrow> <mi>&alpha;</mi> </msubsup> <mo>(</mo> <mrow> <mo>|</mo> <mo>|</mo> <msub> <mi>x</mi> <mrow> <mi>j</mi> <mi>k</mi> </mrow> </msub> <mo>-</mo> <msub> <mi>a</mi> <mrow> <mi>t</mi> <mi>k</mi> </mrow> </msub> <mo>|</mo> <msup> <mo>|</mo> <mn>2</mn> </msup> <mo>-</mo> <msub> <mi>&eta;</mi> <mi>t</mi> </msub> <mo>|</mo> <mo>|</mo> <msub> <mi>a</mi> <mrow> <mi>t</mi> <mi>k</mi> </mrow> </msub> <mo>-</mo> <mover> <msub> <mi>X</mi> <mi>k</mi> </msub> <mo>&OverBar;</mo> </mover> <mo>|</mo> <msup> <mo>|</mo> <mn>2</mn> </msup> </mrow> <mo>)</mo> <mo>)</mo> </mrow> <mfrac> <mn>1</mn> <mrow> <mn>1</mn> <mo>-</mo> <mi>m</mi> </mrow> </mfrac> </msup> </mrow> </mfrac> </mrow>

Note the book

<mrow> <msub> <mi>&Delta;</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>=</mo> <munderover> <mo>&Sigma;</mo> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>s</mi> </munderover> <msubsup> <mi>&omega;</mi> <mrow> <mi>i</mi> <mi>k</mi> </mrow> <mi>&alpha;</mi> </msubsup> <mrow> <mo>(</mo> <mo>|</mo> <mo>|</mo> <msub> <mi>x</mi> <mrow> <mi>j</mi> <mi>k</mi> </mrow> </msub> <mo>-</mo> <msub> <mi>a</mi> <mrow> <mi>i</mi> <mi>k</mi> </mrow> </msub> <mo>|</mo> <msup> <mo>|</mo> <mn>2</mn> </msup> <mo>-</mo> <msub> <mi>&eta;</mi> <mi>i</mi> </msub> <mo>|</mo> <mo>|</mo> <msub> <mi>a</mi> <mrow> <mi>i</mi> <mi>k</mi> </mrow> </msub> <mo>-</mo> <mover> <msub> <mi>X</mi> <mi>k</mi> </msub> <mo>&OverBar;</mo> </mover> <mo>|</mo> <msup> <mo>|</mo> <mn>2</mn> </msup> <mo>)</mo> </mrow> </mrow>

When the sample point x_jWhen there is a fall on a hard division boundary, at this time Δ_ij0, for all Δ s, while ensuring that the distance scale of each sample point with respect to the ith class is unchanged_ijSample Point utilization of P (Δ) of ≧ 0_ij) And (3) adjusting:

<mrow> <msub> <mi>&Delta;</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>=</mo> <mi>P</mi> <mrow> <mo>(</mo> <msub> <mi>&Delta;</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>&GreaterEqual;</mo> <mn>0</mn> <mo>)</mo> </mrow> <mo>=</mo> <msub> <mi>&Delta;</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>+</mo> <mi>r</mi> <mi>a</mi> <mi>n</mi> <mi>d</mi> <mo>*</mo> <munder> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> </mrow> <mi>j</mi> </munder> <mrow> <mo>(</mo> <msub> <mi>&Delta;</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>></mo> <mn>0</mn> <mo>)</mo> </mrow> <mo>,</mo> <mrow> <mo>(</mo> <mi>j</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mo>...</mo> <mo>,</mo> <mi>n</mi> <mo>)</mo> </mrow> </mrow>

after adjustment, the new μ is calculated using the following equation_ij：

<mrow> <msub> <mi>&mu;</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>=</mo> <mfrac> <mrow> <msup> <msub> <mi>&Delta;</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mfrac> <mn>1</mn> <mrow> <mn>1</mn> <mo>-</mo> <mi>m</mi> </mrow> </mfrac> </msup> </mrow> <mrow> <munderover> <mo>&Sigma;</mo> <mrow> <mi>t</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>c</mi> </munderover> <msup> <msub> <mi>&Delta;</mi> <mrow> <mi>t</mi> <mi>j</mi> </mrow> </msub> <mfrac> <mn>1</mn> <mrow> <mn>1</mn> <mo>-</mo> <mi>m</mi> </mrow> </mfrac> </msup> </mrow> </mfrac> </mrow>

<mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mrow> <msub> <mi>&mu;</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>=</mo> <mn>1</mn> <mo>,</mo> </mrow> </mtd> <mtd> <mrow> <msub> <mi>&Delta;</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo><</mo> <mn>0</mn> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mi>&mu;</mi> <mrow> <msup> <mi>i</mi> <mo>&prime;</mo> </msup> <mi>j</mi> </mrow> </msub> <mo>=</mo> <mn>0</mn> <mo>,</mo> </mrow> </mtd> <mtd> <mrow> <msup> <mi>i</mi> <mo>&prime;</mo> </msup> <mo>&NotEqual;</mo> <mi>i</mi> </mrow> </mtd> </mtr> </mtable> </mfenced>

<mrow> <msub> <mi>&omega;</mi> <mrow> <mi>i</mi> <mi>k</mi> </mrow> </msub> <mo>=</mo> <mfrac> <msup> <mrow> <mo>(</mo> <munderover> <mi>&Sigma;</mi> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <msubsup> <mi>&mu;</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> <mi>m</mi> </msubsup> <mo>(</mo> <mrow> <mo>|</mo> <mo>|</mo> <msub> <mi>x</mi> <mrow> <mi>j</mi> <mi>k</mi> </mrow> </msub> <mo>-</mo> <msub> <mi>a</mi> <mrow> <mi>i</mi> <mi>k</mi> </mrow> </msub> <mo>|</mo> <msup> <mo>|</mo> <mn>2</mn> </msup> <mo>-</mo> <msub> <mi>&eta;</mi> <mi>i</mi> </msub> <mo>|</mo> <mo>|</mo> <msub> <mi>a</mi> <mrow> <mi>i</mi> <mi>k</mi> </mrow> </msub> <mo>-</mo> <mover> <msub> <mi>X</mi> <mi>k</mi> </msub> <mo>&OverBar;</mo> </mover> <mo>|</mo> <msup> <mo>|</mo> <mn>2</mn> </msup> </mrow> <mo>)</mo> <mo>)</mo> </mrow> <mfrac> <mn>1</mn> <mrow> <mn>1</mn> <mo>-</mo> <mi>&alpha;</mi> </mrow> </mfrac> </msup> <mrow> <munderover> <mo>&Sigma;</mo> <mrow> <mi>t</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>s</mi> </munderover> <msup> <mrow> <mo>(</mo> <munderover> <mi>&Sigma;</mi> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <msubsup> <mi>&mu;</mi> <mrow> <mi>t</mi> <mi>j</mi> </mrow> <mi>m</mi> </msubsup> <mo>(</mo> <mrow> <mo>|</mo> <mo>|</mo> <msub> <mi>x</mi> <mrow> <mi>j</mi> <mi>t</mi> </mrow> </msub> <mo>-</mo> <msub> <mi>a</mi> <mrow> <mi>i</mi> <mi>t</mi> </mrow> </msub> <mo>|</mo> <msup> <mo>|</mo> <mn>2</mn> </msup> <mo>-</mo> <msub> <mi>&eta;</mi> <mi>i</mi> </msub> <mo>|</mo> <mo>|</mo> <msub> <mi>a</mi> <mrow> <mi>i</mi> <mi>t</mi> </mrow> </msub> <mo>-</mo> <mover> <msub> <mi>X</mi> <mi>t</mi> </msub> <mo>&OverBar;</mo> </mover> <mo>|</mo> <msup> <mo>|</mo> <mn>2</mn> </msup> </mrow> <mo>)</mo> <mo>)</mo> </mrow> <mfrac> <mn>1</mn> <mrow> <mn>1</mn> <mo>-</mo> <mi>&alpha;</mi> </mrow> </mfrac> </msup> </mrow> </mfrac> </mrow>1

Note the book

<mrow> <msub> <mi>&Delta;</mi> <mrow> <mi>i</mi> <mi>k</mi> </mrow> </msub> <mo>=</mo> <munderover> <mo>&Sigma;</mo> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <msubsup> <mi>&mu;</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> <mi>m</mi> </msubsup> <mrow> <mo>(</mo> <mo>|</mo> <mo>|</mo> <msub> <mi>x</mi> <mrow> <mi>j</mi> <mi>k</mi> </mrow> </msub> <mo>-</mo> <msub> <mi>a</mi> <mrow> <mi>i</mi> <mi>k</mi> </mrow> </msub> <mo>|</mo> <msup> <mo>|</mo> <mn>2</mn> </msup> <mo>-</mo> <msub> <mi>&eta;</mi> <mi>i</mi> </msub> <mo>|</mo> <mo>|</mo> <msub> <mi>a</mi> <mrow> <mi>i</mi> <mi>k</mi> </mrow> </msub> <mo>-</mo> <mover> <msub> <mi>X</mi> <mi>k</mi> </msub> <mo>&OverBar;</mo> </mover> <mo>|</mo> <msup> <mo>|</mo> <mn>2</mn> </msup> <mo>)</mo> </mrow> </mrow>

If Δ_ik< 0 because of ω_ik∈[0,1]Therefore, it is necessary to adjust Δ_ikProjecting to an interval larger than 0 and ensuring that the distance scale of the kth characteristic parameter of each sample and the hard dividing area of the ith class is unchanged, and then adjusting delta by using the following formula_ik：

<mrow> <msub> <mi>&Delta;</mi> <mrow> <mi>i</mi> <mi>k</mi> </mrow> </msub> <mo>=</mo> <msub> <mi>&Delta;</mi> <mrow> <mi>i</mi> <mi>k</mi> </mrow> </msub> <mo>-</mo> <munder> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> </mrow> <mi>k</mi> </munder> <mrow> <mo>(</mo> <msub> <mi>&Delta;</mi> <mrow> <mi>i</mi> <mi>k</mi> </mrow> </msub> <mo>)</mo> </mrow> <mo>+</mo> <munder> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> </mrow> <mi>k</mi> </munder> <mrow> <mo>(</mo> <msub> <mi>&Delta;</mi> <mrow> <mi>i</mi> <mi>k</mi> </mrow> </msub> <mo>></mo> <mn>0</mn> <mo>)</mo> </mrow> </mrow>

<mrow> <msub> <mi>a</mi> <mrow> <mi>i</mi> <mi>k</mi> </mrow> </msub> <mo>=</mo> <mfrac> <mrow> <munderover> <mo>&Sigma;</mo> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <msubsup> <mi>&mu;</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> <mi>m</mi> </msubsup> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mrow> <mi>j</mi> <mi>k</mi> </mrow> </msub> <mo>-</mo> <msub> <mi>&eta;</mi> <mi>i</mi> </msub> <mover> <msub> <mi>X</mi> <mi>k</mi> </msub> <mo>&OverBar;</mo> </mover> <mo>)</mo> </mrow> </mrow> <mrow> <munderover> <mo>&Sigma;</mo> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <msubsup> <mi>&mu;</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> <mi>m</mi> </msubsup> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <msub> <mi>&eta;</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> </mrow>

2. The cluster-feature-weighted fuzzy compact scattering clustering method of claim 1, wherein: the sample membership mu_ijAnd a feature weight ω_ikCalculated by the following steps:

establishing an objective function:

<mrow> <msub> <mi>J</mi> <mrow> <mi>C</mi> <mi>W</mi> <mi>F</mi> <mi>C</mi> <mi>S</mi> </mrow> </msub> <mo>=</mo> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>c</mi> </munderover> <munderover> <mi>&Sigma;</mi> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <munderover> <mi>&Sigma;</mi> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>s</mi> </munderover> <msubsup> <mi>&mu;</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> <mi>m</mi> </msubsup> <msubsup> <mi>&omega;</mi> <mrow> <mi>i</mi> <mi>k</mi> </mrow> <mi>&alpha;</mi> </msubsup> <mo>|</mo> <mo>|</mo> <msub> <mi>x</mi> <mrow> <mi>j</mi> <mi>k</mi> </mrow> </msub> <mo>-</mo> <msub> <mi>a</mi> <mrow> <mi>i</mi> <mi>k</mi> </mrow> </msub> <mo>|</mo> <msup> <mo>|</mo> <mn>2</mn> </msup> <mo>-</mo> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>c</mi> </munderover> <munderover> <mi>&Sigma;</mi> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <munderover> <mi>&Sigma;</mi> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>s</mi> </munderover> <msub> <mi>&eta;</mi> <mi>i</mi> </msub> <msubsup> <mi>&mu;</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> <mi>m</mi> </msubsup> <msubsup> <mi>&omega;</mi> <mrow> <mi>i</mi> <mi>k</mi> </mrow> <mi>&alpha;</mi> </msubsup> <mo>|</mo> <mo>|</mo> <msub> <mi>a</mi> <mrow> <mi>i</mi> <mi>k</mi> </mrow> </msub> <mo>-</mo> <mover> <msub> <mi>X</mi> <mi>k</mi> </msub> <mo>&OverBar;</mo> </mover> <mo>|</mo> <msup> <mo>|</mo> <mn>2</mn> </msup> </mrow>

the cluster feature weighted FCS clustering problem is expressed as follows:

<mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mrow> <mi>min</mi> <mi> </mi> <msub> <mi>J</mi> <mrow> <mi>C</mi> <mi>W</mi> <mi>F</mi> <mi>C</mi> <mi>S</mi> </mrow> </msub> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mi>s</mi> <mo>.</mo> <mi>t</mi> <mo>.</mo> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>c</mi> </munderover> <msub> <mi>&mu;</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>=</mo> <mn>1</mn> <mo>,</mo> <munderover> <mi>&Sigma;</mi> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>s</mi> </munderover> <msub> <mi>&omega;</mi> <mrow> <mi>i</mi> <mi>k</mi> </mrow> </msub> <mo>=</mo> <mn>1</mn> </mrow> </mtd> </mtr> </mtable> </mfenced>

obtained by using a Lagrange multiplier method:

in the above formula, λ_i、λ_jIs a lagrange multiplier;

respectively corresponding to mu according to the above formula_ij、ω_ik、λ_i、λ_jObtaining a deviation and making the deviation result zero to obtain mu_ij、ω_ik。

3. The cluster-feature-weighted fuzzy compact scattering clustering method of claim 1, wherein: the sensor collects the state data of the aircraft engine, and the health state of the aircraft engine is judged.