CN115238683B

CN115238683B - Method, device, equipment and medium for recognizing stop words of circulating self-attention

Info

Publication number: CN115238683B
Application number: CN202210949814.4A
Authority: CN
Inventors: 舒畅; 陈又新
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2022-08-09
Filing date: 2022-08-09
Publication date: 2023-06-20
Anticipated expiration: 2042-08-09
Also published as: CN115238683A

Abstract

The invention relates to the technical field of voice semantics, and discloses a method, a device, equipment and a medium for recognizing a stop word with cyclic self-attention. The method comprises the following steps: word segmentation is carried out on the text sentence, and a phrase set is obtained; performing quantization coding on the phrase set, and performing filling connection operation on a coding result according to a preset filling ordering strategy to obtain an initial quantization text matrix; performing matrix multiplication on the initial quantized text matrix by using a random initialization weight matrix to obtain a text quantization matrix; performing attention weight configuration on word vectors corresponding to each phrase in the text quantization matrix by using a pre-constructed attention configuration network to obtain an attention text quantization matrix, and traversing preset times to obtain an updated attention text quantization matrix; and carrying out low weight vector recognition operation on the updated attention text quantization matrix to obtain a stop word vector. The method and the device can realize dynamic recognition of the stop words without the query process of the stop word dictionary.

Description

Method, device, equipment and medium for recognizing stop words of circulating self-attention

Technical Field

The invention relates to the technical field of voice semantics, in particular to a method, a device, equipment and a medium for recognizing a stop word with cyclic self-attention.

Background

With the improvement of algorithm level, semantic recognition has become a common method in intelligent text processing, in the process of semantic recognition, word segmentation processing is required to be carried out on text, then dead words in word segmentation are deleted, and finally the rest words are subjected to semantic recognition. The stop words refer to words which have no practical meaning but frequently occur, and in the semantic recognition process, the storage space can be saved and the recognition efficiency can be improved by deleting the stop words.

At present, the aspect of stop word processing uses a series of words manually collected to make a dictionary of stop words. Before the text sentence is input into the model, the words are segmented, the segmented words are sequentially compared with the words in the stop word dictionary, and if the current words appear in the stop word dictionary, the words are deleted. Wherein, the construction process of the word dictionary is stopped, and the word traversing comparison process needs to consume a great deal of computing resources.

Disclosure of Invention

The invention provides a method, a device, equipment and a medium for recognizing a stop word with cyclic self-attention, which mainly aim at realizing dynamic recognition of the stop word without a query process of a stop word dictionary.

In order to achieve the above object, the present invention provides a method for recognizing a stop word by circulating self-attention, comprising:

performing word segmentation processing on the pre-constructed text sentence by using a word segmentation tool to obtain a phrase set;

performing quantization coding on the phrase set by using a one-hot quantization tool to obtain an initial word vector set, and performing filling connection operation on each initial word vector in the initial word vector set according to a preset filling ordering strategy to obtain an initial quantized text matrix;

performing matrix multiplication operation on the initial quantized text matrix by using a pre-constructed random initialization weight matrix to obtain a text quantized matrix;

performing attention weight configuration on word vectors corresponding to each phrase in the text quantization matrix by using a pre-constructed attention configuration network to obtain an attention text quantization matrix;

according to a preset circulation strategy, carrying out operation circulation preset times of attention weight configuration on word vectors corresponding to each phrase in the text quantization matrix by using a preset attention configuration network to obtain an updated attention text quantization matrix;

and performing low-weight vector recognition operation on the updated attention text quantization matrix by using a pre-trained downstream task classifier to obtain a stop word vector.

Optionally, the performing, by using a pre-constructed attention configuration network, attention weight configuration operation on word vectors corresponding to each phrase in the text quantization matrix to obtain an attention text quantization matrix includes:

extracting word vectors corresponding to a phrase from the text quantization matrix in sequence by utilizing a pre-constructed attention configuration network, and respectively carrying out weighted calculation on the word vectors according to a preset first tensor, a preset second tensor and a preset third tensor to obtain a first word vector set, a first word vector set and a third word vector set;

sequentially extracting a first word vector from the first word vector set, sequentially carrying out vector inner product on each second word vector in the first word vector set and the second word vector set to obtain a vector association numerical value set, and carrying out normalization operation on the vector association numerical value set by using a softmax function to obtain an attention weight set;

carrying out weight calculation on each third word vector in the third word vector set and each attention weight in the attention weight set according to the corresponding relation of each phrase to obtain a weighted vector set;

respectively carrying out weighted calculation on the weighted vector sets according to a first random weight matrix, a second random weight matrix and a third random weight matrix which are pre-constructed to obtain a first weighted word vector set, a second weighted word vector set and a third weighted word vector set;

Sequentially extracting a weighted word vector from the first weighted word vector set, sequentially carrying out vector inner product on each second weighted word vector in the weighted word vector set and the second weighted word vector set to obtain a weighted vector association numerical value set, and carrying out normalization operation on the weighted vector association numerical value set by using the softmax function to obtain a weighted attention weight set;

carrying out weight calculation on each third weighted word vector in the third weighted word vector set and each weighted attention weight in the weighted attention weight set according to the corresponding relation of each phrase to obtain a weighted vector set;

and performing word-level-based full-connection operation on each re-weighted vector in the re-weighted vector set to obtain an attention text quantization matrix.

Optionally, the performing filling connection operation on each initial word vector in the initial word vector set according to a preset filling ordering policy to obtain an initial quantized text matrix includes:

configuring word vector length according to the number of the phrases in the phrase set;

performing length complementation operation on each initial word vector in the initial word vector set to obtain a complement word vector;

And connecting the complementary word vectors corresponding to the phrases according to the sequence of the phrases in the text sentence to obtain an initial quantized text matrix.

Optionally, before the low weight vector recognition operation is performed on the updated attention text quantization matrix by using the pre-trained downstream task classifier, the method further includes:

connecting the attention configuration network with a pre-constructed downstream task classifier to obtain a downstream task classification model;

acquiring a training sample set corresponding to a downstream task, performing forward network calculation on the downstream task classification model by utilizing training samples in the training sample set to obtain a prediction result, and calculating a loss value of a real result corresponding to the prediction result and the training samples by utilizing a cross entropy loss function;

minimizing the loss value to obtain a model parameter with the minimum loss value, and using the model parameter to reversely update the downstream task classification model by a network;

judging whether the loss value converges or not;

when the loss value is not converged, returning to a step of acquiring a training sample set corresponding to a downstream task, and acquiring a new sample to perform iterative training on the downstream task classification model;

And when the loss value converges, obtaining a trained downstream task classification model.

Optionally, after performing the low weight vector recognition operation on the updated attention text quantization matrix by using the pre-trained downstream task classifier to obtain the deactivated word vector, the method may further include:

according to the stop word vector, performing stop word filtering on the updated attention text quantization matrix to obtain a text screening quantization matrix;

and performing downstream task processing on the text screening quantization matrix by using the downstream task classifier.

In order to solve the above-mentioned problems, the present invention also provides a stop word recognition apparatus for circulating self-attention, the apparatus comprising:

the word segmentation module is used for carrying out word segmentation on the pre-constructed text sentence by using a word segmentation tool to obtain a phrase set;

the quantization module is used for carrying out quantization coding on the phrase set by using a one-hot quantization tool to obtain an initial word vector set, carrying out filling connection operation on each initial word vector in the initial word vector set according to a preset filling ordering strategy to obtain an initial quantized text matrix, and carrying out matrix product operation on the initial quantized text matrix by using a pre-built random initialization weight matrix to obtain a text quantization matrix;

The attention configuration module is used for carrying out attention weight configuration on word vectors corresponding to each phrase in the text quantization matrix by utilizing a pre-constructed attention configuration network to obtain an attention text quantization matrix, and carrying out operation cycle preset times of attention weight configuration on the word vectors corresponding to each phrase in the text quantization matrix by utilizing the pre-constructed attention configuration network according to a preset cycle strategy to obtain an updated attention text quantization matrix;

and the stop word recognition module is used for carrying out low weight vector recognition operation on the updated attention text quantization matrix by utilizing the pre-trained downstream task classifier to obtain a stop word vector.

performing weight calculation on the weighted vector set according to a first random weight matrix, a second random weight matrix and a third random weight matrix which are pre-constructed to obtain a first weighted word vector set, a second weighted word vector set and a third weighted word vector set;

and performing word-level-based full-connection operation on the re-weighted vector set to obtain an attention text quantization matrix.

In order to solve the above-mentioned problems, the present invention also provides an electronic apparatus including:

at least one processor; the method comprises the steps of,

a memory communicatively coupled to the at least one processor; wherein,,

the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the cyclic self-attention disabling word recognition method described above.

In order to solve the above-mentioned problems, the present invention also provides a computer-readable storage medium having stored therein at least one computer program that is executed by a processor in an electronic device to implement the above-mentioned cyclic self-attention disabling word recognition method.

According to the embodiment of the invention, the word segmentation tool is utilized to acquire the phrase set, and then the quantization coding is carried out to obtain the text quantization matrix, wherein the deletion operation of the stop words is not carried out in the text quantization matrix, so that the construction and updating processes of the stop word query and the stop word dictionary are saved, and the calculation resources and time are saved; the invention uses the pre-constructed attention configuration network to carry out attention weight configuration operation on word vectors corresponding to each word group in the text quantization matrix to obtain the attention text quantization matrix, wherein the attention configuration network is a network which is obtained by combining a plurality of units formed by a circulation attention mechanism and a full connection network, each unit comprises a circulation process of attention distribution once, the importance degree among the word groups can be subjected to weight configuration, the weight of the stop words can be automatically reduced, a plurality of units are arranged in the attention configuration network according to the preset circulation strategy, the weight of the stop words can be further reduced, and finally the stop words can be obtained by carrying out low weight recognition. Therefore, the method, the device, the equipment and the storage medium for recognizing the stop words with cyclic self-attention provided by the embodiment of the invention can realize dynamic recognition of the stop words without the query process of the stop word dictionary.

Drawings

FIG. 1 is a flowchart of a method for recognizing a stop word with cyclic self-attention according to an embodiment of the present invention;

FIG. 2 is a detailed flowchart of a step in a method for recognizing a stop word with cyclic self-attention according to an embodiment of the present invention;

FIG. 3 is a detailed flowchart of a step in a method for recognizing a stop word with cyclic self-attention according to an embodiment of the present invention;

FIG. 4 is a detailed flowchart of a step in a method for recognizing a stop word with cyclic self-attention according to an embodiment of the present invention;

FIG. 5 is a functional block diagram of a circulating self-attention stop word recognition device according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of an electronic device implementing the method for recognizing a stop word with cyclic self-attention according to an embodiment of the present invention.

The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.

Detailed Description

It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

The embodiment of the application provides a method for recognizing a stop word with cyclic self-attention. In the embodiment of the present application, the execution body of the method for recognizing the circulating self-attentive stop word includes, but is not limited to, at least one of a server, a terminal, and the like, which can be configured to execute the method provided in the embodiment of the present application. In other words, the loop-to-self-attention stop word recognition method may be performed by software or hardware installed at a terminal device or a server device, and the software may be a blockchain platform. The service end includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like. The server may be an independent server, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (ContentDelivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms.

Referring to fig. 1, a flowchart of a method for recognizing a stop word with cyclic self-attention according to an embodiment of the invention is shown. In this embodiment, the method for recognizing the stop word of cyclic self-attention includes the following steps S1 to S6:

s1, performing word segmentation on the pre-constructed text sentence by using a word segmentation tool to obtain a phrase set.

In the embodiment of the invention, the word segmentation tool can be an open-source tool such as a jieba or NLPIR Chinese word segmentation system for word segmentation operation to obtain a phrase set.

In the embodiment of the invention, the text sentence can be "I have already started working at home", and the phrase set [ I "," working "," I "can be obtained through the word segmentation process".

S2, carrying out quantization coding on the phrase set by using a one-hot quantization tool to obtain an initial word vector set, and carrying out filling connection operation on each initial word vector in the initial word vector set according to a preset filling ordering strategy to obtain an initial quantized text matrix.

The one-hot quantization tool is an encoding tool that encodes a plurality of states using a multi-bit state register, wherein each phrase may be a state.

In detail, referring to fig. 2, in the embodiment of the present invention, according to a preset filling and sorting policy, filling and connecting each initial word vector in the initial word vector set to obtain an initial quantized text matrix, including steps S21 to S23:

s21, configuring word vector length according to the number of phrases in the phrase set;

s22, performing length complementation operation on each initial word vector in the initial word vector set to obtain a complement word vector;

s23, connecting the complement word vectors corresponding to the phrases according to the sequence of the phrases in the text sentence, and obtaining an initial quantized text matrix.

In the embodiment of the invention, the quantization coding is performed by a quantization tool, so that the I can be quantized into 1, the I can be quantized into 10, and the I can be quantized into 100 and … ….

According to the phrase set, the number of phrases is known to be 6, the coding length can be 6, and each quantized code can be filled to obtain the quantized word with the quantized word '000001', the quantized word '000010' and the quantized word '… …' and the quantized word '100000'.

Further, in the embodiment of the present invention, the sequence of each phrase in the phrase set obtained by the word segmentation tool is disordered, and in order to enhance the text recognition accuracy, it is necessary to arrange according to the sequence of the text sentence, so as to obtain an initial quantized text matrix, for example:

s3, performing matrix product operation on the initial quantized text matrix by utilizing the pre-constructed random initialization weight matrix to obtain the text quantized matrix.

In the embodiment of the present invention, a random initialization weight matrix W is generated by a matrix generating tool, where the random initialization weight matrix W is n×512, where N is the number of phrases in the phrase set and is 6. Then a text quantization matrix of 6 x 512 can be obtained by performing the proof product of the initial quantization text matrix of 6*6 and the random initialization weight matrix W of 6 x 512.

S4, performing attention weight configuration operation on word vectors corresponding to each phrase in the text quantization matrix by using a pre-constructed attention configuration network to obtain an attention text quantization matrix.

In the embodiment of the invention, the attention configuration network is a multi-unit network formed by an attention mechanism and a full connection layer, and is used for continuously increasing the gap between the common phrase and the stop phrase through the attention mechanism.

Further, the text quantization matrix is 6 rows, and 512 bytes in each row can represent a word vector of a phrase.

In detail, referring to fig. 3, in the embodiment of the present invention, the performing an attention weight configuration operation on word vectors corresponding to each phrase in the text quantization matrix by using a pre-constructed attention configuration network to obtain an attention text quantization matrix includes steps S41 to S47:

s41, sequentially extracting word vectors corresponding to a phrase from the text quantization matrix by utilizing a pre-constructed attention configuration network, and respectively carrying out weighted calculation on the word vectors according to a preset first tensor, a preset second tensor and a preset third tensor to obtain a first word vector set, a first word vector set and a third word vector set;

s42, sequentially extracting a first word vector from the first word vector set, sequentially carrying out vector inner product on each second word vector in the first word vector set and the second word vector set to obtain a vector association numerical value set, and carrying out normalization operation on the vector association numerical value set by using a softmax function to obtain an attention weight set;

s43, carrying out weight calculation on each third word vector in the third word vector set and each attention weight in the attention weight set according to the corresponding relation of each phrase to obtain a weighted vector set;

S44, respectively carrying out weighted calculation on the weighted vector sets according to the first random weight matrix, the second random weight matrix and the third random weight matrix which are pre-constructed to obtain a first weighted word vector set, a second weighted word vector set and a third weighted word vector set;

s45, sequentially extracting a weighted word vector from the first weighted word vector set, sequentially carrying out vector inner product on each second weighted word vector in the weighted word vector set and the second weighted word vector set to obtain a weighted vector association numerical value set, and carrying out normalization operation on the weighted vector association numerical value set by using the softmax function to obtain a weighted attention weight set;

s46, carrying out weight calculation on each third weighted word vector in the third weighted word vector set and each weighted attention weight in the weighted attention weight set according to the corresponding relation of each phrase to obtain a weighted vector set;

s47, performing word-level-based full-connection operation on each re-weighted vector in the re-weighted vector set to obtain an attention text quantization matrix.

In the embodiment of the present invention, word vectors x corresponding to a phrase are sequentially extracted from the text quantization matrix as 1 x 512, and then the word vectors are divided into three parts by weight configuration, which are used as a tensor q, a second tensor k and a third tensor v, wherein the weight configuration can be the same.

And after all word vectors in the text quantization matrix are extracted, a first word vector set qi, a second word vector set ki and a third word vector set ki are obtained, wherein i is 1-6.

Then sequentially extracting a first word vector from the first word vector set qi, such as q1, and then respectively performing vector inner product calculation on q1 and k1, k2, k3 and the like, wherein q1 and k1 inner products are obtained to obtain a1, q1 and k1 inner products are obtained to obtain a2, q1 and k3 inner products are obtained to obtain a3, … …, wherein q1 and k1 are both vectors of 1 x 512, and a1, a2 and the like are all numerical values instead of vectors according to mathematical experience that the vector inner products are the association degree of two vectors, so that attention weight sets a1', a2', a3 'and the like are obtained by taking a1, a2, a3 and the like as inputs of softmax, and a1' +a2'+a3' + … … =1 are obtained. Then multiplying v1 by a1', multiplying v2 by a2', multiplying v3 by a3, … … to obtain a set of weighting vectors v1', v2', v3', etc.

In order to realize model data alignment, the embodiment of the invention performs weight attention configuration operation on the v1', v2', v3', … … again through a first random weight matrix Wq, a second random weight matrix Wk and a third random weight matrix Wv to obtain v1", v2", v3"… …, wherein the Wq, wk and Wv are 512×512.

Then, in the embodiment of the invention, v1 ', v2 ', v3 ' are given with brand new weights to perform full connection operation through the full connection layer based on the word level, so as to obtain x ' containing w1 x v1 ', w2 x v2 ', w3 x v3 ', and the like, wherein the weight coefficients on the neurones in each full connection of w1, w2, w3, and the like can be optimized through a subsequent training process.

S5, according to a preset circulation strategy, carrying out operation circulation preset times of attention weight configuration on word vectors corresponding to each phrase in the text quantization matrix by using a preset attention configuration network to obtain an updated attention text quantization matrix.

In the embodiment of the present invention, the circulation policy refers to that the attention text quantization matrix generated in the previous unit is used as the input of the next unit, and the preset times are repeated in this way, so as to obtain the updated attention text quantization matrix.

When the prediction times are formulated to be 10 times, the invention obtains a better attention weight configuration effect.

S6, performing low weight vector recognition operation on the updated attention text quantization matrix by using a pre-trained downstream task classifier to obtain a stop word vector.

In the embodiment of the invention, the downstream task classifier is a common model in the text recognition field, such as a text two-classification or multi-classification model.

In the embodiment of the invention, the downstream task classifier is similar to a Decoder in a Transform, and the attention configuration network can be used as an Encoder.

In detail, referring to fig. 4, in the embodiment of the present invention, before the low weight vector recognition operation is performed on the updated attention text quantization matrix by using the pre-trained downstream task classifier, the method further includes steps S61 to S65:

s61, connecting the attention configuration network with a pre-constructed downstream task classifier to obtain a downstream task classification model;

s62, acquiring a training sample set corresponding to a downstream task, performing forward network calculation on the downstream task classification model by utilizing training samples in the training sample set to obtain a prediction result, and calculating a loss value of a real result corresponding to the prediction result and the training samples by utilizing a cross entropy loss function;

s63, minimizing the loss value to obtain a model parameter with the minimum loss value, and using the model parameter to reversely update the downstream task classification model by a network;

S64, judging whether the loss value is converged or not;

returning to the step S62 when the loss value is not converged, and obtaining a new sample to perform iterative training on the downstream task classification model;

and when the loss value converges, S65, obtaining a trained downstream task classification model.

According to the invention, the training sample set corresponding to the downstream task is obtained, the downstream task classification model is trained, the training result is obtained through a gradient descent method and a cross entropy loss function, and then the training result is reversely fed back through a preset BP neural network, so that the updated downstream task classification model can be obtained, wherein the training process can be controlled through the convergence degree of the loss value, when the loss value converges, the trained downstream task classification model is obtained, the model training effect tends to be stable, the fitting phenomenon is avoided, and the trained downstream task classification model is obtained.

In addition, in another embodiment of the present invention, the method may further include, after performing a low weight vector recognition operation on the updated attention text quantization matrix by using a pre-trained downstream task classifier to obtain a deactivated word vector: according to the stop word vector, performing stop word filtering on the updated attention text quantization matrix to obtain a text screening quantization matrix; and performing downstream task processing on the text screening quantization matrix by using the downstream task classifier.

According to the embodiment of the invention, through continuous self-attention circulation and full-connection operation, the difference between the weights of meaningful words and stop words is increased according to the correlation among the words, so that the weight of the unnecessary words is smaller and smaller in the subsequent process, the same effect as that of the traditional stop word deletion is generated.

According to the embodiment of the invention, the word segmentation tool is utilized to acquire the phrase set, and then the quantization coding is carried out to obtain the text quantization matrix, wherein the deletion operation of the stop words is not carried out in the text quantization matrix, so that the construction and updating processes of the stop word query and the stop word dictionary are saved, and the calculation resources and time are saved; the invention uses the pre-constructed attention configuration network to carry out attention weight configuration operation on word vectors corresponding to each word group in the text quantization matrix to obtain the attention text quantization matrix, wherein the attention configuration network is a network which is obtained by combining a plurality of units formed by a circulation attention mechanism and a full connection network, each unit comprises a circulation process of attention distribution once, the importance degree among the word groups can be subjected to weight configuration, the weight of the stop words can be automatically reduced, a plurality of units are arranged in the attention configuration network according to the preset circulation strategy, the weight of the stop words can be further reduced, and finally the stop words can be obtained by carrying out low weight recognition. Therefore, the method for recognizing the stop words circularly self-attentive can realize dynamic recognition of the stop words without the query process of the stop word dictionary.

FIG. 5 is a functional block diagram of a circulating self-attention stop word recognition device according to an embodiment of the present invention.

The circulating self-attention disabling word recognition apparatus 100 of the present invention may be installed in an electronic device. Depending on the functions implemented, the circulating self-attention disabling word recognition device 100 may include a word segmentation module 101, a quantization module 102, an attention configuration module 103, and a disabling word recognition module 104. The module of the invention, which may also be referred to as a unit, refers to a series of computer program segments, which are stored in the memory of the electronic device, capable of being executed by the processor of the electronic device and of performing a fixed function.

In the present embodiment, the functions concerning the respective modules/units are as follows:

the word segmentation module 101 is configured to perform word segmentation on the pre-constructed text sentence by using a word segmentation tool to obtain a phrase set;

the quantization module 102 is configured to perform quantization encoding on the phrase set by using a one-hot quantization tool to obtain an initial word vector set, perform filling connection operation on each initial word vector in the initial word vector set according to a preset filling ordering policy to obtain an initial quantized text matrix, and perform matrix product operation on the initial quantized text matrix by using a pre-constructed random initialization weight matrix to obtain a text quantization matrix;

The attention configuration module 103 is configured to perform attention weight configuration on word vectors corresponding to each phrase in the text quantization matrix by using a pre-constructed attention configuration network to obtain an attention text quantization matrix, and perform operation cycle preset times of attention weight configuration on word vectors corresponding to each phrase in the text quantization matrix by using the pre-constructed attention configuration network according to a preset cycle policy to obtain an updated attention text quantization matrix;

the stop word recognition module 104 is configured to perform a low weight vector recognition operation on the updated attention text quantization matrix by using a pre-trained downstream task classifier, so as to obtain a stop word vector.

In detail, each module in the circulating self-attentive deactivated word recognition apparatus 100 in the embodiment of the present application adopts the same technical means as the circulating self-attentive deactivated word recognition method described in fig. 1 to 5, and can produce the same technical effects, which are not repeated here.

Fig. 6 is a schematic structural diagram of an electronic device 1 implementing a method for recognizing a stop word with cyclic self-attention according to an embodiment of the present invention.

The electronic device 1 may comprise a processor 10, a memory 11, a communication bus 12 and a communication interface 13, and may further comprise a computer program stored in the memory 11 and executable on the processor 10, such as a cyclic self-attentive stop-word recognition program.

The processor 10 may be formed by an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be formed by a plurality of integrated circuits packaged with the same function or different functions, including one or more central processing units (Central Processing Unit, CPU), a microprocessor, a digital processing chip, a graphics processor, a combination of various control chips, and so on. The processor 10 is a Control Unit (Control Unit) of the electronic device 1, connects the respective components of the entire electronic device using various interfaces and lines, executes or executes programs or modules stored in the memory 11 (for example, executes a stop word recognition program of cyclic self-attention, etc.), and invokes data stored in the memory 11 to perform various functions of the electronic device and process data.

The memory 11 includes at least one type of readable storage medium including flash memory, a removable hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device, such as a mobile hard disk of the electronic device. The memory 11 may in other embodiments also be an external storage device of the electronic device, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the electronic device. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device. The memory 11 may be used not only for storing application software installed in the electronic device and various types of data, such as codes of a stop word recognition program that loops self-attention, but also for temporarily storing data that has been output or is to be output.

The communication bus 12 may be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, among others. The bus may be classified as an address bus, a data bus, a control bus, etc. The bus is arranged to enable a connection communication between the memory 11 and at least one processor 10 etc.

The communication interface 13 is used for communication between the electronic device 1 and other devices, including a network interface and a user interface. Optionally, the network interface may include a wired interface and/or a wireless interface (e.g., WI-FI interface, bluetooth interface, etc.), typically used to establish a communication connection between the electronic device and other electronic devices. The user interface may be a Display (Display), an input unit such as a Keyboard (Keyboard), or alternatively a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like. The display may also be referred to as a display screen or display unit, as appropriate, for displaying information processed in the electronic device and for displaying a visual user interface.

Fig. 6 shows only an electronic device with components, it being understood by a person skilled in the art that the structure shown in fig. 6 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than shown, or may combine certain components, or may be arranged in different components.

For example, although not shown, the electronic device 1 may further include a power source (such as a battery) for supplying power to each component, and preferably, the power source may be logically connected to the at least one processor 10 through a power management device, so that functions of charge management, discharge management, power consumption management, and the like are implemented through the power management device. The power supply may also include one or more of any of a direct current or alternating current power supply, recharging device, power failure detection circuit, power converter or inverter, power status indicator, etc. The electronic device 1 may further include various sensors, bluetooth modules, wi-Fi modules, etc., which will not be described herein.

It should be understood that the embodiments described are for illustrative purposes only and are not limited to this configuration in the scope of the patent application.

The cyclic self-attentive stop word recognition program stored in the memory 11 of the electronic device 1 is a combination of instructions which, when executed in the processor 10, may implement:

In particular, the specific implementation method of the above instructions by the processor 10 may refer to the description of the relevant steps in the corresponding embodiment of the drawings, which is not repeated herein.

Further, the modules/units integrated in the electronic device 1 may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as separate products. The computer readable storage medium may be volatile or nonvolatile. For example, the computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM).

The present invention also provides a computer readable storage medium storing a computer program which, when executed by a processor of an electronic device, can implement:

In the several embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be other manners of division when actually implemented.

The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional module in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units can be realized in a form of hardware or a form of hardware and a form of software functional modules.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof.

The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.

The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.

The embodiment of the application can acquire and process the related data based on the artificial intelligence technology. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.

Furthermore, it is evident that the word "comprising" does not exclude other elements or steps, and that the singular does not exclude a plurality. A plurality of units or means recited in the system claims can also be implemented by means of software or hardware by means of one unit or means. The terms first, second, etc. are used to denote a name, but not any particular order.

Finally, it should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention.

Claims

1. A method of circulating self-attention deactivated word recognition, the method comprising:

performing low-weight vector recognition operation on the updated attention text quantization matrix by using a pre-trained downstream task classifier to obtain a stop word vector;

the method for obtaining the attention text quantization matrix includes the steps of: extracting word vectors corresponding to a phrase from the text quantization matrix in sequence by utilizing a pre-constructed attention configuration network, and respectively carrying out weighted calculation on the word vectors according to a preset first tensor, a preset second tensor and a preset third tensor to obtain a first word vector set, a first word vector set and a third word vector set; sequentially extracting a first word vector from the first word vector set, sequentially carrying out vector inner product on each second word vector in the first word vector set and the second word vector set to obtain a vector association numerical value set, and carrying out normalization operation on the vector association numerical value set by using a softmax function to obtain an attention weight set; carrying out weight calculation on each third word vector in the third word vector set and each attention weight in the attention weight set according to the corresponding relation of each phrase to obtain a weighted vector set; respectively carrying out weighted calculation on the weighted vector sets according to a first random weight matrix, a second random weight matrix and a third random weight matrix which are pre-constructed to obtain a first weighted word vector set, a second weighted word vector set and a third weighted word vector set; sequentially extracting a weighted word vector from the first weighted word vector set, sequentially carrying out vector inner product on each second weighted word vector in the weighted word vector set and the second weighted word vector set to obtain a weighted vector association numerical value set, and carrying out normalization operation on the weighted vector association numerical value set by using the softmax function to obtain a weighted attention weight set; carrying out weight calculation on each third weighted word vector in the third weighted word vector set and each weighted attention weight in the weighted attention weight set according to the corresponding relation of each phrase to obtain a weighted vector set; performing word-level-based full-connection operation on each re-weighted vector in the re-weighted vector set to obtain an attention text quantization matrix;

Performing filling connection operation on each initial word vector in the initial word vector set according to a preset filling ordering strategy to obtain an initial quantized text matrix, wherein the method comprises the following steps: configuring word vector length according to the number of the phrases in the phrase set; performing length complementation operation on each initial word vector in the initial word vector set to obtain a complement word vector; and connecting the complementary word vectors corresponding to the phrases according to the sequence of the phrases in the text sentence to obtain an initial quantized text matrix.

2. The method for circulating self-attention-disabling word recognition of claim 1 wherein said method further comprises, prior to said low weight vector recognition operation on said updated attention text quantization matrix using a pre-trained downstream task classifier:

judging whether the loss value converges or not;

3. The method for recognizing a dead word of cyclic self-attention according to claim 1, wherein the method further comprises, after performing a low weight vector recognition operation on the updated attention text quantization matrix using a pre-trained downstream task classifier to obtain a dead word vector:

4. A device for circulating self-attention deactivated word recognition, the device comprising:

the deactivated word recognition module is used for carrying out low-weight vector recognition operation on the updated attention text quantization matrix by utilizing a pre-trained downstream task classifier to obtain deactivated word vectors;

5. An electronic device, the electronic device comprising:

at least one processor; the method comprises the steps of,

a memory communicatively coupled to the at least one processor; wherein,,

the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the cyclic self-attention disabling word recognition method of any one of claims 1 to 3.

6. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the cyclic self-attention disabling word recognition method of any one of claims 1 to 3.