CN112487162A

CN112487162A - Method, device and equipment for determining text semantic information and storage medium

Info

Publication number: CN112487162A
Application number: CN202011346527.1A
Authority: CN
Inventors: 王景禾
Original assignee: Tencent Music Entertainment Technology Shenzhen Co Ltd
Current assignee: Tencent Music Entertainment Technology Shenzhen Co Ltd
Priority date: 2020-11-25
Filing date: 2020-11-25
Publication date: 2021-03-12
Anticipated expiration: 2040-11-25
Also published as: CN112487162B

Abstract

The application discloses a method, a device, equipment and a storage medium for determining text semantic information, and belongs to the technical field of computers. The method comprises the following steps: performing word segmentation processing on a target text, and acquiring a plurality of feature information corresponding to each word to obtain a feature information set corresponding to each word; respectively executing the following characteristic fusion steps on the words according to the characteristic information set of the words aiming at each word to obtain each fusion characteristic information of the words: respectively determining each feature information corresponding to the word as target feature information, respectively grouping the target feature information and each feature information of the word as a feature information pair, and determining a fusion correlation coefficient of the two feature information in the feature information pair; determining fusion characteristic information corresponding to the target characteristic information based on each characteristic information of the words and the corresponding fusion correlation coefficient; and determining semantic information corresponding to the target text based on the fusion characteristic information of each word. The method and the device can improve the accuracy of text recognition.

Description

Method, device and equipment for determining text semantic information and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method, an apparatus, a device, and a storage medium for determining text semantic information.

Background

With the advancement of technology, task-based dialog systems are increasingly applied to practical scenes, such as siri, a small-sized-in-a product. In a task-based dialog system, the natural language understanding module makes a significant contribution to the task-based dialog system. The natural language understanding module is mainly used for processing a text input by a user or a text generated according to voice recognition to determine semantic information corresponding to the text, and further recommending related contents for the user.

In the related technology, the obtained text can be subjected to word segmentation processing, a plurality of feature information corresponding to each word is extracted, at least one obtained feature information is input to a pre-trained natural language understanding module, and then semantic information corresponding to the text is output. In the process of implementing the application, the inventor finds that the accuracy of the related technology on text recognition is low.

Disclosure of Invention

In order to solve technical problems in the related art, embodiments of the present application provide a method, an apparatus, a device, and a storage medium for determining text semantic information. The technical scheme of the application is as follows:

in a first aspect, an embodiment of the present application provides a method for determining text semantic information, where the method includes:

performing word segmentation processing on the target text to obtain a plurality of words;

acquiring a plurality of characteristic information corresponding to each word to obtain a characteristic information set corresponding to each word;

respectively executing the following characteristic fusion steps on each word according to the characteristic information set of the word to obtain the fusion characteristic information of each word:

respectively determining each feature information corresponding to the word as target feature information, and respectively combining the target feature information with each feature information group of the word as a feature information pair;

determining a fusion correlation coefficient of the two feature information in the feature information pair;

determining fusion characteristic information corresponding to the target characteristic information based on each characteristic information of the words and a fusion correlation coefficient of each characteristic information and the target characteristic information;

and determining semantic information corresponding to the target text based on the fusion characteristic information of each word.

Optionally, the determining a fusion correlation coefficient of two feature information in the feature information pair includes:

determining a first sub-correlation coefficient of two pieces of feature information in the feature information pair, and determining a second sub-correlation coefficient of the two pieces of feature information in the feature information pair; the first sub-correlation coefficient is used for representing the correlation degree of the two pieces of characteristic information in terms of characteristic values, and the second sub-correlation coefficient is used for representing the characteristic categories of the two pieces of characteristic information;

and fusing a first sub-correlation coefficient of the two pieces of feature information in the feature information pair with a second sub-correlation coefficient of the two pieces of feature information in the feature information pair to obtain a fused correlation coefficient of the two pieces of feature information in the feature information pair.

Optionally, the determining a first sub-correlation coefficient of two pieces of feature information in the feature information pair includes:

respectively carrying out linear mapping on the two pieces of feature information in the feature information pair to obtain feature values of the two pieces of feature information;

and calculating the mutual information of the characteristic values of the two pieces of characteristic information to obtain a first sub-correlation coefficient of the two pieces of characteristic information in the characteristic information pair.

Optionally, the determining a second sub-correlation coefficient of two pieces of feature information in the feature information pair includes:

according to the feature type numbers of the two pieces of feature information in the feature information pair, obtaining the coding vectors of the two pieces of feature information in the feature information pair in a pre-trained feature interaction embedded matrix;

and determining a second sub-correlation coefficient of two pieces of feature information in the feature information pair based on the target feature information in the feature information pair and the coding vector.

Optionally, the determining, based on the target feature information in the feature information pair and the encoding vector, a second sub-correlation coefficient of two feature information in the feature information pair includes:

according to the formula fea_no_ij＝fea_key_i^T*W_no*fea_cross_embed_(i-i)*n+j-1Wno∈R^d*dCalculating a second sub-correlation coefficient of two pieces of feature information in the feature information pair; suppose that two pieces of feature information in the feature information pair are fea _ i and fea _ j, fea _ i is target feature information, i is a feature type number of the target feature information, fea _ key _ i is a feature value of the target feature information fea _ i, fea _ no_ijA second sub-correlation coefficient representing the target feature information fea _ i and the feature information fea _ j, j being a feature type number of the feature information fea _ j, fea _ cross _ embedded_(i-i)*n+j-1Representing a code vector with index (i-1) n + j-1 in the feature interaction embedding matrix, n representing the number of feature information in the set of feature information of the word, W_noRepresenting the mapping space, R representing a constant, d representing the dimension of the mapping space.

Optionally, the determining, based on each feature information of the word and the fusion correlation coefficient between each feature information and the target feature information, fusion feature information corresponding to the target feature information includes:

calculating a product of each feature information of the word and a fusion correlation coefficient of the feature information and the target feature information;

and adding the products of the target characteristic information to obtain fusion characteristic information of the target characteristic information.

In a second aspect, an embodiment of the present application provides an apparatus for determining text semantic information, where the apparatus includes:

the dividing module is configured to perform word segmentation processing on the target text to obtain a plurality of words;

the extraction module is configured to acquire a plurality of feature information corresponding to each word to obtain a feature information set corresponding to each word;

a first determining module configured to perform, for each word, the following feature fusion steps on the word according to the feature information set of the word to obtain respective fusion feature information of the word: respectively determining each feature information corresponding to the word as target feature information, and respectively combining the target feature information with each feature information group of the word as a feature information pair; determining a fusion correlation coefficient of the two feature information in the feature information pair; determining fusion characteristic information corresponding to the target characteristic information based on each characteristic information of the words and a fusion correlation coefficient of each characteristic information and the target characteristic information;

and the second determining module is configured to determine semantic information corresponding to the target text based on the fusion characteristic information of each word.

Optionally, the first determining module is configured to:

according to the formula fea _ no_ij＝fea_key_i^T*W_no*fea_cross_embed_(i-i)*n+j-1Wno∈R^d*dCalculating a second sub-correlation coefficient of two pieces of feature information in the feature information pair; it is assumed that two pieces of feature information in the feature information pair are fea _ i and fea _ j, fea _ i is target feature information, i is a feature type number of the target feature information, fea _ key _ i is a feature value of the target feature information fea _ i, fea _ no_ijA second sub-correlation coefficient representing the target feature information fea _ i and the feature information fea _ j, j being a feature type number of the feature information fea _ j, fea _ cross _ embedded_(i-i)*n+j-1Representing a code vector with index (i-1) n + j-1 in the feature interaction embedding matrix, n representing the number of feature information in the set of feature information of the word, W_noRepresenting the mapping space, R representing a constant, d representing the dimension of the mapping space.

Optionally, the first determining module is configured to:

In a third aspect, an embodiment of the present application provides a computer device, where the computer device includes a processor and a memory, where the memory stores at least one instruction, and the instruction is loaded and executed by the processor to implement the operations performed by the method for determining text semantic information according to the first aspect.

In a fourth aspect, the present application provides a computer-readable storage medium, where at least one instruction is stored, and the instruction is loaded and executed by a processor to implement the operations performed by the method for determining text semantic information according to the first aspect.

The technical scheme provided by the embodiment of the application has the following beneficial effects:

the fusion characteristic information of the words in the embodiment of the application is determined according to each characteristic information of the words and the fusion correlation coefficient of each characteristic information and the target characteristic information, and the fusion correlation coefficient is determined by a first sub-correlation coefficient and a second sub-correlation coefficient of two characteristic information in a characteristic information pair, wherein the first sub-correlation coefficient considers the relationship of the two characteristic information in the characteristic information pair in the aspect of characteristic values, and the second sub-correlation coefficient considers the characteristic types of the two characteristic information in the characteristic information pair. Therefore, when the fusion correlation coefficient is calculated, the similarity of the characteristic values of the two characteristic information in the characteristic information pair is considered, and the characteristic types of the two characteristic information are also considered, so that the calculated fusion correlation coefficient is more accurate, the word recognition accuracy is improved, and the text recognition accuracy is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic diagram of an implementation environment for determining semantic information of a text according to an embodiment of the present disclosure;

FIG. 2 is a flowchart of a method for determining semantic information of a text according to an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of determining semantic information of a text according to an embodiment of the present disclosure;

FIG. 4 is a schematic diagram of determining semantic information of a text according to an embodiment of the present disclosure;

FIG. 5 is a schematic diagram of determining semantic information of a text according to an embodiment of the present disclosure;

FIG. 6 is a diagram for determining semantic information of a text according to an embodiment of the present disclosure;

FIG. 7 is a diagram for determining semantic information of a text according to an embodiment of the present disclosure;

FIG. 8 is a schematic structural diagram of an apparatus for determining semantic information of a text according to an embodiment of the present application;

fig. 9 is a schematic structural diagram of a terminal according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of a server according to an embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

In the process of implementing the related art, the inventor finds that the accuracy of the text recognized by the related art is low. One reason for determining the technical problem after analysis is that, because a text includes a plurality of words, each word corresponds to a plurality of feature information, and a certain correlation exists between every two feature information corresponding to each word, the correlation technique does not consider the correlation existing between every two feature information corresponding to each word, and then directly inputs each word to the pre-trained natural language understanding module, thereby resulting in a low accuracy of the pre-trained natural language understanding module in recognizing the text.

In order to solve the above technical problem, an embodiment of the present application provides a method for determining text semantic information. Fig. 1 is a schematic diagram of an implementation environment of a method for determining text semantic information according to an embodiment of the present application. Referring to fig. 1, the implementation environment includes: a terminal 101.

The terminal 101 may be at least one of a smartphone, a smart watch, a desktop computer, a laptop computer, a virtual reality terminal, an augmented reality terminal, a wireless terminal, a laptop portable computer, and the like. The terminal 101 has a communication function and can access the internet. The terminal 101 may be generally referred to as one of a plurality of terminals, and the embodiment of the present application is illustrated by the terminal 101. Those skilled in the art will appreciate that the number of terminals described above may be greater or fewer. The terminal 101 may be installed with a microphone and an application having a text recognition function. The microphone on the terminal 101 is used for acquiring a voice signal of a user, and the application program with the text recognition function can convert the voice signal of the user into a text, so as to obtain semantic information corresponding to the text according to the method provided by the embodiment of the application. Or the input device installed in the terminal 101 includes a text input module to directly obtain the text input by the user, so that the text input by the user can be directly processed according to the method provided by the embodiment of the present application to obtain the semantic information corresponding to the text input by the user.

In the method provided by the embodiment of the present application, the process of determining the semantic information corresponding to the text by the terminal 101 may be as follows, where after the terminal receives an instruction to acquire a voice signal, the terminal acquires the voice signal and converts the acquired voice signal into the text. Alternatively, the terminal directly obtains the text entered by the user on the terminal. And the terminal performs word segmentation processing on the obtained text and obtains a plurality of characteristic information corresponding to each word. And fusing a plurality of feature information corresponding to each word according to the fusion correlation coefficient between every two feature information corresponding to each word, and determining each fusion feature information of each word. According to the method and the device, when the fusion correlation coefficient is calculated, the similarity of the characteristic values of the two characteristic information in the characteristic information pair is considered, and the characteristic types of the two characteristic information are also considered, so that the calculated fusion correlation coefficient is more accurate, the word recognition accuracy is improved, and the text recognition accuracy is improved.

Alternatively, the method of determining text semantic information may be implemented by a server.

Alternatively, the method for determining the text semantic information may be implemented by both the terminal and the server. The terminal is provided with a voice input module and a communication module, and the voice input module can convert a voice signal input by a user into text data and send the text data to the server through the communication module, or the terminal directly sends the voice signal to the server and the text data is converted by the server. Or the terminal is provided with a text input module and a communication module, and the text data input by the user is obtained and then directly sent to the server through the communication module. The server may be a single server or a server group, if the server is a single server, the server may be responsible for all processing in the following scheme, if the server is a server group, different servers in the server group may be respectively responsible for different processing in the following scheme, and the specific processing allocation condition may be arbitrarily set by a technician according to actual needs, and is not described herein again.

Fig. 2 is a flowchart of a method for determining text semantic information according to an embodiment of the present disclosure. Referring to fig. 2, the embodiment includes:

step 201, performing word segmentation processing on the target text to obtain a plurality of words.

The target text is a text to be recognized, and the text may be a text converted from a speech signal or a text input by a user. In the implementation, a text needing word segmentation processing is obtained, the same words as the words in a pre-stored word stock are determined in the text, and the words are used as the result of word segmentation processing on the text. Or inputting the text needing word segmentation into a pre-trained word segmentation module, and taking a plurality of output words as the result of word segmentation on the text.

Optionally, the step of obtaining the text includes: when a user wants to manipulate the terminal by voice, the terminal detects whether a voice instruction of a specific content is received. And when the terminal detects the voice command of the specific content, acquiring the voiceprint characteristics corresponding to the voice command, and determining whether the user corresponding to the voiceprint characteristics has the authority of controlling the terminal. And when the user corresponding to the voiceprint feature has the authority of controlling the terminal, determining whether the voice signal is detected within the preset time. When the terminal detects a voice signal within a preset time, the voice signal is converted into a text, and the text is used as a text to be recognized.

And when the voiceprint features corresponding to the voice command are the same as the prestored voiceprint features, determining that the user corresponding to the voiceprint features has the authority of controlling the terminal. And when the voiceprint feature corresponding to the voice command is different from the prestored voiceprint feature, determining that the user corresponding to the voiceprint feature does not have the authority of controlling the terminal.

It should be noted that, after the terminal detects the voice signal within the preset time, the method may further include: and extracting the voiceprint features of the voice signal, and comparing the voiceprint features corresponding to the voice signal with the prestored voiceprint features, or comparing the voiceprint features corresponding to the voice signal with the voiceprint features corresponding to the voice command. And when the voiceprint features corresponding to the voice signals are the same as the prestored voiceprint features or the voiceprint features corresponding to the voice signals are the same as the voiceprint features corresponding to the voice commands, converting the voice signals into texts. And when the voiceprint features corresponding to the voice signals are different from the prestored voiceprint features or the voiceprint features corresponding to the voice signals are different from the voiceprint features corresponding to the voice commands, the voice signals are not converted. According to the method and the device, the terminal can be prevented from executing the control instructions of other users by detecting the voiceprint characteristics of the sound signals.

Optionally, the text may also be obtained by: the user opens an application program with a query function on the terminal, and a text is input in a search bar on the application program, so that the terminal can acquire the text input by the user. The application with the query function may be a search engine, a music application, a video application, a shopping application, etc. Alternatively, after the user opens the application having the query function, the text may be acquired by a voice signal, and then the text may be used as the text input in the search bar.

Step 202, obtaining a plurality of feature information corresponding to each word, and obtaining a feature information set corresponding to each word.

The feature information corresponding to each word may be feature information acquired in a plurality of feature categories. For example, the plurality of feature information corresponding to a word includes feature information of the word in terms of word sense, feature information of the word in terms of context, feature information of the word in terms of position in the target text, and the like.

In implementation, each word is input into the feature extraction module, and a plurality of feature information of each word is output. And taking a plurality of characteristic information corresponding to the words as a characteristic information set corresponding to the words.

Step 203, executing the following feature fusion steps on the words according to the feature information set of the words respectively aiming at each word to obtain each fusion feature information of the words: respectively determining each feature information corresponding to the word as target feature information, and respectively combining the target feature information with each feature information group of the word as a feature information pair; determining a fusion correlation coefficient of two feature information in the feature information pair; and determining fusion characteristic information corresponding to the target characteristic information based on each characteristic information of the words and the fusion correlation coefficient of each characteristic information and the target characteristic information.

In implementation, each feature information corresponding to a word is determined as target feature information, and the target feature information and each feature information group of the word are set as feature information pairs, so as to obtain a plurality of feature information pairs corresponding to each word. That is, each feature information of a word is paired with each feature information to obtain a plurality of feature information pairs. And calculating a fusion correlation coefficient of the two pieces of feature information in each feature information pair. And determining fusion characteristic information corresponding to the target characteristic information according to each characteristic information of the words and the fusion correlation coefficient of each characteristic information and the target characteristic information.

For example, the word a corresponds to three pieces of feature information, which are feature information 1, feature information 2, and feature information 3, respectively. For the feature information 1, a feature information pair 11 composed of the feature information 1 and the feature information 1, a feature information pair 12 composed of the feature information 1 and the feature information 2, and a feature information pair 13 composed of the feature information 1 and the feature information 3 are obtained. A fusion correlation coefficient 11 of the feature information 1 and the feature information 1 in the feature information pair 11, a fusion correlation coefficient 12 of the feature information 1 and the feature information 2 in the feature information pair 12, and a fusion correlation coefficient 13 of the feature information 1 and the feature information 3 in the feature information pair 13 are calculated. And determining fusion characteristic information corresponding to the characteristic information 1 according to the characteristic information 1, the fusion correlation coefficient 11, the characteristic information 2, the fusion correlation coefficient 12, the characteristic information 3 and the fusion correlation coefficient 13. By the above method, the fusion feature information corresponding to the feature information 2 and the feature information 3 is calculated. It is understood that each word has a plurality of feature information, and each feature information can obtain the fused feature information corresponding to each word, so that each word has a plurality of fused feature information. For example, a word corresponds to n pieces of feature information, that is, n pieces of fused feature information.

Optionally, the fusion correlation coefficient in the embodiment of the present application is composed of two parts, one part is a first sub-correlation coefficient, and the other part is a second sub-correlation coefficient, where a specific process may be to determine a first sub-correlation coefficient of two pieces of feature information in a feature information pair, and determine a second sub-correlation coefficient of two pieces of feature information in the feature information pair; the first sub-correlation coefficient refers to the degree of association between the two pieces of feature information in terms of feature values, and the second sub-correlation coefficient refers to the feature class to which the two pieces of feature information belong. And fusing the first sub-correlation coefficient of the two pieces of feature information in the feature information pair with the second sub-correlation coefficient of the two pieces of feature information in the feature information pair to obtain a fused correlation coefficient of the two pieces of feature information in the feature information pair.

The first sub-correlation coefficient and the second sub-correlation coefficient may be calculated by the self-attention learning module. As shown in fig. 3, one example of a self-attention learning module includes a linear mapping layer (linear layer), a merge layer (merge layer), a query layer (lookup _ and _ linear layer), and an add layer (add layer). The linear layer is used for performing linear transformation on the plurality of feature information of each word to obtain a feature value of each feature information. The merge layer is used for determining a first sub-relevancy coefficient of two pieces of feature information in each feature information pair corresponding to the word. The lookup _ and _ linear layer is used for determining a second sub-correlation coefficient of the two feature information in each feature information pair corresponding to the word, and the add layer is used for adding the first sub-correlation coefficient and the second sub-correlation coefficient of the two feature information in the feature information pair to obtain a fusion correlation coefficient of the two feature information in the feature information pair.

Specifically, a plurality of feature information corresponding to each word and a feature type number of each feature information are input into the self-attention learning module, and a linear layer in the self-attention learning module performs linear transformation on each feature information to obtain a feature value corresponding to each feature information. And respectively taking each feature information corresponding to a certain word as target feature information, and respectively obtaining a fusion correlation coefficient between the target feature information and each feature information in a feature information set to which the target feature information belongs according to the following method aiming at each target feature information.

And calculating a first sub-correlation coefficient of two pieces of feature information in each pair of feature information by using the merge layer in the self-attention learning module. And determining a second sub-correlation coefficient of two feature information in the feature information pair by the lookup _ and _ linear layer in the self-attention learning module. And an add layer in the self-attention learning module fuses the first sub-correlation coefficient of the two pieces of feature information in the feature information pair and the second sub-correlation coefficient of the two pieces of feature information in the feature information pair to obtain a fusion correlation coefficient of the two pieces of feature information in the feature information pair. After obtaining the fusion correlation coefficient of the two feature information in each feature information pair, the following operations may be further performed: and determining fusion characteristic information corresponding to each characteristic information according to each characteristic information of the words and the corresponding fusion correlation coefficient. And determining fusion characteristic information corresponding to other words in the target text respectively according to the method, so that the self-attention learning module outputs the fusion characteristic information corresponding to each word respectively.

How the first sub-correlation coefficient and the second sub-correlation coefficient are obtained is described below.

In one implementation, linear mapping is performed on two pieces of feature information in the feature information pair respectively to obtain feature values of the two pieces of feature information; and calculating the mutual information of the characteristic values of the two pieces of characteristic information to obtain a first sub-correlation coefficient of the two pieces of characteristic information in the characteristic information pair.

Before obtaining the feature values of the two feature information, the feature information of the words needs to be linearly transformed, and then the feature information corresponding to each word is mapped to the same mapping space to obtain the feature value of each feature information. This process may be implemented by the linear layer in fig. 3.

Taking the example of respectively mapping two feature information of a word to the same mapping space, the specific steps are as follows: by the formula

And mapping the target characteristic information fea _ i and the characteristic information fea _ j to the same mapping space, so as to obtain a characteristic value fea _ key _ i and a characteristic value fea _ key _ j.

The fea _ key _ i represents a characteristic value corresponding to the target characteristic information fea _ i, the fea _ key _ j represents a characteristic value corresponding to the characteristic information fea _ j, the fea _ i represents ith characteristic information corresponding to a word, the fea _ j represents jth characteristic information corresponding to the word, W_kiRepresents the mapping space, W, required for mapping the target feature information fea _ i to the feature values fea _ key _ i_kjRepresents the mapping space, d × d, required for mapping the characteristic information fea _ j to the characteristic values fea _ key _ j_iRepresenting a mapping space W_kiD x d of_jRepresenting a mapping space W_kjR is a constant.

After the above transformation, the dimension of the feature value fea _ key _ i is the same as the dimension of the feature value fea _ key _ j. And mapping each feature information corresponding to the word to the same mapping space by using the same method, and determining a feature value corresponding to each feature information.

Further, mutual information of the feature values of the two pieces of feature information, namely, the correlation degree of the feature values of the two pieces of feature information is calculated to obtain a first sub-correlation coefficient of the two pieces of feature information in the feature information pair. For example, the embodiment of the present application may determine the mutual information between the feature values of two feature information in the feature information pair in the following three ways. This process may be implemented by merge layer in fig. 3.

In the first placeIn one mode, mutual information between feature values of two feature information in a feature information pair is determined through a first formula. Assuming that two feature information in the feature information pair are fea _ i and fea _ j, respectively, where fea _ i is the target feature information, the first formula may be

Wherein fea _ key _ i is the characteristic value of the target characteristic information fea _ i, fea _ key _ j is the characteristic value of the characteristic information fea _ j, fea _ rel_ijRepresents the mutual information between the characteristic value fea _ key _ i and the characteristic value fea _ key _ j, fea _ key _ i^TDenotes the transpose of the feature value fea _ key _ i, and d denotes the dimension of the feature value fea _ key _ i.

In the second mode, the mutual information between the characteristic values of two characteristic information in the characteristic information pair is determined through a second formula, and the second formula can be fea _ rel_ij＝W_rel*tanh(W[fea_key_i,fea_key_j])(W∈R^d*2d,W_rel∈R^1*d) Wherein W is_relIndicates that tanh (W [ fea _ key _ i, fea _ key _ j)]) Corresponding feature information is mapped to a mapping space required by a one-dimensional mapping space, W represents a mapping space required for mapping a combination obtained by the feature value fea _ key _ i and the feature value fea _ key _ j to the one-dimensional mapping space, d represents a dimension of the feature value fea _ key _ i, and other parameters can be referred to the description of the first formula.

In a third way, the mutual information between the characteristic values of two characteristic information in the characteristic information pair is determined through a third formula, wherein the third formula can be fea _ rel_ij＝fea_key_i^TW, fea _ key _ j, where W represents a mapping space required for mapping the feature value fea _ key _ i, and other parameters may be referred to the description of the first formula.

The first sub-correlation coefficient of the two pieces of feature information can be obtained through any one of the formulas. One calculation process of the second sub-correlation coefficient is that according to the feature type numbers of two pieces of feature information in the feature information pair, the code vectors of the two pieces of feature information in the feature information pair are obtained in a pre-trained feature interaction embedded matrix; and determining a second sub-correlation coefficient of two pieces of feature information in the feature information pair based on the target feature information and the coding vector in the feature information pair. This process may be implemented by the lookup _ and _ linear layer in fig. 3.

It should be noted that, the step of obtaining the feature type number corresponding to each feature information is as follows: and inputting each word into a feature extraction module, and outputting feature information corresponding to each word and a feature type number corresponding to each feature information. It should be noted that, each feature information and each feature type number are in a one-to-one correspondence relationship, that is, if there are n pieces of feature information, the feature type number may be 1 to n.

It should be noted that the pre-trained feature interaction embedding matrix includes at least one code vector, and a subscript corresponding to each code vector is related to a feature type number corresponding to two feature information corresponding to the code vector. In implementation, according to the feature type numbers of two pieces of feature information in a feature information pair, in a feature interaction embedded matrix, the coding vectors of the two pieces of feature information in the feature information pair are obtained. And multiplying the code vector by the transposition of the characteristic value of the target characteristic information to determine a second sub-correlation coefficient of two pieces of characteristic information in the characteristic information pair.

It should be noted that, for a word, the feature extraction model may extract feature information of the word in the 1 st feature category, feature information … … in the 2 nd feature category, and feature information in the nth feature category. For example, feature information of the word in terms of context, feature information of the word in terms of part of speech, or the like may be extracted. The feature extraction model may set the feature type number corresponding to the feature information extracted in the 1 st feature class to 1, the feature type number corresponding to the feature information extracted in the 2 nd feature class to 2 … …, the feature type number corresponding to the feature information extracted in the nth feature class to n, and so on. By the method, after each word is input into the feature extraction module, a plurality of feature information and the feature type number corresponding to each feature information are output.

Further, the formula fe can be calculated according to the correlation coefficient of the characteristic informationa_no_ij＝fea_key_i^T*W_no*fea_cross_embed_(i-i)*n+j-1Wno∈R^d*dCalculating a second sub-correlation coefficient of two pieces of feature information in the feature information pair; it is assumed that two pieces of feature information in the feature information pair are fea _ i and fea _ j, fea _ i is target feature information, i is a feature type number of the target feature information, fea _ key _ i is a feature value of the target feature information fea _ i, fea _ no_ijA second sub-correlation coefficient representing the target feature information fea _ i and the feature information fea _ j, j being a feature type number of the feature information fea _ j, fea _ cross _ embedded_(i-i)*n+j-1Representing a code vector with index (i-1) n + j-1 in the feature interaction embedding matrix, n representing the number of feature information in the set of feature information of the word, W_noRepresenting the mapping space, R representing a constant, d representing the dimension of the mapping space.

It should be noted that the target feature information corresponding to the feature value fea _ key _ i and the feature information corresponding to the feature value fea _ key _ j may be the same feature information. The pre-trained feature interaction embedding matrix may be set in the lookup _ and _ linear layer. It should be noted that, in fig. 3, the two lowermost boxes included in the lookup _ and _ linear layer respectively correspond to feature type numbers of two pieces of feature information, the middle box corresponds to a coding vector for searching the two pieces of feature information from the feature interaction embedded matrix according to the feature type numbers, the box beside the middle box corresponds to a feature value of the feature information fea _ i, where it is assumed that the feature information fea _ i is target feature information, and the box above the middle box and the second sub-correlation coefficient fea _ no_jiAnd (7) corresponding.

After the first sub-correlation coefficient of the two pieces of feature information in the feature information pair and the second sub-correlation coefficient of the two pieces of feature information are determined according to the method, the fusion correlation coefficient of the two pieces of feature information in the feature information pair can be obtained based on the sum of the first sub-correlation coefficient and the second sub-correlation coefficient.

In practice, according to the formula

Determining characteristicsThe sum of the first sub-correlation coefficient and the second sub-correlation coefficient of two feature information in the feature information pair is assumed, wherein the two feature information in the feature information pair are fea _ i and fea _ j respectively, where fea _ i is target feature information, fea _ rel_ijA first sub-correlation coefficient, fea no, representing the two characteristic information_ijA second sub-correlation coefficient representing the two pieces of feature information,

and a sum of a first sub-correlation coefficient representing the two pieces of feature information and a second sub-correlation coefficient representing the two pieces of feature information. According to the formula

Acquiring a fusion correlation coefficient of the two feature information, wherein n represents the total number of the feature information corresponding to a word, the value range of k is between 1 and n, and any feature information corresponding to the word can be recorded as fea _ k, so that the feature information is a word with a high degree of similarity, and the feature information is a word with a low degree of similarity

Represents the sum of a first sub-correlation coefficient between the target feature information fea _ i and the any feature information fea _ k and a second sub-correlation coefficient between the target feature information fea _ i and the any feature information fea _ k.

Optionally, calculating a product of each feature information of the word and a fusion correlation coefficient of the feature information and the target feature information; and adding the products corresponding to the target characteristic information to obtain fusion characteristic information of the target characteristic information.

In the implementation, it is assumed that one word has a plurality of feature information, any one of the plurality of feature information may be denoted as fea _ k, and one of the plurality of feature information is regarded as target feature information, and the target feature information is denoted as fea _ i. According to the formula

And determining the fusion characteristic information of the target characteristic information fea _ i. Wherein fea _ attn _ i represents the fusion characteristic of the target characteristic information fea _ iThe fea _ key _ k is the characteristic value of any characteristic information fea _ k,

and n represents the total number of the feature information corresponding to the word.

Inputting a word into a feature extraction module, and obtaining feature information fea _1, feature information fea _2, feature information fea _3 and feature information fea _4, wherein feature types corresponding to the feature information fea _1, the feature information fea _2, the feature information fea _3 and the feature information fea _4 are numbered as 1, 2, 3 and 4 respectively. As shown in fig. 4, the self-attention learning module receives the feature type number 1 corresponding to fea _1 and fea _1, the feature type number 2 corresponding to fea _2 and fea _2, the feature type number 3 corresponding to fea _3 and fea _3, and the feature type number 4 corresponding to fea _4 and fea _4, and outputs the fusion feature information of the feature information fea _1, the fusion feature information of the feature information fea _2, the fusion feature information of the feature information fea _3, and the fusion feature information of the feature information fea _ 4. And obtaining the information of each fusion characteristic of other words based on the same method. And inputting the fusion characteristic information corresponding to each word in the target text into a natural language understanding module for semantic recognition.

In order to simplify the calculation, for each word, each feature information may be linearly transformed to obtain a plurality of linearly transformed feature information (i.e., feature values of the feature information), products of the linearly transformed feature information and corresponding fusion correlation coefficients may be calculated, and the products may be added to obtain fusion feature information of the target feature information.

Optionally, as shown in fig. 5, the self-attention learning module may be further divided according to functions, and the self-attention learning module may include a first sub-correlation determining module, a second sub-correlation determining module, and a fusion module, where the first sub-correlation determining module is configured to determine a first sub-correlation coefficient between every two pieces of feature information, the second sub-correlation determining module is configured to determine a second sub-correlation coefficient between every two pieces of feature information, and the fusion module is configured to perform fusion processing on the first sub-correlation coefficient and the second sub-correlation coefficient to generate a correlation coefficient, so as to obtain each fused feature information of each word.

And 204, determining semantic information corresponding to the target text based on each fusion characteristic information of each word.

After obtaining the fusion characteristic information of each word, the fusion characteristic information of each word may be input to a combine layer (combination layer), and a characteristic vector corresponding to each word is output, where the combine layer is configured to map the fusion characteristic information of each word into a vector with a fixed length, and the mapping method may be splicing, fusion gating, or pooling the fusion characteristic information of each word. And inputting the feature vector corresponding to each word into an upper layer (subsequent processing layer) so as to obtain an output result of the upper layer, wherein the upper layer is used for processing the feature vector corresponding to each word to obtain a processing result. For example, iterative processing is performed on the feature vector corresponding to each word, and semantic information corresponding to the target text is obtained. It should be noted that the combination layer and upper layer may be disposed in the natural language understanding module, and the natural language understanding module may include LSTM, GRU, CNN, transformer, and the like.

In one embodiment, the semantic information may be an instruction, and the terminal may perform a corresponding operation according to the instruction. The semantic information may include an instruction type and an instruction parameter, the instruction type being an operation to be performed, and the instruction parameter being a content of the performed operation. For example, the instruction type may be "call," and the instruction parameter "Ming".

Alternatively, when the user wants to manipulate the terminal through a voice signal, the terminal determines whether a voice instruction of a specific content is detected. And when the terminal detects the voice command of the specific content, acquiring the voiceprint characteristics corresponding to the voice command, and determining whether the user corresponding to the voiceprint characteristics has the authority of controlling the terminal. And when the user corresponding to the voiceprint feature has the authority of controlling the terminal, determining whether the terminal detects the voice signal within the preset time. And if the terminal detects the voice signal within the preset time, converting the obtained voice signal into a text. As shown in fig. 6, n words are obtained from the text input word segmentation module. Inputting the n words into a feature extraction module, and acquiring a plurality of feature information corresponding to each word and a feature type number corresponding to each feature information. And inputting a plurality of feature information corresponding to each word and a feature type number corresponding to each feature information into a self-attention learning module, and outputting each fusion feature information of each word. The respective fusion feature information of each word is input into the natural language understanding module, and an instruction corresponding to the voice signal (i.e. semantic information corresponding to the text in fig. 6) is output. And the terminal executes the operation corresponding to the instruction according to the instruction corresponding to the voice signal.

For example, when the terminal detects a voice command "siri" of specific content, a voiceprint feature corresponding to the "siri" is detected, and whether a user corresponding to the voiceprint feature has the right to operate and control the terminal is determined. When the user corresponding to the voiceprint feature has the authority of controlling the terminal, the terminal records the sound signal within the preset time, and converts the recorded sound signal into a text 'call for Xiaoming Ming'. And performing word segmentation processing on the text 'call for Xiaoming', and obtaining 'Xiaoming' and 'call'. Inputting the Xiaoming and the calling into a feature extraction module, and obtaining a plurality of feature information respectively corresponding to the Xiaoming and the calling. And inputting a plurality of feature information respectively corresponding to the Xiaoming and the calling into the self-attention learning module, and outputting a plurality of fusion feature information respectively corresponding to the Xiaoming and the calling. Inputting a plurality of fusion characteristic information respectively corresponding to Xiaoming and Dian-Ming into a natural language understanding module, and outputting an instruction type and an instruction parameter corresponding to Xiaoming Dian-Ming, wherein the instruction type can be Dian-Ming and the instruction parameter is Xiaoming. And the terminal dials the small and clear telephone according to the determined instruction type and instruction parameters.

It should be noted that the voice signal obtained within the preset time may be uploaded to the server, the command type and the command parameter are determined by the server, and the command type and the command parameter are sent to the terminal, so that the terminal executes the operation of the command type and the command parameter.

Optionally, as shown in fig. 7, in a training process, any sample text is randomly selected from the sample text library, and based on human experience, semantic information of the sample text is determined and used as reference semantic information of the sample text. And inputting the sample text into a word segmentation module to obtain a plurality of sample words. And inputting each sample word into a feature extraction module, and obtaining a plurality of sample feature information corresponding to each sample word and a feature type number corresponding to each sample feature information of each sample word. And inputting the original sample characteristic information corresponding to each sample word and the characteristic type number corresponding to each original sample characteristic information into a self-attention learning module, and outputting the fusion characteristic information of each sample word. And inputting all the sample fusion characteristic information of each sample word into a natural language understanding module together, and outputting the semantic information of the sample text. And determining loss information between the output semantic information and the reference semantic information according to the output semantic information, the reference semantic information and a preset loss function. And adjusting the initial coding vector in the initial feature interaction embedded matrix in the self-attention learning module and the parameters in the self-attention learning module based on the loss information to obtain the trained feature interaction embedded matrix and the trained self-attention learning module. And selecting other sample texts from the sample text library, and training the trained feature interaction embedded matrix and the trained self-attention learning module based on the mode so as to obtain a pre-trained reference feature information set and the pre-trained self-attention learning module.

It should be noted that, in the training process, the method provided by the embodiment of the present application fully considers the correlation between every two pieces of feature information in the multiple pieces of feature information corresponding to the same word, and thus the accuracy of the trained machine learning model for recognizing the text is higher.

In the embodiment of the present application, each piece of fused feature information of each word is determined according to each piece of feature information and a fused correlation coefficient of each piece of feature information and target feature information, and the fused correlation coefficient of two pieces of feature information in a pair of feature information is determined by a first sub-correlation coefficient of the two pieces of feature information in the pair of feature information and a second sub-correlation coefficient of the two pieces of feature information in the pair of feature information, where the first sub-correlation coefficient takes into account a relationship between feature values of the two pieces of feature information in the pair of feature information, and the second sub-correlation coefficient takes into account a feature type of the two pieces of feature information in the pair of feature information. Therefore, the calculated fusion correlation coefficient is more accurate, the word recognition accuracy is higher, and the accuracy of recognizing the target text is improved.

Based on the same technical concept, an embodiment of the present application further provides an apparatus, where the apparatus is used for a server or a terminal, and as shown in fig. 8, the apparatus includes:

a dividing module 801 configured to perform word segmentation processing on a target text to obtain a plurality of words;

an extraction module 802 configured to obtain a plurality of feature information corresponding to each word, and obtain a feature information set corresponding to each word;

a first determining module 803 configured to perform the following feature fusion steps on each word according to the feature information set of the word to obtain respective fused feature information of the word, for each word: respectively determining each feature information corresponding to the word as target feature information, and respectively combining the target feature information with each feature information group of the word as a feature information pair; determining a fusion correlation coefficient of the two feature information in the feature information pair; determining fusion characteristic information corresponding to the target characteristic information based on each characteristic information of the words and a fusion correlation coefficient of each characteristic information and the target characteristic information;

a second determining module 804, configured to determine semantic information corresponding to the target text based on the respective fusion feature information of each word.

Optionally, the first determining module 803 is configured to:

according to the formula fea _ no_ij＝fea_key_i^T*W_no*fea_cross_embed_(i-i)*n+j-1Wno∈R^d*dCalculating a second sub-correlation coefficient of two pieces of feature information in the feature information pair; suppose that two pieces of feature information in the feature information pair are fea _ i and fea _ j, fea _ i is target feature information, i is a feature type number of the target feature information, fea _ key _ i is a feature value of the target feature information fea _ i, fea _ no_ijRepresenting target featuresA second sub-correlation coefficient of the information fea _ i and the characteristic information fea _ j, j being the characteristic type number of the characteristic information fea _ j, fea _ cross _ embedded_(i-i)*n+j-1Representing a code vector with index (i-1) n + j-1 in the feature interaction embedding matrix, n representing the number of feature information in the set of feature information of the word, W_noRepresenting the mapping space, R representing a constant, d representing the dimension of the mapping space.

Optionally, the first determining module 803 is configured to:

and adding the products corresponding to the target characteristic information to obtain fusion characteristic information of the target characteristic information.

It should be noted that: in the apparatus for determining text semantic information according to the foregoing embodiment, when determining text semantic information, only the division of the functional modules is illustrated, and in practical applications, the functions may be distributed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the apparatus for determining text semantic information and the method for determining text semantic information provided by the above embodiments belong to the same concept, and specific implementation processes thereof are detailed in the method embodiments and are not described herein again.

Fig. 9 shows a block diagram of a terminal 900 according to an exemplary embodiment of the present application. The terminal 900 may be: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, motion video Experts compression standard Audio Layer 3), an MP4 player (Moving Picture Experts Group Audio Layer IV, motion video Experts compression standard Audio Layer 4), a notebook computer, or a desktop computer. Terminal 900 may also be referred to as an account device, portable terminal, laptop terminal, desktop terminal, or other name.

In general, terminal 900 includes: a processor 901 and a memory 902.

Processor 901 may include one or more processing cores, such as a 4-core processor, a 9-core processor, and so forth. The processor 901 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 901 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 901 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and drawing the content required to be displayed on the display screen. In some embodiments, the processor 901 may further include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.

Memory 902 may include one or more computer-readable storage media, which may be non-transitory. The memory 902 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 902 is used to store at least one instruction for execution by processor 901 to implement the method of determining textual semantic information provided by method embodiments herein.

In some embodiments, terminal 900 can also optionally include: a peripheral interface 903 and at least one peripheral. The processor 901, memory 902, and peripheral interface 903 may be connected by buses or signal lines. Various peripheral devices may be connected to the peripheral interface 903 via a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of a radio frequency circuit 904, a touch display screen 905, a camera 906, an audio circuit 907, a positioning component 909, and a power supply 909.

The peripheral interface 903 may be used to connect at least one peripheral related to I/O (Input/Output) to the processor 901 and the memory 902. In some embodiments, the processor 901, memory 902, and peripheral interface 903 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 901, the memory 902 and the peripheral interface 903 may be implemented on a separate chip or circuit board, which is not limited in this application.

The Radio Frequency circuit 904 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuitry 904 communicates with communication networks and other communication devices via electromagnetic signals. The radio frequency circuit 904 converts an electrical signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 904 comprises: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, an account identity module card, and so forth. The radio frequency circuit 904 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: metropolitan area networks, various generation mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the radio frequency circuit 904 may also include NFC (Near Field Communication) related circuits, which are not limited in this application.

The display screen 905 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 905 is a touch display screen, the display screen 905 also has the ability to capture touch signals on or over the surface of the display screen 905. The touch signal may be input to the processor 901 as a control signal for processing. At this point, the display 905 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display 905 may be one, providing the front panel of the terminal 900; in other embodiments, the number of the display panels 905 may be at least two, and each of the display panels is disposed on a different surface of the terminal 900 or is in a foldable design; in still other embodiments, the display 905 may be a flexible display disposed on a curved surface or a folded surface of the terminal 900. Even more, the display screen 905 may be arranged in a non-rectangular irregular figure, i.e. a shaped screen. The Display panel 905 can be made of LCD (Liquid Crystal Display), OLED (Organic Light-Emitting Diode), and other materials.

The camera assembly 906 is used to capture images or video. Optionally, camera assembly 906 includes a front camera and a rear camera. Generally, a front camera is disposed at a front panel of the terminal, and a rear camera is disposed at a rear surface of the terminal. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments, camera assembly 906 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.

Audio circuit 907 may include a microphone and a speaker. The microphone is used for collecting sound waves of the account and the environment, converting the sound waves into electric signals, and inputting the electric signals into the processor 901 for processing, or inputting the electric signals into the radio frequency circuit 904 for realizing voice communication. For stereo sound acquisition or noise reduction purposes, the microphones may be multiple and disposed at different locations of the terminal 900. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 901 or the radio frequency circuit 904 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, audio circuit 907 may also include a headphone jack.

The positioning component 908 is used to locate the current geographic Location of the terminal 900 for navigation or LBS (Location Based Service). The Positioning component 908 may be a Positioning component based on the GPS (Global Positioning System) in the united states, the beidou System in china, the graves System in russia, or the galileo System in the european union.

Power supply 909 is used to provide power to the various components in terminal 900. The power source 909 may be alternating current, direct current, disposable or rechargeable. When power source 909 comprises a rechargeable battery, the rechargeable battery may support wired or wireless charging. The rechargeable battery may also be used to support fast charge technology.

In some embodiments, terminal 900 can also include one or more sensors 910. The one or more sensors 910 include, but are not limited to: acceleration sensor 911, gyro sensor 912, pressure sensor 913, fingerprint sensor 914, optical sensor 915, and proximity sensor 916.

The acceleration sensor 911 can detect the magnitude of acceleration in three coordinate axes of the coordinate system established with the terminal 900. For example, the acceleration sensor 911 may be used to detect the components of the gravitational acceleration in three coordinate axes. The processor 901 can control the touch display screen 905 to display the account interface in a horizontal view or a vertical view according to the gravity acceleration signal acquired by the acceleration sensor 911. The acceleration sensor 911 may also be used for acquisition of motion data of a game or an account.

The gyroscope sensor 912 can detect the body direction and the rotation angle of the terminal 900, and the gyroscope sensor 912 and the acceleration sensor 911 cooperate to acquire the 3D motion of the account on the terminal 900. The processor 901 can implement the following functions according to the data collected by the gyro sensor 912: motion sensing (such as changing the UI according to a tilt operation of the account), image stabilization while shooting, game control, and inertial navigation.

Pressure sensors 913 may be disposed on the side bezel of terminal 900 and/or underneath touch display 905. When the pressure sensor 913 is disposed on the side frame of the terminal 900, the holding signal of the account to the terminal 900 may be detected, and the processor 901 performs left-right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 913. When the pressure sensor 913 is disposed at the lower layer of the touch display 905, the processor 901 controls the operability control on the UI interface according to the pressure operation of the account on the touch display 905. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.

The fingerprint sensor 914 is used for collecting the fingerprint of the account, and the processor 901 identifies the account according to the fingerprint collected by the fingerprint sensor 914, or the fingerprint sensor 914 identifies the account according to the collected fingerprint. Upon recognizing that the account is a trusted identity, processor 901 authorizes the account to perform relevant sensitive operations, including unlocking a screen, viewing encrypted information, downloading software, paying, and changing settings. The fingerprint sensor 914 may be disposed on the front, back, or side of the terminal 900. When a physical key or vendor Logo is provided on the terminal 900, the fingerprint sensor 914 may be integrated with the physical key or vendor Logo.

The optical sensor 915 is used to collect ambient light intensity. In one embodiment, the processor 901 may control the display brightness of the touch display 905 based on the ambient light intensity collected by the optical sensor 915. Specifically, when the ambient light intensity is high, the display brightness of the touch display screen 905 is increased; when the ambient light intensity is low, the display brightness of the touch display screen 905 is turned down. In another embodiment, the processor 901 can also dynamically adjust the shooting parameters of the camera assembly 906 according to the ambient light intensity collected by the optical sensor 915.

Proximity sensor 916, also known as a distance sensor, is typically disposed on the front panel of terminal 900. The proximity sensor 916 is used to gather the distance between the account and the front face of the terminal 900. In one embodiment, when the proximity sensor 916 detects that the distance between the account and the front face of the terminal 900 gradually decreases, the processor 901 controls the touch display 905 to switch from the bright screen state to the dark screen state; when the proximity sensor 916 detects that the distance between the account and the front face of the terminal 900 becomes gradually larger, the processor 901 controls the touch display 905 to switch from the breath screen state to the bright screen state.

Those skilled in the art will appreciate that the configuration shown in fig. 9 does not constitute a limitation of terminal 900, and may include more or fewer components than those shown, or may combine certain components, or may employ a different arrangement of components.

Fig. 10 is a schematic structural diagram of a server according to an embodiment of the present application, where the server 1000 may generate a relatively large difference due to a difference in configuration or performance, and may include one or more processors (CPUs) 1001 and one or more memories 1002, where the memory 1002 stores at least one instruction, and the at least one instruction is loaded and executed by the processors 1001 to implement the methods provided by the foregoing method embodiments. Of course, the server may also have components such as a wired or wireless network interface, a keyboard, and an input/output interface, so as to perform input/output, and the server may also include other components for implementing the functions of the device, which are not described herein again.

In an exemplary embodiment, a computer-readable storage medium, such as a memory, is also provided that includes instructions executable by a processor in a terminal to perform the method of determining textual semantic information in the above embodiments. For example, the computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only exemplary of the present application and should not be taken as limiting, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A method for determining text semantic information, the method comprising:

2. The method of claim 1, wherein determining the fusion correlation coefficient of two feature information in the pair of feature information comprises:

3. The method of claim 2, wherein determining the first sub-correlation coefficient for two feature information in the pair of feature information comprises:

4. The method of claim 2, wherein determining the second sub-correlation coefficient for two feature information in the pair of feature information comprises:

5. The method of claim 4, wherein the determining the second sub-correlation coefficient of the two feature information in the feature information pair based on the target feature information in the feature information pair and the encoding vector comprises:

according to the formula fea _ no_ij＝fea_key_i^T*W_no*fea_cross_embed_(i-i)*n+j-1Wno∈R^d*dCalculating a second sub-correlation coefficient of two pieces of feature information in the feature information pair; suppose that two pieces of feature information in the feature information pair are fea _ i and fea _ j, fea _ i is target feature information, i is a feature type number of the target feature information, fea _ key _ i is a feature of the target feature information fea _ iValue, fea _ no_ijA second sub-correlation coefficient representing the target feature information fea _ i and the feature information fea _ j, j being a feature type number of the feature information fea _ j, fea _ cross _ embedded_(i-i)*n+j-1Representing a code vector with index (i-1) n + j-1 in the feature interaction embedding matrix, n representing the number of feature information in the set of feature information of the word, W_noRepresenting the mapping space, R representing a constant, d representing the dimension of the mapping space.

6. The method according to claim 1, wherein the determining fused feature information corresponding to the target feature information based on each feature information of the word and a fused correlation coefficient of each feature information and the target feature information comprises:

7. An apparatus for determining text semantic information, the apparatus comprising:

8. The apparatus of claim 7, wherein the first determining module is configured to:

9. A computer device comprising a processor and a memory, wherein at least one instruction is stored in the memory, and the instruction is loaded and executed by the processor to implement the operations performed by the method for determining text semantic information according to any one of claims 1 to 6.

10. A computer-readable storage medium having stored therein at least one instruction which is loaded and executed by a processor to perform operations performed by the method for determining text semantic information according to any one of claims 1 to 6.