US20210286954A1 - Apparatus and Method for Applying Image Encoding Recognition in Natural Language Processing - Google Patents
Apparatus and Method for Applying Image Encoding Recognition in Natural Language Processing Download PDFInfo
- Publication number
- US20210286954A1 US20210286954A1 US16/820,667 US202016820667A US2021286954A1 US 20210286954 A1 US20210286954 A1 US 20210286954A1 US 202016820667 A US202016820667 A US 202016820667A US 2021286954 A1 US2021286954 A1 US 2021286954A1
- Authority
- US
- United States
- Prior art keywords
- right arrow
- arrow over
- parameter
- image set
- sentence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000003058 natural language processing Methods 0.000 title claims abstract description 19
- 239000013598 vector Substances 0.000 claims abstract description 87
- 238000013528 artificial neural network Methods 0.000 claims abstract description 14
- 230000004044 response Effects 0.000 claims abstract description 12
- 239000003607 modifier Substances 0.000 claims description 11
- 238000005070 sampling Methods 0.000 claims description 10
- 230000009466 transformation Effects 0.000 claims description 4
- 230000001419 dependent effect Effects 0.000 claims description 3
- 230000001131 transforming effect Effects 0.000 claims description 2
- 238000012545 processing Methods 0.000 abstract description 5
- 230000008569 process Effects 0.000 description 15
- 238000012549 training Methods 0.000 description 14
- 238000010801 machine learning Methods 0.000 description 13
- 238000010586 diagram Methods 0.000 description 11
- 238000012552 review Methods 0.000 description 11
- 239000011159 matrix material Substances 0.000 description 10
- 238000012986 modification Methods 0.000 description 10
- 230000004048 modification Effects 0.000 description 10
- 238000003384 imaging method Methods 0.000 description 8
- 238000004458 analytical method Methods 0.000 description 7
- 238000013459 approach Methods 0.000 description 5
- 230000004913 activation Effects 0.000 description 3
- 230000015572 biosynthetic process Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000012952 Resampling Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 241001122767 Theaceae Species 0.000 description 1
- 238000007792 addition Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 235000013351 cheese Nutrition 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 235000015220 hamburgers Nutrition 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/55—Rule-based translation
- G06F40/56—Natural language generation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
Definitions
- the present invention generally relates to the field of natural language process (NLP). More specifically, the present invention relates to techniques of applying image encoding recognition to handle NLP tasks.
- NLP natural language process
- Natural language processing is a technology used in aiding computers to analyze, understand, and derive meanings from human language in a smart and useful way. NLP has become an important research topic in recent years. Applications of NLP help humans to process massive electronic documents including review of comments, news reports, etc., and interact with humans in ways such as electronic personal assistant and social chatbot. Yes, the majority of the applications of NLP are related to information retrieval, articles auto-summarization, and polarity analysis (i.e. positive/negative classification).
- Machine learning recently has contributed much to the NLP technology. Machine learning is able to generalize and handle novel cases. In a machine learning model, if a case resembles something the model has seen before, the model can use its “learning” to evaluate the case. With machine learning, the goal is to create a system in which the model can be continuously improved.
- a machine learning model with text analytics by NLP is able to identify aspects of text, so as to understand meaning of text documents.
- the model can be provided to accelerate and automate the underlying text analytics functions, such that unstructured text is turned into useable data and insights.
- the present invention provides a method and an apparatus for applying image encoding recognition in the execution of natural language processing (NLP) tasks.
- the method comprises the process steps as follows.
- a sentence from a textual source is extracted by an NLP-based feature extractor.
- a word vector is generated in response to the sentence by the NLP-based feature extractor.
- the word vector is converted into a feature vector ⁇ right arrow over (b) ⁇ by the NLP-based feature extractor, in which the feature vector ⁇ right arrow over (b) ⁇ satisfies ⁇ right arrow over (b) ⁇ m and the parameter m is a positive integer.
- the feature vector is transformed into an image set having a plurality of two-dimensional images by a transformer.
- the image set is input into a neural network to execute image recognition by a processor, so as to analyze the sentence.
- the apparatus includes an NLP-based feature extractor, a transformer, and a processor.
- the NLP-based feature extractor is configured to extract a sentence from a textual source, configured to generate a word vector in response to the sentence, and configured to convert the word vector into a feature vector ⁇ right arrow over (b) ⁇ such that ⁇ right arrow over (b) ⁇ m where the parameter m is a positive integer.
- the transformer is configured to transform the feature vector into an image set having a plurality of two-dimensional images.
- the processor is configured to input the image set into a neural network to execute image recognition, so as to analyze the sentence.
- the method further includes a step of modifying resolution of the images by a modifier.
- the modifying resolution includes an up-sampling process and performing feature enhancement.
- the advantages of the present invention include: (1) Since a textual source is converted to a feature vector and then transformed into an image set for recognition, higher accuracy for an NLP task result is achieved. The reason is that the image set having the two-dimensional images contains richer information and unraveled previously-hidden features of the feature vector, so as to give a more complete description of the feature vector. (2)
- the NLP task such as polarity classification, can be achieved via image recognition, in which the image recognition can be configured and executed based on one of many relatively more mature imaging-based models, thereby enhancing performance and accuracy of the results for the NLP task.
- the image set can be further modified. The modification includes up-sampling the image set and performing feature enhancement to the image set. The modification to the image set benefits by reorganizing image features such that a simple image classifier can be used to classifier the image and improve the classification accuracy.
- FIG. 1 depicts a simplified logical structural and dataflow diagram of a method for applying image encoding recognition to handle NLP tasks by an NLP-based recognition system in accordance with various embodiments of the present invention
- FIG. 2A depicts a first schematic diagram illustrating an imaging process in accordance with various embodiments of the present invention
- FIG. 2B depicts a second schematic diagram illustrating the imaging process
- FIG. 3 depicts a simplified logical structural and dataflow diagram of the resolution modification process in accordance with various embodiments of the present invention
- FIG. 4 depicts a schematic diagram illustrating the up-sampling of an image set in accordance with various embodiments of the present invention
- FIG. 5 is a schematic diagram of resampling through interpolation in accordance with various embodiments of the present invention.
- FIG. 6 is a schematic diagram illustrating a feature enhancement in accordance with various embodiments of the present invention.
- FIG. 7 illustrates a simplified logical structural and dataflow diagram of executing an NLP task in accordance with various embodiments of the present invention
- FIG. 8 shows the exemplary patterns generated using a method for image encoding recognition in accordance with various embodiments of the present invention.
- FIG. 9 depicts a block diagram of configuration of an NLP-based recognition system in accordance with various embodiments of the present invention.
- FIG. 1 depicts a simplified logical structural and dataflow diagram of a method for applying image encoding recognition to handle NLP tasks by an NLP-based recognition system 100 in accordance with various embodiments of the present invention.
- the NLP-based recognition system 100 includes a feature extractor 110 , a transformer 120 , a modifier 130 , and a processor 140 , which are configure to execute different stages of the method.
- the NLP-based recognition system 100 executes a method comprising stages S 10 , S 20 , S 30 , and S 40 .
- stage S 10 is data pre-processing executed by a feature extractor 110 ; stage S 20 is forming feature vector executed by the feature extractor 110 ; stage S 30 is imaging executed by a transformer 120 ; stage S 40 is resolution modification executed by a modifier 130 ; and stage S 50 is executing NLP task executed by the processor 140 .
- the NLP-based recognition system 100 may be implemented by an electronic device, such as computer, laptop, cell phone, tablet, or other portable devices.
- the components of the NLP-based recognition system 100 can be designed based on machine learning or deep learning models, some of which serve to automatically discover the representatives needed for feature detection or classification from raw data.
- the components of the NLP-based recognition system 100 may contain a series of neural networks layers, and each layer is configured to learn a representation from its input data and then result thereof is subsequently used for classification/regression. Specifically, according to a predetermined purpose, what kind of knowledges to be learned by the model can be determined during training or learning, such that the model become in tune with those knowledge bases on data fed to the model.
- the method is performed on a textual source 10 to execute an NLP task.
- the NLP task may be a polarity analysis, a text composition, or the likes.
- the output of the method is a positive/negative classification; and the input textual source 10 may contain at least one sentence, such as a comment, a review, or a note with a positive/negative meaning.
- the textual source 10 may contain a positive review as “service is friendly and inviting” or a negative review as “this restaurant suffers from not trying hard enough”.
- the textual source 10 is fed to the feature extractor 110 to begin with stage S 10 .
- the feature extractor 110 is an NLP-based feature extractor designed as being available to machine training.
- the feature extractor 110 is trained with a training dataset containing words and characters of a selected language, in which the training dataset may be a word vector dictionary, millions of raw data and text documents, or the likes.
- the feature extractor 110 is trained to generate a vector representation by mapping words or phrases from a set to vectors of real numbers.
- the feature extractor 110 is able to extract common features among word vectors from a sentence to form a vector representation of it, thereby obtaining word vectors from the sentence.
- stage S 10 the data pre-processing to the textual source 10 includes stages S 12 and S 14 , which are tokenization and word embedding, respectively.
- the feature extractor 110 is configured to execute a tokenization, so as to extract and tokenize all words in a sentence from the textual source 10 .
- tokenization is a process by which big quantity of the sentence of the textual source 10 is divided into smaller parts, which can be called tokens.
- the feature extractor 110 is configured to embed each word of the sentence into a vector, so as to generate a word vector W in response to the sentence.
- NLE natural language element
- stage S 20 continues to process the word vectors ⁇ right arrow over (W) ⁇ , thereby forming a feature vector ⁇ right arrow over (b) ⁇ .
- the formation of the feature vector ⁇ right arrow over (b) ⁇ is executed by the feature extractor 110 .
- the training of the feature extractor 110 further constructs a feature vector database therein.
- the feature vector database may be implemented in one or more databases and/or file systems, local or remote to the feature extractor's run-time execution computing devices and/or servers.
- the training for constructing the feature vector database is performed by using one or more neural networks, such as long short-term memory (LSTM), gated recurrent unit (GRU), or combinations thereof.
- LSTM long short-term memory
- GRU gated recurrent unit
- Such neural network is useful in name-entity recognition and time series pattern analysis.
- a LSTM neural network can be used for classifying and making predictions based on the real-time normal discrepancy data (or the normal discrepancy training data set during training).
- the feature extractor 110 is able to filter out the common features of the word vectors ⁇ right arrow over (W) ⁇ prepared in stage S 14 and convert them into a single feature vector, ⁇ right arrow over (b) ⁇ , which satisfies ⁇ right arrow over (b) ⁇ m , where m is a positive integer.
- ⁇ right arrow over (b) ⁇ m means the feature vector ⁇ right arrow over (b) ⁇ is a m-dimension vector extracted from the word vectors of the sentence.
- the value of m is related to the information shared by the word vectors. More specifically, in an NLP task regarding polarity, the objective is to understand the polarity of a sentence.
- polarity feature only occupies a small portion of all features in the word vectors, the dimension of the corresponding feature vector can be small. For example, if a 100-dimension word vector dictionary is used to perform word embedding, a 20-dimension feature vector is enough to represent the polarity information. Moreover, if a more complex task is to be processed, a larger m should be used.
- stage S 30 is executed to generate one or more images by an imaging process.
- the one or more images are formed by transforming the feature vector ⁇ right arrow over (b) ⁇ prepared in stage S 20 .
- the feature vector ⁇ right arrow over (b) ⁇ serves as an input source to the transformer 120 , and then the transformer 120 outputs one or more images.
- the transformer 120 is an NLP-based transformer which is trained to be able to transform a feature vector into an image or a set of images.
- the training of the transformer 120 involves forming a matrix from an input vector, in which the formed matrix can be treated as an image such that reasonable guides can be made to construct a transformation model for machine learning.
- information stored in the transformation model is related to training data. For example, as the NLP task is the positive/negative classification, the information stored in the transformation model is polarity analysis.
- the parameter q is dependent of a selected language and is set the same as the number of parts of speech of the selected language. For example, if the transformer 120 is applied to an English-based model, the parameter q is the number 8 due to English having eight parts of speech.
- the number of ⁇ right arrow over (d) ⁇ i is the same as the number of images to be output (i.e. “i” number of ⁇ right arrow over (d) ⁇ i generate “i” number of images).
- the dimensions thereof are related.
- the feature vector ⁇ right arrow over (b) ⁇ prepared in stage S 20 and the transform vector ⁇ right arrow over (d) ⁇ i prepared in stage S 30 are illustrated as bars with blocks.
- a two-dimensional image [I] i is formed by the product of the feature vector ⁇ right arrow over (b) ⁇ and the transform vector ⁇ right arrow over (d) ⁇ i .
- “i” number of ⁇ right arrow over (d) ⁇ i generate “i” number of images, so multiple images are formed as the image set I.
- stage S 40 is executed to perform a resolution modification on the formed images prepared in stage S 30 , comprising stages S 42 and S 44 for up-sampling and feature enhancement respectively.
- stage S 42 the image set I prepared in stage S 30 is up-sampled.
- the image set I is up-sampled resulting in I′, such that I ⁇ q ⁇ m ⁇ n and I′ ⁇ q ⁇ m ⁇ t where the parameter t is a positive integer greater than the parameter n.
- the parameter t can be twice of the parameter n, and thus the up-sampled image I′ satisfies I′ ⁇ q ⁇ m ⁇ 2n .
- the up-sampling i.e. increasing the resolution
- FIG. 4 illustrates the up-sampling of the image set I.
- the up-sampling can be achieved by interpolation, such as resampling through interpolation.
- the original image data I i.e. the image set I prepared in stage S 30 , I ⁇ q ⁇ m ⁇ n
- the goal is to rezise the original image data to R′ ⁇ C′, called the new image data J (i.e. the up-sampled image, I′ ⁇ q ⁇ m ⁇ t ).
- the original image data I has four points A1 [f i,j+1 ], A2 [f i+1,j+1 ], A3 [f i,j ], and A4[f i+1,j ].
- a linear interpolation approach is applied, such that two points A5 [f i*,j+1 ] and A6 [f i*,j* ] are obtained by the linear interpolation between the points A1 [f i,j+1 ] and A2 [f i+1,j+1 ] and between A3 [f i,j ] and A4[f i+1,j ], respectively.
- a point A7 [f i*,j* ] is obtained by the linear interpolation between the points A5 [f i*,j+1 ] and A6 [f i*,j* ].
- the new image data J having points A1-A7 is obtained.
- the image set I′ is then transmitted to stage S 44 following stage S 42 to execute the feature enhancement, which can be done by an activation function, on the image set I′, thereby producing a feature-enhanced image set I′′.
- FIG. 6 is a schematic diagram of the feature enhancement in accordance with various embodiments of the present invention.
- Each image of the image set I′ is introduced into the activation function, such that image [I′′] q (a.k.a. the feature-enhanced image set I′′) is obtained.
- image [I′] q of the image set I′ is first multiplied by matrices F and B, where F and B are orthogonal matrices. This operation is developed in accordance with the principle of Singular Value Decomposition (SVD).
- SVD Singular Value Decomposition
- a matrix A decomposes in U ⁇ S ⁇ V T , in which matrices U and V are orthogonal matrices and matrix S is a diagonal matrix whose diagonal values are the singular values of matrix A.
- the features of [I′′] q if further enhanced by applying an activation on [I′′] q .
- these two operations are repeat several times to achieve better features enhancement.
- the image set I′ is rearranged into a feature-enhanced image set I′′ for which I′′ ⁇ q ⁇ r ⁇ s ; where the parameter r is a positive integer equal to or different from the parameter m; and the parameter s is a positive integer equal to or different from the parameters n and t. Therefore, the image set I′ if converted into characteristic form for NLP purpose, such as positive/negative classification in polarity analysis.
- stages S 42 and S 44 can be skipped.
- stage S 40 is executed when the resolution of the image set I prepared in stage S 30 is not high enough for information representation.
- the modified features are represented in another way and the information content are the same as before, and no additional information is added.
- the phrase “the modified features are represented in another way” means important information in a data set can be captured much more effectively than the original features, thereby increasing both accuracy and structural understanding in high-dimensional problems in modifying image quality within losing information embedded therein.
- stage S 42 is executed but stage S 44 is omitted. That is, stage S 42 is one of the conditions for executing stage S 44 .
- the reason for executing stage S 44 is that some negative side effects may accompany the image resolution modification. For example, blurred image may occur.
- the feature enhancement of stage S 44 may be executed to compensate for such negative side effects.
- other proper approach may be selected by machine learning.
- Stage S 50 is executed to process an NLP task.
- the NLP task in the exemplary illustration of FIGS. 1 and 7 is polarity classification, and hence the processor 140 can act as a classifier. That is, for the purpose of the polarity classification, the processor 140 can be trained to become an image classifier by feeding images in an approach of machine learning, thereby finding a result for the NLP task according to the image set I′′.
- the stage S 50 includes stages S 52 , S 54 , and S 56 , which are image set inputting, image processing, and classification, respectively.
- stage S 52 the image set I′′ is transmitted to the processor 140 .
- the resolution of the image set I prepared in stage S 30 is modified by both stages S 42 and S 44 , and thus stage S 44 is taken as an image source to the processor 140 .
- stage S 30 or S 42 may be taken as an image source to the processor 140 (i.e. in the cases of skipping stage S 40 or skipping stage S 42 ).
- stage S 54 the image set of stage S 52 is input into a neural network to execute an image recognition by extracting features from the images.
- a neural network can be convolutional neural network (CNN), fully convolutional network (FCN), multilayer perceptron (MLP), or combinations thereof. That is, the neural network used for image classification can be based on one of many relatively mature imaging-based models.
- the NLP task with analyzing the sentence for the polarity classification can be achieved by image classification, which will be advantageous to improve accuracy in polarity classification for the sentence.
- the image set having two-dimensional images contains richer information so that higher accuracy can be achieved.
- the two-dimensional images unravel hidden features of the feature vector to give a more complete description of the feature vector, and therefore using the transformed image set can reveal more information of the textual source 10 than only using the word vectors.
- traditional approaches for only processing word vector, which may be compressed, into an NLP result would miss a lot of hidden features in the word vector.
- stage S 56 the processor 140 outputs the result of the polarity classification, which may show if the textual source 10 is positive or negative polarity.
- the processor 140 outputs the result of the polarity classification, which may show if the textual source 10 is positive or negative polarity.
- a practical example for polarity classification is provided as follows. There are reviews of a restaurant from a thousand individuals, and each of the reviews is either a positive or negative sentence. The reviews are input into the NLP-based recognition system 100 for polarity classification. For the NLP-based recognition system 100 , train data to test ratio is set as 0.2, which means 800 sentences are used for training and 200 sentences are used for testing. The training is performed several times with different cross-validation data sets to ensure no overfitting occurs.
- “overfitting” means the result corresponds too closely or exactly to a particular set of data and therefore fails to fit additional data or to predict future observations reliably. Moreover, the “overfitting” may be resulted from the processing model containing more parameters than that can be justified by data. The result of this example provides high accuracy and high F1 score, and the number of model training parameters required is reduced.
- Each of patterns P 1 -P 3 and N 1 -N 3 is constructed by an image set and is in a two-dimensional form.
- Patterns P 1 -P 3 correspond to three different comments with positive polarity.
- pattern P 1 is transformed from a feature vector that is generated in response to a sentence “my boyfriend and I sat at the bar and had a completely beloved experience”
- pattern P 2 is transformed from a feature vector that is generated in response to a sentence “he also came back to check on us regularly excellent service”
- pattern P 3 is transformed from a feature vector that is generated in response to a sentence “service is friendly and inviting”.
- Patterns N 1 -N 3 correspond to three different comments with negative polarity.
- pattern N 1 is transformed from a feature vector that is generated in response to a sentence “I've had better not only from dedicated boba tea spots but even from jenni pho”
- pattern N 2 is transformed from a feature vector that is generated in response to a sentence “third the cheese on my friend's burger was cold”
- pattern N 3 is transformed from a feature vector that is generated in response to a sentence “I think this restaurant suffers from not trying hard enough”.
- Patterns P 1 -P 3 that are corresponding to the positive polarity comments have similar features which are different from those of patterns N 1 -N 3 corresponding to the negative polarity comments. Such differences between the positive polarity patterns and the negative polarity patterns are identifiable for machine learning using at least one neural network, so as to achieve polarity classification by image classification.
- patterns P 1 -P 3 and N 1 -N 3 are shown in grayscale, it should be understood that these patterns are exemplary and may be presented in multi-colors in other real-life cases.
- An NLP-based recognition system 100 may comprises a feature extractor 110 , a transformer 120 , a modifier 130 , and a processor 140 .
- the NLP-based recognition system 100 may further include a director 150 and a displayer 160 .
- the director 150 is configured to link to a target textual source and further receive comments of the textual source, so as to input the comments into the feature extractor 110 .
- the target textual source may be found from a website, a software, a bulletin board system (BBS), an App, or other suitable source.
- a classification result of an NLP task can be transmitted from the processor 140 to the displayer 160 , so as to physically and visually show the classification result.
- a user can set the NLP-based recognition system 100 to execute an NLP task for polarity classification by linking a target textual source containing a lot of comments/reviews via the director 150 .
- the displayer 160 of the NLP-based recognition system 100 can show a classification result of the polarity classification, so as to inform the user that the polarity of the comment/review is positive or negative and that the number of the positive and negative comments/reviews.
- the electronic embodiments disclosed herein may be implemented using computing devices, computer processors, or electronic circuitries including but not limited to application specific integrated circuits (ASIC), field programmable gate arrays (FPGA), and other programmable logic devices configured or programmed according to the teachings of the present disclosure.
- ASIC application specific integrated circuits
- FPGA field programmable gate arrays
- other programmable logic devices configured or programmed according to the teachings of the present disclosure.
- All or portions of the electronic embodiments may be executed in one or more computing devices including server computers, personal computers, laptop computers, mobile computing devices such as smartphones and tablet computers.
- the electronic embodiments include computer storage media having computer instructions or software codes stored therein which can be used to program computers or microprocessors to perform any of the processes of the present invention.
- the storage media can include, but are not limited to, floppy disks, optical discs, Blu-ray Disc, DVD, CD-ROMs, and magneto-optical disks, ROMs, RAMs, flash memory devices, or any type of media or devices suitable for storing instructions, codes, and/or data.
- Various embodiments of the present invention also may be implemented in distributed computing environments and/or Cloud computing environments, wherein the whole or portions of machine instructions are executed in distributed fashion by one or more processing devices interconnected by a communication network, such as an intranet, Wide Area Network (WAN), Local Area Network (LAN), the Internet, and other forms of data transmission medium.
- a communication network such as an intranet, Wide Area Network (WAN), Local Area Network (LAN), the Internet, and other forms of data transmission medium.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
Description
- A portion of the disclosure of this patent document contains material, which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
- The present invention generally relates to the field of natural language process (NLP). More specifically, the present invention relates to techniques of applying image encoding recognition to handle NLP tasks.
- Natural language processing (NLP) is a technology used in aiding computers to analyze, understand, and derive meanings from human language in a smart and useful way. NLP has become an important research topic in recent years. Applications of NLP help humans to process massive electronic documents including review of comments, news reports, etc., and interact with humans in ways such as electronic personal assistant and social chatbot. Yes, the majority of the applications of NLP are related to information retrieval, articles auto-summarization, and polarity analysis (i.e. positive/negative classification).
- Machine learning recently has contributed much to the NLP technology. Machine learning is able to generalize and handle novel cases. In a machine learning model, if a case resembles something the model has seen before, the model can use its “learning” to evaluate the case. With machine learning, the goal is to create a system in which the model can be continuously improved.
- A machine learning model with text analytics by NLP is able to identify aspects of text, so as to understand meaning of text documents. In other words, the model can be provided to accelerate and automate the underlying text analytics functions, such that unstructured text is turned into useable data and insights.
- However, current development of machine learning models are based on the linguistic model, which have significant performance limitations. For example, some models analyze a target sentence structure in a way like subjects-words-objects in a sentence to understand its meaning. These types of analysis are specific to particular languages and cannot be generalized for other different languages; they also exhibit low accuracy in handling complex sentences. Therefore, there is a need in the art for a new approach for a machine learning model with NLP that can process and turn massive electronic documents into useable data without missing hidden features, in turn improving accuracy.
- The present invention provides a method and an apparatus for applying image encoding recognition in the execution of natural language processing (NLP) tasks. In accordance to one aspect of the present invention, the method comprises the process steps as follows. A sentence from a textual source is extracted by an NLP-based feature extractor. A word vector is generated in response to the sentence by the NLP-based feature extractor. The word vector is converted into a feature vector {right arrow over (b)} by the NLP-based feature extractor, in which the feature vector {right arrow over (b)} satisfies {right arrow over (b)}∈ m and the parameter m is a positive integer. The feature vector is transformed into an image set having a plurality of two-dimensional images by a transformer. The image set is input into a neural network to execute image recognition by a processor, so as to analyze the sentence.
- In accordance to another aspect of the present invention, the apparatus includes an NLP-based feature extractor, a transformer, and a processor. The NLP-based feature extractor is configured to extract a sentence from a textual source, configured to generate a word vector in response to the sentence, and configured to convert the word vector into a feature vector {right arrow over (b)} such that {right arrow over (b)}∈ m where the parameter m is a positive integer. The transformer is configured to transform the feature vector into an image set having a plurality of two-dimensional images. The processor is configured to input the image set into a neural network to execute image recognition, so as to analyze the sentence.
- In various embodiments, the method further includes a step of modifying resolution of the images by a modifier. The modifying resolution includes an up-sampling process and performing feature enhancement.
- The advantages of the present invention include: (1) Since a textual source is converted to a feature vector and then transformed into an image set for recognition, higher accuracy for an NLP task result is achieved. The reason is that the image set having the two-dimensional images contains richer information and unraveled previously-hidden features of the feature vector, so as to give a more complete description of the feature vector. (2) The NLP task, such as polarity classification, can be achieved via image recognition, in which the image recognition can be configured and executed based on one of many relatively more mature imaging-based models, thereby enhancing performance and accuracy of the results for the NLP task. (3) The image set can be further modified. The modification includes up-sampling the image set and performing feature enhancement to the image set. The modification to the image set benefits by reorganizing image features such that a simple image classifier can be used to classifier the image and improve the classification accuracy.
- Embodiments of the invention are described in more detail hereinafter with reference to the drawings, in which:
-
FIG. 1 depicts a simplified logical structural and dataflow diagram of a method for applying image encoding recognition to handle NLP tasks by an NLP-based recognition system in accordance with various embodiments of the present invention; -
FIG. 2A depicts a first schematic diagram illustrating an imaging process in accordance with various embodiments of the present invention; -
FIG. 2B depicts a second schematic diagram illustrating the imaging process; -
FIG. 3 depicts a simplified logical structural and dataflow diagram of the resolution modification process in accordance with various embodiments of the present invention; -
FIG. 4 depicts a schematic diagram illustrating the up-sampling of an image set in accordance with various embodiments of the present invention; -
FIG. 5 is a schematic diagram of resampling through interpolation in accordance with various embodiments of the present invention; -
FIG. 6 is a schematic diagram illustrating a feature enhancement in accordance with various embodiments of the present invention; -
FIG. 7 illustrates a simplified logical structural and dataflow diagram of executing an NLP task in accordance with various embodiments of the present invention; -
FIG. 8 shows the exemplary patterns generated using a method for image encoding recognition in accordance with various embodiments of the present invention; and -
FIG. 9 depicts a block diagram of configuration of an NLP-based recognition system in accordance with various embodiments of the present invention. - In the following description, methods and apparatuses for applying image encoding recognition to handle natural language processing (NLP) tasks, and the likes are set forth as preferred examples. It will be apparent to those skilled in the art that modifications, including additions and/or substitutions may be made without departing from the scope and spirit of the invention. Specific details may be omitted so as not to obscure the invention; however, the disclosure is written to enable one skilled in the art to practice the teachings herein without undue experimentation.
-
FIG. 1 depicts a simplified logical structural and dataflow diagram of a method for applying image encoding recognition to handle NLP tasks by an NLP-basedrecognition system 100 in accordance with various embodiments of the present invention. The NLP-basedrecognition system 100 includes afeature extractor 110, atransformer 120, amodifier 130, and aprocessor 140, which are configure to execute different stages of the method. The NLP-basedrecognition system 100 executes a method comprising stages S10, S20, S30, and S40. The stage S10 is data pre-processing executed by afeature extractor 110; stage S20 is forming feature vector executed by thefeature extractor 110; stage S30 is imaging executed by atransformer 120; stage S40 is resolution modification executed by amodifier 130; and stage S50 is executing NLP task executed by theprocessor 140. - In various embodiments, the NLP-based
recognition system 100 may be implemented by an electronic device, such as computer, laptop, cell phone, tablet, or other portable devices. The components of the NLP-basedrecognition system 100 can be designed based on machine learning or deep learning models, some of which serve to automatically discover the representatives needed for feature detection or classification from raw data. Accordingly, the components of the NLP-basedrecognition system 100 may contain a series of neural networks layers, and each layer is configured to learn a representation from its input data and then result thereof is subsequently used for classification/regression. Specifically, according to a predetermined purpose, what kind of knowledges to be learned by the model can be determined during training or learning, such that the model become in tune with those knowledge bases on data fed to the model. - The method is performed on a
textual source 10 to execute an NLP task. In various embodiments, the NLP task may be a polarity analysis, a text composition, or the likes. For the application of the NLP task in the polarity analysis, the output of the method is a positive/negative classification; and the inputtextual source 10 may contain at least one sentence, such as a comment, a review, or a note with a positive/negative meaning. For example, in a case of reviewing a restaurant, thetextual source 10 may contain a positive review as “service is friendly and inviting” or a negative review as “this restaurant suffers from not trying hard enough”. - In the exemplary illustration of
FIG. 1 , thetextual source 10 is fed to thefeature extractor 110 to begin with stage S10. Thefeature extractor 110 is an NLP-based feature extractor designed as being available to machine training. At design time, thefeature extractor 110 is trained with a training dataset containing words and characters of a selected language, in which the training dataset may be a word vector dictionary, millions of raw data and text documents, or the likes. During training, thefeature extractor 110 is trained to generate a vector representation by mapping words or phrases from a set to vectors of real numbers. Thus, thefeature extractor 110 is able to extract common features among word vectors from a sentence to form a vector representation of it, thereby obtaining word vectors from the sentence. - In stage S10, the data pre-processing to the
textual source 10 includes stages S12 and S14, which are tokenization and word embedding, respectively. - In stage S12, the
feature extractor 110 is configured to execute a tokenization, so as to extract and tokenize all words in a sentence from thetextual source 10. Specifically, tokenization is a process by which big quantity of the sentence of thetextual source 10 is divided into smaller parts, which can be called tokens. - In stage S14, the
feature extractor 110 is configured to embed each word of the sentence into a vector, so as to generate a word vector W in response to the sentence. For example, if thetextual source 10 were a sentence such as “service is friendly and inviting”, a set of the word vectors W with respect to thetextual source 10 is formed as {right arrow over (W)}={{right arrow over (v)}1, {right arrow over (v)}2, {right arrow over (v)}3, {right arrow over (v)}4, {right arrow over (v)}5}, where {right arrow over (v)}i is the word vector serving as a natural language element (NLE). - After the word embedding of stage S14, stage S20 continues to process the word vectors {right arrow over (W)}, thereby forming a feature vector {right arrow over (b)}. In stage S20, the formation of the feature vector {right arrow over (b)} is executed by the
feature extractor 110. In various embodiments, the training of thefeature extractor 110 further constructs a feature vector database therein. The feature vector database may be implemented in one or more databases and/or file systems, local or remote to the feature extractor's run-time execution computing devices and/or servers. In various embodiments, the training for constructing the feature vector database is performed by using one or more neural networks, such as long short-term memory (LSTM), gated recurrent unit (GRU), or combinations thereof. Such neural network is useful in name-entity recognition and time series pattern analysis. For example, a LSTM neural network can be used for classifying and making predictions based on the real-time normal discrepancy data (or the normal discrepancy training data set during training). As such, thefeature extractor 110 is able to filter out the common features of the word vectors {right arrow over (W)} prepared in stage S14 and convert them into a single feature vector, {right arrow over (b)}, which satisfies {right arrow over (b)}∈ m, where m is a positive integer. In this regard, “{right arrow over (b)}∈ m” means the feature vector {right arrow over (b)} is a m-dimension vector extracted from the word vectors of the sentence. The value of m is related to the information shared by the word vectors. More specifically, in an NLP task regarding polarity, the objective is to understand the polarity of a sentence. To achieve it, using word vectors of the sentence to generate a feature vector is an important means. Since polarity feature only occupies a small portion of all features in the word vectors, the dimension of the corresponding feature vector can be small. For example, if a 100-dimension word vector dictionary is used to perform word embedding, a 20-dimension feature vector is enough to represent the polarity information. Moreover, if a more complex task is to be processed, a larger m should be used. - After the formation of the feature vector {right arrow over (b)}, stage S30 is executed to generate one or more images by an imaging process. In the imaging process, the one or more images are formed by transforming the feature vector {right arrow over (b)} prepared in stage S20. Specifically, the feature vector {right arrow over (b)} serves as an input source to the
transformer 120, and then thetransformer 120 outputs one or more images. In this regard, thetransformer 120 is an NLP-based transformer which is trained to be able to transform a feature vector into an image or a set of images. In some embodiment, the training of thetransformer 120 involves forming a matrix from an input vector, in which the formed matrix can be treated as an image such that reasonable guides can be made to construct a transformation model for machine learning. Furthermore, information stored in the transformation model is related to training data. For example, as the NLP task is the positive/negative classification, the information stored in the transformation model is polarity analysis. - Furthermore, the NLP base for the
transformer 120 means thetransformer 120 depends on a selected language and contains transform vectors {{right arrow over (d)}i}i=1 q, where {right arrow over (d)}i∈ n, the parameter i is a positive integer from 1; n is a positive integer; each of {right arrow over (d)}i is orthogonal to another one; and the parameter q represents the number of channels for images. In various embodiments, the parameter q is dependent of a selected language and is set the same as the number of parts of speech of the selected language. For example, if thetransformer 120 is applied to an English-based model, the parameter q is the number 8 due to English having eight parts of speech. In this regard, the number of {right arrow over (d)}i is the same as the number of images to be output (i.e. “i” number of {right arrow over (d)}i generate “i” number of images). Specifically, with the feature vector {right arrow over (b)}, an image set I can be obtained by computing calculation of I=[{right arrow over (b)}{right arrow over (d)}1 T, {right arrow over (b)}{right arrow over (d)}2 T]∈ q×m×n where {right arrow over (d)}i T is the transpose of {right arrow over (d)}i. For the image set I, the dimensions thereof are related. That is, with the feature vector {right arrow over (b)}∈ m, each of the transform vectors {right arrow over (d)}i∈ n, and the q number of the transform vectors {{right arrow over (d)}i}i=1 q, it results in the formation of the image set I such that I∈ q×m×n and having a plurality of two-dimensional images. - To illustrate, as shown in
FIG. 2A , the feature vector {right arrow over (b)} prepared in stage S20 and the transform vector {right arrow over (d)}i prepared in stage S30 are illustrated as bars with blocks. And as shown inFIG. 2B , a two-dimensional image [I]i is formed by the product of the feature vector {right arrow over (b)} and the transform vector {right arrow over (d)}i. As afore-described, “i” number of {right arrow over (d)}i generate “i” number of images, so multiple images are formed as the image set I. - Referring to
FIGS. 1 and 3 . After the imaging process, stage S40 is executed to perform a resolution modification on the formed images prepared in stage S30, comprising stages S42 and S44 for up-sampling and feature enhancement respectively. - In stage S42, the image set I prepared in stage S30 is up-sampled. The image set I is up-sampled resulting in I′, such that I∈ q×m×n and I′∈ q×m×t where the parameter t is a positive integer greater than the parameter n. For example, the parameter t can be twice of the parameter n, and thus the up-sampled image I′ satisfies I′∈ q×m×2n. In some embodiments, the up-sampling (i.e. increasing the resolution) is performed in both dimensions, such that the up-sampled image I′ satisfies I′∈ q×p×t, where p>m and t>n.
-
FIG. 4 illustrates the up-sampling of the image set I. In various embodiments, the up-sampling can be achieved by interpolation, such as resampling through interpolation. Specifically, as shown inFIG. 5 , the original image data I (i.e. the image set I prepared in stage S30, I∈ q×m×n) as a R×C image, and the goal is to rezise the original image data to R′×C′, called the new image data J (i.e. the up-sampled image, I′∈ q×m×t). The original image data I has four points A1 [fi,j+1], A2 [fi+1,j+1], A3 [fi,j], and A4[fi+1,j]. During the up-sampling, a linear interpolation approach is applied, such that two points A5 [fi*,j+1] and A6 [fi*,j*] are obtained by the linear interpolation between the points A1 [fi,j+1] and A2 [fi+1,j+1] and between A3 [fi,j] and A4[fi+1,j], respectively. Next, a point A7 [fi*,j*] is obtained by the linear interpolation between the points A5 [fi*,j+1] and A6 [fi*,j*]. As such, the new image data J having points A1-A7 is obtained. - Referring to
FIG. 3 again. The image set I′ is then transmitted to stage S44 following stage S42 to execute the feature enhancement, which can be done by an activation function, on the image set I′, thereby producing a feature-enhanced image set I″. - To illustrate,
FIG. 6 is a schematic diagram of the feature enhancement in accordance with various embodiments of the present invention. Each image of the image set I′ is introduced into the activation function, such that image [I″]q (a.k.a. the feature-enhanced image set I″) is obtained. For example, image [I′]q of the image set I′ is first multiplied by matrices F and B, where F and B are orthogonal matrices. This operation is developed in accordance with the principle of Singular Value Decomposition (SVD). In SVD, a matrix A decomposes in U×S×VT, in which matrices U and V are orthogonal matrices and matrix S is a diagonal matrix whose diagonal values are the singular values of matrix A. Matrix S carries the characteristic features of matrix A in a better representation. Equivalently, matrix S is found by multiplying matrix A with matrices U and V (i.e. S=UT×A×V). Therefore, the features of an image [I′]q is rearranged by multiplying suitable orthogonal matrices in the front and back (i.e. FT×[I′]q×B=[I″]q). In various embodiments, after this operation, the features of [I″]q if further enhanced by applying an activation on [I″]q. In various embodiments, these two operations are repeat several times to achieve better features enhancement. As a result, the image set I′ is rearranged into a feature-enhanced image set I″ for which I″∈ q×r×s; where the parameter r is a positive integer equal to or different from the parameter m; and the parameter s is a positive integer equal to or different from the parameters n and t. Therefore, the image set I′ if converted into characteristic form for NLP purpose, such as positive/negative classification in polarity analysis. - In various embodiments, if the image set I prepared in stage S30 has high enough resolution (i.e. the image set I has clear features that can be resolved by the machine learning), stages S42 and S44 can be skipped. In other words, stage S40 is executed when the resolution of the image set I prepared in stage S30 is not high enough for information representation. In view of the image resolution modification, the modified features are represented in another way and the information content are the same as before, and no additional information is added. Herein, the phrase “the modified features are represented in another way” means important information in a data set can be captured much more effectively than the original features, thereby increasing both accuracy and structural understanding in high-dimensional problems in modifying image quality within losing information embedded therein.
- In various embodiments, only stage S42 is executed but stage S44 is omitted. That is, stage S42 is one of the conditions for executing stage S44. In view of this, the reason for executing stage S44 is that some negative side effects may accompany the image resolution modification. For example, blurred image may occur. Thus, the feature enhancement of stage S44 may be executed to compensate for such negative side effects. In various embodiments, to compensate for the side effects, other proper approach may be selected by machine learning.
- Referring to
FIGS. 1 and 7 . Stage S50 is executed to process an NLP task. As afore described, the NLP task in the exemplary illustration ofFIGS. 1 and 7 is polarity classification, and hence theprocessor 140 can act as a classifier. That is, for the purpose of the polarity classification, theprocessor 140 can be trained to become an image classifier by feeding images in an approach of machine learning, thereby finding a result for the NLP task according to the image set I″. The stage S50 includes stages S52, S54, and S56, which are image set inputting, image processing, and classification, respectively. - In stage S52, the image set I″ is transmitted to the
processor 140. In the present embodiment, the resolution of the image set I prepared in stage S30 is modified by both stages S42 and S44, and thus stage S44 is taken as an image source to theprocessor 140. In other embodiments, stage S30 or S42 may be taken as an image source to the processor 140 (i.e. in the cases of skipping stage S40 or skipping stage S42). - In stage S54, the image set of stage S52 is input into a neural network to execute an image recognition by extracting features from the images. Such neural network can be convolutional neural network (CNN), fully convolutional network (FCN), multilayer perceptron (MLP), or combinations thereof. That is, the neural network used for image classification can be based on one of many relatively mature imaging-based models. As such, with the image recognition executed by the
processor 140, the NLP task with analyzing the sentence for the polarity classification can be achieved by image classification, which will be advantageous to improve accuracy in polarity classification for the sentence. - Further, the image set having two-dimensional images contains richer information so that higher accuracy can be achieved. In this regard, the two-dimensional images unravel hidden features of the feature vector to give a more complete description of the feature vector, and therefore using the transformed image set can reveal more information of the
textual source 10 than only using the word vectors. Generally speaking, traditional approaches for only processing word vector, which may be compressed, into an NLP result would miss a lot of hidden features in the word vector. - In stage S56, the
processor 140 outputs the result of the polarity classification, which may show if thetextual source 10 is positive or negative polarity. Briefly, once a comment (or a review/note) is input into the NLP-basedrecognition system 100, the comment is transformed into an image set and then polarity of the comment can be classified according to the image set, thereby showing the comment is positive or negative. - A practical example for polarity classification is provided as follows. There are reviews of a restaurant from a thousand individuals, and each of the reviews is either a positive or negative sentence. The reviews are input into the NLP-based
recognition system 100 for polarity classification. For the NLP-basedrecognition system 100, train data to test ratio is set as 0.2, which means 800 sentences are used for training and 200 sentences are used for testing. The training is performed several times with different cross-validation data sets to ensure no overfitting occurs. Here, “overfitting” means the result corresponds too closely or exactly to a particular set of data and therefore fails to fit additional data or to predict future observations reliably. Moreover, the “overfitting” may be resulted from the processing model containing more parameters than that can be justified by data. The result of this example provides high accuracy and high F1 score, and the number of model training parameters required is reduced. - Some of the image sets transformed from feature vectors by using the afore-described manner are shown in
FIG. 8 . Each of patterns P1-P3 and N1-N3 is constructed by an image set and is in a two-dimensional form. - Patterns P1-P3 correspond to three different comments with positive polarity. For example, pattern P1 is transformed from a feature vector that is generated in response to a sentence “my boyfriend and I sat at the bar and had a completely delightful experience”; pattern P2 is transformed from a feature vector that is generated in response to a sentence “he also came back to check on us regularly excellent service; and pattern P3 is transformed from a feature vector that is generated in response to a sentence “service is friendly and inviting”.
- Patterns N1-N3 correspond to three different comments with negative polarity. For example, pattern N1 is transformed from a feature vector that is generated in response to a sentence “I've had better not only from dedicated boba tea spots but even from jenni pho”; pattern N2 is transformed from a feature vector that is generated in response to a sentence “third the cheese on my friend's burger was cold”; and pattern N3 is transformed from a feature vector that is generated in response to a sentence “I think this restaurant suffers from not trying hard enough”.
- Patterns P1-P3 that are corresponding to the positive polarity comments have similar features which are different from those of patterns N1-N3 corresponding to the negative polarity comments. Such differences between the positive polarity patterns and the negative polarity patterns are identifiable for machine learning using at least one neural network, so as to achieve polarity classification by image classification. Although patterns P1-P3 and N1-N3 are shown in grayscale, it should be understood that these patterns are exemplary and may be presented in multi-colors in other real-life cases.
- Referring to
FIG. 9 . An NLP-basedrecognition system 100 may comprises afeature extractor 110, atransformer 120, amodifier 130, and aprocessor 140. The NLP-basedrecognition system 100 may further include adirector 150 and adisplayer 160. Thedirector 150 is configured to link to a target textual source and further receive comments of the textual source, so as to input the comments into thefeature extractor 110. In various embodiments, the target textual source may be found from a website, a software, a bulletin board system (BBS), an App, or other suitable source. After the NLP-basedrecognition system 100 processes the textual source by thefeature extractor 110, thetransformer 120, themodifier 130, and theprocessor 140 as the afore-described manner, a classification result of an NLP task can be transmitted from theprocessor 140 to thedisplayer 160, so as to physically and visually show the classification result. For example, a user can set the NLP-basedrecognition system 100 to execute an NLP task for polarity classification by linking a target textual source containing a lot of comments/reviews via thedirector 150. Then, thedisplayer 160 of the NLP-basedrecognition system 100 can show a classification result of the polarity classification, so as to inform the user that the polarity of the comment/review is positive or negative and that the number of the positive and negative comments/reviews. - The electronic embodiments disclosed herein may be implemented using computing devices, computer processors, or electronic circuitries including but not limited to application specific integrated circuits (ASIC), field programmable gate arrays (FPGA), and other programmable logic devices configured or programmed according to the teachings of the present disclosure.
- Computer instructions or software codes running in the computing devices, computer processors, or programmable logic devices can readily be prepared by practitioners skilled in the software or electronic art based on the teachings of the present disclosure.
- All or portions of the electronic embodiments may be executed in one or more computing devices including server computers, personal computers, laptop computers, mobile computing devices such as smartphones and tablet computers.
- The electronic embodiments include computer storage media having computer instructions or software codes stored therein which can be used to program computers or microprocessors to perform any of the processes of the present invention. The storage media can include, but are not limited to, floppy disks, optical discs, Blu-ray Disc, DVD, CD-ROMs, and magneto-optical disks, ROMs, RAMs, flash memory devices, or any type of media or devices suitable for storing instructions, codes, and/or data.
- Various embodiments of the present invention also may be implemented in distributed computing environments and/or Cloud computing environments, wherein the whole or portions of machine instructions are executed in distributed fashion by one or more processing devices interconnected by a communication network, such as an intranet, Wide Area Network (WAN), Local Area Network (LAN), the Internet, and other forms of data transmission medium.
- The foregoing description of the present invention has been provided for the purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise forms disclosed. Many modifications and variations will be apparent to the practitioner skilled in the art.
- The embodiments were chosen and described in order to best explain the principles of the invention and its practical application, thereby enabling others skilled in the art to understand the invention for various embodiments and with various modifications that are suited to the particular use contemplated.
Claims (20)
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/820,667 US11132514B1 (en) | 2020-03-16 | 2020-03-16 | Apparatus and method for applying image encoding recognition in natural language processing |
PCT/CN2020/080514 WO2021184385A1 (en) | 2020-03-16 | 2020-03-20 | Apparatus and method for applying image encoding recognition in natural language processing |
CN202080000364.3A CN111566665B (en) | 2020-03-16 | 2020-03-20 | Apparatus and method for applying image code recognition in natural language processing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/820,667 US11132514B1 (en) | 2020-03-16 | 2020-03-16 | Apparatus and method for applying image encoding recognition in natural language processing |
Publications (2)
Publication Number | Publication Date |
---|---|
US20210286954A1 true US20210286954A1 (en) | 2021-09-16 |
US11132514B1 US11132514B1 (en) | 2021-09-28 |
Family
ID=77663714
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/820,667 Active 2040-04-07 US11132514B1 (en) | 2020-03-16 | 2020-03-16 | Apparatus and method for applying image encoding recognition in natural language processing |
Country Status (2)
Country | Link |
---|---|
US (1) | US11132514B1 (en) |
WO (1) | WO2021184385A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220083570A1 (en) * | 2020-09-14 | 2022-03-17 | Accenture Global Solutions Limited | Enhanced data driven intelligent cloud advisor system |
Citations (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5983237A (en) * | 1996-03-29 | 1999-11-09 | Virage, Inc. | Visual dictionary |
US9547821B1 (en) * | 2016-02-04 | 2017-01-17 | International Business Machines Corporation | Deep learning for algorithm portfolios |
US20170109838A1 (en) * | 2015-10-15 | 2017-04-20 | International Business Machines Corporation | Cognitive Marketing Based on Social Networking of Positive Reviewers |
US20180349158A1 (en) * | 2017-03-22 | 2018-12-06 | Kevin Swersky | Bayesian optimization techniques and applications |
US20190108446A1 (en) * | 2017-10-10 | 2019-04-11 | Alibaba Group Holding Limited | Image processing engine component generation method, search method, terminal, and system |
US20190156122A1 (en) * | 2017-11-17 | 2019-05-23 | Adobe Inc. | Intelligent digital image scene detection |
US20200028686A1 (en) * | 2018-07-23 | 2020-01-23 | Florida Atlantic University Board Of Trustees | Systems and methods for extending the domain of biometric template protection algorithms from integer-valued feature vectors to real-valued feature vectors |
US20200026908A1 (en) * | 2018-07-23 | 2020-01-23 | The Mitre Corporation | Name and face matching |
US20200042838A1 (en) * | 2018-08-02 | 2020-02-06 | International Business Machines Corporation | Semantic understanding of images based on vectorization |
US20200126584A1 (en) * | 2018-10-19 | 2020-04-23 | Microsoft Technology Licensing, Llc | Transforming Audio Content into Images |
US20200195779A1 (en) * | 2018-12-13 | 2020-06-18 | Nice Ltd. | System and method for performing agent behavioral analytics |
US20200218857A1 (en) * | 2017-07-26 | 2020-07-09 | Siuvo Inc. | Semantic Classification of Numerical Data in Natural Language Context Based on Machine Learning |
US20200327639A1 (en) * | 2019-04-10 | 2020-10-15 | Eagle Technology, Llc | Hierarchical Neural Network Image Registration |
US20200341974A1 (en) * | 2019-04-25 | 2020-10-29 | Chevron U.S.A. Inc. | Context-sensitive feature score generation |
US20210034993A1 (en) * | 2019-08-01 | 2021-02-04 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Poi valuation method, apparatus, device and computer storage medium |
US20210089571A1 (en) * | 2017-04-10 | 2021-03-25 | Hewlett-Packard Development Company, L.P. | Machine learning image search |
US20210109966A1 (en) * | 2019-10-15 | 2021-04-15 | Adobe Inc. | Video retrieval using temporal visual content |
US20210158226A1 (en) * | 2018-04-12 | 2021-05-27 | Nippon Telegraph And Telephone Corporation | Machine learning system, machine learning method, and program |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120253792A1 (en) | 2011-03-30 | 2012-10-04 | Nec Laboratories America, Inc. | Sentiment Classification Based on Supervised Latent N-Gram Analysis |
US9690772B2 (en) | 2014-12-15 | 2017-06-27 | Xerox Corporation | Category and term polarity mutual annotation for aspect-based sentiment analysis |
CN107220220A (en) | 2016-03-22 | 2017-09-29 | 索尼公司 | Electronic equipment and method for text-processing |
CN106372107B (en) | 2016-08-19 | 2020-01-17 | 中兴通讯股份有限公司 | Method and device for generating natural language sentence library |
US10102453B1 (en) | 2017-08-03 | 2018-10-16 | Gyrfalcon Technology Inc. | Natural language processing via a two-dimensional symbol having multiple ideograms contained therein |
US10810467B2 (en) | 2017-11-17 | 2020-10-20 | Hong Kong Applied Science and Technology Research Institute Company Limited | Flexible integrating recognition and semantic processing |
CN108345633A (en) | 2017-12-29 | 2018-07-31 | 天津南大通用数据技术股份有限公司 | A method and device for natural language processing |
CN109829306B (en) | 2019-02-20 | 2023-07-21 | 哈尔滨工程大学 | A Malware Classification Method Based on Optimized Feature Extraction |
-
2020
- 2020-03-16 US US16/820,667 patent/US11132514B1/en active Active
- 2020-03-20 WO PCT/CN2020/080514 patent/WO2021184385A1/en active Application Filing
Patent Citations (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5983237A (en) * | 1996-03-29 | 1999-11-09 | Virage, Inc. | Visual dictionary |
US20170109838A1 (en) * | 2015-10-15 | 2017-04-20 | International Business Machines Corporation | Cognitive Marketing Based on Social Networking of Positive Reviewers |
US9547821B1 (en) * | 2016-02-04 | 2017-01-17 | International Business Machines Corporation | Deep learning for algorithm portfolios |
US20180349158A1 (en) * | 2017-03-22 | 2018-12-06 | Kevin Swersky | Bayesian optimization techniques and applications |
US20210089571A1 (en) * | 2017-04-10 | 2021-03-25 | Hewlett-Packard Development Company, L.P. | Machine learning image search |
US20200218857A1 (en) * | 2017-07-26 | 2020-07-09 | Siuvo Inc. | Semantic Classification of Numerical Data in Natural Language Context Based on Machine Learning |
US20190108446A1 (en) * | 2017-10-10 | 2019-04-11 | Alibaba Group Holding Limited | Image processing engine component generation method, search method, terminal, and system |
US20190156122A1 (en) * | 2017-11-17 | 2019-05-23 | Adobe Inc. | Intelligent digital image scene detection |
US20210158226A1 (en) * | 2018-04-12 | 2021-05-27 | Nippon Telegraph And Telephone Corporation | Machine learning system, machine learning method, and program |
US20200028686A1 (en) * | 2018-07-23 | 2020-01-23 | Florida Atlantic University Board Of Trustees | Systems and methods for extending the domain of biometric template protection algorithms from integer-valued feature vectors to real-valued feature vectors |
US20200026908A1 (en) * | 2018-07-23 | 2020-01-23 | The Mitre Corporation | Name and face matching |
US20200042838A1 (en) * | 2018-08-02 | 2020-02-06 | International Business Machines Corporation | Semantic understanding of images based on vectorization |
US20200126584A1 (en) * | 2018-10-19 | 2020-04-23 | Microsoft Technology Licensing, Llc | Transforming Audio Content into Images |
US20200195779A1 (en) * | 2018-12-13 | 2020-06-18 | Nice Ltd. | System and method for performing agent behavioral analytics |
US20200327639A1 (en) * | 2019-04-10 | 2020-10-15 | Eagle Technology, Llc | Hierarchical Neural Network Image Registration |
US20200341974A1 (en) * | 2019-04-25 | 2020-10-29 | Chevron U.S.A. Inc. | Context-sensitive feature score generation |
US20210034993A1 (en) * | 2019-08-01 | 2021-02-04 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Poi valuation method, apparatus, device and computer storage medium |
US20210109966A1 (en) * | 2019-10-15 | 2021-04-15 | Adobe Inc. | Video retrieval using temporal visual content |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220083570A1 (en) * | 2020-09-14 | 2022-03-17 | Accenture Global Solutions Limited | Enhanced data driven intelligent cloud advisor system |
US11789983B2 (en) * | 2020-09-14 | 2023-10-17 | Accenture Global Solutions Limited | Enhanced data driven intelligent cloud advisor system |
Also Published As
Publication number | Publication date |
---|---|
WO2021184385A1 (en) | 2021-09-23 |
US11132514B1 (en) | 2021-09-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
RU2701995C2 (en) | Automatic determination of set of categories for document classification | |
US11756244B1 (en) | System and method for handwriting generation | |
EP4150487A1 (en) | Layout-aware multimodal pretraining for multimodal document understanding | |
CN113239169A (en) | Artificial intelligence-based answer generation method, device, equipment and storage medium | |
Sarang | Artificial neural networks with TensorFlow 2 | |
CN112101042A (en) | Text emotion recognition method and device, terminal device and storage medium | |
EP3959652B1 (en) | Object discovery in images through categorizing object parts | |
Zhen et al. | The research of convolutional neural network based on integrated classification in question classification | |
Mejia-Escobar et al. | Towards a Better Performance in Facial Expression Recognition: A Data‐Centric Approach | |
Jindal et al. | Mistra: Misogyny detection through text–image fusion and representation analysis | |
US11132514B1 (en) | Apparatus and method for applying image encoding recognition in natural language processing | |
US20240386621A1 (en) | Text-to-image system and method | |
CN111566665B (en) | Apparatus and method for applying image code recognition in natural language processing | |
CN116862695A (en) | Data accounting management system, method, storage medium, and electronic device | |
Sharma et al. | Recent advances in transfer learning for natural language processing (NLP) | |
Hegde et al. | Sentiment Analysis with LSTM Recurrent Neural Network Approach for Movie Reviews using Deep Learning | |
CN115168566B (en) | A multi-label text classification method and system based on multi-information filtering coding | |
HK40046910A (en) | Apparatus and method for applying image encoding recognition in natural language processing | |
HK40046910B (en) | Apparatus and method for applying image encoding recognition in natural language processing | |
US20250053804A1 (en) | Intelligent digital content generation using first party data | |
CN113792703B (en) | Image question-answering method and device based on Co-Attention depth modular network | |
Shana et al. | Hangul Character Recognition of A New Hangul Dataset with Vision Transformers Model | |
Pandey et al. | Emergence of sign language recognition system into text | |
CN119886362A (en) | Model training method, data processing method, system and storage medium | |
Molina | Behind the Scenes of AI in Art |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: HONG KONG APPLIED SCIENCE AND TECHNOLOGY RESEARCH INSTITUTE COMPANY LIMITED, HONG KONG Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:NG, YU KEUNG;LIU, YANG;LEI, ZHI BIN;REEL/FRAME:052182/0107 Effective date: 20200316 |
|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |