1. Introduction
With the development of e-commerce, more and more people are becoming willing to post their opinions and comments on certain products after consumption, forming a large number of comment texts. These short texts generally have a strong subjective nature, and sometimes contain different emotional tendencies within a sentence. Additionally, short text comments are highly colloquial. This makes the topic of the text vague and difficult to find, semantically incoherent, and, more disturbingly, difficult for researchers to use directly. At the same time, most current research on the sentiment analysis of short comment texts involves analysis from a coarse-grained perspective—that is, only one sentiment tendency is obtained from a paragraph. However, current user reviews generally contain a variety of complex emotions. For example, a product comment in Chinese read as follows: “I love this nice new look of the phone and its glass case, which makes it look fashionable. But it may be lagging after two week”. In a coarse-grained sentiment analysis, this comment would be marked as simply positive. In fact, the user only gave favorable comments on the appearance of the product, and negative comments on the product quality. It can be seen that coarse-grained sentiment analysis cannot accurately reflect the specific aspects that users really care about. Aspect-based sentiment analysis can determine the sentimental tendencies of different aspects in certain product review ontologies. Based on this advantage, aspect-based sentiment analysis of short texts of user reviews can better help consumers to make judgments and decisions, and can also help businesses to make targeted product improvements and increase user satisfaction.
In response to the above problems, this aim of this study was to improve the accuracy of aspect-based sentiment analysis through deep learning research methods. Convolutional neural network (CNN) can extract local feature information in a Chinese comment corpus very well, but can easily miss the long-distance features of the text in the extraction process. Recurrent neural network (RNN) has good memory ability and it is often used to extract the long-distance dependent information of a review corpus, as it can make up for the shortcomings of the convolutional neural network. At the same time, the bidirectional gating recurrent unit (BiGRU) is an “upgraded” version of the RNN, which can better solve the problems of gradient explosion and gradient disappearance. Based on the complementary advantages of the above two neural networks, this paper describes the construction of a hybrid model based on CNN and BiGRU, referred to simply as the CNN + BiGRU model. This model can extract the required local feature information and capture the aspect level of comments. Additionally, it can avoid the problems of gradient explosion and gradient disappearance, improve the accuracy of the model, and reduce the computational overhead. At the same time, in practical application, this model can also be used in the emotional analysis of short texts such as microblog comments and social public opinion posts, which play a guiding role in relevant fields of all walks of life.
2. Literature Review
At present, researchers in text sentiment analysis mostly use three types of methods: (1) the method of using the sentiment polarity dictionary for statistics; (2) the method of building a machine learning framework; (3) the method of deep learning based on a hierarchical model. However, the sentiment dictionary approach requires researchers to define the judgment rules, and the model cannot break away from these fixed sentiment word restrictions. Moreover, the training of machine learning sentiment analysis is a long task, which is too dependent on the categories marked by the text in the training set. In general, both the sentiment dictionary and machine learning approaches have existing problems, so the deep learning methods came into being. Among them, common deep learning models include CNN, RNN, and Long-Short Term Memory (LSTM), which are often used in the field of sentiment analysis and have made certain contributions.
In related research about deep learning methods, Hinton et al. [
1] put forward the concept of the deep network model; this model uses a layer-by-layer greedy algorithm to overcome the problems of the deep network, so the performance of deep learning in all aspects is effectively improved. Zhang et al. [
2] proved that convolutional neural networks can extract local n-gram features from text, have strong learning power for local features, perform well in feature extraction and text classification, and are a method that can relatively reduce the amount of calculation. However, a traditional CNN cannot deeply learn pooled features, and has the following shortcomings in feature extraction:
The use of sigmoid causes the problem of gradient disappearance and slow convergence [
3].
The deeper the learning layer, the more serious the overfitting problem may be [
4].
The adoption of a gradient descent strategy may lead to an increase in cumulative error [
5,
6].
Therefore, this study used a hybrid model based on CNN and BiGRU (CNN + BiGRU) for feature learning.
In order to reduce the impact of the vanishing gradient problem in CNN, scholars have designed Gated Recurrent Unit (GRU) to alleviate it. GRU introduces an update gate and a reset gate. The update gate is used to control the extent to which the state information from the previous moment is brought into the current state, and the reset gate controls the amount of information from the previous state that is written to the current candidate set. Both of these gates alleviate the problems of vanishing gradient and long dependence [
7]. However, there are some problems in standard GRU, such as incomplete learning of the eigenmatrix and inconsistent influence of the beginning and end of a sentence sequence on the state of the one-way hidden layer [
8]. For this reason, the BiGRU network was proposed by scholars. BiGRU is a neural network model jointly determined by the states of two unidirectional and opposite GRUs. At each moment, the input provides two GRUs in opposite directions simultaneously, and the output is determined jointly by the two unidirectional GRUs. For Chinese text, Wang et al. [
9] proposed the use of the CNN and BiGRU joint network model to learn text features and extract the feature representation of sentences, thereby improving the accuracy of text sentiment analysis and the calculation speed of the model. Wang et al. [
10] constructed a neural network model based on BiGRU which used BiGRU to extract features from deep information of text, and proved through experiments that the model had better accuracy and lower loss rate compared with classical models. Geng et al. [
11] proposed a model based on BiGRU and an attention mechanism for the prediction of the novel coronavirus epidemic, and conducted an experiment to prove that BiGRU could reduce the computational cost and make full use of two-way data. In terms of fine-grained text sentiment analysis, Feng et al. [
12] established a fine-grained feature extraction model based on BiGRU and attention, and proved through experiments the role of the BiGRU model in improving the accuracy of sentiment analysis.
Chinese text sentiment analysis is a process of analyzing sentences and judging the subjective feelings, opinions, and attitudes of the sentence users through a series of methods. Deep learning has been widely used in the field of text sentiment analysis. Traditional deep networks such as RNN and LSTM have been applied by many scholars in aspect sentiment analysis [
13]. The function of sentiment analysis technology is to judge the emotional tendencies of Chinese sentences, judge whether the reviewer is positive or negative, and divide the text into several categories according to the reviewer’s attitude. According to the delicacy and emphasis of the evaluated content, it can be divided into three levels: discourse level, sentence level, and aspect level. Zhu et al. [
14] combined a multi-hop inference network to transform a sentiment analysis task into a reading comprehension task, and proposed a text sentiment analysis model based on multi-hop inference. Furthermore, scholars have refined the content of the research objects to study the sentiment analysis of sentence-level text. Wang et al. [
15] proposed an algorithm based on the contribution of emotional polarity which predicts the sentence polarity of a corpus based on the position of words in the sentence, and proved the effectiveness of the algorithm through experiments. At present, the discourse-level and sentence-level sentiment analysis methods and technologies have been developed to a level of relative maturity, but both of these groups focus on overall sentiment analysis. This can lead to omission of details and miscalculations in application. Aspect-level sentiment analysis technology can be used to discover different objects in an aspect and to identify the emotional information expressed in a text for each aspect, effectively solving the appeal problem.
Aspect sentiment analysis was only proposed in 2010, and there have been few studies on this topic so far. It consists of two subtasks: aspect item extraction and aspect sentiment classification [
16]. Specifically, aspect item extraction aims to extract the attributes of goods or services from a comment text; aspect-level sentiment classification should judge the emotional tendency corresponding to each aspect.
In aspect extraction, Paltoglou et al. [
17]. solved aspect extraction as a sequence labeling problem, and then used a linear chained conditional random field to deal with this problem. Traditional methods (such as constructing sentiment dictionaries) completely separate text representation from feature extraction and model training, and focus on text representation and feature extraction. Due to the randomness, high ambiguity, irregularity, and other characteristics of short text, this can easily lead to the problems of feature dispersion and context independence in the process of text representation and feature extraction. All of these factors may lead to lower accuracy of feature extraction and disconnection of contextual semantic relations when using traditional sentiment analysis methods [
18]. To improve the accuracy of sentiment analysis based on sentiment lexicographs, Bravo-Marquez et al. [
19] proposed a time-varying sentiment lexicographs based on incremental word vectors that train an incremental word sentiment classifier from a dynamic word vector to automatically update the sentiment lexicographs.
In aspect-based sentiment classification, there are multiple emotions co-existing in the short-text aspect-based sentiment analysis. Although this problem can be solved by refining the types of sentiment labels, this approach may also lead to a problem in that the model is too complex and the essence of the problem is not solved [
20]. If the number of network layers in RNN is too great, the problem of gradient explosion or gradient disappearance will occur [
21]. At the same time, existing heuristic methods cannot extract the semantic features of polysemous words efficiently, resulting in a poor classification effect and poor generalization of existing deep learning classification models. Therefore, how to effectively solve the above problems and improve the accuracy and generalization of aspect sentiment analysis is attracting extensive attention. In order to solve the above technical problems, Zhang [
22] proposed a short-text sentiment analysis algorithm based on Bi-LSTM, which aimed at solving the problem that statistics-based feature selection methods ignore semantic information and deep learning methods do not contain the statistical and sentiment information of the feature. Tran et al. [
23] proposed a model which uses BiGRU and did an experiment by training GloVe on the SemEval 2014 to prove the effectiveness of this model. Han [
24] proposed an sentiment classification model based on BiGRU and knowledge transfer which uses BiGRU to classify sentiments more accurately according to the semantics of aspect words, and obtains domain knowledge by combination with the knowledge transfer method. Song et al. [
25] used a network model based on a bidirectional gating cyclic neural network which used BiGRU to ensure that the model would have fewer network parameters and work faster than CNN.
In recent years, aspect-level sentiment analysis has also been applied in other fields. Alamoodi et al. [
26] applied aspect-level sentiment analysis to research on the public’s acceptance of vaccines during the COVID-19 pandemic. Alam et al. [
27] applied aspect-based sentiment analysis based on parallel dilated convolutional neural networks to smart city applications.
In summary, aspect-based sentiment analysis has become a hot topic in the field of sentiment analysis in the past two years, and has attracted the attention of many sentiment analysis scholars. In actual application scenarios, aspect-based sentiment analysis has also been improved. The advantage of accuracy is resulting in the gradual replacement of sentence-level and text-level sentiment analysis. Chinese short text reviews usually contain both explicit aspect-level information and implicit aspect-level information. Therefore, aspect-level sentiment analysis technology requires not only explicit structural analysis, but also that attention be paid to implicit expression. Therefore, this paper combines CNN and BiGRU for aspect-level semantic sentiment analysis of short texts.
4. Experimental Study, Performance Evaluation, and Comparison
This experimental section mainly discusses the dataset used in the experiment, preprocessing, evaluation index, and method of comparison. This article used 5G mobile phone evaluation information under JD Mall. The main indicators used were accuracy rate, recall rate, and F1 value, which were used as standards to compare three classic models to evaluate the effect of the model.
4.1. Dataset and Preprocessing
In the data collection stage, this study used the Scrapy code and the 6573-comment text obtained by Octopus software from JD Mall. The original format of the data was an XLS file. First, the original data were converted into a TXT file, and then the text was converted into comments line by line. The length of the obtained text was at the sentence level. We then deleted the comment texts with fewer than five words and deleted the repetitive and various irrelevant texts, leaving 6233 comment texts as the final research object. The data were then preprocessed.
In the data preprocessing stage, the data were first cleaned to filter out all special symbols, punctuation, English, numbers, etc. This study used the re.sub function to remove special characters, and the re.sub (“[a-zA-Z0-9]”, text) statement to remove English and numbers. Next, data segmentation is performed, and the Jieba segmentation in python was used for the operation. The Jieba word segmentation first recognizes which strings are involved in the Chinese text, and then uses a number of expressions to filter out the characters recognized in the previous step. Next, we performed the operation of removing stop words to remove useless words that have actual meanings but are extremely common in Chinese (such as: ah, ah, ah, ah). This study referred for this step to the stop word list of the Chinese Academy of Sciences. There were two main steps in the process of removing stop words: (1) Read the Chinese stop word list; (2) traverse the previously processed sentence, match the words in it with the stop word list of the Chinese Academy of Sciences, and delete any word if it appears the same. Finally, part-of-speech screening was performed.
This study used the Jieba word segmentation component to mark the parts of speech. After marking according to the needs of the present research, other types of word segmentation were deleted, leaving only the four types of speech part words, namely idioms, nouns, other nouns, and noun verbs, namely pos = [‘n’, ‘nz’, ‘vn’, ‘l’]. Data preprocessing was completed through four steps: data cleaning, data segmentation, stop word removal, and part-of-speech filtering. The results are shown in
Table 1.
4.2. Aspect-Level Feature Extraction Based on TF-IDF Vectorization
4.2.1. TF-IDF Algorithm
The TF-IDF algorithm is a feature extraction method recognized by academia. Compared with other algorithms, TF-IDF extraction is more accurate. Therefore, this study used the TF-IDF feature extraction algorithm to implement vectorization processing for 5G mobile phone review text. Through TF-IDF calculation, we were able to find whether a certain word was critical in this text sample. The specific calculation formula is as follows:
The values of
and
are calculated separately using Formula 3, and then the total TF-IDF weight value is obtained and sorted in descending order. Choosing the first few keywords with the highest scores played a role in dimensionality reduction. The workflow is shown in
Figure 4.
4.2.2. TF-IDF Keyword Table
The Scikit-learn machine learning library in Python contains a variety of functions for numerical operations, and also provides the TfidfTransformer function required by the TF-IDF algorithm in this article. The weight was calculated through the above process, so as to filter out suitable keywords. In this paper, 90 keywords from 5G mobile phone reviews were finally screened out, and their respective weights were obtained.
4.2.3. Feature Induction
After summarizing and sorting out the keywords extracted by TF-IDF, the consumer review characteristics of 5G mobile phones were summarized into the following six categories: battery, appearance, function, performance, price, and service, which represent the most important considerations of consumers when buying 5G mobile phones. The six major elements of, the specific extracted feature words are shown in
Table 2 below.
First of all, if there is a problem with the battery of a mobile phone, the product must be repaired or discarded. Therefore, the performance of the battery is a major factor that consumers pay attention to. Secondly, as the user group of electronic products becomes younger, appearance has also become a major factor influencing purchase decisions. Thirdly, the comprehensiveness of the functions of mobile phones and the superiority of performance are the internal driving forces that determine the vast majority of users’ purchase decisions. Moreover, the price will also impact consumer choices. These key elements are considered when using mobile phones. Consumers will have different evaluations of mobile phones with different prices and different grades. Finally, service is also a major factor considered by consumers, one which best reflects the sense of responsibility and service attitude of this merchant. It also affects consumers’ evaluations of mobile phones.
4.2.4. Annotating Emotional Polarity
Using a manual labeling method, we referred to the summarized evaluation features and labeled the feature subject and emotional polarity of each 5G mobile phone review datum. The emotional polarity was replaced by the numbers “0” and “1”. This study adopted the two-polarity classification method, where “0” represents negative emotions, and “1” represents positive emotions. Finally, 7003 pieces of data were marked in this study, of which 6002 were training data and 1001 were test data. A total of 2777 negative emotions and 3215 positive emotions were marked in the training set. The positive and negative emotions were balanced and suitable as input data for model training. Examples of comment annotation are shown in
Table 3.
The evaluation text feature–polarity distribution diagram in
Figure 5 below shows the ranked aspects of consumer focus on 5G mobile phone elements. It can be seen that the most concerning 5G mobile phone element for consumers is appearance, followed by performance and function. There were relatively few comments on the price. This may be due to the fact that consumers have a detailed understanding of the price before purchasing, so there is no excessive evaluation.
4.3. Experimental Settings and Evaluation Criteria
In this section, we first set up the experimental environment and built the experimental platform. The experimental environment parameters are shown in
Table 4.
This study used Python as the model implementation language and Tensorflow as the experimental framework. For the experiment, we first set up an experimental environment and built an experiment platform. The Chinese comments were vectorized using the one-hot model, and then the parameters of the model were set. The dimension of the word vector and the size of the hidden layer were 300. The hyperparameters in the model that needed to be adjusted were determined using the grid search method. After much iteration, the hyper parameter settings required for the experiment are shown in
Table 5.
In order to verify the effect of the CNN + BiGRU model proposed in this section, this study also used accuracy, precision, and F1 value (F1 measure) as experimental evaluation indicators. The specific formulas of each evaluation index are shown in
Table 6.
4.4. Comparison and Analysis of Experimental Results
After many iterations, the CNN + BiGRU model constructed in this article reached its optimal state and the accuracy and loss curve of the model was obtained, as shown in
Figure 6 below. It can be seen from observation that as the number of batches increased, the accuracy of the model continued to rise, the loss continued to drop, the learning rate was suitable, and there was no overfitting situation where loss and acc decreased simultaneously, indicating that the application effect of the model was relatively good. Another important point is that the experiment found that the CNN + BiGRU model has a relatively simple structure and the network training batch size was 64, which will not cause memory explosion and may reduce the training difficulty. Therefore, the CNN + BiGRU model reduced the calculation time and improved the operation efficiency.
At the same time, in order to prove the superior reliability of the CNN + BiGRU model described in this paper, it was necessary to show that this model is superior to other methods. Therefore, this study included a comparative experiment in which the CNN + BiGRU model was compared with the traditional CNN model, LSTM model, and C-LSTM model. All experimental models can be simply divided into three parts, namely input, processing, and output. The experimental environment, experimental parameters, and number of iterations were all the same. This ensured that the internal structure of the model was unique and made the comparison results more convincing. The experimental results are shown in
Table 7 below.
Among all the models, the CNN model had the worst performance in accuracy, recall, and F1 value. The reason may be that, on one hand, the number of 5G mobile phone reviews was relatively small, but they were all more than 30 words. CNN is limited by its inability to solve the problem of long-distance dependence on the context of the sentence, resulting in the loss of some information and making the performance of the CNN model very poor. For the LSTM model, for a comment sentence to enter the current processing structure, all previous units need to be traversed, which not only increases the workload of sentiment analysis tasks, but also easily causes the problem of gradient disappearance. The C-LSTM model uses CNN as an auxiliary, which improves the model’s ability to extract local features, so the model’s information extraction was also more perfect and the three evaluation index scores were improved. However, there were still some problems with the model, namely that when such models do not need to be trained, they still require a lot of resources, which increases the difficulty of training and causes a waste of resources. Finally, the experiment found that the accuracy of the CNN + BiGRU model was improved by 12.12%, 8.37%, and 4.46% compared with the other three models. It verified that the model could not only extract the local features of the evaluated text, but also solve the contextual text dependence problem. The structure is relatively simple while ensuring the effectiveness of the model; taking into account the calculation speed and reducing the computational consumption, it can be seen that the CNN + BiGRU model constructed in this study is far better than other traditional models.
5. Conclusions and Future Outlook
In this paper, we propose an aspect-level text sentiment analysis method based on a convolutional neural network model and a bidirectional threshold recurrent neural network. By cleaning 5G crawler text, Chinese word separation, deleting meaningless discontinued words, and filtering lexicality, the original dirty data were turned into a corpus that could be directly input into the model; in the aspect ontology extraction session, the TF-IDF algorithm was used to obtain feature word vector weights; in the model building session, the one-hot word embedding technique was used to vectorize the subtext, CNN extracted local feature information and GRU extracted the long-distance dependency information, and both below and above information were processed through a bidirectional structure. Based on this, this study used CNN + BiGRU to obtain the contextual information of the text, managing to fully extract the local features of the review text, improve the accuracy of sentiment classification, avoid the gradient explosion and gradient disappearance problems, and reduce the loss value of the model by constantly adjusting the experimental parameter settings. Finally, by analyzing the accuracy, recall, and F1 score metrics, the CNN + BiGRU model constructed in this paper was determined to have significantly improved in each metric compared with the CNN model, LSTM model, and C-LSTM model.
In the comparison with the most basic CNN network model, the CNN + BiGRU model in this paper improved the accuracy by 12%; compared with the LSTM model and C-LSTM model, the CNN + BiGRU model in this paper improved the accuracy by 8% and 4%, respectively, and the model in this paper not only improved the classification effect and achieves the optimum result in all criteria, but also reduced the operation time significantly. It can be seen that the model in this paper has some applicable advantages in aspect-based sentiment analysis research of short comment texts, which can provide reference for the future direction of sentiment analysis.
Aspect-level sentiment analysis can extract all the sentiment tendencies expressed by consumers in all aspects of a product, and merchants can formulate targeted policies on this basis. Applying aspect-level sentiment analysis to recommendation systems can improve their practicability. Therefore, aspect-level sentiment analysis has high research value. This paper constructed a CNN + BiGRU model for aspect-level sentiment analysis, which improved relevant indicators to a certain extent. Regarding aspect-level sentiment analysis, there are some other different and valuable research works; Zhang [
32] used graph neural networks to capture the implicit features between nodes in sentence relation graphs for sentiment analysis. Considering the problem of processing graph-structured data, Wu et al. [
33] proposed a sentiment analysis model based on distance and graph-structured convolutional neural networks.
The authors’ academic ability, research time, and academic energy are limited. Future studies could conduct in-depth research in the following areas:
In the sentiment classification section, this paper only divided sentiments into positive and negative ones. Further research could carry out specific grading calculations for certain emotions, and further refine the classification of words of praise and derogation.
In specific short-text contexts, different emotional entities are usually given different degrees of importance. Therefore, further research could introduce an attention mechanism to help exclude useless information and find key information from big data quickly.
The CNN + BiGRU model could be used for feature fusion and applied to other fields, such as the technology for text backdoor attacks [
34], mode recognition [
35], etc.
Similar models have been applied in sentiment analysis in other languages. Ayoobi et al. [
36] proved the effect of GRU structure on improving the accuracy of sentiment analysis of Arabic text through experiments, and introduced the Multilingual Universal Sentence Encoder machine to improve the accuracy of sentiment analysis. However, the adaptability in other languages needs to be further studied.