Hybrid RNN Based Text Classification Model for Unstructured Data

Pramod Sunagar¹,
B. J. Sowmya²,
Dayananda Pruthviraja ORCID: orcid.org/0000-0001-8445-3469³,
S Supreeth⁴,
Jimpson Mathew⁵,
S Rohith⁶ &
…
G Shruthi⁴

992 Accesses
1 Citation
Explore all metrics

Abstract

The volume of social media posts is on the rise as the number of social media users expands. It is imperative that these data be analyzed using cutting-edge algorithms. This goal is handled by the many techniques used in text categorization. There are a variety of text categorization techniques available, ranging from machine learning to deep learning. Numerical crunching has become easier with less processing time since the emergence of high-end computer facilities. This has led to the development of sophisticated network architectures that can be trained to achieve higher precision and recall. The performance of neural network models which was evaluated by the F1 score is affected by cumulative performance in precision and recall. The current study intends to analyze and compare the performance of the neural network proposed, A Hybrid RNN model that has two layers of BiLSTM and two layers of GRU to that of previous hybrid models. GloVE dataset is used to train the models and their accuracy, precision, recall, and F1 score are used to assess performance. Except for the RNN + GRU model, the RNN + BILSTM + GRU model has a precision of 0.767, a recall of 0.759, and an F1-score of 0.7585. This hybrid model outperforms the others.

Amplifying document categorization with advanced features and deep learning

Article 04 March 2024

Systematic Literature Review and Bibliometric Analysis on Addressing the Vanishing Gradient Issue in Deep Neural Networks for Text Data

ShufText: A Simple Black Box Approach to Evaluate the Fragility of Text Classification Models

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

The current generation has observed a great necessity for text classification [1]. The processing capability of new generation computers has grown exponentially in tandem with the amount of data handled. This is partly due to the fact that there are more end users now, which calls for efficient data management. In order to handle things effectively, data must be uploaded and retrieved as quickly as possible. Handling the data comprises sending and retrieving data with the shortest probable time without loss of data, that is exactly the requirement within the framework of successful scientific literature, health records, e-books, web application, banking details, and digital content. Text classification has been applied in many applications now [2,3,4,5]. This could entail data processing and structuring, where a large amount of data is sorted and organized on the basis of relevancy. Opinion mining, on the other hand, is an equally essential application. There are numerous initiatives to expand the use of categorized data for collaborative filtering. The email classification is also quite important, and one use of this is the classification of text to identify spam communications. Given their criticality and precision requirements, the aforementioned applications challenge existing models’ performance. Recently, researchers have launched several initiatives to address these real-world situations [6].

In the recent past, researchers have taken several remarkable initiatives in response to real-world situations. There are significant attempts being made to manage large amounts of text, categorization techniques used to expose data to text mining and Natural Language Processing (NLP) with purpose for making the process more successful and intricate [7]. The raw data, which is the Twitter data used for the present study, is subjected to text classification by a series of processes, namely feature extraction, dimensionality reduction, selection are introduced a new CNN model in [11] that focuses on multi-scaling and dense connectivity because of the large number of connections, the model is able to generate large n-gram properties dynamically from changeable small n-gram properties, and the model may adaptively select task-friendly yet effective features from a large set of multi-scale characteristics for text categorization by paying attention to several scales of features, appropriate, and finally using the evaluation metrics that make it easier to measure classification accuracy [8]. The major goal of this study is to create a hybrid deep learning text classification system using RNN embedded With BiLSTM and GRU. This research employs 83267 rows of Twitter data for this research by using the hybrid Deep Learning technique.

This paper is organized as follows: Section II describes the relevant work in the field of text categorization. Section III outlines the standard and hybrid text categorization algorithms employed in this work. Section IV presents the evaluation criteria employed to evaluate the proposed technique against other algorithms. Section V describes the results of the algorithm comparison. Section VI summarizes the study’s findings.

Related Works

Authors, Gang Liu and Jiabao Guo have proposed a hybrid deep learning tactic called attention based BiLSTM with CNN (AC-BiLSTM). In this model, the word embedding vectors are utilized to create the convolutional layer, which fetches the high level phrase illustrations. BiLSTM is then used to obtain both the existing and subsequent context depictions. An attention method is used to apply a varying degree of attention to the data output from the biLSTM’s concealed layers. The processed contextual information is classified using the Softmax classifier. The AC-BILSTM demonstrates a high degree of classification accuracy when compared to other current cutting-edge models [9]. In [10], surveyed over 150 deep learning models developed in the last few years. These models were significantly better than the current state of text categorization with respect to different classification challenges. In addition, the authors provided an overview of 40 + widely used datasets and compared with a number of open standards have been published. For the classification of texts, [11] introduced a new CNN model that focuses on multi-scaling and dense connectivity. Because of the large number of connections, the model is able to generate large n-gram properties dynamically from changeable small n-gram properties. The model may adaptively select task-friendly yet effective features from a large set of multi-scale characteristics for text categorization by paying attention to several scales of features. A competitive performance is shown by the model on six benchmark datasets. Our approach can choose the best size to provide meaningful representations for text classification, according to attention visualization.

In [12], proposed Text Graph Convolutional Network as a text classification method. Text GCN gets familiar with the embeddings of words as well as documents parallelly under the supervision of the pre-defined labels for documents after initializing with a one-hot representation of each. On a variety of standard datasets, the results showed that a Text GCN without any external word embeddings or information outperforms cutting-edge text categorization algorithms. Text GCN also picks up on word and document embeddings that are predictive. Furthermore, experimental results show that as the percentage of training data is reduced, Text GCN becomes increasingly superior to state-of-the-art comparison techniques, demonstrating that Text GCN is robust to less training data in text classification. Semi-supervised short text categorization uses a unique heterogeneous graph neural network based technique introduced by [13] and takes use of both little amounts of labelled data as well as vast amounts of data that have not been labelled. Convolutional neural networks may be used to concatenate word graphs, according to [14] Non-contiguous and long-distance semantics can be captured by using a graph of word representations of texts. CNN models are capable of learning at several semantic levels. The results from the RCV1 and NYTimes datasets indicated that employing hierarchical text classification was superior to traditional hierarchical text classification and current deep models.

In [15], proposed two text classification methods in their paper, NA-CNN COIF-LSTM and NA-CNN LSTM. These were created by combining a CNN with no activation function and an LSTM and one of its hybrid models, COIF-LSTM. Comparative tests show that combining a CNN with no activation function and an LSTM or its variations improves performance. In [16], a new text classification algorithm that integrates CNN, LSTM, and attention mechanisms. Initial features are extracted using convolutional layers. Second, LSTM saves context history. Attentional mechanisms generate semantic codes that represent attentional probability distributions and emphasize the effect of inputs on outputs. Create a new model (CNN) by combining long short-term memory (LSTM) and convolutional neural network (CNN) which are standard neural network models [17]. Long text sequences will benefit from LSTM’s ability to preserve the quality of historical information while using the CNN structure to extract the text’s local attributes. For example, in a hybrid model, a CNN is built on top of his LSTM model, and the CNN extracts text feature vectors from those features. In the experiment, we compared the performance of the hybrid model with other models. Experimental data show that using the hybrid model in [18] significantly improves text classification.

In [19], proposed a new hybrid deep learning technique to detect deception by combining recurrent neural networks (RNN) and convolutional neural networks (CNN). The model outperformed state-of-the-art AI and ML models when evaluated using a benchmark dataset. Introduced a hybrid technology that improves the reliability and openness of classification options for medical documents. This model classifies medical text using a three-level hybrid technique, combining a gated attention-based BiLSTM (ABLSTM) with a regular expression-based classifier. The proposed approach goes beyond the state of the art in terms of selecting specific domain and topic related features in [20].

In the recent research discusses various deep learning models for text classification, highlighting their architectures and performance metrics. However, some research gaps and potential areas for further exploration can be identified as below.

The authors compare their proposed models with current cutting-edge models, there is a lack of comprehensive benchmarking across a wider range of datasets and classification challenges. Further research should explore the models’ performance under diverse conditions to assess their generalizability.
The literature discusses attention mechanisms in several models, but there is limited discussion on the interpretability of these mechanisms. Research could delve into understanding how attention is applied and whether it aligns with linguistic or semantic importance, enhancing the models’ interpretability.
Although some researches mention robustness to reduced training data, further investigation is needed to assess the models’ performance across various domains and genres when data availability is limited.
Several researchers introduces the hybrid models, but a systematic evaluation comparing the efficacy of different hybrid architectures is missing.
Some researchers briefly touch upon applications in medical document classification, but there is a need for more in-depth exploration of domain-specific challenges and the adaptability of these models to diverse industries beyond the benchmark datasets mentioned.

Addressing these research gaps can contribute to a more comprehensive understanding of the strengths and limitations of the proposed models, facilitating their practical application in real-world scenarios and could explore the impact of combining various neural network components on classification performance.

Methodology

Flaws are inevitable in neural network models. But by combining different designs, flaws can be minimized. So, in this paper, we look at how layers from different designs are added to classic neural network models and how they affect model performance.

Depending on architecture adapted, recurrent neural networks (RNNs) can take two forms: Gated recurrent unit (GRU) [20] and long short-term memory (LSTM) [19]. The ability of the LSTM layer (Fig. 1) to maintain long-range relationships is its greatest asset [21], whereas the GRU layer (Fig. 2) does not. LSTM is particularly useful for dealing with dying GRU’s, but on the other hand; it has the advantage that training is faster than LSTM because the amount of training data is significantly smaller [22]. The output layer hyperbolic activation function is used in the LSTM and also in the GRU unit. This makes it possible to extract information even after a long period of time. GRU’s are easier to train than LSTMs because they require fewer data points and guarantee better performance than LSTMs. Finally, GRU requires less computing space and time than LSTM. This has led scientists to use GRUs instead of LSTMs when searching is not strictly necessary. The sophisticated variant LSTM variant is Bidirectional LSTM (BiLSTM) that can capture long-term dependencies in both forward and backward directions [23]. A classical LSTM is a unidirectional neural network that only remembers future relationships.

The previous version of this RNN model included two layers of GRU [24] and three layers of LSTM. The Authors attempted to improve the model in this paper by using BiLSTM and GRU layers to build a hybrid RNN model. Figure 3 shows a BiLSTM cell that compares the text to both the next and previous word, predicting the relationship substantially better than a conventional LSTM cell. As in a standard LSTM cell, the “Green” path represents data flow backward, whereas the “Red” path represents data flow forward, as in a standard LSTM cell. The data (x_0, x_1, x_2, .x_i) are compared in “red” with the next word in each pass and in “green” with the previous word in each pass. The final result will be (y_0, y_1, y_2, yn). The proposed hybrid model for the current study is RNN + 2-BiLSTM + 2-GRU, which is compared against RNN + 3-LSTM + 2-GRU, RNN + 4-GRU model, and the RCNN + 4-LSTM layers.

Figure 4 depicts the enhancement of a typical RNN [25] model by integrating two GRU layers and two layers of Bidirectional LSTM layers. The performance of hybrid model is compared to three other hybrid models. The other hybrid models taken into consideration when evaluating the efficacy of the proposed hybrid model include a traditional RNN model with 2 layers of GRU and 3 layers of LSTM (Fig. 5), a standard RNN model developed by adding 4 layers of GRU (Fig. 6), and a traditional RCNN model [26] that is improved by incorporating 4 LSTM layers in addition to the convolutional layers as shown in Fig. 7. As shown in Fig. 1, the hybrid RNN uses an LSTM model and the RCNN model uses a forget gate with an input gate and an output gate.

The output gate layer is the most important layer for maintaining long-term reliability. Based on the current cell’s input vector, the previous cell’s output vector, and the cell’s previous state, the forget gate layer computes a value. This means that the forgotten level will also provide the input level value. A sigmoid neural function with a point multiplication operator generates values in the forget layer of the LSTM gate. The input gate process is required for two tasks. The sigmoid activation function is applied to the incoming vector data and then compared with the value of the hyperbolic activation function.

The previous cell status is added to the new values generated by comparison. The final gate, the output gate layer compare the resulting hyperbolic and sigmoid activation functions by comparing the updated cell state to the resulting input vector. The only gates that comprise a fully-gated GRU are an input gate and forget gate layer, which is separate from the LSTM. Despite being introduced a few years ago, GRU is ideal in some circumstances since it needs a smaller dataset and takes less time to train the model. Since LSTM contains a specific update and forget gate, it is clear that LSTM is more complex. GRU can be utilized because of the LSTM’s complexity, which improves control over the model that contains GRU units.

On the basis of the findings, four models are tested with integrated LSTM units, GRUs, and LSTMs-GRUs, and their performance is assessed. The Twitter dataset GloVe version makes it easier to train models by accurately capturing word syntactic and semantic representation. GloVe’s [26] flaw is its inability to recognize terms that are outside its vocabulary, and deep learning model training requires a lot of data, thus escalating memory needs. Although GloVe performs a similar purpose as Word2Vec [27], the training process will not be hampered by the weights attached to frequent word pairings. Due to the advantages mentioned above, it is possible to train the models considered for the study using vectorized data from the Twitter Dataset with GloVe, which offers fascinating linear word substructures in vector space.

Evaluation

This research employs 83267 rows of Twitter data for this research by using the hybrid Deep Learning technique. Accuracy and recall are the two factors used to evaluate models. Recall is the ratio of true positive predictions to the sum of false positive and false negative predictions (Eq. 1). Precision is the relationship between the true predictions and the total predictions. Recall displays the model’s sensitivity, while precision highlights the model’s accuracy. Increased accuracy results in more false negatives while decreasing the incidence of false positives. Therefore, a rise in false negatives lowers recall value. Unquestionably, a model is useful for a certain application when accuracy and recall are compromised, as they have opposing tendencies. The F1 score quantifies the area below the precision recall arch displayed by both models and provides an indication of the combined precision and recall effect. The F1 score is derived from the harmonic mean of precision and recall (Eq. 3), an important statistic for evaluating model performance.

$$\:Precision\:\left(P\right)=\frac{True\:Positive}{True\:Positive+False\:Positive}\:\:\:\:\:\:\:\:\:\:$$

(1)

$$\:Recall\:\left(R\right)=\frac{True\:Positive}{True\:Positive+False\:Negative}\:\:\:\:\:\:\:\:\:\:$$

(2)

$$\:{F}_{1}=2\:\left[\frac{P\:X\:R\:}{P+\:R\:}\right]\:\:\:\:\:\:\:\:$$

(3)

Results and Discussions

The graphs below compare the proposed RNN-BiLSTM-GRU hybrid model with other models. Figures 8 and 9 show the results plots for four different models, exhibiting precision and recall. RNN models routinely beat RCNN models, despite the fact that, after 18 iterations, RNN and RCNN provide nearly similar accuracy and recall values. However, as illustrated in Figs. 8 and 9, compared to RNN + BILSTM + GRU and RNN + LSTM + GRU hybrid models, RNN models embedded in GRU layers require less time and data to achieve high precision.

The recall-precision curve, which is represented in Figs. 10, 11 and 12, and 13, is extremely important for assessing effectiveness. The least squares approach is used to produce a polynomial curve that displays the findings. The F1 score for each model is represented by the area under the curve that shows the model with the best recall and precision. Compared to the RNN + 2-BiLSTM + 2-GRU, RNN + 3-LSTM + 2-GRU, RNN + 4-GRU models, the RCNN + 4-LSTM model occupies more space and suggests a larger range of recall-precision values.

Figure 14 shows the difference more clearly, and Fig. 15 shows the corresponding variation in F1 scores. The average F1 scores for RNN + 2-BiLSTM + 2-GRU, RNN + 3-LSTM + 2-GRU, RNN + 4-GRU, and RCNN + 4-LSTM models are shown in Table 1. The results show that the RNN + 4 - GRU model is 10% more efficient than RCNN + 4 - LSTM and 4% more efficient than RNN + 3 - LSTM + 2-GRU, and the difference between RNN and RNN + 4-GRU is 0.3%. + 2-BiLSTM + 2-GRU model. The initial slopes of accuracy and loss for RNN + 4-GRU and RNN + 2-BiLSTM + 2-GRU with normalized time agree well with other models that are significantly different from RNN + 4-GRU curves. The RNN model’s F1 curve is higher than that of the RCNN + 4-LSTM model and RNN + 3-LSTM + 2-GRU model due to the fact that the GRU layer in the RNN model has a shorter training time and a smaller data set.

The RNN + 4-GRU model cannot solve long-term dependencies, since it is integrated with the GRU layer. The disadvantage is, however, satisfactorily reduced by the inclusion of layer of BILSTM between the GRU layers and RNN layers. RNN + 2-BiLSTM + 2-GRU, which combines the advantages of the LSTM and GRU layers, is advised above RNN-LSTM-GRU where long-term dependencies are crucial, such as in text categorization. RNN-BiLSTM-GRU is the best option for text categorization since it has LSTM layers, which are present in RCNN but require more time and data to train despite being able to resolve long-term dependencies.

The relationships between accuracy and normalized time are depicted in Figs. 16 and 17. However, the RNN + 3-LSTM + 2-GRU model is able to fit the RNN + 4-GRU and RNN + 2-BiLSTM + 2-GRU models after 0.75 s of normalization. Both models are almost equally accurate suggesting that the RNN + 2-BiLSTM + 2-GRU model can replace the RNN + 4-GRU and the RNN + 3-LSTM + 2-GRU models since they enable retrieval of long-term dependencies with a similar level of accuracy. The RNN + 3-LSTM + 2-GRU model has an LSTM layer that requires more time for training, which results in a discrepancy in slopes from 0 to 0.6s.

Table 1 A comparison of average precision, average recall, and F1 score

Full size table

By matching the slope, the GRU layer compensates for the initial delay, while learning for longer than 0.75 s. The F1 score is determined by the area under the precision-recall curve (Fig. 17). The difference in F1 values for each of these three models are shown in Fig. 18. The model RNN + 2-BILSTM + 2-GRU improves the RNN + 4-GRU model consistently, while RNN + 3-LSTMs + 2-GRUs and RCNN + 4-LSTMs have average F1 values that are higher than the other models but marginally lower than the RNN + 4-GRUs. The same comparison is depicted as a graph in Fig. 16, Fig. 19.

The graph 19, displays the p-values from the paired t-test for each dataset, along with a significance threshold (0.05):

p-values.

1.
p-value indicates the statistical significance of the differences between Model 1 and Model 2’s accuracies are shown in Fig. 20.

The red Dashed line Represents the Significance Threshold of 0.05

Conclusions

The current comprehensive study attempted to tackle a hybrid text classification approach by fitting a model and evaluating its performance in terms of F1 score, recall and accuracy. It has been observed that developing a model for a particular purpose requires trade-offs between aspects such as time, data sets, and long-term dependency management that establish an effective application model. Consequently RNN + 2-BiLSTM + 2-GRU proved to be more accurate than RNN + 4-GRU and RNN + 3-LSTM + 2-GRU, with BiLSTM dealing with long-term dependency queries and GRU is being useful for quick training of model. The proposed RNN + 2-BiLSTM + 2-GRU hybrid model has an average F1 score of 0.76, RNN (4 GRU classes) has an average F1 score of 0.77, and RCNN (4 LSTM classes) has an F1 score average is 0.69. Moreover, due to the bi-LSTM layer of the model, the RNN + 2-BiLSTM + 2-GRU model is able to maintain long-term dependencies without storing redundant context information, even though the training period and dataset are slightly longer.

Data Availability

The labelled datasets used to support the findings of this study can be obtained from the corresponding author upon request.

References

Kowsari K, Meimandi KJ, Heidarysafa M, Mendu S, Barnes L, Brown D. Text classification algorithms: a survey. Inf. 2019;10(4). https://doi.org/10.3390/INFO10040150.
Sunagar P, Kanavalli A, Poornima V, Hemanth VM, Sreeram K, Shivakumar KS. Classification of Covid-19 tweets using deep learning techniques. 5th Int Conf Inven Syst Control ICISC 2021. 2021;204 LNNS:pp123–136. https://doi.org/10.1007/978-981-16-1395-1_10.
Article Google Scholar
Sunagar P, Kanavalli A, Nayak SS, Mahan SR, Prasad S, Prasad S. Lect Notes Electr Eng. 2021;733 LNEE:461–74. https://doi.org/10.1007/978-981-33-4909-4_35/COVER. News Topic Classification Using Machine Learning Techniques.
Kowsari K, E. Brown D, Heidarysafa M, Jafari Meimandi K, S. Gerber M, E. Barnes L. HDLTex: hierarchical deep learning for text classification. Proc - 16th IEEE Int Conf Mach Learn Appl ICMLA 2017. 2017;2017–December:364–71. https://doi.org/10.1109/ICMLA.2017.0-134.
Article Google Scholar
Kowsari K, Heidarysafa M, Brown DE, Meimandi KJ, Barnes LE. RMDL: Random multimodel deep learning for classification, ACM Int. Conf. Proceeding Ser., pp. 19–28, Apr. 2018, https://doi.org/10.1145/3206098.3206111.
Dhal P, Azad C. Hybrid momentum accelerated bat algorithm with GWO based optimization approach for spam classification. Multimed Tools Appl. Sep. 2023;1–41. https://doi.org/10.1007/S11042-023-16448-W/METRICS.
Jianan G, Kehao R, Binwei G. Deep learning-based text knowledge classification for whole-process engineering consulting standards. J Eng Res Jul. 2023. https://doi.org/10.1016/J.JER.2023.07.011.
Article Google Scholar
Yelisetti S, Geethanjali N. Aspect-based Text Classification for Sentimental Analysis using Attention mechanism with RU-BiLSTM, Scalable Comput. Pract. Exp., vol. 24, no. 3, pp. 299–314, Sep. 2023, https://doi.org/10.12694/SCPE.V24I3.2122.
Liu G, Guo J. Bidirectional LSTM with attention mechanism and convolutional layer for text classification. Neurocomputing. Apr. 2019;337:325–38. https://doi.org/10.1016/J.NEUCOM.2019.01.078.
Minaee S, Kalchbrenner N, Cambria E, Nikzad N, Chenaghlu M, Gao J. Deep Learning–based Text Classification, ACM Comput. Surv., vol. 54, no. 3, Apr. 2021, https://doi.org/10.1145/3439726.
Wang S, Huang M, Deng Z. Densely connected CNN with multi-scale feature attention for text classification. IJCAI Int Jt Conf Artif Intell. 2018;2018–July:4468–74. https://doi.org/10.24963/IJCAI.2018/621.
Article Google Scholar
Yao L, Mao C, Luo Y. Graph Convolutional Networks for Text Classification, Proc. AAAI Conf. Artif. Intell., vol. 33, no. 01, pp. 7370–7377, Jul. 2019, https://doi.org/10.1609/AAAI.V33I01.33017370.
Yang T, Hu L, Shi C, Ji H, Li X, Nie L. Heterogeneous graph attention networks for semi-supervised short text classification. ACM Trans Inf Syst. May 2021;39(3). https://doi.org/10.1145/3450352.
Peng H et al. Apr., Large-scale hierarchical text classification with recursively regularized deep graph-CNN, Web Conf. 2018 - Proc. World Wide Web Conf. WWW 2018, pp. 1063–1072, 2018, https://doi.org/10.1145/3178876.3186005.
Luan Y, Lin S, Research on Text Classification Based on CNN and LSTM., Proc. 2019 IEEE Int. Conf. Artif. Intell. Comput. Appl. ICAICA 2019, pp. 352–355, Mar. 2019, https://doi.org/10.1109/ICAICA.2019.8873454.
Bai X. Text classification based on LSTM and attention, 2018 13th Int. Conf. Digit. Inf. Manag. ICDIM 2018, pp. 29–32, Sep. 2018, https://doi.org/10.1109/ICDIM.2018.8847061.
Rosita JD P and, Jacob WS. Multi-objective genetic algorithm and CNN-Based Deep Learning Architectural Scheme for effective spam detection. Int J Intell Networks. Jan. 2022;3:9–15. https://doi.org/10.1016/J.IJIN.2022.01.001.
Zhang J, Li Y, Tian J, Li T. LSTM-CNN Hybrid Model for Text Classification, Proc. 2018 IEEE 3rd Adv. Inf. Technol. Electron. Autom. Control Conf. IAEAC 2018, pp. 1675–1680, Dec. 2018, https://doi.org/10.1109/IAEAC.2018.8577620.
Nasir JA, Khan OS, Varlamis I. Fake news detection: a hybrid CNN-RNN based deep learning approach. Int J Inf Manag Data Insights. Apr. 2021;1(1):100007. https://doi.org/10.1016/J.JJIMEI.2020.100007.
Li X, Cui M, Li J, Bai R, Lu Z, Aickelin U. A hybrid medical text classification framework: integrating attentive rule construction and neural network. Neurocomputing. Jul. 2021;443:345–55. https://doi.org/10.1016/J.NEUCOM.2021.02.069.
Hochreiter S, Schmidhuber J, Memory LS-T. Neural Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997, https://doi.org/10.1162/NECO.1997.9.8.1735.
Chung J, Gulcehre C, Cho K, Bengio Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv. 2014. https://doi.org/10.48550/ARXIV.1412.3555.
Article Google Scholar
Pascanu R, Mikolov T, Bengio Y. On the difficulty of training recurrent neural networks. PMLR, pp. 1310–1318, May 26, 2013. Accessed: Sep. 16, 2023. [Online]. Available: https://proceedings.mlr.press/v28/pascanu13.html.
Sunagar P, Kanavalli A. A hybrid RNN based Deep Learning Approach for text classification. Int J Adv Comput Sci Appl. 2022;13(6):289–95. https://doi.org/10.14569/IJACSA.2022.0130636.
Article Google Scholar
Cho K et al. Learning phrase representations using RNN Encoder-Decoder for statistical machine translation, pp. 1724–34.
Lai S, Xu L, Liu K, Zhao J. Recurrent Convolutional Neural Networks for Text Classification, Proc. AAAI Conf. Artif. Intell., vol. 29, no. 1, pp. 2267–2273, Feb. 2015, https://doi.org/10.1609/AAAI.V29I1.9513.
Goldberg Y, Levy O et al. word2vec Explained: deriving Mikolov. ’s negative-sampling word-embedding method, ArXiv, 2014.

Download references

Acknowledgements

This research work was supported by M S Ramaiah Institute of Technology, Bangalore-560054, Visvesvaraya Technological University, Jnana Sangama, Belagavi-590018, Department of Information Technology, Manipal Institute of Technology Bengaluru, Manipal Academy of Higher Education, Manipal, Karnataka 576104, India, REVA University, Bengaluru- 560064,

Funding

There are no particular grants from funding organizations for this research.

Open access funding provided by Manipal Academy of Higher Education, Manipal

Author information

Authors and Affiliations

Cloudthat Technologies, Bengaluru, Karnataka, 560095, India
Pramod Sunagar
Department of Artificial Intelligence and Data Science, Ramaiah Institute of Technology, Bengaluru, 560054, India
B. J. Sowmya
Department of Information Technology, Manipal Institute of Technology Bengaluru, Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India
Dayananda Pruthviraja
School of Computer Science and Engineering, REVA University, Bengaluru, 560064, India
S Supreeth & G Shruthi
Department of Computer Science and Engineering, Indian Institute of Technology Patna, Patna, India
Jimpson Mathew
Department of ECE, Nagarjuna College of Engineering and Technology, Bengaluru, India
S Rohith

Authors

Pramod Sunagar
View author publications
You can also search for this author in PubMed Google Scholar
B. J. Sowmya
View author publications
You can also search for this author in PubMed Google Scholar
Dayananda Pruthviraja
View author publications
You can also search for this author in PubMed Google Scholar
S Supreeth
View author publications
You can also search for this author in PubMed Google Scholar
Jimpson Mathew
View author publications
You can also search for this author in PubMed Google Scholar
S Rohith
View author publications
You can also search for this author in PubMed Google Scholar
G Shruthi
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Dr. Pramod Sunagar has identified Initial problem identification, algorithm write-up, analysis, drafting of the manuscript, and simulation. Dr. Sowmya B J was responsible for the Literature survey and helped in the initial review process. Dr. S Supreeth was responsible for the Complexity analysis of the research, evaluation of the research work. Dr. Dayananda Pruthviraja was responsible for the Problem anaylsis, drafting, final formatting and applied for the journal. All authors worked together to implement and evaluate the integrated system, and approved the final version of the paper.

Corresponding author

Correspondence to Dayananda Pruthviraja.

Ethics declarations

Conflict of Interest

All the authors declare that they have no conflicts of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Sunagar, P., Sowmya, B.J., Pruthviraja, D. et al. Hybrid RNN Based Text Classification Model for Unstructured Data. SN COMPUT. SCI. 5, 726 (2024). https://doi.org/10.1007/s42979-024-03091-x

Download citation

Received: 24 February 2024
Accepted: 24 June 2024
Published: 26 July 2024
DOI: https://doi.org/10.1007/s42979-024-03091-x

Hybrid RNN Based Text Classification Model for Unstructured Data

Abstract

Similar content being viewed by others

Amplifying document categorization with advanced features and deep learning

Systematic Literature Review and Bibliometric Analysis on Addressing the Vanishing Gradient Issue in Deep Neural Networks for Text Data

ShufText: A Simple Black Box Approach to Evaluate the Fragility of Text Classification Models

Introduction

Related Works

Methodology

Evaluation

Results and Discussions

The red Dashed line Represents the Significance Threshold of 0.05

Conclusions

Data Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Hybrid RNN Based Text Classification Model for Unstructured Data

Abstract

Similar content being viewed by others

Amplifying document categorization with advanced features and deep learning

Systematic Literature Review and Bibliometric Analysis on Addressing the Vanishing Gradient Issue in Deep Neural Networks for Text Data

ShufText: A Simple Black Box Approach to Evaluate the Fragility of Text Classification Models

Introduction

Related Works

Methodology

Evaluation

Results and Discussions

The red Dashed line Represents the Significance Threshold of 0.05

Conclusions

Data Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation