Improving Accuracy using The ASERLU layer in CNN-BiLSTM Architecture on Sentiment Analysis

Sandi  Hermawan; Rilla Mandala

doi:10.29207/resti.v5i5.3534

Sandi Hermawan President University
Rilla Mandala Bandung Institute of Technology

DOI: https://doi.org/10.29207/resti.v5i5.3534

Keywords: CNN-BiLSTM, ReLU, Sentiment Analysis, SERLU, US Election 2020

Abstract

There have been 350,000 tweets generated by the interaction of social networks with different cultures and educational backgrounds in the last ten years. Various sentiments are expressed in the user comments, from support to hatred. The sentiments regarded the United States General Election in 2020. This dataset has 3,000 data gotten from previous research. We augment it becomes 15,000 data to facilitate training and increase the required data. Sentiment detection is carried out using the CNN-BiLSTM architecture. It is chosen because CNN can filter essential words, and BiLSTM can remember memory in two directions. By utilizing both, the training process becomes maximum. However, this method has disadvantages in the activation. The drawback of the existing activation method, i.e., "Zero-hard Rectifier" and "ReLU Dropout" problem to become the cause of training stopped in the ReLU activation, and the exponential function cannot be set become the activation function still rigid towards output value in the SERLU activation. To overcome this problem, we propose a novel activation method to repair activation in CNN-BiLSTM architecture. It is namely the ASERLU activation function. It can adjust positive value output, negative value output, and exponential value by the setter variables. So, it adapts more conveniently to the output value and becomes a flexible activation function because it can be increased and decreased as needed. It is the first research applied in architecture. Compared with ReLU and SERLU, our proposed method gives higher accuracy based on the experiment results.

Downloads

Download data is not yet available.

References

A. Gaydhani, V. Doma, S. Kendre, and L. Bhagwat, “Detecting hate speech and offensive language on twitter using machine learning: An N-gram and TFIDF based approach,” arXiv. 2018. https://arxiv.org/pdf/1809.08651.

L. Grimminger and R. Klinger, “Hate Towards the Political Opponent: A Twitter Corpus Study of the 2020 US Elections on the Basis of Offensive Speech and Stance Detection,” 2021. https://arxiv.org/abs/2103.01664.

R. Alshalan and H. Al-Khalifa, “A deep learning approach for automatic hate speech detection in the saudi twittersphere,” Appl. Sci., 2020. https://doi.org/10.3390/app10238614.

H. Mohaouchane, A. Mourhir, and N. S. Nikolov, “Detecting Offensive Language on Arabic Social Media Using Deep Learning,” in 2019 6th International Conference on Social Networks Analysis, Management and Security, SNAMS 2019, 2019. https://www.doi.org/10.1109/SNAMS.2019.8931839.

I. Abu Farha and W. Magdy, “Multitask Learning for {A}rabic Offensive Language and Hate-Speech Detection,” Proc. 4th Work. Open-Source Arab. Corpora Process. Tools, with a Shar. Task Offensive Lang. Detect., 2020. https://aclanthology.org/2020.osact-1.14.

S. Qiu, X. Xu, and B. Cai, “FReLU: Flexible Rectified Linear Units for Improving Convolutional Neural Networks,” in Proceedings - International Conference on Pattern Recognition, 2018. https://www.doi.org/10.1109/ICPR.2018. 8546022.

L. Parisi, D. Neagu, R. Ma, and F. Campean, “QReLU and m-QReLU: Two novel quantum activation functions to aid medical diagnostics,” arXiv, 2020. https://orcid.org/0000-0002-5865-8708.

A. Ashiquzzaman, A. K. Tushar, S. Dutta, and F. Mohsin, “An efficient method for improving classification accuracy of handwritten Bangla compound characters using DCNN with dropout and ELU,” in Proceedings - 2017 3rd IEEE International Conference on Research in Computational Intelligence and Communication Networks, ICRCICN 2017, 2017. https://www.doi.org/10.1109/ICRCICN.2017.8234497.

G. Zhang and H. Li, “Effectiveness of scaled exponentially-regularized linear units (SERLUs),” arXiv. 2018. https://arxiv.org/pdf/1807.10117.

K. Hara, D. Saito, and H. Shouno, “Analysis of function of rectified linear unit used in deep learning,” in Proceedings of the International Joint Conference on Neural Networks, 2015. https://www.doi.org/110.1109/IJCNN.2015.7280578.

A. F. M. Agarap, “Deep Learning using Rectified Linear Units (ReLU),” arXiv. 2018. https://arxiv.org/pdf/1803.08375.

X. Hu, P. Niu, J. Wang, and X. Zhang, “A Dynamic Rectified Linear Activation Units,” IEEE Access, 2019. https://www.doi.org/10.1109/ACCESS.2019.2959036.

S. C. Douglas and J. Yu, “Why RELU Units Sometimes Die: Analysis of Single-Unit Error Backpropagation in Neural Networks,” Conf. Rec. - Asilomar Conf. Signals, Syst. Comput., vol. 2018-October, no. 3, pp. 864–868, 2019. https://arxiv.org/pdf/1812.05981.

J. Wei and K. Zou, “EDA: Easy data augmentation techniques for boosting performance on text classification tasks,” EMNLP-IJCNLP 2019 - 2019 Conf. Empir. Methods Nat. Lang. Process. 9th Int. Jt. Conf. Nat. Lang. Process. Proc. Conf., pp. 6382–6388, 2020. https://arxiv.org/pdf/1901.11196.

Y. Xu and R. Goodacre, “On Splitting Training and Validation Set: A Comparative Study of Cross-Validation, Bootstrap and Systematic Sampling for Estimating the Generalization Performance of Supervised Learning,” J. Anal. Test., 2018. https://www.doi.org/10.1007/s41664-018-0068-2.