Global Journal of Computer Science and Technology, D: Neural & Artificial Intelligence, Volume 22 Issue 1

problem using the supervised approach where they first undergo training session during which they are fed with labelled data from which they learn the relationships in the data and attain the learning capability. In later stage, they are presented with unseen data of same domain and are able to make remarkable inferences from this unseen data by utilizing the attained learning capability. In addition to reporting the state-of-art classification results, its accuracy is also remarkable. In our research, we can conclude that the models and the data that we used, the Mel Spectrograms performed better and showed better performance than the MFCCs, also we saw that the images with 3 color channels performed better than the features that were saved with extended dimension. Overall we can conclude, our model showed state of the art performance for the Accent classification of Kashmiri Language with five output classes. VI. F uture I mprovements There is a lot of improvement to be done in this field of research. Since, this is the first research in this language, as we could not find any other research related to Kashmiri language, so the area of improvement is vast. We propose following enhancements for this research- • Collection of more data for efficient model training a. The dataset can be increased in size b. The dataset can be in such a way, that it captures the maximum of the features and variations present in the language. • The model can be made more complex and more sophisticated that would be able to handle more data and not underfit. • Making efficient model for being able to capture most of the features of accent classification • Improving the classification error and thus being able to classify wide range of the language. • Making use of different architectures and techniques available for making the overall application most fruitful. • The classification classes can be increased to more than 5 accents or regions. R eferences R éférences R eferencias 1. L. Kat and P. Fung, “Fast accent identification and accented speech recognition,” in Acoustics, Speech, and Signal Processing (ICASSP), IEEE International Conference on, vol. 1. Phoenix, AZ, USA: IEEE, 1999, pp. 221–224. 2. C. Huang, T. Chen, and E. Chang, “Accent issues in large vocabulary continuous speech recognition,” International Journal of Speech Technology, vol. 7, no. 2-3, pp. 141–153, 2004. 3. D. C. Tanner and M. E. Tanner, Forensic aspects of speech patterns: voice prints, speaker profiling, lie and intoxication detection. Lawyers & Judges Publishing Company, 2004. 4. F. Biadsy, J. B. Hirschberg, and D. P. Ellis, “Dialect and accent recognition using phonetic- segmentation supervectors,” 2011. 5. S. Deshpande, S. Chikkerur, and V. Govindaraju, “Accent classification in speech,” in Automatic Identification Advanced Technologies, Fourth IEEE Workshop on. Buffalo, NY, USA: IEEE, 2005, pp. 139–143. 6. T. Chen, C. Huang, E. Chang, and J. Wang, “Automatic accent identification using gaussian mixture models,” in Automatic Speech Recognition and Understanding, IEEE Workshop on. Madonna di Campiglio, Italy: IEEE, 2001, pp. 343–346. 7. H. Tang and A. A. Ghorbani, “Accent classification using support vector machine and hidden markov model,” in Advances in Artificial Intelligence. Springer, 2003, pp. 629–631. 8. K. Kumpf and R. W. King, “Foreign speaker accent classification using phoneme-dependent accent discrimination models and comparisons with human perception benchmarks,” in Proc. EuroSpeech, vol. 4, pp. 2323–2326, 1997. 9. G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-r. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath et al., “Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups,” Signal Processing Magazine, IEEE, vol. 29, no. 6, pp. 82–97, 2012. 10. H. Zen and H. Sak, “Unidirectional long short- term memory recurrent neural network with recurrent output layer for low-latency speech synthesis,” in Acoustics, Speech and Signal Processing (ICASSP), IEEE International Conference on. Brisbane, Australia: IEEE, 2015, pp. 4470–4474. 11. Y. Xu, J. Du, L.-R. Dai, and C.-H. Lee, “An experimental study on speech enhancement based on deep neural networks,” Signal Processing Letters, IEEE, vol. 21, no. 1, pp. 65–68, 2014. 12. Y. Jiao, M. Tu, V. Berisha, and J. Liss, “Online speaking rate estimation using recurrent neural netwroks,” in Acoustics, Speech and Signal Processing, IEEE International Conference on. Shanghai, China: IEEE, 2016. 13. M. V. Chan, X. Feng, J. A. Heinen, and R. J. Niederjohn, “Classification of speech accents with neural networks,” in Neural Networks, IEEE World Congress on Computational Intelligence., IEEE International Conference on, vol. 7. IEEE, 1994, pp. 44834486. 14. Rabiee and S. Setayeshi, “Persian accents identification using an adaptive neural network,” in Second International Workshop on Education Technology and Computer Science. Wuhan, China: IEEE, 2010, pp. 7–10. © 2022 Global Journals Global Journal of Computer Science and Technology Volume XXII Issue I Version I 28 ( )D Acoustic Features based Accent Classification of Kashmiri Language using Deep Learning Year 2022

RkJQdWJsaXNoZXIy NTg4NDg=