Download PDFOpen PDF in browserOne-Hot Encoding and Bag-of-Words Methods in Processing the Uzbek Language Corpus TextsEasyChair Preprint no. 110486 pages•Date: October 9, 2023AbstractComputers are designed to process information in digital or numerical form. But data is not always in numerical form. This article describes how to process data in the form of characters, words, and text, as well as the application of ONE-HOT ENCODING and BAG-OF-WORDS methods to the Uzbek language, among the methods of teaching a computer to process natural language. How do Alexa, Google Home, and many other "smart" assistants understand and respond to our speech today? This article presents the approaches of text processing of the Uzbek language corpus through text processing methods such as Bag-of-words (BOW), ONE-HOT encoding in the field of artificial intelligence called natural language processing. Keyphrases: one-hot encoding, text processing Bag of words, Uzbek language corpus
|