ON USAGE OF MACHINE LEARNING FOR NATURAL LANGUAGE PROCESSING TASKS AS ILLUSTRATED BY EDUCATIONAL CONTENT MINING


Cite item

Full Text

Abstract

In this paper, we review most popular approaches to a variety of natural language processing (NLP) tasks, primarily those, which involve machine learning: from classics to state-of-the-art technologies. Most modern approaches can be separated into three rough categories: ones based on distributional hypothesis, those extracting information from graph-like structures (such as ontologies) and the ones that look for lexico-syntactic patterns in text documents. We focus mainly on the former of the three. Before the analysis can even begin, one of the important steps in preparation stage of NLP is the task of representing words and documents as numeric vectors. There exists a variety of approaches from the most simplistic Bag-of-Words to sophisticated machine learning methods, such as word embedding. Today, in the task of information retrieval the best quality for both English and Russian languages is achieved by approaches based on word embedding algorithms, trained on carefully picked text corpora in conjunction with deep syntactic and semantic analysis using various deep neural networks. A big variety of different machine learning algorithms is being applied for NLP tasks such as Part-of-Speech-tagging, text summarization, named entity recognition, document classification, topic and relation extraction and natural language question answering. We also review possibilities of applying these approaches and methods to educational content analysis, and propose the novel approach to utilizing NLP and machine learning capabilities in analyzing and synthesizing educational content in a form of a decision support systems.

About the authors

A.V. Melnikov

Chelyabinsk State University, Institute of Information Technology, Chelyabinsk, Russia

Author for correspondence.
Email: mav@csu.ru

D.S. Botov

Chelyabinsk State University, Institute of Information Technology, Chelyabinsk, Russia

Email: dmbotov@gmail.com

J.D. Klenin

Chelyabinsk State University, Institute of Information Technology, Chelyabinsk, Russia

Email: jklen@yandex.ru

References

Supplementary files

Supplementary Files
Action
1. JATS XML

Copyright (c) 2018 А.В. Мельников, Д.С. Ботов, Ю.Д. Кленин



СМИ зарегистрировано Федеральной службой по надзору в сфере связи, информационных технологий и массовых коммуникаций (Роскомнадзор).
Регистрационный номер и дата принятия решения о регистрации СМИ: серия ФС 77 - 70157 от 16.06.2017.

This website uses cookies

You consent to our cookies if you continue to use our website.

About Cookies