Using Multi-Context Information for Word Representation Learning in Natural Language Processing

Dewalkar, Swapnil Ashok and Desarkar, Maunendra Sankar (2019) Using Multi-Context Information for Word Representation Learning in Natural Language Processing. Masters thesis, Indian institute of technology Hyderabad.

Text
Thesis_Mtech_CS_5551.pdf - Submitted Version
Restricted to Repository staff only until June 2024.
Download (708kB) | Request a copy

Abstract

The existing word embedding techniques are mostly based on Bag of Words models where words that co-occur with each other considered to be related. However, it is not necessary that similar or related words occur in the same context window. While there are few other methods that use different lexical resources available in Natural Language Processing, most of these methods use lexical resources independently as there is no unified method to use all different resources together. In this thesis, we propose a new method to combine different types of resources for training of word embeddings. The lexical resources used in our work are Dependency Parse Tree and WordNet. We have also tried to explain the usefulness of various relations in Dependency Parsing. We perform and present evaluations on various evaluation tasks, e.g. Semantic Textual Similarity, Word Similarity, Concept Categorization, Word Analogy, along with the qualitative results.

[error in script]

IITH Creators: