Статья

A comparison of text representation methods for predicting political views of social media users

A. Glazkova,
2021

The paper focuses on the task of predicting political views of social media users. The aim of this study is to identify the most effective method for representation textual information from user profile. We compared several text representation methods, including a bag of words modeling, averaged word2vec embeddings, Sentence Transformers representation, and text representations obtained with three BERT-based models, such as Multilingual BERT, SlavicBERT, and RuBERT. We conducted our experiments on the dataset of VKontakte users' data collected with VK API. We evaluated the effectiveness of binary classification for the pages of users with radical political views, including ultraconservatives, communists, and libertarians, and users who are indifferent to politics. Further, we compared the impact of various text representations for distinguishing users belonging to different radical political movements, such as communists vs. libertarians, libertarians vs. ultraconservatives, ultraconservatives vs. communists. Best results were predictably shown by BERT-based models. Moreover, in each task, the best result was achieved by different models. © 2021 CEUR-WS. All rights reserved.

Цитирование

Похожие публикации

Источник

Версии

  • 1. Version of Record от 2021-08-23

Метаданные

Об авторах
  • A. Glazkova
    University of Tyumen, 6, Volodarskogo Str., Tyumen, 625003, Russian Federation
Предметная рубрика
  • COVID-19
Название журнала
  • CEUR Workshop Proceedings
Том
  • 2843
Ключевые слова
  • Decision making; Bag-of-words models; Binary classification; Political movements; Political views; Social media; Text representation; Textual information; User profile; Social networking (online)
Издатель
  • CEUR-WS
Тип документа
  • Conference Paper
Тип лицензии Creative Commons
  • CC-BY
Правовой статус документа
  • Свободная лицензия
Источник
  • scopus