Статья

A comparison of text representation methods for predicting political views of social media users

A. Glazkova,
2021

The paper focuses on the task of predicting political views of social media users. The aim of this study is to identify the most effective method for representation textual information from user profile. We compared several text representation methods, including a bag of words modeling, averaged word2vec embeddings, Sentence Transformers representation, and text representations obtained with three BERT-based models, such as Multilingual BERT, SlavicBERT, and RuBERT. We conducted our experiments on the dataset of VKontakte users' data collected with VK API. We evaluated the effectiveness of binary classification for the pages of users with radical political views, including ultraconservatives, communists, and libertarians, and users who are indifferent to politics. Further, we compared the impact of various text representations for distinguishing users belonging to different radical political movements, such as communists vs. libertarians, libertarians vs. ultraconservatives, ultraconservatives vs. communists. Best results were predictably shown by BERT-based models. Moreover, in each task, the best result was achieved by different models.

Цитирование

Похожие публикации

Источник

Версии

  • 1. Version of Record от 2021-01-01

Метаданные

Об авторах
  • A. Glazkova
    University of Tyumen
Название журнала
  • CEUR Workshop Proceedings
Том
  • 2843
Финансирующая организация
  • 20-011-32031 Russian Foundation for Basic Research
Номер гранта
  •  РФФИ
Тип документа
  • journal article
Тип лицензии Creative Commons
  • CC BY
Правовой статус документа
  • Свободная лицензия
Источник
  • scopus