0

I am wondering what is the most appropriate way to model the interaction between two words/variables in a language model for a sentiment analysis task. For example, in the following dataset:

You didn't solve my problem,NEU
I never made that purchase,NEU
You never solve my problems,NEG

The words "solve" and "never", in isolation, doesn't have a negative sentiment. But, when they appear together, they do. Formally speaking: assuming we have a feature «solve» that takes the value 0 when the word «solve» is absent, and 1 when the word is present, and another feature «never» with the same logic: the difference in the probability of Y=NEG between «solve»=0 and «solve»=1 is different when «never»=0 and «never»=1. But a basic logistic regression (using, for example, sklearn), wouldn't be able to handle this kind of situation (it doesn't add interaction terms).

  • It depends on how you teach the models those objectives. The semantic rules can be learned thanks to attention mechanisms that define if a specific structure is NEG or NEU. Do you want to train a new model or reuse a new one? – Nicolas Martin Jan 20 '23 at 08:50
  • Thank you @NicolasMartin. That's what I was wondering: which machine learning or deep learning architectures can be used to model the interaction between words. I want to train a model from scratch. It is a more experimental exercise. What could I try? – pedritoanonimo Jan 20 '23 at 12:06
  • You should train a Bert model from scratch (check online guides or chatgpt) – Nicolas Martin Jan 20 '23 at 13:42

0 Answers0