I'm still new in machine learning and I've been trying to expand my knowledge about it. For my first project, I want to classify if a tweet is suicidal or not using the gradient boost algorithm.
I do know that ml models can't process plain text which is why we have to represent them as numbers. These numeric values will be the input features to the machine learning model (correct me if I'm wrong).
But what I don't understand is how these numbers/vectors are being processed by the model to train it and make a prediction.
Hopefully someone can explain how plain text are converted into words and what's happening internally as they are taken as input to the machine learning model.