I'm having a hard time understanding why people use any vector they find as a candidate for a recommender system.
In my mind, a recommender system requires a space where distance represents similarity. Of course, before you can construct such a space, first you need to settle on the type of distance you want to use (euclidean, angular, or anything else). Then you need a model (assuming we are talking about ML) to map your input (it could be an image, text, or anything else) to a point in that space. One major aspect of this model is that it's aware of the type of distance we've defined. If there's no notion of the distance in the model, definitely the output of the model is not going to have the attribute of "distance means similarity".
I'm asking this question because I've seen people use any vector they find to construct a recommender system. Here's an example of using a VAE's latent vectors for recommender systems:
I've also seen people using fastText word embeddings in the same way. I understand that all these embeddings/latent vectors form clusters in their spaces with some interesting patterns. But I don't think this is enough to assume the "distance represents similarity" requirement for a recommender system.
Please let me know if I'm missing anything here.