Machine Learning - Euclidian Distance Classifier exercise

Question

I'm taking part in an elective subject at university which mainly focuses on the foundations of Machine Learning. Now we got our first exercise - this task should be done practically in any language (I've chosen Python). Our teacher doesn't explain the relations between theory and practice well and so it's hard for all of us to follow along - so I decided to post this question here. I don't want anyone to give me a solution, I just don't understand what he wants and may be a hint how to approach this question.:

Here is the full exercise:

Euclidean distance classifier

Develop an Euclidian distance classifier as below: Generate 1000 random points corresponding to each class out of 3 classes with feature size 2 for a 3-class classification problem. For the simplicity consider the classes following N([0 1 2], I), N([0 0 1], I) and N([1 0 0],I) respectively.

Generate the output an 1000-dimensional vector whose ith component contains the class where the corresponding vector is assigned, according to the minimum Euclidean distance classifier.

I understand that I should generate random points with two features which are belonging to one of the three classes - okay. But I don't get the second part of the sentence. The classes are normally distributes with a mean(?)-vector of [0, 1, 2], [0, 0, 1] and [1, 0, 0]?

For what does the second paramter I stand in the normal distribution
Does the vector stand for the position/mean of the multivariate normal distribution?
How would you approach this question? Using a k nearest neighbor algorithm?

Thanks for any helpful answers!

Max

2nd parameter is standard deviation. The output of the classifier is a long vector (1000 itens) each item represents the class of this input vector (the ith one) — Nikos M., May 06 '21 at 15:52

score 0 · Accepted Answer · answered May 06 '21 at 13:14

Clearly your exercice is not very clear, at least it is not to me.

I guess you should consider $I$ as the variance or std deviation of the variable, just make sure that is a parameter of your code so you can change it later in case it is not what you assumed it was. Check up in your course if it corresponds to something, but it doesn't seem anything familiar to me.

Not sure what your second question means, but here are the vectors that need to be created according to me :

Vector of points (1st question), each points lives in 2-D space (feature size).
Vector of classes (2nd question), this vector looks something like [1, 0, 2, 2, 1, 1, ...]. For this vector, it would mean that first value of your vector of points belongs to class 1, second value to class 0 , third to class 2, ...

As the exercise is asking about euclidian distance classifier, you just need to create an algorithm that takes a point, computes the euclidian distance with each class center, and classify it in the closest class. (Not sure that kind of classifier can be considered Machine Learning but w/e).

You probably noticed it, but the mean values given are 3-D vectors, which is quite weird as I said that our points live in 2-D space. I guess the last value of each mean vector is the class of the point generated (just my guess, not blaming your teacher, but as I said your exercise isn't very clear). So according to that interpretation, the classes would be distributed according to distribution :

[0, 1], I variance or std deviation for class 2.
[0, 0], I variance or std deviation for class 1.
[1, 0], I variance or std deviation for class 0.

Hope it helps, if you have any remaining questions, feel free to ask.

Machine Learning - Euclidian Distance Classifier exercise

1 Answers1