Suppose you have an array of n normally distributed numbers whose values are initially unknown(and the probability parameters are unknown too). You must choose one number and you want it to have maximum value. You examine the numbers one number at a time. You may only choose a number right after it is examined(that is we learn it's value) and before the next number is examined. What strategy would you use to maximize the expected value of chosen number?
My intuition and thinking:.
We can use numbers seen so far to estimate the distribution. We can than calculate the expected value of the maximum number in the rest of the list and if it is less than the current number we will take the current number. First of all is my thinking at all correct? And second of all can someone help me with the actual calculations because I have only taken one course in undergraduate probability so far. Thank you!