7

Rigor Theory. I wish to learn the scientific method and how to apply it in machine learning. Specifically, how to verify that a model captured the pattern in data; how to rigorously reach conclusions based on well-justified empirical evidence.

Verification in Practice. My colleagues in both academia and industry tell me measuring the accuracy of the model on testing data is sufficient, but I don't feel confident such criteria are sufficient.

Data Science Books. I have picked up multiple data science books, like Skiena's manual, Dell EMC's book, and Waikato's data mining. Even though there had been a section for diagnosing the model and measuring results, my instinct worries are these are heuristics, but not rigour-based.

Scientific Method Books. Searching for the scientific method I found, Statistics and Scientific Method: An Introduction for Students and Researchers and Principles of Scientific Methods, which seem to answer the crux of my question. I am planning to study both of them.

My Questions. Here are couple of questions I hope to gain guidance on, from your wonderful community.

  • Is it feasible to rigorously apply the scientific method in machine learning applications like recommendation engines or social sciences, or is it the case that so far our scientific/technological advancement didn't reach that degree of maturity, and that the best we can hope for is heuristics-based approximations.
  • Is it feasible to do machine learning in practical industry, by applying the scientific method, or is it the case that industry leaders prefer cheap heuristics in order to minimize a project's costs?
  • Are the scientific method books I mentioned above useful for enhancing my own skills in machine learning? Are they worthwhile the effort and time?
  • Are you aware of better alternative resources for learning the scientific method? Are there more helpful courses or recorded lectures?
  • Do you have any recommendations or advise, while studying the scientific method, for someone who is mainly motivated by machine learning in industry like recommendation engines applications and logistical optimization?
  • 2
    I love this question. I feel like I'm having to discover for myself how to do rigorous work in this aspect; that my department didn't prepare me in that particular aspect of data science. Personally, I've been using tools like Benchling, designed for experimental biologists, to record my ML experiments. – John Madden Mar 21 '22 at 18:53
  • That is a lot of questions... – Peter Mortensen Mar 21 '22 at 20:56

1 Answers1

6

I think it's an excellent idea to acquire and apply a solid scientific basis in general and in ML in particular. Here are a few comments:

  • The scientific method is a general set of good principles for obtaining reliable scientific conclusions. It's not especially precise and it's not always clear if it's applied correctly in a specific case, there's no clear binary way to say whether a study satisfies the scientific method or not.
  • Mind that what is considered scientifically valid evolves over time. For example, various shortcomings of using significance tests have been demonstrated recently.
  • In science and in statistics in particular, the main scientific point is usually not to prove something with 100% confidence (virtually never possible) but to quantify the confidence in some reliable way. A common mistake is to expect a ML prediction to be 100% reliable: of course this is not possible (it wouldn't be statistical learning otherwise), but it is possible to measure how likely a prediction is correct (to some extent).
  • So the level of maturity of the field doesn't matter: solid scientific principles can be applied to any existing method. This even applies to heuristics: it's just a matter of estimating their performance reliably.
  • Last point: let's not forget that the scientific method is not a rigid set of formal rules to apply systematically, it's also about questioning whether a particular dataset or approach is adapted to the goal. Another common mistake I see is to apply a specific kind of evaluation to a system without thinking whether this evaluation truly measures the task that the system is supposed to do.

(I don't have any specific book recommendation)

Erwan
  • 24,823
  • 3
  • 13
  • 34