4

As a researcher and instructor, I'm looking for open-source books (or similar materials) that provide a relatively thorough overview of data science from an applied perspective. To be clear, I'm especially interested in a thorough overview that provides material suitable for a college-level course, not particular pieces or papers.

IgorS
  • 5,444
  • 11
  • 31
  • 43
statsRus
  • 325
  • 1
  • 10
  • 4
    List questions are usually not suited for Stack Exchange websites since there isn't an "objective" answer or a way to measure the usefulness of an answer. Having said that, one of my recommendations would be MacKay's "Information Theory, Inference, and Learning Algorithms." – Ansari May 14 '14 at 00:38
  • 3
    This question appears to be off-topic because it is asks for a favorite resource. On other SE sites, this would immediately be closed. Since this is a new site, we still have to decide if this is a valid question here – demongolem May 14 '14 at 01:16
  • Fair enough regarding what constitutes a "valid" question, although on other SE sites this question would **not** be immediately closed as you've stated: e.g., [2495 votes](http://stackoverflow.com/questions/194812/list-of-freely-available-programming-books), [1440 votes](http://stackoverflow.com/questions/1711/what-is-the-single-most-influential-book-every-programmer-should-read), [168 votes](http://tex.stackexchange.com/questions/11/what-is-the-best-book-to-start-learning-latex), and so on. There's great interest for these kinds of questions, even if this isn't deemed the right place. – statsRus May 14 '14 at 02:35
  • 1
    @statsRus: Try posting a question like that to SO, and it'll be closed; these questions exists because they have historical significance, but they are not considered good, on-topic questions for Stack Exchange sites, so **please do not use them as evidence that you can ask similar questions here.** – blunders May 15 '14 at 21:08

3 Answers3

13

One book that's freely available is "The Elements of Statistical Learning" by Hastie, Tibshirani, and Friedman (published by Springer): see Tibshirani's website.

Another fantastic source, although it isn't a book, is Andrew Ng's Machine Learning course on Coursera. This has a much more applied-focus than the above book, and Prof. Ng does a great job of explaining the thinking behind several different machine learning algorithms/situations.

Nick Peterson
  • 323
  • 3
  • 10
  • 2
    Nice one, @Nicholas... Another book from Hastie and Tibshirani is [Introduction to Statistical Learning](http://www-bcf.usc.edu/~gareth/ISL/), which is a bit gentler of an entry compared to ESL. – buruzaemon May 14 '14 at 02:16
8

Data Science specialization from Johns Hopkins University at Coursera would be a great start. https://www.coursera.org/specialization/jhudatascience/1

IgorS
  • 5,444
  • 11
  • 31
  • 43
6

There is free ebook "Introduction to Data Science" based on language