While reading introductory texts in machine learning, I found that my statistics background was not sufficient. Classical statistics is not often used in nowadays machine learning: hypothesis tests, sample distribution, and p-values are not often mentioned. Instead, people start to talk about priors and posteriors. This book was recommended, so I gave it a try during winter vacation.

Over all, the book was a good experience. It’s targeted at people with some exposure to statistics: they had to deal with data and wanted to make sense out of it at some point. It delivered what it promised. The materials are very accessible.

The first half of the book is the part worth reading. The latter half is not very well written, possibly because more sophisticated Bayesian methods have not been developed, and the author prefers analytic solution rather than numerical computation.

Given those caveats, this is a MUST read for those who just experience either frequentist or Bayesian approach. The book compared the two approaches while solving several classical statistical questions in a clear and concise manner.

Will I be converted to do Bayesian stastistics after reading this book? Yes and no.

Bayesian statistics modeled our belief explicitly. It is suitable to machine learning tasks. It is also suitable for thorough statistical studies & surveys. However, it’s hard to come up with a reasonable prior. For an amateur statistician, it makes more sense to just take an off-the-shelf procedure that gives p-values and whatever measures that have been curated in publications. It’s just not worth the hassle. And the frequentist approach gives formulas that are easier to explain. You prefer k/n rather than (k+1)/(n+2), because the former is simpler and easier to understand for people with no statistical background.

When forced to compare between frequentist approach and Bayesian approach, one is constantly reminded of his philosophy. It seems that after all, human is always subjective. As long as numbers are calculated, you are on the safe side of mathematics. But once you try to make any sense out of those numbers, you start to impose assumptions. Reasonable assumptions result in reasonable conclusions, and otherwise. For those hiding their heads in the sand, thinking that numbers can give them the definitive answer, well, they are wrong. To sum up, I’d like to say

Lies, damned lies, and statistics

Advertisements