data analysis with open source tools - philipp janert

Data Analysis with Open Source Tools does a great job covering a lot of topics in way that balances theoretical explanations and practical demonstration. In keeping true to its title, a wealth of tools (and data sources) are identified and explored.

Because the book offers a balance between explanation and demonstration it can be read in two different ways. First, you can read the chapters without getting involved with the code to get a better understanding of the whys and hows of the different analysis techniques. On the other hand, if you are more of a brass tacks person, you can focus on the code, run the examples, and just skim the explanations.

For those that are exploring the world of data analysis, this book is a great compliment to Segaran's Programming Collective Intelligence and Russell's Mining the Social Web. Where the books overlap the explanations and examples differ which helps enormously when trying to master the concepts and techniques. However, each book contains topics not in the others. Collectively they offer a rather powerful set of tools.

Having read the other books prior to this one, I really appreciated the time spent on the mathematics behind each technique. The others get your hands dirty very quickly - and I appreciated that greatly when first exploring data mining - but I found myself wanting to have a deeper understanding which this book so nicely provides. As Janert mentions in the first chapter, the succinct notation of mathematics is much clearer than having to try to extract the essence of twenty lines of source code. Without a doubt, though, Data Analysis is dense which and that might turn a few people off.

All said and done, I'm glad I took the time to read the book and will definitely keep it nearby.

Full information: Data Analysis with Open Source Tools by Philipp Janert, O'Reilly Media, Inc.

