Data Mining with Decision Trees: Theroy and Applications (Machine Perception and Artificial Intelligence) (英語) ハードカバー – 2007/12/17
Kindle 端末は必要ありません。無料 Kindle アプリのいずれかをダウンロードすると、スマートフォン、タブレットPCで Kindle 本をお読みいただけます。
This is the first comprehensive book dedicated entirely to the field of decision trees in data mining and covers all aspects of this important technique.Decision trees have become one of the most powerful and popular approaches in knowledge discovery and data mining, the science and technology of exploring large and complex bodies of data in order to discover useful patterns. The area is of great importance because it enables modeling and knowledge extraction from the abundance of data available. Both theoreticians and practitioners are continually seeking techniques to make the process more efficient, cost-effective and accurate. Decision trees, originally implemented in decision theory and statistics, are highly effective tools in other areas such as data mining, text mining, information extraction, machine learning, and pattern recognition. This book invites readers to explore the many benefits in data mining that decision trees offer:
The first thing you notice about this book is its very academic style. It has numbered paragraphs like 2.0, and 18.104.22.168. It been used a graduate text, presumably for mathematicians and computer scientists. I think it would be good for that purpose. It could work quite well for statisticians that are interested in the details of data mining algorithms. It is in a series in Machine Perception and Artificial Intelligence. Other titles include "Fundamentals of Robotics", and "Bridging the Gap Between Graph Edit Distance and Kernel Machines", so don't confuse this book with something like Data Mining Techniques, which is written for a general audience. It opens the 2nd chapter with (condensed): "A training set is a bag instance of a bag schema. A bag instance is a collection of tuples that may contain duplicates." The folks that I work with can instantly divide themselves into those that would consider a book like this, and those that wouldn't. It cites references in almost every sentence, which can be distracting to the casual reader, and eventually convinced me that I need to read the original authors like Breiman. Classification and Regression Trees
So having issued a warning, there is plenty to like. The authors have made a real attempt to cover everything - I found 1/3 that I knew, 1/3 that will be quite useful to me, and 1/3 that is too much detail for me. Chapter 3 "Evaluation of Classification Trees" will be great for statisticians that wondered how to judge the efficacy of a tree that was built without hypothesis testing. Also, I was very pleased to see a chapter on "Decision Forests", which is a discussion of "ensemble methods" - in other words combining a set of tree models.
I was hoping for something that would have a detailed chapter on each of the most common decision trees algorithms with briefer sections on the obscure ones. It has all this information, but in a way that I have to work pretty hard to get to it. If you want a quick overview of data mining (even if you think that trees are the method you are going to use), try Data Mining Techniques. Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management If you want to know the details, but are content to learn the details only on the well known techniques (like CHAID and CART) then Larose is a good choice. Discovering Knowledge in Data: An Introduction to Data Mining
If you're looking for a collection of organized references to important papers on the topic of decision trees and you've access to the archives of the cited journals, then this book is useful as a jumping-off point to see how the various papers relate. If you're looking for a standalone book on the topic, look elsewhere.