Data Mining: Practical Machine Learning Tools and Techniques, Third Edition (The Morgan Kaufmann Series in Data Management Systems) (英語) ペーパーバック – 2011/1/20
"...offers a thorough grounding in machine learning concepts as well as practical advice on applying machine learning tools and techniques in real-world data mining situations."
"Co-author Witten is the author of other well-known books on data mining, and he and his co-authors of this book excel in statistics, computer science, and mathematics. Their in- depth backgrounds and insights are the strengths that have permitted them to avoid heavy mathematical derivations in explaining machine learning algorithms so they can help readers from different fields understand algorithms. I strongly recommend this book to all newcomers to data mining, especially to those who wish to understand the fundamentals of machine learning algorithms."--INFORMS Journal of Computing
"The third edition of this practical guide to machine learning and data mining is fully updated to account for technological advances since its previous printing in 2005 and is now even more closely aligned with the use of the Weka open source machine learning, data mining and data modeling application. Beginning with an introduction to data mining, the volume explores basic inputs, outputs and algorithms, the implementation of machine learning schemes and in-depth exploration of the many uses of the Weka data analysis software. Numerous illustration, tables and equations are included throughout and additional resources are available through a companion website. Witten, Frank and Hall are academics with the department of computer science at the University of Waikato, New Zealand, the home of the Weka software project."--Book News, Reference & Research
"I would recommend this book to anyone who is getting started in either data mining or machine learning and wants to learn how the fundamental algorithms work. I liked that the book slowly teaches you the different algorithms piece by piece and that there are also a lot of examples. I plan on taking a machine learning course this upcoming fall semester and feel that the book gave me great insight that the course will be based on mathematics more than I had originally expected. My favorite part of the book was the last chapter where it explains how you can solve different practical data mining scenarios using the different algorithms. If there were more chapters like the last one, the book would have been perfect. This book might not be that useful if you do not plan on using the Weka software or if you are already familiar with the various machine learning algorithms. Overall, Data Mining: Practical Machine Learning Tools and Techniques is a great book to learn about the core concepts of data mining and the Weka software suite."--ACM SIGSOFT Software Engineering Notes
"This book is a must-read for every aspiring data mining analyst. Its many examples and the technical background it imparts would be a unique and welcome addition to the bookshelf of any graduate or advanced undergraduate student. The book is written for both academic and application-oriented readers, and I strongly recommend it to any reader working in the area of machine learning and data mining."--Computing Reviews.com
Ian H. Witten is a professor of computer science at the University of Waikato in New Zealand. He directs the New Zealand Digital Library research project. His research interests include information retrieval, machine learning, text compression, and programming by demonstration. He received an MA in Mathematics from Cambridge University, England; an MSc in Computer Science from the University of Calgary, Canada; and a PhD in Electrical Engineering from Essex University, England. He is a fellow of the ACM and of the Royal Society of New Zealand. He has published widely on digital libraries, machine learning, text compression, hypertext, speech synthesis and signal processing, and computer typography. He has written several books, the latest being Managing Gigabytes (1999) and Data Mining (2000), both from Morgan Kaufmann.
Eibe Frank lives in New Zealand with his Samoan spouse and two lovely boys, but originally hails from Germany, where he received his first degree in computer science from the University of Karlsruhe. He moved to New Zealand to pursue his Ph.D. in machine learning under the supervision of Ian H. Witten, and joined the Department of Computer Science at the University of Waikato as a lecturer on completion of his studies. He is now an associate professor at the same institution. As an early adopter of the Java programming language, he laid the groundwork for the Weka software described in this book. He has contributed a number of publications on machine learning and data mining to the literature and has refereed for many conferences and journals in these areas.>
Mark A. Hall was born in England but moved to New Zealand with his parents as a young boy. He now lives with his wife and four young children in a small town situated within an hour's drive of the University of Waikato. He holds a bachelor's degree in computing and mathematical sciences and a Ph.D. in computer science, both from the University of Waikato. Throughout his time at Waikato, as a student and lecturer in computer science and more recently as a software developer and data mining consultant for Pentaho, an open-source business intelligence software company, Mark has been a core contributor to the Weka software described in this book. He has published a number of articles on machine learning and data mining and has refereed for conferences and journals in these areas.
- 出版社 : Morgan Kaufmann; 第3版 (2011/1/20)
- 発売日 : 2011/1/20
- 言語 : 英語
- ペーパーバック : 664ページ
- ISBN-10 : 0123748569
- ISBN-13 : 978-0123748560
- 寸法 : 19.05 x 3.81 x 23.5 cm
- Amazon 売れ筋ランキング: - 243,198位洋書 (の売れ筋ランキングを見る洋書)
I use this book for a module at my University and it is also very reasonably priced (circa £21.00) which for a student book is affordable.
The book chapters follow a logical order, from data input, output representations, data mining algorithms for supervised (classification) and unsupervised (clustering) and examples for Weka. It covers all the major models, from Linear, Statistical, Rule representation and decision trees: It covers basic algorithms such as 1R (OneR) and various clustering methods, K-Means etc.
There is a few places where it is not clear or followed through in methods, I had to spend a bit of time replicating their numbers, some of the examples seem to be coincidental in their numbers which makes it more difficult to apply a technique to different data sets due to the some parts, lack of follow through (decision tree induction based on entropy as a splitting criteria for example, the book stops short of following through an example on deeper levels in the tree, this I had to replicate the numbers to understand the method).
K-nearest neighbour chapter is lacking depth, it does not seem to cover the process of creating Voronoi tessellation boundary diagrams just jumps into Kd-Trees and so on. Could have more introduction and more coverage of KNN.
Association Rule mining, the book is not really clear on support and confidence, at least not compared to other books on the subject that more clearly states the calculations. Could have better and more varied examples, to show the application of the methods with differing data sets to help clarify application of methods.
A worthwhile book on your bookshelf for machine learning applied to data mining.
This volume hits the mark for the practitioner who wishes to come up to speed on the tools and techniques for data mining currently available.
Beginning with core introductory concepts, the text moves on to more advanced topic areas in a readable format without all the greek notations and mathematical proofs academics writers clutter pages with. The text is nicely separated into three distinct parts-introduction, advanced topics and the software application.
As with all learning, seeing the forest and not all the trees will greatly enhance the topic areas covered in each part of the book. If read in chapter sequence, the references to other chapters throughout the book can be ignored or used as reference upon re-reading topic areas.
The text provides a bonus with a detailed overview of WEKA software that is a free download. The software is a very useful tool in exploring the concepts areas discussed in the text. It is both functional and easily learned with few bugs normally associated with open source software.
Beginning students may find some of the topic areas too advanced but working through the text from start to finish will allow for the development of useful skills in this evolving area of machine learning.
For those whose academic past includes advanced statistics the coverage area will be a useful review leading into the application of previously learned tools to data mining.
This text was a pleasant surprise and one that is a good reference book on the shelf of the practitioner.
Sono abbastanza un neofita del campo, ma ho trovato il libro giusto per quello che è il mio approccio: vedere prima la parte pratica delle cose (capire a cosa servono) e poi passare alla teoria. Questo indubbiamente copre la parte pratica, molti esempi, lineari durante tutta la spiegazione (esposti all'inizio e poi ripresi in continuazione) e poca, per non dire praticamente nulla, la teoria. Avevo già qualche base da cui partire, ma sicuramente qui si va molto più a fondo, direi che comunque il materiale è abbastanza chiaro, si fa capire quasi interamente con una sola lettura, sono pochi i passaggi che richiedono un impegno "intenso", se così lo posso definire.
Sul lato software, il libro usa solo weka, che non conosco, come del resto non conosco nessun altro programma simile ("Sono abbastanza un neofita [...]") ma essendo gli autori del libro anche gli sviluppatori del software direi che può essere un ottimo motivo per consigliare il libro.
Non ho idea se il software venga poi effettivamente usato anche in campo aziendale, ma già avere del codice davanti con cui partire è un'ottima cosa.