Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems ペーパーバック – 2017/4/11
Kindle 端末は必要ありません。無料 Kindle アプリのいずれかをダウンロードすると、スマートフォン、タブレットPCで Kindle 本をお読みいただけます。
Data is at the center of many challenges in system design today. Difficult issues need to be figured out, such as scalability, consistency, reliability, efficiency, and maintainability. In addition, we have an overwhelming variety of tools, including relational databases, NoSQL datastores, stream or batch processors, and message brokers. What are the right choices for your application? How do you make sense of all these buzzwords?
In this practical and comprehensive guide, author Martin Kleppmann helps you navigate this diverse landscape by examining the pros and cons of various technologies for processing and storing data. Software keeps changing, but the fundamental principles remain the same. With this book, software engineers and architects will learn how to apply those ideas in practice, and how to make full use of data in modern applications.
- Peer under the hood of the systems you already use, and learn how to use and operate them more effectively
- Make informed decisions by identifying the strengths and weaknesses of different tools
- Navigate the trade-offs around consistency, scalability, fault tolerance, and complexity
- Understand the distributed systems research upon which modern databases are built
- Peek behind the scenes of major online services, and learn from their architectures
Martin is a researcher in distributed systems at the University of Cambridge. Previously he was a software engineer and entrepreneur at Internet companies including LinkedIn and Rapportive, where he worked on large-scale data infrastructure. In the process he learned a few things the hard way, and he hopes this book will save you from repeating the same mistakes.
Martin is a regular conference speaker, blogger, and open source contributor. He believes that profound technical ideas should be accessible to everyone, and that deeper understanding will help us develop better software.
Nowhere else perhaps is this more prominent than in data space that up-levels libraries and frameworks as the conversation starter. That gets in the way of success. It is indeed impossible to model Cassandra "tables" without understanding - at least - quorum, compaction, log-merge data structure. Due to the way the present day solutions are built ("fits one use case perfectly well"), if these solutions are not implemented well to the particular domain, failure is just a release away.
Mr Kleppmann does a great job of articulating the "systems" aspects of data engineering. He starts from a functional 4 lines code to build a database to the way how one can interpret and implement concurrency, serializability, isolation and linearizability (the latter for distributed systems). His book also has over 800 pointers to state of the art research as well as some of the computer science's classic papers. The book slows down its pace on the chapter on Distributed System and on the final one. A good editor could have trimmed about 120 pages and still retain most value one could get from the book.
That said, if you ever worked on data systems, especially across paradigms (IMS -> RDBMS -> NoSQL -> Map-Reduce -> Spark -> Streaming -> Polyglot), this book is pretty much only resource out there to tie the "loose ends" and paint a coherent narrative. Highly recommended!
If you are interested in distributed systems or scalability, this book is a must-read for you. It gives you a high level understanding of different technology, including the idea behind it, the pros and cons, and the problem it is trying to solve. A great book for practitioners who want to learn all the essential concepts quickly.
I didn't come from a traditional CS background, but I did have some basic knowledge in hardware and data structure. You will need some of that, such as hard disk vs SSD and AVL tree, to understand the materials. If you are completely new to backend or DS, you may want to start with another book "Web Scalability for Startup Engineers." After that book, you can read the free article "Distributed Systems for Fun and Profit" and you are good to go for this amazing book :D
Kleppman has coherently blended the relevant computer science theory with modern use cases and applications. The focus is primarily on the core principles and thought-processes that one must apply when it comes to building data services. Design concepts don't go out-of-date soon, so the book has very long shelf-life.
The high-point of this book is the author's lucid prose, which indicates mastery of the subject matter and clarity of thought. Conceptualizing reality is an art and the author really shines here. You’ll find that whenever you have a question after reading a particular sentence, the answer to that will be found in the upcoming sentences. It’s like mind-reading.
Also kudos to the author for those nice diagrams and interesting maps (and for avoiding mathematical formulas with Greek symbols). The bibliography at the end of each chapter is thorough enough for unending personal research.
If you are working on or interviewing for big data engineering, systems design, cloud consulting or devops/SRE, then this book is a keeper for a long-long time.