Flexible Pattern Matching in Strings: Practical On-Line Search Algorithms for Texts and Biological Sequences (英語) ペーパーバック – 2007/7/26
Kindle 端末は必要ありません。無料 Kindle アプリのいずれかをダウンロードすると、スマートフォン、タブレットPCで Kindle 本をお読みいただけます。
String matching problems range from the relatively simple task of searching a single text for a string of characters to searching a database for approximate occurrences of a complex pattern. Recent years have witnessed a dramatic increase of interest in sophisticated string matching problems, especially in information retrieval and computational biology. This book presents a practical approach to string matching problems, focusing on the algorithms and implementations that perform best in practice. It covers searching for simple, multiple and extended strings, as well as regular expressions, and exact and approximate searching. It includes all the most significant new developments in complex pattern searching. The clear explanations, step-by-step examples, algorithm pseudocode, and implementation efficiency maps will enable researchers, professionals and students in bioinformatics, computer science, and software engineering to choose the most appropriate algorithms for their applications.
'If you need efficient pattern matching for any kind of string then this is the only book I know that comes even close to providing you [with] the tools for the job.' The Journal of the ACCU
'I really enjoyed reading and studying this book. I am convinced it is a must-read, especially chapters 4 through 6, for anyone who is involved in the task of designing algorithms for modern string or sequence matching.' Computing Reviews
The main topics, chapter by chapter, are simple matching of one desired word to a string, matching of multiple words, two levels of complexity in wildcards and regular expressions, and approximate matching. A number of important and historical algorithms are discussed in each chapter, in great detail. There's pseudo-code for the most important algorithms. Quite a few also have examples worked in detail. The mechanics are tedious and somewhat bulky, but anyone actually trying to implement these techniques will appreciate the examples.
What's really interesting is what's not in this book. You won't find a lot of theory, and you won't find some of the most famous algorithms in string matching. The authors make it clear that this is about practical algorithms with efficient implementations. Lots of the algorithms beloved by theoreticians are impractically complex or just plain slow. Those may be mentioned in passing or as the base for more practical algorithms, but are not welcome on these pages.
It's not an easy read, but it's not a book for people with easy problems. It discusses tradeoffs, like when one technique works well for short strings but another works better on long strings. It addresses the different needs of English-language processing and bioinformatics - just the different numbers of letters in each alphabet make a difference, in some cases.
This is a good one for anyone who takes string processing seriously. There's no cut&paste code here, but plenty for a knowledgable programmer to use. Even better, it offers references to the literature and to working code, and pointers to some books on related topics. I expect to get a lot of use out of this one.