• 参考価格: ¥ 4,069
  • OFF: ¥ 256 (6%)
通常配送無料 詳細
通常3~5週間以内に発送します。 在庫状況について
この商品は、Amazon.co.jp が販売、発送します。
ギフトラッピングを利用できます。
数量:1
Spidering Hacks がカートに入りました
+ ¥ 257 関東への配送料
中古品: 非常に良い | 詳細
発売元 Nearfine
コンディション: 中古品: 非常に良い
コメント: 少し使用感あり。 海外からの発送のため、2―4週間程度でお手元に商品をお届けします。商品及びサービス等の質問は、日本語で対応いたします。
この商品をお持ちですか? マーケットプレイスに出品する
裏表紙を表示 表紙を表示
サンプルを聴く 再生中... 一時停止   Audible オーディオエディションのサンプルをお聴きいただいています。
2点すべてのイメージを見る

Spidering Hacks (英語) ペーパーバック – 2003/11

1 件のカスタマーレビュー

すべての フォーマットおよびエディションを表示する 他のフォーマットおよびエディションを非表示にする
Amazon 価格 新品 中古品
ペーパーバック
"もう一度試してください。"
¥ 3,813
¥ 3,554 ¥ 2,444

AmazonStudent

Amazon Student会員なら、この商品は+10%Amazonポイント還元(Amazonマーケットプレイスでのご注文は対象外)。無料体験でもれなくポイント1,000円分プレゼントキャンペーン実施中。



キャンペーンおよび追加情報

  • 本とまとめ買いで割引 対象商品最大5000円OFF「PCソフト」

  • 掲載画像とお届けする商品の表紙が異なる場合があります。ご了承ください。

  • 【Amazon 本買取サービス】 お手持ちの本・雑誌・コミック・洋書を、Amazonが買い取ります。1点から無料集荷、買取価格は事前に検索可能。




商品の説明

内容紹介

The Internet, with its profusion of information, has made us hungry for ever more, ever better data. Out of necessity, many of us have become pretty adept with search engine queries, but there are times when even the most powerful search engines aren't enough. If you've ever wanted your data in a different form than it's presented, or wanted to collect data from several sites and see it side-by-side without the constraints of a browser, then Spidering Hacks is for you. Spidering Hacks takes you to the next level in Internet data retrieval--beyond search engines--by showing you how to create spiders and bots to retrieve information from your favorite sites and data sources. You'll no longer feel constrained by the way host sites think you want to see their data presented--you'll learn how to scrape and repurpose raw data so you can view in a way that's meaningful to you. Written for developers, researchers, technical assistants, librarians, and power users, Spidering Hacks provides expert tips on spidering and scraping methodologies. You'll begin with a crash course in spidering concepts, tools (Perl, LWP, out-of-the-box utilities), and ethics (how to know when you've gone too far: what's acceptable and unacceptable). Next, you'll collect media files and data from databases. Then you'll learn how to interpret and understand the data, repurpose it for use in other applications, and even build authorized interfaces to integrate the data into your own content. By the time you finish Spidering Hacks, you'll be able to: Aggregate and associate data from disparate locations, then store and manipulate the data as you like Gain a competitive edge in business by knowing when competitors' products are on sale, and comparing sales ranks and product placement on e-commerce sites Integrate third-party data into your own applications or web sites Make your own site easier to scrape and more usable to others Keep up-to-date with your favorite comics strips, news stories, stock tips, and more without visiting the site every day Like the other books in O'Reilly's popular Hacks series, Spidering Hacks brings you 100 industrial-strength tips and tools from the experts to help you master this technology. If you're interested in data retrieval of any type, this book provides a wealth of data for finding a wealth of data.

著者について

Kevin Hemenway, coauthor of Mac OS X Hacks, is better known as Morbus Iff, the creator of disobey.com, which bills itself as "content for the discontented." Publisher and developer of more home cooking than you could ever imagine, he'd love to give you a Fry Pan of Intellect upside the head. Politely, of course. And with love. Tara Calishain is the creator of the site, ResearchBuzz. She is an expert on Internet search engines and how they can be used effectively in business situations.

登録情報

  • ペーパーバック: 402ページ
  • 出版社: Oreilly & Associates Inc (2003/11)
  • 言語: 英語
  • ISBN-10: 0596005776
  • ISBN-13: 978-0596005771
  • 発売日: 2003/11
  • 商品パッケージの寸法: 15.2 x 2.5 x 22.9 cm
  • おすすめ度: 5つ星のうち 5.0  レビューをすべて見る (1 件のカスタマーレビュー)
  • Amazon 売れ筋ランキング: 洋書 - 298,421位 (洋書の売れ筋ランキングを見る)
  •  カタログ情報、または画像について報告

  • 目次を見る

この本のなか見!検索より

(詳細はこちら
この本のサンプルページを閲覧する
おもて表紙 | 著作権 | 目次 | 抜粋 | 索引 | 裏表紙
この本の中身を閲覧する:

カスタマーレビュー

5つ星のうち 5.0
星5つ
1
星4つ
0
星3つ
0
星2つ
0
星1つ
0
カスタマーレビューを表示
あなたのご意見やご感想を教えてください

最も参考になったカスタマーレビュー

6 人中、3人の方が、「このレビューが参考になった」と投票しています。 投稿者 びーぐる 投稿日 2003/12/28
形式: ペーパーバック
全然聞いたことがないモジュールがたくさん出ていて大変勉強になります。もっと早く知っていれば苦労しなかったというテクニックが満載ですね。翻訳も多分出ると思うんですが、現在開発しているプログラムで利用できそうなので原書を購入してしまいました。英語もそれほど難しくないと思います。他のHacksシリーズと同じく、100のHacks(テクニック?)に細かく分かれていて一つ一つが独立しているので読みやすいと思います。
コメント このレビューは参考になりましたか? はい いいえ 評価を送る...
フィードバックありがとうございました。 このレビューが不適切である場合は、当社までお知らせください。
申し訳ありませんが、お客様の投票の記録に失敗しました。もう一度試してください。

Amazon.com で最も参考になったカスタマーレビュー (beta)

Amazon.com: 19 件のカスタマーレビュー
33 人中、32人の方が、「このレビューが参考になった」と投票しています。
Good book with a light start 2004/2/15
投稿者 A Williams - (Amazon.com)
形式: ペーパーバック
The `Hacks' series from O'Reilly seems to be breeding as fast as virii in a Windows network - every time you turn around another one. While the writing and editing have remained high some such as `eBay Hacks' have not really had great material. `Spidering Hacks' is an improvement almost back to the quality I remember in the last contribution from Calishain, `Google Hacks'.
She and Kevin Hemenway have taken a fairly complex topic, spidering and scraping web sites and reduced it to manageable chunks in their hundred hacks. The writing has the same light, readable feel you can quickly grow to expect from O'Reilly. Certainly I have never found myself faulting their editing.
There are some caveats. It seems that O'Reilly and Dornfest (the Editor of this book and the series) have fallen in love with having a hundred hacks and little in the way of an introduction. I think this may have been a better book if it was done as 90 `hacks' and had a much larger introduction as the first chapters hacks are all too light and more truly introductory material such as how a HTML page is built and how to properly register your spider. Given that only someone with a fair amount of web knowledge is going to consider spidering a website in the first place then this early material is way too slight. From Hack 9 on it quickly gets down to useful and informative chunks in each and no longer feels `lightweight'.
This may be a reflection on trying to extend the `Hacks' series into places it has to be forced. While the format worked well for Google and Amazon I felt the entire topic of eBay too light for a topic in this series and perhaps spidering is too heavy or complex. If this book had been written in a more traditional format some of my complaints would disappear.
All the examples are in Perl and the serious part of the book starts with examples using LWP::Simple to grab a page before going on to LWP::UserAgent and much more complex requests using authentication, custom headers and posting form data. It also covers using curl and wget.
Then it gets down to the nitty gritty of scraping using HTML:Treebuilder and HTML:TokeParser. This is all further expanded through the next few hacks until starting at Hack 39 through to 89 there are a good series of examples (perhaps a few too many). Finally there are two chapters on maintaining your collection and `Giving Back To The World' which tells how to make it easy to scrape your site and using RSS.
O'Reilly have a page for the book with ten example hacks, index, Table of Contents and errata and you can also visit hacks.oreilly.com for the same ten hacks with the possibility of more being added.
As a whole this volume seems a little thin. If you've been doing the maths then you've realised that only about thirty of the hundred hacks actually give any details on building and running a serious web spider. Sure, a number of the examples provide good information on how to perform various tasks and some of the last eleven hacks are good to know but in all the book feels like it lacks solid information throughout. A bit more information on various crawling and page parsing techniques would have been good.
After that criticism I'm now surprising myself, I'm going to recommend this book. This isn't a large field and when you consider that most other books on writing spiders and crawlers are less than practical and more than expensive "Spidering Hacks" has many good points. It's written for the practical Perl programmer, it examines several methods and gives lots of examples and while not cheap it's certainly inexpensive. Given that I found it both useful and inspiring the complaints above may be a little like nitpicking. I should also say that I found this volume immensely useful in writing my own spider and scraper (it gets a list of new books from the web sites of several publishers.) I have to be honest and admit that there are three publishers, O'Reilly, Addison Wesley and Prentice Hall, from whom I expect a decent standard and criticise a little harder when they move from that norm. If this book had come from SAMS or Wrox I may well have not looked quite so hard for flaws and been a little more generous in my treatment of the ones I found.
That said, I recommend this book to you if you want a practical introduction to building a web spider in Perl.
24 人中、24人の方が、「このレビューが参考になった」と投票しています。
Many examples of how to use spiders 2004/4/9
投稿者 W Boudville - (Amazon.com)
形式: ペーパーバック
The book has a nice collection of case studies on how to gather data from disparate websites. You might consider this as showing a simple way for you to use Web Services.
Spidering is the way that search engines gather their data. But you do not have to be Altavista or Google to use spiders. Nor do you have to be scanning a large fraction of the Web. The authors demistify spiders. If you can follow their examples, then you get concrete instances of usage that might help your particular application.
Thoughtfully, the examples are mostly written in Perl, with a few in Java. These languages should be familiar to many. Though even if you don't know them, the logic of the code can still be useful. (That is, you can treat the code as pseudocode.)
While spiders are probably best known as being used by search engines, they are really only the starting point for the latter. The much harder problems start when you have the data amassed by a spider. Now you have to efficiently find correlations between the various web pages. You should be aware that the book does not discuss these with any significant depth. Not surprising, because these are outside the scope of the book. The examples do show how to use the data found by spiders. But most of these are for web pages that sit in a given domain. So the pages are closely affiliated in content and structure.
19 人中、18人の方が、「このレビューが参考になった」と投票しています。
Lots of great ideas 2004/3/23
投稿者 Jack D. Herrington - (Amazon.com)
形式: ペーパーバック
Once in a long while you get a book that inspires you with a lot of great small ideas. Spidering Hacks is just that type of book. The web has a wealth of structured and semi-structured that is just waiting to be mined with automated tools. This book not only teaches you how to get the data out of these sources, but gives you idea about where to look for information and what to do with it.
This book demonstrates everything I like in a technical book. It not only describes how things are done. It also gives practical examples of how the technology can be useful in the real world, and presents them enthusiastically. It makes you want to go out and implement all of the ideas and to keep on going with some of your own.
Nitpicks I have with the book are minor. The 'Hacks' format seems imposed, for example, hack #8 is about installing CPAN. I don't think that section should be left out, but I don't think it's a hack either. But hey, I don't care that much about the structure as long as it isn't an imposing flaw and the content within the structure is great, as it is with this book.
Have to say, O'Reilly is on a roll with the Hacks series. They have all been fine books.
17 人中、15人の方が、「このレビューが参考になった」と投票しています。
Great Book 2004/1/6
投稿者 カスタマー - (Amazon.com)
形式: ペーパーバック
Are you ready to be the next Google? It is widely known that Google pulled out in front of (and largely obsoleted) major search engine players like Altavista and Yahoo largely because of Google's highly accurate search results -- you find what you search for. They are so confident in their search engine spiders they even have a "I'm feeling lucky" button to transport you to the first search result found -- it's arrogance, but well deserved arrogance. In a sentence, Google works.
Enter Kevin Hemenway and Tara Calishain's latest O'Reilly book: Spidering Hacks. Continuing in the Oreilly "Hacks" tradition, this comprehensive guidebook provides a hundred clear, useful tools for designing and implementing the next generation -- or maybe just your own customized -- spider (or bot, if you prefer.)
So why build your own spider? Well, if you have a large website, your spider could check link integrity, HTML standards and check meta-tags. If you are researching a topic and Google is not returning what you want, creating your own spider might be just what you need. This handy book (with examples in Perl) will show you how to:
* Create a site-friendly bot that wont get you banned by webmasters (Hack #16 --Respecting your Scrapee's Bandwidth, and Hack # 17 -- Respecting robots.txt)
* Interested in graphics, audio and video? Hacks #33 through #42 step you through collecting media files. Specific examples including scraping films from [...] (Hack #24), gathering movies from the Library of Congress (Hack #35) and archiving images from Webshots. You'll have your own personalized library in no time.
* Weblog-Free Google Results -- Weblogs (aka Blogs) are amazingly popular these days. With Google's pagerank algorithm, that means they get heavy emphasis in your search results. Hack #50 skims down the search results by eliminating those annoying Blogs.
In addition, you'll find multiple hacks covering Amazon.com and RSS Feeds. The book includes much information regarding spider automation (e.g. Cron jobbing your spiders.) You'll find content filtering and and even a hack using PHP code(Hack #84.)
This book is extraordinarily helpful and is a great resource for any PERL hacker. I highly recommend it to any computer hobbyist interesting in data mining and spidering and scraping. Well done, O'Reilly!
8 人中、8人の方が、「このレビューが参考になった」と投票しています。
Example-filled and easy-to-follow 2004/3/7
投稿者 Midwest Book Review - (Amazon.com)
形式: ペーパーバック
The knowledgeable collaboration of Kevin Hemenway and Tara Calishain, Spidering Hacks: 100 Industrial-Strength Tips & Tools is an extensive, 402-page instructional guidebook and reference to Internet data retrieval through the use of spiders and scrapers. Including information on methodology, philosophies, and ethical considerations, as well as freely available modules, scripts, frameworks, and templates, information on how to build alternative interfaces to online databases, how to keep one's data current and share it in a user-friendly manner, and so much more, Spidering Hacks is an example-filled, easy-to-follow, highly recommended computer shelf resource.
これらのレビューは参考になりましたか? ご意見はクチコミでお聞かせください。


フィードバック