The Hypertext Transfer Protocol (HTTP) allows information to be exchanged between a web server and a web browser. Java allows you to program HTTP directly. HTTP programming allows you to create programs that access the web much like a human user would. These programs, which are called bots, can collect information or automate common web programming tasks. This book presents a collection of very reusable recipes for Java bot programming.
This book also introduces the Heaton Research Spider. The Heaton Research Spider is an open source spider framework. Using the Heaton Research Spider you can create spiders that will crawl a web site, much like a real spider crawls the web. The Heaton Research Spider is available in both Java and Microsoft Dot Net form.
Table of Contents
- The Structure of HTTP Requests
- Examining HTTP Traffic
- Introducing the Java HTTP Classes
- Beyond Simple Requests
- Secure HTTP Requests
- Extracting Data
Link: Read online.