Htmlunit

screenshot of Htmlunit

HtmlUnit is a "GUI-Less browser for Java programs".

Overview

HtmlUnit is a GUI-less browser for Java programs that allows users to model HTML documents and interact with web pages programmatically. It provides support for HTTP and HTTPS protocols, cookies, form submission, JavaScript, and more. HtmlUnit is commonly used for testing purposes and web scraping.

Features

  • Support for HTTP and HTTPS protocols
  • Ability to handle cookies
  • Support for various submit methods (POST, GET, HEAD, DELETE)
  • Customizable request headers
  • Wrapper for HTML pages for easy access to information
  • Form submission and link clicking support
  • Proxy server support
  • Support for basic and NTLM authentication

Maven:

Add the following to your pom.xml:

<dependency>