Website Image Extractor

This Java program is a crude way to extract and download images from the HTML of various websites. Given a URL, the program scans the HTML for that page and looks for image links and other image references. Then, the list of images is presented to the user and unwanted images can be removed.

Images can be selected using pattern recognition chosen by the user. Images can be automatically selected or deselected from the master list of images using keyword recognition (explained in detail below).

Additionally, the order in which the images should be downloaded can be changed (for images that need to be named in sequential order). Before downloading the images, various output file types can be selected and custom image naming schemes can be used (explained in detail below).

Currently, the program does not support extracting images from pages which use JavaScript or other languages to render images after the page is loaded; only image URLs found in the HTML code of the given URL can be downloaded.

Additional information can be found on the GitHub repository.

Download now!


URL Entry

A URL must first be provided from which to extract the images from. To do this, enter the URL in the top field and select the “Grab Images” button. The images will then be extracted and the other fields in the program can be used for further manipulation.

Website Image Extractor URL entry

Image Preview and Order

The leftmost panel in the GUI is reserved for image previews and manipulation of the order of images in the image list (center panel, explained in detail below). After selecting an image from the dropdown, a preview of that image will appear in the space below the “Move up” and “Move down” buttons. If the image URL is not valid (invalid URL or other issue), a “File not found” image will be displayed instead.

Using the “Move up” and “Move down” buttons, the current selected image (from the dropdown) will be moved up and down on the list of all images, respectively. This is useful for images that need to be named in sequential order to correct any images that are not in the desired order.

Website Image Extractor image preview and order changer

Image List

The center panel is used to list all of the extracted images and the order they will be downloaded/named in. The only manipulation available in this panel is the ability to check/uncheck individual images to be marked for renaming/downloading. The “Delete unchecked boxes” at the bottom of the panel will remove unchecked images from the list of images.

Website Image Extractor image list

Auto-Select Settings

The auto-select settings are used to check/uncheck images in the image list automatically based on keywords entered by the user. The “Including” field (populated with a comma-separated list) will check all images in the list and check them only if an image name contains ALL of the keywords in the list. The “excluding field” (populated with a comma-separated list) will uncheck all images in the list containing ANY of the keywords in the list.

Website Image Extractor auto-select settings

Image Output Settings

The image output settings are used to determine the output image file name. The “File prefix” field is added at the beginning of the image file name. The “Start index” spinner determines which integer value to begin counting for sequential image names. The “Extension” dropdown is used to select the image file type.

If you wish to preserve the original image file names, simply uncheck the “Enable custom file names” checkbox, and those fields will be disabled and ignored.

The “Browse” button can be used to select a different image output location (default is the same location that the program is running in).

Website Image Extractor image output settings

Download Images

Once all the settings are chosen and correct images are selected and in the correct order in the image list, use the “Download!” button to begin downloading all images. The progress bar will show the current download progress and text will display how many images have been downloaded and the total number of images to download.

Website Image Extractor download images

Releases

View the GitHub repository!

VersionGitHubJAR (local)
1.1[593 KB]
1.0[591 KB]

1 thought on “Website Image Extractor”

Leave a Comment