wget and Curl are a pair of great data trasnfer tools for linux, here are some details on using wget.

Download a single file/page:

wget http://required_site/file

Download the entire site, using the -r option:

wget -r http://required_site/

Download certain file types, using the -A option

Say,to download only pdf and mp3 use:

wget -r -A pdf,mp3 http://required_site/

To follow external links, using the -H option:

wget -r -H -A pdf,mp3 http://required_site/

To limit the sites to follow, using the -D option:

wget -r -H -A pdf,mp3 -D files.site.com http:/required_site/

Number of levels to go , when using -r option can be indicated using the -l option:

wget -r -l 2 http://required_site/

Download all images from the site:

wget -erobots=off -r -l1 --no-parent -A .gif,.jpg http://required_site/


Still more....{tricky}

Using wget to download content protected by referer and cookies
#1. get base url and save its cookies in file
#2. get protected content using stored cookies

wget --cookies=on --keep-session-cookies --save-cookies=cookie.txt http://first_page
wget --referer=http://first_page --cookies=on --load-cookies=cookie.txt --keep-session-cookies --save-cookies=cookie.txt http://second_page

Mirror website to a static copy for local browsing:

wget --mirror -w 2 -p --html-extension --convert-links -P http://required_site

Wget to work in the background:

wget -t 45 -o log http://required_site &

Wget for FTP { login and  password ! Wget says ill take care}:

wget ftp://reqiured_site

Read the list of URLs from a file :

wget -i file


Thank you http://www.h3manth.com/2009/01/wget-tircks-and-tips.html

Published on  November 24th, 2023