wget - Linux System Download File Tool

wget - Linux System Download File Tool

Linux System Download File Tool

Supplementary notes

The wget command is used to download files from the specified URL. wget is very stable. It has strong adaptability in the case of narrow bandwidth and unstable network. If the download fails due to the network, wget will keep trying until the whole file is downloaded. If the server interrupts the download process, it will connect to the server again and continue to download from where it stopped. This is useful for downloading large files from servers with limited link time. wget supports HTTP, HTTPS and FTP protocols and can use HTTP proxy. The so-called automatic download means that wget can be executed in the background after the user exits the system. This means that you can log in to the system, start a wget download task, and then exit the system. wget will execute in the background until the task is completed. Compared with most other browsers, users need to participate in downloading a large amount of data, which saves a lot of trouble. It is used to download resources from the network. If no directory is specified, the downloaded resources will default to the current directory. wget is powerful but easy to use:

1. It supports the breakpoint download function, which was also the biggest selling point of network ant and FlashGet in those years. Now, Wget can also use this function, and those users whose network is not very good can rest assured;
2. Support both FTP and HTTP download methods. Although most software can be downloaded by HTTP, sometimes it is still necessary to download software by FTP;
3. Support proxy server for systems with high security intensity, generally they will not directly expose their own systems to the Internet. Therefore, support proxy is a necessary function for downloading software;
4. The setting is convenient and simple. Users who are used to the graphical interface are not used to the command line. However, the command line actually has more advantages in setting. At least, you can click the mouse many times less, and don't worry about whether you click the mouse wrongly;
5. The program is small and completely free. Small programs can be considered, because the current hard disk is too big; Completely free has to be considered. Even though there are many so-called free software on the network, the advertising of these software is not what we like.

grammar

1. wget [parameter] [URL address]

option

1. Startup parameters: 
2.  -V, –-version display wget Exit after version of
3.  -h, –-help Print syntax help 
4.  -b, –-background Transfer to background execution after startup 
5.  -e, –-execute=COMMAND implement wgetrc Format command, wgetrc Format see/etc/wgetrc or ~/.wgetrc 
6. #Record and input file parameters: 
7. -o, –-output-file=FILE Write the record to FILE In the file 
8.  -a, –-append-output=FILE Append record to FILE In the file 
9.  -d, –-debug Print debug output 
10.  -q, –-quiet Quiet mode(No output) 
11.  -v, –-verbose Verbose mode(This is the default setting) 
12.  -nv, –-non-verbose Turn off verbose mode, but not quiet mode 
13.  -i, –-input-file=FILE Download in FILE What appears in the file URLs 
14. -F, –-force-html Treat the input file as HTML Format file treatment 
15.  -B, –-base=URL take URL As in-F -i Parameter specifies the prefix of the relative link that appears in the file
16. –-sslcertfile=FILE Optional client certificate
17.  –-sslcertkey=KEYFILE Optional client certificate KEYFILE
18. –-egd-file=FILE appoint EGD socket File name for 
19.  Download parameters: 
20.  –-bind-address=ADDRESS Specify local use address(Host name or IP,When there are multiple local IP Or name)
21. -t, –-tries=NUMBER Set the maximum number of link attempts(0 Indicates unlimited).
22.  -O –-output-document=FILE Write the document to FILE In the file 
23.  -nc, –-no-clobber Do not overwrite existing files or use.#prefix
24.  -c, –-continue Then download the unfinished files 
25.  –progress=TYPE Set progress bar flag 
26.  -N, –-timestamping Do not download the file again unless it is newer than the local file 
27. -S, –-server-response Print server response 
28.  –-spider Don't download anything 
29.  -T, –-timeout=SECONDS Sets the number of seconds the response times out
30.  -w, –-wait=SECONDS Interval between attempts SECONDS second 
31. –waitretry=SECONDS Wait 1 between relinks SECONDS second
32. –random-wait Wait 0... 2 between Downloads*WAIT second 
33. -Y, –-proxy=on/off Turn the agent on or off 
34. -Q, –-quota=NUMBER Set download capacity limit 
35.  –limit-rate=RATE Limited Download Rate 
36.  Catalog parameters: 
37.  -nd –-no-directories Do not create directory
38.  -x, –-force-directories Force directory creation 
39.  -nH, –-no-host-directories Do not create Host Directory
40.  -P, –-directory-prefix=PREFIX Save file to directory PREFIX/... 
41.  –cut-dirs=NUMBER ignore NUMBER Layer remote directory 
42.  HTTP Option parameters: 
43.  -–http-user=USER set up HTTP User name is USER. 
44.  -–http-passwd=PASS set up http Password is PASS 
45.  -C, –-cache=on/off allow/Server side data caching is not allowed (Generally allowed)
46. -E, –-html-extension Will all text/html Document to.html Extension save 
47.  -–ignore-length ignore Content-Length Header domain 
48.  -–header=STRING stay headers Insert string in STRING 
49. -–proxy-user=USER Set the user name of the agent as USER 
50.  -–proxy-passwd=PASS Set the password of the agent to PASS
51.  -–referer=URL stay HTTP The request contains Referer: URL'head 
52.  -s, –-save-headers preservation HTTP Header to file 
53. -U, –-user-agent=AGENT Set the name of the agent to AGENT instead of Wget/VERSION 
54. -–no-http-keep-alive close HTTP Activity link (Forever link) 
55. –-cookies=off Not used cookies 
56.  –-load-cookies=FILE Start a session from a file before starting the session FILE Loading in cookie 
57.  -–save-cookies=FILE After the session ends, the cookies Save to FILE In the file 
58.  FTP Option parameters: 
59.  -nr, -–dont-remove-listing Don't move listing file
60.  -g, -–glob=on/off Turns on or off the of file names globbing mechanism 
61. -–passive-ftp Use passive transmission mode (Default value). 
62.  -–active-ftp Use active transfer mode 
63.  -–retr-symlinks In recursion, the link points to the file(Not a directory) 
64.  Recursive download parameters:
65.  -r, -–recursive Recursive Download--Use with caution! 
66. -l, -–level=NUMBER Maximum recursion depth (inf Or 0 for infinity) 
67.  –-delete-after Partially delete the file after now 
68.  -k, –-convert-links Convert non relative links to relative links 
69.  -K, –-backup-converted Converting files in X Before, back it up as X.orig 
70. -m, –-mirror Equivalent to -r -N -l inf -nr
71. -p, –-page-requisites Download display HTML All pictures of the file
72.  Include and exclude in recursive Download(accept/reject): 
73.  -A, –-accept=LIST Semicolon delimited list of accepted extensions 
74.  -R, –-reject=LIST Semicolon delimited list of unacceptable extensions
75.  -D, –-domains=LIST Semicolon delimited list of accepted fields 
76.  –-exclude-domains=LIST Semicolon delimited list of unacceptable fields 
77.  –-follow-ftp track HTML In the document FTP link 
78.  –-follow-tags=LIST Semicolon delimited tracked HTML List of tags 
79.  -G, –-ignore-tags=LIST Semicolon delimited ignored HTML List of tags 
80.  -H, –-span-hosts Go to external host when recursive 
81.  -L, –-relative Track only relative links
82.  -I, –-include-directories=LIST List of allowed directories 
83.  -X, –-exclude-directories=LIST List of directories not included
84.  -np, –-no-parent Do not trace back to the parent directory
85. wget -S –-spider url Don't download only show process

parameter

URL: download the specified URL address.

example

Download a single file using wget

wget http://www.jsdig.com/testfile.zip

The following example is to download a file from the network and save it in the current directory. During the download process, a progress bar will be displayed, including (download completion percentage, downloaded bytes, current download speed, remaining download time).

Download and save with a different file name

1. wget -O wordpress.zip http://www.jsdig.com/download.aspx?id=1080

wget will command by default with the character following the last match /. For dynamic link download, the file name is usually incorrect. Error: the following example will download a file and save it with the name download.aspx?id=1080:

1. wget http://www.jsdig.com/download?id=1

Even if the downloaded file is in zip format, it still uses the download.php?id=1080 command. Correct: to solve this problem, we can use the parameter - O to specify a file name:

1. wget -O wordpress.zip http://www.jsdig.com/download.aspx?id=1080

wget speed limit Download

1. wget --limit-rate=300k http://www.jsdig.com/testfile.zip

When you execute wget, it will occupy all possible broadband downloads by default. But when you are ready to download a large file and you need to download other files, it is necessary to speed limit.

Continue transmission using wget breakpoint

1. wget -c http://www.jsdig.com/testfile.zip

Using wget -c to restart the interrupted file download is very helpful for the sudden interruption due to network and other reasons when we download a large file. We can continue to download instead of downloading a file again. You can use the - c parameter when you need to continue the interrupted download.

Background download using wget

1. wget -b http://www.jsdig.com/testfile.zip

When downloading very large files, we can use the parameter - b to download in the background. You can use the following command to view the download progress:

1. tail -f wget-log

Camouflage Agent Name Download

wget --user-agent="Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.204 Safari/534.16" http://www.jsdig.com/testfile.zip

Some websites can reject your download request by judging that the proxy name is not a browser. However, you can use the -- user- agent parameter to disguise.

Test download link
When you plan to download regularly, you should test whether the download link is valid at the scheduled time. We can add the -- spider parameter to check.

1. wget --spider URL

If the download link is correct, it will display:

Spider mode enabled. Check if remote file exists.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Remote file exists and could contain further links,
but recursion is disabled – not retrieving.

This ensures that the download can take place at the scheduled time, but when you give the wrong link, the following error will be displayed:

wget --spider url
Spider mode enabled. Check if remote file exists.
HTTP request sent, awaiting response... 404 Not Found
Remote file does not exist – broken link!!!

You can use the -- spider parameter in the following cases:

Check before scheduled Download

Check whether the website is available at intervals

Check for dead links on Web pages

Increase the number of retries

1. wget --tries=40 URL

If there is a problem with the network or downloading a large file, it may also fail. wget retries 20 connection downloads by default. If necessary, you can use -- tries to increase the number of retries.

Download multiple files

1. wget -i filelist.txt

First, save a download link file:

  1. cat filelist.txt
    url1
    url2
    url3
    url4

Then use this file and parameter - i to download.

Mirror site

1. wget --mirror -p --convert-links -P ./LOCAL URL

Download the entire website locally.

– miror account opening image download.

-p Download all files that display normally for html pages.

– convert links will be converted to local links after downloading.

-P. / local saves all files and directories to the specified local directory.

Filter downloads in the specified format

1. wget --reject=gif ur

Download a website, but you don't want to download pictures, you can use this command. Save the downloaded information into the log file

1. wget -o download.log URL

Do not want the downloaded information to be displayed directly on the terminal, but in a log file that can be used. Limit total download file size

1. wget -Q5m -i filelist.txt

When you want to download more than 5M files and quit downloading, you can use. Note: this parameter has no effect on downloading a single file. It is only valid when downloading recursively.

Download the specified format file

1. wget -r -A.pdf url

You can use this feature when:

Download all the pictures of a website.

Download all videos from a website.

Download all PDF files of a website.

FTP download

1. wget ftp-url 2. wget --ftp-user=USERNAME --ftp-password=PASSWORD url

You can use wget to download ftp links. Anonymous ftp download using wget:

wget ftp-url

ftp download using wget username and password authentication:

1. wget --ftp-user=USERNAME --ftp-password=PASSWORD url

Tags: Linux Operation & Maintenance CentOS server wget

Posted on Wed, 17 Nov 2021 22:53:14 -0500 by knighthawk1337