download.file {utils} | R Documentation |
This function can be used to download a file from the Internet.
download.file(url, destfile, method, quiet = FALSE, mode = "w", cacheOK = TRUE)
url |
A character string naming the URL of a resource to be downloaded. |
destfile |
A character string with the name where the downloaded file is saved. Tilde-expansion is performed. |
method |
Method to be used for downloading files. Currently
download methods "internal" , "wget" and "lynx"
are available, and there is a value "auto" : see Details. The
method can also be set through the option
"download.file.method" : see options() . |
quiet |
If TRUE , suppress status messages (if any). |
mode |
character. The mode with which to write the file. Useful
values are "w" , "wb" (binary), "a" (append) and
"ab" . Only used for the "internal" method. |
cacheOK |
logical. Is a server-side cached value acceptable?
Implemented for the "internal" and "wget" methods. |
The function download.file
can be used to download a single
file as described by url
from the internet and store it in
destfile
.
The url
must start with a scheme such as
"http://"
, "ftp://"
or "file://"
.
If method = "auto"
is chosen (the default), the internal method
is used on Windows.
cacheOK = FALSE
is useful for "http://"
URLs, and will
attempt to get a copy directly from the site rather than from an
intermediate cache. (Not all platforms support it.)
It is used by available.packages
.
The remaining details apply to method "internal"
only.
See url
for how "file://"
URLs are interpreted,
especially on Windows. This function does decode encoded URLs.
The timeout for many parts of the transfer can be set by the option
timeout
which defaults to 60 seconds.
The level of detail provided during transfer can be set by the
quiet
argument and the internet.info
option. The
details depend on the platform and scheme, but setting
internet.info
to 0 gives all available details, including
all server responses. Using 2 (the default) gives only serious
messages, and 3 or more suppresses all messages.
A progress bar tracks the transfer. If the file length is known, the full width of the bar is the known length. Otherwise the initial width represents 100Kbytes and is doubled whenever the current width is exceeded.
There is an alternative method if you have Internet Explorer 4 or
later installed. You can use the flag --internet2, when
the ‘Internet Options’ of the system are used to choose proxies
and so on; these are set in the Control Panel and are those used for
Internet Explorer. This version does not support cacheOK = FALSE
.
Method "wget"
can be used with proxy firewalls which require
user/password authentication if proper values are stored in the
configuration file for wget
.
An (invisible) integer code, 0
for success and non-zero for
failure. For the "wget"
and "lynx"
methods this is the
status code returned by the external program. The "internal"
method can return 1
, but will in most cases throw an error.
This applies to the internal code only.
Proxies can be specified via environment variables.
Setting "no_proxy"
stops any proxy being tried.
Otherwise the setting of "http_proxy"
or "ftp_proxy"
(or failing that, the all upper-case version) is consulted and if
non-empty used as a proxy site. For FTP transfers, the username
and password on the proxy can be specified by "ftp_proxy_user"
and "ftp_proxy_password"
. The form of "http_proxy"
should be "http://proxy.dom.com/"
or
"http://proxy.dom.com:8080/"
where the port defaults to
80
and the trailing slash may be omitted. For
"ftp_proxy"
use the form "ftp://proxy.dom.com:3128/"
where the default port is 21
. These environment variables
must be set before the download code is first used: they cannot be
altered later by calling Sys.putenv
.
Usernames and passwords can be set for HTTP proxy transfers via
environment variable http_proxy_user
in the form
user:passwd
. Alternatively, "http_proxy"
can be of the
form "http://user:pass@proxy.dom.com:8080/"
for compatibility
with wget
. Only the HTTP/1.0 basic authentication scheme is
supported.
Under Windows, if "http_proxy_user"
is set to "ask"
then
a dialog box will come up for the user to enter the username and
password. NB: you will be given only one opportunity to enter this,
but if proxy authentication is required and fails there will be one
further prompt per download.
Methods "wget"
and "lynx"
are for historical
compatibility. They will block all other activity on the R process.
For methods "wget"
and "lynx"
a system call is made to
the tool given by method
, and the respective program must be
installed on your system and be in the search path for executables.
options
to set the timeout
and
internet.info
options.
url
for a finer-grained way to read data from URLs.
url.show
, available.packages
,
download.packages
for applications