updatePACKAGES Update Existing PACKAGES Files
Description
Update an existing repository by reading the PACKAGES file, retaining entries which are still valid, removing entries which are no longer valid, and only processing built package tarballs which do not match existing entries.
update_PACKAGES can be much faster than write_PACKAGES for small-moderate changes to large repository indexes, particularly in non-strict mode (see Details).
Usage
update_PACKAGES(dir = ".", fields = NULL, type = c("source",
"mac.binary", "win.binary"), verbose.level = as.integer(dryrun),
latestOnly = TRUE, addFiles = FALSE, rds_compress = "xz",
strict = TRUE, dryrun = FALSE)
Arguments
dir | See |
fields | See |
type | See |
verbose.level | (0, 1, 2) What level of informative messages which should be displayed throughout the process. Defaults to 0 if |
latestOnly | See |
addFiles | See |
rds_compress | See |
strict | logical. Should 'strict mode' be used when checking existing |
dryrun | logical. Should the updates to existing |
Details
Throughout this section, package tarball is defined to mean any archive file in dir whose name can be interpreted as <package>_<version>.<ext> - with <ext> the appropriate extension for built packages of type type - (or that is pointed to by the File field of an existing PACKAGES entry). Novel package tarballs are those which do not match an existing PACKAGES file entry.
update_PACKAGES calls directly down to write_PACKAGES with a warning (and thus all package tarballs will be processed), if any of the following conditions hold:
-
typeiswin.binaryandstrictisTRUE(no MD5 checksums are included in win.binaryPACKAGESfiles) -
No
PACKAGESfile exists underdir -
A
PACKAGESfile exists underdirbut is empty -
fieldsis notNULLand one or more specified fields are not present in the existingPACKAGESfile
update_PACKAGES avoids (re)processing package tarballs in cases where a PACKAGES file entry already exists and appears to remain valid. The logic for detecting still-valid entries is as follows:
Any package tarball which was last modified more recently than the existing PACKAGES file is considered novel; existing PACKAGES entries appearing to correspond to such tarballs are always considered stale and replaced by newly generated ones. Similarly, all PACKAGES entries that do not correspond to any package tarball found in dir are considered invalid and are excluded from the resulting updated PACKAGES files.
When strict is TRUE, PACKAGES entries that match a package tarball (by package name and version) are confirmed via MD5 checksum; only those that pass are retained as valid. All novel package tarballs are fully processed by the standard machinery underlying write_PACKAGES and the resulting entries are added. Finally, if latestOnly is TRUE, package-version pruning is performed across the entries.
When strict is FALSE, package tarballs are assumed to encode correct metadata in their filenames. PACKAGES entries which appear to match a package tarball are retained as valid (No MD5 checksum testing occurs). If latestOnly is TRUE, package-version pruning is performed across the full set of retained entries and novel package tarballs before the processing of the novel tarballs, at significant computational and time savings in some situations. After the optional pruning, any relevant novel package tarballs are processed via the standard machinery and added to the set of retained entries.
In both cases, after the above process concludes, entries are sorted alphabetically by the string concatenation of Package and Version. This should match the entry order write_PACKAGES outputs.
The fields within the entries are ordered as follows: canonical fields - i.e., those appearing as columns when available.packages is called on a CRAN mirror - appear first in their canonical order, followed by any non-canonical fields.
After entry and field reordering, the final database of PACKAGES entries is written to all three PACKAGES files, overwriting the existing versions.
When verbose.level is 0, no extra messages are displayed to the user. When it is 1, detailed information about what is happening is conveyed via messages, but underlying machinery from write_PACKAGES is invoked with verbose = FALSE. Behavior when verbose.level is 2 is identical to verbose.level 1 with the exception that underlying machinery from write_PACKAGE is invoked with verbose = TRUE, which will individually list every processed tarball.
Note
While both strict and non-strict modes can offer speedups when updating small percentages of large repositories, non-strict mode is much faster and is recommended in situations where the assumption it makes about tarballs' filenames encoding accurate information is safe.
Note
Users should expect significantly smaller speedups over write_PACKAGES in the type == "win.binary" case on at least some operating systems. This is due to write_PACKAGES being significantly faster in this context, rather than update_PACKAGES being slower.
Author(s)
Gabriel Becker (adapted from previous, related work by him in the switchr package which is copyright Genentech, Inc.)
See Also
Examples
## Not run:
write_PACKAGES("c:/myFolder/myRepository") # on Windows
update_PACKAGES("c:/myFolder/myRepository") # on Windows
write_PACKAGES("/pub/RWin/bin/windows/contrib/2.9",
type = "win.binary") # on Linux
update_PACKAGES("/pub/RWin/bin/windows/contrib/2.9",
type = "win.binary") # on Linux
## End(Not run)
Copyright (©) 1999–2012 R Foundation for Statistical Computing.
Licensed under the GNU General Public License.