agrep Approximate String Matching (Fuzzy Matching)
 Description
Searches for approximate matches to pattern (the first argument) within each element of the string x (the second argument) using the generalized Levenshtein edit distance (the minimal possibly weighted number of insertions, deletions and substitutions needed to transform one string into another). 
Usage
agrep(pattern, x, max.distance = 0.1, costs = NULL,
      ignore.case = FALSE, value = FALSE, fixed = TRUE,
      useBytes = FALSE)
agrepl(pattern, x, max.distance = 0.1, costs = NULL,
       ignore.case = FALSE, fixed = TRUE, useBytes = FALSE)
 Arguments
| pattern | a non-empty character string to be matched. For  | 
| x | character vector where matches are sought. Coerced by  | 
| max.distance | Maximum distance allowed for a match. Expressed either as integer, or as a fraction of the pattern length times the maximal transformation cost (will be replaced by the smallest integer not less than the corresponding fraction), or a list with possible components 
 If  | 
| costs | a numeric vector or list with names partially matching insertions, deletions and substitutions giving the respective costs for computing the generalized Levenshtein distance, or  | 
| ignore.case | if  | 
| value | if  | 
| fixed | logical. If  | 
| useBytes | logical. in a multibyte locale, should the comparison be character-by-character (the default) or byte-by-byte. | 
Details
The Levenshtein edit distance is used as measure of approximateness: it is the (possibly cost-weighted) total number of insertions, deletions and substitutions required to transform one string into another.
This uses the tre code by Ville Laurikari (https://github.com/laurikari/tre), which supports MBCS character matching. 
The main effect of useBytes is to avoid errors/warnings about invalid inputs and spurious matches in multibyte locales. It inhibits the conversion of inputs with marked encodings, and is forced if any input is found which is marked as "bytes" (see Encoding). 
Value
agrep returns a vector giving the indices of the elements that yielded a match, or, if value is TRUE, the matched elements (after coercion, preserving names but no other attributes). 
agrepl returns a logical vector. 
Note
Since someone who read the description carelessly even filed a bug report on it, do note that this matches substrings of each element of x (just as grep does) and not whole elements. See also adist in package utils, which optionally returns the offsets of the matched substrings. 
Author(s)
Original version in R < 2.10.0 by David Meyer. Current version by Brian Ripley and Kurt Hornik.
See Also
grep, adist. A different interface to approximate string matching is provided by aregexec(). 
Examples
agrep("lasy", "1 lazy 2")
agrep("lasy", c(" 1 lazy 2", "1 lasy 2"), max.distance = list(sub = 0))
agrep("laysy", c("1 lazy", "1", "1 LAZY"), max.distance = 2)
agrep("laysy", c("1 lazy", "1", "1 LAZY"), max.distance = 2, value = TRUE)
agrep("laysy", c("1 lazy", "1", "1 LAZY"), max.distance = 2, ignore.case = TRUE)
    Copyright (©) 1999–2012 R Foundation for Statistical Computing.
Licensed under the GNU General Public License.