unidecode
This module is based on Python's Unidecode module by Tomaz Solc, which in turn is based on the Text::Unidecode
Perl module by Sean M. Burke (http://search.cpan.org/~sburke/Text-Unidecode-0.04/lib/Text/Unidecode.pm ).
It provides a single proc that does Unicode to ASCII transliterations: It finds the sequence of ASCII characters that is the closest approximation to the Unicode string.
For example, the closest to string "Äußerst" in ASCII is "Ausserst". Some information is lost in this transformation, of course, since several Unicode strings can be transformed in the same ASCII representation. So this is a strictly one-way transformation. However a human reader will probably still be able to guess what original string was meant from the context.
This module needs the data file "unidecode.dat" to work: This file is embedded as a resource into your application by default. But you an also define the symbol --define:noUnidecodeTable
during compile time and use the loadUnidecodeTable
proc to initialize this module.
Imports
Procs
proc loadUnidecodeTable(datafile = "unidecode.dat") {...}{.raises: [], tags: [].}
- loads the datafile that
unidecode
to work. This is only required if the module was compiled with the--define:noUnidecodeTable
switch. This needs to be called by the main thread before any thread can make a call tounidecode
. Source Edit proc unidecode(s: string): string {...}{.raises: [], tags: [].}
-
Finds the sequence of ASCII characters that is the closest approximation to the UTF-8 string
s
.Example:
unidecode("北京")
Results in: "Bei Jing"
Source Edit
© 2006–2021 Andreas Rumpf
Licensed under the MIT License.
https://nim-lang.org/docs/unidecode.html