std.csv
Implements functionality to read Comma Separated Values and its variants from an input range of dchar
.
Comma Separated Values provide a simple means to transfer and store tabular data. It has been common for programs to use their own variant of the CSV format. This parser will loosely follow the RFC-4180. CSV input should adhere to the following criteria (differences from RFC-4180 in parentheses):
- A record is separated by a new line (CRLF,LF,CR)
- A final record may end with a new line
- A header may be provided as the first record in input
- A record has fields separated by a comma (customizable)
- A field containing new lines, commas, or double quotes should be enclosed in double quotes (customizable)
- Double quotes in a field are escaped with a double quote
- Each record should contain the same number of fields
- Example
import std.algorithm; import std.array; import std.csv; import std.stdio; import std.typecons; void main() { auto text = "Joe,Carpenter,300000\nFred,Blacksmith,400000\r\n"; foreach (record; csvReader!(Tuple!(string, string, int))(text)) { writefln("%s works as a %s and earns $%d per year", record[0], record[1], record[2]); } // To read the same string from the file "filename.csv": auto file = File("filename.csv", "r"); foreach (record; file.byLine.joiner("\n").csvReader!(Tuple!(string, string, int))) { writefln("%s works as a %s and earns $%d per year", record[0], record[1], record[2]); } } }When an input contains a header the
Contents
can be specified as an associative array. Passing null to signify that a header is present. auto text = "Name,Occupation,Salary\r" "Joe,Carpenter,300000\nFred,Blacksmith,400000\r\n"; foreach (record; csvReader!(string[string]) (text, null)) { writefln("%s works as a %s and earns $%s per year.", record["Name"], record["Occupation"], record["Salary"]); } // To read the same string from the file "filename.csv": auto file = File("filename.csv", "r"); foreach (record; csvReader!(string[string]) (file.byLine.joiner("\n"), null)) { writefln("%s works as a %s and earns $%s per year.", record["Name"], record["Occupation"], record["Salary"]); }This module allows content to be iterated by record stored in a struct, class, associative array, or as a range of fields. Upon detection of an error an CSVException is thrown (can be disabled). csvNextToken has been made public to allow for attempted recovery. Disabling exceptions will lift many restrictions specified above. A quote can appear in a field if the field was not quoted. If in a quoted field any quote by itself, not at the end of a field, will end processing for that field. The field is ended when there is no input, even if the quote was not closed.
- See Also:
- Wikipedia Comma-separated values
- License:
- Boost License 1.0.
- Authors:
- Jesse Phillips
- Source
- std/csv.d
- class CSVException: object.Exception;
-
Exception containing the row and column for when an exception was thrown.
Numbering of both row and col start at one and corresponds to the location in the file rather than any specified header. Special consideration should be made when there is failure to match the header see
HeaderMismatchException
for details.
When performing type conversions,std.conv.ConvException
is stored in thenext
field.- Examples:
-
import std.exception : collectException; import std.algorithm.searching : count; string text = "a,b,c\nHello,65"; auto ex = collectException!CSVException(csvReader(text).count); // "(Row: 0, Col: 0) Row 2's length 2 does not match previous length of 3." writeln(ex.toString);
- Examples:
-
import std.exception : collectException; import std.algorithm.searching : count; import std.typecons : Tuple; string text = "a,b\nHello,65"; auto ex = collectException!CSVException(csvReader!(Tuple!(string,int))(text).count); // "(Row: 1, Col: 2) Unexpected 'b' when converting from type string to type int" writeln(ex.toString);
- size_t row;
- size_t col;
- class IncompleteCellException: std.csv.CSVException;
-
Exception thrown when a Token is identified to not be completed: a quote is found in an unquoted field, data continues after a closing quote, or the quoted field was not closed before data was empty.
- Examples:
-
import std.exception : assertThrown; string text = "a,\"b,c\nHello,65,2.5"; assertThrown!IncompleteCellException(text.csvReader(["a","b","c"]));
- dstring partialData;
-
Data pulled from input before finding a problem
This field is populated when using
csvReader
but not bycsvNextToken
as this data will have already been fed to the output range.
- class HeaderMismatchException: std.csv.CSVException;
-
Exception thrown under different conditions based on the type of
Contents
.Structure, Class, and Associative Array
- When a header is provided but a matching column is not found
Other- When a header is provided but a matching column is not found
- Order did not match that found in the input
Since a row and column is not meaningful when a column specified by the header is not found in the data, both row and col will be zero. Otherwise row is always one and col is the first instance found in header that occurred before the previous starting at one.- Examples:
-
import std.exception : assertThrown; string text = "a,b,c\nHello,65,2.5"; assertThrown!HeaderMismatchException(text.csvReader(["b","c","invalid"]));
- enum Malformed: int;
-
Determines the behavior for when an error is detected.
Disabling exception will follow these rules:
- A quote can appear in a field if the field was not quoted.
- If in a quoted field any quote by itself, not at the end of a field, will end processing for that field.
- The field is ended when there is no input, even if the quote was not closed.
- If the given header does not match the order in the input, the content will return as it is found in the input.
- If the given header contains columns not found in the input they will be ignored.
- Examples:
-
import std.algorithm.comparison : equal; import std.algorithm.searching : count; import std.exception : assertThrown; string text = "a,b,c\nHello,65,\"2.5"; assertThrown!IncompleteCellException(text.csvReader.count); // ignore the exceptions and try to handle invalid CSV auto firstLine = text.csvReader!(string, Malformed.ignore)(null).front; assert(firstLine.equal(["Hello", "65", "2.5"]));
- ignore
-
No exceptions are thrown due to incorrect CSV.
- throwException
-
Use exceptions when input has incorrect CSV.
- auto csvReader(Contents = string, Malformed ErrorLevel = Malformed.throwException, Range, Separator = char)(Range input, Separator delimiter = ',', Separator quote = '"')
Constraints: if (isInputRange!Range && is(immutable(ElementType!Range) == immutable(dchar)) && isSomeChar!Separator && !is(Contents T : T[U], U : string));
auto csvReader(Contents = string, Malformed ErrorLevel = Malformed.throwException, Range, Header, Separator = char)(Range input, Header header, Separator delimiter = ',', Separator quote = '"')
Constraints: if (isInputRange!Range && is(immutable(ElementType!Range) == immutable(dchar)) && isSomeChar!Separator && isForwardRange!Header && isSomeString!(ElementType!Header));
auto csvReader(Contents = string, Malformed ErrorLevel = Malformed.throwException, Range, Header, Separator = char)(Range input, Header header, Separator delimiter = ',', Separator quote = '"')
Constraints: if (isInputRange!Range && is(immutable(ElementType!Range) == immutable(dchar)) && isSomeChar!Separator && is(Header : typeof(null))); -
Returns an input range for iterating over records found in
input
.An optional
header
can be provided. The first record will be read in as the header. IfContents
is a struct then the header provided is expected to correspond to the fields in the struct. WhenContents
is not a type which can contain the entire record, theheader
must be provided in the same order as the input or an exception is thrown.- Returns:
- An input range R as defined by
std.range.primitives.isInputRange
. WhenContents
is a struct, class, or an associative array, the element type of R isContents
, otherwise the element type of R is itself a range with element typeContents
. If aheader
argument is provided, the returned range provides aheader
field for accessing the header from the input in array form.
- Throws:
-
CSVException
When a quote is found in an unquoted field, data continues after a closing quote, the quoted field was not closed before data was empty, a conversion failed, or when the row's length does not match the previous length.HeaderMismatchException
when a header is provided but a matching column is not found or the order did not match that found in the input. Read the exception documentation for specific details of when the exception is thrown for different types ofContents
.
- Examples:
- The
Contents
of the input can be provided if all the records are the same type such as all integer data:import std.algorithm.comparison : equal; string text = "76,26,22"; auto records = text.csvReader!int; assert(records.equal!equal([ [76, 26, 22], ]));
- Examples:
- Using a struct with modified delimiter:
import std.algorithm.comparison : equal; string text = "Hello;65;2.5\nWorld;123;7.5"; struct Layout { string name; int value; double other; } auto records = text.csvReader!Layout(';'); assert(records.equal([ Layout("Hello", 65, 2.5), Layout("World", 123, 7.5), ]));
- Examples:
- Specifying
ErrorLevel
asMalformed.ignore
will lift restrictions on the format. This example shows that an exception is not thrown when finding a quote in a field not quoted.string text = "A \" is now part of the data"; auto records = text.csvReader!(string, Malformed.ignore); auto record = records.front; writeln(record.front); // text
- Examples:
- Read only column "b"
import std.algorithm.comparison : equal; string text = "a,b,c\nHello,65,63.63\nWorld,123,3673.562"; auto records = text.csvReader!int(["b"]); assert(records.equal!equal([ [65], [123], ]));
- Examples:
- Read while rearranging the columns by specifying a header with a different order"
import std.algorithm.comparison : equal; string text = "a,b,c\nHello,65,2.5\nWorld,123,7.5"; struct Layout { int value; double other; string name; } auto records = text.csvReader!Layout(["b","c","a"]); assert(records.equal([ Layout(65, 2.5, "Hello"), Layout(123, 7.5, "World") ]));
- Examples:
- The header can also be left empty if the input contains a header row and all columns should be iterated. The header from the input can always be accessed from the
header
field.string text = "a,b,c\nHello,65,63.63"; auto records = text.csvReader(null); writeln(records.header); // ["a", "b", "c"]
- void csvNextToken(Range, Malformed ErrorLevel = Malformed.throwException, Separator, Output)(ref Range input, ref Output ans, Separator sep, Separator quote, bool startQuoted = false)
Constraints: if (isSomeChar!Separator && isInputRange!Range && is(immutable(ElementType!Range) == immutable(dchar)) && isOutputRange!(Output, dchar)); -
Lower level control over parsing CSV
This function consumes the input. After each call the input will start with either a delimiter or record break (\n, \r\n, \r) which must be removed for subsequent calls.
- Parameters:
Range input
Any CSV input Output ans
The first field in the input Separator sep
The character to represent a comma in the specification Separator quote
The character to represent a quote in the specification bool startQuoted
Whether the input should be considered to already be in quotes
- Throws:
-
IncompleteCellException
When a quote is found in an unquoted field, data continues after a closing quote, or the quoted field was not closed before data was empty.
- Examples:
-
import std.array : appender; import std.range.primitives : popFront; string str = "65,63\n123,3673"; auto a = appender!(char[])(); csvNextToken(str,a,',','"'); writeln(a.data); // "65" writeln(str); // ",63\n123,3673" str.popFront(); a.shrinkTo(0); csvNextToken(str,a,',','"'); writeln(a.data); // "63" writeln(str); // "\n123,3673" str.popFront(); a.shrinkTo(0); csvNextToken(str,a,',','"'); writeln(a.data); // "123" writeln(str); // ",3673"
© 1999–2021 The D Language Foundation
Licensed under the Boost License 1.0.
https://dlang.org/phobos/std_csv.html