Structs DSV Format

The XINA Structs DSV (delimiter separated values) format provides a standard delimited text data file format. This is recommended for data files attached to events, and forms the basis for the structs buffer file format.

Files have certain standard requirements:

Must be UTF-8 encoded
New lines will be interpretted from either \n or \r\n
Blank lines will be ignored
Lines starting with the # character are treated as comments and ignored

The conf object may define other customization of the format:

Key	Value	Default	Description
delimiter	string	auto detect (`','`, `'\t'`, `';'`)	value delimiter
quote_char	character	`"` (double quote character)	value quote character
ignore_lines	number	`0`	lines to ignore at the start of the file
mode	`"row"` or `"col"`	auto-detect	file row/col format (see below)
t	`"auto"`, `"iso8601"`, `"s"`, `"ms"`, or `"us"`	`"auto"`	time format (see below)
zone	string		time zone to use if not provided
invalid	`"ignore"`, `null`, or number	`"ignore"`	preferred interpretation of invalid literal
nan	`"ignore"`, `null`, or number	`"ignore"`	preferred interpretation of `'Nan'` literal
p_infinity	`"ignore"`, `null`, or number	`"ignore"`	preferred interpretation of positive `'Infinity'` literal
n_infinity	`"ignore"`, `null`, or number	`"ignore"`	preferred interpretation of negative `'Infinity'` literal

It is strongly recommended to include a unique appropriately generated 128-bit UUID in the standard 36 character format as a comment in the first processed line of each file. (If ignore_lines > 0, this would be the first line after that number of lines.)

The first processed uncommented line will be intepretted as the column header. If the mode property is "row", the file must contain three columns:

Name	Description	Alternate Names
t	Unix time or ISO8601 zoned timestamp	time, timestamp
k	key	key, mn, mnemonic, n, name
v	value (numeric, empty, or `null`)	val, value

The header is used to determine the order of the columns.

For example (whitespace added for clarity, not required):

# 123e4567-e89b-12d3-a456-426614174000
t , k     , v
0 , v_mon , 1
0 , i_mon , 5
1 , t_mon , 100
2 , v_mon , 1.1
2 , i_mon , 4
3 , t_mon ,
4 , v_mon , 1.2
4 , i_mon , 3
5 , t_mon , 101

If mode is "col", the file must first contain a time column, followed by a column for each mnemonic. The column headers must specify the mnemonic name or ID for each column. Unlike row, null values must be spelled out explicitly, as empty values will not create a point in the database.

For example, the following is equivalent to the above example (whitespace added for clarity, not required):

# 123e4567-e89b-12d3-a456-426614174000
t       , v_mon , i_mon , t_mon
0       , 1     , 5     ,
1       ,       ,       , 100
2       , 1.1   , 4     ,
3       ,       ,       , null
4       , 1.2   , 3     ,
5       ,       ,       , 101

If the mode property is not specified, the mode will be determined by the number of columns in the file. If there are exactly 3 columns with names matching the required columns for the "row" mode, that mode is used; otherwise the file is assumed to use the column mode.

Time Parsing

The mode of time processing is determined by the value for t in conf. The auto mode attempts to interpret the most likely formatting for the timestamp. If the value is an integer or floating point format, it will be interpretted as a Unix timestamp, with precision based on these rules:

t > 1e16: error, value above typical range
t > 1e14: microseconds
t > 1e11: milliseconds
t > 1e8: seconds
t <= 1e8: error, value below typical range

Otherwise it will be interpretted as a zoned ISO8601 timestamp. If t is set explicitly in the configuration the time will always be interpretted in that context. The ISO timestamp may use the standard format: 2023-05-31T17:55:07.000 or condensed 20230531T175507.000. If the zone property provided in the configuration, the timestamps do not require a zone. Otherwise they must include an explicit zone.