Skip to main content

Structs DSV Format

The XINA Structs DSV (delimiter separated values) format provides a standard delimited text data file format. This is recommended for data files attached to events, and forms the basis for the structs buffer file format.

Files have certain standard requirements:

  • Must be UTF-8 encoded
  • New lines will be interpretted from either \n or \r\n
  • Blank lines will be ignored
  • Lines starting with the # character are treated as comments and ignored

The conf object may define other customization of the format:

literals(see
Key Value Default Description
delimiter string auto detect (',', '\t',(comma ';')character) value delimiter
quote_char character " (double quote character) value quote character
ignore_lines number 0 lines to ignore at the start of the file
mode"row" or "col"auto-detectfile row/col format (see below)
t"auto", "iso8601", "s", "ms", or "us""auto"time format (see below)
zone string UTC time zone to use if not provided
invalidvalues "ignore",JSON null, or numberobject "ignore" preferred interpretation of invalidstring literal
nan"ignore", null, or number"ignore"preferred interpretation of 'Nan' literal
p_infinity"ignore", null, or number"ignore"preferred interpretation of positive 'Infinity' literal
n_infinity"ignore", null, or number"ignore"preferred interpretation of negative 'Infinity' literalbelow)

It is strongly recommended to include a unique appropriately generated 128-bit UUID in the standard 36 character format as a comment in the first processed line of each file. (If ignore_lines > 0, this would be the first line after that number of lines.)

There are two format modes for DSV files: the row format, in which each line contains a time, label, and value, and the column format, in which each row contains a time and one or more values. The first processed uncommented line will be intepretted as the column header.header, If the mode propertywhich is "row",used to determine the file mustformat. containThe file will be treated as row mode if it contains exactly three columns:columns, with each having one of the reserved column names in the table below.

Name Description Alternate Names
t Unix time or ISO8601 zoned timestamp ts, time, timestamptimestamp, datetime
k key key, m, m_id, mn, mn_id, mnemonic, mnemonic_id, n, name
v value (numeric, empty, or null) val, value

The header is used to determine the order of the columns.

For example (whitespace added for clarity, not required):

# 123e4567-e89b-12d3-a456-426614174000
t , k     , v
0 , v_mon , 1
0 , i_mon , 5
1 , t_mon , 100
2 , v_mon , 1.1
2 , i_mon , 4
3 , t_mon , null
4 , v_mon , 1.2
4 , i_mon , 3
5 , t_mon , 101

If mode is "col",Otherwise, the file mustwill be interpretted as the column format, where the first containmust abe the time column, followed by a column for each mnemonic. The column headers must specify the mnemonic name or ID for each column. Unlike row, null values must be spelled out explicitly, as empty values will not create a point in the database.

For example, the following is equivalent to the above example (whitespace added for clarity, not required):

# 123e4567-e89b-12d3-a456-426614174000
t , v_mon , i_mon , t_mon
0 , 1     , 5     ,
1 ,       ,       , 100
2 , 1.1   , 4     ,
3 ,       ,       , null
4 , 1.2   , 3     ,
5 ,       ,       , 101

If the mode property is not specified, the mode will be determined by the number of columns in the file. If there are exactly 3 columns with names matching the required columns for the "row" mode, that mode is used; otherwise the file is assumed to use the column mode.

Time Parsing

The mode of time processing is determined by the value for t in conf. The auto mode attempts to interpret the most likely formatting for the timestamp. If the value is an integer or floating point format, it will be interpretted as a Unix timestamp, with precision based on these rules:

  • t > 1e16: error, value above typical range
  • t > 1e14: microseconds
  • t > 1e11: milliseconds
  • t > 1e8: seconds
  • t <= 1e8: error, value below typical range

Otherwise it will be interpretted as a zoned ISO8601 timestamp. If t is set explicitly in the configuration the time will always be interpretted in that context. The ISO timestamp may use the standard format:

2023-05-31T17:55:07.000

or

Or condensed format:

20230531T175507.000.

If the zone property provided in the configuration, the timestamps do not require a zone. Otherwise they must include an explicit zone.

Non-Numeric Values

Values which are non-numeric may be ignored, treated as null, or mapped to explicit values using the values property of the conf object. Ignored values are treated by XINA as though they do not exist in the file. null values are stored as an actual data point in the XINA database, but with the value null instead of a numeric value. (This is primarily useful to create a visual gap in plots.)

The following values are ignored by default (note, case-insensitive and whitespace agnostic):

  • "" (empty string)
  • "nv" (no value, ITOS)
  • "na"
  • "n/a"

The following values are interpretted as null by default (note, case-insensitive and whitespace agnostic):

  • "null"
  • "nil"
  • "none"
  • "nan"
  • "inf"
  • "+inf"
  • "-inf"
  • "infinity"
  • "+infinity"
  • "-infinity"

Custom value interpretations may be specified in the values object as either "ignore", null, or a numeric value. For example:

{
 "?": "ignore",
 "notta": null,
 "onetwothree": 123
}