Telemetry

Telemetry refers to the raw data files for mnemonic (and potentially instants/intervals) in a data model.

Data Source

Abstractly, a data source (or simply source) is a single point of data import to a model. In many cases, a model will only have a single data source; for example, if all data is provided directly from a single instrument, or multiple components are merged into a single data stream through FEDS before import into XINA. In these cases delineation by data source is not required in model organzation, and should use this pattern:

In a model group, the data group will be used as the default location for telemetry which does not specify a source.

However, in environments with multiple import points running in parallel, databases must be designed with multiple sources.

In this example each telemetry file would need to specify either source_a or source_b. Additionally, each source has distinct databases for instant, interval, and mnemonic data. This would be required if each data source provided all three data types. As requirements for instants and intervals are less stringent than mnemonics, in some circumstances instants and intervals could be considered a single source and populated independently:

Telemetry Database

Each model must contain a single telemetry file database.

Required Fields

Field	Type	Description
`src`	`asciivstring(32)` (may be `null`)	data source name
`u_id`	`uuid`	universally unique ID
`name`	`utf8vstring(128)`	file name
`t_start`	`instant(us)`	start time of telemetry data
`t_end`	`instant(us)`	end time of telemetry data
`meta`	`jsonobject` (may be `null`)	arbitrary metadata as needed
`format`	`asciivstring(16)`	file format (see below)
`conf`	`jsonobject` (may be `null`)	configuration parameters, depending on `format`

If src is null, the telemetry file will be associated with the default source (and data) group.

Telemetry Formats

Currently there is only one natively supported general purpose format, using the code csv/tsv. The full documentation is available here. Additional formats will be added in the future, and custom project specific formats may be added as needed.

CSV / TSV

~~The~~ csv ~~and~~ tsv ~~formats provide a standard delimited text file format. Files should be ASCII or UTF-8 encoded. New lines will be interpretted from either~~ \n or \r\n~~. The~~ conf ~~may define other customization of the format:~~

~~Key~~	~~Value~~	~~Default~~	~~Description~~
`delimit`	`string`	~~for~~ `csv`: `,` ~~(comma), for~~ `tsv`: `\t` ~~(tab)~~	~~value delimiter~~
`quote`	`character`	`"` ~~(double quote character)~~	~~value quote character~~

~~The first line must contain an~~ ~~appropriately generated 128-bit UUID in the standard 36 character format~~.

This may optionally be followed by one or more metadata values. These are treated as (key, value) pairs stored in XINA alongside the file. Each key must be unique in the file, must not be empty, and must not start with the $ ~~(dollar sign) character. The values may be any JSON value. If the value starts with a~~ [ ~~(brace) or~~ { ~~(bracket) character it will be interpretted as a JSON array or object, respectively. Literal values~~ true ~~and~~ false ~~will be stored as booleans. If the value is numeric it will be stored as a number. If the value is empty it will be stored as~~ null~~. Otherwise, it will be stored as a string.~~

~~The file must then have a row starting with one of two values:~~ $mn_row or $mn_col~~. These indicate the end of any metadata and the start of the file data.~~

~~For~~ $mn_row~~, the file must contain three columns, in this order:~~

~~time (Unix time or ISO8601 zoned timestamp)~~

~~mnemonic (name or ID)~~

~~value (numeric, empty string, or~~ null)

~~For example (whitespace added for clarity, not required):~~

123e4567-e89b-12d3-a456-426614174000
bldg, 37
room, 123
$mn_row
0, v_mon, 1
0, i_mon, 5
1, t_mon, 100
2, v_mon, 1.1
2, i_mon, 4
3, t_mon,
4, v_mon, 1.2
4, i_mon, 3
5, t_mon, 101

~~For~~ $mn_col~~, the file must first contain a time column, followed by a column for each mnemonic. The row starting with~~ $mn_col ~~must specify the column headers with the mnemonic name or ID for each column. Unlike~~ $mn_row, null ~~values must be spelled out explicitly, as empty string values will not create a point in the database.~~

~~For example, the following is equivalent to the above example (whitespace added for clarity, not required):~~

123e4567-e89b-12d3-a456-426614174000
bldg, 37
room, 123
$mn_col , v_mon , i_mon , t_mon
0       , 1     , 5     ,
1       ,       ,       , 100
2       , 1.1   , 4     ,
3       ,       ,       , null
4       , 1.2   , 3     ,
5       ,       ,       , 101

Assumptions and Limitations

Each telemetry file is considered the single source of truth for all mnemonics, instants, and intervals for it's associated data source for its time range. This has the following implications:

Telemetry files with the same source cannot contain overlapping time ranges. If an import operation is performed with a file violating this constraint the operation will fail and return an error.

Within a single model, each mnemonic may only come from a single source. Because mnemonics are not necessarily strictly associated with models, and the source may vary between models, this cannot be verified on import and must be verified on the client prior to importing data.

Data Flow

Telemetry data flow involves two phases, the import phase and the mining phase. Telemetry files must be imported with the MODEL_TM_IMPORT action. Files are parsed with some degree of validation (details depend on file format) and imported into the model telemetry database. Once this is complete the mining phase begins. This is performed asynchronously as a XINA Run task. Mining tasks are executed sequentially for each model by specifying a XINA Run thread for the model.

Full details of the mining process vary depending on the file format. The default csv format mining tool is documented here.

All of these concepts merge together into two core paradigms of data flow management: server side aggregation, and client side aggregation.

Server Side Aggregation

With server side aggregation, the XINA server is responsible for aggregating one or more data services into a model. Typically this means XINA is also responsible for management of mnemonic definitions, using the MODEL_MN_IMPORT API call.

Pros

minimal client side configuration required to get started
flexible and responsive to changing environments, mnemonics, requirements

Cons

performance is worse than client side aggregation
not recommended above 1k total data points per second
less stringent validation means user mistakes may go unnoticed

Client Side Aggregation

With client side aggregation, data for a model is entire aggregated into a single data source on the client. This solution is common for telemetry generated directly by an instrument, or multiple sources merged through FEDS. Typically the merged file(s) are a binary format and require custom utilities to convert to XINA formats, which can be deployed within the XINA ecosystem to XINA Run servers.

Pros

much higher performance ceiling than server side aggregation
stringent validation ensures data conforms to standard

Cons

more complex initial setup
mnemonic definitions need coordination between client and server
changes are more complex and likely involve human interaction