XBin Format Reference
The XBin (XINA Binary) format provides a XINA standard binary format for time based data files. It uses the file extension xbin
.
The xbin format organizes key-value data by time. The data content is a series of rows in ascending time order, with each row having a single microsecond precision Unix time, unique within the file.
Segment Format
XBin data is often encoded in segments, which are defined by an initial 1, 2, or 4 byte unsigned integer length, then that number of bytes. These are referred to in this document as:
- seg1 (up to 255 bytes)
- seg2 (up to 65,535 bytes)
- seg4 (up to 2,147,483,647 bytes)
If the length value of a segment is zero there is no following data and the value is considered empty.
Examples
The string "foo"
has a 3 byte UTF-8 encoding: 0x66
, 0x6f
, 0x6f
.
As a seg1, this is encoded with a total of 4 bytes (the initial byte containing the length, 3):
0x03
0x66
0x6f
0x6f
As a seg2, 5 bytes:
0x00
0x03
0x66
0x6f
0x6f
And as a seg4, 7 bytes:
0x00
0x00
0x00
0x03
0x66
0x6f
0x6f
Value Format
Each value starts with a 1 byte unsigned integer indicating the value type, followed by additional byte(s) containing the value itself, as applicable.
Value Type Definition
Code | Value | Length (bytes) | Description |
---|---|---|---|
0 |
null |
0 | literal null / empty string |
1 |
ref dict index | 1 | index 0 to 255 (see below) |
2 |
ref dict index | 2 | index 256 to 65,535 |
3 |
ref dict index | 4 | index 65,536 to 2,147,483,647 |
4 |
true |
0 | boolean literal |
5 |
false |
0 | boolean literal |
6 |
int1 | 1 | 1 byte signed integer |
7 |
int2 | 2 | 2 byte signed integer |
8 |
int4 | 4 | 4 byte signed integer |
9 |
int8 | 8 | 8 byte signed integer |
10 |
float4 | 4 | 4 byte floating point |
11 |
float8 | 8 | 8 byte floating point |
12 |
string1 | variable | seg1 UTF-8 encoded string |
13 |
string2 | variable | seg2 UTF-8 encoded string |
14 |
string4 | variable | seg4 UTF-8 encoded string |
15 |
json1 | variable | seg1 UTF-8 encoded JSON |
16 |
json2 | variable | seg2 UTF-8 encoded JSON |
17 |
json4 | variable | seg4 UTF-8 encoded JSON |
18 |
jsonarray1 | variable | seg1 UTF-8 encoded JSON array |
19 |
jsonarray2 | variable | seg2 UTF-8 encoded JSON array |
20 |
jsonarray4 | variable | seg4 UTF-8 encoded JSON array |
21 |
jsonobject1 | variable | seg1 UTF-8 encoded JSON object |
22 |
jsonobject2 | variable | seg2 UTF-8 encoded JSON object |
23 |
jsonobject4 | variable | seg4 UTF-8 encoded JSON object |
24 |
bytes1 | variable | seg1 raw byte array |
25 |
bytes2 | variable | seg2 raw byte array |
26 |
bytes4 | variable | seg4 raw byte array |
27 |
xstring1 | variable | seg1 xstring |
28 |
xstring2 | variable | seg2 xstring |
29 |
xstring4 | variable | seg4 xstring |
30 |
xjsonarray1 | variable | seg1 xjson array |
31 |
xjsonarray2 | variable | seg2 xjson array |
32 |
xjsonarray4 | variable | seg4 xjson array |
33 |
xjsonobject1 | variable | seg1 xjson object |
34 |
xjsonobject2 | variable | seg2 xjson object |
35 |
xjsonobject4 | variable | seg4 xjson object |
36 - 255 |
unusued, reserved |
XString Format
The xstring value type allows chaining mutliple encoded values to be interpretted as a string. The xstring segment length must be the total number of bytes of all encoded values in the string.
Note that although any data type may be included in an xstring, the exact string representation of certain values may vary depending on the decoding environment (specifically, the formatting of floating point values) and thus it is not recommended to include them in xstring values. JSON values will be converted to their minimal string representation. Byte arrays will be converted to a hex string. Null values will be treated as an empty string.
XJSON Array Format
The xjsonarray value type allows chaining mutliple encoded values to be interpretted as a JSON array. The xjsonarray segment length must be the total number of bytes of all encoded values in the array.
XJSON Object Format
The xjsonobject value type allows chaining mutliple encoded values to be interpretted as a JSON object. Each pair of values in the list is interpretted as a key-value pair. The xjsonobject segment length must be the total number of bytes of all encoded key-value pairs in the object. Note that key values must resolve to a string, xstring, number, boolean, or null (which will be interpretted as an empty string key).
Examples
Null Value:
Code | Content (0 bytes) |
---|---|
0x00 |
300 (as 2 byte integer):
Code | Content (2 bytes) |
---|---|
0x05 |
0x01 0x2c |
0.24 (as 8 byte float):
Code | Content (8 bytes) |
---|---|
0x09 |
0x3f 0xce 0xb8 0x51 0xEB 0x85 0x1E 0xb8 |
"foo" (as string1):
Code | Content (4 bytes) |
---|---|
0x0a |
0x03 0x66 0x6f 0x6f |
{"foo":"bar"} (as json1):
Code | Content (14 bytes) |
---|---|
0x0d |
0x0d 0x7b 0x22 0x66 0x6f 0x6f 0x22 0x3a 0x22 0x62 0x61 0x72 0x22 0x7d |
"foo123" (as xstring1, split as string1 "foo" and int1 123):
Code | Content (7 bytes) |
---|---|
0x19 |
[ 0x06 ](total length) [ 0x03 0x66 0x6f 0x6f ]("foo") [ 0x04 0x7b ](123) |
Dictionaries
The xbin format provides user-managed compression through the dictionary value types. These come in two flavors, the type code byte values (128 - 255), and the reference dictionary.
The type code dict is reserved for the most heavily used values, as each encoded value uses only a single byte. The reference dictionary can contain up to the 4 byte signed integer index space (2,147,483,647).
Binary File Format
UUID
The file starts with a 16 byte binary encoded UUID. This is intended to uniquely identify the file, but the exact implementation and usage beyond this is not explicitly defined as part of the format definition. For XINA purposes two xbin files with the same UUID would be expected to be identical.
Header
A value which must either be null
or a jsonobject1
, jsonobject2
, or jsonobject4
. This is currently a placeholder with no defined parameters.
Reference Dict
A seg4 containing 0 to 2,147,483,647 encoded values, which may be referenced by zero based index with the reference dict index value type. The index size is determined by the total number of values in the reference dict (1 byte up to 255, 2 bytes up to 65,535, 4 bytes otherwise).
Lookup Table
1 unsigned byte indicating a number of (8 byte, 8 byte) pairs of (time, byte offset) references to improve file parsing speed. The feature may be omitted with a value of 0. The byte offset is calculated relative to the first byte of the first row (the first byte immediately following this table).
Rows
Each row contains:
- 8 byte signed integer containing Unix time with microsecond precision
- seg4 of row data, containing
- header, single value which must either be
null
or ajsonobject1
,jsonobject2
, orjsonobject4
- one or more key,value pairs
- header, single value which must either be
The row header is currently a placeholder with no defined parameters.
Example File
Given a data set with UUID 9462ef87-f232-4694-922c-12b93c95e27c:
t | voltage | current | label |
---|---|---|---|
0 | 5 | 10 | "foo" |
1 | "bar" | ||
2 | 5 | null |
A corresponding xbin file containing the same data would be:
UUID (16 bytes)
0x94
0x62
0xef
0x87
0xf2
0x32
0x46
0x94
0x92
0x2c
0x12
0xb9
0x3c
0x95
0xe2
0x7c
Header (1 byte)
0x00
(null, 1 byte)
Reference Dict, three values, "voltage", "current", "label" (29 bytes)
0x00
0x00
0x00
0x19
(seg4 length, 25)
0x0a
0x07
0x76
0x6f
0x6c
0x74
0x61
0x67
0x65
("voltage", 9 bytes)
0x0a
0x07
0x63
0x75
0x72
0x72
0x65
0x6e
0x74
("current", 9 bytes)
0x0a
0x05
0x6c
0x61
0x62
0x65
0x6c
("label", 7 bytes)
Lookup Table (33 bytes)
0x02
(two pairs of lookup values)
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x01
(time, 1, 8 bytes)
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x16
(byte offset, 22, 8 bytes)
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x02
(time, 2, 8 bytes)
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x2a
(byte offset, 42, 8 bytes)
Row t0 (22 bytes)
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
(time, 0, 8 bytes)
0x00
0x00
0x00
0x0e
(row length, 15, 4 bytes)
0x00
(header, null, 1 byte)
0x01
0x00
(reference to index 0, "voltage", 2 bytes)
0xff
(type code reference to index 0, 5, 1 byte)
0x01
0x01
(reference to index 1, "current", 2 bytes)
0x04
0x0a
(integer value 10, 2 bytes)
0x01
0x02
(reference to index 2, "label", 2 bytes)
0x0a
0x03
0x66
0x6f
0x6f
(string "foo", 5 bytes)
Row t1 (20 bytes)
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x01
(time, 1, 8 bytes)
0x00
0x00
0x00
0x08
(row length, 8, 4 bytes)
0x00
(header, null, 1 byte)
0x01
0x02
(reference to index 2, "label", 2 bytes)
0x0a
0x03
0x62
0x61
0x72
(string "bar", 5 bytes)
Row t2 (19 bytes)
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x02
(time, 2, 8 bytes)
0x00
0x00
0x00
0x0e
(row length, 15, 4 bytes)
0x00
(header, null, 1 byte)
0x01
0x00
(reference to index 0, "voltage", 2 bytes)
0x00
(type code reference to index 0, 5, 1 byte)
0x01
0x01
(reference to index 1, "current", 2 bytes)
0x00
(null, 1 byte)