Skip to main content

XBin Format Reference

The XBin (XINA Binary) format provides a XINA standard binary format for time based data files. It uses the file extension xbin.

The xbin format organizes key-value data by time. The data content is a series of rows in ascending time order, with each row having a single microsecond precision Unix time, unique within the file.

Segment Format

XBin data is often encoded in segments, which are defined by an initial signed integer byte length, then that number of bytes. These are referred to in this document as:

  • seg1 (up to 127 bytes)
  • seg2 (up to 32,767 bytes)
  • seg4 (up to 2,147,483,647 bytes)

If the length value of a segment is zero there is no following data and the value is considered empty.

Examples

The string "foo" has a 3 byte UTF-8 encoding: 0x66, 0x6f, 0x6f.

As a seg1, this is encoded with a total of 4 bytes (the initial byte containing the length, 3):

0x03 0x66 0x6f 0x6f

As a seg2, 5 bytes:

0x00 0x03 0x66 0x6f 0x6f

And as a seg4, 7 bytes:

0x00 0x00 0x00 0x03 0x66 0x6f 0x6f

Value Format

Each value starts with a 1 byte signed integer indicating the value type, followed by additional byte(s) containing the value itself, as applicable.

Value Type Definition

Code Value Length (bytes) Description
0 null 0 literal null / empty string
1 reference dict index variable see below
2 true boolean literal 0
3 false boolean literal 0
4 1 byte signed integer 1
5 2 byte signed integer 2
6 4 byte signed integer 4
7 8 byte signed integer 8
8 4 byte floating point 4
9 8 byte floating point 8
10 string1 variable seg1 UTF-8 encoded string
11 string2 variable seg2 UTF-8 encoded string
12 string4 variable seg4 UTF-8 encoded string
13 json1 variable seg1 UTF-8 encoded JSON
14 json2 variable seg2 UTF-8 encoded JSON
15 json4 variable seg4 UTF-8 encoded JSON
16 jsonarray1 variable seg1 UTF-8 encoded JSON array
17 jsonarray2 variable seg2 UTF-8 encoded JSON array
18 jsonarray4 variable seg4 UTF-8 encoded JSON array
19 jsonobject1 variable seg1 UTF-8 encoded JSON object
20 jsonobject2 variable seg2 UTF-8 encoded JSON object
21 jsonobject4 variable seg4 UTF-8 encoded JSON object
22 bytes1 variable seg1 raw byte array
23 bytes2 variable seg2 raw byte array
24 bytes4 variable seg4 raw byte array
25 xstring1 variable seg1 xstring
26 xstring2 variable seg2 xstring
27 xstring4 variable seg4 xstring
28 xjsonarray1 variable seg1 xjson array
29 xjsonarray2 variable seg2 xjson array
30 xjsonarray4 variable seg4 xjson array
31 xjsonobject1 variable seg1 xjson object
32 xjsonobject2 variable seg2 xjson object
33 xjsonobject4 variable seg4 xjson object
(-128, -1) code dict index 0 see below

XString Format

The xstring value type allows chaining mutliple encoded values to be interpretted as a string. The xstring segment length must be the total number of bytes of all encoded values in the string.

XJSON Array Format

The xjsonarray value type allows chaining mutliple encoded values to be interpretted as a JSON array. The xjsonarray segment length must be the total number of bytes of all encoded values in the array.

XJSON Object Format

The xjsonobject value type allows chaining mutliple encoded values to be interpretted as a JSON object. Each pair of values in the list is interpretted as a key-value pair. The xjsonobject segment length must be the total number of bytes of all encoded key-value pairs in the object. Note that key values must resolve to a string, xstring, number, boolean, or null (which will be interpretted as an empty string key).

Examples

Null Value:

Code Content (0 bytes)
0x00

300 (as 2 byte integer):

Code Content (2 bytes)
0x05 0x01 0x2c

0.24 (as 8 byte float):

Code Content (8 bytes)
0x09 0x3f 0xce 0xb8 0x51 0xEB 0x85 0x1E 0xb8

"foo" (as string1):

Code Content (4 bytes)
0x0a 0x03 0x66 0x6f 0x6f

{"foo":"bar"} (as json1):

Code Content (14 bytes)
0x0d 0x0d 0x7b 0x22 0x66 0x6f 0x6f 0x22 0x3a 0x22 0x62 0x61 0x72 0x22 0x7d

"foo123" (as xstring1, split as "foo" string1 and integer 123):

Code Content (7 bytes)
0x19 [ 0x06 ](total length) [ 0x03 0x66 0x6f 0x6f ]("foo") [ 0x04 0x7b ](123)

Dictionaries

The xbin format provides user-managed compression through the dictionary value types. These come in two flavors, the signed negative type code byte values, and the reference dictionary.

The negative code index space is reserved for the most heavily used values, as each encoded value uses only a single byte. The reference dictionary can contain up to the 4 byte signed integer index space (2,147,483,647).

Binary File Format

UUID

The file starts with a 16 byte binary encoded UUID. This is intended to uniquely identify the file, but the exact implementation and usage beyond this is not explicitly defined as part of the format definition. For XINA purposes two xbin files with the same UUID would be expected to be identical.

Header

A value which must either be null or a json1, json2, or json4 containing a single JSON object. This is currently a placeholder with no defined parameters.

Type Code Dict

A seg4 containing 0 to 127 encoded values, which will correspond to the type codes -128 to -1.

Reference Dict

A seg4 containing 0 to 2,147,483,647 encoded values, which may be referenced by zero based index with the reference dict index value type. The index size is determined by the total number of values in the reference dict (one byte up to 127, two bytes up to 32,767, four bytes otherwise).

Lookup Table

A single signed byte indicating a number of (8 byte, 8 byte) pairs of (time, byte offset) references to improve file parsing speed. The feature may be omitted with a value of 0. The byte offset is calculated relative to the first byte of the first row.

Rows

Each row contains:

  • 8 byte signed integer containing Unix time with microsecond precision
  • seg4 of row data, containing
    • header, single value which must either be null or a json1, json2, or json4 containing a single JSON object
    • one or more key,value pairs

The row header is currently a placeholder with no defined parameters.

Current Value Table

A file may optionally end with a final row with a time value of zero, which indicates the start of the current value table. This uses the same format as the entire rows block, but should only include the last value in the file for each key.

Example File

Given a data set with UUID 9462ef87-f232-4694-922c-12b93c95e27c:

t voltage current label
0 5 10 "foo"
1 "bar"
2 5 null

A corresponding xbin file containing the same data would be:

UUID (16 bytes)

0x94 0x62 0xef 0x87 0xf2 0x32 0x46 0x94 0x92 0x2c 0x12 0xb9 0x3c 0x95 0xe2 0x7c

Header (1 byte)

0x00 (null, 1 byte)

Type Code Dict, one value, 5 (6 bytes)

0x00 0x00 0x00 0x02 (seg4 length, 2, 4 bytes)

0x04 0x05 (1 byte signed integer, 5, 2 bytes)

Reference Dict, three values, "voltage", "current", "label" (29 bytes)

0x00 0x00 0x00 0x19 (seg4 length, 25)

0x0a 0x07 0x76 0x6f 0x6c 0x74 0x61 0x67 0x65 ("voltage", 9 bytes)

0x0a 0x07 0x63 0x75 0x72 0x72 0x65 0x6e 0x74 ("current", 9 bytes)

0x0a 0x05 0x6c 0x61 0x62 0x65 0x6c ("label", 7 bytes)

Lookup Table (33 bytes)

0x02 (two pairs of lookup values)

0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x01 (time, 1, 8 bytes)

0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x16 (byte offset, 22, 8 bytes)

0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x02 (time, 2, 8 bytes)

0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x2a (byte offset, 42, 8 bytes)

Row t0 (22 bytes)

0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 (time, 0, 8 bytes)

0x00 0x00 0x00 0x0e (row length, 15, 4 bytes)

0x00 (header, null, 1 byte)

0x01 0x00 (reference to index 0, "voltage", 2 bytes)

0xff (type code reference to index 0, 5, 1 byte)

0x01 0x01 (reference to index 1, "current", 2 bytes)

0x04 0x0a (integer value 10, 2 bytes)

0x01 0x02 (reference to index 2, "label", 2 bytes)

0x0a 0x03 0x66 0x6f 0x6f (string "foo", 5 bytes)

Row t1 (20 bytes)

0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x01 (time, 1, 8 bytes)

0x00 0x00 0x00 0x08 (row length, 8, 4 bytes)

0x00 (header, null, 1 byte)

0x01 0x02 (reference to index 2, "label", 2 bytes)

0x0a 0x03 0x62 0x61 0x72 (string "bar", 5 bytes)

Row t2 (19 bytes)

0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x02 (time, 2, 8 bytes)

0x00 0x00 0x00 0x0e (row length, 15, 4 bytes)

0x00 (header, null, 1 byte)

0x01 0x00 (reference to index 0, "voltage", 2 bytes)

0x00 (type code reference to index 0, 5, 1 byte)

0x01 0x01 (reference to index 1, "current", 2 bytes)

0x00 (null, 1 byte)

Current Value Table

0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 (time, 0, 8 bytes)

0x00 0x00 0x00 0x0e (row length, 22, 4 bytes)

0x00 (header, null, 1 byte)

0x01 0x00 (reference to index 0, "voltage", 2 bytes)

0xff (type code reference to index 0, 5, 1 byte)

0x01 0x01 (reference to index 1, "current", 2 bytes)

0x00 (null, 1 byte)

0x01 0x02 (reference to index 2, "label", 2 bytes)

0x0a 0x03 0x62 0x61 0x72 (string "bar", 5 bytes)