Example

BOX represents each node in the DOM tree of an XML document by a type code followed by either a text value or an index to a text value. If the text value is being encountered for the first time, such as the name of the root element, then the text is used. If the text value has been seen before, such as any element end tag, an index to that text is used. The first unique text value gets an index of zero, the second gets an index of one, and so on.

BOX type codes come in increments of ten from 0 to 90. They represent all the node types BOX can handle. For example, the code for the start of an element is 40. If the text which is the name of the element has not been encountered yet then the type code will be followed by that text. If the text has been encountered before then the code will be 41, 42 or 44, based on the number of bytes required to hold the index to the text. If the index will fit in one byte then 41 is used. For two bytes, 42 is used. For four bytes, 44 is used.

The attributes of an element, including namespace declarations, are encoded before the element whose start tags contains them. Reasons for this are explained in the FAQ.

Here's a side-by-side example of XML and BOX. BOX type codes look like this. Note that multiple whitespace text values only appear when the whitespace characters and/or number of them differ. "pi" stands for processing instruction. Integer values are written in big endian order. Text values are written in ASCII or UTF-8 as indicated by the first byte.

XMLBOX
n/a'A' for ASCII
<?xml-stylesheet type="text/xsl" href="family.xsl"?>70 (pi target)
14 (length)
"xml-stylesheet" (index=0)
80 (pi value)
33 (length)
"type="text/xsl" href="family.xsl"" (index=1)
<!-- This is a test document for the BOX package. -->20 (comment)
46 (length)
" This is a test document for the BOX package. " (index=2)
<family xmlns="http://www.ociweb.com/geneology" xmlns:kids="http://www.ociweb.com/kids">50 (namespace prefix)
0 (length)
"" (index=3)
60 (namespace uri)
31 (length)
"http://www.ociweb.com/geneology" (index=4)
50 (namespace prefix)
4 (length)
"kids" (index=5)
60 (namespace uri)
26 (length)
"http://www.ociweb.com/kids" (index=6)
40 (element start)
6 (length)
"family" (index=7)
whitespace90 (text)
3 (length)
whitespace (index=8)
<father age="40" occupation="software engineer">0 (attribute name)
3 (length)
"age" (index=9)
10 (attribute value)
2 (length)
"40" (index=10)
0 (attribute name)
10 (length)
"occupation" (index=11)
10 (attribute value)
17 (length)
"software engineer" (index=12)
40 (element start)
6 (length)
"father" (index=13)
whitespace90 (text)
5 (length)
whitespace (index=14)
<name>40 (element start)
4 (length)
"name" (index=15)
Mark&<90 (text)
6 (length)
"Mark&<" (index=16)
</name>30 (end of element)
whitespace91 (text)
8 (index)
</father>30 (end of element)
whitespace91 (text)
8 (index)
<mother age="40">1 (attribute name)
9 (index)
11 (attribute value)
10 (index)
40 (element start)
6 (length)
"mother" (index=17)
whitespace91 (text)
14 (index)
<name>41 (element start)
15 (index)
Tami90 (text)
4 (length)
"Tami" (index=18)
</name>30 (end of element)
whitespace91 (text)
8 (index)
</mother>30 (end of element)
whitespace91 (text)
8 (index)
<kids:daughter age="16">1 (attribute name)
9 (index)
10 (attribute value)
2 (length)
"16" (index=19)
40 (element start)
13 (length)
"kids:daughter" (index=20)
whitespace91 (text)
14 (index)
<!-- An excellent artist! -->20 (comment)
22 (length)
" An excellent artist! " (index=21)
whitespace91 (text)
14 (index)
<?homework class="math"?>70 (pi target)
8 (length)
"homework" (index=22)
80 (pi value)
12 (length)
"class="math"" (index=23)
whitespace91 (text)
14 (index)
<name>41 (element start)
15 (index)
Amanda90 (text)
6 (length)
"Amanda" (index=24)
</name>30 (end of element)
whitespace91 (text)
8 (index)
</kids:daughter>30 (end of element)
whitespace91 (text)
8 (index)
<kids:son xmlns="http://www.ociweb.com/children" kids:age="14">51 (namespace prefix)
3 (index)
60 (namespace uri)
30 (length)
"http://www.ociweb.com/children" (index=25)
0 (attribute name)
8 (length)
"kids:age" (index=26)
10 (attribute value)
2 (length)
"14" (index=27)
40 (element start)
8 (length)
"kids:son" (index=28)
whitespace91 (text)
14 (index)
<!-- An excellent chess player! -->20 (comment)
28 (length)
" An excellent chess player! " (index=29)
whitespace91 (text)
14 (index)
<name>41 (element start)
15 (index)
Jeremy90 (text)
6 (length)
"Jeremy" (index=30)
</name>30 (end of element)
whitespace91 (text)
8 (index)
</kids:son>30 (end of element)
whitespace91 (text)
8 (index)
<empty>40 (element start)
5 (length)
"empty" (index=31)
</empty>30 (end of element)
whitespace90 (text)
1 (length)
whitespace (index=32)
</family>30 (end of element)