This results in a hierarchical tree structure within the XML document.

Alternatively, one could store the XML files on the file system.

XML Schema Language was developed to overcome this shortfall (W3C, 2000).

XML schema defines many more data types, and allows for the specification of rules which not only apply to the structure of the XML document, but to the contained data too.

Harrusi, Averbuch, & Yehudai (2006), have also published an XML aware compression technique.

Since many applications are data centric and are interested in the contents of the XML document, then the first approach is not suitable.

Before the development of XML, a certain amount of a-priori agreement on data and it’s meaning was required between systems.

With the development of XML, data can be exchanged between systems without any prior agreement, so long as both systems understand the same vocabulary, that is, speak the same language.

That is, all data contained within the XML document, DTD treats it as a string.

This suits document markup languages, but is not suitable when an application needs to control the contained data.

DTD documents were introduced by SGML, and they conform to Extended Backus Naur Form (EBNF).

XML Schema documents on the other hand are written using an XML syntax.

XML bridges this gap by being both human and machine readable, while being flexible enough to support platform and architecture independent data interchange.

