Before an application can read an XML document, it must learn what XML markup tags the document uses. It does this by reviewing the document type definition (DTD). A valid document includes a document type declaration that identifies the DTD the document satisfies. The DTD lists all the elements, attributes, and entities the document uses and the contexts in which it uses them.

Creating Your Own Markup Languages

When you create an XML document, you aren't really using XML to code the document. Instead, you are using a markup language that was created in XML. In other words, XML is used to create markup languages that are then used to create XML documents.

When you create your own markup language, you are basically establishing which elements (tags) and attributes are used to create documents in that language. Not only is it important to fully describe the different elements and attributes, but you must also describe how they relate to one another.

Creating Your First DTD

A DTD can be declared inline inside an XML document, or as an external reference.

Internal DTD Declaration

If the DTD is declared inside the XML file, it should be wrapped in a DOCTYPE definition with the following syntax:

<!DOCTYPE root-element [element-declarations]>

Example XML document with an internal DTD:

<?xml version="1.0"?>
<!DOCTYPE note [
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
]>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend</body>
</note>

The DTD above is interpreted like this:

!DOCTYPE note defines that the root element of this document is note.
!ELEMENT note defines that the note element contains four elements: "to,from,heading,body".
!ELEMENT to defines the to element to be of the type "#PCDATA".
!ELEMENT from defines the from element to be of the type "#PCDATA".
!ELEMENT heading defines the heading element to be of the type "#PCDATA".
!ELEMENT body defines the body element to be of the type "#PCDATA".

External DTD Declaration

If the DTD is declared in an external file, it should be wrapped in a DOCTYPE definition with the following syntax:

<!DOCTYPE root-element SYSTEM "filename">

This is the same XML document as above, but with an external DTD (Open it, and select view source):

<?xml version="1.0"?>
<!DOCTYPE note SYSTEM "note.dtd">
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

And this is the file "note.dtd" which contains the DTD:

<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>

As you know by now, the goal of most XML documents is to be valid. Document validity is extremely important because it guarantees that the data within a document conforms to a standard set of guidelines as laid out in a schema (DTD or XSD).

An XML application can certainly determine if a document is well formed without any other information, but it requires a schema in order to assess document validity. This schema typically comes in the form of a DTD (Document Type Definition) or XSD (XML Schema Definition), which you learned about in next article.

To recap, schemas allow you to establish the following ground rules that XML documents must adhere to in order to be considered valid:

- Establish the elements that can appear in an XML document, along with the attributes that can be used with each
- Determine whether an element is empty or contains content (text and/or child elements)
- Determine the number and sequence of child elements within an element
- Set the default value for attributes

Sty - Knowledge is Free