Open XML File

Information, tips and instructions

XML Format

XML file is formatted in accordance with Extensible Language Markup (XML) 1.0 specification available from World Wide Web Consortium (W3C). XML 1.1 specification is also available, but it is not widely used and supported by the software.

The structure of the XML document is defined by (Document Type Definition) DTD file. DTD file defines which elements the XML document can have, what attributes could be assigned to elements. DTD also defines the hierarchy of the XML document, specifically defining parent-child relationships of the elements.

XML documents which conform to the XML specification syntax rules are called well-formed. Well-formed XML documents could be parsed by all XML parsers and rendered in XML viewers.

XML documents which have a link to the corresponding DTD file, and which could be correctly verified against it are called valid XML documents.

Besides validation against DTD files, validation against XML schema could also be used. XML schema is similar to DTD file in a way that it also describes the structure of the XML document. Two of the most popular XML schema languages are XML Schema and RELAX NG. XML schema is typically stored in a file with XSD file extensions.

Below is an example of XML document which uses DTD file for structure definition. XML document consists of tags and data enclosed within then. There is an opening tag and closing tag for each element of the XML. In the example below is an opening tag and is a closing tag. As you can see closing tag always starts with forward slash. Each opening tag in XML must have a closing tag. The area within opening and closing tags is called scope of the tag and if there is another opening tag within the tag scope then the closing tag should be also within the same scope. Tags can have text enclosed inside them and attributes. In the example below “Consultant” is a text and id=30 is an attribute.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE card SYSTEM "card.dtd">
<heading>Business Card</heading>
<title id="30">Consultant</title>

As you can see above, first line of the XML document defines specification version and encoding it uses. Version is typically 1.0 but 1.1 could also be used. Encoding depends on the application and language you need to use by UTF-8 is the most popular one but others are available including: UTF-16, ISO-Latin-1, ASCII, HTML. For more details about XML encodings and internationalization process look at XML Encodings Support.

Second line of the XML document in the example above define the reference to the card.dtd file which defines the structure of the XML document. Below is a source code of the card.dtd file.

<!DOCTYPE card
<!ELEMENT card (fname,lname,heading,title)>
<!ELEMENT fname (#PCDATA)>
<!ELEMENT lname (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT title (#PCDATA)>
<!ATTLIST title id CDATA "0">

In the document above !DOCTYPE card defines the root element of the XML document and its name “card”. <!ELEMENT card (fname,lname,heading,title)> string specifies that card element must have fname, lname, heading and title elements as its children. Following lines define that each of the children element has type #PCDATA (parsed-character data).

When XML document is processed the DTD file is first being read and the rules in it is loaded into memory. While XML document is being read it is validated against the structure described in the DTD and if an error is found it is reported to the user. During parsing XML is also being checked to be well-formed, meaning it is checked that tags are opened and closed correctly, that only one root element is present, encodings are correct, etc.

This brief overview of the XML/DTD document format is enough to give you a basic understanding of how XML document structured and parsed. For more details you should refer to the latest fifth edition specification of XML format at

File Extension Info

XML Quick Info
  Extensible Markup Language
Opens with
  Microsoft XML Notepad
  Altova XMLSpy