|
|
The parser returns a tree built during the document analysis. The value returned is an **xmlDocPtr** (i.e., a pointer to an **xmlDoc** structure). This structure contains information such as the file name, the document type, and a **children** pointer which is the root of the document (or more exactly the first child under the root which is the document). The tree is made of **xmlNode**s, chained in double-linked lists of siblings and with a children<->parent relationship. An xmlNode can also carry properties (a chain of xmlAttr structures). An attribute may have a value which is a list of TEXT or ENTITY_REF nodes.
|
|
|
|
|
|
Here is an example (erroneous with respect to the XML spec since there should be only one ELEMENT under the root):
|
|
|
|
|
|
![ structure.gif ](http://xmlsoft.org/structure.gif)
|
|
|
|
|
|
In the source package there is a small program (not installed by default) called **xmllint** which parses XML files given as argument and prints them back as parsed. This is useful for detecting errors both in XML code and in the XML parser itself. It has an option **--debug** which prints the actual in-memory structure of the document; here is the result with the [example](http://xmlsoft.org/tree.html#example) given before:
|
|
|
|
|
|
```
|
|
|
DOCUMENT
|
|
|
version=1.0
|
|
|
standalone=true
|
|
|
ELEMENT EXAMPLE
|
|
|
ATTRIBUTE prop1
|
|
|
TEXT
|
|
|
content=gnome is great
|
|
|
ATTRIBUTE prop2
|
|
|
ENTITY_REF
|
|
|
TEXT
|
|
|
content= linux too
|
|
|
ELEMENT head
|
|
|
ELEMENT title
|
|
|
TEXT
|
|
|
content=Welcome to Gnome
|
|
|
ELEMENT chapter
|
|
|
ELEMENT title
|
|
|
TEXT
|
|
|
content=The Linux adventure
|
|
|
ELEMENT p
|
|
|
TEXT
|
|
|
content=bla bla bla ...
|
|
|
ELEMENT image
|
|
|
ATTRIBUTE href
|
|
|
TEXT
|
|
|
content=linus.gif
|
|
|
ELEMENT p
|
|
|
TEXT
|
|
|
content=...
|
|
|
```
|
|
|
|
|
|
This should be useful for learning the internal representation model.
|
|
|
|
|
|
Daniel Veillard |
|
|
\ No newline at end of file |