module yeti.xml

Simple DOM like XML API for Yeti.

This module uses internally Streaming API for XML (StAX) for parsing and generating XML. If you have JDK 1.6 or later, then StAX implementation comes with the JRE. Otherwise you need to include StAX implementation in the classpath.

typedef xml_value =
CData string |
CData section.
Comment string |
XML comment string.
DTD string |
Document Type Definition. May exists only in root elements tailValues list.
PCData string
Character data section.

typedef xmlns_declaration = {
prefix is string,
Namespace prefix for XML elements and attributes.
uri is string
Namespace URI.
}

typedef xml_element = ('a is {
attributes is hash<string, string>,
XML element attributes in hash map. The map keys may be either attribute local names (if no prefix is intended) or in the form "{namespace-URI}local-name".

The xmlParse function inserts the {namespace-URI} prefix only when the KeepNS () option was given and the attribute had a prefix.

The xmlWrite function requires that xmlns prefixes are defined for used namespace URIs.
elements is array<'a>,
Nested XML child elements.
name is string,
XML element local name.
var tailValues is list<xml_value>,
List of non-element XML sections directly after this elements closing tag (including XML comments).

Root element can also have the DTD (document type definition) and preceding comments here.
var text is string,
Text contained in this XML element. When read, this property is a result of concatenating all PCData and CData sections from this elements values and nested elements tailValues fields.

When written, the values field will be assigned [PCData value], and all nested elements tailValues will be assigned empty list [] (removing all non-element sections from this XML element).

This property is only for convenience (the xmlWrite function ignores it).
var uri is string,
Namespace URI for the XML elements name. Value undef_str is used, when namespace prefix is not present.
var values is list<xml_value>,
List of non-element XML sections directly after this elements starting tag (and before any nested element), including XML comments.
var xmlns is list<xmlns_declaration>
List of xmlns namespaces declared with this element (these use the xmlns:prefix="uri" form in the actual XML).
})

Module signature

{
xmlByPath path element is list<string> → ('a is {.elements is list?<'a>, .name is string}) → list<'a>,
Finds all child elements by the given path.
path-list of element names forming the path
element-element where to start

Description

First finds all child elements with the name matching the first path element. Then their child elements matching the next path element, and so on repeating for each path element, and finallyi returns the child elements matching the last path element.

Examples

 load yeti.xml;
 xml = xmlParse [Str """
    <foo>
        <nothing>???</nothing>
        <something>
           <bar>33</bar>
           <baz>whatever</baz>
           <bar>42</bar>
        </something>
    </foo>"""];
A XML tree is obtained by parsing the above XML. The root element (in xml) is the <foo> element. The bar elements can be found using the following query:
 bars = xmlByPath ["something", "bar"] xml;
The contents of bars could be obtained by mapping (.text) field:
 bar_values = map (.text) bars;
 for bar_values println;
The for loop prints in this case the values 33 and 42.
 33
 42
xmlElement name is string → xml_element,
Creates a new empty XML element structure.
name-local name for the XML element
xmlParse options is
list?<
Coalescing. boolean |
Whether to require the processor to coalesce adjacent character data. Coalescing can eat CDATA living between PCDATA. Default is false (for conforming XML Streaming API implementations).
File. string |
Read XML from a file with this name.
InputStream. ~java.io.InputStream |
Read XML from this InputStream instance (expects UTF-8 encoding).
KeepAll. () |
Turns on both KeepNS (preserving attribute namespaces) and KeepWS (preserving ignorable whitespace).
KeepNS. () |
Element attribute names (in .attributes hash) will be in the form '{namespace-URI}attribute-name' for attributes that had a namespace prefix in XML. When combined with NSAware false option, the prefix part in attribute names will be preserved.
KeepWS. () |
Keep ignorable whitespace (as defined in the XML 1.1 recommendation).
NSAware. boolean |
Turns on/off namespace processing for XML 1.0 support. Default is true (for conforming XML Streaming API implementations).
Reader. ~java.io.Reader |
Read XML from this Reader instance.
Source. ~javax.xml.transform.Source |
Read XML from this Source instance.
Str. string |
Parse this string as XML.
Validate. ()
Turns on implementation specific DTD validation.
>
xml_element
,
Parses XML from specified source into xml_element structure. The options list must contain one of the following source options: InputStream, Reader, Source, Str or File.
options-XML source and parsing options
xmlWrite options document is
list?<
File. string |
Write UTF-8 encoded XML to file with this name.
Indent. string |
Format generated XML by inserting "\n" before each start element, and indenting the lines with this string (repeated by element nesting depth).
Out. (string → ()) |
Apply this function to genereted XML (string).
OutputStream. ~java.io.OutputStream |
Write XML to this OutputStream instance in UTF-8 encoding.
Result. ~javax.xml.transform.Result |
Write XML to this Result instance.
Writer. ~java.io.Writer
Write XML to this Writer instance.
>
('b is {
.attributes is hash<string, string>,
XML element attributes in hash map. The map keys may be either attribute local names (if no prefix is intended) or in the form "{namespace-URI}local-name".

The xmlParse function inserts the {namespace-URI} prefix only when the KeepNS () option was given and the attribute had a prefix.

The xmlWrite function requires that xmlns prefixes are defined for used namespace URIs.
.elements is array<'b>,
.name is string,
.tailValues is list<xml_value>,
var .text is string,
Text contained in this XML element. When read, this property is a result of concatenating all PCData and CData sections from this elements values and nested elements tailValues fields.

When written, the values field will be assigned [PCData value], and all nested elements tailValues will be assigned empty list [] (removing all non-element sections from this XML element).

This property is only for convenience (the xmlWrite function ignores it).
.uri is string,
.values is list<xml_value>,
.xmlns is list<xmlns_declaration>
}) →
()
Generates textual representation of the XML document and writes it to the specified destination. The options list must contain one of the following destination options: OutputStream, Writer, File, Result or Out.
options-destination and formatting options
document-XML root element representing a document

Examples

Print XML to standard output:
 xmlWrite [Out println, Indent '  '] document;
}