XML 6

Transcription

XML 6
XML (6)
Extensible Markup Language
Acknowledgements and copyrights: these slides are a result of
combination of notes and slides with contributions from: Michael
Kiffer, Arthur Bernstein, Philip Lewis, Hanspeter Mφssenbφck,
Hanspeter Mφssenbφck, Wolfgang Beer, Dietrich Birngruber, Albrecht
Wφss, Mark Sapossnek, Bill Andreopoulos, Divakaran Liginlal,
Michael Morrison, Anestis Toptsis, Deitel and Associates, Prentice
Hall, Addison Wesley, Microsoft AA.
They serve for teaching purposes only and only for the students that are
registered in CSE4413 and should not be published as a book or in any
form of commercial product, unless written permission is obtained
from each of the above listed names and/or organizations.
1
[XSL cont’d]
Other conditional tags
y<xsl:choose is similar to the switch statement
y<xsl:when test = is used with choose
y<xsl:otherwise is used once for the non matching when
2
Example
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="/">
<xsl:for-each select="/contacts/contact/name">
<xsl:choose>
<xsl:when test=" text() = 'Frank Rizzo' ">
<br/>
Alternative
This is the Frank Rizzo Contact::
<xsl:value-of select="text()"></xsl:value-of>
ways to do
<br/>
the same
<hr/>
thing.
</xsl:when>
<xsl:when test = " text() = 'Sol Rosenberg' ">
<br/>
This is the <xsl:value-of select="."/> Contact.
<br/>
<hr/>
</xsl:when>
</xsl:choose>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
3
Result …
4
Sorting
?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:template match="/">
<xsl:for-each select="contacts/contact">
<xsl:sort select="name" order="descending"/>
<xsl:apply-templates/>
<br/>
<hr/>
</xsl:for-each>
</xsl:template>
<xsl:template match ="city">
<xsl:value-of select="."/>
<br/>
</xsl:template>
<xsl:template match ="state">
<xsl:value-of select="."/>
<br/>
</xsl:template>
<xsl:template match ="zip">
<xsl:value-of select="."/>
<br/>
</xsl:template>
•
<xsl:template match ="company">
<xsl:value-of select="."/>
<br/>
</xsl:template>
<xsl:template match ="notes">
<xsl:value-of select="."/>
<br/>
</xsl:template>
<xsl:template match ="name">
<xsl:value-of select="."/>
<br/>
</xsl:template>
<xsl:template match ="address">
<xsl:value-of select="."/>
<br/>
</xsl:template>
<xsl:template match ="email">
<xsl:value-of select="."/>
<br/>
</xsl:template>
<xsl:template match ="phone">
<xsl:apply-templates/>
<br/>
</xsl:template>
The
templates
<xsl:template match ="voice">
<xsl:value-of select="."/>
<br/>
</xsl:template>
<xsl:template match ="fax">
<xsl:value-of select="."/>
<br/>
</xsl:template>
<xsl:template match ="mobile">
<xsl:value-of select="."/>
<br/>
</xsl:template>
</xsl:stylesheet>
5
Result …
The corresponding
XML/HTML file
6
The XML DOM
•
•
•
XML Document Object Model (DOM)
Provides a programming interface for manipulating XML documents
in memory
Includes a set of objects and interfaces that represent the content and
structure of an XML document. Defines the document structure
through an object model
– Tree-view of a document
– Nodes, elements and attributes, text elements, etc
•
•
•
•
Enables a program to traverse an XML tree
Allows elements, attributes, etc., to be added/deleted in an XML tree
Allows new XML documents to be created programmatically.
W3C defined the DOM Level 1 and Level 2 Core
– http://www.w3.org/TR/1998/REC-DOM-Level-1-19981001/
– http://www.w3.org/TR/2000/REC-DOM-Level-2-Core-20001113/
7
According to W3C ..
• The DOM is a “Platform- and language-
neutral interface that will allow
programs and scripts to dynamically
access and update the content,
structure and style of documents”
8
Generating The DOM
<?xml version=“1.0”?>
Parser
Dom Tree
XML
Document
Root Element
Child Element
Text
Child Element
Text
9
DOM API for XML
– Provides access to an in-memory tree representation of the
XML Document
– Performs parsing of the document to load into memory
– Returns a handle to the root element
– Can search for any node in the DOM
– Can add or delete elements, attributes etc.
– Can transform the DOM into a new XML Document
– Can work with Document fragments
10
•
<?xml version="1.0"?>
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
<?xml-stylesheet type="text/xsl" href="contacts.xsl"?>
<contacts>
<!-- This is my good friend Frank. -->
<contact>
<name>Frank Rizzo</name>
<address>1212 W 304th Street</address>
<city>New York</city>
<state>New York</state>
<zip>10011</zip>
<phone>
<voice>212-555-1212</voice>
<fax>212-555-1342</fax>
<mobile>212-555-1115</mobile>
</phone>
<email>[email protected]</email>
<web> http://abc.com </web>
<company> </company>
</contact>
<!-- This is my old college roommate Sol. -->
<contact>
<name>Sol Rosenberg</name>
<address>1162 E 412th Street</address>
<city>New York</city>
<state>New York</state>
<zip>10011</zip>
<phone>
<voice>212-555-1818</voice>
<fax>212-555-1828</fax>
<mobile>212-555-1521</mobile>
</phone>
<email>[email protected]</email>
<web> www.abc.org </web>
<company>Rosenberg&apos;s Shoes &amp; Glasses</company>
<notes>Sol collects Civil War artifacts.</notes>
</contact>
</contacts>
contacts.xml
again
Example DOM:
The XML document
11
Example DOM: The DOM tree
document
xml
contacts
contact
contact
name
name
email
address
address
company
city
web
city
company
state
zip
notes
phone
voice
mobile
fax
12
The DOM tree is autogenerated by most XML
editors, such as the one
we see here from
</oXygen>
Data tree (generated by
<oXygen/>). Needs
only the xml file to be
produced
13
.NET Supports XML
• XML 1.0
• http://www.w3.org/TR/1998/REC-xml-19980210
• XML Namespaces
• http://www.w3.org/TR/1999/REC-xml-names-19990114/
• XML Schemas
• http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/
• http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/
• XPath expressions
• http://www.w3.org/TR/1999/REC-xpath-19991116
• XSL/T transformations
• http://www.w3.org/TR/1999/REC-xslt-19991116
• DOM Level 1 and Level 2 Core
• http://www.w3.org/TR/1998/REC-DOM-Level-1-19981001/
• http://www.w3.org/TR/2000/REC-DOM-Level-2-Core-20001113/
• More … (later)
14
Processing XML data in .NET
XSLT Stylesheet
XslTranform
XslTransform
XPathNavigator
XPath
XmlReader
XmlDocument
XmlWriter
XmlDocument
• XmlReader: Reading XML data
• XmlDocument, XmlNode: Object model for XML data (DOM)
• XmlWriter: Wrting XML data
• XPathNavigator: XPath selections
• XslTransform: Transformation of XML documents
15
XML Namespaces in .NET
System.Xml
.Xsl
.XPath
.Schema
.Serialization
From the actual .NET
documentation.
16
XML Namespaces in .NET…/
System.Xml
.Xsl
EntityHandling
Formatting
NameTable
ReadState
TreePosition
Validation
WriteState
XmlAttribute
XmlAttributeCollection
XmlCDataSection
XmlCharacterData
.XPath
XmlCharType
XmlComment
XmlConvert
XmlDataDocument
XmlDeclaration
XmlDocument
XmlDocumentFragment
XmlDocumentType
XmlElement
XmlEntity
XmlEntityReference
XmlNamedNodeMap
.Serialization
.Schema
XmlNode
XmlNodeReader
XmlNodeType
XmlNotation
XmlReader
XmlSpace
XmlText
XmlTextReader
XmlTextWriter
XmlUrlResolver
XmlWhitespace
XmlWriter
...
17
System.Xml Namespace
• Overall namespace for classes that provide XML support in .NET.
• Contains classes for creating, editing, navigating XML documents
• Reading, writing and manipulating documents via the DOM
– Use the XmlDocument class for XML documents
• Classes that correspond to every “type” of XML “element”:
– XmlElement, XmlAttribute, XmlComment, etc
18
XmlReader
• Abstract base class for reading XML
• Reads in depth-first order
– Same order as textual XML data
• Fast, forward-only (cannot jump within XML document),
non-cached XML stream reader.
• .NET’s implementation of SAX (Simple API
for XML)
• Implementations are:
– XmlTextReader: efficient, no immediate storage of elements
– XmlValidatingReader: validates document against DTD or XSD
– XmlNodeReader: reading from an XmlNode (DOM)
19
XmlReader
public abstract class XmlReader {
public
public
public
public
public
public
abstract
abstract
abstract
abstract
abstract
abstract
string Name { get; }
string LocalName { get; }
string Value { get; }
XmlNodeType NodeType { get; }
int AttributeCount { get; }
int Depth { get; }
public abstract bool Read();
public virtual void Skip();
public abstract string GetAttribute(int i);
public abstract void Close();
...
}
Properties of
current element
•full name
•local name
•value
•type
•number of attributes
•depth in document
•Reading of next
element
•Skipping the current
element and its subs
•Getting the element‘s
attributes
Closing the reader
20
XmlWriter
• Abstract base classes for writing XML
• Fast, forward-only, non-cached XML stream writer
• Base class for XmlTextWriter
21
XmlTextReader & XmlTextWriter
• Derived from the XmlReader & XmlWriter abstract classes
• Implement all the functionality defined by their base classes
• Designed to work with a text based stream
– As opposed to an in-memory DOM
• XmlTextReader methods support reading XML elements
– Read, MoveToElement, ReadString, etc
• XmlTextWriter methods support writing XML elements
– WriteDocType, WriteComment, WriteName, etc
22
XML in .NET
XmlTextReader
• Forward-only, read-only, non-cached access to stream-based
XML data
• Implements XmlReader
• Access to data by parsing text input from:
– Streams
– TextReader objects
– Strings
• Properties and methods to view elements and attributes
• Event support for validation
23
XML in .NET
XmlTextReader coding bits (C#)
XmlTextReader reader = new XmlTextReader(“c:\\Sample.xml”);
while (reader.Read()) {
...
}
24
A C# example using the XmlTextReader class …
using System;
using System.Xml;
// find and print all addresses in file contacts.xml
namespace ConsoleApplication1XmlTextReader
{
class Class1
{
static void Main(string[] args)
{
XmlTextReader r;
r = new XmlTextReader("E:\\4413 -- e-commerce
course\\xmlTest\\ConsoleApplication1XmlTextReader\\bin\\Debug\\contacts.xml");
while (r.Read())
{
if (r.IsStartElement("address"))
{
r.Read(); // read the value
Console.WriteLine("{0}, ", r.Value);
}
}
r.Close();
}
}
}
25
The output …
26
XmlTextWriter coding bits (C#)
XmlTextWriter writer = new XmlTextWriter (“c:\Sample.xml”, null);
writer.WriteStartDocument();
// Write the declaration
writer.WriteStartElement(“ROOT”); // Write the root element
writer.WriteEndElement();
// Write the close tag of root
writer.Flush();
// Write XML to file
writer.Close();
// Close writer
27
XmlDocument
• Derived from the XmlNode class
• Represents an entire (in memory) XML document
• Supports DOM Level 1 and Level 2
Core functionality
• Reading & writing built on top of XmlReader &
XmlWriter
• Load a document and generate the DOM
– Using: URI, file, XmlReader, XmlTextReader or Stream
28
XML in .NET
XmlDocument
•
•
Implements W3C XML Document Object Model (DOM) Level 1 and
Level 2 specifications
Implements XmlNode
– XML elements are represented by XmlNode objects
•
Represents an entire XML document
– Construction of object structure in main memory
– All nodes are available to view/manipulate (efficient
manipulation
of XML data, but not efficient in terms of space).
•
•
Cached in memory
Events for changing, inserting and removing nodes (i.e., implements the
SAX).
29
Class XmlDocument …
Root element
public class XmlDocument : XmlNode {
public XmlDocument();
Document type
(DOCTYPE declaration)
public XmlElement DocumentElement { get; }
public virtual XmlDocumentType DocumentType { get; }
public virtual void Load(Stream in);
public virtual void Load(string url);
public virtual void LoadXml(string data);
public virtual void Save(Stream out);
public virtual void Save(string url);
Loading the XML data
Saving
30
Class XmlDocument …/
public virtual XmlDeclaration CreateXmlDeclaration
(string version, string encoding, string standalone);
public XmlElement CreateElement(string name);
public XmlElement CreateElement
(string qualifiedName, string namespaceURI);
public virtual XmlElement CreateElement
(string prefix, string lName, string nsURI);
public virtual XmlText CreateTextNode(string text);
public virtual XmlComment CreateComment(string data);
public
public
public
public
public
public
event
event
event
event
event
event
XmlNodeChangedEventHandler
XmlNodeChangedEventHandler
XmlNodeChangedEventHandler
XmlNodeChangedEventHandler
XmlNodeChangedEventHandler
XmlNodeChangedEventHandler
NodeChanged;
NodeChanging;
NodeInserted;
NodeInserting;
NodeRemoved;
NodeRemoving;
Creation of:
•declaration node
•elements
•text nodes
•comments
Events
}
31
XML in .NET
Load an XML
doc in memory
XmlDocument coding bits…
XmlDocument myXmlDoc = new XmlDocument();
myXmlDoc.Load(“c:\\Sample.xml”);
XmlDocument doc = new XmlDocument();
XmlDeclaration decl = doc.CreateXmlDeclaration("1.0", null, null);
doc.AppendChild(decl);
Create new
document and
add declaration
<?xml version="1.0" ?>
32
XmlDocument coding bits…
XmlElement rootElem = doc.CreateElement(“contacts");
rootElem.SetAttribute(“myAttribute", "1");
doc.AppendChild(rootElem);
contacts
<contacts myAttribute ="1">
myAttribute
1
33
XmlDocument coding bits…
XmlElement aContact = doc.CreateElement(“contact");
aContact.SetAttribute("id", "1");
XmlElement f = doc.CreateElement("firstname");
f.AppendChild(doc.CreateTextNode(“Frank"));
aContact.AppendChild(f);
XmlElement l = doc.CreateElement("lastname");
l.AppendChild(doc.CreateTextNode(“Rizzo”));
aContact.AppendChild(l);
Create and add
contacts element
and subelements
contacts
<?xml version="1.0" ?>
<contacts myAttribute = "1">
<contact id="1">
<firstname> Frank </firstname>
<lastname> Rizzo </lastname>
</ contact >
</contacts >
contact
firstname
Frank
myAttribute
1
Id
1
lastname
Rizzo
34
XmlDocument coding bits … (C#)
using System.Xml;
//Create an XmlDocument, Load it, Write it to the Console
//One way:
XmlDocument xDoc = new XmlDocument();
xDoc.Load( “C:\\myData.xml");
xDoc.Save( Console.Out);
//Second way (Use a XmlTextReader to read the XML):
XmlDocument xDoc = new XmlDocument();
XmlTextReader reader = new XmlTextReader(“C:\\myData.xml");
xDoc.Load( reader );
xDoc.Save( Console.Out);
//Third way (Use a XmlTextWriter to output the XML document):
XmlTextWriter writer = new XmlTextWriter( Console.Out );
writer.Formatting = Formatting.Indented;
xDoc.WriteContentTo( writer );
writer.Flush();
Console.WriteLine();
writer.Close();
35
System.Xml.Xsl
Namespace
•
•
Provides support for XSL Transformations
Some of the classes:
–
XsltTransform: Transforms using a stylesheet
–
(XsltException: Used to handle transformation
exceptions)
•
Four simple steps to perform a transformation
1.
2.
3.
4.
Instantiate a XsltTransform object
Load a stylesheet
Load the data (xml file)
Transform!
36
Coding bits …
using System.Xml.Xsl;
// 1. Create a XslTransform object
XslTransform xslt = new XslTransform();
// 2. Load an XSL stylesheet
xslt.Load("http://somewhere/favorite.xsl");
// 3 & 4. Load the XML data file & transform!
xslt.Transform(“http://somewhere/mydata.xml”,
“C:\\somewhere_else\\TransformedXmlOutput.xml”
);
37
Example…(XSLT)
Upon clicking the button,
xsl and xml documents are
opened and a XSL
Transform is performed.
The progress of the process
is displayed in the white
area, and upon completion
of the process the browser
is opened and the resulting
HTML document is
displayed.
38
Example (XSLT) …/
In the XSLT, the fax
numbers of the
contacts.xml file are
extracted.
39
using
using
using
using
using
using
using
System;
System.Drawing;
System.Collections;
System.ComponentModel;
System.Windows.Forms;
System.Data;
System.Xml.Xsl;
Use the appropriate
namespace
The code …
Declare the button
and the ListBox
(white area).
namespace WindowsApplication1XSLT
{
/// <summary>
/// Summary description for Form1.
/// </summary>
public class Form1 : System.Windows.Forms.Form
{
private System.Windows.Forms.Button button1;
private System.Windows.Forms.ListBox listBox1;
/// <summary>
/// Required designer variable.
/// </summary>
private System.ComponentModel.Container components = null;
public Form1()
{
//
// Required for Windows Form Designer support
//
InitializeComponent();
//
// TODO: Add any constructor code after InitializeComponent
call
//
}
40
/// <summary>
/// Clean up any resources being used.
/// </summary>
protected override void Dispose( bool disposing )
{
if( disposing )
{
if (components != null)
{
components.Dispose();
}
}
base.Dispose( disposing );
}
…
Hook up the
button to the
event mechanism
of C#
#region Windows Form Designer generated code
/// <summary>
/// Required method for Designer support - do not modify
/// the contents of this method with the code editor.
/// </summary>
private void InitializeComponent()
{
this.button1 = new System.Windows.Forms.Button();
this.listBox1 = new System.Windows.Forms.ListBox();
Create the
this.SuspendLayout();
components.
//
// button1
//
this.button1.Location = new System.Drawing.Point(80, 56);
this.button1.Name = "button1";
this.button1.Size = new System.Drawing.Size(136, 24);
this.button1.TabIndex = 0;
this.button1.Text = "perform XSLT";
41
this.button1.Click += new System.EventHandler(this.button1_Click);
//
// listBox1
//
this.listBox1.Location = new System.Drawing.Point(48, 120);
this.listBox1.Name = "listBox1";
this.listBox1.Size = new System.Drawing.Size(184, 95);
this.listBox1.TabIndex = 1;
//
// Form1
//
this.AutoScaleBaseSize = new System.Drawing.Size(5, 13);
this.ClientSize = new System.Drawing.Size(292, 273);
this.Controls.Add(this.listBox1);
Add the GUI
this.Controls.Add(this.button1);
components to the Form
this.Name = "Form1";
this.Text = "Form1";
this.ResumeLayout(false);
…
}
#endregion
Start the
program
/// <summary>
/// The main entry point for the application.
/// </summary>
[STAThread]
static void Main()
{
Application.Run(new Form1());
}
42
private void button1_Click(object sender, System.EventArgs e)
{
// 1. Create a XslTransform object
XslTransform xslt = new XslTransform();
// 2. Load an XSL stylesheet
…/
Create
Transform
object
listBox1.Items.Add("Loading .xsl file ..");
xslt.Load("E:\\4413 -- e-commerce course\\xmlTest\\contactsFaxes.xsl");
listBox1.Items.Add(" .... xsl file LOADED!");
// 3 & 4. Load the XML data file & transform!
Load the .xls
file
listBox1.Items.Add(" Performing xls transformation ...");
xslt.Transform("E:\\4413 -- e-commerce course\\xmlTest\\contacts.xml",
"E:\\4413 -- e-commerce
course\\xmlTest\\TransformedContactsXmlOutput.html", null );
Perform the
listBox1.Items.Add(" ... xls transformation completed!");
listBox1.Items.Add("\n");
listBox1.Items.Add("... wait for display!");
transformation on the
given xml file, using the
given xsl file.
System.Diagnostics.Process.Start("IExplore", "E:\\4413 -- e-commerce
course\\xmlTest\\TransformedContactsXmlOutput.html");
}
}
}
Invoke Internet Explorer and
open the resulting html file.
43
XML in .NET
Core Classes in System.XML
Abstract (Base) Class
Concrete (Derived) Class
XmlNode
XmlDocument (DOM in .NET)
XmlLinkedNode
XmlElement
XmlAttribute
XmlDataDocument
XmlReader (SAX in .NET)
XmlTextReader
XmlNodeReader
XmlValidatingReader
XmlWriter
XmlTextWriter
Good news!!!:
visual studio .net documentation
contains code examples
that use the features of these classes.
44
XML in .NET
XmlNode
• Represents a single node in a XML document
hierarchy.
• An abstract class.
• Properties and methods to
– traverse XML document hierarchy.
– query properties. Support for XPath expressions
through IXPathNavigable interface
– view, modify, copy, delete nodes
– Can select and navigate a subset of a document
45
XmlNode class …
public abstract class XmlNode : ICloneable, IEnumerable, IXPathNavigable {
public
public
public
public
public
public
public
public
public
abstract string Name { get; }
abstract string LocalName { get; }
abstract XmlNodeType NodeType { get; }
virtual string Value { get; set; }
virtual XmlAttributeCollection Attributes { get; }
virtual XmlDocument OwnerDocument { get; }
virtual bool IsReadOnly { get; }
virtual bool HasChildNodes { get; }
virtual string Prefix { get; set; }
public
public
public
public
public
public
public
public
virtual
virtual
virtual
virtual
virtual
virtual
virtual
virtual
Properties
of node
(name,
value,
attributes,
etc).
XmlNodeList ChildNodes { get; }
XmlNode FirstChild { get; }
XmlNode LastChild { get; }
XmlNode NextSibling { get; }
XmlNode PreviousSibling { get; }
XmlNode ParentNode { get; }
XmlElement this[string name] { get; }
XmlElement this[string localname, string ns] { get; }
…
Accessing adjacent nodes (children, siblings, parent,
etc)
46
XmlNode class …/
...
public
public
public
public
public
public
virtual
virtual
virtual
virtual
virtual
virtual
XmlNode AppendChild(XmlNode newChild);
XmlNode PrependChild(XmlNode newChild);
XmlNode InsertAfter(XmlNode newChild, XmlNode refChild);
XmlNode InsertBefore(XmlNode newChild, XmlNode refChild);
XmlNode RemoveChild(XmlNode oldChild);
void RemoveAll();
public XPathNavigator CreateNavigator();
public XmlNodeList SelectNodes(string xpath);
public XmlNode SelectSingleNode(string xpath);
public abstract void WriteContentTo(XmlWriter w);
public abstract void WriteTo(XmlWriter w);
...
}
Adding and
removing nodes
Selection of nodes
Writing
47
XML in .NET
XmlLinkedNode
• Implements XmlNode
• Retrieves the node immediately preceding or
following the current node
• An abstract class from which XmlElement is
derived
– Declaration:
public abstract class XmlLinkedNode : XmlNode
48
XML in .NET
XmlElement
• Represents an element in the DOM tree
• Properties and methods to view, modify,
and create element objects
– Declaration:
public class XmlElement : XmlLinkedNode
– Code example:
XmlDocument myXmlDoc = new XmlDocument();
myXmlDoc.Load (“c:\\Sample.xml”);
// DocumentElement retrieves the root element
XmlElement root = myXmlDoc.DocumentElement;
49
XML in .NET
XmlAttribute
• Implements XmlNode
• Represents an attribute of an XmlElement
• Valid and/or default values defined by
schema
– Declaration:
public class XmlAttribute : XmlNode
50
XML in .NET
XmlAttribute
• Coding bits:
XmlDocument myXmlDoc = new XmlDocument();
myXmlDoc.Load (“c:\\Sample.xml”);
// Get the attribute collection of the root element
XmlAttributeCollection attrColl = myXmlDoc.DocumentElement.Attributes;
// Create a new attribute and set its value to “1”.
XmlAttribute newAttr = myXmlDoc.CreateAttribute(“value”);
newAttr.Value = “1”;
// Append the new attribute to the collection
attrColl.Append(newAttr);
51
XPath
•
XPath is a language for identification of elements in an XML document
•
XPath expression (location path) selects a set of nodes
•
A location path consists of location steps, which are separated by "/"
//step/step/step/
Examples of location paths:
9 "*"
selects all nodes
9 “/contacts/*” selects all elements under the contacts elements
9 “/contacts/contact[1]” returns the first contact element of the contacts
elements
9 “/contacts/*/firstname”
contacts elements
returns the firstname elements under the
52
XPath
• Extends path expressions with query facility
• XPath views an XML document as a tree
– Root of the tree is a node which does not
correspond to anything in the document
– Internal nodes are XML elements
– Leaves are either
• Attributes
• Text nodes
• Comments
• Other things that we didn’t discuss (processing instructions, …)
53
XPath
Document
Tree
Root of XML tree
Root of XML document
54
Document Corresponding to the Tree
• A fragment of the report document that we used frequently
<?xml version=“1.0” ?>
<!-- Some comment -->
<Students>
<Student StudId=“111111111” >
<Name><First>John</First><Last>Doe</Last></Name>
<Status>U2</Status>
<CrsTaken CrsCode=“CS308” Semester=“F1997” />
<CrsTaken CrsCode=“MAT123” Semester=“F1997” />
</Student>
<Student StudId=“987654321” >
<Name><First>Bart</First><Last>Simpson</Last></Name>
<Status>U4</Status>
<CrsTaken CrsCode=“CS308” Semester=“F1994” />
</Student>
</Students>
<!-- Some other comment -->
55
XPath Basics
• An XPath expression takes a document tree as
input and returns a set of nodes of the tree
• Expressions that start with / are absolute path
expressions
– Expression / – returns root node of XPath tree
– /Students/Student – returns all Student-elements
that
Student
are children of Students elements, which in turn must
be children of the root
– /Student – returns empty set (no such children at
root)
56
XPath Basics …/
9 Current (or context node) – exists during the evaluation
of XPath expressions
9 . – denotes the current node; .. – denotes the parent
9 foo/bar – returns all bar elements that are children of foo nodes,
which in turn are children of the current node
9 ./foo/bar – same
9 ../abc/cde – all children of cde, which is a child of abc, under the
parent of the current node
9 Expressions that don’t start with / are relative (to the
current node)
57
Attributes, Text, etc.
Denotes an
attribute
9 /Students/Student/@StudentId – returns the
StudentId attribute of Student (which
(Student) is a
(
child of Students,
Students which are children of the root)
9 /Students/Student/Name/Last/text( ) –
returns the text children of Last children of …
9 /comment( ) – returns comment nodes under root
58
Overall Idea and Semantics
• An XPath expression is:
locationStep1/locationStep2/…
• Location step:
step
This is called full syntax.
We used abbreviated syntax before.
Full syntax is better for describing
meaning. Abbreviated syntax is
better for programming.
Axis::nodeSelector[predicate]
• Navigation axis:
axis
• child, parent – have seen
• ancestor, descendant, ancestor-or-self, descendant-or-self
• some other
• Node selector:
selector node name or wildcard; e.g.,
– ./child::Student (we used ./Student, which is an abbreviation)
– ./child::* – any e-child (abbreviation: ./*)
• Predicate:
Predicate a selection condition; e.g.,
Students/Student[CourseTaken/@CrsCode = “CS532”]
59
XPath Semantics
• locationStep1/locationStep2/… means:
– Find all nodes specified by locationStep1
– For each such node N:
• Find all nodes specified by locationStep2 using N as the
current node
• Take union
– For each node returned by locationStep2 do the same
• locationStep = axis::node[predicate]
– Find all nodes specified by axis::node
– Select only those that satisfy predicate
60
More on Navigation Primitives
• 1st Student child of Students:
Students
/Students/
Students Student[1]
Student
• last Student elements within Students
/Students/Student[last()]
61
XPath Queries – Examples
• Students who have taken CS532:
– //Student[CrsTaken/@CrsCode=“CS532”]
• Students who have either a first name or have taken a
course in some semester or have status U4
–
//Student[Name/First or
CrsTaken/@Semester or
Status/text() = “U4”]
62
Class XPathNavigator allows navigation in document.
public abstract class XPathNavigator : ICloneable {
public
public
public
public
abstract
abstract
abstract
abstract
Properties of
current node
string Name { get; }
string Value { get; }
bool HasAttributes { get; }
bool HasChildren { get; }
public virtual XPathNodeIterator Select(string xpath);
public virtual XPathNodeIterator Select(XPathExpression expr);
public virtual XPathExpression Compile(string xpath);
public abstract bool MoveToNext();
public abstract bool MoveToFirstChild();
public abstract bool MoveToParent();
// etc
Selection of
nodes by
XPath
expression
Moving to adjacent nodes
}
public interface IXPathNavigable {
XPathNavigator CreateNavigator();
}
IXPathNavigable (implemented by
XmlNode) returns XPathNavigator
63
XPathNavigator
• Code example:
Load XmlDocument
and create
XPathNavigator
XmlDocument myXmlDoc;
myXmlDoc.Load (“c:\\Sample.xml”);
XPathNavigator nav = myXmlDoc.CreateNavigator();
nav.MoveToRoot();
nav.MoveToNext();
// move to root element
// move to next element
64
XPathNavigator
XPathNodeIterator iterator = nav.Select("/contacts/*/name");
while (iterator.MoveNext())
Console.WriteLine(iterator.Current.Value);
Select name elements, iterate over
selected elements and put out name
values.
XPathExpression expr = nav.Compile("/contacts/contact[name=‘Frank
Rizzo’]/email");
iterator = nav.Select(expr);
while (iterator.MoveNext()) Console.WriteLine(iterator.Current.Value);
For better run-time efficiency
compile expression and use
compiled expression.
65
XPathNavigator
• Example: iterate over a subset of nodes
XmlDocument doc = new XmlDocument();
doc.Load("person.xml");
XPathNavigator nav = doc.CreateNavigator();
XPathNodeIterator iter = nav.Select("/person/name");
while (iter.MoveToNext ()) {
// process selection here… with iter.Current.Value
}
• Example: sum all Prices in a document
public static void SumPriceNodes(XPathNavigator nav) {
// in this case, evaluate returns a number
Console.WriteLine("sum=" + nav.Evaluate("sum(//Price)"));
}
66
C# and .NET
How to access the C# API. Select
“Start | Programs |
Microsoft Visual Studio
.NET 2003 | Microsoft
Visual Studio .NET 2003
Documentation”
The XML
APIs
67
The C# (and .NET in general) API
Choose …
68
…/
System is a major
namespace for C#
69