How I learned to love XML! (FME Certified Professional, Peter

Transcription

How I learned to love XML! (FME Certified Professional, Peter
How I learned to ‘love’ XML
Peter Laulund
National Survey and Cadastre
Agenda
• KMS and INSPIRE
• About XML/GML
• Writing XML with FME
– Templates
– Schema mapping, semantic
– Schema mapping, geometry
– Workflow – design
INSPIRE
• INSPIRE is a European initiative to
create a common SDI
• Specific datasets has to be available
in a harmonized way
• KMS is the National contact point
• KMS has five Annex 1 datasets
• I will talk about XML not INSPIRE
KMS datasets
•
•
•
•
•
•
Transport Networks
Hydrography
Cadastral Parcels
Geographical Names
Administrative Units
Documentation - http://inspire.jrc.ec.europa.eu
Warning: getting through the PDF and the related
XSD(s) is a tough read !
Inspire-foss
FME and XML
XML
• XML: eXtensible Markup Language
– Syntax used to describe data
• GML: Geography Markup Language
– XML dialect for describing geography
– GML 2, GML 3.1.1, GMLSF, GML 3.2.1
• XSD: XML Schema Definition
– XML dialect for describing the contend
of xml files
XML - example
<?xml version="1.0" encoding="UTF-8"?>
<!-- oprettet af Pel, kms, 3. august 2012-->
<venner xmlns:p="http://www.kms.dk/xmlschmas"
xmlns:d="http://www.kms.dk/xmlschmas">
<p:person id="345">
<p:navn>Peter Laulund</p:navn>
<p:adresse>Sognegårds alle 54</p:adresse>
<p:født>
<d:dato>
<d:dag>3</d:dag>
<d:måned>maj</d:måned>
<d:år>1957</d:år>
<d:klokken/>
</d:dato>
</p:født>
<p:telefon type="fastnet">+45 36499408</p:telefon>
<p:telefon type="mobil">+45 26273031</p:telefon>
<p:giftMed href="#445"/>
<p:arbjedsgiver/>
</p:person>
</venner>
FME and XML
•
•
•
•
•
•
FME reads and writes XML/GML
Converts geometry to gml
XMLSampleGenerator
XMLTemplater
XMLValidator
XMLFormater
XMLTemplater
• An XML template is an XML document
with XQuery functions
<gn:text>{fme:get-attribute("name")}</gn:text>
<au:geometry>
{fme:get-xml-attribute("gml_geom")}
</au:geometry>
{fme:get-xml-list-attribute("level{}.xml")}
<gml:featureMembers>
{fme:process-features("FEATURE")}
</gml:featureMembers>
XMLTemplater
• The document may be loaded
from
– an attribute
– a file
– entered into the transformer
• We use a file that is loaded
into an attribute
Templates
• Use the XMLSampelGenerator to
create the template
• Edit the template in a text editor
– Delete
– Add XQuery function calls
• Use XMLValidator to evaluate the
result
• Use XMLFormater to make it look
pretty
Writing INSPIRE GML
Challenges
• Five datasets some with more than
one feature type
• Data for download from ftp don with
FME
• WFS with Snowflake
• WMS with ?
• All data in one Oracle database
Dataflow in KMS
Oracle
*.GML
Read
Write
Schema Mapping
- semantik
Schema Mapping
- geometry
Transform to xml
Sql
FME tools
Tcl
Aggregate
SetTraits
OGCGeometry
XMLTemplate
XMLValidate
XMLFormater
Schema mapping
Schema mapping are basic FME
functionality
•
•
•
•
Add or remove attributes
Change feature types
Alter domain values
All our data are in an Oracle database
we will therefore use sql for schema
mapping
Schema mapping
F eatureT yp e
REGION
REGION
REGION
REGION
REGION
REGION
REGION
:
:
:
:
:
:
:
:
oldAttribute
REGIONKODE
DAGI_ID
FEAT_ID
TIMEOF_CRE
FEAT_TYPE
REGIONNAVN
DQ_RESPONS
:
:
:
:
:
:
:
:
n e w A t t r i b ut e
nationalCode
localId
localIdGeom
beginLifespanVersion
nationalLevelName
name
sourceOfName
F eatureT yp e
REGION
REGION
REGION
REGION
REGION
:
:
:
:
:
:
newAttribute
namespace
namespaceGeom
country
gmlTemplate
nationalLevel
:
:
:
:
:
:
value
dk.kms.au
dk.kms.au.geom
DK
AdministrativeUnit
2ndOrder
Schema mapping
Europavej
Primærvej
Sekundærvej
Anden vigtig vej
Større lokalvej
Lokalvej
Indkørselsvej
Anden
vej
Read
Hovedsti
Cykelsti langs vej
Sti, diverse
Trafikvej-Gennemfart
Trafikvej-Fordeling
Lokalvej-Primær
Lokalvej-Sekundær
Lokalvej-Tertiær
Ikke tildelt
?
mainRoad
firstClass
secondClass
thirdClass
fourthClass
fifthClass
sixthClass
seventhClass
eighthClass
ninthClass
Write
inspireId
All features must have an inspireId, it is a complex
type made of
• A namespace - <country>.<organisation>.<dataset>
• Id – the features database id
• Version – null, sequence or timestamp
<cp:inspireId>
<base:Identifier>
<base:localId> {fme:get-attribute("localId")} </base:localId>
<base:namespace> {fme:get-attribute("namespace")} </base:namespace>
<base:versionId> {fme:get-attribute("beginLifespanVersion")} </base:versionId>
</base:Identifier>
</cp:inspireId>
<base:Identifier>
<base:localId>595944</base:localId>
<base:namespace>dk.kms.tn.roadnode</base:namespace>
<base:versionId>2012-03-13T18:09:26</base:versionId>
</base:Identifier>
Example - geometry
Coordinate System: ` EPSG:25832'
Geometry Type: IFMEPoint
Number of Geometry Traits: 1
GeometryTrait(string): `gml_id' has value
`dk.kms.tn.roadnode.594897.20120803145439'
Coordinate Dimension: 3
(725261.96,6187842.58,2.5)
@GMLGeometry(TO_ATTRIBUTE, GML_3.2.1, gml_geom)
<net:geometry>
{fme:get-xml-attribute("gml_geom")}
</net:geometry>
<net:geometry>
<gml:Point
gml:id="dk.kms.tn.roadnode.594897.20120803145439”
srsName="EPSG:25832"
srsDimension="3">
<gml:pos>725261.96 6187842.58 2.5</gml:pos>
</gml:Point>
</net:geometry>
gml:id
•
•
•
•
•
gmi:id is mandatory
Unique within the document
Must start with a letter
In FME default is an UUID
Build the same way as inspireId
– <namespace>.<id>.<timestamp>
gml:id
NAVNE
- ID
- FraDato
- Navn
X
MONTAGE
- ID
- FraDato
- Geometri
<gn:geometry>
<gml:MultiCurve
gml:id="dk.kms.gn.114294-0"
srsName="EPSG:25832" srsDimension="2">
<gml:curveMember>
<gml:LineString gml:id="dk.kms.gn.geom.24795424.20080613T085805">
<gml:posList>723071.8 6194337.45 ....
</gml:LineString>
</gml:curveMember>
<gml:curveMember>
<gml:LineString gml:id="dk.kms.gn.geom.24793173.20080613T085805">
<gml:posList>723273.34 6194566.81 .....
</gml:LineString>
</gml:curveMember>
.......
</gn:geometry>
FME script design
Design
When we are designing an FME script we
should reflect on
• Schema mapping – FME or database
• Generic or specific
• Design of dataflow – design patterns
• Pre- and post processing
• Existing system architecture
Dataflow design
Dataflow design
Dataflow design
Conclusion design
• Testing shows that design #3 is
best
– Design #1 will not work with big
datasets because of the list
– It use 10 to 15 percent less
memory than #2
– The timing is identical to #2
– We can validate individual features
Conclusion
• After the first tests it been easy to work
with templates
• Schema mapping in Oracle with sql
• A generic solution based on Design #3
• Templates as individual files read into an
attribute
• Only problem is FME can’t handle big
datasets (+60.000 features) –yet
Questions?
Peter Laulund
Rentemestervej 8
DK-2400 Copenhagen NV
Denmark
Phone: +45 72 54 51 73
E-mail: [email protected]