Recent additions from gonuts issues:
- append.go illustrates turning a sub-element into a list (or appending to a list)
- order.go illustrates working with a list of mixed tags and preserving their sequence
Examples of using ValuesFromTagPath().
A number of interesting examples have shown up in the gonuts discussion group
that could be handled - after a fashion - using the ValuesFromTagPath() function.
gonuts1.go -
Here we see that the message stream has a problem with multiple tag spellings,
though the message structure remains constant. In this example we 'anonymize'
the tag for the variant spellings.
values := m.ValuesForPath("data.*)
where '*' is any possible spelling - "netid" or "idnet"
and the result is a list with 1 member of map[string]interface{} type.
Once we've retrieved the Map, we can parse it using the known keys - "disable",
"text1" and "word1".
gonuts1a.go - (03-mar-14)
Here we just permute the tag labels using m.NewMap() to make all the messages
consistent. Then they can be decoded into a single structure definition.
gonuts2.go -
This is an interesting case where there was a need to handle messages with lists
of "ClaimStatusCodeRecord" entries as well as messages with NONE. (Here we see
some of the vagaries of dealing with mixed messages that are verging on becoming
anonymous.)
msg1 - the message with two ClaimStatusCodeRecord entries
msg2 - the message with one ClaimStatusCodeRecord entry
msg3 - the message with NO ClaimStatusCodeRecord entries
ValuesForPath options:
path == "Envelope.Body.GetClaimStatusCodesResponse.GetClaimStatusCodesResult.ClaimStatusCodeRecord"
for msg == msg1:
returns: a list - []interface{} - with two values of map[string]interface{} type
for msg == msg2:
returns: a list - []interface{} - with one map[string]interface{} type
for msg == msg3:
returns 'nil' - no values
path == "*.*.*.*.*"
for msg == msg1:
returns: a list - []interface{} - with two values of map[string]interface{} type
path == "*.*.*.*.*.Description
for msg == msg1:
returns: a list - []interface{} - with two values of string type, the individual
values from parsing the two map[string]interface{} values where key=="Description"
path == "*.*.*.*.*.*"
for msg == msg1:
returns: a list - []interface{} - with six values of string type, the individual
values from parsing all keys in the two map[string]interface{} values
Think of the wildcard character "*" as anonymizing the tag in the position of the path where
it occurs. The books.go example has a range of use cases.
gonuts3.go -
Uses the ValuesForKey method to extract a list of image "src" file names that are encoded as
attribute values.
gonuts4.go -
Here we use the ValuesForPath to extract attribute values for country names. The attribute
is included in the 'path' argument by prepending it with a hyphen: ""doc.some_tag.geoInfo.country.-name".
gonuts5.go (10-mar-14) -
Extract a node of values using ValuesForPath based on name="list3-1-1-1". Then get the values
for the 'int' entries based on attribute 'name' values - mv.ValuesForKey("int", "-name:"+n).
gonuts5a.go (10-mar-14) -
Extract a node of values using ValuesForPath based on name="list3-1-1-1". Then get the values
for the 'int' entries based on attribute 'name' values - mv.ValuesForKey("*", "-name:"+n).
(Same as gonuts5.go but with wildcarded key value, since we're matching elements on subkey.)
EAT YOUR OWN DOG FOOD ...
I needed to convert a large (14.9 MB) XML data set from an Eclipse metrics report on an
application that had 355,100 lines of code in 211 packages into CSV data sets. The report
included application-, package-, class- and method-level metrics reported in an element,
"Value", with varying attributes.
<Value value=""/>
<Value name="" package="" value=""/>
<Value name="" source="" package="" value=""/>
<Value name="" source="" package="" value="" inrange=""/>
In addition, the metrics were reported with two different "Metric" compound elements:
<Metrics>
<Metric id="" description="">
<Values>
<Value.../>
...
</Values>
</Metric>
...
<Metric id="" description="">
<Value.../>
</Metric>
...
</Metrics>
Using the mxj package seemed a more straightforward approach than using Go vernacular
and the standard xml package. I wrote the program getmetrics.go to do this. Here are
three version to illustrate using
getmetrics1.go - pass os.File handle for metrics_data.xml to NewMapXmlReader.
getmetrics2.go - load metrics_data.xml into an in-memory buffer, then pass it to NewMapXml.
getmetrics3.go - demonstrates overhead of extracting the raw XML while decoding with NewMapXmlReaderRaw.
To run example getmetrics1.go, extract a 120,000+ row data set from metrics_data.zip. Then:
go run getmetrics1.go -file=metrics_data.xml