11

Below I have an example of an Adobe XML swidtag used to track inventory. I need to parse out relevant information using xmllint in bash and output that to a new text file.

For example I would like to parse the following

swid:entitlement_required_indicator
swid:product_title
swid:product_version
swid:name
swid:numeric
swid:major
swid:minor
swid:build
swid:review

I have tried using this, but it will not let me read the namespace

xmllint --xpath '//swid:product_version/swid:name/text()' file.xml

I've also tried

xmllint --xpath "//*[local-name1()='product_version']/*[local-name2()='name']/text()" file.xml

But got these errors

xmlXPathCompOpEval: function local-nameame1 not found
XPath error : Unregistered function
XPath error : Stack usage errror
XPath evaluation failure

Sample tag file for Creative Suite 5 The following sample is for Adobe Photoshop CS5 serialized as Creative Suite 5 Master Collection (Suite)

<?xml version="1.0" encoding="utf-8"?>
<swid:software_identification_tag xsi:schemaLocation="http://standards.iso.org/iso/19770/-2/2008/schema.xsd software_identification_tag.xsd" 
     xmlns:swid="http://standards.iso.org/iso/19770/-2/2008/schema.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

<!--Mandatory Identity elements -->
<swid:entitlement_required_indicator>true</swid:entitlement_required_indicator>
<swid:product_title>Acrobat XI Pro</swid:product_title>
<swid:product_version>
    <swid:name>1.0</swid:name>
    <swid:numeric>
        <swid:major>1</swid:major>
        <swid:minor>0</swid:minor>
        <swid:build>0</swid:build>
        <swid:review>0</swid:review>
    </swid:numeric>
</swid:product_version>
<swid:software_creator>
    <swid:name>Adobe Systems Incorporated</swid:name>
    <swid:regid>regid.1986-12.com.adobe</swid:regid>
</swid:software_creator>
<swid:software_licensor>
    <swid:name>Adobe Systems Incorporated</swid:name>
    <swid:regid>regid.1986-12.com.adobe</swid:regid>
</swid:software_licensor>
<swid:software_id>
    <swid:unique_id>CreativeCloud-CS6-Mac-GM-MUL</swid:unique_id>
    <swid:tag_creator_regid>regid.1986-12.com.adobe</swid:tag_creator_regid>
</swid:software_id>

<swid:tag_creator>
    <swid:name>Adobe Systems Incorporated</swid:name>
    <swid:regid>regid.1986-12.com.adobe</swid:regid>
</swid:tag_creator>
<!--Optional Identity elements -->
<swid:license_linkage>
    <swid:activation_status>activated</swid:activation_status>
    <swid:channel_type>SUBSCRIPTION</swid:channel_type>
    <swid:customer_type>RETAIL</swid:customer_type>
</swid:license_linkage>
<swid:serial_number>909702426602037824854600</swid:serial_number>
</swid:software_identification_tag>
macman
  • 113
  • 1
  • 1
  • 5

4 Answers4

21

This discussion is enlightening.

At the very least, even if not ideal, you should be able to do:

xmllint --xpath "//*[local-name()='product_version']/*[local-name()='name']/text()" file.xml

Or use xmlstarlet instead:

xmlstarlet sel -t -v //swid:product_version/swid:name file.xml
Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
  • The discussion you listed is very enlightening, thank you. For "//*[local-name()='product_version'] is local-name, something that I create? Ex. "//*[name1()='product_version']. I tried renaming it and got – macman Nov 27 '12 at 16:19
  • 1
    The reason this works in XMLStarlet is [a feature](http://xmlstar.sourceforge.net/doc/UG/xmlstarlet-ug.html#idm47077139523728): "In order to handle namespaces with greater ease, XMLStarlet (versions 1.2.1+) will use the namespace prefixes declared on the root element of the input document." – Tanz87 Jul 06 '20 at 18:00
7

Try using a here-doc. Example:

#!/bin/bash
xmllint --shell file.xml <<EOF
setns swid=http://standards.iso.org/iso/19770/-2/2008/schema.xsd
xpath //swid:product_version/swid:name/text()
EOF

Works with later versions of xmllint that support the --xpath parameter.

roblogic
  • 207
  • 2
  • 9
2

With an older version of xmllint (which doesn't support --xpath) you can set a namespace and query more intuitively thus (but you have to grep out some additional garbage):

#!/bin/bash
echo 'setns swid=http://standards.iso.org/iso/19770/-2/2008/schema.xsd
      cat //swid:product_version/swid:name/text()' | \
xmllint --shell file.xml | egrep -v '^(/ >| -----)'
Ed Randall
  • 250
  • 3
  • 7
1

I had similar issues, reading pom.xml (a maven configuration file) in shell script for jenkins. To ensure a good result, I would do:

xmllint --xpath "//swid:software_identification_tag/*[local-name()='product_version']/*[local-name()='name']/text()" file.xml

You don't seem to have the problem here put if your xml has that kind of additionnal content:

<swid:product_specifics>
<swid:product_version>
...
</swid:product_version>
</swid:product_specifics>

xmllint --xpath "//*[local-name()='product_version']/*[local-name()='name']/text()" file.xml won't work

In my situation, a pom.xml has many "version" elements, so if you want a specific one, the path should be exact, otherwise you'll get multiple values you don't want.

Anthon
  • 78,313
  • 42
  • 165
  • 222