3

I need to audit XML file structures and need to generate a report that shows only the DOM tree structure and omit the values. Essentially, I just the node names only and no values. I tried using xmllint and xmlstarlet but can't figure out how to do this.

Does anyone know of any tools or examples of above tools that do this?

cat $filename.xml | xmlstarlet format -t gives me what I need, but I want to omit all values.

Kinnara
  • 31
  • 2
  • 2
    Please [edit] your question and add an example input file and the output you would want from that example so we can test our answers. I realize you want a general solution, but we'll still need a file to play with. – terdon Jul 08 '21 at 14:37
  • 2
    "_I need to audit XML file structures_" if you have an XSD that describes the XML you want to validate, then there are tools such as `xmlstarlet` that will do this for you already. – roaima Jul 08 '21 at 15:25
  • 2
    Do you want to output the elements, but without textual contents? Or just the tag names used? What about attributes? What about comments or XML processing instructions? Would a straight list of each element's XPath suffice? What ave you searched/researched/tried? – C. M. Jul 08 '21 at 20:06

2 Answers2

9

The xmllint interactive shell command du appears to provide what you want:

   du PATH
       Show the structure of the subtree under the given path or the current node.

If you want something non-interactive, then perhaps

printf '%s\n' du exit | xmllint --shell file.xml

or

xmllint --shell file.xml <<EOF
du
exit
EOF

ex.

$ printf '%s\n' du exit | xmllint --shell rss.xml
/ > /
rss
  channel
    title
    link
    description
    copyright
    language
    lastBuildDate
    image
      url
      title
      link
    item
      title
      link
      description
      pubDate
    item
      title
      link
      description
      pubDate
    item
      title
      link
      description
      pubDate
/ >
steeldriver
  • 78,509
  • 12
  • 109
  • 152
6

Since you're already using xmlstarlet you may as well continue using it.

The xmlstarlet tool has an el (elements) sub-command which is used to "Display element structure of XML document".

By default, it outputs data like this:

$ xmlstarlet el /usr/X11R6/share/xcb/ge.xml
xcb
xcb/request
xcb/request/field
xcb/request/field
xcb/request/reply
xcb/request/reply/pad
xcb/request/reply/field
xcb/request/reply/field
xcb/request/reply/pad

You may also get attributes:

$ xmlstarlet el -a /usr/X11R6/share/xcb/ge.xml
xcb
xcb/@header
xcb/@extension-xname
xcb/@extension-name
xcb/@major-version
xcb/@minor-version
xcb/request
xcb/request/@name
xcb/request/@opcode
xcb/request/field
xcb/request/field/@type
xcb/request/field/@name
xcb/request/field
xcb/request/field/@type
xcb/request/field/@name
xcb/request/reply
xcb/request/reply/pad
xcb/request/reply/pad/@bytes
xcb/request/reply/field
xcb/request/reply/field/@type
xcb/request/reply/field/@name
xcb/request/reply/field
xcb/request/reply/field/@type
xcb/request/reply/field/@name
xcb/request/reply/pad
xcb/request/reply/pad/@bytes

See also xmlstarlet el --help.

Using the val (validate) sub-command ("Validate XML document(s) (well-formed/DTD/XSD/RelaxNG)"), xmlstarlet may validate your XML document for you. It will, by default, just check whether the document is well formed, but it may also validate your document against a provided XSD schema, the document's DTD, or an Relax-NG schema.

See also xmlstarlet val --help.

Kusalananda
  • 320,670
  • 36
  • 633
  • 936