I have difficulty parsing a huge XML file (about 100GB with large nodes). I am trying to reduce the node sizes by deleting unnecessary tags. For example, any <text> tags.
If I use native XML parsers such as xmlstarlet
xmlstarlet ed -P -d '//text' file.xml
I face the same problem of being out of memory.
Is there a safe way (with little memory footprint) to remove all <text></text> pairs without breaking the XML structure?