Remove XML comments using Regex in bash -


i want remove xml comments in bash using regex (awk, sed, grep...) have looked @ other questions missing something. here's xml code

<table>     <!--    removed bla bla bla bla bla bl............      removeee      removeddddd     -->  <row>         <column name="example"  value="1" ></column>     </row> </table> 

so i'm comparing 2 xml files don't want comparison take account comments. this

diff file1.xml file2.xml | sed '/<!--/,/-->/d' 

but removes line starts <!-- , last line. not remove lines in between.

in end, you're going have recommend client/friend/instructor need install kind of xml processor. xmlstarlet command line tool, there number (or @ least number greater 2) of implementations of xslt can compiled standard unix, , in cases windows. cannot xml processing regex-based tools, , whatever hard read, harder maintain, , fail on corner cases, disastrous consequences.

i haven't spent lot of time polishing or reviewing following little awk program. think remove comments compliant xml documents. note following comment not compliant:

<!-- xml comments cannot include -- comment illegal --> 

and not treated correctly script.

the following illegal, since i've seen in wild , wasn't hard deal with, did so:

<!-------------- comment ill-formed but... --------------> 

here is. no guarantees. know it's hard read, , wouldn't want maintain it. may fail on arbitrary corner cases.

awk 'in_comment&&/-->/{sub(/([^-]|-[^-])*--+>/,"");in_comment=0}      in_comment{next}      {gsub(/<!--+([^-]|-[^-])*--+>/,"");       in_comment=sub(/<!--+.*/,"");       print}' 

Comments

Popular posts from this blog

c++ - CryptStringToBinary API behavior -

c++ - Correct method for redrawing a layered window -

java.util.scanner - How to read and add only numbers to array from a text file -