python 2.7 - How to remove <?xml version="1.0" encoding="utf-8"?> when using "xml" in Beautiful Soup -
from bs4 import beautifulsoup xmlcontent = "some text <tags>" bs = beautifulsoup(xmlcontent, "xml") print bs
outputs:
<?xml version="1.0" encoding="utf-8"?> text <tags>
is possible not output:
<?xml version="1.0" encoding="utf-8"?>
i know if using lxml
, remove added <body>
tags do:
bs = beautifulsoup(xmlcontent, "lxml") print bs.body.next
is there equivalent use xml
xml version , encoding not included?
i choosing use xml
on lxml
contents being parsed in xml format - best choice or can use lxml
xml content?
this seems work:
from bs4 import beautifulsoup xmlcontent = "some text <tags>" bs = beautifulsoup(xmlcontent, "xml") bs = bs.encode_contents() print type(bs) # it's string print bs # text <tags>
Comments
Post a Comment