python 2.7 - How to remove <?xml version="1.0" encoding="utf-8"?> when using "xml" in Beautiful Soup -
from bs4 import beautifulsoup xmlcontent = "some text <tags>" bs = beautifulsoup(xmlcontent, "xml") print bs outputs:
<?xml version="1.0" encoding="utf-8"?> text <tags> is possible not output:
<?xml version="1.0" encoding="utf-8"?> i know if using lxml, remove added <body> tags do:
bs = beautifulsoup(xmlcontent, "lxml") print bs.body.next is there equivalent use xml xml version , encoding not included?
i choosing use xml on lxml contents being parsed in xml format - best choice or can use lxml xml content?
this seems work:
from bs4 import beautifulsoup xmlcontent = "some text <tags>" bs = beautifulsoup(xmlcontent, "xml") bs = bs.encode_contents() print type(bs) # it's string print bs # text <tags>
Comments
Post a Comment