PHP HTML DOM, XPATH - weird characters? -


assume $html_dom contains page has html entities  . in output below, output  .

$html_dom = new domdocument(); @$html_dom->loadhtml($html_doc); $xpath = new domxpath($html_dom);  $query   = '//div[@class="foo"]/div/p'; $my_foos = $xpath->query($query_abstract); foreach ($my_foos $my_foo) {     echo html_entity_decode($my_foos->nodevalue);     die; } 

how handle don't weird characters? tried following no success:

$html_doc = mb_convert_encoding($html_doc, 'html-entities', 'utf-8'); $html_dom = new domdocument(); $html_dom->resolveexternals = true; @$html_dom->loadhtml($html_doc); $xpath = new domxpath($html_dom);  $query   = '//div[@class="foo"]/div/p'; $my_foos = $xpath->query($query); foreach ($my_foos $my_foo) {     echo html_entity_decode($my_foos->nodevalue);     die; } 

mb_convert_encoding idea, not work expected because domdocument seems little big buggy when comes encoding.

moving mb_convert_encoding actual node output did trick.

$html_dom = new domdocument(); $html_dom->resolveexternals = true; @$html_dom->loadhtml($html_doc); $xpath = new domxpath($html_dom);  $query   = '//div[@class="foo"]/div/p'; $my_foos = $xpath->query($query); foreach ($my_foos $my_foo) {     echo mb_convert_encoding($my_foo->nodevalue, 'html-entities', 'utf-8');     die; } 

Comments

Popular posts from this blog

c++ - CryptStringToBinary API behavior -

c++ - Correct method for redrawing a layered window -

java.util.scanner - How to read and add only numbers to array from a text file -