python - HTML page vastly different when using a headless webkit implementation using PyQT -


i under impression using headless browser implementation of webkit using pyqt automatically me html code each url heavy js code in it. seeing partially. comparing page when save page firefox window.

i using following code -

class jabbawebkit(qwebpage):     # 'html' class variable      def __init__(self, url, wait, app, parent=none):         super(jabbawebkit, self).__init__(parent)         jabbawebkit.html = ''          if wait:             qtimer.singleshot(wait * sec, app.quit)         else:             self.loadfinished.connect(app.quit)          self.mainframe().load(qurl(url))      def save(self):         jabbawebkit.html = self.mainframe().tohtml()      def useragentforurl(self, url):         return user_agent       def get_page(url, wait=none):         # here trick how call several times         app = qapplication.instance() # checks if qapplication exists          if not app: # create qapplication if doesnt exist             app = qapplication(sys.argv)         #         form = jabbawebkit(url, wait, app)         app.abouttoquit.connect(form.save)         app.exec_()         return jabbawebkit.html 

can 1 see wrong code?

after running code through few urls, here 1 found shows problems running quite - http://www.chilis.com/en/pages/menu.aspx

thanks pointers.

the page have ajax code, when finish load, still need time update page ajax. code quit when finish load.

you should add code wait time , process events in webkit:

for in range(200): #wait 2 seconds     app.processevents()     time.sleep(0.01) 

Comments

Popular posts from this blog

java.util.scanner - How to read and add only numbers to array from a text file -

rewrite - Trouble with Wordpress multiple custom querystrings -