PhantomJS acts differently than Firefox webdriver


问题内容

I’m working on some code in which I use Selenium web driver - Firefox. Most of
things seems to work but when I try to change the browser to PhantomJS, It
starts to behave differently.

The page I’m processing is needed to be scrolled slowly to load more and more
results and that’s probably the problem.

Here is the code which works with Firefox webdriver, but doesn’t work with
PhantomJS:

def get_url(destination,start_date,end_date): #the date is like %Y-%m-%d 
    return "https://www.pelikan.sk/sk/flights/listdfc=%s&dtc=C%s&rfc=C%s&rtc=%s&dd=%s&rd=%s&px=1000&ns=0&prc=&rng=0&rbd=0&ct=0&view=list" % ('CVIE%20BUD%20BTS',destination, destination,'CVIE%20BUD%20BTS', start_date, end_date)



def load_whole_page(self,destination,start_date,end_date):
        deb()

        url = get_url(destination,start_date,end_date)

        self.driver.maximize_window()
        self.driver.get(url)

        wait = WebDriverWait(self.driver, 60)
        wait.until(EC.invisibility_of_element_located((By.XPATH, '//img[contains(@src, "loading")]')))
        wait.until(EC.invisibility_of_element_located((By.XPATH,
                                                       u'//div[. = "Poprosíme o trpezlivosť, hľadáme pre Vás ešte viac letov"]/preceding-sibling::img')))
        i=0
        old_driver_html = ''
        end = False
        while end==False:
            i+=1

            results = self.driver.find_elements_by_css_selector("div.flightbox")
            print len(results)
            if len(results)>=__THRESHOLD__: # for testing purposes. Default value: 999
                break
            try:
                self.driver.execute_script("arguments[0].scrollIntoView();", results[0])
                self.driver.execute_script("arguments[0].scrollIntoView();", results[-1])            
            except:
                self.driver.save_screenshot('screen_before_'+str()+'.png')
                sleep(2)

                print 'EXCEPTION<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<'
                continue

            new_driver_html = self.driver.page_source
            if new_driver_html == old_driver_html:
                print 'END OF PAGE'
                break
            old_driver_html = new_driver_html

            wait.until(wait_for_more_than_n_elements((By.CSS_SELECTOR, 'div.flightbox'), len(results)))
        sleep(10)

To detect when the page is full loaded, I compare old copy of html and new
html which is probably not what I’m supposed to do but with Firefox it is
sufficient.

Here is the screen of PhantomJS when the loading is stopped:enter image
description here

With Firefox, it loads more and more results, but with PhantomJS it is stucked
on for example 10 results.

Any ideas? What are the differences between these two drivers?


问题答案:

Two key things that helped me to solve it:

  • do not use that custom wait I’ve helped you with before
  • set the window.document.body.scrollTop first to 0 and then to document.body.scrollHeight in a row

Working code:

results = []
while len(results) < 200:
    results = driver.find_elements_by_css_selector("div.flightbox")

    print len(results)

    # scroll
    driver.execute_script("arguments[0].scrollIntoView();", results[0])
    driver.execute_script("window.document.body.scrollTop = 0;")
    driver.execute_script("window.document.body.scrollTop = document.body.scrollHeight;")
    driver.execute_script("arguments[0].scrollIntoView();", results[-1])

Version 2 (endless loop, stop if there is nothing loaded on scroll
anymore):

results = []
while True:
    try:
        wait.until(wait_for_more_than_n_elements((By.CSS_SELECTOR, "div.flightbox"), len(results)))
    except TimeoutException:
        break

    results = self.driver.find_elements_by_css_selector("div.flightbox")
    print len(results)

    # scroll
    for _ in xrange(5):
        try:
            self.driver.execute_script("""
                arguments[0].scrollIntoView();
                window.document.body.scrollTop = 0;
                window.document.body.scrollTop = document.body.scrollHeight;
                arguments[1].scrollIntoView();
            """, results[0], results[-1])
        except StaleElementReferenceException:
            break  # here it means more results were loaded

print "DONE. Result count: %d" % len(results)

Note that I’ve changed the comparison in the wait_for_more_than_n_elements
expected condition. Replaced:

return count >= self.count

with:

return count > self.count

Version 3 (scrolling from header to footer multiple times):

header = wait.until(EC.visibility_of_element_located((By.TAG_NAME, 'header')))
footer = wait.until(EC.visibility_of_element_located((By.TAG_NAME, 'footer')))

results = []
while True:
    try:
        wait.until(wait_for_more_than_n_elements((By.CSS_SELECTOR, "div.flightbox"), len(results)))
    except TimeoutException:
        break

    results = self.driver.find_elements_by_css_selector("div.flightbox")
    print len(results)

    # scroll
    for _ in xrange(5):
        self.driver.execute_script("""
            arguments[0].scrollIntoView();
            arguments[1].scrollIntoView();
        """, header, footer)
        sleep(1)