PhantomJS acts differently than Firefox webdriver
问题内容:
I’m working on some code in which I use Selenium web driver - Firefox. Most of
things seems to work but when I try to change the browser to PhantomJS, It
starts to behave differently.
The page I’m processing is needed to be scrolled slowly to load more and more
results and that’s probably the problem.
Here is the code which works with Firefox webdriver, but doesn’t work with
PhantomJS:
def get_url(destination,start_date,end_date): #the date is like %Y-%m-%d
return "https://www.pelikan.sk/sk/flights/listdfc=%s&dtc=C%s&rfc=C%s&rtc=%s&dd=%s&rd=%s&px=1000&ns=0&prc=&rng=0&rbd=0&ct=0&view=list" % ('CVIE%20BUD%20BTS',destination, destination,'CVIE%20BUD%20BTS', start_date, end_date)
def load_whole_page(self,destination,start_date,end_date):
deb()
url = get_url(destination,start_date,end_date)
self.driver.maximize_window()
self.driver.get(url)
wait = WebDriverWait(self.driver, 60)
wait.until(EC.invisibility_of_element_located((By.XPATH, '//img[contains(@src, "loading")]')))
wait.until(EC.invisibility_of_element_located((By.XPATH,
u'//div[. = "Poprosíme o trpezlivosť, hľadáme pre Vás ešte viac letov"]/preceding-sibling::img')))
i=0
old_driver_html = ''
end = False
while end==False:
i+=1
results = self.driver.find_elements_by_css_selector("div.flightbox")
print len(results)
if len(results)>=__THRESHOLD__: # for testing purposes. Default value: 999
break
try:
self.driver.execute_script("arguments[0].scrollIntoView();", results[0])
self.driver.execute_script("arguments[0].scrollIntoView();", results[-1])
except:
self.driver.save_screenshot('screen_before_'+str()+'.png')
sleep(2)
print 'EXCEPTION<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<'
continue
new_driver_html = self.driver.page_source
if new_driver_html == old_driver_html:
print 'END OF PAGE'
break
old_driver_html = new_driver_html
wait.until(wait_for_more_than_n_elements((By.CSS_SELECTOR, 'div.flightbox'), len(results)))
sleep(10)
To detect when the page is full loaded, I compare old copy of html and new
html which is probably not what I’m supposed to do but with Firefox it is
sufficient.
Here is the screen of PhantomJS when the loading is stopped:
With Firefox, it loads more and more results, but with PhantomJS it is stucked
on for example 10 results.
Any ideas? What are the differences between these two drivers?
问题答案:
Two key things that helped me to solve it:
- do not use that custom wait I’ve helped you with before
- set the
window.document.body.scrollTop
first to 0 and then todocument.body.scrollHeight
in a row
Working code:
results = []
while len(results) < 200:
results = driver.find_elements_by_css_selector("div.flightbox")
print len(results)
# scroll
driver.execute_script("arguments[0].scrollIntoView();", results[0])
driver.execute_script("window.document.body.scrollTop = 0;")
driver.execute_script("window.document.body.scrollTop = document.body.scrollHeight;")
driver.execute_script("arguments[0].scrollIntoView();", results[-1])
Version 2 (endless loop, stop if there is nothing loaded on scroll
anymore):
results = []
while True:
try:
wait.until(wait_for_more_than_n_elements((By.CSS_SELECTOR, "div.flightbox"), len(results)))
except TimeoutException:
break
results = self.driver.find_elements_by_css_selector("div.flightbox")
print len(results)
# scroll
for _ in xrange(5):
try:
self.driver.execute_script("""
arguments[0].scrollIntoView();
window.document.body.scrollTop = 0;
window.document.body.scrollTop = document.body.scrollHeight;
arguments[1].scrollIntoView();
""", results[0], results[-1])
except StaleElementReferenceException:
break # here it means more results were loaded
print "DONE. Result count: %d" % len(results)
Note that I’ve changed the comparison in the wait_for_more_than_n_elements
expected condition. Replaced:
return count >= self.count
with:
return count > self.count
Version 3 (scrolling from header to footer multiple times):
header = wait.until(EC.visibility_of_element_located((By.TAG_NAME, 'header')))
footer = wait.until(EC.visibility_of_element_located((By.TAG_NAME, 'footer')))
results = []
while True:
try:
wait.until(wait_for_more_than_n_elements((By.CSS_SELECTOR, "div.flightbox"), len(results)))
except TimeoutException:
break
results = self.driver.find_elements_by_css_selector("div.flightbox")
print len(results)
# scroll
for _ in xrange(5):
self.driver.execute_script("""
arguments[0].scrollIntoView();
arguments[1].scrollIntoView();
""", header, footer)
sleep(1)