在IBM在线Jupyter笔记本的技能实验室中使用Selenium Webdriver和Python的错误

问题描述

我尝试了不同的方法来使Python脚本中的Selenium工作到使用online Jupyter notebook刮网,但没有成功。我阅读了许多其他说明(例如thisthis)或答案(例如thisthisthisthis等)关于类似的问题,但似乎对我没有任何帮助。 在虚拟环境中,我在开发文件夹中同时下载了Firefox(v81.0)和geckodriver(v0.27),并且授予了我所有的权限:

jupyterlab@jupyterlab-sps:/resources/testDevelop$ ls -l
total 7797

drwxrwsr-x 8 jupyterlab resources    4096 Oct  8 13:24 firefox
-rwxrwxrwx 1 jupyterlab resources 7274984 Oct  8 13:21 geckodriver
-rw-rw-r-- 1 jupyterlab resources     120 Oct 12 08:47 geckodriver.log
-rw-rw-r-- 1 jupyterlab resources   31813 Oct 12 09:42 testDevelop.ipynb

jupyterlab@jupyterlab-sps:/resources/testDevelop/firefox$ ls -l
total 165651
-rw-rw-r-- 1 jupyterlab resources       825 Sep 30 14:26 Throbber-small.gif
-rw-rw-r-- 1 jupyterlab resources       895 Sep 30 15:49 application.ini
drwxrwsr-x 4 jupyterlab resources      4096 Oct  8 13:24 browser
-rwxrwxr-x 1 jupyterlab resources    241720 Sep 30 16:28 crashreporter
-rw-rw-r-- 1 jupyterlab resources      4003 Sep 30 14:26 crashreporter.ini
drwxrwsr-x 3 jupyterlab resources      4096 Oct  8 13:24 defaults
-rw-rw-r-- 1 jupyterlab resources       174 Sep 30 16:28 dependentlibs.list
-rwxrwxr-x 1 jupyterlab resources     14656 Sep 30 16:28 firefox
-rwxrwxr-x 1 jupyterlab resources    569104 Sep 30 16:28 firefox-bin
-rw-rw-r-- 1 jupyterlab resources      1449 Sep 30 16:32 firefox-bin.sig
-rw-rw-r-- 1 jupyterlab resources      1449 Sep 30 16:32 firefox.sig
drwxrwsr-x 2 jupyterlab resources      4096 Oct  8 13:24 fonts
drwxrwsr-x 3 jupyterlab resources      4096 Oct  8 13:24 gmp-clearkey
drwxrwsr-x 2 jupyterlab resources      4096 Oct  8 13:24 gtk2
drwxrwsr-x 2 jupyterlab resources      4096 Oct  8 13:24 icons
-rwxrwxr-x 1 jupyterlab resources    895568 Sep 30 16:28 libfreeblpriv3.so
-rwxrwxr-x 1 jupyterlab resources    691064 Sep 30 16:28 libgraphitewasm.so
-rwxrwxr-x 1 jupyterlab resources     43408 Sep 30 16:28 liblgpllibs.so
-rwxrwxr-x 1 jupyterlab resources   2175768 Sep 30 16:28 libmozavcodec.so
-rwxrwxr-x 1 jupyterlab resources    220128 Sep 30 16:28 libmozavutil.so
-rwxrwxr-x 1 jupyterlab resources     14352 Sep 30 16:28 libmozgtk.so
-rwxrwxr-x 1 jupyterlab resources    113512 Sep 30 16:28 libmozsandBox.so
-rwxrwxr-x 1 jupyterlab resources   1207424 Sep 30 16:28 libmozsqlite3.so
-rwxrwxr-x 1 jupyterlab resources     18376 Sep 30 16:28 libmozwayland.so
-rwxrwxr-x 1 jupyterlab resources    243728 Sep 30 16:28 libnspr4.so
-rwxrwxr-x 1 jupyterlab resources    694896 Sep 30 16:28 libnss3.so
-rwxrwxr-x 1 jupyterlab resources    465616 Sep 30 16:28 libnssckbi.so
-rwxrwxr-x 1 jupyterlab resources    191728 Sep 30 16:28 libnssutil3.so
-rwxrwxr-x 1 jupyterlab resources    184120 Sep 30 16:28 liboggwasm.so
-rwxrwxr-x 1 jupyterlab resources     22872 Sep 30 16:28 libplc4.so
-rwxrwxr-x 1 jupyterlab resources     14592 Sep 30 16:28 libplds4.so
-rwxrwxr-x 1 jupyterlab resources    168024 Sep 30 16:28 libsmime3.so
-rwxrwxr-x 1 jupyterlab resources    326208 Sep 30 16:28 libsoftokn3.so
-rwxrwxr-x 1 jupyterlab resources    406208 Sep 30 16:28 libssl3.so
-rwxrwxr-x 1 jupyterlab resources 131841712 Sep 30 16:28 libxul.so
-rw-rw-r-- 1 jupyterlab resources      1449 Sep 30 16:32 libxul.so.sig
-rwxrwxr-x 1 jupyterlab resources   1260688 Sep 30 16:28 minidump-analyzer
-rw-rw-r-- 1 jupyterlab resources  26270759 Sep 30 16:32 omni.ja
-rwxrwxr-x 1 jupyterlab resources    614144 Sep 30 16:28 pingsender
-rw-rw-r-- 1 jupyterlab resources       166 Sep 30 16:28 platform.ini
-rwxrwxr-x 1 jupyterlab resources    564936 Sep 30 16:28 plugin-container
-rw-rw-r-- 1 jupyterlab resources      1449 Sep 30 16:32 plugin-container.sig
-rw-rw-r-- 1 jupyterlab resources      2017 Sep 30 16:32 precomplete
-rw-rw-r-- 1 jupyterlab resources         0 Sep 30 16:28 removed-files
-rw-rw-r-- 1 jupyterlab resources       132 Sep 30 16:28 update-settings.ini
-rwxrwxr-x 1 jupyterlab resources    101864 Sep 30 16:28 updater
-rw-rw-r-- 1 jupyterlab resources       638 Sep 30 16:28 updater.ini

我还将firefox和geckodriver的路径添加到env变量路径中,即:

jupyterlab@jupyterlab-sps:/resources/testDevelop/firefox$ echo $PATH
/resources/testDevelop:/resources/testDevelop/firefox:/resources/firefox:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/home/jupyterlab/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/jre1.8.0_211/bin:/home/jupyterlab/hadoop-2.9.2/bin:/home/jupyterlab/spark-2.4.3/bin

但是,如果我尝试这段代码

import os
import selenium
from selenium import webdriver
from selenium.webdriver import Firefox
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

driver=Firefox(executable_path='/resources/testDevelop/geckodriver',)

我知道了:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-23-e332a8e620e3> in <module>
      6 from webdriver_manager.firefox import GeckoDriverManager
      7 
----> 8 driver=Firefox(executable_path='/resources/testDevelop/geckodriver',)
      9 cap = DesiredCapabilities().FIREFOX
     10 cap["marionette"] = False

~/conda/envs/python/lib/python3.6/site-packages/selenium/webdriver/firefox/webdriver.py in __init__(self,firefox_profile,firefox_binary,timeout,capabilities,proxy,executable_path,options,service_log_path,firefox_options,service_args,desired_capabilities,log_path,keep_alive)
    177         else:
    178             if self.binary is None:
--> 179                 self.binary = FirefoxBinary()
    180             if self.profile is None:
    181                 self.profile = FirefoxProfile()

~/conda/envs/python/lib/python3.6/site-packages/selenium/webdriver/firefox/firefox_binary.py in __init__(self,firefox_path,log_file)
     45         self.command_line = None
     46         if self._start_cmd is None:
---> 47             self._start_cmd = self._get_firefox_start_cmd()
     48         if not self._start_cmd.strip():
     49             raise WebDriverException(

~/conda/envs/python/lib/python3.6/site-packages/selenium/webdriver/firefox/firefox_binary.py in _get_firefox_start_cmd(self)
    167                 raise RuntimeError(
    168                     "Could not find firefox in your system PATH." +
--> 169                     " Please specify the firefox binary location or install firefox")
    170         return start_cmd
    171 

RuntimeError: Could not find firefox in your system PATH. Please specify the firefox binary location or install firefox

所以我尝试了:

import os
import selenium
from selenium import webdriver
from selenium.webdriver import Firefox
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities


cap = DesiredCapabilities().FIREFOX
cap["marionette"] = False
driver = os.path.normpath(os.path.join(os.getcwd(),'geckodriver'))
binary = os.path.normpath(os.path.join(os.getcwd(),'firefox','firefox'))
ff_binary = webdriver.firefox.firefox_binary.FirefoxBinary(firefox_path=binary,log_file='ff_log.log')
#driver = webdriver.Firefox(executable_path=GeckoDriverManager().install())
browser = webdriver.Firefox(firefox_binary=ff_binary,capabilities=cap,executable_path=driver)

但是我遇到了一个奇怪的错误

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-26-0bb63b20498c> in <module>
     12 binary = os.path.normpath(os.path.join(os.getcwd(),'firefox'))
     13 ff_binary = webdriver.firefox.firefox_binary.FirefoxBinary(firefox_path=binary,log_file='ff_log.log')
---> 14 browser = webdriver.Firefox(firefox_binary=ff_binary,executable_path=driver)

~/conda/envs/python/lib/python3.6/site-packages/selenium/webdriver/firefox/webdriver.py in __init__(self,keep_alive)
    189 
    190             executor = ExtensionConnection("127.0.0.1",self.profile,--> 191                                            self.binary,timeout)
    192             RemoteWebDriver.__init__(
    193                 self,~/conda/envs/python/lib/python3.6/site-packages/selenium/webdriver/firefox/extension_connection.py in __init__(self,host,timeout)
     50         self.profile.add_extension()
     51 
---> 52         self.binary.launch_browser(self.profile,timeout=timeout)
     53         _URL = "http://%s:%d/hub" % (HOST,PORT)
     54         RemoteConnection.__init__(

~/conda/envs/python/lib/python3.6/site-packages/selenium/webdriver/firefox/firefox_binary.py in launch_browser(self,profile,timeout)
     70         self.profile = profile
     71 
---> 72         self._start_from_profile_path(self.profile.path)
     73         self._wait_until_connectable(timeout=timeout)
     74 

~/conda/envs/python/lib/python3.6/site-packages/selenium/webdriver/firefox/firefox_binary.py in _start_from_profile_path(self,path)
     93         self.process = Popen(
     94             command,stdout=self._log_file,stderr=STDOUT,---> 95             env=self._firefox_env)
     96 
     97     def _wait_until_connectable(self,timeout=30):

~/conda/envs/python/lib/python3.6/subprocess.py in __init__(self,args,bufsize,executable,stdin,stdout,stderr,preexec_fn,close_fds,shell,cwd,env,universal_newlines,startupinfo,creationflags,restore_signals,start_new_session,pass_fds,encoding,errors)
    685         (p2cread,p2cwrite,686          c2pread,c2pwrite,--> 687          errread,errwrite) = self._get_handles(stdin,stderr)
    688 
    689         # We wrap OS handles *before* launching the child,otherwise a

~/conda/envs/python/lib/python3.6/subprocess.py in _get_handles(self,stderr)
   1202             else:
   1203                 # Assuming file-like object
-> 1204                 c2pwrite = stdout.fileno()
   1205 
   1206             if stderr is None:

AttributeError: 'str' object has no attribute 'fileno'

我不知道此错误是什么问题。我已经单独检查了地址的值,它们似乎是正确的,即:

  • 二进制返回:'/ resources / StockScreener / firefox / firefox'
  • 驱动程序返回:'/ resources / StockScreener / geckodriver'
  • ff_binary返回:

我也尝试通过这种方式使用GeckoDriverManager:

import os
import selenium
from selenium import webdriver
from selenium.webdriver import Firefox
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from webdriver_manager.firefox import GeckoDriverManager


cap = DesiredCapabilities().FIREFOX
cap["marionette"] = False
driver = os.path.normpath(os.path.join(os.getcwd(),log_file='ff_log.log')
#browser = webdriver.Firefox(firefox_binary=ff_binary,executable_path=driver)
driver = webdriver.Firefox(executable_path=GeckoDriverManager().install()) 

并返回:

[WDM] - Driver [/home/jupyterlab/.wdm/drivers/geckodriver/linux64/v0.27.0/geckodriver] found in cache

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-32-012cab2ea574> in <module>
     13 ff_binary = webdriver.firefox.firefox_binary.FirefoxBinary(firefox_path=binary,log_file='ff_log.log')
     14 #browser = webdriver.Firefox(firefox_binary=ff_binary,executable_path=driver)
---> 15 driver = webdriver.Firefox(executable_path=GeckoDriverManager().install())
     16 #browser.get('http://google.com/')
     17 #Simple assignment

~/conda/envs/python/lib/python3.6/site-packages/selenium/webdriver/firefox/webdriver.py in __init__(self,log_file)
     45         self.command_line = None
     46         if self._start_cmd is None:
---> 47             self._start_cmd = self._get_firefox_start_cmd()
     48         if not self._start_cmd.strip():
     49             raise WebDriverException(

~/conda/envs/python/lib/python3.6/site-packages/selenium/webdriver/firefox/firefox_binary.py in _get_firefox_start_cmd(self)
    167                 raise RuntimeError(
    168                     "Could not find firefox in your system PATH." +
--> 169                     " Please specify the firefox binary location or install firefox")
    170         return start_cmd
    171 

RuntimeError: Could not find firefox in your system PATH. Please specify the firefox binary location or install firefox

考虑到在所有情况下,我都有:

jupyterlab@jupyterlab-sps:/resources/testDevelop/firefox$ whereis firefox
firefox: /resources/testDevelop/firefox /resources/testDevelop/firefox/firefox.sig /resources/testDevelop/firefox/firefox

最后,如果我只写:

import os
import selenium
from selenium import webdriver
from selenium.webdriver import Firefox
#from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
#from webdriver_manager.firefox import GeckoDriverManager

driver=Firefox(executable_path='/resources/testDevelop/geckodriver')

在不安装webdriver-manager的情况下重新启动内核后,出现以下错误

---------------------------------------------------------------------------
SessionNotCreatedException                Traceback (most recent call last)
<ipython-input-2-89dbd2507c70> in <module>
      6 #from webdriver_manager.firefox import GeckoDriverManager
      7 
----> 8 driver=Firefox(executable_path='/resources/testDevelop/geckodriver')
      9 #cap = DesiredCapabilities().FIREFOX
     10 #cap["marionette"] = False

~/conda/envs/python/lib/python3.6/site-packages/selenium/webdriver/firefox/webdriver.py in __init__(self,keep_alive)
    172                 command_executor=executor,173                 desired_capabilities=capabilities,--> 174                 keep_alive=True)
    175 
    176         # Selenium remote

~/conda/envs/python/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py in __init__(self,command_executor,browser_profile,keep_alive,file_detector,options)
    155             warnings.warn("Please use FirefoxOptions to set browser profile",156                           DeprecationWarning,stacklevel=2)
--> 157         self.start_session(capabilities,browser_profile)
    158         self._switch_to = SwitchTo(self)
    159         self._mobile = Mobile(self)

~/conda/envs/python/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py in start_session(self,browser_profile)
    250         parameters = {"capabilities": w3c_caps,251                       "desiredCapabilities": capabilities}
--> 252         response = self.execute(Command.NEW_SESSION,parameters)
    253         if 'sessionId' not in response:
    254             response = response['value']

~/conda/envs/python/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py in execute(self,driver_command,params)
    319         response = self.command_executor.execute(driver_command,params)
    320         if response:
--> 321             self.error_handler.check_response(response)
    322             response['value'] = self._unwrap_value(
    323                 response.get('value',None))

~/conda/envs/python/lib/python3.6/site-packages/selenium/webdriver/remote/errorhandler.py in check_response(self,response)
    240                 alert_text = value['alert'].get('text')
    241             raise exception_class(message,screen,stacktrace,alert_text)
--> 242         raise exception_class(message,stacktrace)
    243 
    244     def _value_or_default(self,obj,key,default):

SessionNotCreatedException: Message: Unable to find a matching set of capabilities

该问题似乎受许多版本更新的影响,因此新更新可能会引起误判和问题。怎么解决?您是否可以建议使用其他方法轻松地进行网络抓取(运行JavaScript)的类似方法

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)