当前位置: 代码迷 >> 综合 >> python3.7 scrapy 报错 KeyError: ‘Spider not found: baidu‘
  详细解决方案

python3.7 scrapy 报错 KeyError: ‘Spider not found: baidu‘

热度:95   发布时间:2023-12-12 22:17:35.0

它没有找到 名叫 baidu 的爬虫,所以我根据提示一步一步让它跑起来了
先安装了 scrapy
再创建了 scrapy01 项目
scrapy startproject scrapy01

然后我根据提示,到了 scrapy01 这个页面, scrapy genspider example example.com

在这里插入图片描述

再修改了 example.py:
在这里插入图片描述
就有了下面的运行日志,不过并没有 “百度知道” 这些字样… …

于是,就把 settings.py 里面的
ROBOTSTXT_OBEY = True
改成了 ROBOTSTXT_OBEY = False
在这里插入图片描述

下面是运行日志

<!-- D:\pyFile>scrapy startproject scrapy01 New Scrapy project 'scrapy01', using template directory 'd:\python37-32\lib\site-packages\scrapy\templates\project', created in:D:\pyFile\scrapy01You can start your first spider with:cd scrapy01scrapy genspider example example.comD:\pyFile>cd scrapy01D:\pyFile\scrapy01>scrapy genspider example example.com Created spider 'example' using template 'basic' in module:scrapy01.spiders.exampleD:\pyFile\scrapy01>cd ..D:\pyFile>scrapy crawl baidu Scrapy 2.5.0 - no active projectUnknown command: crawlUse "scrapy" to see available commandsD:\pyFile>cd scrapy01D:\pyFile\scrapy01>scrapy crawl baidu 2021-08-01 19:14:38 [scrapy.utils.log] INFO: Scrapy 2.5.0 started (bot: scrapy01) 2021-08-01 19:14:38 [scrapy.utils.log] INFO: Versions: lxml 4.6.3.0, libxml2 2.9.5, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 21.7.0, Python 3.7.0 (v3.7.0:1bf9cc5093, Jun 27 2018, 04:06:47) [MSC v.1914 32 bit (Intel)], pyOpenSSL 20.0.1 (OpenSSL 1.1.1k 25 Ma r 2021), cryptography 3.4.7, Platform Windows-10-10.0.17763-SP0 2021-08-01 19:14:38 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.selectreactor.SelectReactor Traceback (most recent call last):File "d:\python37-32\lib\site-packages\scrapy\spiderloader.py", line 75, in loadreturn self._spiders[spider_name] KeyError: 'baidu'During handling of the above exception, another exception occurred:Traceback (most recent call last):File "d:\python37-32\lib\runpy.py", line 193, in _run_module_as_main"__main__", mod_spec)File "d:\python37-32\lib\runpy.py", line 85, in _run_codeexec(code, run_globals)File "D:\Python37-32\Scripts\scrapy.exe\__main__.py", line 7, in <module>File "d:\python37-32\lib\site-packages\scrapy\cmdline.py", line 145, in execute_run_print_help(parser, _run_command, cmd, args, opts)File "d:\python37-32\lib\site-packages\scrapy\cmdline.py", line 100, in _run_print_helpfunc(*a, **kw)File "d:\python37-32\lib\site-packages\scrapy\cmdline.py", line 153, in _run_commandcmd.run(args, opts)File "d:\python37-32\lib\site-packages\scrapy\commands\crawl.py", line 22, in runcrawl_defer = self.crawler_process.crawl(spname, **opts.spargs)File "d:\python37-32\lib\site-packages\scrapy\crawler.py", line 191, in crawlcrawler = self.create_crawler(crawler_or_spidercls)File "d:\python37-32\lib\site-packages\scrapy\crawler.py", line 224, in create_crawlerreturn self._create_crawler(crawler_or_spidercls)File "d:\python37-32\lib\site-packages\scrapy\crawler.py", line 228, in _create_crawlerspidercls = self.spider_loader.load(spidercls)File "d:\python37-32\lib\site-packages\scrapy\spiderloader.py", line 77, in loadraise KeyError(f"Spider not found: {spider_name}") KeyError: 'Spider not found: baidu'D:\pyFile\scrapy01>scrapy crawl baidu 2021-08-01 19:16:00 [scrapy.utils.log] INFO: Scrapy 2.5.0 started (bot: scrapy01) 2021-08-01 19:16:00 [scrapy.utils.log] INFO: Versions: lxml 4.6.3.0, libxml2 2.9.5, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 21.7.0, Python 3.7.0 (v3.7.0:1bf9cc5093, Jun 27 2018, 04:06:47) [MSC v.1914 32 bit (Intel)], pyOpenSSL 20.0.1 (OpenSSL 1.1.1k 25 Ma r 2021), cryptography 3.4.7, Platform Windows-10-10.0.17763-SP0 2021-08-01 19:16:00 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.selectreactor.SelectReactor Traceback (most recent call last):File "d:\python37-32\lib\site-packages\scrapy\spiderloader.py", line 75, in loadreturn self._spiders[spider_name] KeyError: 'baidu'During handling of the above exception, another exception occurred:Traceback (most recent call last):File "d:\python37-32\lib\runpy.py", line 193, in _run_module_as_main"__main__", mod_spec)File "d:\python37-32\lib\runpy.py", line 85, in _run_codeexec(code, run_globals)File "D:\Python37-32\Scripts\scrapy.exe\__main__.py", line 7, in <module>File "d:\python37-32\lib\site-packages\scrapy\cmdline.py", line 145, in execute_run_print_help(parser, _run_command, cmd, args, opts)File "d:\python37-32\lib\site-packages\scrapy\cmdline.py", line 100, in _run_print_helpfunc(*a, **kw)File "d:\python37-32\lib\site-packages\scrapy\cmdline.py", line 153, in _run_commandcmd.run(args, opts)File "d:\python37-32\lib\site-packages\scrapy\commands\crawl.py", line 22, in runcrawl_defer = self.crawler_process.crawl(spname, **opts.spargs)File "d:\python37-32\lib\site-packages\scrapy\crawler.py", line 191, in crawlcrawler = self.create_crawler(crawler_or_spidercls)File "d:\python37-32\lib\site-packages\scrapy\crawler.py", line 224, in create_crawlerreturn self._create_crawler(crawler_or_spidercls)File "d:\python37-32\lib\site-packages\scrapy\crawler.py", line 228, in _create_crawlerspidercls = self.spider_loader.load(spidercls)File "d:\python37-32\lib\site-packages\scrapy\spiderloader.py", line 77, in loadraise KeyError(f"Spider not found: {spider_name}") KeyError: 'Spider not found: baidu'D:\pyFile\scrapy01>scrapy crawl baidu 2021-08-01 19:16:12 [scrapy.utils.log] INFO: Scrapy 2.5.0 started (bot: scrapy01) 2021-08-01 19:16:12 [scrapy.utils.log] INFO: Versions: lxml 4.6.3.0, libxml2 2.9.5, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 21.7.0, Python 3.7.0 (v3.7.0:1bf9cc5093, Jun 27 2018, 04:06:47) [MSC v.1914 32 bit (Intel)], pyOpenSSL 20.0.1 (OpenSSL 1.1.1k 25 Ma r 2021), cryptography 3.4.7, Platform Windows-10-10.0.17763-SP0 2021-08-01 19:16:12 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.selectreactor.SelectReactor Traceback (most recent call last):File "d:\python37-32\lib\site-packages\scrapy\spiderloader.py", line 75, in loadreturn self._spiders[spider_name] KeyError: 'baidu'During handling of the above exception, another exception occurred:Traceback (most recent call last):File "d:\python37-32\lib\runpy.py", line 193, in _run_module_as_main"__main__", mod_spec)File "d:\python37-32\lib\runpy.py", line 85, in _run_codeexec(code, run_globals)File "D:\Python37-32\Scripts\scrapy.exe\__main__.py", line 7, in <module>File "d:\python37-32\lib\site-packages\scrapy\cmdline.py", line 145, in execute_run_print_help(parser, _run_command, cmd, args, opts)File "d:\python37-32\lib\site-packages\scrapy\cmdline.py", line 100, in _run_print_helpfunc(*a, **kw)File "d:\python37-32\lib\site-packages\scrapy\cmdline.py", line 153, in _run_commandcmd.run(args, opts)File "d:\python37-32\lib\site-packages\scrapy\commands\crawl.py", line 22, in runcrawl_defer = self.crawler_process.crawl(spname, **opts.spargs)File "d:\python37-32\lib\site-packages\scrapy\crawler.py", line 191, in crawlcrawler = self.create_crawler(crawler_or_spidercls)File "d:\python37-32\lib\site-packages\scrapy\crawler.py", line 224, in create_crawlerreturn self._create_crawler(crawler_or_spidercls)File "d:\python37-32\lib\site-packages\scrapy\crawler.py", line 228, in _create_crawlerspidercls = self.spider_loader.load(spidercls)File "d:\python37-32\lib\site-packages\scrapy\spiderloader.py", line 77, in loadraise KeyError(f"Spider not found: {spider_name}") KeyError: 'Spider not found: baidu'D:\pyFile\scrapy01>scrapy crawl baidu 2021-08-01 19:23:14 [scrapy.utils.log] INFO: Scrapy 2.5.0 started (bot: scrapy01) 2021-08-01 19:23:14 [scrapy.utils.log] INFO: Versions: lxml 4.6.3.0, libxml2 2.9.5, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 21.7.0, Python 3.7.0 (v3.7.0:1bf9cc5093, Jun 27 2018, 04:06:47) [MSC v.1914 32 bit (Intel)], pyOpenSSL 20.0.1 (OpenSSL 1.1.1k 25 Ma r 2021), cryptography 3.4.7, Platform Windows-10-10.0.17763-SP0 2021-08-01 19:23:14 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.selectreactor.SelectReactor 2021-08-01 19:23:14 [scrapy.crawler] INFO: Overridden settings: {'BOT_NAME': 'scrapy01','NEWSPIDER_MODULE': 'scrapy01.spiders','ROBOTSTXT_OBEY': True,'SPIDER_MODULES': ['scrapy01.spiders']} 2021-08-01 19:23:14 [scrapy.extensions.telnet] INFO: Telnet Password: 8519c1b805fe8f82 2021-08-01 19:23:14 [scrapy.middleware] INFO: Enabled extensions: ['scrapy.extensions.corestats.CoreStats','scrapy.extensions.telnet.TelnetConsole','scrapy.extensions.logstats.LogStats'] 2021-08-01 19:23:14 [scrapy.middleware] INFO: Enabled downloader middlewares: ['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware','scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware','scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware','scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware','scrapy.downloadermiddlewares.useragent.UserAgentMiddleware','scrapy.downloadermiddlewares.retry.RetryMiddleware','scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware','scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware','scrapy.downloadermiddlewares.redirect.RedirectMiddleware','scrapy.downloadermiddlewares.cookies.CookiesMiddleware','scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware','scrapy.downloadermiddlewares.stats.DownloaderStats'] 2021-08-01 19:23:14 [scrapy.middleware] INFO: Enabled spider middlewares: ['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware','scrapy.spidermiddlewares.offsite.OffsiteMiddleware','scrapy.spidermiddlewares.referer.RefererMiddleware','scrapy.spidermiddlewares.urllength.UrlLengthMiddleware','scrapy.spidermiddlewares.depth.DepthMiddleware'] 2021-08-01 19:23:14 [scrapy.middleware] INFO: Enabled item pipelines: [] 2021-08-01 19:23:14 [scrapy.core.engine] INFO: Spider opened 2021-08-01 19:23:14 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min) 2021-08-01 19:23:14 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023 2021-08-01 19:23:15 [scrapy.core.engine] DEBUG: Crawled (200) <GET http://baidu.com/robots.txt> (referer: None) 2021-08-01 19:23:15 [scrapy.downloadermiddlewares.robotstxt] DEBUG: Forbidden by robots.txt: <GET http://baidu.com/> 2021-08-01 19:23:15 [scrapy.core.engine] INFO: Closing spider (finished) 2021-08-01 19:23:15 [scrapy.statscollectors] INFO: Dumping Scrapy stats: {'downloader/exception_count': 1,'downloader/exception_type_count/scrapy.exceptions.IgnoreRequest': 1,'downloader/request_bytes': 219,'downloader/request_count': 1,'downloader/request_method_count/GET': 1,'downloader/response_bytes': 2702,'downloader/response_count': 1,'downloader/response_status_count/200': 1,'elapsed_time_seconds': 0.357149,'finish_reason': 'finished','finish_time': datetime.datetime(2021, 8, 1, 11, 23, 15, 311116),'log_count/DEBUG': 2,'log_count/INFO': 10,'response_received_count': 1,'robotstxt/forbidden': 1,'robotstxt/request_count': 1,'robotstxt/response_count': 1,'robotstxt/response_status_count/200': 1,'scheduler/dequeued': 1,'scheduler/dequeued/memory': 1,'scheduler/enqueued': 1,'scheduler/enqueued/memory': 1,'start_time': datetime.datetime(2021, 8, 1, 11, 23, 14, 953967)} 2021-08-01 19:23:15 [scrapy.core.engine] INFO: Spider closed (finished) D:\pyFile\scrapy01>scrapy crawl baidu 2021-08-01 20:14:58 [scrapy.utils.log] INFO: Scrapy 2.5.0 started (bot: scrapy01) 2021-08-01 20:14:58 [scrapy.utils.log] INFO: Versions: lxml 4.6.3.0, libxml2 2.9.5, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 21.7.0, Python 3.7.0 (v3.7.0:1bf9cc5093, Jun 27 2018, 04:06:47) [MSC v.1914 32 bit (Intel)], pyOpenSSL 20.0.1 (OpenSSL 1.1.1k 25 Ma r 2021), cryptography 3.4.7, Platform Windows-10-10.0.17763-SP0 2021-08-01 20:14:58 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.selectreactor.SelectReactor 2021-08-01 20:14:58 [scrapy.crawler] INFO: Overridden settings: {'BOT_NAME': 'scrapy01','NEWSPIDER_MODULE': 'scrapy01.spiders','SPIDER_MODULES': ['scrapy01.spiders']} 2021-08-01 20:14:58 [scrapy.extensions.telnet] INFO: Telnet Password: d78762731a57c67f 2021-08-01 20:14:58 [scrapy.middleware] INFO: Enabled extensions: ['scrapy.extensions.corestats.CoreStats','scrapy.extensions.telnet.TelnetConsole','scrapy.extensions.logstats.LogStats'] 2021-08-01 20:14:59 [scrapy.middleware] INFO: Enabled downloader middlewares: ['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware','scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware','scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware','scrapy.downloadermiddlewares.useragent.UserAgentMiddleware','scrapy.downloadermiddlewares.retry.RetryMiddleware','scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware','scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware','scrapy.downloadermiddlewares.redirect.RedirectMiddleware','scrapy.downloadermiddlewares.cookies.CookiesMiddleware','scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware','scrapy.downloadermiddlewares.stats.DownloaderStats'] 2021-08-01 20:14:59 [scrapy.middleware] INFO: Enabled spider middlewares: ['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware','scrapy.spidermiddlewares.offsite.OffsiteMiddleware','scrapy.spidermiddlewares.referer.RefererMiddleware','scrapy.spidermiddlewares.urllength.UrlLengthMiddleware','scrapy.spidermiddlewares.depth.DepthMiddleware'] 2021-08-01 20:14:59 [scrapy.middleware] INFO: Enabled item pipelines: [] 2021-08-01 20:14:59 [scrapy.core.engine] INFO: Spider opened 2021-08-01 20:14:59 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min) 2021-08-01 20:14:59 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023 2021-08-01 20:14:59 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (meta refresh) to <GET http://www.baidu.com/> from <GET http://baidu.com/> 2021-08-01 20:14:59 [scrapy.core.engine] DEBUG: Crawled (200) <GET http://www.baidu.com/> (referer: None) [<Selector xpath='//html/head/title/text()' data='百度一下,你就知道'>] 2021-08-01 20:14:59 [scrapy.core.engine] INFO: Closing spider (finished) 2021-08-01 20:14:59 [scrapy.statscollectors] INFO: Dumping Scrapy stats: {'downloader/request_bytes': 422,'downloader/request_count': 2,'downloader/request_method_count/GET': 2,'downloader/response_bytes': 1838,'downloader/response_count': 2,'downloader/response_status_count/200': 2,'elapsed_time_seconds': 0.366301,'finish_reason': 'finished','finish_time': datetime.datetime(2021, 8, 1, 12, 14, 59, 488070),'httpcompression/response_bytes': 2381,'httpcompression/response_count': 1,'log_count/DEBUG': 2,'log_count/INFO': 10,'response_received_count': 1,'scheduler/dequeued': 2,'scheduler/dequeued/memory': 2,'scheduler/enqueued': 2,'scheduler/enqueued/memory': 2,'start_time': datetime.datetime(2021, 8, 1, 12, 14, 59, 121769)} 2021-08-01 20:14:59 [scrapy.core.engine] INFO: Spider closed (finished)-->
  相关解决方案