问题描述
我完成了编辑脚本,以检查该URL是否需要WWW Web基本身份验证,并按照以下脚本为用户打印结果:
#!/usr/bin/python
# Importing libraries
from urllib2 import urlopen, HTTPError
import socket
import urllib2
import threading
import time
# Setting up variables
url = open("oo.txt",'r')
response = None
start = time.time()
# Excuting Coommands
start = time.time()
for line in url:
try:
response = urlopen(line, timeout=1)
except HTTPError as exc:
# A 401 unauthorized will raise an exception
response = exc
except socket.timeout:
print ("{0} | Request timed out !!".format(line))
except urllib2.URLError:
print ("{0} | Access error !!".format(line))
auth = response and response.info().getheader('WWW-Authenticate')
if auth and auth.lower().startswith('basic'):
print "requires basic authentication"
elif socket.timeout or urllib2.URLError:
print "Yay"
else:
print "Not requires basic authentication"
print "Elapsed Time: %s" % (time.time() - start)
我有一点需要我帮助您在脚本中进行编辑的地方。.我希望脚本将所有10个URL一起检查,并在一个文本文件中一次性提供所有URL的结果。 我读到了有关多线程和处理的信息,但没有找到与我的情况相匹配的简化代码的方式。
当超时或URL错误出现时,我在结果中也有问题,脚本在两行中给出结果,如下所示:
http://www.test.test
| Access error !!
我想要一行,为什么它在拖曳中显示?
在这个问题上有帮助吗?
提前致谢
1楼
包current.futures提供了功能,使在Python中使用并发变得非常容易。
您定义一个函数check_url
,应为每个URL调用。
然后,您可以使用map
函数,将该函数并行应用于每个URL并遍历返回值。
#! /usr/bin/env python3
import concurrent.futures
import urllib.error
import urllib.request
import socket
def load_urls(pathname):
with open(pathname, 'r') as f:
return [ line.rstrip('\n') for line in f ]
class BasicAuth(Exception): pass
class CheckBasicAuthHandler(urllib.request.BaseHandler):
def http_error_401(self, req, fp, code, msg, hdrs):
if hdrs.get('WWW-Authenticate', '').lower().startswith('basic'):
raise BasicAuth()
return None
def check_url(url):
try:
opener = urllib.request.build_opener(CheckBasicAuthHandler())
with opener.open(url, timeout=1) as u:
return 'requires no authentication'
except BasicAuth:
return 'requires basic authentication'
except socket.timeout:
return 'request timed out'
except urllib.error.URLError as e:
return 'access error ({!r})'.format(e.reason)
if __name__ == '__main__':
urls = load_urls('/tmp/urls.txt')
with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:
for url, result in zip(urls, executor.map(check_url, urls)):
print('{}: {}'.format(url, result))