当前位置: 代码迷 >> 综合 >> Openresty+Lua+Memcached反爬虫策略
  详细解决方案

Openresty+Lua+Memcached反爬虫策略

热度:19   发布时间:2024-01-09 16:42:17.0

http://www.07net01.com/2015/04/822090.html

直接用Openresty替换掉了Nginx,通过Nginx内嵌Lua配合一个Memcached实现一个不依赖后端反爬虫验证(类似于CloudFlare的验证码。Memcached中包含键值identify_IP的用户都会被重定向到identify.php进行处理,可以在identify.php通过验证码或者js进行human验证,验证之后将identify_IP删除,该IP即可继续访问。

server {  #...location / {index index.php;}location ~ \.php$ {content_by_lua 'uri = ngx.var.uriif uri == "/identify.php" thenngx.exec("@bypass")returnendclientIP = ngx.var.remote_addrlocal memcached = require "resty.memcached"local memc, err = memcached:new()if not memc thenngx.say("failed to instantiate memc: ", err)returnendlocal ok, err = memc:connect("127.0.0.1", 11211)if not ok thenngx.say("failed to connect: ", err)returnendlocal res, flags, err = memc:get("identify_"..clientIP)if err thenngx.exec("@bypass")returnendif  res == "1" thenngx.exec("@identify")returnendngx.exec("@bypass")';}location @bypass {#echo 'bypass';#rewrite breakfastcgi_pass unix:/var/run/php5-fpm.sock;fastcgi_index  index.php;fastcgi_param  SCRIPT_FILENAME  $document_root$fastcgi_script_name;include        fastcgi_params;}location @identify {#echo 'identify';#identify.phprewrite ^/(.*)$ /identify.php?url=$request_uri redirect;       #redirect}location ~ /\.ht {deny  all;}}

identify_IP键值可通过分析Nginx日志自动set,通过AWK筛选出10分钟的访问日志。

tac chd_access.log | awk 'BEGIN{ "date -d \"-10 minute\" +\"%H:%M:%S\"" | getline min5 } { if (substr($4, 14) > min5) print; else exit;}' | tac

然后写个python cron分析,比如10分钟内请求页面数超过100的用户,然后插入Memcached好了...

原文地址:Openresty+Lua+Memcached反爬虫策略, 感谢原作者分享。