一个倒排索引(inverted index)的python实现
标签: python索引搜索文档
2016-02-01 22:35
26人阅读
评论(0)
收藏
举报
分类:
版权声明:本文为博主原创文章,未经博主允许不得转载。
一个倒排索引(inverted index)的python实现
- 使用spider.py抓取了10篇中英双语安徒生童话并存在 “documents_cn”目录下
- 使用inverted_index_cn.py对 “documents_cn”目录下文档建立倒排索引
- 查询 “第三根火柴”, “kindled third”, “kindled match”的位置
- 获得结果如下
注:search函数先搜索词组的情况(即每个汉字或词间距离为1),如无结果再搜索临近情况(即距离为2或距离为3)
spider.py
<code class="hljs python has-numbering" style="display: block; padding: 0px; background-color: transparent; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-top-left-radius: 0px; border-top-right-radius: 0px; border-bottom-right-radius: 0px; border-bottom-left-radius: 0px; word-wrap: normal; background-position: initial initial; background-repeat: initial initial;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">from</span> lxml <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> html <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> os<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> sys reload(sys) sys.setdefaultencoding(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'utf-8'</span>) seed_url = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u"http://www.kekenet.com/read/essay/ats/"</span> x = html.parse(seed_url) spans = x.xpath(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"*//ul[@id='menu-list']//li/h2/a"</span>) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> span <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> spans[:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">10</span>]:details_url = span.xpath(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"attribute::href"</span>)[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]xx = html.parse(details_url)name = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'documents_cn//'</span>+span.text.replace(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u' '</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'_'</span>)f = open(name, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'a'</span>)<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">try</span>:contents = xx.xpath(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"//div[@id='article']//p/text()"</span>)<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> content <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> contents:<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> len(str(content)) > <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>:f.write(content.encode(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'raw_unicode_escape'</span>)+<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'\n'</span>)<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">except</span> Exception, e:<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"wrong!!!!"</span>, ef.close()os.remove(name)<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">else</span>:f.close() </code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li></ul>
inverted_index_cn.py
<code class="hljs python has-numbering" style="display: block; padding: 0px; background-color: transparent; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-top-left-radius: 0px; border-top-right-radius: 0px; border-bottom-right-radius: 0px; border-bottom-left-radius: 0px; word-wrap: normal; background-position: initial initial; background-repeat: initial initial;"><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># coding:utf-8</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> os <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> jieba <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> re <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> sys reload(sys) sys.setdefaultencoding(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'utf-8'</span>)_STOP_WORDS = frozenset([<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'a'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'about'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'above'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'above'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'across'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'after'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'afterwards'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'again'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'against'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'all'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'almost'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'alone'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'along'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'already'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'also'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'although'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'always'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'am'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'among'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'amongst'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'amoungst'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'amount'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'an'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'and'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'another'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'any'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'anyhow'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'anyone'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'anything'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'anyway'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'anywhere'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'are'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'around'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'as'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'at'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'back'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'be'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'became'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'because'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'become'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'becomes'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'becoming'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'been'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'before'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'beforehand'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'behind'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'being'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'below'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'beside'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'besides'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'between'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'beyond'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'bill'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'both'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'bottom'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'but'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'by'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'call'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'can'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'cannot'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'cant'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'co'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'con'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'could'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'couldnt'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'cry'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'de'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'describe'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'detail'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'do'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'done'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'down'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'due'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'during'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'each'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'eg'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'eight'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'either'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'eleven'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'else'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'elsewhere'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'empty'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'enough'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'etc'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'even'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ever'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'every'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'everyone'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'everything'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'everywhere'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'except'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'few'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fifteen'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fify'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fill'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'find'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fire'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'first'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'five'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'for'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'former'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'formerly'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'forty'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'found'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'four'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'from'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'front'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'full'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'further'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'get'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'give'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'go'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'had'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'has'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'hasnt'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'have'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'he'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'hence'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'her'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'here'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'hereafter'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'hereby'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'herein'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'hereupon'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'hers'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'herself'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'him'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'himself'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'his'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'how'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'however'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'hundred'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ie'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'if'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'in'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'inc'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'indeed'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'interest'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'into'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'is'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'it'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'its'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'itself'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'keep'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'last'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'latter'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'latterly'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'least'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'less'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ltd'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'made'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'many'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'may'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'me'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'meanwhile'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'might'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'mill'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'mine'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'more'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'moreover'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'most'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'mostly'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'move'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'much'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'must'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'my'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'myself'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'name'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'namely'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'neither'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'never'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'nevertheless'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'next'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'nine'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'no'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'nobody'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'none'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'noone'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'nor'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'not'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'nothing'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'now'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'nowhere'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'of'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'off'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'often'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'on'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'once'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'one'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'only'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'onto'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'or'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'other'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'others'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'otherwise'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'our'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ours'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ourselves'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'out'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'over'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'own'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'part'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'per'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'perhaps'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'please'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'put'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'rather'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'re'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'same'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'see'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'seem'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'seemed'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'seeming'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'seems'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'serious'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'several'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'she'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'should'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'show'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'side'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'since'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'sincere'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'six'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'sixty'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'so'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'some'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'somehow'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'someone'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'something'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'sometime'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'sometimes'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'somewhere'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'still'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'such'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'system'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'take'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ten'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'than'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'that'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'the'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'their'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'them'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'themselves'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'then'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'thence'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'there'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'thereafter'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'thereby'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'therefore'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'therein'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'thereupon'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'these'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'they'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'thickv'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'thin'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'third'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'this'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'those'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'though'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'three'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'through'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'throughout'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'thru'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'thus'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'to'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'together'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'too'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'top'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'toward'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'towards'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'twelve'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'twenty'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'two'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'un'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'under'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'until'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'up'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'upon'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'us'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'very'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'via'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'was'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'we'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'well'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'were'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'what'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'whatever'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'when'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'whence'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'whenever'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'where'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'whereafter'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'whereas'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'whereby'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'wherein'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'whereupon'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'wherever'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'whether'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'which'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'while'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'whither'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'who'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'whoever'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'whole'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'whom'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'whose'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'why'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'will'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'with'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'within'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'without'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'would'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'yet'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'you'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'your'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'yours'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'yourself'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'yourselves'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'the'</span>])<span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">word_split</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(text)</span>:</span>word_list = []pattern = re.compile(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'[\u4e00-\u9fa5]+'</span>)jieba_list = list(jieba.cut(text))time = {}<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> i, c <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> enumerate(jieba_list):<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> c <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> time: <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># record appear time</span>time[c] += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">else</span>:time.setdefault(c, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>) != <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> pattern.search(c): <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># if Chinese</span>word_list.append((len(word_list), (text.<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span>(c, time[c]), c)))<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">continue</span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> c.isalnum(): <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># if English or number</span>word_list.append((len(word_list), (text.<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span>(c, time[c]), c.lower()))) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># include normalize</span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> word_list<span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">words_cleanup</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(words)</span>:</span>cleaned_words = []<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> <span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span>, (offset, word) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> words: <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># words-(word <span style="color: rgb(0, 0, 0); box-sizing: border-box; background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span> for search,(letter offset for display,word))</span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> word <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> _STOP_WORDS:<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">continue</span>cleaned_words.append((<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span>, (offset, word)))<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> cleaned_words<span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">word_<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span></span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(text)</span>:</span>words = word_split(text)words = words_cleanup(words)<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> words<span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;"><span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(102, 255, 255); background-position: initial initial; background-repeat: initial initial;">inverted</span>_<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span></span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(text)</span>:</span><span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(102, 255, 255); background-position: initial initial; background-repeat: initial initial;">inverted</span> = {}<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> <span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span>, (offset, word) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> word_<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span>(text):locations = <span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(102, 255, 255); background-position: initial initial; background-repeat: initial initial;">inverted</span>.setdefault(word, [])locations.append((<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span>, offset))<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> <span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(102, 255, 255); background-position: initial initial; background-repeat: initial initial;">inverted</span><span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;"><span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(102, 255, 255); background-position: initial initial; background-repeat: initial initial;">inverted</span>_<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span>_add</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(102, 255, 255); background-position: initial initial; background-repeat: initial initial;">inverted</span>, doc_id, doc_<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span>)</span>:</span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> word, locations <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> doc_<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span>.iteritems():indices = <span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(102, 255, 255); background-position: initial initial; background-repeat: initial initial;">inverted</span>.setdefault(word, {})indices[doc_id] = locations<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> <span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(102, 255, 255); background-position: initial initial; background-repeat: initial initial;">inverted</span><span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">search</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(102, 255, 255); background-position: initial initial; background-repeat: initial initial;">inverted</span>, query)</span>:</span>words = [word <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> _, (offset, word) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> word_<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span>(query) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> word <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> <span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(102, 255, 255); background-position: initial initial; background-repeat: initial initial;">inverted</span>] <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># query_words_list</span>results = [set(<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(102, 255, 255); background-position: initial initial; background-repeat: initial initial;">inverted</span>[word].keys()) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> word <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> words]<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># x = map(lambda old: old+1, x) </span>doc_set = reduce(<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">lambda</span> x, y: x & y, results) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> results <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">else</span> []precise_doc_dic = {}<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> doc_set:<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> doc <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> doc_set:<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span>_list = [[indoff[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>] <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> indoff <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> <span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(102, 255, 255); background-position: initial initial; background-repeat: initial initial;">inverted</span>[word][doc]] <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> word <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> words]offset_list = [[indoff[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>] <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> indoff <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> <span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(102, 255, 255); background-position: initial initial; background-repeat: initial initial;">inverted</span>[word][doc]] <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> word <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> words]precise_doc_dic = precise(precise_doc_dic, doc, <span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span>_list, offset_list, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 词组查询</span>precise_doc_dic = precise(precise_doc_dic, doc, <span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span>_list, offset_list, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 临近查询</span>precise_doc_dic = precise(precise_doc_dic, doc, <span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span>_list, offset_list, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">3</span>) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 临近查询</span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> precise_doc_dic<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">else</span>:<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> {}<span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">precise</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(precise_doc_dic, doc, <span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span>_list, offset_list, range)</span>:</span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> precise_doc_dic:<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> range != <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>:<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> precise_doc_dic <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 如果已找到词组,不需再进行临近查询</span>phrase_<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span> = reduce(<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">lambda</span> x, y: set(map(<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">lambda</span> old: old + range, x)) & set(y), <span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span>_list)phrase_<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span> = map(<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">lambda</span> x: x - len(<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span>_list) - range + <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>, phrase_<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span>)<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> len(phrase_<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span>):phrase_offset = []<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> po <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> phrase_<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span>:phrase_offset.append(offset_list[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>][<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span>_list[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>].<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span>(po)]) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># offset_list[0]代表第一个单词的字母偏移list</span>precise_doc_dic[doc] = phrase_offset<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> precise_doc_dic<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> __name__ == <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'__main__'</span>:<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Build <span style="color: rgb(0, 0, 0); box-sizing: border-box; background-color: rgb(102, 255, 255); background-position: initial initial; background-repeat: initial initial;">Inverted</span>-<span style="color: rgb(0, 0, 0); box-sizing: border-box; background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">Index</span> for documents</span><span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(102, 255, 255); background-position: initial initial; background-repeat: initial initial;">inverted</span> = {}<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># documents = {}</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># doc1 = u"开发者可以指定自己自定义的词典,以便包含jieba词库里没有的词"</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># doc2 = u"军机处长到底是谁,<span style="color: rgb(0, 0, 0); box-sizing: border-box; background-color: rgb(255, 255, 102); background-position: initial initial; background-repeat: initial initial;">Python</span> Perl"</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># documents.setdefault("doc1", doc1)</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># documents.setdefault("doc2", doc2)</span>documents = {}<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> filename <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> os.listdir(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'documents_cn'</span>):f = open(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'documents_cn//'</span> + filename).read()documents.setdefault(filename.decode(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'utf-8'</span>), f)<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> doc_id, text <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> documents.iteritems():doc_<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span> = <span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(102, 255, 255); background-position: initial initial; background-repeat: initial initial;">inverted</span>_<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span>(text)<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(102, 255, 255); background-position: initial initial; background-repeat: initial initial;">inverted</span>_<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span>_add(<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(102, 255, 255); background-position: initial initial; background-repeat: initial initial;">inverted</span>, doc_id, doc_<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span>)<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Print <span style="color: rgb(0, 0, 0); box-sizing: border-box; background-color: rgb(102, 255, 255); background-position: initial initial; background-repeat: initial initial;">Inverted</span>-<span style="color: rgb(0, 0, 0); box-sizing: border-box; background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">Index</span></span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> word, doc_locations <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> <span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(102, 255, 255); background-position: initial initial; background-repeat: initial initial;">inverted</span>.iteritems():<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> word, doc_locations<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Search something and print results</span>queries = [<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'第三根火柴'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'kindled third'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'kindled match'</span>]<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> query <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> queries:result_docs = search(<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(102, 255, 255); background-position: initial initial; background-repeat: initial initial;">inverted</span>, query)<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"Search for '%s': %s"</span> % (query, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u','</span>.join(result_docs.keys())) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># %s是str()输出字符串%r是repr()输出对象</span><span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">extract_text</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(doc, <span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span>)</span>:</span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> documents[doc].decode(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'utf-8'</span>)[<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span>:<span style="box-sizing: border-box; color: rgb(0, 0, 0); background-color: rgb(255, 204, 153); background-position: initial initial; background-repeat: initial initial;">index</span> + <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">30</span>].replace(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'\n'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">' '</span>)<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> result_docs:<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> doc, offsets <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> result_docs.items():<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> offset <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> offsets:<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">' - %s...'</span> % extract_text(doc, offset)<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">else</span>:<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'Nothing found!'</span><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> </code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li><li style="box-sizing: border-box; padding: 0px 5px;">29</li><li style="box-sizing: border-box; padding: 0px 5px;">30</li><li style="box-sizing: border-box; padding: 0px 5px;">31</li><li style="box-sizing: border-box; padding: 0px 5px;">32</li><li style="box-sizing: border-box; padding: 0px 5px;">33</li><li style="box-sizing: border-box; padding: 0px 5px;">34</li><li style="box-sizing: border-box; padding: 0px 5px;">35</li><li style="box-sizing: border-box; padding: 0px 5px;">36</li><li style="box-sizing: border-box; padding: 0px 5px;">37</li><li style="box-sizing: border-box; padding: 0px 5px;">38</li><li style="box-sizing: border-box; padding: 0px 5px;">39</li><li style="box-sizing: border-box; padding: 0px 5px;">40</li><li style="box-sizing: border-box; padding: 0px 5px;">41</li><li style="box-sizing: border-box; padding: 0px 5px;">42</li><li style="box-sizing: border-box; padding: 0px 5px;">43</li><li style="box-sizing: border-box; padding: 0px 5px;">44</li><li style="box-sizing: border-box; padding: 0px 5px;">45</li><li style="box-sizing: border-box; padding: 0px 5px;">46</li><li style="box-sizing: border-box; padding: 0px 5px;">47</li><li style="box-sizing: border-box; padding: 0px 5px;">48</li><li style="box-sizing: border-box; padding: 0px 5px;">49</li><li style="box-sizing: border-box; padding: 0px 5px;">50</li><li style="box-sizing: border-box; padding: 0px 5px;">51</li><li style="box-sizing: border-box; padding: 0px 5px;">52</li><li style="box-sizing: border-box; padding: 0px 5px;">53</li><li style="box-sizing: border-box; padding: 0px 5px;">54</li><li style="box-sizing: border-box; padding: 0px 5px;">55</li><li style="box-sizing: border-box; padding: 0px 5px;">56</li><li style="box-sizing: border-box; padding: 0px 5px;">57</li><li style="box-sizing: border-box; padding: 0px 5px;">58</li><li style="box-sizing: border-box; padding: 0px 5px;">59</li><li style="box-sizing: border-box; padding: 0px 5px;">60</li><li style="box-sizing: border-box; padding: 0px 5px;">61</li><li style="box-sizing: border-box; padding: 0px 5px;">62</li><li style="box-sizing: border-box; padding: 0px 5px;">63</li><li style="box-sizing: border-box; padding: 0px 5px;">64</li><li style="box-sizing: border-box; padding: 0px 5px;">65</li><li style="box-sizing: border-box; padding: 0px 5px;">66</li><li style="box-sizing: border-box; padding: 0px 5px;">67</li><li style="box-sizing: border-box; padding: 0px 5px;">68</li><li style="box-sizing: border-box; padding: 0px 5px;">69</li><li style="box-sizing: border-box; padding: 0px 5px;">70</li><li style="box-sizing: border-box; padding: 0px 5px;">71</li><li style="box-sizing: border-box; padding: 0px 5px;">72</li><li style="box-sizing: border-box; padding: 0px 5px;">73</li><li style="box-sizing: border-box; padding: 0px 5px;">74</li><li style="box-sizing: border-box; padding: 0px 5px;">75</li><li style="box-sizing: border-box; padding: 0px 5px;">76</li><li style="box-sizing: border-box; padding: 0px 5px;">77</li><li style="box-sizing: border-box; padding: 0px 5px;">78</li><li style="box-sizing: border-box; padding: 0px 5px;">79</li><li style="box-sizing: border-box; padding: 0px 5px;">80</li><li style="box-sizing: border-box; padding: 0px 5px;">81</li><li style="box-sizing: border-box; padding: 0px 5px;">82</li><li style="box-sizing: border-box; padding: 0px 5px;">83</li><li style="box-sizing: border-box; padding: 0px 5px;">84</li><li style="box-sizing: border-box; padding: 0px 5px;">85</li><li style="box-sizing: border-box; padding: 0px 5px;">86</li><li style="box-sizing: border-box; padding: 0px 5px;">87</li><li style="box-sizing: border-box; padding: 0px 5px;">88</li><li style="box-sizing: border-box; padding: 0px 5px;">89</li><li style="box-sizing: border-box; padding: 0px 5px;">90</li><li style="box-sizing: border-box; padding: 0px 5px;">91</li><li style="box-sizing: border-box; padding: 0px 5px;">92</li><li style="box-sizing: border-box; padding: 0px 5px;">93</li><li style="box-sizing: border-box; padding: 0px 5px;">94</li><li style="box-sizing: border-box; padding: 0px 5px;">95</li><li style="box-sizing: border-box; padding: 0px 5px;">96</li><li style="box-sizing: border-box; padding: 0px 5px;">97</li><li style="box-sizing: border-box; padding: 0px 5px;">98</li><li style="box-sizing: border-box; padding: 0px 5px;">99</li><li style="box-sizing: border-box; padding: 0px 5px;">100</li><li style="box-sizing: border-box; padding: 0px 5px;">101</li><li style="box-sizing: border-box; padding: 0px 5px;">102</li><li style="box-sizing: border-box; padding: 0px 5px;">103</li><li style="box-sizing: border-box; padding: 0px 5px;">104</li><li style="box-sizing: border-box; padding: 0px 5px;">105</li><li style="box-sizing: border-box; padding: 0px 5px;">106</li><li style="box-sizing: border-box; padding: 0px 5px;">107</li><li style="box-sizing: border-box; padding: 0px 5px;">108</li><li style="box-sizing: border-box; padding: 0px 5px;">109</li><li style="box-sizing: border-box; padding: 0px 5px;">110</li><li style="box-sizing: border-box; padding: 0px 5px;">111</li><li style="box-sizing: border-box; padding: 0px 5px;">112</li><li style="box-sizing: border-box; padding: 0px 5px;">113</li><li style="box-sizing: border-box; padding: 0px 5px;">114</li><li style="box-sizing: border-box; padding: 0px 5px;">115</li><li style="box-sizing: border-box; padding: 0px 5px;">116</li><li style="box-sizing: border-box; padding: 0px 5px;">117</li><li style="box-sizing: border-box; padding: 0px 5px;">118</li><li style="box-sizing: border-box; padding: 0px 5px;">119</li><li style="box-sizing: border-box; padding: 0px 5px;">120</li><li style="box-sizing: border-box; padding: 0px 5px;">121</li><li style="box-sizing: border-box; padding: 0px 5px;">122</li><li style="box-sizing: border-box; padding: 0px 5px;">123</li><li style="box-sizing: border-box; padding: 0px 5px;">124</li><li style="box-sizing: border-box; padding: 0px 5px;">125</li><li style="box-sizing: border-box; padding: 0px 5px;">126</li><li style="box-sizing: border-box; padding: 0px 5px;">127</li><li style="box-sizing: border-box; padding: 0px 5px;">128</li><li style="box-sizing: border-box; padding: 0px 5px;">129</li><li style="box-sizing: border-box; padding: 0px 5px;">130</li><li style="box-sizing: border-box; padding: 0px 5px;">131</li><li style="box-sizing: border-box; padding: 0px 5px;">132</li><li style="box-sizing: border-box; padding: 0px 5px;">133</li><li style="box-sizing: border-box; padding: 0px 5px;">134</li><li style="box-sizing: border-box; padding: 0px 5px;">135</li><li style="box-sizing: border-box; padding: 0px 5px;">136</li><li style="box-sizing: border-box; padding: 0px 5px;">137</li><li style="box-sizing: border-box; padding: 0px 5px;">138</li><li style="box-sizing: border-box; padding: 0px 5px;">139</li><li style="box-sizing: border-box; padding: 0px 5px;">140</li><li style="box-sizing: border-box; padding: 0px 5px;">141</li><li style="box-sizing: border-box; padding: 0px 5px;">142</li><li style="box-sizing: border-box; padding: 0px 5px;">143</li><li style="box-sizing: border-box; padding: 0px 5px;">144</li><li style="box-sizing: border-box; padding: 0px 5px;">145</li><li style="box-sizing: border-box; padding: 0px 5px;">146</li><li style="box-sizing: border-box; padding: 0px 5px;">147</li><li style="box-sizing: border-box; padding: 0px 5px;">148</li><li style="box-sizing: border-box; padding: 0px 5px;">149</li><li style="box-sizing: border-box; padding: 0px 5px;">150</li><li style="box-sizing: border-box; padding: 0px 5px;">151</li><li style="box-sizing: border-box; padding: 0px 5px;">152</li><li style="box-sizing: border-box; padding: 0px 5px;">153</li><li style="box-sizing: border-box; padding: 0px 5px;">154</li><li style="box-sizing: border-box; padding: 0px 5px;">155</li><li style="box-sizing: border-box; padding: 0px 5px;">156</li><li style="box-sizing: border-box; padding: 0px 5px;">157</li><li style="box-sizing: border-box; padding: 0px 5px;">158</li><li style="box-sizing: border-box; padding: 0px 5px;">159</li><li style="box-sizing: border-box; padding: 0px 5px;">160</li><li style="box-sizing: border-box; padding: 0px 5px;">161</li><li style="box-sizing: border-box; padding: 0px 5px;">162</li><li style="box-sizing: border-box; padding: 0px 5px;">163</li><li style="box-sizing: border-box; padding: 0px 5px;">164</li><li style="box-sizing: border-box; padding: 0px 5px;">165</li><li style="box-sizing: border-box; padding: 0px 5px;">166</li><li style="box-sizing: border-box; padding: 0px 5px;">167</li><li style="box-sizing: border-box; padding: 0px 5px;">168</li><li style="box-sizing: border-box; padding: 0px 5px;">169</li><li style="box-sizing: border-box; padding: 0px 5px;">170</li><li style="box-sizing: border-box; padding: 0px 5px;">171</li><li style="box-sizing: border-box; padding: 0px 5px;">172</li><li style="box-sizing: border-box; padding: 0px 5px;">173</li><li style="box-sizing: border-box; padding: 0px 5px;">174</li><li style="box-sizing: border-box; padding: 0px 5px;">175</li><li style="box-sizing: border-box; padding: 0px 5px;">176</li><li style="box-sizing: border-box; padding: 0px 5px;">177</li><li style="box-sizing: border-box; padding: 0px 5px;">178</li><li style="box-sizing: border-box; padding: 0px 5px;">179</li><li style="box-sizing: border-box; padding: 0px 5px;">180</li><li style="box-sizing: border-box; padding: 0px 5px;">181</li><li style="box-sizing: border-box; padding: 0px 5px;">182</li></ul>