lucene2.32 and lucene3.02 搜索对比 ,三次搜索结果的归结所花时间(应用较特殊)
经过测试初步总结如下
lucene3.0.2优化点:
所用的搜索时间提升了50%,消耗内存相差3G之多(26.5G-23.5GB )
lucene3.0.2不足之处,经过几次测试,初始化索引加载时间要比lucene2.0.3长
2010-1-5
1\测试单次搜索的数据承受量
2\测试单次搜索的数据承受量,加上类别统计()
条件:
机器配置
Intel(R) Xeon(R) CPU E5506 @2.13GHz (2 处理器)
内存 32GB
系统类型 64位操作系统
1\
condition :bi=1+stype=0+channel=9+sf=THREE+sort=60+tis=1+nowPage=1+keyword=供应+lay=2
document num:58,293,970 (110G)
<page>
<perPage>10</perPage>
<total>4916415</total>
<count>7500</count>
<countone>7500</countone>
<counttwo>0</counttwo>
<countthree>0</countthree>
<begin>1</begin>
<end>10</end>
<time>858</time>
[总个搜索]花费总时间为:936
2\
document number 116,587,940 (220G)
condition :bi=1+stype=0+channel=9+sf=THREE+sort=60+tis=1+nowPage=1+keyword=供应+lay=2
used memory
5.37G
<page>
<perPage>10</perPage>
<total>9832830</total>
<count>7500</count>
<countone>7500</countone>
<counttwo>0</counttwo>
<countthree>0</countthree>
<begin>1</begin>
<end>10</end>
<time>3635</time>
</page>
[总个搜索]花费总时间为:3807
3\ 关键字较多情况下
document number 116,587,940
condition :bi=1+stype=0+channel=9+sf=THREE+sort=60+tis=1+nowPage=1+keyword=供应产品+lay=2
要10多秒才能出结果了
初步结论
5000万的时候 搜索带类别聚类还能接受
达到亿级时 要10多秒才能出数据,如果单线程搜索,基本不能用,需要考滤并行算法了处理之
[quote]condition java -Xmx24g -Xms24g -Xmn23g -Xss128k -XX:+UseConcMarkSweepGC -XX:CMSFullGCsBeforeCompaction=8 -XX:+UseCMSCompactAtFullCollection -XX:ParallelGCThreads=8 -XX:CMSInitiatingOccupancyFraction=500m
document number :12000000
1\
new version load load julei: 111088,71074 mill(2min) 23.5GB
old version load load julei 66925 mill 26.5GB
2\bi=1+stype=0+channel=9+sf=THREE+sort=60+tis=1+lay=2+nowPage=10+keyword=技术规格
new :
a\
<page>
<perPage>10</perPage>
<total>5667685</total>
<count>7500</count>
<countone>0</countone>
<counttwo>6463</counttwo>
<countthree>1037</countthree>
<begin>91</begin>
<end>100</end>
<time>1638</time>
[总个搜索]花费总时间为:1872
old
<page>
<perPage>10</perPage>
<total>5658667</total>
<count>7500</count>
<countone>0</countone>
<counttwo>6463</counttwo>
<countthree>1037</countthree>
<begin>91</begin>
<end>100</end>
<time>3354</time>
[总个搜索]花费总时间为:4524
b\
<page>
<perPage>10</perPage>
<total>5667685</total>
<count>7500</count>
<countone>0</countone>
<counttwo>6463</counttwo>
<countthree>1037</countthree>
<begin>91</begin>
<end>100</end>
<time>1388</time>
[总个搜索]花费总时间为:1544
old
<page>
<perPage>10</perPage>
<total>5658667</total>
<count>7500</count>
<countone>0</countone>
<counttwo>6463</counttwo>
<countthree>1037</countthree>
<begin>91</begin>
<end>100</end>
<time>2028</time>
[总个搜索]花费总时间为:3167
c\
<perPage>10</perPage>
<total>5667685</total>
<count>7500</count>
<countone>0</countone>
<counttwo>6463</counttwo>
<countthree>1037</countthree>
<begin>91</begin>
<end>100</end>
<time>1295</time>
[总个搜索]花费总时间为:1419
old
<perPage>10</perPage>
<total>5658667</total>
<count>7500</count>
<countone>0</countone>
<counttwo>6463</counttwo>
<countthree>1037</countthree>
<begin>91</begin>
<end>100</end>
<time>2012</time>
[总个搜索]花费总时间为:3213
3\ bi=1+stype=0+channel=9+sf=THREE+sort=60+tis=1+lay=2+nowPage=10
new :
a\
<page>
<perPage>10</perPage>
<total>11658794</total>
<count>7500</count>
<countone>7500</countone>
<counttwo>0</counttwo>
<countthree>0</countthree>
<begin>91</begin>
<end>100</end>
<time>1123</time>
[总个搜索]花费总时间为:1248
old:
<page>
<perPage>10</perPage>
<total>11639726</total>
<count>7500</count>
<countone>7500</countone>
<counttwo>0</counttwo>
<countthree>0</countthree>
<begin>91</begin>
<end>100</end>
<time>1841</time>
[总个搜索]花费总时间为:2933
b\
<page>
<perPage>10</perPage>
<total>11658794</total>
<count>7500</count>
<countone>7500</countone>
<counttwo>0</counttwo>
<countthree>0</countthree>
<begin>91</begin>
<end>100</end>
<time>1108</time>
[总个搜索]花费总时间为:1248
c\
<page>
<perPage>10</perPage>
<total>11658794</total>
<count>7500</count>
<countone>7500</countone>
<counttwo>0</counttwo>
<countthree>0</countthree>
<begin>91</begin>
<end>100</end>
<time>1045</time>
[总个搜索]花费总时间为:1232
old:
<perPage>10</perPage>
<total>11639726</total>
<count>7500</count>
<countone>7500</countone>
<counttwo>0</counttwo>
<countthree>0</countthree>
<begin>91</begin>
<end>100</end>
<time>1576</time>
[总个搜索]花费总时间为:2699
4\ bi=1+stype=0+channel=9+sf=THREE+sort=60+tis=1+lay=2+nowPage=10+keyword=凯迪系列散热器
new \
<page>
<perPage>10</perPage>
<total>2621837</total>
<count>7501</count>
<countone>1</countone>
<counttwo>0</counttwo>
<countthree>7500</countthree>
<begin>91</begin>
<end>100</end>
<time>562</time>
[总个搜索]花费总时间为:733
old
<page>
<perPage>10</perPage>
<total>2619057</total>
<count>7500</count>
<countone>1</countone>
<counttwo>0</counttwo>
<countthree>7499</countthree>
<begin>91</begin>
<end>100</end>
<time>1014</time>
</page>
[总个搜索]花费总时间为:2153
<perPage>10</perPage>
<total>2619057</total>
<count>7500</count>
<countone>1</countone>
<counttwo>0</counttwo>
<countthree>7499</countthree>
<begin>91</begin>
<end>100</end>
<time>998</time>
[总个搜索]花费总时间为:2059
5\ bi=1+stype=0+channel=9+sf=THREE+sort=60+tis=1+nowPage=10+keyword=凯迪系列散热器
取最低值(5次到10次):
new:
<perPage>10</perPage>
<total>2621837</total>
<count>7501</count>
<countone>1</countone>
<counttwo>0</counttwo>
<countthree>7500</countthree>
<begin>91</begin>
<end>100</end>
<time>468</time>
[总个搜索]花费总时间为:546
old
<page>
<perPage>10</perPage>
<total>2619057</total>
<count>7500</count>
<countone>1</countone>
<counttwo>0</counttwo>
<countthree>7499</countthree>
<begin>91</begin>
<end>100</end>
<time>609</time>
[总个搜索]花费总时间为:1747[/quote]
经过测试初步总结如下
lucene3.0.2优化点:
所用的搜索时间提升了50%,消耗内存相差3G之多(26.5G-23.5GB )
lucene3.0.2不足之处,经过几次测试,初始化索引加载时间要比lucene2.0.3长
2010-1-5
1\测试单次搜索的数据承受量
2\测试单次搜索的数据承受量,加上类别统计()
条件:
机器配置
Intel(R) Xeon(R) CPU E5506 @2.13GHz (2 处理器)
内存 32GB
系统类型 64位操作系统
1\
condition :bi=1+stype=0+channel=9+sf=THREE+sort=60+tis=1+nowPage=1+keyword=供应+lay=2
document num:58,293,970 (110G)
<page>
<perPage>10</perPage>
<total>4916415</total>
<count>7500</count>
<countone>7500</countone>
<counttwo>0</counttwo>
<countthree>0</countthree>
<begin>1</begin>
<end>10</end>
<time>858</time>
[总个搜索]花费总时间为:936
2\
document number 116,587,940 (220G)
condition :bi=1+stype=0+channel=9+sf=THREE+sort=60+tis=1+nowPage=1+keyword=供应+lay=2
used memory
5.37G
<page>
<perPage>10</perPage>
<total>9832830</total>
<count>7500</count>
<countone>7500</countone>
<counttwo>0</counttwo>
<countthree>0</countthree>
<begin>1</begin>
<end>10</end>
<time>3635</time>
</page>
[总个搜索]花费总时间为:3807
3\ 关键字较多情况下
document number 116,587,940
condition :bi=1+stype=0+channel=9+sf=THREE+sort=60+tis=1+nowPage=1+keyword=供应产品+lay=2
要10多秒才能出结果了
初步结论
5000万的时候 搜索带类别聚类还能接受
达到亿级时 要10多秒才能出数据,如果单线程搜索,基本不能用,需要考滤并行算法了处理之