<h4 class="cat-hd fst-cat-hd ">
<i class="cat-icon fst-cat-icon active-trigger"></i>
<a class="cat-name fst-cat-name"
href="http://bosidengny.tmall.com/category-907362758.htm?search=y&catName=%D0%C2%C6%B7%D7%A8%C7%F8"
>新品专区</a>
</h4>
</li>
<li class="cat fst-cat">
<h4 class="cat-hd fst-cat-hd has-children">
<i class="cat-icon fst-cat-icon active-trigger"></i>
<a class="cat-name fst-cat-name"
href="http://bosidengny.tmall.com/category-907362759.htm?search=y&catName=%B1%A3%C5%AF%C9%CF%D7%B0"
>保暖上装</a>
</h4>
<div class="snd-pop">
<div class="snd-pop-inner">
<ul class="fst-cat-bd">
<li class="cat snd-cat">
<h4 class="cat-hd snd-cat-hd">
<i class="cat-icon snd-cat-icon"></i>
<a class="cat-name snd-cat-name"
href="http://bosidengny.tmall.com/category-907362760.htm?search=y&parentCatId=907362759&parentCatName=%B1%A3%C5%AF%C9%CF%D7%B0&catName=%BC%D9%C1%BD%BC%FE%A3%A8%B3%C4%C9%C0%C1%EC%A3%A9"
>
假两件(衬衫领)
</a>
</h4>
</li>
<li class="cat snd-cat">
<h4 class="cat-hd snd-cat-hd">
<i class="cat-icon snd-cat-icon"></i>
<a class="cat-name snd-cat-name"
href="http://bosidengny.tmall.com/category-907362761.htm?search=y&parentCatId=907362759&parentCatName=%B1%A3%C5%AF%C9%CF%D7%B0&catName=V%C1%EC%C9%CF%D7%B0"
>
V领上装
</a>
</h4>
</li>
上面是分类 我匹配的一级分类
$dafht='#<h4 class="cat-hd fst-cat-hd (.*)category-(.*).htm\?search\=y\&catName\=(.*)">(.*)</a></h4>#iUs';
preg_match_all($dafht, $fenlei, $dafenlei);
但是无效 求大神解答
一级分类主要是采集 category-907362761.htm 这个数字ID 和后面的名称
二级分类采集 category-907362761.htm parentCatId=907362759 这俩个数字ID 和后面的名称
怎么写 卡主半天了 求解答
------解决思路----------------------
为什么费神写这个?人家网站稍有变化,功夫就白费了
网上有很多简捷实用的工具,为什么不用呢?
比如这个
$s =<<< TXT
<h4 class="cat-hd fst-cat-hd ">
<i class="cat-icon fst-cat-icon active-trigger"></i>
<a class="cat-name fst-cat-name"
href="http://bosidengny.tmall.com/category-907362758.htm?search=y&catName=%D0%C2%C6%B7%D7%A8%C7%F8"
>新品专区</a>
</h4>
</li>
<li class="cat fst-cat">
<h4 class="cat-hd fst-cat-hd has-children">
<i class="cat-icon fst-cat-icon active-trigger"></i>
<a class="cat-name fst-cat-name"
href="http://bosidengny.tmall.com/category-907362759.htm?search=y&catName=%B1%A3%C5%AF%C9%CF%D7%B0"
>保暖上装</a>
</h4>
<div class="snd-pop">
<div class="snd-pop-inner">
<ul class="fst-cat-bd">
<li class="cat snd-cat">
<h4 class="cat-hd snd-cat-hd">
<i class="cat-icon snd-cat-icon"></i>
<a class="cat-name snd-cat-name"
href="http://bosidengny.tmall.com/category-907362760.htm?search=y&parentCatId=907362759&parentCatName=%B1%A3%C5%AF%C9%CF%D7%B0&catName=%BC%D9%C1%BD%BC%FE%A3%A8%B3%C4%C9%C0%C1%EC%A3%A9"
>
假两件(衬衫领)
</a>
</h4>
</li>
<li class="cat snd-cat">
<h4 class="cat-hd snd-cat-hd">
<i class="cat-icon snd-cat-icon"></i>
<a class="cat-name snd-cat-name"
href="http://bosidengny.tmall.com/category-907362761.htm?search=y&parentCatId=907362759&parentCatName=%B1%A3%C5%AF%C9%CF%D7%B0&catName=V%C1%EC%C9%CF%D7%B0"
>
V领上装
</a>
</h4>
</li>
TXT;
include 'simple_html_dom.php';
$p = new simple_html_dom;
$p->load($s);
foreach($p->find('a') as $v) {
echo $v->class, PHP_EOL; //这是可供区分级别的 class
echo $v->href,PHP_EOL; //这是url
echo trim($v->innertext()),PHP_EOL; //这是说明文字
}
cat-name fst-cat-name
http://bosidengny.tmall.com/category-907362758.htm?search=y&catName=%D0%C2%C6%B7%D7%A8%C7%F8
新品专区
cat-name fst-cat-name
http://bosidengny.tmall.com/category-907362759.htm?search=y&catName=%B1%A3%C5%AF%C9%CF%D7%B0
保暖上装
cat-name snd-cat-name
http://bosidengny.tmall.com/category-907362760.htm?search=y&parentCatId=907362759&parentCatName=%B1%A3%C5%AF%C9%CF%D7%B0&catName=%BC%D9%C1%BD%BC%FE%A3%A8%B3%C4%C9%C0%C1%EC%A3%A9
假两件(衬衫领)
cat-name snd-cat-name
http://bosidengny.tmall.com/category-907362761.htm?search=y&parentCatId=907362759&parentCatName=%B1%A3%C5%AF%C9%CF%D7%B0&catName=V%C1%EC%C9%CF%D7%B0
V领上装