类似这种
<td colspan="2" class="ly2"><span title='嫁娶:男娶女嫁,举行结婚大典的吉日。'><a href='2012yiE5AB81E5A8B6.htm'>嫁娶</a></span></td>
我需要得到下面标签里的文字‘嫁娶’、‘祈福’...
<a href='2012yiE5AB81E5A8B6.htm'>嫁娶</a>
<a href='2012jiE7A588E7A68F.htm'>祈福</a>
求大神帮忙解决!
------解决方案--------------------------------------------------------
(?i)(?<=<a\b[^>]*?>)[^<>]+(?=</a>)
------解决方案--------------------------------------------------------
- C# code
string tempStr = File.ReadAllText(@"C:\Documents and Settings\Administrator\桌面\Test.txt", Encoding.GetEncoding("GB2312"));//读取txt string pattern1 = @"(?is)(?<=<td[^>]*?class=['""]ly2[^>]*?>((?!</td>)[\s\S])*<a[^>]*?>)[^<>]+"; string pattern2 = @"(?is)(?<=<td[^>]*?class=['""]lj2[^>]*?>((?!</td>)[\s\S])*<a[^>]*?>)[^<>]+"; string[] temp_array1 = Regex.Matches(tempStr,pattern1).Cast<Match>().Select(a=>a.Value).ToArray(); /* * [0] "嫁娶" string [1] "出行" string [2] "开市" string [3] "安床" string [4] "入殓" string */ string[] temp_array2 = Regex.Matches(tempStr, pattern2).Cast<Match>().Select(a => a.Value).ToArray(); /* * [0] "祈福" string [1] "动土" string */