获取一个html源码,然后提取里面的符合条件的网址,比如http://style.sina.com.cn/industry/2012-10-08/1147106801.shtml这种,结尾是若干数字+“.shtml”结尾的网址,怎么用正则表达式做到呢?
------解决方案--------------------------------------------------------
- C# code
string s = ",比如http://style.sina.com.cn/industry/2012-10-08/1147106801.shtml 这种"; s = Regex.Match(s, @"[a-zA-z]+://[^\s]*").ToString(); System.Diagnostics.Debug.Print(s); MessageBox.Show(s);// 输出 // http://style.sina.com.cn/industry/2012-10-08/1147106801.shtml
------解决方案--------------------------------------------------------
(?i)https?://\S*?/\d+\.shtml