日期:2014-05-17  浏览次数:20963 次

第二次提问 ASP正则问题
HTML code
<meta http-equiv=Content-Type content=text/html;charset=gb2312>
<meta http-equiv=Content-Type content="text/html;charset=gb2312">
<meta http-equiv=Content-Type content="text/html;charset=gb2312"/>
<meta http-equiv=Content-Type content=‘text/html;charset=gb2312’/>


上次已经问过一次,当时要求用ASP+正则把上面几种情况中的编码值提取出来。

然后有高手给了 <meta[^>]+charset=([\w\-]+)[^>]*> 这句正则我测试完全正常,但是今天又碰见 <meta charset="UTF-8" /> 这个情况,代码无法提取到UTF-8这个值了,麻烦高手帮我修改下正则好不?

------解决方案--------------------
不是"结尾引号到后面的>之间有空格"的问题,是charset=后有个双引号没有被匹配到
VBScript code
<script language="vbscript">
Dim str, re, rv
str = "<meta http-equiv=Content-Type content=text/html;charset=gb2312>"
str = "<meta http-equiv=Content-Type content=""text/html;charset=gb2312"">"
str = "<meta http-equiv=Content-Type content=""text/html;charset=gb2312""/>"
str = "<meta http-equiv=Content-Type content=‘text/html;charset=gb2312’/>"
str = "<meta charset=""UTF-8"" />"

Set re = New RegExp
re.Pattern = "<meta[^>]+charset=[""]?([\w\-]+)[^>]*>"
re.Global = True
re.IgnoreCase = True
re.MultiLine = True

Set matches = re.Execute(str)

if matches.Count>0 then
    msgbox matches(0).SubMatches(0)
end if

</script>