日期:2014-05-17  浏览次数:20404 次

从别的网页上抓取部分内容 实现我自己的功能
我想从另一个网站中抓取新闻列表到我的网站中,我现在能得到目标源代码,并在TEXTBOX文本中显示,怎么处理这段内容只保留我想要的HTML片段内容?


也就是说我想再别的网页上抓取 人民币汇率那么一部分内容 放在我自己的网页上 ,现在整个网页是抓出来了 我想的是怎么去抓 我想要的那一部分内容

------解决方案--------------------
如果不知道正则,也可以通过两个指定的字符,取字符中的中间值。

Function GetKey(ByVal HTML, ByVal Start, ByVal Last)
Dim body1 = Split(HTML, Start)
Dim body2 = Split(body1(1), Last)
GetKey = body2(0)
End Function

------解决方案--------------------
C# code
using System;
using System.Net;
using System.Text;
using System.IO;


    public class Test
    {
        // Specify the URL to receive the request.
        public static void Main (string[] args)
        {
            HttpWebRequest request = (HttpWebRequest)WebRequest.Create (args[0]);

            // Set some reasonable limits on resources used by this request
            request.MaximumAutomaticRedirections = 4;
            request.MaximumResponseHeadersLength = 4;
            // Set credentials to use for this request.
            request.Credentials = CredentialCache.DefaultCredentials;
            HttpWebResponse response = (HttpWebResponse)request.GetResponse ();

            Console.WriteLine ("Content length is {0}", response.ContentLength);
            Console.WriteLine ("Content type is {0}", response.ContentType);

            // Get the stream associated with the response.
            Stream receiveStream = response.GetResponseStream ();

            // Pipes the stream to a higher level stream reader with the required encoding format. 
            StreamReader readStream = new StreamReader (receiveStream, Encoding.UTF8);

            Console.WriteLine ("Response stream received.");
            Console.WriteLine (readStream.ReadToEnd ());
            response.Close ();
            readStream.Close ();
        }
    }

/*
The output from this example will vary depending on the value passed into Main 
but will be similar to the following:

Content length is 1542
Content type is text/html; charset=utf-8
Response stream received.
<html>
...
</html>

*/