c# - HtmlAgilityPack - 从 html 表中获取数据

标签 c# html screen-scraping html-agility-pack

我的程序使用 HtmlAgilityPack 并抓取一个 HTML 网页,将其存储在一个变量中,我试图从 HTML 中获取特定 Div 类标签 (boardcontainer) 下的两个表。使用我当前的代码,它会在整个网页中搜索每个表格并显示它们,但是当单元格为空时,它会引发异常:

“NullReferenceException 未处理 - 对象引用未设置为对象的实例。”。

HTML 片段(在本例中,我在网站上搜索“Microsoft”:

<div class="boardcontainer">
<table cellpadding="4" cellspacing="1" border="0" width="100%">
<tr><td colspan="6" class="catbg" height="18" >Main Database</td></tr>
<tr>
    <td class="windowbg" width="28%" align="center">Company Name</td>
    <td class="windowbg" width="12%" align="center">0870 / 0871</td>
    <td class="windowbg" width="12%" align="center">0844 / 0845</td>
    <td class="windowbg" width="12%" align="center">01 / 02 / 03</td>
    <td class="windowbg" width="12%" align="center">Freephone</td>
    <td class="windowbg" width="24%" align="center">Other Information</td>
</tr>
    <tr>
<td class=windowbg2 width=28% align=center BGCOLOR=#FFFFCC><a href=http://www.websitename.com/exit.php?site=www.microsoft.co.uk target="_blank">Microsoft</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC>�0870 601 0100</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC>�0844 800 2400</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC>�01954 713950</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC>�</a></td><td class=windowbg2 width=24% align=center BGCOLOR=#FFFFCC>�<b>Customer Support</b><br><i>Straight to agent (no menu)</i><br><font size=1>Also for 0870 6010200</font></td></tr>
    <tr>
<td class=windowbg2 width=28% align=center BGCOLOR=#FFFFCC><a href=http://www.websitename.com/exit.php?site=www.microsoft.co.uk target="_blank">Microsoft</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC>�0870 601 0100</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC>�0844 800 2400</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC>�0118 909 7800</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC>�</a></td><td class=windowbg2 width=24% align=center BGCOLOR=#FFFFCC>�<b>Main UK Switchboard</b><br><i>Ask to be put through to required department</i><br><font size=1>Also for 0870 6010200</font></td></tr>
    <tr>

这是我当前的代码,它只抓取表格并显示行和单元格,然后在 Null 时抛出异常。

        string html = myRequest.GetResponse();
        HtmlDocument htmlDoc = new HtmlDocument();
        htmlDoc.LoadHtml(html);


        foreach (HtmlNode table in htmlDoc.DocumentNode.SelectNodes("//table"))
        {
            Console.WriteLine("Found: " + table.Id);
            foreach (HtmlNode row in table.SelectNodes("tr"))
            {
                Console.WriteLine("row");
                foreach (HtmlNode cell in row.SelectNodes("th|td")) //Exception is thrown here
                {
                    Console.WriteLine("cell: " + cell.InnerText);
                }
            }
        }

我如何更改它以搜索特定的 div 类并从中提取表格?

感谢阅读。

完整的 HTML:

    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />

<form method="post" action="sdfsd.php">
<html>
<head><title>SAYNOTO0870.COM - Non-Geographical Alternative Telephone Numbers</title>
<meta name='copyright' content='SAYNOTO0870.COM - 1999-2010'>
<META name="y_key" content="5a00e35b9f1986b0" >
</head>

<body>
<BODY bgColor=#ffffe6>
<table border="0" width="100%" id="headertable1">
    <tr>

        <td width="335" valign="top">
<font face='Tahoma' size='2'>
        <center><b>
        <font size="6">SAY<font color="#FF0000">NO</font>TO<font color="#FF0000">0870</font>.COM</font>
</b></center>
</font>
<font face='Tahoma' size='4'>
<center><b><font size="2">Non-Geographical Alternative Telephone Numbers</font></b><font size="3"></font><font face='Tahoma' size='2'></font><br>

<span style="font-weight: 700"><font size="1">Awarded Website Of The Day by 
BBC Radio 2, and  featured<br>
on the BBC Radio 2's 
Jeremy Vine show and The Guardian.</font></span></center>
        </td>
        <td width="403" rowspan="2" align="center">

<a href="http://energy.saynoto0870.com" target="_blank"><img src="/banners/energyheader.gif" alt="Save Money on your Gas and Electricity" width="420" height="60" border="0" align="middle"></a>        
        </td>  </tr></table>
<table width="92%" cellspacing="1" cellpadding="0" border="0" align="CENTER">
  <tr> 
    <td align="center"> 
      <table bgcolor="#AFC6DB" width="100%" cellspacing="0" cellpadding="0" align="center">

        <tr> 
          <td width="100%" align="center"> 
            <table border="0" width="100%" cellpadding="3" cellspacing="0" bgcolor="#AFC6DB" align="center">
              <tr> 
                <td valign="middle" bgcolor="#CCFFCC" align="center" width="180">
<font face='Tahoma' size='2'>
                <b>
<a href="/">
                <img src="/images/home.gif" alt="Home" border="0">Home</a></td>
                <td valign="middle" bgcolor="#CCFFCC" align="center" width="143">
                <b>

<font face='Tahoma' size='2'>

                <a href="/cgi-bin/forum/YaBB.cgi">
                <img src="/images/forum.gif" alt="Discussion Forum" border="0">Discussion Forum</a></td>
                <td valign="middle" bgcolor="#CCFFCC" align="center" width="134">
                <font face="Tahoma" size="2">
                <b>
                <a href="/links.php">
                <img src="/images/links.gif" alt="Links" border="0">Links</a></td>

                <td valign="middle" bgcolor="#CCFFCC" align="center" width="103">
                <font face="Tahoma" size="2">
                <b>
                <a href="/help.php">
                <img src="/images/help.gif" alt="Help" border="0">Help</a></td>
                <td valign="middle" bgcolor="#CCFFCC" align="center" width="114">
                <font face="Tahoma" size="2">
                <b>

                <a href="/contact">
                <img src="/images/contact.gif" alt="Contact Us" border="0">Contact Us</a>
                </td>
              </tr>
              <tr> 
                <td valign="middle" bgcolor="#CCFFCC" align="center" width="321" colspan="2">
<font face='Tahoma' size='2'>
                <a href="/search.php">
                <font face="Tahoma">

                <b>
                <font size="2">
                <img src="/images/search.gif" alt="Search" border="0"></font></b></font><font size="2"><b>Search 
                to find an alternative number</b></font></a></td>
                <td valign="middle" bgcolor="#CCFFCC" align="center" width="365" colspan="3">
<font face='Tahoma' size='2'>
                <a href="/add.php">
                <font face="Tahoma">
                <b>
                <font size="2">

                <img src="/images/addno.gif" alt="Add A New Number" border="0"></font></b></font><font size="2"><b>Click 
                here to add a new alternative number</b></font></a></td>
              </tr>
            </table>
          </td>
        </tr>
      </table>
    </td>
  </tr>

</table>

<br>
<center>
<script type="text/javascript"><!--
google_ad_client = "pub-9959843696187618";
google_ad_width = 468;
google_ad_height = 60;
google_ad_format = "468x60_as";
google_ad_type = "text_image";
//2007-06-07: SAYNOTO0870-Header
google_ad_channel = "6422558175";
google_color_border = "ffffe6";
google_color_bg = "ffffe6";
google_color_link = "32527A";
google_color_text = "000000";
google_color_url = "2D8930";
//-->
</script>
<script type="text/javascript"
  src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script>
</center>
<BR><input type=hidden name="search_name" value="Microsoft">
</form>
<link rel="stylesheet" href="search.css" type="text/css" />

  <table width="100%" align="center" border="0">
  <tr>

    <td><font size="2">

<div class="seperator"></div>

<div class="boardcontainer">
<table cellpadding="4" cellspacing="1" border="0" width="100%">
<tr><td colspan="6" class="catbg" height="18" >Main Database</td></tr>

<tr>
    <td class="windowbg" width="28%" align="center">Company Name</td>
    <td class="windowbg" width="12%" align="center">0870 / 0871</td>

    <td class="windowbg" width="12%" align="center">0844 / 0845</td>
    <td class="windowbg" width="12%" align="center">01 / 02 / 03</td>
    <td class="windowbg" width="12%" align="center">Freephone</td>
    <td class="windowbg" width="24%" align="center">Other Information</td>
</tr>


    <tr>

<td class=windowbg2 width=28% align=center BGCOLOR=#FFFFCC><a href=http://www.saynoto0870.com/exit.php?site=www.microsoft.co.uk target="_blank">Microsoft</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0870 601 0100</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0844 800 2400</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 01954 713950</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> </a></td><td class=windowbg2 width=24% align=center BGCOLOR=#FFFFCC> <b>Customer Support</b><br><i>Straight to agent (no menu)</i><br><font size=1>Also for 0870 6010200</font></td></tr>
    <tr>
<td class=windowbg2 width=28% align=center BGCOLOR=#FFFFCC><a href=http://www.saynoto0870.com/exit.php?site=www.microsoft.co.uk target="_blank">Microsoft</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0870 601 0100</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0844 800 2400</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0118 909 7800</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> </a></td><td class=windowbg2 width=24% align=center BGCOLOR=#FFFFCC> <b>Main UK Switchboard</b><br><i>Ask to be put through to required department</i><br><font size=1>Also for 0870 6010200</font></td></tr>

    <tr>
<td class=windowbg2 width=28% align=center BGCOLOR=#FFFFCC><a href=http://www.saynoto0870.com/exit.php?site=www.microsoft.co.uk target="_blank">Microsoft</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0870 601 0100</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0844 800 2400</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> +35314502113</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> </a></td><td class=windowbg2 width=24% align=center BGCOLOR=#FFFFCC> <b>Customer Support</b><br><i>Answers as Microsoft Ireland with same options as UK 08 numbers</i><br>Reduce cost using 1899 (or similar)<br><font size=1>Also for 0870 6010200</font></td></tr>
    <tr>
<td class=windowbg2 width=28% align=center BGCOLOR=#FFFFCC><a href=http://www.saynoto0870.com/exit.php?site=www.microsoft.co.uk target="_blank">Microsoft</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0870 241 1963</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0844 800 2400</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 020 3147 4930</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0800 0188354</a></td><td class=windowbg2 width=24% align=center BGCOLOR=#FFFFCC> <b>Product Activation</b><br><i>Home & Business (Volume Licensing)</i><br><font size=1>Also: 0800 018 8364 & +800 2284 8283<br>Also for 0870 6010100 & 0870 6010200</font></td></tr>

    <tr>
<td class=windowbg2 width=28% align=center BGCOLOR=#FFFFCC><a href=http://www.saynoto0870.com/exit.php?site=www.microsoft.co.uk target="_blank">Microsoft</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0870 241 1963</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> </a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> </a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0800 9179016</a></td><td class=windowbg2 width=24% align=center BGCOLOR=#FFFFCC> <b>Volume Licensing</b></td></tr>
    <tr>
<td class=windowbg2 width=28% align=center BGCOLOR=#FFFFCC><a href=http://www.saynoto0870.com/exit.php?site=www.microsoft.co.uk target="_blank">Microsoft</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> </a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> </a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 020 3027 6039</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0800 7318457</a></td><td class=windowbg2 width=24% align=center BGCOLOR=#FFFFCC> <b>Online Services Support</b><br><i>MSN, Hotmail, Live, Messenger etc</i><br><font size=1>Also: 0800 587 2920</font></td></tr>

    <tr>
<td class=windowbg2 width=28% align=center BGCOLOR=#FFFFCC><a href=http://www.saynoto0870.com/exit.php?site=www.microsoft.co.uk target="_blank">Microsoft</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0870 607 0700</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0844 800 6006</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> +35317065353</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> </a></td><td class=windowbg2 width=24% align=center BGCOLOR=#FFFFCC> <b>Ask Partner Hotline</b><br><i>Answers with same options</i><br>Reduce cost using 1899 (or similar)</td></tr>
    <tr>
<td class=windowbg2 width=28% align=center BGCOLOR=#FFFFCC><a href=http://www.saynoto0870.com/exit.php?site=www.microsoft.co.uk target="_blank">Microsoft</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0870 607 0700</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0844 800 6006</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> </a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0800 9173128</a></td><td class=windowbg2 width=24% align=center BGCOLOR=#FFFFCC> <b>Partner Network Regional Service Centre</b><br><i>Help with membership questions and tools, benefits and resource queries</i></td></tr>

    <tr>
<td class=windowbg2 width=28% align=center BGCOLOR=#FFFFCC><a href=http://www.saynoto0870.com/exit.php?site=www.microsoft.co.uk target="_blank">Microsoft</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0870 601 0100</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0844 800 2400</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> </a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0800 0324479</a></td><td class=windowbg2 width=24% align=center BGCOLOR=#FFFFCC> <b>Direct Services</b><br><font size=1>Also for 0870 6010200</font></td></tr>
    <tr>
<td class=windowbg2 width=28% align=center BGCOLOR=#FFFFCC><a href=http://www.saynoto0870.com/exit.php?site=www.microsoft.co.uk/msdn target="_blank">Microsoft</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0870 601 0100</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0844 800 2400</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> +35318831002</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0800 0517215</a></td><td class=windowbg2 width=24% align=center BGCOLOR=#FFFFCC> <b>MSDN (Microsoft Developers Network)</b><br>When calling +353 reduce cost using 1899 (or similar)<br><font size=1>Also for 0870 6010200</font></td></tr>

    <tr>
<td class=windowbg2 width=28% align=center BGCOLOR=#FFFFCC><a href=http://www.saynoto0870.com/exit.php?site=www.microsoft.co.uk/technet target="_blank">Microsoft</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0870 601 0100</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0844 800 2400</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> +35318831002</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0800 281221</a></td><td class=windowbg2 width=24% align=center BGCOLOR=#FFFFCC> <b>Microsoft Technet</b><br>When calling +353 reduce cost using 1899 (or similar)<br><font size=1>Also for 0870 6010200</font></td></tr>
    <tr>
<td class=windowbg2 width=28% align=center BGCOLOR=#FFFFCC><a href=http://www.saynoto0870.com/exit.php?site=www.xbox.co.uk target="_blank">Microsoft XBOX</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> </a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> </a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 020 7365 9792</a></td><td class=windowbg2 width=12% align=center BGCOLOR=#FFFFCC> 0800 5871102</a></td><td class=windowbg2 width=24% align=center BGCOLOR=#FFFFCC> <b>Customer Support</b></td></tr>

    <tr>

</tr>
</table>
</div><br />    

<table width="100%" align="center" border="0">
  <tr><td><font size="2">
<div class="seperator"></div>

<div class="boardcontainer">
<table cellpadding="4" cellspacing="1" border="0" width="100%">

<tr><td colspan="6" class="catbg" height="18" >Unverified Numbers Database</td></tr>

<tr>
    <td class="windowbg" width="28%" align="center">Company Name</td>
    <td class="windowbg" width="12%" align="center">0870 / 0871</td>
    <td class="windowbg" width="12%" align="center">0844 / 0845</td>
    <td class="windowbg" width="12%" align="center">01 / 02 / 03</td>
    <td class="windowbg" width="12%" align="center">Freephone</td>
    <td class="windowbg" width="24%" align="center">Other Information</td>

</tr>

<td class=windowuv width=28% align=center BGCOLOR=#CCFFFF> Microsoft</td><td class=windowuv width=12% align=center BGCOLOR=#CCFFFF> 0870 501 0800</a></td><td class=windowuv width=12% align=center BGCOLOR=#CCFFFF> 0844 800 8338</a></td><td class=windowuv width=12% align=center BGCOLOR=#CCFFFF> 0118 909 7994</a></td><td class=windowuv width=12% align=center BGCOLOR=#CCFFFF> </a></td><td class=windowuv width=24% align=center BGCOLOR=#CCFFFF> <b>Premier Support</b></td></tr>
    <tr>
<td class=windowuv width=28% align=center BGCOLOR=#CCFFFF>Microsoft AskPartner (Licensing)</a></td><td class=windowuv width=12% align=center BGCOLOR=#CCFFFF> 0870 607 0700</a></td><td class=windowuv width=12% align=center BGCOLOR=#CCFFFF> </a></td><td class=windowuv width=12% align=center BGCOLOR=#CCFFFF> 020 8784 1000</a></td><td class=windowuv width=12% align=center BGCOLOR=#CCFFFF> </a></td><td class=windowuv width=24% align=center BGCOLOR=#CCFFFF> Switchboard of Sitel UK in Kingston where the AskPartner team is based. Ask for Microsoft Team. 0800 - 1800.</td></tr>

    <tr>
<td class=windowuv width=28% align=center BGCOLOR=#CCFFFF> Microsoft Office Live Meeting</td><td class=windowuv width=12% align=center BGCOLOR=#CCFFFF> </a></td><td class=windowuv width=12% align=center BGCOLOR=#CCFFFF> </a></td><td class=windowuv width=12% align=center BGCOLOR=#CCFFFF> 020 3024 9260</a></td><td class=windowuv width=12% align=center BGCOLOR=#CCFFFF> 0800 0854811</a></td><td class=windowuv width=24% align=center BGCOLOR=#CCFFFF> EMC Conferencing on Meeting Place</td></tr>

</tr>
</table>
</div><br />

<center>
<a href="http://homephone.consumerchoices.co.uk/?partner=saynoto0870" target="_blank">

<img src="/banners/consumerchoices.png" border="0" alt="ConsumerChoices" align="middle"></img></a>
<BR><BR>
</center>

<div class="seperator">
<table cellpadding="4" cellspacing="1" border="0" width="100%">
<tr>
    <td class="titlebg" align="center" colspan="2">
        Info Centre
    </td>
</tr>

    <td class="windowbg2">
        <div style="float: left; width: 59%; text-align: left;">

        <span class="small">Please use the Contact Us option, to report any incorrect numbers that you notice on the site.  Thanks for your support.</span><br />
        </div>
        <div style="float: left; width: 40%; text-align: left;">
        <div class="small" style="float: left; width: 49%;"><span style="color: red;"><b>lllll</b></span> Main Database - A number that has been checked and at the time it was checked worked correctly.  Please let us know of any numbers that no longer work as expected.</div><div class="small" style="float: left; width: 49%;"><span style="color: #CCFFFF;"><b>lllll</b></span> Unverified Number - A number that has been added by a visitor to the website, and hasn't yet been verified as correct.  Please use the Contact Us link at the top of the page to let us know if these work (or don't work) for you.</div>
        </div>

    </td>
</tr>
</table>

</div>
    </font></td>
  </tr>
</table>


<br>

<head>
<style>
<!--.smallfont{ font: 11px verdana, geneva, lucida, 'lucida grande', arial, helvetica, sans-serif;}-->

</style>
</head>
<b>
<center>
<font color='red'>
</center>
</b>
</font>
<BR>
<center>

<script type="text/javascript"><!--
google_ad_client = "pub-9959843696187618";
google_ad_width = 728;
google_ad_height = 90;
google_ad_format = "728x90_as";
google_ad_type = "text_image";
//2007-06-07: SAYNOTO0870-Footer
google_ad_channel = "7459969292";
google_color_border = "FFFFE6";
google_color_bg = "FFFFE6";
google_color_link = "32527A";
google_color_text = "000000";
google_color_url = "2D8930";
//-->
</script>
<script type="text/javascript"  src="http://pagead2.googlesyndication.com/pagead/show_ads.js"></script>
<BR></center>
<BR><center><B>

<font face="Tahoma" size="2">
Website and Content © 1999-2011 SAYNOTO0870.COM.&nbsp; All Rights Reserved</b>.
<br><b>Written permission is required to duplicate any of the content within this site. </b></center></font>
<script src="http://www.google-analytics.com/urchin.js" type="text/javascript"></script>
<script type="text/javascript">_uacct = "UA-194609-1";urchinTracker();</script>
</body></html>

最佳答案

以下 XPATH 允许您在 HTML 文档中搜索特定的 DIV(类为“boardcontainer”):

//div[@class='boardcontainer']/table

要处理空行,只需检查返回的 HtmlNodeCollection 是否为 null

这是一个完整的例子:

HtmlDocument htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(html);

foreach (HtmlNode table in htmlDoc.DocumentNode.SelectNodes("//div[@class='boardcontainer']/table"))
{
  Console.WriteLine("Found: " + table.Id);

  foreach (HtmlNode row in table.SelectNodes("tr"))
  {
    Console.WriteLine("row");

    HtmlNodeCollection cells = row.SelectNodes("th|td");

    if (cells == null)
    {
      continue;
    }

    foreach (HtmlNode cell in cells)
    {                        
      Console.WriteLine("cell: " + cell.InnerText);
    }
  }
} 

您还应该检查是否找到了一个表,以及找到的表是否包含行。

关于c# - HtmlAgilityPack - 从 html 表中获取数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/8580762/

相关文章:

c# - Linq to SQL 更新数据

c# - .NET 4.0 到 4.5 迁移并在 4.0 机器上运行异常

c# - Fluent NHibernate 乐观锁和延迟加载

html - 虚拟滚动的搜索实现

python - 被 scrapy 困住了,下面是来自 subreddits 的 imgur 链接

c# - 在 C# 中抓取 Windows 应用程序的屏幕

node.js - 通过 Puppeteer 设置 SessionStorage

c# - 什么规则管理 .NET 应用程序和 C# 语言的跨版本兼容性?

javascript - 当上面的某些图层悬停并具有位置偏移时,如何停止图层向下移动

html - 将元素 float 到 fieldset 的右上角