php - 从 HTML 中移除样式

标签 php html css regex vba

我有一个充满产品描述的数据库,这些产品描述充斥着可怕的计算机生成的 HTML 并散布着不同的样式信息...样式属性、字体标签、背景属性...

我必须重新设计网站,但首先我需要从产品描述中删除所有样式。在有人建议手动完成之前,有 100,000 种产品。我认为 PHP 中一些有创意的正则表达式可能会成功。

理想情况下,我想删除所有 HTML 并只使用纯文本,但描述包含表格和表格的表格......这样只会以泪流满面。

期待您的创造性解决方案:)

编辑-

转念一想,我也可以在 VBA 中完成,因为我可以将它们导出到 Excel 工作表。所以 PHP 或 VBA 解决方案会很棒。

编辑-

    <div class="XXXX-template-06">
          <table border="0" cellpadding="0" cellspacing="0" style="border-collapse: collapse" bordercolor="#111111" width="694" id="AutoNumber1">
            <tbody><tr>
              <td width="516" height="18" bgcolor="#999966" align="center">
              <p align="center"><font face="Verdana" color="#FFFFFF"><b>Mont Blanc Scott Roof mounted cycle bike carrier<br>
              <br>
              Part Number: 728540</b></font></p></td>
              <td width="178" height="18" bgcolor="#999966" align="center">
              <a href="/shippingcalculator.html?SKU=728540" target="_blank"><img border="0" src="http://images.ZZZZpro.com/2145/" width="88" height="33"></a></td>
            </tr>
            <tr>
              <td width="694" height="57" bgcolor="#CCCC99" align="center" colspan="2">
              <b><font face="Verdana" size="2" class="CustomStyle-CycleCarrier">
    <script type="text/javascript">
    <!--function click() { if (event.button==2) { alert('All graphics, descriptions and other information, including the HTML code of this listing are the property of XXXX Limited and may not be reproduced in any form without the express permission of XXXX Limited. Email us: sales@XXXX.com'); } } document.onmousedown=click // -->
    <!---->
    <!---->
    <!---->
    <!---->
    <!---->
    <!---->
    <!---->
    <!---->
    <!---->
    <!---->
    <!---->
    <!----> -->
    </script>


    <div align="center">
      <center>
        <table height="336" background="http://images.ZZZZpro.com/2145/I/21/fade1.jpg" width="680" border="0">
          <tbody><tr>
            <td height="49" width="136"><p align="center"><img height="62" src="http://XXXXbiz.ipage.com/XXXX/Images/Mont%20Blanc/montblanc.jpg" width="165" border="0"></p></td>
            <td height="49" width="378"><p align="center"><font face="Verdana" color="#0000ff" size="5"><u><strong>Mont Blanc </strong></u></font><u><strong><font face="Verdana" color="#0000FF" size="5">Scott Roof Bar Rack 1 Cycle Carrier</font></strong></u></p></td>
            <td height="49" width="146"><img height="69" src="http://images.ZZZZpro.com/2145/I/20/logomed.gif" width="174" border="0"></td>
          </tr>
          <tr>
            <td height="241" colspan="3" width="672"><hr><p align="center"><img height="223" src="http://XXXXbiz.ipage.com/XXXX/Images/Mont%20Blanc/scottlrg.jpg" width="237" border="0"></p><p><font color="black"><b>Scott</b> </font></p><ul><li>Stylish, easy to use roof mounted cycle carrier, distinctive oval carrying bar.<br></li><li>Extra Soft Frame clamps hold cycle safely and gently<br></li><li>Extra wide wheel holders take the fattest tyres<br></li><li>Strong Webbing straps fasten wheels securely to carrier<br></li><li><font size="3" color="black">Upright, roof bar mounted, locking cycle carrier<br></font></li><li><font size="3" color="black">&nbsp;Locks to roof rails and locks bikes<br></font></li><li><font size="3" color="black">&nbsp;Quick and easy to use<br></font></li><li><font size="3" color="black">Adjustable for most cycle styles</font></li></ul><center><table cellspacing="0" width="100%" cellpadding="20" border="0" height="1" class="featuretable">
                  <tbody><tr>
                    <td height="55" class="featuretd" width="110"><p align="center"><a target="_blank" href="http://www.montblancuk.co.uk/support/inst/scott.pdf"><img width="20" alt="Open document" src="http://espimages.biz/2145/I/20/mount_link.gif" border="0" height="20"></a></p></td>
                    <td height="55" class="featuretd">To view Fitting Instructions in PDF format please click the spanner</td>
                  </tr>
                </tbody></table>
                <table height="317">
                  <tbody><tr class="technicaltr" valign="top">
                    <td height="1" class="technicalfirstcolumn"><font class="technicalheader">Technical data</font></td>
                    <td height="1" class="technicalsecondcolumn"><p><font class="heading1">Mont </font>Blanc Scott</p><p align="center"><img height="107" src="http://XXXXbiz.ipage.com/XXXX/Images/Mont%20Blanc/scottfaint.jpg" width="127" border="0"></p></td>
                  </tr>
                  <tr class="technicaltr" valign="top">
                    <td height="21" class="technicalfirstcolumn"><div>Max number of bikes</div></td>
                    <td height="21" class="technicalsecondcolumn"><div>1</div></td>
                  </tr>
                  <tr class="technicaltr" valign="top">
                    <td height="18" class="technicalfirstcolumn"><div>Load capacity (kg)</div></td>
                    <td height="18" class="technicalsecondcolumn"><div>15 KG</div></td>
                  </tr>
                  <tr class="technicaltr" valign="top">
                    <td height="21" class="technicalfirstcolumn"><div>Weight (kg)</div></td>
                    <td height="21" class="technicalsecondcolumn"><div>2.2KG</div></td>
                  </tr>
                  <tr class="technicaltr" valign="top">
                    <td height="21" class="technicalfirstcolumn"><div>Fits frame-dimensions (mm)</div></td>
                    <td height="21" class="technicalsecondcolumn">Up to 80mm</td>
                  </tr>
                  <tr class="technicaltr" valign="top">
                    <td height="21" class="technicalfirstcolumn"><div>Fits wheel-dimensions</div></td>
                    <td height="21" class="technicalsecondcolumn"><div>All</div></td>
                  </tr>
                  <tr class="technicaltr" valign="top">
                    <td height="21" class="technicalfirstcolumn"><div>Locks bikes to carrier</div></td>
                    <td height="21" class="technicalsecondcolumn"><div>Yes</div></td>
                  </tr>
                  <tr class="technicaltr" valign="top">
                    <td height="21" class="technicalfirstcolumn"><div>Locks carrier to car</div></td>
                    <td height="21" class="technicalsecondcolumn"><div>Yes</div></td>
                  </tr>
                  <tr class="technicaltr" valign="top">
                    <td height="21" class="technicalfirstcolumn"><div>Tilt function, with bikes</div></td>
                    <td height="21" class="technicalsecondcolumn"><div>NA</div></td>
                  </tr>
                  <tr class="technicaltr" valign="top">
                    <td height="21" class="technicalfirstcolumn"><div>TÜV/EuroBE approved</div></td>
                    <td height="21" class="technicalsecondcolumn"><div>NA</div></td>
                  </tr>
                  <tr class="technicaltr" valign="top">
                    <td height="21" class="technicalfirstcolumn"><div>Fullfills City Crash norms</div></td>
                    <td height="21" class="technicalsecondcolumn"><div>NA</div></td>
                  </tr>
                  <tr class="technicaltr" valign="top">
                    <td height="21" class="technicalfirstcolumn"><div>Miscellaneous</div></td>
                    <td height="21" class="technicalsecondcolumn"><div><p>Fits all types of Roof Bars,</p></div></td>
                  </tr>
                </tbody></table>
                <p align="center">
                  <font size="2" face="Verdana">The cycle carrier is 
                  guaranteed for Five year from date of purchase.                  
<br>                  
<br>We stock a wide range of towbars and towing accessories.                   
<a href="mailto:sales@XXXX.com?subject=Witter ZX88 Cycle Carrier"><br>Click 
                  here to email us</a> if you require details of our other 
                  towing equipment.</font>
                </p>


<hr>                
              </center>

            </td>

          </tr>
        </tbody></table>
      </center>

    </div>

  <br>
              Please note that with the Type of cycle carrier where you mount it
              <br>
              onto a flange ball you may need the long reach ball which will <br>
              allow you enough clearance from the bumper</font></b></td>
            </tr>
            <tr>
              <td width="694" height="57" bgcolor="#CCCC99" align="center" colspan="2">
              <a href="http://www.XXXXeuro.ZZZZprostorefront.co.uk/products/728540-mont-blanc-scott-roof-mounted-cycle-bike-carrier-728540.html" target="_blank"><img border="0" src="http://images.ZZZZpro.com/2145/" width="55" height="40"></a>
              <b><font face="Verdana" size="2">Not from the UK ? Click the flag
              to purchase this item from our EU site </font></b><a href="http://www.XXXXeuro.ZZZZprostorefront.co.uk/products/728540-mont-blanc-scott-roof-mounted-cycle-bike-carrier-728540.html" target="_blank"><img border="0" src="http://images.ZZZZpro.com/2145/" width="57" height="40"></a></td>
            </tr>
          </tbody></table>
</div>

编辑-

通过它我认为我需要摆脱以下内容:

属性: 风格 背景颜色 背景

标签: 字体

最佳答案

我建议使用 XSLT 去除所有不需要的内容。一个简单的身份模板将是一个很好的起点。

关于php - 从 HTML 中移除样式,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/7119923/

相关文章:

php - 如何从带有大括号的字符串中提取数字?

php - 管理模块的选项方法

javascript - 如何在鼠标滚轮上进行水平滚动?

javascript - 我无法从 Javascript 触发自定义事件

javascript - 固定面板 extjs 4.2 的大小

css - nth-of-type(odd) 不工作 SCSS

css - 带边距自动的中心 float div 不起作用

php - 如何使用散列密码后登录页面

javascript - 如何在 HTML 表格中使用 javascript 对象

php - 让 PHPStan 理解 Laravel Eloquent Builder query()