php - 从 html 中删除\r\n 和转义字符

标签 php html regex preg-replace imap

我有以下 html,它是我使用 imap_fetchbody 从电子邮件中提取的,

<div dir=\"ltr\"><br><div class=\"gmail_quote\"><div dir=\"ltr\"><br><div class=\"gmail_quote\"><div class=\"\">
---------- Forwarded message ----------<br>
<span style=\"font-family:&quot;Helvetica&quot;,&quot;sans-serif&quot;\"><\/span>
From: <span style=\"font-family:&quot;Helvetica&quot;,&quot;sans-serif&quot;\">&quot;
<span>xyz<\/span>&quot; &lt;<a href=\"mailto:<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="3b484e4b4b54494f7b43424115585456" rel="noreferrer noopener nofollow">[email protected]</a>\" target=\"_blank\">support@<span>xyz<\/span>.com<\/a>&gt;<\/span><br>
\r\n\r\n\r\n\r\nDate: Fri, Apr 18, 2014 at 7:17 PM<br>
Subject: Bla bla xyz<br><\/div><div><div class=\"h5\">To: XYZ &lt;<a href=\"mailto:<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="d3abaaa993b4beb2babffdb0bcbe" rel="noreferrer noopener nofollow">[email protected]</a>\" target=\"_blank\"><a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="156d6c6f557278747c793b767a78" rel="noreferrer noopener nofollow">[email protected]</a><\/a>&gt;<br><br><br>\r\n\r\n<div dir=\"ltr\">\r\n\r\n\r\n\r\n
<div class=\"gmail_quote\"><div><div><div dir=\"ltr\"><div class=\"gmail_quote\"><div dir=\"ltr\"><div><div class=\"gmail_quote\">
<div dir=\"ltr\"><div><div><div class=\"gmail_quote\"><div style=\"word-wrap:break-word\" lang=\"EN-US\">\r\n\r\n\r\n\r\n
<div>
<div>
<div>
<blockquote style=\"margin-top:5pt;margin-bottom:5pt\">
<div><div>
<table style=\"width:100%;background:none repeat scroll 0% 0% rgb(207,207,207)\" cellpadding=\"0\" cellspacing=\"0\" border=\"0\" width=\"100%\">
<tbody>
<tr>\r\n\r\n\r\n\r\n
<td style=\"width:325pt;padding:0in\" width=\"650\">\r\n\r\n<div align=\"center\"><table style=\"width:325pt;background:none repeat scroll 0% 0% rgb(207,207,207)\" cellpadding=\"0\" cellspacing=\"0\" border=\"0\" width=\"650\">\r\n\r\n\r\n\r\n
<tbody><tr>
<td style=\"padding:0in 0in 5.25pt\"><p style=\"text-align:center\" align=\"center\">
<span style=\"font-size:7.5pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:rgb(64,64,64)\">If you are unable to see this message, 
<a href=\"http:\/\/click.e.xyz.com\/?qs=3771d7c90c958f02a4b2e78494f12a3116ddb15df79b8d04cdf5aeba42012b118\" target=\"_blank\">
<span style=\"color:rgb(64,64,64)\">click here<\/span><\/a> to view.<br>
\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\nTo ensure delivery to your inbox, please add <a href=\"mailto:<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="5724222727382523172f2e2d7934383a" rel="noreferrer noopener nofollow">[email protected]</a>\" target=\"_blank\"><a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="23505653534c5157635b5a590d404c4e" rel="noreferrer noopener nofollow">[email protected]</a><\/a> to your address book. <\/span><\/p>
<\/td>
<\/tr>
<\/tbody>
<\/table>
<\/div><\/div><\/div><\/div>

我想摆脱所有的 \\r\n 并仍然保留 html 的 <> 。 我尝试过 stripslashes、stripcslashes、nl2br、htmlspecialchars_decode。但我无法实现我想要的。 这是我与 imap_qprint 函数一起尝试过的,

$text = stripslashes(imap_qprint($text));
$body = preg_replace('/(\v|\s)+/', ' ', $text );

Res:它不会删除所有空白字符。

最佳答案

匹配以下正则表达式:

(\\r|\\n|\\) 带有 g 修饰符

并替换为

''(空字符串)

演示:http://regex101.com/r/mS3wM2

关于php - 从 html 中删除\r\n 和转义字符,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23171173/

相关文章:

php - Youtube API PHP问题

javascript - ajax - 验证数据库中是否已存在行

php - 表存储引擎的 Zend_Db/PDO 测试

html - 使用 CSS3 变换后的溢出行为

python - 每 64 个字符插入换行符

php - Ajax 调用中的 500 内部服务器错误 (Laravel)

javascript - 将数据属性从“like”按钮复制到父列表项

javascript - 获取具有特定 id 的最近的前一个元素

javascript - 正则表达式适用于 JavaScript,但不适用于 PHP

regex - 删除前导零但不是全零