我正在尝试编写一个 Windows 批处理文件,它将查看一个看起来像这样(简化)的特定 html 文件:
<input name="pattern" value="*.var" type="text" /><img style="width: 16px; height: 16px; vertical-align:middle; cursor:pointer" onclick="this.parentNode.submit()" class="icon-go-next icon-sm" src="/static/474743c8/images/16x16/go-next.png" /></form></div><table class="fileList"><tr><td><img style="width: 16px; height: 16px; " class="icon-text icon-sm" src="/static/474743c8/images/16x16/text.png" /></td><td><a href="./address.var.varapplication-varapplication-varwebservice-05.05.07-SNAPSHOT.var">address.var.varapplication-varapplication-varwebservice-05.05.07-SNAPSHOT.var</a></td><td class="fileSize">133.49 MB</td><td><a href="./address.var.varapplication-varapplication-varwebservice-05.05.07-SNAPSHOT.var/*fingerprint*/"><img style="width: 16px; height: 16px; " class="icon-fingerprint icon-sm" src="/static/474743c8/images/16x16/fingerprint.png" /></a> <a href="./address.var.varapplication-varapplication-varwebservice-05.05.07-SNAPSHOT.var/*view*/">view</a></td></tr><tr><td style="text-align:right;" colspan="3"><div style="margin-top: 1em;"><a href="./*.var/*zip*/target.zip"><img style="width: 16px; height: 16px; " class="icon-package icon-sm" src="/static/474743c8/images/16x16/package.png" />
并使用构建版本(例如 05.05.07-SNAPSHOT - 下一次将是另一个版本,但格式保持不变)作为另一个批处理文件的变量。 我试过 findstr 但没有成功:
for /F "delims=" %%a in ('findstr /ic "webservice" a.html') do set "line=%%a"
set "line=%line:*webservice=%"
for /F "delims=" %%a in ("%line%") do set string=%%a
for %%b in ("%line%") do @ set "var=%%b"
SET build=%var:~-11,8%
ECHO. %build%
最佳答案
解析结构化标记时,最好将其视为分层对象而不是平面文本。作为层次结构导航不仅比尝试将字符串与标记或正则表达式匹配更容易,而且面向对象的方法也更能抵抗格式变化(无论代码是否被缩小、美化、引入换行符,无论如何).
考虑到这一点,我建议 using a querySelector选择作为类名为“fileList”的表元素的子元素的 anchor 标记。然后使用正则表达式从 anchor 标记的 href 属性中抓取版本信息。
@if (@CodeSection == @Batch) @then
@echo off & setlocal
set "html=test.html"
for /f "delims=" %%I in ('cscript /nologo /e:JScript "%~f0" "%html%"') do set "%%I"
echo %build%
goto :EOF
@end // end batch / begin JScript hybrid code
var htmlfile = WSH.CreateObject('htmlfile'),
fso = WSH.CreateObject('Scripting.FileSystemObject'),
file = fso.OpenTextFile(WSH.Arguments(0), 1),
html = file.ReadAll();
file.Close();
htmlfile.write('<meta http-equiv="x-ua-compatible" content="IE=9" />' + html);
var anchors = htmlfile.querySelectorAll('table.fileList a');
for (var i = 0; i < anchors.length; i++) {
if (/webservice-((\d+\.)*\d.+)\.var$/i.test(anchors[i].href)) {
WSH.Echo('build=' + RegExp.$1);
WSH.Quit(0);
}
}
更酷的是,如果您抓取的 HTML 文件由网络服务器提供,您还可以使用 Microsoft.XMLHTTP
methods无需依赖 wget
或 curl
或类似工具即可检索 HTML。这只需要对上面的代码进行一些小改动。
@if (@CodeSection == @Batch) @then
@echo off & setlocal
set "URL=http://www.domain.com/file.html"
for /f "delims=" %%I in ('cscript /nologo /e:JScript "%~f0" "%URL%"') do set "%%I"
echo %build%
goto :EOF
@end // end batch / begin JScript hybrid code
var xhr = WSH.CreateObject('Microsoft.XMLHTTP'),
htmlfile = WSH.CreateObject('htmlfile');
xhr.open('GET', WSH.Arguments(0), true);
xhr.setRequestHeader('User-Agent', 'XMLHTTP/1.0');
xhr.send('');
while (xhr.readyState != 4) WSH.Sleep(50);
htmlfile.write('<meta http-equiv="x-ua-compatible" content="IE=9" />' + xhr.responseText);
var anchors = htmlfile.querySelectorAll('table.fileList a');
for (var i = 0; i < anchors.length; i++) {
if (/webservice-((\d+\.)*\d.+)\.var$/i.test(anchors[i].href)) {
WSH.Echo('build=' + RegExp.$1);
WSH.Quit(0);
}
}
关于html - Windows批处理文件在html文件中查找变量字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38636600/