regex - 在 Julia 中将 ismatch() 函数与正则表达式一起使用时出错

标签 regex function julia ijulia-notebook

我正在尝试编写一个非常简单的程序来查找与 Julia 中的 ismatch() 函数的匹配项。假设我的模式是

e_pat = r".+@.+"

然后我创建一个名为 input 的列表,其中包含一些随机元素:

input= ["<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="f3839a839cb3949e929a9fdd909c9e" rel="noreferrer noopener nofollow">[email protected]</a>", 23, "trapo", "<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="0f676063664f68626e6663216c6062" rel="noreferrer noopener nofollow">[email protected]</a>"]

现在我想确定存在多少个匹配项,然后使用 e_pat 作为引用打印它们:

for i in input
    println(ismatch(e_pat, i)) && println(i)
end

使用该代码,我只得到“true”并且错误显示如下:

true

TypeError: non-boolean (Void) used in boolean context

Stacktrace:
 [1] macro expansion at ./In[27]:4 [inlined]
 [2] anonymous at ./<missing>:?
 [3] include_string(::String, ::String) at ./loading.jl:522

我该怎么做才能获得以下内容?

"<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="b4c4ddc4dbf4d3d9d5ddd89ad7dbd9" rel="noreferrer noopener nofollow">[email protected]</a>"
"<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="264e494a4f66414b474f4a0845494b" rel="noreferrer noopener nofollow">[email protected]</a>"

我阅读了 ismatch() 文档,但没有发现任何有用的内容。 任何帮助将不胜感激

最佳答案

问题是,虽然这个表达式返回 true :

julia> @show ismatch(e_pat, "<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="deaeb7aeb19eb9b3bfb7b2f0bdb1b3" rel="noreferrer noopener nofollow">[email protected]</a>");
ismatch(e_pat,"<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="cbbba2bba48baca6aaa2a7e5a8a4a6" rel="noreferrer noopener nofollow">[email protected]</a>") = true

使用println ,只打印 true返回nothing :

julia> @show println(ismatch(e_pat, "<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="62120b120d22050f030b0e4c010d0f" rel="noreferrer noopener nofollow">[email protected]</a>"));
true
println(ismatch(e_pat,"<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="e5958c958aa58288848c89cb868a88" rel="noreferrer noopener nofollow">[email protected]</a>")) = nothing

其类型为 Void :

julia> typeof(nothing)
Void

错误告诉您不能使用 Void 类型的对象在 boolean 上下文中 ( nothing ) 只是 Void 的一个实例在 Julia 中被视为单例:

julia> nothing && true
ERROR: TypeError: non-boolean (Void) used in boolean context

修复该问题后,请注意这也是另一个错误:

julia> @show ismatch(e_pat, 42);
ERROR: MethodError: no method matching ismatch(::Regex, ::Int32)
Closest candidates are:
  ismatch(::Regex, ::SubString{T<:AbstractString}) at regex.jl:151
  ismatch(::Regex, ::SubString{T<:AbstractString}, ::Integer) at regex.jl:151
  ismatch(::Regex, ::AbstractString) at regex.jl:145
  ...

这告诉你ismatch没有这样的方法,您不能将它与类型参数的组合一起使用: (Regex, Int) .

您可以执行类似的操作来确保所有对象都是 String s:

julia> input = string.(["<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="f6869f8699b6919b979f9ad895999b" rel="noreferrer noopener nofollow">[email protected]</a>", 23, "trapo", "<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="bcd4d3d0d5fcdbd1ddd5d092dfd3d1" rel="noreferrer noopener nofollow">[email protected]</a>"])
4-element Array{String,1}:
 "<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="502039203f10373d31393c7e333f3d" rel="noreferrer noopener nofollow">[email protected]</a>"
 "23"            
 "trapo"         
 "<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="365e595a5f76515b575f5a1855595b" rel="noreferrer noopener nofollow">[email protected]</a>"

最后,您可以使用宏 @show (打印表达式及其结果,最后返回结果)而不是 println函数(打印结果并返回 nothing ,以调试正在发生的事情:

julia> for i in input
           @show(ismatch(e_pat, i)) && println(i)
       end
ismatch(e_pat,i) = true
<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="4f3f263f200f28222e2623612c2022" rel="noreferrer noopener nofollow">[email protected]</a>
ismatch(e_pat,i) = false
ismatch(e_pat,i) = false
ismatch(e_pat,i) = true
<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="b9d1d6d5d0f9ded4d8d0d597dad6d4" rel="noreferrer noopener nofollow">[email protected]</a>

因此,为了打印您的预期结果,只需删除左侧 println :

julia> for i in input
           ismatch(e_pat, i) && println(i)
       end
<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="e19188918ea1868c80888dcf828e8c" rel="noreferrer noopener nofollow">[email protected]</a>
<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="9ef6f1f2f7def9f3fff7f2b0fdf1f3" rel="noreferrer noopener nofollow">[email protected]</a>

如果您想存储它们而不是打印它们,您可以使用数组理解:

julia> result = [str for str in input if ismatch(e_pat, str)]
2-element Array{String,1}:
 "<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="700019001f30171d11191c5e131f1d" rel="noreferrer noopener nofollow">[email protected]</a>"
 "<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="a0c8cfccc9e0c7cdc1c9cc8ec3cfcd" rel="noreferrer noopener nofollow">[email protected]</a>"

或者像这样的索引表达式:

julia> ismatch.(e_pat, input)       
4-element BitArray{1}:              
  true                              
 false                              
 false                              
  true                              

julia> result = input[ismatch.(e_pat, input)]
2-element Array{String,1}:          
 "<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="9aeaf3eaf5dafdf7fbf3f6b4f9f5f7" rel="noreferrer noopener nofollow">[email protected]</a>"                   
 "<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="6a020506032a0d070b030644090507" rel="noreferrer noopener nofollow">[email protected]</a>" 

这样您就可以稍后打印它们,而不必重复计算:

julia> println.(result)
<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="e4948d948ba48389858d88ca878b89" rel="noreferrer noopener nofollow">[email protected]</a>
<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="d0b8bfbcb990b7bdb1b9bcfeb3bfbd" rel="noreferrer noopener nofollow">[email protected]</a>

关于regex - 在 Julia 中将 ismatch() 函数与正则表达式一起使用时出错,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48716095/

相关文章:

mysql - 在字符串中搜索字符串的 SQL 查询(但字符串可能是分开的)

regex - 将姓氏拆分到新行

python - Python中如何正确使用私有(private)函数?

julia - 数组的长度或维度不正确。 Julia 中的等高线图

cmd - Julia 从命令行在现有 REPL 上调用脚本

PHP - 用 preg 替换 ereg

python if-else递归函数返回不需要的值

c++ - 在 C++ 中使用 return 语句退出函数

matplotlib - Matlab 在 Julia 中的 "hold on"

java - 检查字符串是否匹配特定的正则表达式