ruby - 在 Ruby 中提取 URL(到数组)

下午好

我正在学习如何在 Ruby 中使用 RegEx，并且遇到了需要一些帮助的问题。我正在尝试从字符串中提取 0 到多个 URL。

这是我正在使用的代码:

sStrings = ["hello world: http://www.google.com", "There is only one url in this string http://yahoo.com . Did you get that?", "The first URL in this string is http://www.bing.com and the second is http://digg.com","This one is more complicated http://is.gd/12345 http://is.gd/4567?q=1", "This string contains no urls"]
sStrings.each  do |s|
  x = s.scan(/((http|https):\/\/[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(([0-9]{1,5})?\/.[\w-]*)?)/ix)
  x.each do |url|
    puts url
  end
end

这是返回的内容:

http://www.google.com
http
.google
nil
nil
http://yahoo.com
http
nil
nil
nil
http://www.bing.com
http
.bing
nil
nil
http://digg.com
http
nil
nil
nil
http://is.gd/12345
http
nil
/12345
nil
http://is.gd/4567
http
nil
/4567
nil

仅提取完整 URL 而不是 RegEx 部分的最佳方法是什么？

最佳答案

您可以使用匿名捕获组 (?:...) 而不是 (...)。

我看到您这样做是为了学习正则表达式，但如果您真的想从字符串中提取 URL，请查看 URI.extract，它从字符串中提取 URI . (需要“uri” 才能使用它)

关于ruby - 在 Ruby 中提取 URL(到数组)，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/2592174/

上一篇：ruby-on-rails - 获取 Mechanize 以处理来自任意 POST 的 cookie(以编程方式登录网站)

下一篇：ruby-on-rails - 运行脚本/服务器时如何定义常量？

java - 正则表达式 - 匹配包含 "+"和 "-"的字符串

regex - 如何使用正则表达式 (glob) 搜索文件树

python - 计算连续字符

c++ - 操作字符串的函数 ("abcdef"-> "faebdc")

ruby-on-rails - 你的 Ruby 版本是 2.3.0，但是你的 Gemfile 指定了 2.1.2

ruby-on-rails - sudo : rvm: command not found - RVM MultiUser install on Ubuntu 12. 04 服务器

ruby - RabbitMQ:连接并发布到 Ruby 中的现有队列

regex - 神秘的 sed 命令语法困惑

python - 快速解析使用类似 xml 标签的字符串