regex - 如何在字节上使用 Rust 正则表达式(Vec<u8> 或 &[u8])?

标签 regex rust

我有一个 &[u8],我需要验证它是否符合某种模式。 Regex documentation 中有关于 &[u8] 使用正则表达式的示例并在 the module documentation 。我从the examples section获取代码并将其放入 main() 中并添加一些声明:

extern crate regex;
use regex::Regex;

fn main() {
    let re = Regex::new(r"'([^']+)'\s+\((\d{4})\)").unwrap();
    let text = b"Not my favorite movie: 'Citizen Kane' (1941).";
    let caps = re.captures(text).unwrap();
    assert_eq!(&caps[1], &b"Citizen Kane"[..]);
    assert_eq!(&caps[2], &b"1941"[..]);
    assert_eq!(&caps[0], &b"'Citizen Kane' (1941)"[..]);
    // You can also access the groups by index using the Index notation.
    // Note that this will panic on an invalid index.
    assert_eq!(&caps[1], b"Citizen Kane");
    assert_eq!(&caps[2], b"1941");
    assert_eq!(&caps[0], b"'Citizen Kane' (1941)");
}

我不明白这个示例代码与常规字符串匹配有何不同,而且编译器确实提示需要 &str。一般来说,代码不会暗示它与通常的字符串匹配有何不同,我对此没有任何问题。

我认为我犯了一些基本错误,例如缺少或更精确的导入。我在这里陷入了一个猜测游戏,因为文档未能提供工作示例(就像他们经常做的那样),而且这次编译器也未能将我推向正确的方向。

以下是编译器消息:

error[E0308]: mismatched types
 --> src/main.rs:7:28
  |
7 |     let caps = re.captures(text).unwrap();
  |                            ^^^^ expected str, found array of 45 elements
  |
  = note: expected type `&str`
             found type `&[u8; 45]`

error[E0277]: the trait bound `str: std::cmp::PartialEq<[u8]>` is not satisfied
 --> src/main.rs:8:5
  |
8 |     assert_eq!(&caps[1], &b"Citizen Kane"[..]);
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ can't compare `str` with `[u8]`
  |
  = help: the trait `std::cmp::PartialEq<[u8]>` is not implemented for `str`
  = note: required because of the requirements on the impl of `std::cmp::PartialEq<&[u8]>` for `&str`
  = note: this error originates in a macro outside of the current crate

error[E0277]: the trait bound `str: std::cmp::PartialEq<[u8]>` is not satisfied
 --> src/main.rs:9:5
  |
9 |     assert_eq!(&caps[2], &b"1941"[..]);
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ can't compare `str` with `[u8]`
  |
  = help: the trait `std::cmp::PartialEq<[u8]>` is not implemented for `str`
  = note: required because of the requirements on the impl of `std::cmp::PartialEq<&[u8]>` for `&str`
  = note: this error originates in a macro outside of the current crate

error[E0277]: the trait bound `str: std::cmp::PartialEq<[u8]>` is not satisfied
  --> src/main.rs:10:5
   |
10 |     assert_eq!(&caps[0], &b"'Citizen Kane' (1941)"[..]);
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ can't compare `str` with `[u8]`
   |
   = help: the trait `std::cmp::PartialEq<[u8]>` is not implemented for `str`
   = note: required because of the requirements on the impl of `std::cmp::PartialEq<&[u8]>` for `&str`
   = note: this error originates in a macro outside of the current crate

error[E0277]: the trait bound `str: std::cmp::PartialEq<[u8; 12]>` is not satisfied
  --> src/main.rs:13:5
   |
13 |     assert_eq!(&caps[1], b"Citizen Kane");
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ can't compare `str` with `[u8; 12]`
   |
   = help: the trait `std::cmp::PartialEq<[u8; 12]>` is not implemented for `str`
   = note: required because of the requirements on the impl of `std::cmp::PartialEq<&[u8; 12]>` for `&str`
   = note: this error originates in a macro outside of the current crate

error[E0277]: the trait bound `str: std::cmp::PartialEq<[u8; 4]>` is not satisfied
  --> src/main.rs:14:5
   |
14 |     assert_eq!(&caps[2], b"1941");
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ can't compare `str` with `[u8; 4]`
   |
   = help: the trait `std::cmp::PartialEq<[u8; 4]>` is not implemented for `str`
   = note: required because of the requirements on the impl of `std::cmp::PartialEq<&[u8; 4]>` for `&str`
   = note: this error originates in a macro outside of the current crate

error[E0277]: the trait bound `str: std::cmp::PartialEq<[u8; 21]>` is not satisfied
  --> src/main.rs:15:5
   |
15 |     assert_eq!(&caps[0], b"'Citizen Kane' (1941)");
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ can't compare `str` with `[u8; 21]`
   |
   = help: the trait `std::cmp::PartialEq<[u8; 21]>` is not implemented for `str`
   = note: required because of the requirements on the impl of `std::cmp::PartialEq<&[u8; 21]>` for `&str`
   = note: this error originates in a macro outside of the current crate

最佳答案

and added a few declarations

不幸的是,您添加了错误的内容。请注意您链接到的文档是如何针对 struct regex::bytes::Regex 的,而不是 regex::Regex — 它们是两种不同的类型!

extern crate regex;
use regex::bytes::Regex;
//         ^^^^^

fn main() {
    let re = Regex::new(r"'([^']+)'\s+\((\d{4})\)").unwrap();
    let text = b"Not my favorite movie: 'Citizen Kane' (1941).";
    let caps = re.captures(text).unwrap();

    assert_eq!(&caps[1], &b"Citizen Kane"[..]);
    assert_eq!(&caps[2], &b"1941"[..]);
    assert_eq!(&caps[0], &b"'Citizen Kane' (1941)"[..]);

    assert_eq!(&caps[1], b"Citizen Kane");
    assert_eq!(&caps[2], b"1941");
    assert_eq!(&caps[0], b"'Citizen Kane' (1941)");
}

as the docs fail to provide working examples (as they regularly do)

请注意,文档中的代码块默认情况下会编译并执行,因此我的经验是,这些示例不起作用的情况很少见。

关于regex - 如何在字节上使用 Rust 正则表达式(Vec<u8> 或 &[u8])?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46491927/

相关文章:

regex - 谷歌地图的地址格式

javascript - 用html元素替换某些字符

rust - Rust 中的可变Arc

vector - 在 Rust 中,有没有办法只对容器的一部分执行 retain()?

java - 匹配方括号内的内容,包括嵌套方括号

python - 使用正则表达式提取第一段

javascript - 后向捕获组与前向捕获组连接

rust - 如何解析 cargo 中的 "multiple matching crates for ` 包裹`"?

module - 如何将 crate 作为子模块导入?

csv - Rust:读写 CSV 性能