rust - Rust 中 F# 中字符串的等价 Cons 模式

我正在通过实现我的一小段 F# 来试验 Rust。

我正处于要解构字符串的位置。这是 F#:

 let rec internalCheck acc = function
    | w :: tail when Char.IsWhiteSpace(w) -> 
        internalCheck acc tail
    | other
    | matches
    | here

..可以这样调用:internalCheck [] "String here" 其中 :: 运算符表示右侧是“列表的其余部分” ".

所以我查看了 Rust 文档，有像这样解构向量的示例:

let v = vec![1,2,3];

match v {
    [] => ...
    [first, second, ..rest] => ...
}

..等等然而，这现在位于 slice_patterns 功能门的后面。我试过类似的东西:

match input.chars() {
    [w, ..] => ...
}

这告诉我功能门需要非稳定版本才能使用。

所以我下载了 multirust 并安装了我能找到的最新版本(2016-01-05)，当我终于得到了 slice_patterns功能正常工作......我遇到了关于语法和“休息”(在上面的例子中)不允许的无穷无尽的错误。

那么，在 Rust 中是否有一种等效的方法来解构字符串，利用类似于 :: 的功能……？基本上，我想将 1 个字符与守卫匹配，并在后面的表达式中使用“其他所有内容”。

如果答案是“不，没有”，那是完全可以接受的。我当然无法在任何地方在线找到许多此类示例，而且切片模式匹配在功能列表中似乎并不靠前。

(如果我在 Rust 文档中遗漏了什么，我会很乐意删除这个问题)

最佳答案

您可以将模式与 byte 切片匹配:

#![feature(slice_patterns)]

fn internal_check(acc: &[u8]) -> bool {
    match acc {
        &[b'-', ref tail..] => internal_check(tail),
        &[ch, ref tail..] if (ch as char).is_whitespace() => internal_check(tail),
        &[] => true,
        _ => false,
    }
}

fn main() {
    for s in ["foo", "bar", "   ", " - "].iter() {
        println!("text '{}', checks? {}", s, internal_check(s.as_bytes()));
    }
}

您可以将它与 char 切片一起使用(其中 char 是一个 Unicode 标量值):

#![feature(slice_patterns)]

fn internal_check(acc: &[char]) -> bool {
    match acc {
        &['-', ref tail..] => internal_check(tail),
        &[ch, ref tail..] if ch.is_whitespace() => internal_check(tail),
        &[] => true,
        _ => false,
    }
}

fn main() {
    for s in ["foo", "bar", "   ", " - "].iter() {
        println!("text '{}', checks? {}",
                 s, internal_check(&s.chars().collect::<Vec<char>>()));
    }
}

但截至目前，它不适用于 &str (生成 E0308 )。我认为这是最好的，因为 &str 既不在这里也不在那里，它是引擎盖下的 byte 切片，但 Rust 试图保证它是一个有效的 UTF-8 并尝试提醒您根据 unicode 序列和字符而不是字节来使用 &str。因此，为了有效地匹配 &str，我们必须显式使用 as_bytes 方法，本质上告诉 Rust “我们知道我们在做什么”。

无论如何，这就是我的阅读。如果您想更深入地研究 Rust 编译器的源代码，您可以从 issue 1844 开始。并浏览那里链接的提交和问题。

Basically I want to match 1 character with a guard and use "everything else" in the expression that follows.

如果您只想匹配一个单个字符，那么使用chars迭代器获取字符并匹配字符本身可能比将整个 UTF-8 &str 转换为 &[char] 切片更好。例如，使用 chars迭代器，您不必为字符数组分配内存。

fn internal_check(acc: &str) -> bool {
    for ch in acc.chars() {
        match ch {
            '-' => (),
            ch if ch.is_whitespace() => (),
            _ => return false,
        }
    }
    return true;
}

fn main() {
    for s in ["foo", "bar", "   ", " - "].iter() {
        println!("text '{}', checks? {}", s, internal_check(s));
    }
}

您还可以使用 chars在 Unicode 标量值边界上拆分 &str 的迭代器:

fn internal_check(acc: &str) -> bool {
    let mut chars = acc.chars();
    match chars.next() {
        Some('-') => internal_check(chars.as_str()),
        Some(ch) if ch.is_whitespace() => internal_check(chars.as_str()),
        None => true,
        _ => false,
    }
}

fn main() {
    for s in ["foo", "bar", "   ", " - "].iter() {
        println!("text '{}', checks? {}", s, internal_check(s));
    }
}

但请记住，截至目前，Rust 无法保证将此尾递归函数优化为循环。 (尾调用优化本来是该语言的一个受欢迎的补充，但由于与 LLVM 相关的困难，它到目前为止还没有实现)。

关于rust - Rust 中 F# 中字符串的等价 Cons 模式，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/34667670/

rust - Rust 中 F# 中字符串的等价 Cons 模式

上一篇：rust - 当不能使用 Cell 时，如何在 Rust 中改变(或避免改变)嵌套的、构造的字段，而不使所有内容都可变？

下一篇：rust - 使用换行符调试特征实现