recursion - Rust-在递归函数中收集Vec的切片

标签 recursion rust lifetime huffman-code ownership

我目前正在尝试构建霍夫曼编码程序,并且遇到遍历生成的霍夫曼树以创建查找表时遇到的问题。我决定使用递归函数实现上述遍历。在实际的实现中,我使用bitvec条板箱来保存位序列,但为简单起见,在本文中,我将使用Vec<bool>
我的想法是将所有代码字的集合保存在Vec codewords中,然后仅将向量中的一个切片保存为实际的查询表,为此我使用了HashMap
问题是我将如何解决为左遍历和右遍历都添加0或1的问题。我的想法是保存当前序列切片的克隆,将0附加到codewords,然后在向左遍历后将该克隆附加到codewords的末尾,以便我可以推1并向右遍历。我想到的功能是这样的:

use std::collections::HashMap;

// ignore everything being public, I use getters in the real code
pub struct HufTreeNode {
    pub val: u8,
    pub freq: usize,
    pub left: i16,
    pub right: i16,
}

fn traverse_tree<'a>(
    cur_index: usize,
    height: i16,
    codewords: &'a mut Vec<bool>,
    lookup_table: &mut HashMap<u8, &'a [bool]>,
    huffman_tree: &[HufTreeNode],
) {
    let cur_node = &huffman_tree[cur_index];

    // if the left child is -1, we reached a leaf
    if cur_node.left == -1 {
        // the last `height` bits in codewords
        let cur_sequence = &codewords[(codewords.len() - 1 - height as usize)..];
        lookup_table.insert(cur_node.val, cur_sequence);
        return;
    }

    // save the current sequence so we can traverse to the right afterwards
    let mut cur_sequence = codewords[(codewords.len() - 1 - height as usize)..].to_vec();
    codewords.push(false);
    traverse_tree(
        cur_node.left as usize,
        height + 1,
        codewords, // mutable borrow - argument requires that `*codewords` is borrowed for `'a`
        lookup_table,
        huffman_tree,
    );

    // append the previously saved current sequence
    codewords.append(&mut cur_sequence); // second mutable borrow occurs here
    codewords.push(true); // third mutable borrow occurs here
    traverse_tree(
        cur_node.right as usize,
        height + 1,
        codewords, // fourth mutable borrow occurs here
        lookup_table,
        huffman_tree,
    );
}

fn main() {
    // ...
}
显然,在那段代码中存在生存期和借用的问题,我很明白问题出在哪里。据我了解,当我在递归调用中将codewords作为参数时,只要我将切片保存在lookup_table中,就必须借用向量,这显然是不可能的,从而导致错误。我该如何解决?
这是cargo check给我的:
error[E0499]: cannot borrow `*codewords` as mutable more than once at a time
  --> untitled.rs:43:5
   |
14 |   fn traverse_tree<'a>(
   |                    -- lifetime `'a` defined here
...
34 | /     traverse_tree(
35 | |         cur_node.left as usize,
36 | |         height + 1,
37 | |         codewords, // mutable borrow - argument requires that `*codewords` is borrowed for `'a`
   | |         --------- first mutable borrow occurs here
38 | |         lookup_table,
39 | |         huffman_tree,
40 | |     );
   | |_____- argument requires that `*codewords` is borrowed for `'a`
...
43 |       codewords.append(&mut cur_sequence); // second mutable borrow occurs here
   |       ^^^^^^^^^ second mutable borrow occurs here

error[E0499]: cannot borrow `*codewords` as mutable more than once at a time
  --> untitled.rs:44:5
   |
14 |   fn traverse_tree<'a>(
   |                    -- lifetime `'a` defined here
...
34 | /     traverse_tree(
35 | |         cur_node.left as usize,
36 | |         height + 1,
37 | |         codewords, // mutable borrow - argument requires that `*codewords` is borrowed for `'a`
   | |         --------- first mutable borrow occurs here
38 | |         lookup_table,
39 | |         huffman_tree,
40 | |     );
   | |_____- argument requires that `*codewords` is borrowed for `'a`
...
44 |       codewords.push(true); // third mutable borrow occurs here
   |       ^^^^^^^^^ second mutable borrow occurs here

error[E0499]: cannot borrow `*codewords` as mutable more than once at a time
  --> untitled.rs:48:9
   |
14 |   fn traverse_tree<'a>(
   |                    -- lifetime `'a` defined here
...
34 | /     traverse_tree(
35 | |         cur_node.left as usize,
36 | |         height + 1,
37 | |         codewords, // mutable borrow - argument requires that `*codewords` is borrowed for `'a`
   | |         --------- first mutable borrow occurs here
38 | |         lookup_table,
39 | |         huffman_tree,
40 | |     );
   | |_____- argument requires that `*codewords` is borrowed for `'a`
...
48 |           codewords, // fourth mutable borrow occurs here
   |           ^^^^^^^^^ second mutable borrow occurs here
我在这里想念什么?在向量API中,我缺少一些神奇的功能吗,为什么这首先会造成生命周期问题呢?据我所知,我的所有一生都是正确的,因为codewords的生存时间足够长,以至于lookup_table可以保存所有这些分片,而且我从不可变地借两次东西。如果我的一生有问题,编译器会在if cur_node.left == -1块内提示,而我所拥有的cur_sequence是拥有的Vec,因此不会有任何借用问题。因此,问题的实质在于具有以可变引用作为参数的递归函数的核心思想。
有什么办法可以解决这个问题吗?我曾尝试让codewords拥有并返回它,但随后编译器无法确保我保存在lookup_table中的位序列生命周期足够长。我仍然唯一的想法是将拥有的向量保存在lookup_table内,但是到那时codewords向量已经过时了,我可以简单地通过将cur_sequence向量作为参数来实现这一点,我在每次调用中都将其克隆,但是我选择了我的方法是在实际编码过程中获得更好的缓存性能,此后我就会迷失方向。

最佳答案

问题是,当您像在cur_sequence中一样从codewords创建切片let cur_sequence = &codewords[(codewords.len() - 1 - height as usize)..];时,编译器将对codewords的引用的生存期延长到至少与cur_sequence相同(为什么:编译器要确保切片cur_sequence始终有效,但是如果您更改codewords(例如,将其清除),则cur_sequence可能无效。通过对codewords保留不变的引用,则当切片仍然存在时,借用规则将禁止对codewords进行修改)。不幸的是,您将cur_sequence保存在lookup_table中,从而使对codewords的引用在整个函数中都保持有效,因此您不能再可变地借用codewords
解决方案是自己维护切片的索引:创建一个结构:

struct Range {
    start: usize,
    end: usize
}

impl Range {
    fn new(start: usize, end: usize) -> Self {
        Range{ start, end}
    }
}

然后使用它代替切片:
let cur_range = Range::new(
    codewords.len() - 1 - height as usize,
    codewords.len() - 1
);
lookup_table.insert(cur_node.val, cur_range);
这样,保持范围有效的责任就在于您。
完整的代码:
use std::collections::HashMap;

// ignore everything being public, I use getters in the real code
pub struct HufTreeNode {
    pub val: u8,
    pub freq: usize,
    pub left: i16,
    pub right: i16,
}

struct Range {
    start: usize,
    end: usize
}

impl Range {
    fn new(start: usize, end: usize) -> Self {
        Range{ start, end}
    }
}

fn traverse_tree(
    cur_index: usize,
    height: i16,
    codewords: &mut Vec<bool>,
    lookup_table: &mut HashMap<u8, Range>,
    huffman_tree: &[HufTreeNode],
) {
    let cur_node = &huffman_tree[cur_index];

    // if the left child is -1, we reached a leaf
    if cur_node.left == -1 {
        // the last `height` bits in codewords
        // let cur_sequence = &codewords[(codewords.len() - 1 - height as usize)..];
        let cur_range = Range::new(
            codewords.len() - 1 - height as usize,
            codewords.len() - 1
        );
        lookup_table.insert(cur_node.val, cur_range);
        return;
    }

    // save the current sequence so we can traverse to the right afterwards
    let mut cur_sequence = codewords[(codewords.len() - 1 - height as usize)..].to_vec();
    codewords.push(false);
    traverse_tree(
        cur_node.left as usize,
        height + 1,
        codewords, // mutable borrow - argument requires that `*codewords` is borrowed for `'a`
        lookup_table,
        huffman_tree,
    );

    // append the previously saved current sequence
    codewords.append(&mut cur_sequence); // second mutable borrow occurs here
    codewords.push(true); // third mutable borrow occurs here
    traverse_tree(
        cur_node.right as usize,
        height + 1,
        codewords, // fourth mutable borrow occurs here
        lookup_table,
        huffman_tree,
    );
}

fn main() {
    // ...
}

关于recursion - Rust-在递归函数中收集Vec的切片,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66289524/

相关文章:

java - 使用递归求和

javascript - 这个递归组合函数中的 "return a"有什么作用?

c# - 递归迭代器 block 方法的提前终止

github - 如何在 Cargo.toml 中的依赖项中指定某个提交?

c++ - std::initializer 列表全局/静态对象的生命周期

我可以存储指向 __PRETTY_FUNCTION__ 的指针以供以后使用,还是需要立即复制该字符串?

algorithm - 调用了多少次生成器?

rust - 如何为notify-rust建立外部超时

rust - Rust 中的函数向量

rust - 为什么这个 MutexGuard 没有被丢弃?