haskell - 为什么字节串的 Data.Binary 实例添加字节串的长度作为前缀

查看各种 ByteString 类型的 put 实例，我们发现在写入之前，字节串的长度总是在二进制文件中添加前缀。例如这里 - https://hackage.haskell.org/package/binary-0.8.8.0/docs/src/Data.Binary.Class.html#put

举个例子

instance Binary B.ByteString where
    put bs = put (B.length bs) -- Why this??
             <> putByteString bs
    get    = get >>= getByteString

这样做有什么特殊原因吗？这是编写 Bytestring 而不添加长度前缀的唯一方法 - 创建我们自己的新类型包装器并拥有 Binary 实例吗？

最佳答案

Is there any particular reason for doing this?

get 和 put 的思想是您可以组合多个对象。例如你可以写:

write_func :: ByteString -> Char -> Put
write_func some_bytestring some_char = do
    put some_bytestring
    put some_char

那么您想要定义一个可以读回数据的函数，显然您希望这两个函数一起充当身份函数:如果编写者写入某个 ByteString 和某个Char，那么您希望读取函数读取相同的ByteString和字符。

阅读器功能应类似于:

read_fun :: Get (ByteString, Char)
read_fun = do
    bs <- get
    c <- get
    return (bs, c)

但问题是，ByteString什么时候结束？ 'A' 字符也可以是 ByteString 的一部分。因此，您需要以某种方式指示 ByteString 的结束位置。这可以通过保存长度或末尾的一些标记来完成。如果是标记，您需要“转义”字节串，使其不能包含标记本身。

但是您因此需要某种机制来指定 ByteString 何时结束。

And is the only way to write Bytestring without prefixing the length - creating our own newtype wrapper and having an instance for Binary?

不，事实上它已经在实例定义中了。如果你想写一个没有长度的ByteString，那么你可以使用 putByteString :: ByteString -> Put :

write_func :: ByteString -> Char -> Put
write_func some_bytestring some_char = do
    <b>putByteString</b> some_bytestring
    put some_char

但是当读取ByteString时，您需要计算出必须读取多少字节。

关于haskell - 为什么字节串的 Data.Binary 实例添加字节串的长度作为前缀，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/63558292/