背景

我在业余时间自学数据库，尝试通过从头开始实现来学习。

您必须实现的第一件事是底层数据格式和存储机制。

在 DB 中，有一个称为“Slotted Page”的结构，如下所示:

+-----------------------------------------------------------+
| +----------------------+  +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ |
| | HEADER               |  | | | | | | | | | | | | | | | | |
| |                      |  | | | | | | | | | | | | | | | | |
| +----------------------+  +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ |
|                                     SLOT ARRAY            |
|                                                           |
|                                                           |
|                                                           |
|                 +--------------------+ +----------------+ |
|                 |  TUPLE #4          | |  TUPLE #3      | |
|                 |                    | |                | |
|                 +--------------------+ +----------------+ |
|         +--------------------------+ +------------------+ |
|         |  TUPLE #2                | |  TUPLE #1        | |
|         |                          | |                  | |
|         +--------------------------+ +------------------+ |
+-----------------------------------------------------------+

页面数据通过二进制序列化存储到文件中。插槽是最简单的部分，其定义可能如下所示:

struct Slot {
  uint32_t offset;
  uint32_t length;
}

在 C++ 中，读/写过程可能是 std::memcpy

// Ignoring offset of header size in below
void write_to_buffer(char *buffer, Slot& slot, uint32_t slot_idx) {
    memcpy(buffer + sizeof(Slot) * slot_idx, &slot.offset, sizeof(uint32_t));
    memcpy(buffer + sizeof(Slot) * slot_idx + sizeof(uint32_t), &slot.length, sizeof(uint32_t));
}

void read_from_buffer(char *buffer, Slot& slot, uint32_t slot_idx) {
    memcpy(&slot.offset, buffer + sizeof(Slot) * slot_idx, sizeof(uint32_t));
    memcpy(&slot.length, buffer + sizeof(Slot) * slot_idx + sizeof(Slot), sizeof(uint32_t));
}

在 Java 中，据我所知，您可以执行以下两项操作之一:

字节缓冲区

record Slot(int offset, int length) {
    void write(ByteBuffer buffer) {
        buffer.putInt(offset).putInt(length);
    }
    
    static Slot read(ByteBuffer buffer) {
        return new Slot(buffer.getInt(), buffer.getInt());
    }
}

新的外国内存 Material

record Slot(int offset, int length) {
    public static MemoryLayout LAYOUT = MemoryLayout.structLayout(
            ValueLayout.JAVA_INT.withName("offset"),
            ValueLayout.JAVA_INT.withName("length"));

    public static TupleSlot from(MemorySegment memory) {
        return new TupleSlot(
                memory.get(ValueLayout.JAVA_INT, 0),
                memory.get(ValueLayout.JAVA_INT, Integer.BYTES));
    }

    public void to(MemorySegment memory) {
        memory.set(ValueLayout.JAVA_INT, 0, offset);
        memory.set(ValueLayout.JAVA_INT, Integer.BYTES, length);
    }
}

它们之间的性能差异是什么？

如果它可以忽略不计，我更喜欢 ByteBuffer API。

最佳答案

Paul Sandoz 在 panama-dev 邮件列表上的回复:

Hi Gavin,

Using MemorySegment will given you far more control over the description (layout) and management (freeing and pooling) than ByteBuffer. Also, if it’s an issue you will also not be constrained by ByteBuffer’s size limitation. Performance wise using MemorySegment should be as good as or better than ByteBuffer.

In many respects MemorySegment is a better API to interact with native memory. ByteBuffer was introduced in Java 1.4 with NIO and had additional design constraints in mind that are less relevant today (such as an internal mutable index).

Paul.

关于Java:ByteBuffer 与 jdk.incubator.foreign(巴拿马)外部内存方法(MemoryLayout/Segment)的性能对比，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/73423726/

Java:ByteBuffer 与 jdk.incubator.foreign(巴拿马)外部内存方法(MemoryLayout/Segment)的性能对比

背景

上一篇：python - 如何查找 numpy 矩阵中是否存在子矩阵？

下一篇：Python--计算给定样本列表的值的归一化概率