我正在编写一个简单的 Linux USB 字符驱动程序,允许从它创建的设备节点读取一个短字符串。
它工作正常,但我注意到使用 cat
从设备节点读取和使用 Files.readAllBytes 从 Java 程序读取之间存在差异.
使用 cat
读取,在第一次调用 file_operations.read
函数时传入大小为 131072 的缓冲区和 5 个字节字符串被复制:
kernel: [46863.186331] usbtherm: Device was opened
kernel: [46863.186407] usbtherm: buffer: 131072, read: 5, offset: 5
kernel: [46863.186444] usbtherm: done, returning 0
kernel: [46863.186481] usbtherm: Device was released
用Files.readAllBytes
读取,第一次调用传入一个大小为1的缓冲区,然后传入一个大小为8191的缓冲区,剩下的4个字节被复制:
kernel: [51442.728879] usbtherm: Device was opened
kernel: [51442.729032] usbtherm: buffer: 1, read: 1, offset: 1
kernel: [51442.729102] usbtherm: buffer: 8191, read: 4, offset: 5
kernel: [51442.729140] usbtherm: done, returning 0
kernel: [51442.729158] usbtherm: Device was released
file_operations.read
函数(包括调试 printk
的)是:
static ssize_t device_read(struct file *filp, char *buffer, size_t length,
loff_t *offset)
{
int err = 0;
size_t msg_len = 0;
size_t len_read = 0;
msg_len = strlen(message);
if (*offset >= msg_len)
{
printk(KERN_INFO "usbtherm: done, returning 0\n");
return 0;
}
len_read = msg_len - *offset;
if (len_read > length)
{
len_read = length;
}
err = copy_to_user(buffer, message + *offset, len_read);
if (err)
{
err = -EFAULT;
goto error;
}
*offset += len_read;
printk(KERN_INFO "usbtherm: buffer: %ld, read: %ld, offset: %lld\n",
length, len_read, *offset);
return len_read;
error:
return err;
}
两种情况下读取的字符串是相同的,所以我想没关系,我只是想知道为什么会有不同的行为?
最佳答案
GNU cat
来源 cat
,
insize = io_blksize (stat_buf);
您可以看到缓冲区的大小由 coreutils 的 io_bliksize()
决定,它有一个相当 interesting comment在这方面,
/* As of May 2014, 128KiB is determined to be the minimium blksize to best minimize system call overhead.
所以这将用 cat
解释结果,因为 128KiB 是 131072 字节,GNUrus 认为这是最小化系统调用开销的最佳方式。
Files.readAllBytes
有点难以掌握,至少对于像我这样单纯的人来说是这样。 source of readAllBytes
public static byte[] readAllBytes(Path path) throws IOException {
try (SeekableByteChannel sbc = Files.newByteChannel(path);
InputStream in = Channels.newInputStream(sbc)) {
long size = sbc.size();
if (size > (long)MAX_BUFFER_SIZE)
throw new OutOfMemoryError("Required array size too large");
return read(in, (int)size);
}
}
显示它只是在调用 read(InputStream, initialSize)
其中初始大小由字节 channel 的大小决定。 size()
方法也有一个有趣的评论,
The size of files that are not isRegularFile() files is implementation specific and therefore unspecified.
最后, read(InputStream, initialSize)
电话 InputStream.read(byteArray, offset, length)
进行阅读(源代码中的注释来自原始源代码,并且自 capacity - nread = 0
以来令人困惑,因此第一次到达 while 循环时,它不读取到 EOF):
private static byte[] read(InputStream source, int initialSize)
throws IOException {
int capacity = initialSize;
byte[] buf = new byte[capacity];
int nread = 0;
int n;
for (;;) {
// read to EOF which may read more or less than initialSize (eg: file
// is truncated while we are reading)
while ((n = source.read(buf, nread, capacity - nread)) > 0)
nread += n;
// if last call to source.read() returned -1, we are done
// otherwise, try to read one more byte; if that failed we're done too
if (n < 0 || (n = source.read()) < 0)
break;
// one more byte was read; need to allocate a larger buffer
if (capacity <= MAX_BUFFER_SIZE - capacity) {
capacity = Math.max(capacity << 1, BUFFER_SIZE);
} else {
if (capacity == MAX_BUFFER_SIZE)
throw new OutOfMemoryError("Required array size too large");
capacity = MAX_BUFFER_SIZE;
}
buf = Arrays.copyOf(buf, capacity);
buf[nread++] = (byte)n;
}
return (capacity == nread) ? buf : Arrays.copyOf(buf, nread);
}
BUFFER_SIZE
的声明对于 Files
// buffer size used for reading and writing
private static final int BUFFER_SIZE = 8192;
InputStream.read(byteArray, offset, length)
的文档/来源包含相关评论,
If length is zero, then no bytes are read and 0 is returned;
自 size()
为您的设备节点返回 0 字节,这是 read(InputStream source, int initialSize)
中发生的情况:
在第一轮for (;;)
循环:
capacity=0
和nread=0
.所以source.read
在while ((n = source.read(buf, nread, capacity - nread)) > 0)
将 0 个字节读入buf
并返回 0:while
的条件循环是假的,它所做的就是n = 0
作为条件的副作用。自
n = 0
,source.read()
在if (n < 0 || (n = source.read()) < 0) break;
读取 1 个字节,表达式计算为false
: 我们的for
循环不退出。这导致您的“缓冲区:1,读取:1,偏移量:1”capacity
缓冲区的设置为BUFFER_SIZE
, 读取的单个字节被放入buf[0]
, 和nread
递增。
第二轮for (;;)
循环
因此有
capacity=8192
和nread=1
,这使得while ((n = source.read(buf, nread, capacity - nread)) > 0) nread += n;
从偏移量 1 读取 8191 字节直到source.read
返回 -1:EOF!这发生在读取剩余的 4 个字节之后。这导致您的“缓冲区:8191,读取:4,偏移量:5”。从现在开始
n = -1
,if (n < 0 || (n = source.read()) < 0) break;
中的表达式n < 0
上的短路,这使得我们的for
循环退出而不读取任何更多字节。
最后,该方法返回 Arrays.copyOf(buf, nread)
: 放置读取字节的那部分缓冲区的副本。
关于java - 为什么 Files.readAllBytes 首先读取 bufsize 为 1?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37635183/