c - 在 Unix 上用 C 读取文本文件的一行——我的 read_line 坏了？

我想制作一个函数，从给定的文本文件中读取您选择的一行。继续作为参数的函数(打开的 int fd 和 int line_number) 它必须使用 C 语言和 Unix 系统调用(read 和/或 open)来执行此操作。它还应该读取任何空格，并且它不能有实际限制(即该行必须能够有您选择的长度)。我做的功能是这样的:

char* read_line(int file, int numero_riga){
    char myb[1];
    if (numero_riga < 1) {
        return NULL;
    }
    char* myb2 = malloc(sizeof(char)*100);
    memset(myb2, 0, sizeof(char));
    ssize_t n;
    int i = 1;
    while (i < numero_riga) {
        if((n = read(file, myb, 1)) == -1){
            perror("read fail");
            exit(EXIT_FAILURE);
        }
        if (strncmp(myb, "\n", 1) == 0) {
            i++;
        }else if (n == 0){
            return NULL;
        }
    }
    numero_riga++;
    int j = 0;
    while (i < numero_riga) {
        ssize_t n = read(file, myb, 1);
        if (strncmp(myb, "\n", 1) == 0) {
            i++;
        }else if (n == 0){
            return myb2;
        }else{
            myb2[j] = myb[0];
            j++;
        }
    }

    return myb2;
}

直到最近，我还认为这行得通，但确实存在一些问题。使用消息队列，read_line 读取的字符串作为空字符串 ("\0") 接收。我知道消息队列不是问题，因为尝试传递普通字符串不会产生问题。如果可能的话，我想要一个修复程序，并解释为什么我应该以某种方式更正它。这是因为如果我不理解我的错误，我就有可能在未来重蹈覆辙。

编辑 1. 根据答案，我决定添加一些问题。我如何结束 myb2？有人可以根据我的代码给我一个例子吗？我如何提前知道构成要读取的一行 txt 的字符数？

编辑 2. 我不知道该行有多少个字符，所以我不知道要分配多少个字符；这就是我使用 *100 的原因。

最佳答案

部分分析

您在以下位置发生内存泄漏:

char* myb2 = (char*) malloc((sizeof(char*))*100);
memset(myb2, 0, sizeof(char));
if (numero_riga < 1) {
    return NULL;
}

在分配内存之前检查 numero_riga。

下面的循环充其量也是可疑的:

int i = 1;
while (i < numero_riga) {
    ssize_t n = read(file, myb, 1);
    if (strncmp(myb, "\n", 1) == 0) {
        i++;
    }else if (n == 0){
        return NULL;
    }
}

您没有检查 read() 是否真的足够快地返回任何内容，当您检查时，您(再次)泄漏内存并忽略任何事先读取的内容，并且您没有检测到错误( n < 0 )。当您确实检测到换行符时，只需将 1 添加到 i 即可。您决不会将读取的字符保存在缓冲区中(例如 myb2 )。总而言之，这似乎已经彻底崩溃了......除非......除非你试图从头开始读取文件中的第 N^th 行，而不是文件中的下一行文件，这是更常见的。

你需要做的是:

扫描N-1行，注意EOF
当另一个字节可用时
- 如果是换行符，终止字符串并返回
- 否则，将其添加到缓冲区，如果没有空间则分配空间。

实现

我想我可能会像这样使用函数 get_ch():

static inline int get_ch(int fd)
{
    char c;
    if (read(fd, &c, 1) == 1)
        return (unsigned char)c;
    return EOF;
}

然后在主要的 char *read_nth_line(int fd, int line_no) 函数中你可以做:

char *read_nth_line(int fd, int line_no)
{
    if (line_no <= 0)
        return NULL;

    /* Skip preceding lines */
    for (int i = 1; i < line_no; i++)
    {
        int c;
        while ((c = get_ch(fd)) != '\n')
        {
            if (c == EOF)
                return NULL;
        }
    }

    /* Capture next line */
    size_t max_len = 8;
    size_t act_len = 0;
    char  *buffer  = malloc(8);
    int c;
    while ((c = get_ch(fd)) != EOF && c != '\n')
    {
        if (act_len + 2 >= max_len)
        {
            size_t new_len = max_len * 2;
            char *new_buf = realloc(buffer, new_len);
            if (new_buf == 0)
            {
                free(buffer);
                return NULL;
            }
            buffer = new_buf;
            max_len = new_len;
        }
        buffer[act_len++] = c;
    }
    if (c == '\n')
        buffer[act_len++] = c;
    buffer[act_len] = '\0';
    return buffer;
}

添加测试代码:

#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>

extern char *read_nth_line(int fd, int line_no);

…code from main answer…

int main(void)
{
    char *line;
    while ((line = read_nth_line(0, 3)) != NULL)
    {
        printf("[[%s]]\n", line);
        free(line);
    }
    return 0;
}

这从标准输入中每隔三行读取一次。它似乎工作正常。最好对边界条件(短线等)进行更详尽的检查，以确保它不会滥用内存。 (测试长度为 1 的行——仅换行符——最多 18 个字符和 valgrind 表明它是可以的。随机更长的测试似乎也是正确的。)

关于c - 在 Unix 上用 C 读取文本文件的一行——我的 read_line 坏了？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/27893186/

c - 在 Unix 上用 C 读取文本文件的一行——我的 read_line 坏了？

部分分析

实现

上一篇：c - 具有多个客户端的 UDP 客户端服务器

下一篇：c - 如何在 C api 中从 KDB+ 访问日期和 varbinary