C - 使用多线程时使用 lseek() 获得的 write() 位置不准确

当使用多个线程同时写入同一文件的不同部分时，我在获取正确的文件位置时遇到问题。

我有一个文件的全局文件描述符。在我的写作功能中，我首先锁定一个互斥锁，然后执行 lseek(global_fd, 0, SEEK_CUR) 获取当前文件位置。接下来，我使用 write() 写入 31 个零字节(31 是我的条目大小)，实际上是为以后保留空间。然后我解锁互斥体。

稍后在函数中，我向同一个文件声明一个本地 fd 变量，然后打开它。我现在在本地 fd 上执行 lseek 以到达我从中学到的位置早些时候，我的空间被保留了。最后，我 write() 那里有 31 个数据字节条目，并关闭本地 fd。

这个问题似乎很少见，一个条目没有被写入预期的位置(它不是损坏的数据 - 似乎它与不同的条目交换，或者两个条目被写入相同的位置) .有多个线程在运行我描述的“写作功能”。

我从那以后了解到 pwrite() 可以用来写入特定的偏移量，这会更有效，并且消除了 lseek()。但是，我首先想弄清楚:我原来的算法有什么问题？是否有任何类型的缓冲可能导致预期写入位置与数据实际最终存储在文件中的位置之间存在差异？

相关代码片段如下。这是一个问题的原因是，在第二个数据文件中，我记录了我正在写入的条目的存储位置。如果基于写入前的 lseek() 的位置不准确，则我的数据无法正确匹配——这种情况偶尔会发生(很难重现——它发生在可能是 10 万分之一的写入)。谢谢!

db_entry_add(...)
{
   char dbrecord[DB_ENTRY_SIZE];
   int retval;

   pthread_mutex_lock(&db_mutex);

   /* determine the EOF index, at which we will add the log entry */
   off_t ndb_offset = lseek(cfg.curr_fd, 0, SEEK_CUR);
   if (ndb_offset == -1)
   {
      fprintf(stderr, "Unable to determine ndb offset: %s\n", strerror_s(errno, ebuf, sizeof(ebuf)));
      pthread_mutex_unlock(&db_mutex);
      return 0;
   }

   /* reserve entry-size bytes at the location, at which we will
      later add the log entry */
   memset(dbrecord, 0, sizeof(dbrecord));

   /* note: db_write() is a write() loop */ 
   if (db_write(cfg.curr_fd, (char *) &dbrecord, DB_ENTRY_SIZE) < 0)
   {
      fprintf(stderr, "db_entry_add2db - db_write failed!");
      close(curr_fd);
      pthread_mutex_unlock(&db_mutex);

      return 0;
   }

   pthread_mutex_unlock(&db_mutex);

   /* in another data file, we now record that the entry we're going to write 
      will be at the specified location. if it's not (which is the problem,
      on rare occasion), our data will be inconsistent */ 
   advertise_entry_location(ndb_offset);
   ...

   /* open the data file */
   int write_fd = open(path, O_CREAT|O_LARGEFILE|O_WRONLY, 0644);
   if (write_fd < 0)
   {
      fprintf(stderr, "%s: Unable to open file %s: %s\n", __func__, cfg.curr_silo_db_path, strerror_s(errno, ebuf, sizeof(ebuf)));
      return 0;
   }

   pthread_mutex_lock(&db_mutex);

   /* seek to our reserved write location */
   if (lseek(write_fd, ndb_offset, SEEK_SET) == -1)
   {
      fprintf(stderr, "%s: lseek failed: %s\n", __func__, strerror_s(errno, ebuf, sizeof(ebuf)));
      close(write_fd);
      return 0;
   }

   pthread_mutex_unlock(&db_mutex);

   /* write the entry */
   /* note: db_write_with_mutex is a write() loop wrapped with db_mutex lock and unlock */ 
   if (db_write_with_mutex(write_fd, (char *) &dbrecord, DB_ENTRY_SIZE) < 0)
   {
      fprintf(stderr, "db_entry_add2db - db_write failed!");         
      close(write_fd);

      return 0;
   }

   /* close the data file */
   close(write_fd);

   return 1; 
}

为了完整性，再补充一点。我有一个类似但更简单的例程，也可能导致问题。这个使用缓冲输出 (FILE*, fopen, fwrite)，但在每次写入结束时执行 fflush()。它写入一个与早期例程不同的文件，但可能导致相同的症状。

pthread_mutex_lock(&data_mutex);

/* determine the offset at which the data will be written. this has to be accurate,    
otherwise it could be causing the problem */ 
offset = ftell(current_fp);

fwrite(data);
fflush(current_fp);

pthread_mutex_unlock(&data_mutex);

最佳答案

似乎有几个地方可能会出错。我将进行以下更改:(1) 保持一致并按照 bdonlan 的建议使用相同的 I/O 库，(2) 使 lseek() 和写入一个由互斥锁保护的原子操作，以便只有一个线程在一次可以执行添加到两个文件的那些操作。 SEEK_CUR 根据文件偏移指针的当前位置进行查找，所以您不希望 SEEK_END 查找到文件末尾以便追加到那里吗？然后，如果您正在修改文件的特定部分，您将使用 SEEK_SET 重新定位到您要写入的位置。并且您希望在互斥保护部分执行此操作，以便仅允许单个线程执行文件定位和文件更新。

关于C - 使用多线程时使用 lseek() 获得的 write() 位置不准确，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/11286736/

C - 使用多线程时使用 lseek() 获得的 write() 位置不准确

上一篇：c - 为什么窗口不能正确设置光标？

下一篇：c - 这个简单的指针程序输出的解释是什么