从文件名计算字数统计程序中的字数

标签 c linux

我似乎能够正确获取行、字符和空格。但是,我很难弄清楚如何计算单词数。它们不必是字典形式;例如,fdasf fdjsak fea 将是三个词。

这是我的:

#include <stdio.h>

int main(int argc, char* argv[]) {
  int ccount = 0;
  int wcount = 1;
  int lcount = 0;
  int scount = 0;

  char* filename = argv[1];
  printf("filename is: %s\n", filename);

  FILE* infile;
  infile = fopen(filename, "r");

  if (infile == NULL) {
    printf("%s: is not a valid file\n", filename);
    return -1;
  }

  char c;
  while ((c = fgetc(infile)) != EOF) {
    if (c == ' ') {
      wcount++;
    }
    if (c == '\n') {
      lcount++;
    }
    if (c != ' ' || c != '\n') {
      ccount++;
    }
    if (c == ' ') {
      scount ++;
    }
  }

  printf("total number of lines: %d\n", lcount);
  printf("total number of characters: %d \n", ccount);
  printf("total number of non-whitespace characters: %d \n", scount );
  printf("total number of words: %d \n", wcount);

  return 0;
}

最佳答案

虽然有很多方法可以做到这一点,但这里有一个从 stdin 读取的简短示例,您可以简单地将 stdin 更改为 infile 用于您的目的(在打开 infile 之后)。这不会将空字符串(单独的 '\n')算作一个单词。您可以对此进行修改以满足您的需要。它包含解释逻辑的注释。如果您有任何问题,请告诉我:

#include <stdio.h>

int main (void) {

    char *line = NULL;  /* pointer to use with getline ()  */
    char *p = NULL;     /* pointer to parse getline return */
    ssize_t read = 0;   /* actual chars read per-line      */
    size_t n = 0;       /* max chars to read (0 - no limit)*/
    int spaces = 0;     /* counter for spaces and newlines */
    int total = 0;      /* counter for total words read    */

    printf ("\nEnter a line of text (or ctrl+d to quit)\n");

    while (printf ("\n input: ") && (read = getline (&line, &n, stdin)) != -1) 
    {
        /* strip trailing '\n' or '\r' */
        while (line[read-1] == '\n' || line[read-1] == '\r')
            line[--read] = 0;

        spaces = 0;
        p = line;

        if (read > 0) {        /* read = 0 covers '\n' case (blank line with [enter])  */
            while (*p) {                            /* for each character in line      */
                if (*p == '\t' || *p == ' ') {      /* if space,       */
                    while (*p == '\t' || *p == ' ') /* read all spaces */
                        p++;
                    spaces += 1;                    /* consider sequence of spaces 1   */
                } else
                    p++;                            /* if not space, increment pointer */
            }
            total += spaces + 1;                    /* words per-line = spaces + 1     */
        }

        printf (" chars read: %2zd,  spaces: %2d,  words: %2d,  total: %3d   | '%s'\n",
                read, spaces, (spaces > 0) ? spaces+1 : 0, 
                total, (read > 1) ? line : "[enter]");
    }

    printf ("\n\n  Total words read: %d\n\n", total);

    return 0;

}

输出:

$ ./bin/countwords

Enter a line of text (or ctrl+d to quit)

 input: my dog has fleas
 chars read: 16,  spaces:  3,  words:  4,  total:   4   | 'my dog has fleas'

 input:
 chars read:  0,  spaces:  0,  words:  0,  total:   4   | '[enter]'

 input: fee fi fo fum
 chars read: 13,  spaces:  3,  words:  4,  total:   8   | 'fee fi fo fum'

 input:

  Total words read: 8

注意:要扩展可识别的空白字符,您可以包含ctype.h header 并使用isspace() 函数而不是像上面那样简单地检查 spacestabs。省略是有意将所需的头文件限制为 stdio.h

关于从文件名计算字数统计程序中的字数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28778108/

相关文章:

python - 如何在 Python 中使用 librt 函数?

c - 为什么这段代码打印第 n 个数字给出运行时错误?

python - 如何用正则表达式选择PID值?

c - 使用C中的套接字通过TCP发送音频文件

PHP 警告 : PHP Startup: Unable to load dynamic library:/home/lib. 如此: undefined symbol :第 0 行未知中的 __gxx_personality_v0

linux - Lisp 工具包 (ltk) : Cannot get SCALE :variable value

linux - 什么是内存映射页和匿名页?

linux - 使用 sed 在每行的开头插入文本

linux - 无法在端口 80 或 443 上运行 Node.js Web 服务

c++ - 用于快速插入和删除的堆 O(log n)