linux - AWK 文件重新格式化

标签 linux shell awk scripting

我正在努力使用 awk 重新格式化逗号分隔的文件。该文件包含多台服务器和多个指标的一天分钟数据
例如,每台服务器每分钟 2 条记录,持续 24 小时

输入文件示例:

server01,00:01:00,AckDelayAverage,9999  
server01,00:01:00,AckDelayMax,8888  
server01,00:02:00,AckDelayAverage,666  
server01,00:02:00,AckDelayMax,5555  
.....  
server01,23:58:00,AckDelayAverage,4545  
server01,23:58:00,AckDelayMax,8777  
server01,23:59:00,AckDelayAverage,4686  
server01,23:59:00,AckDelayMax,7820  
server02,00:01:00,AckDelayAverage,1231  
server02,00:01:00,AckDelayMax,4185  
server02,00:02:00,AckDelayAverage,1843  
server02,00:02:00,AckDelayMax,9982  
.....  
server02,23:58:00,AckDelayAverage,1022  
server02,23:58:00,AckDelayMax,1772  
server02,23:59:00,AckDelayAverage,1813  
server02,23:59:00,AckDelayMax,9891  

我正在尝试重新格式化文件,使其每分钟有一行,并使用字段 1 和 3 的唯一串联作为列标题

例如,预期的输出文件如下所示:

Minute, server01-AckDelayAverage,server01-AckDelayMax, server02-AckDelayAverage,server02-AckDelayMax  

00:01:00,9999,8888,1231,4185  
00:02:00,666,5555,1843,8892  
...  
...  
23:58:00,4545,8777,1022,1772  
23:59:00,4686,7820,1813,9891  

最佳答案

使用 GNU awk 的解决方案。将此称为 awk -F, -f script input_file:

/Average/ { average[$2, $1] = $4; }
/Max/ { maximum[$2, $1] = $4; }
{
    if (!($2 in minutes)) {
        minutes[$2] = 1;
    }
    if (!($1 in servers)) {
        servers[$1] = 1;
    }
}
END {
    mcount = asorti(minutes, smin);
    scount = asorti(servers, sserv);
    printf "minutes";
    for (col = 1; col <= scount; col++) {
        printf "," sserv[col] "-average," sserv[col] "-maximum";
    }
    print "";
    for (row = 1; row <= mcount; row++) {
        key = smin[row];
        printf key;
        for (col = 1; col <= scount; col++) {
            printf "," average[key, sserv[col]] "," maximum[key, sserv[col]];
        }
        print "";
    }
}

关于linux - AWK 文件重新格式化,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37589871/

相关文章:

linux - 为什么 (ps -f) 不创建子 shell 而是创建一个单独的进程?

linux - () 对简单代码行有什么影响?

date - 使用 awk 或 sed 将时间戳(unix 13 位)转换为 csv 文件完整列的日期时间格式

bash - 在 AWK 中打印字段编号大于的行

linux - 构建 glibc 时出错

sql - 如何快速查询大数据?

c - MSG_CONFIRM 和 TCP

linux - 一般如何区分CTL-D和CTL-C键盘中断?

bash - 从文件名列表中查找路径的有效方法

linux - 将标题添加到制表符分隔的文件