awk - 用小数秒创建时间戳

awk可以使用 strftime 函数生成时间戳，例如

$ awk 'BEGIN {print strftime("%Y/%m/%d %H:%M:%S")}'
2019/03/26 08:50:42

但我需要一个带小数秒的时间戳，最好低至纳秒。 gnu date可以通过 %N 做到这一点元素:

$ date "+%Y/%m/%d %H:%M:%S.%N"
2019/03/26 08:52:32.753019800

但是调用date效率比较低从内部 awk相比于调用 strftime ，我需要高性能，因为我正在使用 awk 处理许多大文件并且需要在处理文件时生成许多时间戳。有没有办法让awk可以有效地生成一个包含小数秒的时间戳(理想情况下是纳秒，但毫秒是可以接受的)？

添加我正在尝试执行的示例:

awk -v logFile="$logFile" -v outputFile="$outputFile" '
BEGIN {
   print "[" strftime("%Y%m%d %H%M%S") "] Starting to process " FILENAME "." >> logFile
}
{
    data[$1] += $2
}
END {
    print "[" strftime("%Y%m%d %H%M%S") "] Processed " NR " records." >> logFile
    for (id in data) {
        print id ": " data[id] >> outputFile
    }
}
' oneOfManyLargeFiles

最佳答案

如果您确实需要亚秒级计时，则可以调用任何外部命令，例如 date或读取外部系统文件，例如 /proc/uptime或 /proc/rct违背了亚秒级精度的目的。这两种情况都需要很多资源来检索请求的信息(即时间)

由于 OP 已经使用了 GNU awk，因此您可以使用动态扩展。动态扩展是通过实现用 C 或 C++ 编写的新函数并用 gawk 动态加载它们来向 awk 添加新功能的一种方式。如何编写这些函数在 GNU awk manual 中有广泛的记载。 .

幸运的是，GNU awk 4.2.1 附带了一组可以随意加载的默认动态库。这些库之一是 time具有两个简单功能的库:

the_time = gettimeofday() Return the time in seconds that has elapsed since 1970-01-01 UTC as a floating-point value. If the time is unavailable on this platform, return -1 and set ERRNO. The returned time should have sub-second precision, but the actual precision may vary based on the platform. If the standard C gettimeofday() system call is available on this platform, then it simply returns the value. Otherwise, if on MS-Windows, it tries to use GetSystemTimeAsFileTime().

result = sleep(seconds) Attempt to sleep for seconds seconds. If seconds is negative, or the attempt to sleep fails, return -1 and set ERRNO. Otherwise, return zero after sleeping for the indicated amount of time. Note that seconds may be a floating-point (nonintegral) value. Implementation details: depending on platform availability, this function tries to use nanosleep() or select() to implement the delay.

_{source: GNU awk manual}

现在可以用一种相当直接的方式调用这个函数:

awk '@load "time"; BEGIN{printf "%.6f", gettimeofday()}'
1553637193.575861

为了证明这种方法比更经典的实现更快，我使用 gettimeofday() 对所有 3 个实现进行计时。 :

awk '@load "time"
     function get_uptime(   a) {
        if((getline line < "/proc/uptime") > 0)
        split(line,a," ")
        close("/proc/uptime")
        return a[1]
     }
     function curtime(    cmd, line, time) {
        cmd = "date \047+%Y/%m/%d %H:%M:%S.%N\047"
        if ( (cmd | getline line) > 0 ) {
           time = line
        }
        else {
           print "Error: " cmd " failed" | "cat>&2"
        }
        close(cmd)
        return time
      }
      BEGIN{
        t1=getimeofday(); curtime(); t2=gettimeofday();
        print "curtime()",t2-t1
        t1=getimeofday(); get_uptime(); t2=gettimeofday();
        print "get_uptime()",t2-t1
        t1=getimeofday(); gettimeofday(); t2=gettimeofday();
        print "gettimeofday()",t2-t1
      }'

输出:

curtime() 0.00519109
get_uptime() 7.98702e-05
gettimeofday() 9.53674e-07

虽然很明显 curtime()是最慢的，因为它加载外部二进制文件，令人吃惊的是，awk 在处理额外的外部/proc/文件时非常快。

关于awk - 用小数秒创建时间戳，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/55358991/

awk - 用小数秒创建时间戳

上一篇：google-cloud-platform - GCP 上的 Terraform 共享 VPC - 静态内部 IP 地址

下一篇：game-development - 有没有办法在 GDScript 中通过引用传递内置变量？