shell - rand() 如何在 awk 中工作

我正在尝试使用 awk 和 rand() 对 csv 文件的第二列进行采样(任意数量的样本都可以)。但是，我注意到我总是得到相同数量的样本

cat toy.txt | awk -F',' 'rand()<0.2 {print $2}' | wc -l

我探索了一下，似乎 rand() 没有按我的预期工作。例如，下面的a似乎总是1，

cat toy.txt | awk -F',' 'a=rand() a<0.2 {print a}'

为什么？

最佳答案

来自 documentation :

CAUTION: In most awk implementations, including gawk, rand() starts generating numbers from the same starting number, or seed, each time you run awk. Thus, a program generates the same results each time you run it. The numbers are random within one awk run but predictable from run to run. This is convenient for debugging, but if you want a program to do different things each time it is used, you must change the seed to a value that is different in each run. To do this, use srand().

关于shell - rand() 如何在 awk 中工作，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/45901042/

上一篇：forms - Xamarin Shell 引发歧义路由匹配异常

下一篇：shell - 为什么 Tmux 有服务器？有什么好处？

相关文章：

awk - 使用 sed/awk 从重复行中删除模式

regex - 如何使用 sed、awk 或其他 OS X 工具替换文件(JSON 格式)中的多行 block ？

bash - 通过拆分特定字段将行拆分为多个

xml - grep 多个字符串并在变量中存储/打印下一行值

bash - 将秒转换为时、分、秒

bash - 如何将世界权限设置为与组权限相同？

bash - Hadoop fs命令在命令行上运行，但不在shell脚本中运行

windows - 如何在没有浏览器的情况下显示弹出窗口

bash - 为什么 shell 变量在给命令加上前缀时会变成环境变量？

awk - 如何使 awk 程序始终使用相同的输入文件？