Python data.table 行过滤正则表达式

Python 的 data.table 相当于 %like% 是什么？

简短示例:

dt_foo_bar = dt.Frame({"n": [1, 3], "s": ["foo", "bar"]})  
dt_foo_bar[re.match("foo",f.s),:] #works to filter by "foo"

我曾期望这样的东西能起作用:

dt_foo_bar[re.match("fo",f.s),:]

但它返回“预期的字符串或类似字节的对象”。我很想开始在 Python 中使用新的 data.tables 包，就像我在 R 中使用它一样，但我处理的文本数据比数字数据多得多。

提前致谢。

最佳答案

从 0.9.0 版本开始，datatable 包含函数 .re_match() 执行正则表达式过滤。例如:

>>> import datatable as dt
>>> dt_foo_bar = dt.Frame(N=[1, 3, 5], S=["foo", "bar", "fox"])
>>> dt_foo_bar[dt.f.S.re_match("fo."), :]
     N  S  
--  --  ---
 0   1  foo
 1   5  fox

[2 rows x 2 columns]

通常，.re_match() 应用于列表达式并生成一个新的 bool 列，指示每个值是否与给定的正则表达式匹配。

关于Python data.table 行过滤正则表达式，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/54621252/

上一篇：python - 如何删除对 "HTTP For Humans"Requests 库的依赖并使用 urllib 代替？

下一篇：python - DataFrame GroupBy 多级选择

相关文章：

python - 如何隐藏主窗口标题栏并在 kivy 框架中放置透明背景？

python - 没有异常时将回溯打印到文件

python - 有没有一种方法可以打印 python 数据表而无需最后等待用户输入

python - py-datatable 'in' 运算符？

python - 将新列分配给数据表

Python 游戏问题

python - 无法在 Elastic Beanstalk 上用 Bottle 加载静态文件

python - 准确度测量值 val_acc 可信吗？

python - 如何在python中组合(合并)两个数据表框架

python - 如何在python数据表中填充空值？