在 R 中有很好的功能,可以为分类变量的每个级别运行带有虚拟变量的回归。例如Automatically expanding an R factor into a collection of 1/0 indicator variables for every factor level

在 Julia 中是否有等效的方法来执行此操作。

x = randn(1000)
group = repmat(1:25 , 40)
groupMeans = randn(25)
y = 3*x + groupMeans[group]

data = DataFrame(x=x, y=y, g=group)
for i in levels(group)
    data[parse("I$i")] = data[:g] .== i
    I21+I22+I23+I24, data)


如果您使用的是 DataFrames 包,则在您之后 pool数据,包将处理其余的:

Pooling columns is important for working with the GLM package When fitting regression models, PooledDataArray columns in the input are translated into 0/1 indicator columns in the ModelMatrix - with one column for each of the levels of the PooledDataArray.

您可以查看有关合并数据的其余文档 here

