f# - 在 Deedle 中重新排列行和列

标签 f# deedle

我有一张 table

 Month Cluster Year ActualAmount TargetedAmount 
 1     1       2015 100          200            
 1     1       2016 300          400            
 1     1       2017 300          400            
 2     1       2015 500          600            
 2     2       2016 700          800  

我希望将年份的行值作为列

 Month Cluster ActualAmount.2015 ActualAmount.2016 ActualAmount.2017 TargetedAmount.2015 ...
 1     1        100          300          300          200 ...             
 2     1        500            -            -          600 ...
...

我试图用 pivotTable 解决它(见下文)。它没有得到正确的索引。

#r "nuget: Deedle"

open System
open Deedle

type Record =
    { Month: int
      Cluster: int
      Year: int
      ActualAmount: int
      TargetedAmount: int }

let Records =
    [ { Month = 1
        Cluster = 1
        Year = 2015
        ActualAmount = 100
        TargetedAmount = 200 }
      { Month = 1
        Cluster = 1
        Year = 2016
        ActualAmount = 300
        TargetedAmount = 400 }
      { Month = 1
        Cluster = 1
        Year = 2017
        ActualAmount = 300
        TargetedAmount = 400 }
      { Month = 2
        Cluster = 1
        Year = 2015
        ActualAmount = 500
        TargetedAmount = 600 }
      { Month = 2
        Cluster = 2
        Year = 2016
        ActualAmount = 700
        TargetedAmount = 800 } ]

let df = Frame.ofRecords Records

df.Print()

let pdf = df |> Frame.pivotTable (fun k r -> r.GetAs<int>("Month")) (fun k r -> r.GetAs<int>("Year")) id
    
pdf.Print()
     2015                                       2016                                       2017                                       
1 -> Deedle.Frame`2[System.Int32,System.String] Deedle.Frame`2[System.Int32,System.String] Deedle.Frame`2[System.Int32,System.String] 
2 -> Deedle.Frame`2[System.Int32,System.String] Deedle.Frame`2[System.Int32,System.String] <missing>                                      

感谢任何帮助。

最佳答案

我不是 Deedle 专家,但这似乎有效:

let pivot col df =
    df
        |> Frame.pivotTable
            (fun k r -> r.GetAs<int>("Month"), r.GetAs<int>("Cluster"))
            (fun k r -> r.GetAs<int>("Year"))
            (fun frm -> frm.GetColumn(col).Sum().ToString())
        |> Frame.fillMissingWith "-"
        |> Frame.mapColKeys (fun c -> $"{col}.{c}")
let actualAmounts =
    df |> pivot "ActualAmount"
let targetedAmounts =
    df |> pivot "TargetedAmount"

let monthClusters =
    actualAmounts
        |> Frame.mapRows (fun (month, cluster) _ ->
            [
                "Month", month
                "Cluster", cluster
            ] |> Series.ofObservations)
        |> Frame.ofRows

let pdf = monthClusters.Join(actualAmounts).Join(targetedAmounts)
pdf.Print()

输出是:

       Month Cluster ActualAmount.2015 ActualAmount.2016 ActualAmount.2017 TargetedAmount.2015 TargetedAmount.2016 TargetedAmount.2017
1 1 -> 1     1       100               300               300               200                 400                 400
2 1 -> 2     1       500               -                 -                 600                 -                   -
  2 -> 2     2       -                 700               -                 -                   800                 -

诀窍是为实际金额和目标金额计算单独的数据透视表,然后将它们连接在一起。

关于f# - 在 Deedle 中重新排列行和列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/70937857/

相关文章:

lambda - F# 在字符串数组上使用 List.map

dictionary - 映射到 Deedle 框架

f# - 速度问题 : Creating Series with Deedle/Getting unique values in F#

css - 如何使 FsLab 表格渲染得很好?

f# - 如何在 [<ReflectedDefinition>] 标记模块中获取函数的 AST?

inheritance - 参数多态性与子类型多态性 F#

f# - 在 F# 中处理 Deedle 时间序列中的缺失值 (1)

f# - 为什么不能将带有 byref 的函数直接转换为委托(delegate)?

csv - 如何将数据从 FSharp.Data.CsvProvider 传递到 Deedle.Frame?

f# - 格式化内置类型以在 Deedle 中进行 pretty-print