python - 我需要 python 中 list_of_dicts_of_lists 的按列中位数

我有这个数据:

list_of_dicts_of_lists = [
    {'a': [1,2], 'b': [3,4], 'c': [3,2], 'd': [2,5]}
    {'a': [2,2], 'b': [2,2], 'c': [1,6], 'd': [4,7]}
    {'a': [2,2], 'b': [5,2], 'c': [3,2], 'd': [2,2]}
    {'a': [1,2], 'b': [3,4], 'c': [1,6], 'd': [5,5]} 
    ]

我需要这个结果:

median_dict_of_lists = (
    {'a': [1.5,2], 'b': [3,3], 'c': [2,4], 'd': [3,5]}
    )

...其中每个值都是上面相应列的中位数。

我需要可用的模式字典和不存在模式时的中值字典。我能够通过将每个字典串起来，获取字符串列表的模式，然后 ast.literal_eval(most_common_string) 返回字典来快速而肮脏的 statistics.mode() ，但在没有模式的情况下，我需要一个按列的中位数。

我知道如何使用 statistics.median()；然而，将它应用于这种情况的嵌套符号，按列排列，让我感到困惑。

数据都是 float ；我把它写成 int 只是为了更容易阅读。

最佳答案

您可以将 statistics.median 与 itertools.groupby 一起使用:

import statistics
import itertools
list_of_dicts_of_lists = [
  {'a': [1,2], 'b': [3,4], 'c': [3,2], 'd': [2,5]},
  {'a': [2,2], 'b': [2,2], 'c': [1,6], 'd': [4,7]},
  {'a': [2,2], 'b': [5,2], 'c': [3,2], 'd': [2,2]},
  {'a': [1,2], 'b': [3,4], 'c': [1,6], 'd': [5,5]} 
]
new_listing = [(a, list(b)) for a, b in itertools.groupby(sorted(itertools.chain(*map(lambda x:x.items(), list_of_dicts_of_lists)), key=lambda x:x[0]), key=lambda x:x[0])]
d = {a:zip(*map(lambda x:x[-1], b)) for a, b in new_listing}
last_data = ({a:[statistics.median(b), statistics.median(c)] for a, [b, c] in d.items()},)

输出:

({'a': [1.5, 2.0], 'b': [3.0, 3.0], 'c': [2.0, 4.0], 'd': [3.0, 5.0]},)

关于python - 我需要 python 中 list_of_dicts_of_lists 的按列中位数，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/49353986/

python - 我需要 python 中 list_of_dicts_of_lists 的按列中位数

上一篇：python - 在滚动平均值中替换 NaN(python)

下一篇：python - 如果列表(a)中的项目存在于列表(b)中，则通过更大的列表(b)迭代列表(a)以给出真/假值