python - 按唯一属性值过滤数据类实例

我有一个数据类实例列表，其形式为:

dataclass_list = [DataEntry(company="Microsoft", users=["Jane Doe", "John Doe"]), DataEntry(company="Google", users=["Bob Whoever"]), DataEntry(company="Microsoft", users=[])]

现在我想过滤该列表并通过某个键(在本例中为公司)仅获取唯一实例。

所需的列表:

new_list = [DataEntry(company="Microsoft", users=["Jane Doe", "John Doe"]), DataEntry(company="Google", users=["Bob Whoever"])]

最初的想法是以 python 的 set() 或 filter() 函数的方式使用函数，但这两种方法在这里都是不可能的。

到目前为止我的工作解决方案:

tup_list = [(dataclass, dataclass.company)) for dataclass in dataclass_list]
new_list = []
check_list = []
for tup in tup_list:
    if tup[1].lower() not in check_list:
        new_list.append(tup[0])
        check_list.append(tup[1].lower())

这给了我想要的输出，但我想知道是否有更Pythonic或更优雅的解决方案？

最佳答案

在您的 DataEntry 数据类中，您需要覆盖 __eq__(...)和 __hash__(...)函数，您可以在其中指定计算对象的哈希值时使用哪个属性以及何时将两个对象视为相等。

一个简短的示例，其中默认情况下使用类 Company 的 name 属性来确定两个对象的相等性。我还通过一个选项扩展了您的案例，您可以在其中确定在构造对象时将考虑唯一性的属性。请注意，要比较的所有对象都需要具有相同的comparison_attr。

import pprint

class Company:

    def __init__(self, name, location, comparison_attr="name") -> None:
        # By default we use the attribute `name` for comparison
        self.name = name
        self.location = location
        self.__comparison_attr = comparison_attr

    def __hash__(self) -> int:
        return hash(self.__getattribute__(self.__comparison_attr))

    def __eq__(self, other: object) -> bool:
        return self.__getattribute__(self.__comparison_attr) == other.__getattribute__(self.__comparison_attr)

    def __repr__(self) -> str:
        return f"name={self.name}, location={self.location}"

for attribute_name in ["name", "location"]:
    companies = [
        Company("Google", "Palo Alto", comparison_attr=attribute_name), 
        Company("Google", "Berlin", comparison_attr=attribute_name),
        Company("Microsoft", "Berlin", comparison_attr=attribute_name),
        Company("Microsoft", "San Francisco", comparison_attr=attribute_name),
        Company("IBM", "Palo Alto", comparison_attr=attribute_name),
    ]

    print(f"Attribute considered for uniqueness: {attribute_name}")
    pprint.pprint(set(companies))

输出:

Attribute considered for uniqueness: name
{name=Microsoft, location=Berlin,
 name=Google, location=Palo Alto,
 name=IBM, location=Palo Alto}

Attribute considered for uniqueness: location
{name=Microsoft, location=San Francisco,
 name=Google, location=Berlin,
 name=Google, location=Palo Alto}

关于python - 按唯一属性值过滤数据类实例，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/71426366/

python - 按唯一属性值过滤数据类实例

上一篇：javascript - 正则表达式匹配最后一个冒号后面不在大括号内的字符

下一篇：javascript - VUE3 - 对象上的 VUEX v 模型