注意

df2 是这里唯一可以使用的东西 - 使用 df 或 df1 将使用不可能的数据。数据以df2形式接收，它希望被操作为df1的形式。 df1 或 df 都不能用作解决方案的一部分(因为 df1 就是解决方案)。

设置测试数据

这只是为这篇文章设置的。

# sample data
reps = {1: "dog", 2: "ant", 3: "cat", 6: "orange", 7: "apple", 8: "grape"}
df = pd.DataFrame(
    {"one": [1, 1, 1, 2, 2, 2, 3, 3, 3], "two": [6, 7, 8, 6, 7, 8, 6, 7, 8]}
)
df = df.replace(reps).copy()
df1 = df.copy()
df2 = df.sample(frac=1, random_state=1).replace(reps).reset_index(drop=True)

问题

对 df1 进行排序，使其与 df1 的顺序相同。

df2:

   one     two
0  cat   grape
1  dog   grape
2  cat  orange
3  cat   apple
4  dog   apple
5  dog  orange
6  ant   apple
7  ant  orange
8  ant   grape

df1

   one     two
0  dog  orange
1  dog   apple
2  dog   grape
3  ant  orange
4  ant   apple
5  ant   grape
6  cat  orange
7  cat   apple
8  cat   grape

条件

您不能使用 df1 作为解决方案的一部分，或 df，数据为 df2，并且它需要按照df1的顺序进行排序。

尝试

我尝试过使用pd.Categorical，但无法使某些功能发挥作用。

order_one = ["dog", "ant", "cat"]
order_two = ["orange", "apple", "grape"]

df2 = (
    df2.groupby(["two"])
    .apply(lambda a: a.iloc[pd.Categorical(a["one"], order_one).argsort()])
    .reset_index(drop=True)
)

df2 = (
    df2.groupby(["one"])
    .apply(lambda a: a.iloc[pd.Categorical(a["two"], order_two).argsort()])
    .reset_index(drop=True)
)

编辑

该解决方案应纯粹基于df2，df1只是测试数据的一部分，用于演示df2应如何排序。使用 df1 的解决方案不可行，因为这是对 df2 进行排序的结果，我不能将其用作解决方案的一部分

最佳答案

让我们尝试一下pd.Categorical

df2.one=pd.Categorical(df2.one,categories=df1.one.unique())
df2.two=pd.Categorical(df2.two,categories=df1.two.unique())
df2=df2.sort_values(['one','two'])
df2
   one     two
5  dog  orange
4  dog   apple
1  dog   grape
7  ant  orange
6  ant   apple
8  ant   grape
2  cat  orange
3  cat   apple
0  cat   grape

将其变成函数

def yourfunc(x,y):
...     for c in x.columns : 
...         x[c]=pd.Categorical(x[c],categories=y[c].unique())
...     return x.sort_values(x.columns.tolist())
... 
yourfunc(df1,df2)
   one     two
8  cat   grape
6  cat  orange
7  cat   apple
2  dog   grape
0  dog  orange
1  dog   apple
5  ant   grape
3  ant  orange
4  ant   apple

更新

order_fruit = ["orange", "apple", "grape"]
order_animals = ["dog", "ant", "cat"]
def yourfunc(x,y):
...      for c, self in zip(x.columns,y) : 
...          x[c]=pd.Categorical(x[c],categories=self)
...      return x.sort_values(x.columns.tolist())
... 
yourfunc(df2,[order_animals,order_fruit])
   one     two
5  dog  orange
4  dog   apple
1  dog   grape
7  ant  orange
6  ant   apple
8  ant   grape
2  cat  orange
3  cat   apple
0  cat   grape

关于python - 如何按特定顺序对两个(或更多)不同列上的 pandas 数据框进行排序，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/62110302/

python - 如何按特定顺序对两个(或更多)不同列上的 pandas 数据框进行排序

注意

设置测试数据

问题

条件

尝试

编辑

上一篇：tensorflow - 使用 ResNet50 进行二元分类的恒定验证准确性

下一篇：c# - 如何读取文本文件中一行的一部分？