python - 在比较两个数据帧时,有没有有效的方法为单元格分配 id?

标签 python pandas csv dataframe

我想为 df2 中的数据连续分配一个特定的 ID,并基于该 ID,我想转换 df1 中所有出现的 ID。我编写的代码需要花费大量时间来执行。还有其他办法吗?

for i in range (0,35261):
    for j in range (0,54793):
        if (df2.V_ID[i] == df.V_ID[j]):
            df.V_ID[j] = i

df 的示例数据:

        time               IP1           IP2        GETVIDEO    V_ID                       IP3
0   2008-03-11 17:28:17 63.22.65.77 205.181.173.92  GETVIDEO    ORDhCi6JQaY&signature   254.212.25.169
1   2008-03-11 17:28:20 63.22.65.94 35.139.184.95   GETVIDEO    xEcFchOvj4Y&signature   254.212.19.255
2   2008-03-11 17:28:22 63.22.65.73 35.139.176.183  GETVIDEO    z-oBoCMSfbw&signature   254.212.19.196
3   2008-03-11 17:28:23 63.22.65.73 102.15.230.123  GETVIDEO    pSo-_TavE1U&signature   254.212.25.206
4   2008-03-11 17:28:23 63.22.65.77 102.15.134.225  GETVIDEO    kHtaORb0LUk&signature   254.212.22.122
5   2008-03-11 17:28:23 63.22.65.77 102.15.111.222  GETVIDEO    t7qjlPPmeJE&origin  105.136.78.115
6   2008-03-11 17:28:27 63.22.65.73 35.139.31.8     GETVIDEO    2UPaRi0WY7c&origin  105.136.78.115
7   2008-03-11 17:28:28 63.22.65.73 102.15.143.68   GETVIDEO    lAzrUxpybs0&signature   254.212.21.130
8   2008-03-11 17:28:30 63.22.65.73 205.181.139.118 GETVIDEO    J_KKyw8V-l0&origin  105.136.78.115
9   2008-03-11 17:28:31 63.22.65.73 102.15.143.20   GETVIDEO    xnsPfRdSU0Q&origin  105.136.78.115
10  2008-03-11 17:28:34 63.22.65.94 102.15.141.151  GETVIDEO    qDKx6CkQM04&origin  105.136.78.115

df2 的示例数据:

        V_ID            count
0   2UPaRi0WY7c&origin  768
1   t7qjlPPmeJE&origin  142
2   CKrTlXN9-iE&origin  107
3   IZtPejST9IQ&origin  103
4   FKb3qRljGBc&origin  93
5   LcM0OT6mnqA&origin  67
6   7sei-eEjy4g&origin  62
7   qDKx6CkQM04&origin  53
8   4rb8aOzy9t4&origin  46
9   wjv4Fp7GiGk&origin  46
10  SKDXBvPIepI&sign    44

预期输出:

time               IP1           IP2                    GETVIDEO    V_ID                       IP3
    0   2008-03-11 17:28:17 63.22.65.77 205.181.173.92  GETVIDEO    42  254.212.25.169
    1   2008-03-11 17:28:20 63.22.65.94 35.139.184.95   GETVIDEO    13  254.212.19.255
    2   2008-03-11 17:28:22 63.22.65.73 35.139.176.183  GETVIDEO    21  254.212.19.196
    3   2008-03-11 17:28:23 63.22.65.73 102.15.230.123  GETVIDEO    14  254.212.25.206
    4   2008-03-11 17:28:23 63.22.65.77 102.15.134.225  GETVIDEO    23  254.212.22.122
    5   2008-03-11 17:28:23 63.22.65.77 102.15.111.222  GETVIDEO    1   105.136.78.115
    6   2008-03-11 17:28:27 63.22.65.73 35.139.31.8     GETVIDEO    0   105.136.78.115
    7   2008-03-11 17:28:28 63.22.65.73 102.15.143.68   GETVIDEO    33  254.212.21.130
    8   2008-03-11 17:28:30 63.22.65.73 205.181.139.118 GETVIDEO    42  105.136.78.115
    9   2008-03-11 17:28:31 63.22.65.73 102.15.143.20   GETVIDEO    19  105.136.78.115
    10  2008-03-11 17:28:34 63.22.65.94 102.15.141.151  GETVIDEO    7   105.136.78.115

最佳答案

import pandas as pd 

df2 = pd.DataFrame({'V_ID': ['a','b','c','d'], 'count':[12,5,7,9]})
df = pd.DataFrame({'time':['2008-03-11', '2008-03-11', '2008-03-11','2008-03-11', '2008-03-11', '2008-03-11', '2008-03-11'],
                   'V_ID': ['a', 'sdf', 'c','rge', 'gfg', 'a', 'a']})

# Create an index column for df2
df2 = df2.reset_index()

# Key-value pairs of index and V_ID
mapping = df2['V_ID'].to_dict()

# Invert key-value pairs 
mapping = {v: k for k, v in mapping.items()}

# Replace values in df['V_ID'] that matches with keys in mapping with values
df['V_ID'] = df['V_ID'].replace(mapping)

print(df)

         time V_ID
0  2008-03-11    0
1  2008-03-11  sdf
2  2008-03-11    2
3  2008-03-11  rge
4  2008-03-11  gfg
5  2008-03-11    0
6  2008-03-11    0

关于python - 在比较两个数据帧时,有没有有效的方法为单元格分配 id?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53928168/

相关文章:

python - 使用值对字典进行切片

python - 如何在pandas python中查找和匹配不同数据框中的特定值

python - 连接数据框中的列并生成新 ID

mysql - 如何合并 4 个 mysql 表中的数据

python - 如何将 csv 文件中的字符串与日志文件中的字符串进行匹配?是否可以?

Python matplotlib : connect two subplot diagrams

python - 使用二分法查找列表中 f(x) 变化的位置(在 Python 中)

php - 当 csv 表标题与 mySQL 表标题不匹配时使用 LOAD DATA INFILE 语句?

python - 点击按钮后复制插入mysql数据失败

python - 使用 pandas.cut() 并将其设置为数据帧的索引