带条件的 Python 列表到 Dataframe

我有一个很长的 list (示例如下)

df_list = ['Joe',
 'UK',
 'Buyout',
 '10083',
 '4323',
 'http://info2.com',
 'Linda',
 'US',
 'Liquidate',
 '97656',
 '1223',
 'http://global.com',
 '<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="086461666c69486f64676a6964266b6765" rel="noreferrer noopener nofollow">[email protected]</a>'           
          ]

如您所见，该列表包含有关个人(Joe 和 Linda)的信息。然而，问题是，对于某些观察结果(本例中为 Joe)，我缺少第 7 个元素，它对应于实体的电子邮件地址，因为对于 Linda，我们确实有此人的电子邮件，因此已填充。

我想把这个列表变成一个有 7 列的数据框(如下)，对于没有有效电子邮件地址(不包含“@”)的观察，我想输入 Null/空值，而不是下一个元素，这将是电子邮件列的下一个观察的 NAME 列。

cols = ['NAME'
,'COUNTRY'
,'STRATEGIES'
,'TOTAL FUNDS'
,'ESTIMATED PAYOFF'
,'WEBSITE'
,'EMAIL']

到目前为止，这就是我所处的位置

big_list = []  #intention is to append N (number of unique entity) small_lists into a big_list and call pd.DataFrame(big_list)
small_list = [] #intention is to create a small_list for each observation/entity, containing 7 values, including email or null if empty
for element in df_list:
    small_list.append(element)
if ("@" not in small_list):
    small_list[-1] = None

任何帮助将不胜感激! 谢谢

最佳答案

你可以使用发电机:

def gen_batch(df_list):
    i = 6
    while i <= len(df_list):
        if i < len(df_list) and '@' in df_list[i]:
            yield df_list[i-6: i+1] 
            i += 7
        else:
            yield df_list[i-6: i] + [pd.np.NAN]
            i += 6

pd.DataFrame(gen_batch(df_list), columns=cols)

输出:

关于带条件的 Python 列表到 Dataframe，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/60716777/

带条件的 Python 列表到 Dataframe

上一篇：javascript - react : How to access array in an object?

下一篇：python - beautifulsoup 不返回所有 html