python - 如何使用列名列表对数据框进行排序

标签 python pandas csv sorting dataframe

我在 Python 2.7 上使用 Pandas 0.22.0,并以 PyCharm 作为 IDE。

我正在尝试使用循环对多个数据帧进行排序。这些数据框是从 .csv 文件创建的,然后使用 pandas 中的“xlsxwriter”转换为 xlsx。

我创建了一个排序列表,其中包含所有排序要求,这样当我运行循环时,它会拾取一个 csv 文件,将其转换为数据框,“对其进行排序”(我在其中被卡住了),然后将整个内容输出为 .xlsx 文件,以便可以在 MSEXCEL 中播放。

如果我使用 df = df.sort_values(by=['SITE', 'DEPARTMENT', 'LOCATION', 'ASSET_TYPE', 'ASSET_NAME']) 则没有问题。

但是,如果我使用这个:df = df.sort_values(by=sorts[0]),代码就会崩溃。

    Traceback (most recent call last):
      File "D:/OneDrive/Programming Practice/Python/Rubaiyat/test1.py", line 55, in <module>
        df = df.sort_values(by=(sorts[0]))
      File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 3619, in sort_values
        k = self.xs(by, axis=other_axis).values
      File "C:\Python27\lib\site-packages\pandas\core\generic.py", line 2335, in xs
        return self[key]
      File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 2139, in __getitem__
        return self._getitem_column(key)
      File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 2146, in _getitem_column
        return self._get_item_cache(key)
      File "C:\Python27\lib\site-packages\pandas\core\generic.py", line 1842, in _get_item_cache
        values = self._data.get(item)
      File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 3843, in get
        loc = self.items.get_loc(item)
      File "C:\Python27\lib\site-packages\pandas\core\indexes\base.py", line 2527, in get_loc
        return self._engine.get_loc(self._maybe_cast_indexer(key))
      File "pandas\_libs\index.pyx", line 117, in pandas._libs.index.IndexEngine.get_loc
      File "pandas\_libs\index.pyx", line 139, in pandas._libs.index.IndexEngine.get_loc
      File "pandas\_libs\hashtable_class_helper.pxi", line 1265, in pandas._libs.hashtable.PyObjectHashTable.get_item
      File "pandas\_libs\hashtable_class_helper.pxi", line 1273, in pandas._libs.hashtable.PyObjectHashTable.get_item
    KeyError: "'SITE', 'DEPARTMENT', 'LOCATION', 'ASSET_TYPE', 'ASSET_NAME'"

整个代码如下:

    import pandas
    import sys

    reload(sys)
    sys.setdefaultencoding('utf-8')

    reportDF = ["assetReport", "assetTypeReport", "assetStatusReport", "locationReport", "departmentReport", "siteReport",
                "userReport"]

    sheetNames = ["Asset Report", "Asset Types", "Asset Status", "Locations", "Cost Centers", "Sites", "Users"]


    columnNames = [("EPC", "Creation Date", "Modification Date", "Inventory Date", "Asset Name", "Asset Status",
                    "Asset Type", "Asset User", "Location", "Site", "Cost Center", "Description"),
                "Asset Type Name",
                ("Asset Status", "Asset Status Description"),
                ("Location Name", "EPC", "Floor", "GPS", "Capacity", "Lead Time", "Site Name"),
                "Cost Center",
                ("Site", "Country", "Postal Code", "City", "Address", "GPS"),
                ("User Name", "User Role", "First Name", "Last Name", "Email", "User Disabled?")]

    sorts = ["'SITE', 'DEPARTMENT', 'LOCATION', 'ASSET_TYPE', 'ASSET_NAME'",
            'ASSET_TYPE_NAME', 
            'ASSET_STATUS_NAME',
            "'SITE_NAME', 'LOCATION_NAME'",
            'DEPARTMENT_NAME',
            'SITE_NAME',
            'USER_NAME']

    writer = pandas.ExcelWriter('mergedSheet.xlsx')

    for i in range(0, 7):
        df = pandas.read_csv(reportDF[i], delimiter=';')
        df = df.sort_values(by=sorts[i])
        df.to_excel(writer, sheet_name=sheetNames[i], engine='xlsxwriter', header=columnNames[i], freeze_panes=(1, 0))

    writer.save()
    writer.close()

我们将非常感谢任何帮助或指导。 谢谢。

最佳答案

您创建一个字符串,即:“'SITE', 'DEPARTMENT', 'LOCATION', 'ASSET_TYPE', 'ASSET_NAME'”

我认为它应该是这样的:

sorts = [['SITE', 'DEPARTMENT', 'LOCATION', 'ASSET_TYPE', 'ASSET_NAME'],
        'ASSET_TYPE_NAME', 
        'ASSET_STATUS_NAME',
        ['SITE_NAME', 'LOCATION_NAME'],
        'DEPARTMENT_NAME',
        'SITE_NAME',
        'USER_NAME']

关于python - 如何使用列名列表对数据框进行排序,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48540634/

相关文章:

python - 根据数字和组 ID 扩展 pandas 数据框行 (Python 3)。

javascript - 使用 Javascript 或 VBscript 将本地 Html 表单数据导出到 CSV

java - 将 REST Web 请求从 Python 转换为 Java

python - NoReverseMatch 与 Python 社交身份验证/Facebook 登录

python - PDFminer - 有没有办法从 pdfminer 将 pdf 转换为 html?

Python:索引分配中的for循环

python - 计算多个 pandas 数据帧的百分比变化

python - Pandas 函数用于从数据帧生成系列

python - 如何向数据行添加标题属性?

c# - 使用c#将csv文件插入MYSQL表