python - 识别 csv 文件中的排列(并计算它们)

标签 python csv

尝试从 csv 文件中获取数据 这是我的 csv 文件的样子:

a,0,b,2,c,6,G,4,l,6,mi,2,m,0,s,4
a,2,b,2,c,0,G,4,l,6,mi,4,m,0,s,6
a,4,b,2,c,6,G,6,l,2,mi,4,m,0,s,0
a,2,b,0,c,2,G,6,l,4,mi,4,m,0,s,6
a,2,b,2,c,6,G,4,l,0,mi,6,m,0,s,4
a,2,b,6,c,0,G,6,l,0,mi,4,m,2,s,4
a,0,b,6,c,4,G,2,l,0,mi,6,m,4,s,2
a,6,b,6,c,4,G,0,l,0,mi,2,m,4,s,2

因此,例如在行 [0] 中, 取决于第1,3,5,7,9,11,13,15行中的数值 我需要获取 0,2,4,6,10,12,14

中的值

更深层次的例子: 从第 1 行开始: 我需要得到

 a,m = 0
b,mi = 2
c,l = 6
G,s =4

最后,我要补充一下,哪两个的组合最高。所以本质上是每个的总和。

为了做到这一点:

# Sanitize filelist to keep only *.csv files    
def sanitize_filelist(filelist):

    sanitized_filelist = []

    # Keep only the log file
    for file in range(len(filelist)):
        if string.lower(filelist[file][-4:]) == '.csv':
            sanitized_filelist += [filelist[file]]
#    print sanitized_filelist
    return sanitized_filelist


def parse_files(dataset_path,file):
    threads = [0,2,4,6,10,12,14]
    coreid  = [1,3,5,7,9,11,13,15]
    cores = [0,2,4,6]
    thread_data = [[],[],[],[],[],[],[]]
    #core = [[],[],[],[],[],[],[]]
        threadcorecount = [[0 for a in range(0,4)] for b in range(0,8)]
    dataset = csv.reader(open(dataset_path, 'rb'), delimiter=',')
    for line in dataset:
        #print line
        for thread in range(len(threads)):
            thread_data[thread] = line[threads[thread]]
        for core in range(len(threads)):
            if line[coreid[core]] == cores[0]:
                sub = core - 1
                print thread_data[sub],cores[0]

我写了这个片段 - 仍然是一个测试版本。 我无法获取这些值并进行打印。没有错误..我不明白错误是什么。

最佳答案

如果我理解了您的所有请求,下面的代码应该可以解决问题:如果您想访问每一行中的值(或保存 counter 变量)和 sorted_results 来获取可能排列的计数。

一些引用:

代码如下:

import csv
from collections import Counter
import operator

def parse_files(dataset_path,f):  # please avoid using reserved words like file
    threads = range(0,16,2)
    dataset = csv.reader(open(dataset_path,'rb'), delimiter=',')
    results = []
    for line in dataset:
        counter = {str(x):[] for x in range(0,8,2)}
        # map(lambda x:counter[line[x+1]].append(line[x]), threads)
        # map(lambda ...) is just a more pythonic way to write the following two lines
        for index in threads:
            counter[line[index+1]].append(line[index])
        # now counter is something like 
        #{'0': ['c', 'l'], '2': ['a', 'm'], '4': ['mi', 's'], '6': ['b', 'G']}

        results.extend([','.join(v)+'='+k for k,v in counter.items()])
        # in results, I'm appending something like this:
        # {'c,l=6', 'a,m=0', 'b,mi=2', 'G,s=4'}

    sorted_results = sorted(dict(Counter(results)).iteritems(), key=operator.itemgetter(1), reverse=True)
    print '\n'.join(['The couple %s appears %d times'%el for el in sorted_results])

    # >>> The couple a,b=2 appears 2 times
    # >>> The couple c,m=4 appears 2 times
    # >>> The couple G,s=4 appears 2 times
    # >>> The couple c,mi=6 appears 1 times
    # >>> The couple a,m=2 appears 1 times
    # >>> ...

关于python - 识别 csv 文件中的排列(并计算它们),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15365549/

相关文章:

python - 替换csv文件python的第一行

python - 更改行数据以防它与不同的行不匹配

ruby-on-rails - ruby : sanitize CSV with irregular fields

python - 使用 Selenium Webdriver 和 Python 中的 'By' 检查元素是否存在

python - 如何在Python中将十六进制编码为base64?

python - 在 PyGame 中将来自 Raspberry Pi 相机的 IO 流显示为视频

python - 在 Python 中使用带有自签名证书的请求时证书验证失败

javascript - 使用 d3 加载 csv 文件后范围未更新

sql - 导入到 mysql 损坏

php - 从 python (wsgi) 访问 php $_SESSION - 这可能吗?