我正在尝试解析 XML 文件,返回值并将其放入 .csv 文件中。到目前为止我有以下代码:
for shift_i in shift_list :
# Iterates through all values in 'shift_list' for later comparison to ensure all tags are only counted once
for node in tree.xpath("//Data/Status[@Name and @Reason]"):
# Iterates through all nodes containing a 'Name' and 'Reason' attribute
state = node.attrib["Name"]
reason = node.attrib["Reason"]
end = node.attrib["End"]
start = node.attrib[u'Start']
# Sets each of the attribute values to the name of the attribute all lowercase
try:
shift = node.attrib[u'Shift']
except:
continue
# Tries to set shift attribute value to 'shift' variable, sometimes fails if no Shift attribute is present
if shift == shift_i :
# If the Shift attribute is equal to the current iteration from the 'shift_list', takes the difference of start and end and appends that value to the list with the given Name, Reason, and Shift
tdelta = datetime.strptime(end, FMT) - datetime.strptime(start, FMT)
d[state, reason, shift].append((tdelta.total_seconds()) / 60)
for node in tree.xpath("//Data/Status[not(@Reason)]"):
# Iterates through Status nodes with no Reason attribute
state = node.attrib["Name"]
end = node.attrib["End"]
start = node.attrib[u'Start']
# Sets each of the attribute values to the name of the attribute all lowercase
try:
shift = node.attrib[u'Shift']
except:
continue
# Tries to set shift attribute value to 'shift' variable, sometimes fails if no Shift
# attribute is present
if shift == shift_i:
# If the Shift attribute is equal to the current iteration from the 'shift_list',
# takes the difference of start and end and appends that value to the list with
# the given Name, "No Reason" string, and Shift
tdelta = datetime.strptime(end, FMT) - datetime.strptime(start, FMT)
d[state, 'No Reason', shift].append((tdelta.total_seconds()) / 60)
for item in d :
# Iterates through all items of d
d[item] = sum(d[item])
# Sums all values related to 'item' and replaces value in dictionary
a.update(d)
# Current keys and values in temporary dictionary 'd' to permanent
# dictionary 'a' for further analysis
d.clear()
# Clears dictionary d of current iterations keys and values to start fresh for next
# iteration. If this is not done, d[item] = sum(d[item]) returns
# "TypeError: 'float' object is not iterable"
这将创建一个字典,其值如下所示:
{('Name1','Reason','Shift'):Value,('Name2','Reason','Shift'):Value....}
print(a) 返回此内容
defaultdict(<class 'list'>, {('Test Run', 'No Reason', 'Night'): 5.03825, ('Slow Running', 'No Reason', 'Day'): 10.72996666666667, ('Idle', 'Shift Start Up', 'Day'): 5.425433333333333, ('Idle', 'Unscheduled', 'Afternoon'): 470.0, ('Idle', 'Early Departure', 'Day'): 0.32965, ('Idle', 'Break Creep', 'Day'): 24.754250000000003, ('Idle', 'Break', 'Day'): 40.0, ('Micro Stoppage', 'No Reason', 'Day'): 39.71673333333333, ('Idle', 'Unscheduled', 'Night'): 474.96175, ('Running', 'No Reason', 'Day'): 329.4991500000004, ('Idle', 'No Reason', 'Day'): 19.544816666666666})
我想创建一个 .csv,其中包含“名称”+“原因”列和总计,行由“类次”描述。像这样:
Name1-Reason Name2-Reason Name3-Reason Name4-Reason
Shift1 value value value value
Shift2 value value value value
Shift3 value value value value
我不知道该怎么做。我尝试使用嵌套字典来更好地描述我的数据,但在使用时出现 TypeError
d[state][reason][shift].append((tdelta.total_seconds()) / 60)
如果有更好的方法,请告诉我,我对此很陌生,很乐意听到所有建议。
最佳答案
我认为以下内容可能会满足您的要求,或者至少接近您想要的。您所说的 CSV 文件应格式化的方式忽略了一个重要的考虑因素,即每一行都必须有一个 Name-Reason
列,用于每种可能的组合两者,即使任何类次行中没有任何特定的混合 - 因为,这就是 CSV 文件格式的工作原理。
from collections import defaultdict
import csv
# Dictionary keys are (Name, Reason, Shift)
d = {('Test Run', 'No Reason', 'Night'): 5.03825,
('Slow Running', 'No Reason', 'Day'): 10.72996666666667,
('Idle', 'Shift Start Up', 'Day'): 5.425433333333333,
('Idle', 'Unscheduled', 'Afternoon'): 470.0,
('Idle', 'Early Departure', 'Day'): 0.32965,
('Idle', 'Break Creep', 'Day'): 24.754250000000003,
('Idle', 'Break', 'Day'): 40.0,
('Micro Stoppage', 'No Reason', 'Day'): 39.71673333333333,
('Idle', 'Unscheduled', 'Night'): 474.96175,
('Running', 'No Reason', 'Day'): 329.4991500000004,
('Idle', 'No Reason', 'Day'): 19.544816666666666}
# Transfer data to a defaultdict of dicts.
dd = defaultdict(dict)
for (name, reason, shift), value in d.items():
name_reason = name + '-' + reason # Merge together to form lower level keys
dd[shift][name_reason] = value
# Create a csv file from the data in the defaultdict.
ABSENT = '---' # Placeholder for empty fields
name_reasons = sorted(name_reason for shift in dd.keys()
for name_reason in dd[shift])
with open('dict.csv', 'wb') as csv_file:
writer = csv.writer(csv_file, delimiter=',')
writer.writerow(['Shift'] + name_reasons) # column headers row
for shift in sorted(dd):
row = [shift] + [dd[shift].get(name_reason, ABSENT)
for name_reason in name_reasons]
writer.writerow(row)
以下是上面代码创建的 dict.csv
文件的内容:
Shift,Idle-Break,Idle-Break Creep,Idle-Early Departure,Idle-No Reason,Idle-Shift Start Up,Idle-Unscheduled,Idle-Unscheduled,Micro Stoppage-No Reason,Running-No Reason,Slow Running-No Reason,Test Run-No Reason
Afternoon,---,---,---,---,---,470.0,470.0,---,---,---,---
Day,40.0,24.754250000000003,0.32965,19.544816666666666,5.425433333333333,---,---,39.71673333333333,329.4991500000004,10.72996666666667,---
Night,---,---,---,---,---,474.96175,474.96175,---,---,---,5.03825
关于python - 根据字典中的值创建 .csv,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40047315/