Group by and Count in Python
For me the grouping is not working properly, I have mentioned my code and the result I require from it, I am trying to get results of the group by and count in python.
My Code:
#!/usr/bin/env python
counts = {}
logfile = open("/tmp/test.out", "r")
for line in logfile:
if line.startswith("20") in line:
seq = line.strip()
substr = seq[0:13]
if substr not in counts:
counts[substr] = 0
counts[substr] += 1
for substr, count in counts.items():
print(count,substr)
The result that I require as output:
6 2019-06-17T00
13 2019-06-17T01
9 2019-06-17T02
7 2019-06-17T03
6 2019-06-17T04
There is a small issue in this program which is counts[substr] += 1 is indented too much, For Count with respect to Group in Python it should be indented in such a way that it would increment every single time a substring is found.
Updated Code:
for line in logfile:
if line.startswith("20") in line:
seq = line.strip()
substr = seq[0:13]
if substr not in counts:
counts[substr] = 0
# Un-indented below
counts[substr] += 1
# Print output only after loop completes
for substr, count in counts.items():
print(count,substr)
Output would be:
6 2019-06-17T00
13 2019-06-17T01
9 2019-06-17T02
7 2019-06-17T03
6 2019-06-17T04