Python Recipe: Read CSV/TSV Textfiles and Ignore Comment-lines
Scientific data commonly comes in tab-separated textfile format containing comment lines. What is the best way to read this data? Analogous to the recipe given by skip.montanaro, use a commented file decorator as follows:
import sys, re
import csv
class CommentedFile:
def __init__(self, f, commentstring="#"):
self.f = f
self.commentstring = commentstring
def next(self):
line = self.f.next()
while line.startswith(self.commentstring):
line = self.f.next()
return line
def __iter__(self):
return self
tsv_file = csv.reader(CommentedFile(open("inputfile.txt", "rb")),
delimiter='\t')
for row in tsv_file:
print row[2] # prints column 3 of each line
Tags:english programming python | Filed on February 15th, 2010 | 7 Comments »
Thanks for posting this; just found your blog searching around. Keep up the good work!
Sorry for my English, but your post touched me so that I could not remain silent. Thank you for such posts, write more often.
Very useful code. many many thanks
I changed the line
line.startswith(self.commentstring):
to
line.startswith(self.commentstring) or not line.strip():
so now it can also skip blank lines.
Nice code! thanks
Hi,
I get this error in python 3.1
TypeError: argument 1 must be an iterator
why?
suresh
The code above has only been tested with Python Version <3.0. I've not been switched to >3.0 yet, so I am not familiar with it and cannot give you a good answer on the fly. Maybe you just try the python 2 -> 3 converter and inspect the result.