Python Recipe: Read CSV/TSV Textfiles and Ignore Comment-lines

Scientific data commonly comes in tab-separated textfile format containing comment lines. What is the best way to read this data? Analogous to the recipe given by skip.montanaro, use a commented file decorator as follows:

import sys, re
import csv
class CommentedFile:
    def __init__(self, f, commentstring="#"):
        self.f = f
        self.commentstring = commentstring
    def next(self):
        line = self.f.next()
        while line.startswith(self.commentstring):
            line = self.f.next()
        return line
    def __iter__(self):
        return self

tsv_file = csv.reader(CommentedFile(open("inputfile.txt", "rb")),
                      delimiter='\t')
for row in tsv_file:
    print row[2] # prints column 3 of each line

7 Responses to “Python Recipe: Read CSV/TSV Textfiles and Ignore Comment-lines”

  1. Thanks for posting this; just found your blog searching around. Keep up the good work!

  2. Sorry for my English, but your post touched me so that I could not remain silent. Thank you for such posts, write more often.

  3. Very useful code. many many thanks

  4. I changed the line

    line.startswith(self.commentstring):

    to

    line.startswith(self.commentstring) or not line.strip():

    so now it can also skip blank lines.

  5. Nice code! thanks

  6. Hi,
    I get this error in python 3.1

    TypeError: argument 1 must be an iterator

    why?
    suresh

  7. The code above has only been tested with Python Version <3.0. I've not been switched to >3.0 yet, so I am not familiar with it and cannot give you a good answer on the fly. Maybe you just try the python 2 -> 3 converter and inspect the result.