Python Read Large Text Files Probelm -
i trying compare 2 large text files line line (10gb each) without loading entire files memory. used following code indicated in other threads:
with open(in_file1,"r") f1, open(in_file2,"r") f2: (line1, line2) in zip(f1, f2): compare(line1, line2)
but seems python fails read file line line. observed memory usage while running code > 20g. tried using:
import fileinput (line1, line2) in zip(fileinput.input([in_file1]),fileinput.input([in_file2])): compare(line1, line2)
this 1 tries load memory. i'm using python 2.7.4 on centos 5.9, , didn't store of lines in code.
what going wrong in code? how should change avoid loading ram?
any suggestion appreciated! thank you!
python's zip function returns list of tuples. if fetches complete files build list. use itertools.izip instead. return iterator of tuples.
with open(in_file1,"r") f1, open(in_file2,"r") f2: (line1, line2) in izip(f1, f2): compare(line1, line2)
Comments
Post a Comment