But one credit-card account doesn't give me the
records in an order I can understand. Usually, I want to know the date range
of last week's download, so I type something like
$ grep "^D" visa.QIF | sort
and based on the output, I'll search for transactions dated the
following day or later. This way, I won't confuse myself by seeing the
same transactions I saw last week (this is where déjà vu
is for real).
Tonight I was confused the other way: I ended up with a .QIF file that was missing some transactions. As usual, the transactions were listed in some apparently-random order, and I didn't trust the bank to display transactions on-screen in the same order. Fortunately, the website can be told to display the transactions in some nice order (by transaction date, for example, or post date—ascending or descending). I like this concept, but it's too bad I haven't figured out how to download those transactions in a nice order. It may not be possible.
I wanted to have the on-screen stuff and my QIF file in the same order so I could see what was missing more easily, and feel confident that I had caught everything. How to sort the "QIF" file? For those who don't know, (but why are you reading this?), a QIF file looks like this:
!Type:Bank D07/16/2012 T-31.30 C* N PWITHDRAWAL SMITH'S DEPT STORE -PAYMENT ^ D07/12/2012 T-3,300.00 C* N PWITHDRAWAL MASTER CARD -ONLINE PMT ^ …etc.A few things to note:
- Entries consist of 6 lines, at least they do here; the 6th line begins with ^ and indicates the end of the entry
- Dollar amounts are preceded by "T" and use commas to separate thousands
- Dates are preceded by "D" and are in the American format, sort of: mm/dd/yyyy
#!/usr/bin/python3 -utt # vim:et:sw=4 '''Program to sort records from a quicken 'qif' file. Parameters: filename''' import os import sys from time import localtime, mktime, strftime, strptime def main(args): '''Read quicken records from a qif file, creating a dict for each one. At end of file, sort them by US-style date and print them.''' if len(args) != 1: print('Expected exactly one parameter, viz., filename; got', args, file=sys.stderr) sys.exit(1) if not os.path.exists(args[0]): print("Can't find inputfile %s" % args[0], file=sys.stderr) sys.exit(1) xactions = list() curr = dict() for aline in open(args[0], 'r'): aline = aline.strip() if not aline: continue akey = aline[0] aval = aline[1:] # end of record? if akey == '^': if curr: xactions.append(curr) curr = dict() elif akey == 'T': curr[akey] = float(aval.replace(',', '')) # 1,234 => 1234 elif akey == 'D': curr[akey] = mktime(strptime(aval, '%m/%d/%Y')) else: curr[akey] = aval xactions.sort(key=lambda X: X['D']) for one in xactions: print('%s $%8.2f %4s %s' % (strftime('%Y-%m-%d', localtime(one['D'])), one.get('T', 99999.99), one.get('N', '#?')[:4], one.get('P', '<payee unspecified>')))Oh, the date_cmp() function is struck out, because it's not needed in this program. Why did I write it then? Because I was thinking of the way sorting works in python2.x, where you pass a comparison function into sort() if you want to use a non-default one, This post from last year shows how that one works. We don't need it here, though, and actually can't use it, because the sort of Python3 uses a "key" function, rather than a "cmp" function.def date_cmp(a, b): '''Compare dates of two qif-based entries.''' return cmp(a['D'], b['D'])if __name__ == '__main__': main(sys.argv[1:])
With that out of the way, here's how it works, basically: we read the input file ("Activity.QIF" or "Quicken.qif" or whatever) one line at a time. when we see a '^' then we're done with one entry and stash it in a list.
Each entry, by the way, is a python "dict" (like a hash for you Perl
guys) -- the key is the first character in each line, and the value is
everything after that. For some reason, I wanted to convert the dollar
amounts into floating-point numbers (what was I thinking?) and in order
to do that, I needed to kill off any ','; that's why the
float(aval.replace(',', ''))
construct above.
For the dates, I wanted to be able to sort them easily, and I thought the easy way to do that was to convert them into scalar values corresponding to "number of seconds since the epoch", so that's what the mktime/strptime stuff is about. By the way, I usually like to code like "import os" and then later "os.path.exists(...)." This avoids embarrassing name collisions, and besides makes it clear where each thing came from. But with all those "time" routines, is there any doubt where they came from? I mean, localtime, mktime, str*time.
When we get to the end of the file, we've got all the records, so we sort them. A sanity-check (to make sure we don't have data left over in curr) might be nice, but that's "an exercise for the reader."
As I mentioned above, the sorting uses "key" rather than "cmp", and the "key" function just takes the value in each entry corresponding to the 'D' element—a less-than-one-liner, so I used a lambda to say, "hey, map X to X['D']"; it's not worth a whole "def funcName(foo):"
We then take the sorted list and print one element at a time: date, dollar amount, "check number" (that's what the "N" is), and payee. (Is it something else for deposits? I dunno.)
If it weren't past my bedtime, I'd explain further. But it is, so I won't. Except to point out that in python3, "print" is a function, not a statement. I may add more explanation later.
No comments:
Post a Comment