Look! A short post with just one or two “things learned” in it! Follow-through! :)
I’ve rediscovered something more annoying than Unicode (or rather ASCII; I blame ASCII, since we should all be using Unicode, always, at this point) – time.
There’s been a time element in most of the scripts I’ve written, so far, but 1) someone else wrote it, and I’ve just been reusing it. 2) It explicitly skipped the hard parts. See?
from datetime import date, timedelta
start_date = str(date.today() - timedelta(days_back))
But if you start with a string, possibly a string that won’t be formatted the same in every chunk of data/metadata* you’re looking at, it’s a little trickier. I wandered through numerous date and time libraries and, finally, this happened—and I think it was my coworker who came up with it, actually:
from datetime import datetime
from dateutil.parser import *
DEFAULT = datetime(2010, 01, 01)
a_date = parse(stringthatrepresentsadate, yearfirst=True, default=DEFAULT).isoformat()
Seems so simple, right?
You’re probably wondering “why the weird default?”. Wellllll, we had some yyyy-mm dates (ex: 2011-02) that were parsing correctly as, for example, February, WhateverYear (yay!). But if only the month and the year are available, parse() helpfully supplies a day, so that it can generate a valid datetime for you. (OK, that’s actually helpful. Valid data is valid.) If you haven’t specified a default, the day it gives you is, oh so helpfully, the day on which you run it. Today is the 29th, which was “out of range for month,” as Python kept helpfully barking at me, every time the month was February. (Shh, pedants. Shhhh.) You get around this by having a default date set. You can’t just do a default day; you need a whole date. I found a funny mistake in my code as I was writing this post, so I’ll probably pick a funnier/more important default date when I make that other change. :)
So, great, now we have a datetime object. I’m sure JSON parsers will be thrilled to take that in. (Even if they would, which they would not, our schema specified an ISO 8601-formatted string, NOT a datetime object.) That’s where .isoformat() comes in, converting your datetime object back into an ISO 8601-formatted string. It’s a convenient thing to have in your pocket.
Fun Python fact: putting a variable in all caps is used to signify “this is a constant that we are not going to change.” You CAN change it; Python won’t stop you; but convention is that you don’t do that.
*Metadata is in the eye of the beholder, it seems. We grab metadata with an API, and there’s actually metadata about that metadata (when it was updated, where it came from) that we also capture. And then we save the metadata and metadata-metadata as objects in a system, and there’s metadata about that. I threatened to make a song to the tune of “I am the very model of the modern major general” about the situation, but then I think we got back to work.