I have a pet project–maybe it’s the sort of project only a web librarian would find interesting: I am trying to fix the location associated with my campus’s IP addresses. … Which might not even make sense, but wait! I can explain! Go into WorldCat.org, and go into an item record. Look at “Worldwide libraries that own this item,” and do you see how it auto-filled your ZIP code? Yeah, it does that wrong for us, showing a ZIP code for a city 257 miles away, as the raven flies. Foursquare also gets our location wrong when you search for venues 1) from a desktop computer, 2) on campus, 3) while logged out of the site–not a major use case, honestly: GPS-enabled mobile device provide their own locations, and if you log in before searching it automatically uses your home city. Besides OCLC and 4sq, WhatIsMyIP.com and Google have also been known to get it wrong, as have other services.
And I wanted to fix it.
Background of how this happened to us, specifically:
You see, my university (UAA) is part of a statewide system of universities, each of which has several campuses. The main offices for University of Alaska are in Fairbanks, a few hundred miles from my university/campus, in Anchorage. And the main office serves as our internet service provider (ISP), meaning all of our IP addresses are provided by them, as are all of the IP addresses at every other campus in the system. (Look, it’s a small state, at least in terms of population and fiber optic cables.)
Where your IP address comes from isn’t actually important, most of the time, but in our case it matters, because apparently Fairbanks reported all of the IP addresses that belonged to them as being located there. (To whom this was originally reported, I’m not sure. DNS? ICANN? Eh, I should probably know, but, ultimately, it doesn’t matter.) I’m not mad at them for it: probably, back when it happened, geo-location wasn’t a big deal, and I bet they move blocks of addresses between campuses from time to time, anyway, so even if they’d done it “right,” there would be errors now. It’s cool. And they were as helpful as they could be, when I contacted them about this problem, providing me with a bunch of IP range information. We’re still friends. :)
Back to the world at large–and OCLC specifically:
A while back, some business people, or tech people, or somebody got this great idea: they decided to make huge databases of all of the IP address blocks in the world and what location each one comes from, so that people visiting websites could have their locations auto-filled for them. And website builders like Google and Foursquare and, yes, OCLC/WorldCat now pay for access to these databases. Apparently, they don’t actually query these databases each time someone loads a webpage (too slow!); instead, they buy access, load/cache all of the data, and then go on their merry ways, presumably updating from time to time. And, at least in the case of OCLC, they do not offer any other way to identify or fix the location that goes with the IP address of the browser accessing their site–yes, there’s IP address recognition that tells them to offer on-campus patrons the option to go to consortiumlibrary.worldcat.org, rather than www.worldcat.org, but they haven’t built in the extra step to help their service realize that, if they’re on campus at the University of Alaska Anchorage they aren’t in Fairbanks. They instead go with the data they loaded from their IP-to-location database vendor. Which they are unwilling to change, even at the specific request of individual libraries–we aren’t the only library to have asked, either. (I’m going to be a little petulant and point out that Google and Foursquare both ignore their vendor-supplied data when a user is logged in. WorldCat doesn’t even do that.)
There are a couple of mitigating factors that make this not a huge deal: the big one is that, once you set your ZIP code, your browser keeps it in a cookie. So when our patrons get a Fairbanks ZIP code, if they notice and change it, they probably won’t have to change it a second time. Second–and cooler–our Head of Systems, ever clever, hacked together a fix in EZProxy, to force the right ZIP code into the box; it only works if a user accesses WorldCat from the library’s website (so, not from Google Books or the other vendors who work with WorldCat), but that’s still pretty good! That’s probably the bulk of our WorldCat users, right there.
It’s not perfect, though, and it’s stupid that there should be this kind of problem with something as big as OCLC, or worse, Google, which is why I set out to fix this–and learned/put together most of the facts I’ve just shared with you, in the process.
What I did:
I exchanged more emails than you’d believe with OCLC–because, even with screen shots, this is not an easy problem to explain, and we kept getting stuck on “no, there is no way in our system to change this,” leading me to decide to go over their heads, so to speak, and fix their original data source. After many emails I got them to give me the name of their IP-to-location database provider, a company called MaxMind. I know our IP ranges (not by heart, but we have them in our intranet ;)), so I got in touch with MaxMind directly, to see if they’d fix our location. I didn’t go all super-altruistic and provide the IP ranges and locations for other UA or UAA campuses, opting just to provide the ones for the campus in Anchorage. I figured that, if it worked, I would share what I found with the right people, or I’d submit the others to MaxMind, or something.
MaxMind fixed us right up. It took a followup email; I guess they lost my request in their ticketing system. But they were friendly and helpful, and I was able to confirm that our IP addresses now show up as being in Anchorage. Woo!
That was in February. I checked in, once I’d confirmed that the source–MaxMind–was right, and OCLC said they reload that data every quarter or so. So I waited.
I kept checking WorldCat, once every week or two, and, due to a caching issue (I didn’t reset Safari, so it remembered my Anchorage ZIP code from a previous search), I actually thought the problem was fixed, earlier this summer. But I was wrong. It is still not fixed.
Several(!) emails with OCLC went back and forth, over the summer. I had to keep explaining the problem, which I guess makes sense–they have different people answer tickets, right? Anyway, from what I gather, it seems like maybe they don’t reload as often as was originally suggested. But it’s been put on some kind of priority list, and so it should happen any time now. When it does, I will be ever so delighted and will brag all over social media, and things will be great!
In the meantime…
Since Foursquare and a few other services still have the wrong data for us, I started trying to hunt for other IP-to-location database companies. Foursquare, it turns out, also uses MaxMind and hasn’t reloaded in a while. (They were SUPER nice via email, by the way, and seemed apologetic about not having reloaded that data. It’s a really uncommon use of their site that causes the problem to come up, so I hope I didn’t alarm them.) Google is apparently a MaxMind customer, although I haven’t confirmed that with them in any official way–they just suddenly started getting our location right, in the spring or early summer. WhatIsMyIP.com also suddenly started getting it right. Fmbip.com (which will tell you all kinds of scary/interesting things about your browser) does not have it right. I guess maybe they’ll be the next company I ask. If they also use MaxMind, I may back off of this quest, beyond continuing to email OCLC about refreshing their data every three months or so, so that I eventually get to have my brag-fest. At this point, it looks as though this project will have taken over a year to complete!