They say that the only thing worse than finding a worm in an apple is finding half a worm. I had such a moment at work last week. We had sent a list of target companies to a data mining firm in India with instructions to find the web and email addresses and contact names of people at these companies. A fairly standard practice these days. Data miners charge as little as a few cents per record for this kind of work.
I received a batch of such data and was giving it a once-over before having it imported into our database. You can’t really quality check thousands of names and addresses in any meaningful way, so you just leaf through and look for anything really odd. And completely by random, I found something really odd.
Company: Hebrew Academy of [redacted]
Contact: Mahmoud Ahmadinejad, President
It was truly a half a worm moment. We had already marketed to tens of thousands of records mined by this company. I checked the Hebrew Academy web site, and there was no evidence that Mr. “wipe Israel off the map” was in charge of a mid-western yeshiva. I checked all the other Jewish orgainizations in the list a a bunch of others for good measure but found nothing else odd. Several troubling thoughts came to mind:
- Since there’s no way this was a random error, somebody over there is making stuff up, and probably thinks this is funny.
- Imagine how awful it would have been if we had sent a letter addressed to Mr. Ahmadinejad at the Hebrew Academy.
- I happened to catch this one. How many others are lurking in the data?
As a marketer, I figured that there are lots of other data mining fish in the sea, and the cost of redoing all this work was still small compared to the potential cost of not just erroneous but offensive material lurking in there. Trying not to be too political about the specifics of the problem, I sent a note to the vendor simply saying that we could no longer trust the quality of the data for marketing purposes, and would have to cancel the project.
There was some back and forth, and then I found the other half of the worm. My data miner sent me a link to a press release posted on the Hebrew Academy’s site titled, “Israeli leader calls for tougher sanctions against Iran,” and including the line, “In Iran, President Mahmoud Ahmadinejad pledged to push ahead with his country’s nuclear program and said his people would not bow to Western intimidation.” The miners had searched the site for the word “president” and grabbed the name next to it without paying much attention. Had they searched for “prime minister” they would have put Ehud Olmert in charge.
Sure, this suggested less than stellar QA practice, appalling lack of awareness of current events, and maybe even laziness, but at least it didn’t mean that there was a bigot on the loose out to embarrass us. Much more in line with my expectations of cents-per-record data mining.
So what makes this a Purim story? Not much other than coincidence. Tonight is Purim, a holiday that marks the escape of the Jewish people from genocide at the hands of the Persians. Weird how these things sometimes fold back on themselves.
Enjoy the holiday and don’t forget to double check your mailing lists.
Nice post. I’ve been thinking myself about writing about some of the things going on at work… but I’m not sure if it would be so useful. It’d be nice to explain the situation with the site and what’s wrong with it in terms of markup and the stylesheets.
Scrap that, what’s right about it would be a better thing! 🙂
Keep up the good work dude.