Monday, August 27, 2007

The Spock Entity Resolution and Extraction Challenge

Newly launched people search engine Spock is offering $50,000 to the team that can create the best algorithm for identifying unique individuals from the ever-changing morass of personal data on the internet. The team of judges is composed of software engineers, computer science professors, and a venture capitalist.

It’s too bad that no philosophers or sociologists were invited to tag along. They could have advised the Spock team that personal identity poses some of the oldest problems around. A Buddhist or Heraclitean would argue that the entity Spock is chasing is a non-entity, that no enduring personality exists—only temporarily assembled bundles of individual properties.

Without going that far, one still finds it nearly impossible to identify a collection of features that an individual retains from infancy to elderhood that is unique enough to be essential to them. People change party affiliation, citizenship, gender, size, and hair color. People change interests, ideologies, addresses, names. Perhaps your DNA is coincidentally unique to you, but what if you had a twin (or a clone)? What if you changed your genetic makeup through gene therapy?

Of course, there is at least one identity that remains constant and trackable, for those of us who aren’t secret agents, at least—and I suspect it’s the one Spock is trying to identify us with. It’s the one that cashes the checks—our “official” identity for civic and financial purposes. Unfortunately for Spock, the figures that nail it down—DOB, SSN, etc.—are the ones we are well-advised to keep off the internet. And for those of us unlucky enough to have our identities stolen, even this “official” identity doesn’t always resolve to a unique individual.

Spock shows much promise. But identity as found on the internet is unavoidably a bundle of properties without a substance to adhere to. Insofar as people are multifaceted or conformist, Spock will never entirely be able to resolve them.

No comments: