Sunday, March 22, 2009

Facebook vs. Internet: Advantage, Facebook

Jesse Stay's excellent post on the search potential of Facebook's Lexicon has inspired me to put down a few quick thoughts on Facebook's nearly unlimited potential to capture the future of what John Battelle calls the "database of intentions".

Google's extraordinary accomplishment is that they used superb statistical analysis to make some vague sense out of the complete mishmash that makes up the flat-text Web. But while that accomplishment is considerable, at the end of the day, they're still dealing with mush.

Facebook's great opportunity is that everything within Facebook is structured; and increasingly, users express their intentions against this structured data at scale in a way that can be very productively mined -- for product improvement, for user retention, for advertising. For insight.

Riddle yourself this: You have 200 Facebook friends. They are all pretty active. Does your FB feed actually show every single event from every single one of them? No, it doesn't. FB is algorithmically determining what is most interesting to you - dynamically - based on how much attention you pay to what those users do, and how you interact with them. Facebook knows how much you care about each of your friends. It knows whether you pay more attention to people near or far, to men or to women, to people you work with, went to high school with, or went to college with. It knows because you explicitly describe all those relationships, in a way that Google can never grasp no matter how world-beating its science and how vast its server farms.

Or consider the Lexicon graphs that Jesse highlights in his post. Google Trends can handily generate one of those for you from their painstakingly de-mishmashed dataset. But they can't tell you the demographic breakdown of that interest, because they don't know who's male and who's female. Nor do they know whether that interest is coming from people directly associated with the topic in question; for instance Ohio State, my alma mater.

Here's the Ohio State Lexicon graph, which I have annotated to show the precision of Facebook's read on the importance of a topic:
facebook lexicon "ohio state" - current version
Here's the term 'Football' as a proxy from the new Lexicon, which doesn't yet allow analysis of arbitrary search terms.
facebook lexicon "football" - new version
As you can see, FB could allow you to slice and dice the 'Ohio State' search by any number of associations -- male vs. female, by age, and whether the person had attended Ohio State. Google can't do that. No one else can do that, because no one else has assembled a gigantic graph of defined and structured entities within which users apply their attention and annotation.

The implications for local search alone boggle my mind
- that's food for another post.

It's worth noting that Lexicon is really, really slow right now. My hat is off to FB for making it work at all -- I assume that some implementation of Cassandra is behind the current Lexicon, and one reason they may not be allowing open-text searching in the new Lexicon is because while they're pushing the envelope developing it, they're crunching big batch jobs on a limited set of terms in Hadoop for the more sophisticated analysis presented there. Zvents has developed some pretty sophisticated internal analytics based on Hypertable, and I'm familiar with the challenges that this sort of slice-and-dice presentation presents -- they are considerable.

Google has taken the statistical analysis of flat text about as far as it can go. The question is, what next? Powerset attempted one approach, which was the semantic analysis of that same flat text. We'll see whether Microsoft and Powerset can make a go of that - the jury is definitely out whether it adds value in a computationally and commercially tractable manner. But in the meantime, my bet is on Facebook -- because the information potential of a structured system is vastly greater than that of a flat corpus, and it is far more tractable to parsing.

Internet, watch out. Here comes Facebook.

Monday, March 16, 2009

Reality Check of the Day: China v. USA

One of my great realizations from living in the UK for a couple years is just how utterly the U.S. media lacks any perspective on American military actions abroad. The 'abroad' is completely redundant, of course - unlike every other country on earth, aside from our distant independence and single civil war, the U.S. military has NEVER had military action that wasn't abroad.

This piece in Newsweek caught my eye:
"The confrontation last week between a U.S. ship and five Chinese naval craft was just the latest of many low-grade military clashes in the South China Sea, the site of numerous territorial disputes. It was eerily similar to the "Hainan Island" incident in 2001..."
But the punch line was the ending quote:
"This confrontation had been preceded by increasingly bold behavior on the part of People's Liberation Army ships and planes. "They seem to be militarily more aggressive," said Obama's new National Intelligence director, Dennis Blair..."
Um. Yeah.

For the reality-based coalition, here's a handy world map showing the location of the United States, China, and the 2001 and 2009 incidents between the U.S. and Chinese military:


Who, exactly, is being "militarily more aggressive"?

Monday, March 02, 2009

Greenspan 2004: Your house will cover your personal debt

I randomly found my grumpy notes on this whistling-past-the-graveyard gem from Sage Alan in February 2004, and thought it was well worth posting:

The finances of American households are in generally good shape even though consumers have increased their debt and bankruptcy filings have surged, the Federal Reserve chairman, Alan Greenspan, said yesterday.

In a speech to the Credit Union National Association in Washington, Mr. Greenspan said that an extended period of low interest rates and extra cash from mortgage refinancing had given borrowers flexibility to better manage their debts...

Consumer debt reached a record $2 trillion in December, according to the most recent figures from the Federal Reserve. That includes credit cards and car loans, but not mortgages...

[Greenspan] said that American households own more than $14 trillion in real estate assets and that mortgage refinancing and the rise in home values have helped to bolster consumer spending in economic hard times as well as better periods.

"Over the past two years, " he said, "significant increases in the value of real estate assets have, for some households, mitigated stock market losses and supported consumption."

Boy, he sure got that one right, didn't he?