Friday, December 17, 2004

The three-axis auto-match and Match.com

I met my girlfriend, Andrea, through Match.com, and thus I view the capabilities of such matching services in a fairly positive light. However, onece one sets out on the quest of finding across distance that "perfect someone" via multilple search categories, one must commit to the notion that an ever-more-perfect search mechanism will yield ever-more-perfect results.

The previous post on Matt Jones' bookshelf and my forthcoming post on location-based blogging beg the question -- what are fast and easy ways to identify whether someone is likely your type? I would argue that you can analyze three axes and get very good results on the interesting-conversation front, and probably the I-want-to-date-you front as well:

1) Personal: Height/weight/gender/gender preference/availability. Just the basics.
2) Bookshelf: What do you have on yours?
3) Geo-log -- where have you been, and how much time have you spent there?

Voila. A whole new way of doing dating.

More on Amazon - "My Work Bookshelf"


My Work Bookshelf
Originally uploaded by blackbeltjones.

Someone with the screen-name 'blackbeltjones' has uploaded his work bookshelf onto Flickr and annotated the jpg with the actual titles of the books. Can I just say that a) Flickr rocks b) this is a cool idea and c) this is an obvious extension of the content of my previous post on this subject. Prabal, are you taking note?

Click through on this photo to the right, and the full coolness will be revealed.

Tuesday, December 14, 2004

Real search engines and great search engine humor (no, really!)

If you think that Google is the end-all and be-all, read John Battelle on IBM's WebFountain. Wow. That is how a search engine is supposed to work, but the reason you can't access it is that it costs too much per query to give to you for free, and you're too cheap to pay for it.

I can't remember where it was that someone pointed out to me that the true genius of Google wasn't that they created such a great search result, but that they created a pretty decent search result on a less-computational-cost-per-search than could be repaid by advertising. Per might remember, I need to ask him. (task placeholder)

I am going to cross-post on WebFountain over at Onohoku (link placeholder) because this is the scary shit that is destroying the middle class as we know it, in both corporate and governmental applications. Technology can make you free, and it can also make you a robot (which means 'slave labor' in the original Czech)...

The Battelle piece also led me to a truly wonderful self-parody hidden within the Google corporate site. I won't spoil it other than to say fly over there now and check it out!

Monday, December 13, 2004

More on Amazon and the extended conversation

On a day when Google announces that they're digitizing a number of major libraries, Prabal and I happened to be having a conversation about Amazon and where all this virtual/physical stuff is going.

Here are a couple of datapoints.

Firstly, Amazon has this handy "search inside the book" feature which allows full-text searching of books.

Secondly, Amazon now does this very handy "this book references... this book is referenced by..." relational search for references to and from the book.

Now. Most people's libraries are made up of books published in the last 20 years. The vast majority of these books have bar codes on them, which either contain or link directly to the ISBN number of the book in some database online somewhere. Here's a scenario:

January 1, 2005: Amazon places a special offer on its website. By clicking in a special box on their order form, for just $5.99, Amazon will include in your order a USB bar-code reader and a CD with special book-organizing software -- kind of like iTunes for your books. When you receive the software and plug in the reader, you can scan the bar code on every book you own; and for the ones that don't have bar codes, you can type in the ISBN or Library of Congress catalog number. That combination will allow your average book owner to get 90% of their library online at a rate of about five books a minute, or a few hours for a reasonable library of 1000 books. Press a button, and your entire library is uploaded to Amazon.

Some trust and fair-use issues would need to be resolved; Amazon might negotiate with the holders of the copyrights that anyone who physically scanned a bar code would get unrestricted search and access to the text of the books online. Perhaps this would be limited to books you'd bought through Amazon, and they would effectively get into the DRM business. Books that you'd typed in the LC or ISBN number would need to be managed a bit more closely; perhaps such books would drive a query from the site to look up and enter a quotation from a chapter in the book, or some such method of ensuring that the customer actually held a physical copy.

This would also lead to issues with people walking into bookstores and scanning books. However, this battle will be fought soon over cell phones with good cameras in them. Already in Japan, it is reported as a serious issue that kids will enter bookstores and snap photos of pages in magazines that are of particular interest to them.

What would the result of such an upload be? Suddenly, one could manage and search one's personal library online. Search from a text perspective, anyway. Soon I'm going to get around to writing about the "fruit theory" which causes me to believe that books on bookshelves -- even lots of them, such as my library of about 3000 books -- are a highly efficient mental model by which to keep track of information. What's a great example of this? Prabal takes pictures of his bookshelves. He has shelves and shelves of books in Ohio, whereas he is in California -- and he knows where on the shelf a certain book is (efficient mental model). With the picture, he can a) remind/search to a greater degree of precision and b) tell his wife Wendy using a coordinate system that he might not remember, but can now communicate, e.g. "it's a little to the left of the big red book on the 2nd shelf from the bottom," and FedEx does the rest.

For manifold reasons, people are always? or at least for a very long time, going to want real physical books. This combined system would be a beautiful way to combine the benefits of the virtual and physical, and Amazon would be dumb or nuts (and they are neither) not to pursue it.

And then... Steve Wozniak is doing virtual/physical integration via his "Wheels of Zeus" startup -- which is aimed at the "where's my stuff" personal RFID market. For a deluxe package of $59.95, Amazon could mail out a combined RFID/bar code scanner and a big sticker sheet of WoZ tags... and not only you, but your entire family , could suddenly find and access all those great books on your shelves.

The combination of web-based management, personal possessions which come from a limited dataset, and scanning/tagging technologies is going to be flat-out amazing. Imagine managing your wine collection this way. Your tools. Your shoes. Anything (unlike CDs) where the virtual must remain no matter how far digitization goes.

This is gonna be cool.

And the best thing is, besides this great conversation, we had a billion-dollar idea tonight! Now we just need a billion-dollar implementation... ;-)

Friday, December 03, 2004

The Extended Conversation

I stole this citation from John Battelle's Searchblog. But it's cool enough that I want to write about it myself.

"Rageboy" has discovered, and briefly discussed, that Amazon is now providing hypertextual citation in both directions -- showing both the books that are cited in Book A, and all the books that cite Book A. This means that you can now not only find out all about Book A, you can quickly ascertain where Book A sits in the entire pantheon of scholarship in its topical field. The similarity of this to Google's PageRank algorithm will not be lost on any of the search cognoscenti, and the potential of that alone is enormous -- in addition to reading review, finding out "people who bought also bought," and looking at best-seller lists, I can now choose my books on the basis of how often they are cited, or how broadly they cite -- powerful applications of the "hubs and authorities" model which underlies PageRank. When this is suitably combined with a nice visualization and discovery tool like TouchGraph, we might really be on to something. Three cheers for social networks of data!!