I want my data, not a conversation
At the Ontario Linux Fest I had the wonderful opportunity of having lunch with Jeremy. It was after his keynote talk. I had lots of questions, which he very generously tolerated as we ate our beet sandwiches. We chatted about the future of data and ownership and various things and he recommended Van Jacobson's Google Tech Talk: The Future of the Internet. The video is just over an hour long and I haven't taken the chance to watch until today. (I've been recovering from the flu this weekend and decided this video was like watching Q in his lab coming up with the gadgets. There were fewer explosions, but I probably learned more.)
Here are a few of my favourite points (hopefully they're enough to entice you to spend an hour watching, or at least listening to, this talk)... The talk starts out with a great history of communications (including why we started with four-digit phone numbers)... and then it got more interesting...
- People don't want to have conversations. They just want their Web pages. They hand in a URL and they want to get something back. That's not a conversation. It's the computer equivalent of, "does anybody have the time?"
- Our current model doesn't stop the spam. The content is garbage, but it came from my mail server, so it must be a good conversation. Things are currently blind to the data.
- Change your point of view to focus on the data. Not where the data lives, because it doesn't have to live anywhere. Data is named. When you want a chunk of data, you present its name to the network, and it doesn't matter what kind of networking technology you've got, use them all, it's asking does anybody know where we are? Pass this data out to the network and hope that somebody can give you a reply.
- And then he read my mind because what he was describing sounds like caching... but it's not. The effect is very much the same. In caching you say, oh the data lives there, but I've got a copy of it here. In my model the data doesn't live anywhere. Whereever it is, that's great. I can authenticate it and I know whether or not I can trust it.... I'm not moving it from it's one true location to somewhere else...it doesn't have one true location.
- Integrity and trust are properties of the data, not of the way that you obtain it.
- So this is, by contrast to the Microsoft Authenticode model, where you download this update from Microsoft, then there's a little window that comes up and says, "Do you trust Microsoft?" And the answer is "NO!" and I kept clicking "no" and my machine never got updated. [The audience and I both laughed at this.] I don't! Their model is "Do you trust their delivery mechanism?" They're basically asserting loading this update won't cause financial harm to Microsoft. Great, but that's not what I want. I want my machine to work. I want to know what you're going to do to my machine. I want to know something about the data and the operations you're going to take, but they can't tell me that. So I want properties of the data where I don't have to trust remote agents, that the data itself lets me figure out what it means and who sent it and how it's connected to the world.
This last point reminds me of Laura's recent request to have system updates a little bit more system-aware. (It also reminds me of the first time a friend of mine told me to download some form of BSD from the Intarwebs and I clicked "sure" because it had blowfish and then promptly FREAKEDOUT because the download screen seemed to be taking over my computer and further more I was supposed to just trust a whole operating system that just existed somewhere on the internet?! That's CRAZYTALK.) So go watch the video and then go read Laura's blog post. Do I have any solutions to entice you back here? Um, nope! Just some brain fodder. So away you go! It's not as exciting as the new Bond movie, but it is something you can enjoy from the comfort of your own home.

Comments
Leave your feedback at the bottom. Comments may be held in moderation.
Talking about names, names can make life much easier. I've been writing a parser for a research language I've been developing, and the data structure I'm using for the syntax tree is comprised of a lot of nested named nodes. This way I can get data just by doing a name lookup rather than an endless sequence of child references. Like you said, I don't necessarily care where the information is (though sometimes I do, so I still need access to the underlying tree structure) and a name lookup is intuitive and easy.