January 30, 2011

"Before I cancel your account..."

"Before I cancel your account, I would be happy to put your account on a pause which would freeze your account for 6 months at $5 a month, Would that work for you until you're ready to utilize the service again?"

No. I want to cancel the account. In fact, I'm pretty offended that I had to chat with you at all. You have a website, where I can review and modify my account. I had to provide several sensitive pieces of information to gain access to said website. I find the lack of a "cancel" button on that site to be offensive. It tells me "We're not letting you go until we have a chance to annoy you with attempts to entice you with new offers or get you to give up one of your friends to take your place as our hostage we mean customer."

This is the kind of thing that so many companies get right, and so many get wrong. You, my friend, got it wrong. I don't feel like a valued customer, I feel like my business is your right and how dare I attempt to deny you said business.

Hey - stop smirking, cell-phone rep. I hate you too -- same reason.

January 16, 2011

Own Your Data, Cont'd

There's been some talk lately in social software circles about how to take back ownership of your data: ie, how to make sure that the content and data you create and share cannot be lost or destroyed by a particular service shutting it's doors.

Dave Winer and Tantek Çelik, two professional nerds and thinkers I admire a lot, are tackling the problem from different directions, and I'm really glad to see it. Variety is the spice of life, and the stuff of systems that survive.

Tantek has been working on a personal publishing platform he calls Falcon. Eventually Falcon will include microblogging, blogging, photo sharing, geo-location data, etc. The core idea is that you create your content locally (on your server), and that's where the "canonical" version lives. Then the software pushes it out (syndicates it) to the web services everyone else is using. Ideally, the syndication and messaging technologies will be adopted that will allow comments, favorites, and other forms of feedback to be aggregated back to the home server.

Dave is also working on something similar, but based almost entirely on RSS. But the component I'm seeing from Dave that I really like is his EC2 for Poets project. This is a project to define a server instance for EC2 that anyone can purchase and run on Amazon's hosted computing platform, all set up and ready to go. This will be the foundation of Dave's publishing platform, but I love that he's starting from the ground up, knowing that the first step to running your own site/publishing environment is having a place to run it from.

Both lines of exploration are vitally important -- defining the technologies that will allow us to freely create while retaining ownership of our data, and paving the on-ramps that will give us a place to do it.

Helvetica

When I started Gary Hustwit's documentary on the Helvetica typeface, I was all "damn you can reverse Helvetica Bold out of any photo and it will look awesome". By the end I was ready to throw up every time I saw it on screen.

January 12, 2011

The System

Some mornings I wake up dreading the moment my son will get up. He wakes up hard and fast, and it totally breaks The System. This is not my son's fault — he's a 7yo boy. But it is making me looks to update The System.

I became a "morning" person because I hate mornings. I don't like having to wake up with people around, in my face, expecting things. So I worked out The System. The System goes like this: I set my alarm to wake me an hour before whoever in the family typically wakes up next. I like it dark. I get up, make coffee, read my Bible, pray, read some news and email. By this point my brain is working, I've decided how I feel about the day, and I can generally be calm, kind, understanding, etc.

In one college course we all had to be up at 6:30am for breakfast, and to class at 7:30am, but we were all working until 2 and 3 in the morning. I came to breakfast so I could get coffee, but I was (not so) affectionately known as "the dark presence". People laughed, but it was "haha only serious", and I was generally left alone until after the first session. No one likes "the dark presence".

Breaking the System seems to break my ability to operate with any degree of social skill, and my son breaks The System. Seriously. He wakes up around the same time I do, long before my wife or the other kids, he's bounding with energy, and has to have my attention. He isn't able to quietly entertain himself, he needs interaction from the get go. My natural response is GO. AWAY. This only makes him sore at me and can derail his whole day, which makes it a lose-lose.

So I'm working on some tweaks to The System. One is to get up even earlier, but it's hard to do that and still be able to stay up with the wife in the evenings. Another is to plan ahead: make sure the coffee is going to be ready without having to make it. Make a list of things for him to do to start his day. Think about how those first couple of conversations might go so I'm ready. You laugh! -- but I'm a nerd and don't handle change/conflict well without some preparation.

I think I can make this work, but I'm learning that The System needs some adaptive execution paths in order to continue working.

January 01, 2011

Wallrazer

For 2011, I'm going indie. I'm hanging my shingle at http://wallrazer.com, and I'm actually pretty excited about the prospects.

wallrazer

This is a big deal for me -- I haven't been outside a "regular" job for about 7 years. Mostly I'll be doing contract web development in Movable Type, python, and Django. This will be my bread and butter most of the time. I'm getting some leads from fellow ex-Aparters, and I love that while the brand may be dead, the community isn't.

I also want to put more time into helping to promote Open Web technologies and the indie web. I'm hoping to make it out to IndieWebDay and meet with other folks who care about helping site operators own their data: more homesteading, less sharecropping!

Ultimately, I want to develop some ideas I've got to put these ideas into code, and release as much of that as possible as open source. As principal of my own company, I get to say when I can release code. YES.

November 05, 2010

Freshening Up

As I look forward to what's next, my brain has been nudging me to do some homework. My development time in the last couple of years has been spent deep in Movable Type's guts, making our workhorse do things that aren't... natural.

Now I'm working on getting my brain back into full-stack mode. It seems appropriate that a competent web developer today should at least understand the full web "stack": be able to develop a useful storage schema, know how that stored data gets to the middle tier, how to get the results of the logic in the middle tier to the page, and some interesting things that can be done with it there.

In other words, a decent web generalist. Few of us work on the whole stack, but I think we ought to be able to have a useful conversation with someone focusing on any of the areas we're not focused on.

So, I'm in refresh mode. Some things I'm doing to exercise the ol' brain:

Build out a simple service in node.js

I'm building a simple url shortener (the new Hello World?) in node.js. Node isn't brand new, but it's interesting, and does some things in ways I've not tried before. In particular, it is focused on rarely blocking execution, so that requests are processed at ludicrous speed.

I'm also playing with mongodb, a document database in the NoSQL vein, which, like node, uses javascript and JSON natively. I don't have an opinion on it yet, but it's fast and the idea of storing whole (denormalized) documents is fascinating.

Finally start learning HTML5 and its Ilk

The front end for node-short (see above) is basic HTML5, and I've finally started reading up on what that actually means and on some of the more geeky bits like Web Sockets and Web Workers. Not much to report as I haven't been playing around with them for long, but I think these two are going to continue to enable even richer "rich UIs" by allowing front-end developers access to more data in a more timely manner.

What Else?

I'm also just reviewing a few non-perl (mostly python) projects I've worked on, just so I can get as many synapses firing as possible; there are code paths in my brain that need refreshing and re-wiring!

October 18, 2010

Yahoo planning Facebook-style Connect single sign-on system

Yahoo "Connect" is first smart strategic move by Yahoo in years. http://read.bi/dn79EC

I agree with Chris Dixon re: this Business Insider article - Yahoo's market it quite diverse and made of regular people; I think that not being known as a "social network" may make some users more comfortable using them as an signin provider.

I wonder if the Facebook Connect UI ever adds friction to the signin instead of removing it? "Do I want my friends to all know about this?"

October 11, 2010

A Slighty Snarky Response to Morten Just's iPhone and WinPhone Comparison

Had a bit of fun with the screenshots from Morten Just's iPhone and Windows Phone 7 series side-by-side: The chrome, the chrome!:

iPhone and Windows Phone 7 series side-by-side: The chrome, the chrome! - Morten Just

iPhone and Windows Phone 7 series side-by-side: The chrome, the chrome! - Morten Just

iPhone and Windows Phone 7 series side-by-side: The chrome, the chrome! - Morten Just

iPhone and Windows Phone 7 series side-by-side: The chrome, the chrome! - Morten Just

WinPhone is purty, no doubt. But the iPhone is all about data density and making actions visible. And yeah, I'll grant that the iOS UI is about due for a bit of refresh, but I'd bet that the same principles will be evident.

October 05, 2010

SAY: Goodbye

Two years to the week since my first day with Six Apart Services, Six Apart announced that we will be merging with VideoEgg to create a new entity, SAY Media.

Sadly, some of us won't be making the transition, including remote employees like myself.

The last two years have been filled with challenges, moments of total WIN mixed with the occasional growth-inducing FAIL. I got to work with some the smartest people I've ever had the pleasure of working with, on some of the most challenging projects. Whatever happens next, I'll be grateful for the time I got to spend here.

I was proud to work for Six Apart, one of the real game-changers of the blogging revolution, and I'm sad to see Six Apart go away as an entity. But there are a lot of great people here who are going on with SAY Media to impact the web in entirely new ways, and I'm proud of them.

I'm staying on for the foreseeable future, helping complete and transition various projects, but I'm also looking forward to the next opportunity. The most satisfying work I've done over the years has been when I'm making a difference in people's lives - through technology, but also outside of the screen - and that's where I'm setting my sights.

You can hire me!

September 27, 2010

And then there were 5...

The last three weeks have been some of the most intense in recent memory for me, filled with blessings and some sadness.

A month ago (!), Jodi and I learned that a young boy we had heard about back in April (from the foster system here in AZ) was available for adoption. We had sought to adopt him in April, but circumstances were not right and it didn't work out. This time around we were already certified to adopt in Arizona and the process went forward more smoothly - and more quickly - than we could have expected.

V. was placed with us 2 weeks ago as an adoptive placement; we'll be able to officially adopt him 6 months hence. The last 2 weeks have been especially intense emotionally as V. gets to know our family, and we, him! He's a sweet boy with a rambunctious streak who adores swimming more than anything else in the world right now (good thing we have a pool and the water is still around 80!) so we're spent a lot of time in the pool. I've started working with an architect (hi Dad!) to design a phased playground in the backyard. Partly cause playgrounds are fun, and also to give V. and I a project to work on together.

Things have not been all wine and roses -- introducing a new permanent member of the family at 7 isn't the easiest thing to do. V. and his new sisters are still getting acclimated, but a lot of fun is being had amidst the familial growing pains (there are 5 of us now!). Ultimately we know V. was meant to be in our family and we're excited to see what God is doing here!

August 03, 2010

Bond. The Princess Bond.

I have a Personal Theory of James Bond. You may not think that James Bond as a character requires anything so vain as a Personal Theory, but nevetheless, I have one.

In my personal Theory of Bond, there is only one James Bond - Sean Connery (you know it's true). So what about all the other "Bonds"? William Goldman gave us the answer:

"Well, Roberts had grown so rich, he wanted to retire. So he took me to his cabin and told me his secret. "I am not the Dread Pirate Roberts," he said. "My name is Ryan. I inherited this ship from the previous Dread Pirate Roberts, just as you will inherit it from me. The man I inherited it from was not the real Dread Pirate Roberts, either. His name was Cummerbund. The real Roberts has been retired fifteen years and living like a king in Patagonia." Then he explained the name was the important thing for inspiring the necessary fear. You see, no one would surrender to the Dread Pirate Westley."

With Bond, it's the same thing - Sean Connery is retired and has been living in Patagonia (or, in my personal fantasy, a remote cabin in Alaska, in bitter exile, until his country once again must call upon him in its hour of greatest need). Thereafter, his replacement came on, assumed the name Bond, and continued the legend. It's the name, you see, that inspires the necessary fear.

Photoshop

July 18, 2010

Web-based URL Handlers (or, Lessons from the iPhone)

While walking back from the Gilt Club tonight (meet-n-greet before FSW) to the hotel, I was chatting with Brion and Zach from Status.net and starting talking about a conceptual UI bug in status.net. The bug to me is that in order to subscribe to a user's personal status.net site (say, I want to subscribe to Evan on his private instance) I need to enter my status.net identity on his site to get the ball rolling. Not that this is difficult, but it seems awkward and annoying.

Solution: Web sites should be able to register URL handlers with the browser. A site could define their own URL protocol string, and provide a callback that links with that protocol string would be sent to. This is a direct "port" of the same idea from iPhone interprocess communication, which was what got me thinking about it.

What if my status.net site could tell my browser:

  1. I can handle ostatus: URIs
  2. Send the link to http://me.status.net/subscribe?{url}

The browser could then pop up a dialog:

me.status.net would like to register to handle status: links. Would you like to use me.status.net to handle these links? [Okay] [No Thanks]

I could click to say yes, then when visiting some other user's ostatus-implementing site, I could click the subscribe link (which looks like status:them.ostatus.com) and the browser would automatically take me to my selected status: handler site and let that site complete the process.

How would the handler site provide this information? One idea could be our friend the <link> tag in the site <head>:

<link rel="url-handler" type="ostatus" href="http://me.status.net/subscribe?{url}" />

Unfortunately, that looks a little janky to me and probably tromps all over the intended semantics of the tag. Perhaps something with <meta>? I'll leave that up to the markup wizards.

We could also take this a step further: If my browser knows how to handle ostatus: URLs now, we could improve the UI some, perhaps, by having the browser parse the incoming page for a rel=status link, and give me a nice little [status] icon in the location bar?

Status_bar

Obviously that UI would get crowded kinda fast, but this could be done with a variety of services: like:, follow:, etc. -- as long as the handler can be linked to a particular protocol or service.

Thoughts?

July 01, 2010

The Voice in the Stream

Even though I'm a firm believer in the (distributed) social web, there's something that bugs me about the current implementation of dashboards, news feeds, and activity streams we see now: it's the voices, you see.

The author's voice (also known as writer's voice) is the style in which a story is presented, including, among other thing, the syntax, diction, person, and dialogue. WikiAnswers

Streams speak to us in the third person. "Your friend posted this, your friend liked that, your friend won this...". It's a fairly objective[1] reportage of events, delivered in the cool tone of a mellow NPR reporter. The stream washes over us, with some number (smaller or larger depending on your generation) being interesting enough to inspire us to further action.

Voices-facebook

Blogs - with exceptions - speak to us in the first person: "This just happened and I have to tell you about it", "I've been thinking about this for a while, and you should know that...":

Voices-scriptingnews-1

Back in 2003, Dave Winer wrote:

That is the essential element of weblog writing, and almost all the other elements can be missing, and the rules can be violated, imho, as long as the voice of a person comes through, it's a weblog.

That's a bit out of context for this piece, but it contains the germ of what I was thinking about -- that blogs give you the unfiltered voice of the writer, and we've lost some of that in the rush to streams.

Why is Twitter so compelling? Because it's an unfiltered stream of voices (if you filter out the spam- and ad-bots of course). In many ways, Twitter is a blog aggregator, not a "social network". Twitter doesn't wrap our posts in the third person (imagine every post starting with "your friend tweeted: ...") because there's only one kind of event - a tweet. And Twitter just gives us the tweets [2].

Conversely, I find that the Facebook newsfeed has less impact. There are so many diverse events flowing through Facebook that you need much more context to understand them: "your friend posted this on your other friends 'wall'...". I think the additional meta processing gets in the way. Often, the automated nature of the system even removes the user's voice completely: "your friend, who plays the game Farmville, acheived a new goal in that game, called 'Tractor!'".

Can this compelling voice be recovered? Is it worth recovering? I think it can, and I think it is. I truly love the web of social activity we've teased from the chaos of the previous decade's web, but there are subtle aspects at play that I don't claim to fully understand yet.

The challenge is to more clearly delineate "events" from "content" -- look at Flickr's new photo page for a good example:

Voices-flickr-1

Flickr now inlines favorites in the comment stream for a photo; some users don't like the new feature but I think they've put a lot of thought into the implementation. Notice that favorites (events) are greyed out and less visually important than the comments (content). This is a simple example but it helps keep the photo's story (Flickr's term), and the voices of the commenters, intact while providing new information (when a photo was favorited, as opposed to that it was favorited).

Another example, a comparison. Here's Cliqset's "shared item" presentation:

Voices-cliqset

And here's Twitter's famous "retweet":

Voices-twitter

Notice that Cliqset's design puts the action in the third person (due in part to their focus on aggregating Activity Streams, which to date have typically been presented/delivered in the 3rd person). Twitter preserves the orginal user's voice, then tacks on the meta information (who shared it and when) afterwards. There was some pushback when Twitter first rolled out this feature, but I've come to appreciate and like the way they designed it (and I miss it in Tweetie for Mac!).

As noted above, 99% of the "problem" (if it can be called a problem) is design - find the user's voice and push it to the forefront. The rest is meta clutter.


[1] "Fairly" objective because the motivations of the service do occasionally intrude; Facebook doesn't show us every action that everyone in our network performs on the site, it shows us the bits that their algorithm deems interesting or relevant. Possibly, also, those with a higher chance of getting us to click on ads, though obviously I don't know that.

[2] This may change as Twitter's new annotations make it possible to build apps that post structured non-microblog events to Twitter's event queue. More on that in another post. Maybe. :-)

June 15, 2010

Preserving the Link Economy in the Age of URL Shorteners

Links: The Only Currency on the Web

Adam Curry once said "Links are the only true currency of the web" (hat tip to Dave Winer). When we link to other sites, we're giving value, and our own content may accrue value over time as inbound links increase (or not). Google's famous "PageRank" practically codified the idea by weighting search results based on those inbound links.

These days that value is being made real as the commercialization of the web continues, and the social network services that we pour our hearts (and links!) into are no exception. Google targets ads to us based on our profiles, our email, our searches. Facebook shows us more ads based on our posts, our likes, our friends posts, ad nauseum. Twitter -- yeah, it's coming.

But this isn't about ads - that's another topic. This is about the fact that generally, we expect that our content - our tweets, our blog posts, our comments, etc - will be passed along as is. They're our words, right? And some of those words may be links. I mean, it is the web here. But that expectation may not be long for this world, based on Twitter's recent announcement that they're going to be wrapping (i.e. re-writing) links in our tweets to route through their own internal URL shortener. This will allow them to find and prevent links to malware and other spam and scams, but it also means that in some ways our content is no longer our own. For example, it means that the version of your tweet in Google's realtime index will be Twitter's link, not yours.

Short Links and Spam

It's true, that url shorteners have been used to hide links to sites containing phishing scams, spam and actual malware. This works because not only do users not read, they especially don't read that string of mysterious characters in the browser's address bar. But the solution is not to take control of an entire ecosystem of links.

By implementing a system that uses a combination of personal [perma-short-links]((http://www.monkinetic.com/2010/05/tantek-celik-diso-20-brass-tacks.html) and rel=shortlink, Service Providers have no (malware-related) reason to route every link on the service through their own shortener. The components are already in place to allow publishers to provide their own shortlinks for content and link it to the content, and for Service Providers to validate those links. Consider this flow:

Shortlink Validation

  1. Publisher creates a new page/article, including a element with the rel=shortlink microformat with the page's perma-short-link
  2. Publisher posts a message on Twitter, Buzz, etc. linking to the article with the perma-short-link.
  3. The Service Provider, in the interest of link safety, resolves the shortlink and looks for the rel=shortlink tag in the resolved page, showing that the published shortlink is valid.
  4. If the shortlink is found, then the shortlink can be trusted (assuming the page itself has been checked for malware, etc, and can be vouched for) and the message should be published as-is by the service provider.
  5. If the shortlink cannot be validated, then the link should be wrapped for security reasons and the message published with the wrapped link.

What I'm describing has been hashed out on the microformats wiki, Google code, and various mailing lists some time ago, and there are even large publishers implementing rel=shortlink. And bringing more publishers and content systems online with these tools is one part of the solution.

In light of the dust-up over Twitter's recent announcements, it's worth pushing publishers to implement perma-short-links and rel=shortlink on their side. And it's important that we press Twitter, the other social messaging systems, and content aggregators in general to respect those links and leave our content alone.

June 14, 2010

Twitter's Track-ware

Go read DeWitt Clinton's piece on URL shorteners, right now. I'll wait.

...

DeWitt makes some of the same points that Tantek made in our recent interview, and that I made in my analysis of Twitter's link-wrapping plans. But there are some aspects that DeWitt touches on that I had not been aware of, and which worry me significantly.

From the introductory post to the developer list regarding the upcoming URL rewriting plans, Twitter notes that they "will be updating the TOS to require you to check t.co and register the click."

The larger context of that quote, from Twitter's Raffi Krikorian, is:

...if you do choose to prefetch all the URLs on a timeline, then, when a user actually clicks on one of the links, please still send him or her through t.co. We will be updating the TOS to require you to check t.co and register the click.

Essentially, Twitter is saying that if you're going to build an app that uses Twitter, you must route the user through their shortener so that they get the Business Intelligence they want. This is their prerogative, but it pains me that neither users nor publishers get a choice. As a publisher, even if I roll my own URL shortener and publish content with my own perm-short-links, Twitter is going to wrap my links in their track-ware, and my readers are subject to it.

More after I calm, down, probably.

June 09, 2010

Bleary-Eyed Analysis of Twitter's Link-Wrapping Plans

This is going to be short and only partially thought-out, since I'm currently suffering from deployment-induced sleep deprivation, but that's not important now.

Twitter recently announced that they are going to start wrapping all links in tweets with their own short urls, on the t.co domain. Ostensibly, this is "to detect, intercept, and prevent the spread of malware, phishing, and other dangers." Some initial thoughts:

The Man-in-the-Middle

Twitter is now taking full control, and responsibility, for the content of pages linked in tweets delivered by the service. Today that means malware, phishing scams, etc - all good things to be filtered out - but what will it mean tomorrow?

Better UI for Shortened Links

Twitter intends to provide more information about a shortened link in the UI, a goal I can fully get behind.

...it could be displayed to web or application users as amazon.com/Delivering- or as the whole URL or page title. Ultimately, we want to display links in a way that removes the obscurity of shortened link and lets you know where a link will take you.

But they don't need a url-wrapper to do that. Exhibit A is this example from identi.ca:

image from www.flickr.com

Bit.ly

Bit.ly is clearly in Twitter's sights:

We are also looking to provide services that make use of this data, an example would be analytics within our eventual commercial accounts service.

RT Immunity?

It looks like retweets are immune from the wrapperness? (This is from Ryan Sarver's stream, one of the accounts for which the new t.co shortner is turned on.)

Twitter / Ryan Sarver: RT @chrismessina: Just ena ...

Effect on Archives

In general, URL shorteners may be limiting the lifespan of archived microblogging content. I'm thinking about stuff in search engine indexes, the Internet Archive, etc. If those indexers aren't storing the data structure behind the tweet, notice, whatever, and how to interpret it, then swaths of our links may die one day in one fell swoop when short URL services finally succumb to market consolidation. By inserting itself into potentially every tweet, Twitter is (as I've said) taking on a lot of power/responsibility.

Personal Perma-Shortlinks

If you're a publisher (and I count anyone with a web presence be it casual or commercial), I do think that this is another opportunity to ask yourself: "who owns your links?", and look at building a personal permalink/shortlink system that is under your control. The link landscape is shifting fast, moves like Twitter's are going to shake out some of the players, and you don't want all the links back to your content to die with one of the players.

June 08, 2010

The Open Community Can't Beat Facebook. Sort of.

...the open community can't beat Facebook.

But companies using open technologies can - by building better products. Outside the echo chamber of web standards fanatics, the vast majority of web users don’t care about how the web works. They care about their user experience, where their friends are, and when something goes wrong, protecting their privacy.

-- Eran Hammer-Lahav: How The Open Community Can Beat Facebook

Worth a couple of reads. Remember, no matter how Open it is, it's the product that matters to the consumer/end user. See also:

May 28, 2010

TinWhistle (was NewBase60)

Apropos of my recent interview with Tantek Çelik about URL shorteners, I've ported his NewBase60 code (part of his CASSIS project) to perl. It's now found on github at http://github.com/sivy/TinWhistle. This code converts numbers (in this case, dates expressed as days-since-the-epoch) into a base60 (sexagesimal) representation that's really short. For example, '1971-06-29' becomes '94', and '2010-05-26' becomes '45v'.

As described in the interview, this value is used in conjunction with some other bits to create a reversible short URL - meaning that anyone following the algorithm described in that post should be able to derive the URL from the shortlink, and not be dependent on a particular service to resolve them.

This code is pretty complete and converts from DateTime to days-since-epoch, ordinal dates (YYYY-DDD, see the readme), and sexagesimal days.


Update

I've expanded the code and moved it to a new project name. I've implemented a URL shortener called TinWhistle (roughly based on Tantek's Whistle) that uses NewBase60, and of which NewBase60 is now a sub-component.

May 26, 2010

OpenID Connect: Progress Ensues, Challenges Remain

Earlier this year, Chris Messina started writing about an idea he labeled "OpenID Connect". From Chris's original post:

for the non-tech, uninitiated audiences: OpenID Connect is a technology that lets you use an account that you already have to sign up, sign in, and bring your profile, contacts, data, and activities with you to any compatible site on the web.

OpenID currently allows users to accomplish the first two tasks in Chris's list: sign up and sign in, optionally giving the relying party some basic information about yourself. The idea behind "OpenID Connect" is to make the last group of tasks easier: rework the OpenID signin process to build on OAuth and provide a standard API that gives the relying site access to some user-controlled subset of a user's personal information, contacts, and activities.

Several weeks ago David Recordon (Open Platforms lead for Facebook) released a self-proclaimed "strawman spec" for OpenID Connect. This is an unfinished document designed to get a deeper discussion started about the ideas and problems involved.

The early draft below is meant to inspire and help revitalize the OpenID community. It isn't perfect, but hopefully it's a real starting point.

The spec explains the tech with what seems to be a minimum of handwaving, but there remain conceptual challenges to overcome if OpenID Connect is going to gain traction, and actually solve the issues it's trying to address.

The Challenge: Manage Conflicting Goals

To be successful, OpenID Connect must manage two conflicting goals:

  1. Be technologically and economically attractive to relying parties
  2. Continue to support user privacy and identity controls

The first goal is to be attractive for relying parties (sites that would support OpenID Connect as a sign-in option) to implement. This means that support for the new protocol must make economic sense to the RPs. Consider Chris's comments on the value of user data to the relying parties:

on Facebook, the liberal defaults are meant to make Facebook users’ accounts more valuable to relying parties than other, more privacy-preserving account configurations.

Relying sites need to know that if they spend the time building in support for your auth system that the registered users they get through your system are going to bring value to their site. What's the only real value on the web? Information. Facebook provides the most "valuable" users right now because their defaults provide 3rd party sites with the most information about the newly signed-in user.

The second goal that OpenID Connect needs to address is to continue the history of user control and privacy that OpenID was built on:

OpenID is a decentralized standard, meaning it is not controlled by any one website or service provider. You control how much personal information you choose to share with websites that accept OpenIDs, and multiple OpenIDs can be used for different websites or purposes. -- Benefits of An OpenID

While Facebook's model is to share the greatest amount of information possible about a user by default, OpenID's strength over the years has been the high level of control and choice offered to users, even if the user experience was never at the level it needed to be. Stand-alone providers like http://myopenid.com offered useful interfaces to select and identity to log in with, and even what fields to pass back to the relying site. This was valuable and necessary functionality, that Facebook users concerned with privacy may finally be coming to appreciate.

I sincerely hope that OpenID Connect and related technologies will encourage identity providers and relying parties to give the user the tools to understand and manage their own privacy in this new, wide-open information landscape, or I fear it's going to just become a new front-end to the data-devouring machine that is the current state of the social web.

May 25, 2010

Tantek Celik on DiSo 2.0: Down to Brass Tacks

Several months ago I published the first part of an interview series with (recent mozilla hire!) Tantek Çelik about his in-progress ideas about what he calls "DiSo 2.0". This is the next part of that series, and the third article in a series on the The Future of DiSo.

I think all of these big changes (and several smaller ones) have made it clear that we need to update the notion of what a DiSo implementation should both look like and do. -- Interview: Tantek Celik, Conceptualizing DiSo 2.0

So, onto part 2 of my interview with Tantek, in which we start discussing the "brass tacks" of the technologies we might consider DiSo 2.0:

DiSo 2.0: Technologies

In your initial post/tweet about DiSo 2.0, you tossed out a list of 10 peices of technology/technique that you saw as foundational to the next iteration of the distributed social environment we've labeled "DiSo". We've talked some about your vision for DiSo -- now I'd like to focus in on some of the technologies you're building on (conceptually and in code).

#1 personal site+shortlink domains

Steve Ivy: The foundation for DiSo continues to be a personal site - whether self-hosted or otherwise. This sort of goes to the core idea of your web presence being your own, and you're not just "share-cropping" (http://en.wikipedia.org/wiki/Sharecropping) for a corporation. Can you expand on this idea and how having your own shortlink domain fits in?

Tantek Çelik: Historically the need for "short" URLs is not new. But what we mean by "short" has certainly changed a lot recently.

Old email systems would wrap text at 80 characters (or a few less) and make it just harder enough to reliably reconstruct or use URLs that many systems adopted a common practice of keeping URLs to 70 characters or less by design.

This was fundamentally usability driven.

Shorter URLs in email are easier to use and more reliable. They're nicer in IM too.

And browser screenshots. I've retyped URLs from screenshots in slides, I'm sure many of you have too.

How about print? Ever typed in a URL from a book?

Or advertising. Magazine spreads or billboards - URLs are ubiquitous.

The easier to read and type-in, the more folks visit the URL.

But again, this is nothing new. Ever since the dotcom boom URLs have become a part of our visual language (much to the chagrin of linguists I'm sure).

Then Twitter rewrote our brains to think in 140 characters and suddenly every one of them counted.

And two things happened:

  1. URL shortener services showed up which would trim any URL down to a small handful of characters, saving your precious tweetspace for your own words. Everyone started using them. Twitter and clients started auto-shortening URLs.

  2. We started to understand just how fragile these shorteners are, and how they break the web. How many shortener sites have died taking their links with them to the bit bucket? Even tr.im, which is keeping the lights on longer than others, is set to shut down in 2010. It was frustration with tr.im's downtimes and then end-of-service announcement that led me to this realization: It's not good enough to have your own URL; You need to have your own shortener as well.

This isn't just for independents. Companies and hosting services should have their own too. The first big site to realize and do this right was Flickr, and many have followed.

The key here is that when you own and host your own shortener for links to your content, you're not adding any more fragility to web. If your shortener goes down, your site probably is down as well. They're tied together. No additional risk. Unless you use a database for the shortenings and you lose your database because you were unwilling (or unable) to pay the DBA tax to maintain it. We'll talk more about the DBA tax problem in due time.

But why is it important to own the shortened links to your content? Why not just always share your full "long" URLs?

In short:

  1. You can't always do so. E.g. Twitter now auto-shortens many URLs.
  2. Shorter URLs tend to be better for sharing (for all the reasons discussed at the top).

And that #2 is where we get to DiSo.

A couple of the key architectural components of DiSo 2.0 are:

  1. Publish on your own site, own your URLs, your permalinks, and
  2. Syndicate out to other sites. Your text updates to Twitter, your checkins to Foursquare, your photos to Flickr etc.

The direction of the content flow is very important here, as it has to do with ownership, and what's the original vs. what's just a copy.

It's ok to sharecrop copies, especially when the copies link back to your original. That's called distribution.

It's not ok to sharecrop the original and aggregate copies on your own site. You're still sharecropping and you're still beholden/vulnerable to those 3rd party sites going down, censoring your content, renaming you, or being blocked by some nationwide internet filtering firewall.

Now some folks think the "aggregate all your stuff out there onto your own site" approach is the way to go, and frankly, this is definitely something worth exploring, because it's probably easier to build.

On the one hand you've got vaporkickstartware like Diaspora (talking about "scraping Twitter and Flickr") on the other hand there are actual shipping implementations, like Movable Type's "Action Streams" plugin. It's actually quite a nice piece of work, and doesn't look half bad at that. You can see it live on the personal sites of Mark Paschal and David Recordon. MT Action Streams explores many of the user interface issues around activity streams - issues that need exploring, regardless of which way the content is flowing.

But it's not DiSo. Any aggregation-based solution is still beholden (vulnerable) to the silos where you sharecrop your content.

In a DiSo solution, when you syndicate your content out to other sites, the key is that those syndicated copies of your content link back to the original. Permalinks serve this role for blog posts. For short text updates that you syndicate to Twitter or Identi.ca etc., you need perma-short-links. And that's where your own shortener is essential.

#2 algorithmic URL shortener

SI: So, when I hear "algorithmic" here I assume you mean "dependably reversible" - that you can get from the shortlink to the full link using an algorithm, not just looking it up in a big key-value map, right? Why is this a foundational part of DiSo?

TÇ: One of the key emphases of the DiSo 2.0 I've outlined is maintainability. Fewer moving parts, fewer magic hidden files, fewer things that can inexplicably fail = more independents succesfully running and owning their own sites, identities, web presences over time.

Nearly all (maybe all?) open source URL shorteners today use a database to store the pairs of "short code" and "actual URL". If you lose that database, forget to back it up, have some bad database code that corrupts it etc., your shortlinks are gone, dead, useless.

If instead you create and use a URL shortener to create shortlinks that are algorithmically reversible, and then document that algorithm, publicly, then anyone can figure out how to expand your shortlinks. If they happen upon them on some random site, they can expand them and look for the original, or at least know that you're linking to the same thing that a normal permalink somewhere else is expressing.

In addition, all manner of browser or aggregator tools and sites that currently have to manually resolve shortlinks by calling the APIs of their services can save the bandwidth and time and simply decode your URLs themselves.

Once again, Flickr set a very good example with their http://flic.kr/ shortener for Flickr photos.

In fact, their doing so inspired me within days to grab http://ttk.me/ and set it up to redirect to my site http://tantek.com/, knowing I would eventually (as I have) add various shortening services to it.

Similarly I encourage every independent out there, everyone who wants to install and/or run their own DiSo implementation (like Falcon), to go ahead and not just grab a domain name for themselves, but also grab a shortener domain too. Set it up to redirect to your primary domain for now.

Two more things that Kellan got right in the Flickr shortener which I've also found inspiration in:

  1. just a "/p/" to indicate "Photo" presumably (clever idea to prefix like that to allow for other prefixes to do other things)
  2. and then a Base58 compressed photo id.

Regarding 1, I've also settled on one-character "spaces" for different types of URLs. "p" for photo makes sense to re-use. After quite a bit of personal research into what types of content are different enough and used often enough to warrant their own short URL spaces, I've come up with about 20 different content types, each with their own letter. I've decided to release them mostly as I implement them.

Here are a few examples from my content-type short codes:

  • b - blog post, article (structured, with headings), essay
  • i - identifier - on another system using subdirectories as system id spaces
    • i/i/ - compressed ISBN numbers
    • i/a/ - compressed ASIN numbers
  • p - photo
  • t - text, (plain) text, tweet, thought, note, unstructured, untitled

I decided to keep my text note "t" shortener as short as possible, which meant dropping a trailing "/".

After that I use a 3 digit sexagesimal (Base60) number to represent the date in a manner deliberately limited to human individuals. Why Base60? Lots of reasons, including print-safety (as mentioned above). Want to read the entire derivation and reasons why? See http://tantek.pbworks.com/NewBase60 (includes open source CASSIS implementation).

Why 3 sexagesimal digits to represent the date? It turns out that 3 sexagesimal digits are capable of representing over 500 years of days - plenty overengineered for any human lifetime. And if anyone does figure out how to live more than 500 years I have a feeling that person will not only not resemble human as we know it very much, but will either have bigger problems to deal with than URL shortener limitations, or will be so smart that they will come up with a better solution.

But for now, for our feeble less than 200 year lifetimes, this is good enough. In addition we can even agree on a day zero that computes well with existing platforms. Unix Epoch start: 1970-01-01. Given that no-one published anything to the web before 1990, I think we're ok with that. What happens in a few hundred years? Perhaps people can pick their own day zeroes as they see fit.

Thus the 3 characters after the "t" represent the number of days since 1970-01-01 in sexagesimal - what I'm calling "epoch days".

Finally I allow for 1 (or 2, but haven't needed it yet) more sexagesimal digit to indicate the nth ordinal post of that type for that day. Thus:

http://ttk.me/tSSSn

  • SSS = sexagesimal epoch days
  • n = nth post that day

This is sufficient to expand to:

http://tantek.com/YYYY/DDD/tn/

  • YYYY = year
  • DDD = day of the year (see related: http://en.wikipedia.org/wiki/ISO_8601#Ordinal_dates )
  • n = nth post that day

Which I then redirect server-side to a longer URL with post keywords (AKA "slug") on the end.

E.g.

http://ttk.me/t4432

is

  • t - text note
  • 443 - 443(base60)th epoch day = 2010, the 34th day of
  • 2 - 2nd text note that day

thus expands to:

http://tantek.com/2010/034/t2/

which is enough for Falcon to retrieve the post in the hAtom store, where it also gets the keyword/slug phrase for the post, and uses it to redirect it to:

http://tantek.com/2010/034/t2/diso-2-personal-domains-shortener-hatom-push-relmeauth


Thanks, Tantek!

There's more to come as we continue to explore the Future of DiSo. Stay tuned...

See also:

Update: I've written a quick perl implementation of the algorithm described in this post:

  • TinWhistle - perl URL shortener library based heavily on Tantek's ideas, and significantly on his cassis.js code.

Status

By The Power Of

monkinetic runs on TypePad.