My Data Is My Data: Why Facebook, Twitter, and Google+ Are Bad For The Web

We just can’t get along

Facebook doesn’t want to talk to Google. Twitter doesn’t want to talk to Tumblr, Facebook, Google, or, really, anyone. Google only wants to talk to you.

It’s like a bad highschool drama. You’ve got jocks, geeks, nerds, goths, whatever, and none of them know how to communicate with each other. I mean, of course they know how, but none of them really wants to take the effort.

I’m not going to extend the simile past its breaking point, but you get the picture.

For all our talk of a new web, of sharing, of the bright and shiny future of all things internet, it seems that we’ve forgotten that these companies and their associated networks are all actively fighting each other. Not only that, but they’re fighting each other with our data.

Everyone understands how networks work, right? The more people on a network, the more useful the network becomes (at least most of the time). Especially if that network is something like Facebook, where identities aren’t as fluid as something Reddit or, in the extreme, 4chan.

Even Reddit benefits from network effects (the same way Slashdot and Digg did). It’s not that you can find a lot of people, but that you can find a lot of interesting bits of text, you can interact with users from other cultures and backgrounds and experiences, and you can discover new and funny and significant things.

But on another level a network doesn’t really benefit simply from your presence. If you don’t do anything, what’s the point? You’re just a tiny dot in the database somewhere. If you, for instance, have a Facebook account and never do anything with it, you’re probably actively making the network worse. Facebook’s goal, after all, is to keep you on Facebook so they can sell you ads. And the way you do that is to interact, whether it’s with people, games, or content. Everything that Facebook does points at that single goal. Keep you there and sell you ads.

The thing is, the multi-billion dollar empires of Facebook, Twitter, and to some extent even Google, are built on your data. When you interact, you create data. You post something, that’s data. That’s your creative expression. When you share something, the fact that you shared that is (or at least should be) your data. When you make a friend, that activity is a datapoint about you, and should be your data. No-one should be able to own these bits of data, and no-one should be able to say “you can’t tell anyone about this or that”.

In effect, since they won’t talk to each other, that’s exactly what these social networks are doing. Tried to export your friend list from Facebook lately? It gives you some data, yes, but as little as possible. Just a list of names. It’s essentially so little information as to be completely useless.

Tried to export your follower/following lists from Twitter? No such luck. I mean, you can go copy and paste them if you want, but from a data interchange point of view, that’s like driving from Toronto to Tulsa to buy a fridge. It’s doable, yes, but it’s definitely not worth it. And more to the point, it’s uninhibitedly non-repeatable and non-automatable.

This makes sense for the companies involved. For Facebook, there’s nothing in the universe more important than Facebook’s data. It defines their network. It makes them powerful. The same thing for Twitter. They don’t really want you to export your tweets to somewhere. They want you to import but not export, because your tweets make them powerful.

Google on the other hand wants to slurp all this data up and give you search results while (you guessed it!) selling you ads. Google is a microcosm for the problem that we want to solve.

Google offers a free service with high utility. It searches the tangled depths of the internet to give you relevant information. It wants to be able to see into your Facebook and your Twitter so it can serve up better and more relevant search results.

This is service they can’t offer because Facebook and Twitter do not allow Google inside their walled gardens. Sure, Google can index the public bits of Facebook and Twitter, but that’s only the tip of the iceberg of what exists on those networks. Especially Facebook, where you can control privacy settings.

Google would love to return personal search result on your personal social graph. Their wet dream is unfettered access to the depths of Facebook’s database so they can return information that is relevant not only to everyone, but also relevant to you and only you.

You may not want Google to do this. Or you may. Either way, the point is moot. Google is denied providing this service because they simply can’t get that data, and frankly, there’s no way for you to tell Google “Okay, go in an suck out all my data”. Facebook would shut that down 20 seconds after launch.

Again, you may not want Google doing this. But the point is you don’t have the option.

Or let’s say you want to create some kind of meta-journal of you life. You want to integrate what you email, what you tweet, where you go, what you eat, who you’re with… and dump that all into your own personal database. Then when you’re 80 you can look back at your boring life feel terrible. Whatever.

You can’t really do that right now. I mean, you kind of can, and some people are trying really hard to make a part of that happen (I’m looking at you, ThinkUp), but it’s difficult. It’s also not becoming easier. Twitter and Facebook and Google aren’t allowing more of your data to escape their walled gardens. It’s less. And from what you can see from Twitter’s shuttering of the various anonymous data access techniques, the amount and type of data is declining.

This is not an optimal state of affairs. It’s not good. It’s not even close to good.

Why this is bad, part 1

The title of this blog post makes it clear: My data is my data. You can build your empire on top of it, you can use it to monetize me, you can use it to sell me things, but none of that changes the fact that it’s my data. It’s my creative work. It describes me. It describes my life. It describes the things that I do. And this data is mine.

How can Facebook, Google, Twitter, or any other social network (but especially the Big 3 social networks where all the utility lies) deny me the right to do what I want with that data? How dare they tell me what I can and cannot export and to whom I can and cannot export it? Who does Facebook think they are that they won’t let Google index at the very least my profile and actions?

Again, I understand why these networks do this. But having a good reason is not the same as doing a good thing.

Twitter is, I think, probably noticeably and publicly terrible at this. Not only are they technically incompetent (I can’t access most of the tweets I’ve made), but they’re also technically evil, constantly revising their API to limit what I can access and how I can access it and how often and on and on and on.

All this because Facebook and Twitter have gone the way of traditional media companies and decided that I’m not the customer. They don’t want to have a relationship with me. They want to sell me or at the very least sell my eyeballs. They don’t want me to do this or do that or do that other thing because it dilutes their network’s value. Their display guidelines exist to make sure that I must see their ads (I won’t, but only because I use other means), and their access restrictions exist to prevent 3rd party clients from gaining too much market share and becoming powerful within the ecosystem.

Why this is pad, part 2

Look at the internet. The internet is one of those rare examples of the free market working properly because the oligarchies of the time either didn’t understand what was going on, or didn’t realise its value.

If the internet oligarchies of today’s network space existed when the internet was being built, we would still have AOL, Compuserve, and whatever else. They wouldn’t talk to each other. They’d preserve their own network’s value, but they wouldn’t work together to enhance the value of the network as a whole.

I can imagine an protointernet that works this way. It’s not pretty. Each side clinging to its various silos, desperately trying to steal network value from one another.

The benefits of our internet and the way it works together are obvious. Anyone trying to create silos out of the internet via hardware (China, Iran) or software (Facebook, Twitter, etc) are doing the network as a whole a disservice. They’re giving the finger to the hard work of all the people who invented this wonderful thing, and all the benefits the internet has given us.

Can you imagine being unable to send someone in Sweden an email because you don’t subscribe to Sweden’s network, or you haven’t bought the “talk to Sweden email upgrade package”?

Ridiculous, right? Who would use email?

Facebook, Twitter, Google+… all these services that capture data and activity into silos and keep it to themselves are doing the same thing. They are diluting the value of the internet as a whole because they keep the data to themselves.

Why this is bad, part 3

This kind of network effect causes splintering effects too. This is counter-intuitive, but let me explain my thinking here.

In the physical world, large-scale networks take a lot of money to develop and maintain. Build-out and operation is capital and labour intensive. This is why new and disruptive physical networking technologies tend to piggyback on existing networks. Take the internet as an example: It resided at first on telephone networks and then cable networks and finally mobile networks not because those were necessarily the best technologies available (with the notable exception of mobile networks, which were literally the only technology available) but because they were there. (As an interesting side-note, eventually phone and cable networks will reverse this and piggyback completely on the internet, but that’s neither here nor there.)

Of course, the rules that make things difficult and costly in the real world don’t apply in the digital world. In the same way that digitally copying music and tv costs pennies to the dollar of physical distribution, building a massive network on top of the internet is fairly easy. Except that now instead of being one of the only games in town, you’re one of many games in town. And instead of being vulnerable only to buyouts and bankruptcy, you’re vulnerable to your users simply leaving.

There’s no lock-in anymore. You can try to create lock-in, and trust me, they’re trying, but it’s not the same as being locked into a three year mobile plan on one of three networks all of whom are essentially the same (but in the case of mobile phones, at least they interoperate!).

Because they’re no barrier to entry and no lock-in, creating a new network-on-top-of-the-network Facebook-style has become the latest internet trend. There are veritable host of up-and-comers. They all want to be the next big thing. And none of them talk to each other.

In retrospect, it’s obvious why Google created Google+. Google may not have even wanted to ever create a social network. Google may have even been averse to it. But they had to. They need that data. They need your interactions. And as always, it’s to serve ads to you.

The unwillingness of Facebook and Twitter directly lead to a massive elephant entering the social networking space. This is splintering in action, a failure of the free market in digital economies, all caused by a massive tragedy of the commons that no-one except some FOSS crackpots cared about for a long time.

Time to start caring

I’ve heard a lot of suggestions about what to do here. Consumer rage. Developer boycotts. User boycotts. Site boycotts.

None of that is going to work. I think we’re going to have to go that third party that seems to always end up mediating tragedies of the commons: The Government.

Listen, there’s no other way around it. We’ll have to legislate this. Come up with a flexible export format for user data that can be pinged, let’s say X number of times per day or week. Let’s make this part of consumer protection legislation.

At the very least the threat of a legislated solution would bring the parties to the table and allow us to make a better web.

In the end, I just want to be able to do what I like with the data that I created. I wasn’t paid for it. I did it all for free. There’s no work for hire or anything like that here. I just want my data and I want to do whatever I like with it.

That’s not too much to ask, is it?