# Graphs

It has become customary to use “graph” to refer to the underlying data structures at social networks like Facebook. (Computer scientists call the study of graphs “network theory,” but on the web the word “network” is used to refer to the websites themselves).

A graph consists of a set of nodes connected by edges. The original internet graph is the web itself, where webpages are nodes and links are edges. In social graphs, the nodes are people and the edges friendship. Edges are what mathematicians call relations. Two important properties that relations can either have or not have are symmetry (if A ~ B then B ~ A) and transitivity (if A ~ B and B ~ C then A ~ C).

Facebook’s social graph is symmetric (if I am friends with you then you are friends with me) but not transitive (I can be friends with you without being friends with your friend).  You could say friendship is probabilistically transitive in the sense that I am more likely to like someone who is a friend’s friend then I am a user chosen at random. This is basis of Facebook’s friend recommendations.

Twitter’s graph is probably best thought of as an interest graph. One of Twitter’s central innovations was to discard symmetry: you can follow someone without them following you. This allowed Twitter to evolve into an extremely useful publishing platform, replacing RSS for many people. The Twitter graph isn’t transitive but one of its most powerful uses is retweeting, which gives the Twitter graph what might be called curated transitivity.

Graphs can be implicitly or explicitly created by users. Facebook and Twitter’s graphs were explicitly created by users (although Twitter’s Suggested User List made much of the graph de facto implicit). Google Buzz attempted to create a social graph implicitly from users’ emailing patterns, which didn’t seem to work very well.

Over the next few years we’ll see the rising importance of other types of graphs. Some examples:

Taste: At Hunch we’ve created what we call the taste graph. We created this implicitly from questions answered by users and other data sources. Our thesis is that for many activities – for example deciding what movie to see or blouse to buy – it’s more useful to have the neighbors on your graph be people with similar tastes versus people who are your friends.

Financial Trust: Social payment startups like Square and Venmo are creating financial graphs – the nodes are people and institutions and the relations are financial trust. These graphs are useful for preventing fraud, streamlining transactions, and lowering the barrier to accepting non-cash payments.

Endorsement: An endorsement graph is one in which people endorse institutions, products, services or other people for a particular skill or activity. LinkedIn created a successful professional graph and a less successful endorsement graph. Facebook seems to be trying to layer an endorsement graph on its social graph with its Like feature. A general endorsement graph could be useful for purchasing decisions and hence highly monetizable.

Local: Location-based startups like Foursquare let users create social graphs (which might evolve into better social graphs than what Facebook has since users seem to be more selective friending people in local apps). But probably more interesting are the people and venue graphs created by the check-in patterns. These local graphs could be useful for, among other things, recommendations, coupons, and advertising.

Besides creating graphs, Facebook and Twitter (via Facebook Connect and OAuth) created identity systems that are extremely useful for the creation of 3rd party graphs. I expect we’ll look back on the next few years as the golden age of graph innovation.

I can’t remember the last time the tech world was so interesting. First, innovation is at an all time high.  Apple, Google, Facebook, Twitter and even Microsoft (in the non-monopoly divisions) are making truly exciting products. Second, since the battles are between platforms, the strategic issues are complex, involving complementary network effects.

Twitter’s moves this week were particular interesting.  A lot of third-party developers were unhappy. I think this is mainly a result of Twitter having sent mixed signals over the past few years. Twitter’s move into complementary areas was entirely predictable – it happens with every platform provider. The real problem is that somehow Twitter had convinced the world they were going to “let a thousand flowers bloom” – as if they were a non-profit out to save the world, or that they would invent some fantastic new business model that didn’t encroach on third-party developers. This week Twitter finally started acting like what it is: a well-financed company run by smart capitalists.

This mixed signaling has been exacerbated by the fact that Twitter has yet to figure out a business model (they sold data to Microsoft & Google but this is likely just one-time R&D purchases). Maybe Twitter thinks they know what their business model is and maybe they’ll even announce it soon. But whatever they think or announce will only truly be their business model when and if it delivers on their multi-billion dollar aspirations. It will likely be at least a year or two before that happens.

Normally, when third parties try to predict whether their products will be subsumed by a platform, the question boils down to whether their products will be strategic to the platform. When the platform has an established business model, this analysis is fairly straightforward (for example, here is my strategic analysis of Google’s platform).  If you make games for the iPhone, you are pretty certain Apple will take their 30% cut and leave you alone. Similarly, if you are a content website relying on SEO and Google Adsense you can be pretty confident Google will leave you alone. Until Twitter has a successful business model, they can’t have a consistent strategy and third parties should expect erratic behavior and even complete and sudden shifts in strategy.

Hopefully Twitter “fills holes” through acquisitions instead of internal development. Twitter was a hugely clever invention and has grown its user base at a staggering rate, but on the product development front has been underwhelming.  Buying Tweetie seemed to be a tacit acknowledgement of this weakness and an attempt to rectify it. Acquisitions also have the benefit of sending a positive signal to developers since least some of them are embraced and not just replaced.

What’s Facebook doing during all of this?  Last year, Facebook seemed to be frantically copying Twitter – defaulting a lot of information to public, creating a canonical namespace, etc. Now that Twitter seems to be mimicing Facebook, Facebook’s best move is probably just to sit back and watch the Twitter ecosystem fight amongst itself.  As Facebooker Ivan Kirigin tweeted yesterday: “I suppose when your competition is making huge mistakes, you should just stfu.”

Disclosure: As with everything I write, I have a ton of conflicts of interest, some of which are listed here.

# Search and the social graph

Google has created a multibillion-dollar economy based on keywords.  We use keywords to find things and advertisers use keywords to find customers.  As Michael Arrington points out, this is leading to increasing amounts of low quality, keyword-stuffed content. The end result is a very spammy internet. (It was depressing to see Tim Armstrong cite Demand Media, a giant domain-name owner and robotic content factory, as a model for the new AOL.)

Some people hope the social web — link sharing via Twitter, Facebook etc — will save us.  Fred Wilson argues that “social beats search” because it’s harder to game people’s social graph.  Cody Brown tweeted:

On Twitter you have to ‘game’ people, not algorithms. Look how many followers @demandmedia has. A lot less then you guys: @arrington @jason

These are both sound points. Lost amid this discussion, however, is that the links people tend to share on social networks – news, blog posts, videos – are in categories Google barely makes money on. (The same point also seems lost on Rupert Murdoch and news organizations who accuse Google of profiting off their misery).

Searches related to news, blog posts, funny videos, etc. are mostly a loss leaders for Google. Google’s real business is selling ads for plane tickets, dvd players, and malpractice lawyers. (I realize this might be depressing to some internet idealists, but it’s a reality). Online advertising revenue is directly correlated with finding users who have purchasing intent. Google’s true primary competitive threats are product-related sites, especially Amazon. As it gets harder to find a washing machine on Google, people will skip search and go directly to Amazon and other product-related sites.

This is not to say that the links shared on social networks can’t be extremely valuable.  But most likely they will be valuable as critical inputs to better search-ranking algorithms. Cody’s point that it’s harder to game humans than machines is very true, but remember that Google’s algorithm was always meant to be based on human-created links. As the spammers have become more sophisticated, the good guys have come to need new mechanisms to determine which links are from trustworthy humans. Social networks might be those new mechanisms, but that doesn’t mean they’ll displace search as the primary method for navigating the web.

# Why does it matter that Twitter is supplanting RSS?

The other day I claimed that Twitter is supplanting RSS, and that long term that’s a bad thing.  Andrew Weissman had a very reasonable response:

Twitter is the most open application people are currently using. It’s open on the way in and the way out. The variety of applications using the Twitter api are astounding in that they cover many use cases.

Given that, why will Ashton and Oprah someday care?

The problem is Twitter isn’t really open.  For Twitter to be truly open, it would have to be possible to use “Twitter” without an any way involving Twitter the institution. Instead, all data goes through Twitter’s centralized service.   Today’s dominant core internet services – the web (HTTP), email (SMTP), and subscription messaging (RSS) – are open protocols that are distributed across millions of institutions.  If Twitter supplants RSS, it will be the first core internet service that has a single, for-profit gatekeeper.

Why would this matter to Ashton or Oprah?  Imagine if Microsoft Exchange server wasn’t just an instantiation of SMTP but was a centralized service that all email had to pass through.  A single institution is never as reliable as a system distributed across millions of institutions.   Nor is it as secure – for example, a distributed denial-of-service attack can much more easily bring down one service than the entire internet.

But most importantly, having one company control a core internet service hinders competition and therefore innovation.  To continue the Microsoft Exchange analogy – do you think in that world we would have such a diverse email ecosystem if everyone had to go through Microsoft to build stuff?

And this is all true while we are still living in the fantasy land where everything involving Twitter is free.  At some point Twitter will need to make lots of money to justify their valuation.  Then we can really assess the impact of having a single company control a core internet service.