The interoperability of social networks

Google recently added a caustic warning message when users attempt to export their Google Contacts to Facebook:

Hold on a second. Are you super sure you want to import your contact information for your friends into a service that won’t let you get it out?

Facebook allows users to download their personal information (photos, profile info, etc) but has been fiercely protective of the social graph (you can’t download friends, etc). The downloaded data arrives in a .zip file – hardly a serious attempt to interoperate using modern APIs (update: Facebook employee corrects me/clarifies in comments here). In contrast, Google has taken an aggressively open posture with respect to the social graph, calling Facebook’s policy “data protectionism.”

The economic logic behind these positions is a straightforward application of Metcalf’s law, which states that the value of a network is the square of the number of nodes in the network*.  A corollary to Metcalf’s law is that when two networks connect or interoperate the smaller network benefits more than the larger network does. If network A has 10 users then according to Metcalf’s law its “value” is 100 (10*10).   If network B has 20 users than it’s value is 400 (20*20). If they interoperate, network A gains 400 in value but network B only gains 100 in value. Interoperating is generally good for end users, but assuming the two networks are directly competitive – one’s gain is the other’s loss – the larger network loses.

A similar network interoperability battle happened last decade among Instant Messaging networks. AIM was the dominant network for many years and refused to interoperate with other networks. Google Chat adopted open standards (Jabber) and MSN and Yahoo were much more open to interoperating. Eventually this battle ended in a whimper — AIM never generated much revenue, and capitulated to aggregators and openness.  (Capitulating was probably a big mistake – they had the opportunity to be as financially successful as Skype or Tencent).

Google might very well genuinely believe in openness. But it is also strategically wise for them to be open in layers that are not strategic (mobile OS, social graph, Google docs) while remaining closed in layers that are strategic (search ranking algorithm, virtually all of their advertising services).

When Google releases their long-awaited new social network, Google Me, expect an emphasis on openness. This could create a rich ecosystem around their social platform that could put pressure on Facebook to interoperate. True interoperability would be great for startups, innovation, and – most importantly – end users.

* Metcalf’s law assumes that every node is connected to every node and each connection is equally valuable. Real world networks are normally not like this. In particular, social networks are much more clustered and therefore have somewhere between linear and exponential utility growth with each additional user.

Web services should be both federated and extensible

One of the most important developments of the web 2.0 era is the proliferation of full featured, bidirectional APIs.  APIs provide a way to “federate” web services from a single website to a distributed network of 3rd party sites. Another important web 2.0 development is the proliferation of web Apps (e.g. Facebook Apps). Apps provide a way to make websites “extensible.”

The next step in this evolution is to create web services that are both federated (APIs) and extensible (Apps).

In my ideal world, the social graph would not be controlled by a private company. That said, Facebook, to its credit, has aggressively promoted a fairly open API through Facebook Connect. Facebook has also been a leader in promoting Apps. For Facebook, creating extensible, federated services would mean providing a framework for Facebook Connect Apps – apps that extend Facebook functionality but reside on non-Facebook.com websites.

Consider the following scenario.  Imagine that in the future a geolocation data/algorithm provider like SimpleGeo takes Facebook Places check-in data and, using algorithms and non-Facebook data, produces new data sets, for example: map directions, venue recommendations, and location-based coupons. The combination of Facebook’s data (social graph and check-ins) and SimpleGeo data/algorithms would create much more advanced feature possibilities than either service acting alone.

With today’s APIs, if, say, Gowalla wanted to integrate Facebook plus SimpleGeo into their app*, they would basically have 3 choices:

1) Embed Facebook widgets in Gowalla.  These are simple iframes (effectively separate little websites) that don’t interact with SimpleGeo.  Gowalla would just have to sit and wait and hope that Facebook decided to bake in SimpleGeo-like functionality.

2) Pre-import SimpleGeo data. This significantly limits the size and dynamism of the SimpleGeo data sets and doesn’t incorporate SimpleGeo algorithms, thus severely limiting functionality.

3) Host an instance of SimpleGeo’s servers internally.  This requires heavy technical integration, undermining the main benefit of APIs.

In a world of extensible APIs (or “API Apps”), Gowalla could instead send Facebook data back to SimpleGeo.  The data flow would look something like this:

(Note how there are three parties involved – @peretti calls this a “data threesome”). This configuration is much simpler to integrate – and potentially much more powerful and dynamic – than the other configurations listed above.  You could implement this today, but it would create user experience challenges.  For example, Gowalla would be sending Facebook data to a 3rd party (step 3), which might (depending on the data sent) require explicit user opt-in. Things become more onerous if SimpleGeo wanted to share its own user data with Gowalla. That would require an additional oAuth to SimpleGeo (authorizing step 4).

Allowing websites to be federated and extensible will open up a whole new wave of innovation.  Ideally some spec like oAuth could include the multiple authorizations in a single authorization screen.  Facebook could also do this by allowing 3rd parties to be part of the Facebook Connect authorization process.  Inasmuch as Facebook’s seems to be trying to embed their social graph as deeply as possible into the core experiences of other websites, allowing extensible APIs would seem to be a smart move.

* I have no connection to any of these companies (Facebook, Gowalla, SimpleGeo) and have no knowledge of their product plans beyond their public websites.  I am imagining functionality that Gowalla and SimpleGeo might include someday but for all I know they have no interest in these features – I just picked them somewhat arbitrarily as examples.

Graphs

It has become customary to use “graph” to refer to the underlying data structures at social networks like Facebook. (Computer scientists call the study of graphs “network theory,” but on the web the word “network” is used to refer to the websites themselves).

A graph consists of a set of nodes connected by edges. The original internet graph is the web itself, where webpages are nodes and links are edges. In social graphs, the nodes are people and the edges friendship. Edges are what mathematicians call relations. Two important properties that relations can either have or not have are symmetry (if A ~ B then B ~ A) and transitivity (if A ~ B and B ~ C then A ~ C).

Facebook’s social graph is symmetric (if I am friends with you then you are friends with me) but not transitive (I can be friends with you without being friends with your friend).  You could say friendship is probabilistically transitive in the sense that I am more likely to like someone who is a friend’s friend then I am a user chosen at random. This is basis of Facebook’s friend recommendations.

Twitter’s graph is probably best thought of as an interest graph. One of Twitter’s central innovations was to discard symmetry: you can follow someone without them following you. This allowed Twitter to evolve into an extremely useful publishing platform, replacing RSS for many people. The Twitter graph isn’t transitive but one of its most powerful uses is retweeting, which gives the Twitter graph what might be called curated transitivity.

Graphs can be implicitly or explicitly created by users. Facebook and Twitter’s graphs were explicitly created by users (although Twitter’s Suggested User List made much of the graph de facto implicit). Google Buzz attempted to create a social graph implicitly from users’ emailing patterns, which didn’t seem to work very well.

Over the next few years we’ll see the rising importance of other types of graphs. Some examples:

Taste: At Hunch we’ve created what we call the taste graph. We created this implicitly from questions answered by users and other data sources. Our thesis is that for many activities – for example deciding what movie to see or blouse to buy – it’s more useful to have the neighbors on your graph be people with similar tastes versus people who are your friends.

Financial Trust: Social payment startups like Square and Venmo are creating financial graphs – the nodes are people and institutions and the relations are financial trust. These graphs are useful for preventing fraud, streamlining transactions, and lowering the barrier to accepting non-cash payments.

Endorsement: An endorsement graph is one in which people endorse institutions, products, services or other people for a particular skill or activity. LinkedIn created a successful professional graph and a less successful endorsement graph. Facebook seems to be trying to layer an endorsement graph on its social graph with its Like feature. A general endorsement graph could be useful for purchasing decisions and hence highly monetizable.

Local: Location-based startups like Foursquare let users create social graphs (which might evolve into better social graphs than what Facebook has since users seem to be more selective friending people in local apps). But probably more interesting are the people and venue graphs created by the check-in patterns. These local graphs could be useful for, among other things, recommendations, coupons, and advertising.

Besides creating graphs, Facebook and Twitter (via Facebook Connect and OAuth) created identity systems that are extremely useful for the creation of 3rd party graphs. I expect we’ll look back on the next few years as the golden age of graph innovation.

Facebook is about to try to dominate display ads the way Google dominates text ads

It is customary to divide online advertising into two categories: direct response and brand advertising. I prefer instead to divide it according to the mindset of users: whether or not they are actively looking to purchase something (i.e. they have purchasing intent).*

When users are actively looking to purchase something, they typically go to search engines or e-commerce sites. Through advertising or direct sales, these sites harvest intent. Google and Amazon are the biggest financial beneficiaries of intent harvesting.

When the user is not actively looking to buy something, the goal of an online ad is to generate intent. The intent generation market is still fairly fragmented and will grow rapidly over the next few years as brand advertising increasingly moves online. P&G – which alone spends almost $4B/year on brand advertising – needs to convince the next generation of consumers that Crest is better than Colgate. This is why Google paid such a premium for Doubleclick, Yahoo for Right Media, and Microsoft for aQuantive (MS’s biggest acquisition ever).

In 2003, Google introduced AdSense, a program to syndicate their intent harvesting text ads beyond Google’s main property Google.com.  The playbook they followed was: use their popular website to build a critical mass of advertisers; then use that critical mass to run an off-property network that offers the highest payouts to publishers. AdSense became so dominant that competitors like Yahoo quit the syndicated ad business altogether. Today, Google has such a powerful position that they don’t disclose percentage revenue splits to publishers and extract the vast majority of the profits.

It is widely believed that Facebook will soon follow the AdSense playbook by introducing an off-property ad network. They’ll try to use their strong base of advertisers to dominate intent generating ads the way AdSense dominated intent harvesting ads.

But to win the intent generation ad battle, data is as important as a critical mass of advertisers. For intent harvesting, users simply type what they are looking for into a search box. For intent generating ads, you need to use data to make inferences about what might influence the user.

This is what the introduction of the Facebook Like button is all about.  Intent generating ads – which mostly means displays ads – have notoriously low click through rates (well below 1%). Attempts to improve these numbers through demographics have basically failed. Many startups are having success using social data to target ads today. But the holy grail for targeting intent generating ads is taste data – which basically means what the user likes. Knowing, for example, that a user liked Avatar is an incredibly useful datapoint for targeting an Avatar 2 ad.

Publishers who adopt Facebook’s Like feature may get more traffic and perhaps a better user experience as a result.  But they should hope the intent generation ad market doesn’t end up like the intent harvesting ad market – with one dominant player commanding the lion’s share of the profits.

* Most text ads are about intent harvesting and most display ads are about intent generation, but they are not coreferential distinctions. For example, with techniques like “search retargeting” (you do a Google search for washing machines and the later on another site see a display ad for washing machines), sometimes intent harvesting is delivered through display ads.

Facebook, Zynga, and buyer-supplier hold up

The brewing fight between Facebook and Zynga is what is known in economic strategy circles as “buyer-supplier hold up.” The classic framework for analyzing a firm’s strategic position is Michael Porter’s Five Forces. In Porter’s framework, Zynga’s strategic weakness is extreme supplier concentration – they get almost all their traffic from Facebook.

It is in Facebook’s economic interest to extract most of Zynga’s profits, leaving them just enough to keep investing in games and advertising. Last year’s reduced notification change seemed like one move in this direction as it forced game makers to buy more ads instead of getting traffic organically. This probably hurt Zynga’s profitability but also helped them fend off less well-capitalized rivals. Facebook could also hold up Zynga by entering the games business itself, but this seemed unlikely since thus far Facebook has kept its features limited to things that are “utility like.”

The way Facebook now seems to be holding up Zynga – requiring Zynga to use their payments system –  is particularly clever.  First, payments are still very much a “utility like” feature, and arguably one that benefits the platform, so it doesn’t come across as flagrant hold up. It is also clever because – assuming Facebook has insight into Zynga’s profitability – Facebook can charge whatever percentage gets them an optimal share of Zynga’s profits.

The risk for Zynga is obvious — if they don’t diversify their traffic sources very soon, they are left with a choice between losing profits and losing their entire business.  But there is a risk for Facebook as well. If buyers of traffic (e.g. app makers) fear future hold up, they are less likely to make investments in the platform. The biggest mistake platforms make isn’t charging fees (Facebook) or competing with complements (Twitter), it’s being inconsistent.  Apple also charges 30% fees but they’ve been mostly consistent about it. App makers feel comfortable investing in the Apple platform and even having most of their business depend on them in a way they don’t on Facebook or Twitter.