Chris Dixon

Inferring intent on mobile devices

[Google CEO Eric] Schmidt said that while the Google Instant predictive search technology helps shave an average of 2 seconds off users’ queries, the next step is “autonomous search.” This means Google will conduct searches for users without them having to manually conduct searches. As an example, Schmidt said he could be walking down the streets of San Francisco and receive information about the places around him on his mobile phone without having to click any buttons. “Think of it as a serendipity engine,” Schmidt said. “Think of it as a new way of thinking about traditional text search where you don’t even have to type.”  - eWeek

When users type phrases into Google, they are searching, but also expressing intent. To create the “serendipity engine” that Eric Schmidt envisions would require a system that infers users’ intentions.

Here are some of the input signals a mobile device could use to infer intent.

Context

Location: It is helpful to break location down into layers, from the most concrete to the most abstract:

1) lat / long – raw GPS coordinates

2) venue – mapping of lat / long coordinates to a venue.

3) venue relationship to user – is the user at home, at a friend’s house, at work, in her home city etc.

4) user movement – locations the user has visited recently.

5) inferred user activity – if the user is at work during a weekday, she is more likely in the midst of work. If she is walking around a shopping district on a Sunday away from her home city, she is more likely to want to buy something. If she is outside, close to home, and going to multiple locations, she is more likely to be running erands.

Weather: during inclement weather user is less likely to want to move far and more likely to prefer indoor activities.

Time of day & date: around mealtimes the user is more likely to be considering what to eat. On weekends the user is more likely to be doing non-work activities. Outside at night, the user is more likely to be looking for bar/club/movie etc.  Time of days also lets you know what venues are open & closed.

News events near the user: they are at the pro sporting event, an accident happened nearby, etc.

Things around the user: knowing not just venues, but activities (soccer game), inventories (Madden 2011 is in stock at BestBuy across the street), events (concert you might like is nearby), etc.

These are just a few of the contextual signals that could be included as input signals.

Taste

The more you know about users’ tastes, the better you can infer their intent. It is silly to suggest a great Sushi restaurant to someone who dislikes Sushi. At Hunch we model taste with a giant matrix. One axis is every known user (the system is agnostic about which ID system – it could be Facebook, Twitter, a mobile device, etc), the other axis is things, defined very broadly: product, person, place, activity, tag etc.  In the cells of the matrix are either the known or predicted affinity between the person and thing.  (Hunch’s matrix currently has about 500M people, 700M items, and 50B known affinity points).

Past expressed intent

- App actions:  e.g. user just opened Yelp, so is probably looking for a place to go.

- Past search actions: user’s recent (desktop & mobile) web searches could be indications of later intent.

- Past “saved for later” actions:  user explicitly saved something for later e.g. using Foursquare’s “to do” functionality.

Behavior of other people

- Friends:  The fact that a user’s friends are all gathered nearby might make her want to join them.

- Tastemates: That someone with similar tastes just performed some actions suggests the user is more likely to want to perform the same actions.

- Crowds: The user might prefer to go toward or avoid crowds, depending on mood and taste.

How should an algorithm weight all these signals? It is difficult to imagine this being done effectively anyway except empirically through a feedback loop. So the system suggests some intent, the user gives feedback, and then the system learns by adjusting signal weightings and gets smarter.  With a machine learning system like this it is usually impossible to get to 100% accuracy, so the system would need a “fault tolerant” UI.  For example, pushing suggestions through modal dialogs could get very annoying without 100% accuracy, whereas making suggestions when the user opens an application or through subtle push alerts could be non-annoying and useful.

  • http://twitter.com/ajayjapan Ajay Chainani

    The biggest barrier to this is the battery life of the phone. When you create an iPhone application that monitors your location changes in the background apple makes you explicitly state that: “Continued use of GPS running in the background can dramatically decrease battery life.” There are ways around this like checking at certain time intervals. But this is definitely in our future. If Google doesn’t do it. I will.

    • http://technbiz.blogspot.com paramendra

      “If Google does not do it, I will.” I like that.

    • http://www.cdixon.org chris dixon

      Certainly battery life will get dramatically better and this issue will go away.

      • http://resumecvservice.com/ resume writing service

        you think? i suppose that everything isnot that easy… unfortunately

      • http://twitter.com/hajons hajons

        It will? How? Better screens, quadcore cpus, multiple cpus, more sensors etc, all point to the opposite, at least for the next few years. Energy harvesters, nuclear batteries, may become an alternative but that will take time.

      • http://borasky-research.net/2011/01/13/project-kipling-alpha-test-is-now-in-suse-studio-ddj-datajourno/ znmeb

        Battery life, storage size and bandwidth are the issues here. You either compute stuff, store it on the device or you fetch it from a server over expensive wireless infrastructure. Throw in highly regulated carriers that are structured by law as utilities vs. Apple and Google that are “growth companies”, a public that is easily creeped out by “intelligent” devices and being “tracked” and you have a complex ecosystem. It’s a perfect recipe for “life on the edge of chaos”, in other words. ;-)

    • http://twitter.com/DavidYKay David Y. Kay

      There are other techniques, including GeoIP and cellular triangulation that are far less battery-intensive. While they’re not as accurate as GPS, they are much more suitable for background position tracking.

      Moreover, it’s interesting to consider more passive/server-side means such as tweets and check-ins that broadcast your location without a GPS fix. That is, to supplement the primary location data from the device itself, you could monitor other channels for location fixes.

      On a similar note, I believe that in order to help quell the battery problem, there should be a universal repository of one’s location. I believe Latitude has aspirations in this direction. The use case being that whenever you geolocate, rather than each app storing your location, each app uploads your coordinates to the Latitude API, updating your canonical location. Afterwards, other apps can just query the central location data, rather than the current situation where each app is querying the hardware directly.

  • http://technbiz.blogspot.com paramendra

    2015: A Mobile Tech Company Will Storm The Room http://goo.gl/fb/10pOg

  • http://twitter.com/mgershoff Matt Gershoff

    Treat the problem as a POMDP (maybe factorized or hierarchical) learned via some sort of explore/exploit policy augmented with active learning. Not easy, but might be the right way to think of it?

  • http://twitter.com/titanas Stefanos Kofopoulos

    There is a better way to get over all these obstacles and get the system to know faster, efficiently and without waiting for the software to be 100% accurate – neither the batteries to achieve the so called magical fuel cell efficiency. It’s not a software issue at all.

    • http://www.cdixon.org chris dixon

      I agree battery life is important especially for background processes required for all this data collection but still seems like a major software/algorithm issue.

      • http://twitter.com/titanas Stefanos Kofopoulos

        True but i think there’s something everybody keeps missing, a tiny detail that provides an ultra cheap solution to the problem (money/time/battery wise). I keep wondering if this is one of those things making you wondering “how didn’t i think of that first” or i’m simply day dreaming. IMHO a whole startup can be built on top of this

      • http://twitter.com/mgershoff Matt Gershoff

        Agreed. You need to both predict the ‘state’ of the individual (their intent) as well as the optimal action conditioned on that state. Since ‘state’ will be a latent construct in this situation, it won’t be a trivial task to derive this info. One will probably need to expend resources to probe the environment in order to reduce the uncertainty in both the state estimates as well to determine estimates for the best course of action given a specific user in a that state.

        • http://twitter.com/titanas Stefanos Kofopoulos

          …or size down the environment itself

          • http://twitter.com/mgershoff Matt Gershoff

            Sure, By using some sort of function approximation you can reduce or at least generalize over the feature space – if I get your comment.

            • http://twitter.com/titanas Stefanos Kofopoulos

              What i meant was that people move less, shorter distances and limited space. With physical space where people move sizing down and detailed amounts of information about specific areas growing, algorithms won’t need to be that smart on average because uncertainty is limited more and more (on average). Plus, the fact that we experience more and more of the world through screens instead of the actual space around us, makes signals like accurate GPS info not as important as they might seem.

              I guess what i’m trying to say is that modern life is turning into a more of a low entropy system on average rather than high.

  • http://inuvi.com Mark Westling

    Google’s worried about user entry speed because that’s the bottleneck in any of their searches. For companies that don’t have Google’s computing resources, though, you could get a speed-up on the result side by predicting intent and then pre-computing and caching results. Computer chess programs work this way: instead of throwing away the search tree between moves, they’ll keep the subtrees that match likely moves by their opponents and save themselves the bother of recomputing if, in fact, the opponent makes a predicted move. This isn’t as sexy as guiding a user’s UI entry but it’s a clever way to save computing time.

    • http://www.cdixon.org chris dixon

      But I think the point is to remove need for typing and also increase “search” frequency by pushing vs pulling searches. Not sure what you describe addresses this.

      • http://inuvi.com Mark Westling

        Yes, from Google’s standpoint, that’s what matters. I was just remarking on how to use intent inference in a useful way that doesn’t depend on a clever UI (or risk annoying a user if not-so-clever). Not useful for Google but possibly useful for others.

  • http://www.facebook.com/profile.php?id=3114404 Ben Fisher

    Wonderful post, but I wonder if this really is “intent” as much as a mere highly targeted suggestion that the user then says “yeah, I like that.” The only way that Google could read intent is by creating it via suggestions. That is, Google, or Hunch for that matter, may know the types of things you want really well in the environment you are in and therefore have a probability calculated that you will say yes. But that is not much different than super duper advertising, compared to,say, someone independent of those influences deciding to undertake an action.

    • http://www.cdixon.org chris dixon

      agree. it’s a lot like super advertising. but i guess advertising at
      its extreme of goodness is like an intelligent bot looking out for
      stuff you like.

  • Pingback: Inferring intent on mobile devices | Daniel Bachhuber's weblog

  • Pingback: cdixon.org – chris dixon's blog / Inferring intent on mobile devices | Dr. G on Twitter

  • http://twitter.com/omarelamri Omar El Amri

    Hi Chris,

    Great post. I also attended your NYU Startup panel: great insight!

    I have a request for a post unrelated to this topic. It’d be great if you could elaborate on the node system for the Hunch Tase Graph. I founded a startup called At (my profile picture) and I’m working on a platform called KnolFlow, based on my thesis: “Knowledge Flow: The Multidimensional Representation of Knowledge Modelled after Cerebral Neural Networks.” Let me elaborate on this to tell you why I’m asking for an explanation of the Taste Graph node system: Basically, the KnolFlow graph nodes are divided into three categories: Objects, Methods, and Common Sense. Nodes are connected on multiple dimensions. For example, the object node, Pablo Picasso, is connected on the “Endeavors” dimension to the nodes that represent his artwork, i.e. Guernica, etc… And on the “family tree” dimension, he is connected to his wife, kids, parents, etc… Simple. The Common Sense category deals with the nature of Objects and their Methods. For example, humans are connected to the Animal Kingdom, and to their methods such as Run, Eat, Talk. Each Method, has properties such as Velocity and Acceleration for Run.

    Now I have studied all of the graphs out there and have come to the conclusion that at this point only two graphs are mature enough and sophisticatedly structured enough to be integrated into KnolFlow, if it were to ever be as successful as I would like it to be. They are Facebook’s Social Graph, and Hunch’s Taste Graph. And I’d like to integrate Hunch into KnolFlow on two levels actually: at the Object level and into Common Sense. First, the taste connections that you’re building will add tremendous data on the nature of the most important objects on KnolFlow, humans. This is at the Common Sense level, of course. Moreover — and this is my most immediate concern — if you’ve already mapped out a large amount of objects, it would be better for me to integrate them into KnolFlow at a one-stop-shop from Hunch, rather than to crawl for them on their respective — often greatly inconsistent — graphs. After all, a local business on the Yelp graph is just the same as its counterpart on the Foursquare graph, although the connections are semantically different, but those can come later.

    So what I am asking is, if you guys at Hunch even reveal this type of information, for you to explain how you structure your nodes irrespective of the Taste Graph. Are they in SQL tables? Or in purely XML documents? How do you organize them? Are they connected? For example, is a local McDonald’s a child node of the McDonald’s corporation? Is “The Dark Knight” a child node of Warner Bros.? And finally, do you plan on building an API to open up your nodes, not necessarily your connections, to third parties?

    I don’t know what your plans for Hunch are, but I do know that if you license your graph that would present a tremendous opportunity for a platform like mine. However, because I know you like to reference Metcalfe’s Law and how opening up your graph will benefit the smaller graphs more, I must say that it could open up Hunch to wide array of very lucrative applications. I’m building KQL, or Knowledge Query Language, that will allow people to query KnolFlow, and I can imagine so many applications of the taste graph within that. For example, in Ad serving, roughly something along the lines of: SELECT advertisement FROM ALL adagencies WHERE product IS NODE OF dixon, chris IN DIMENSION taste OR likes. Something like that would serve an ad of a product that you probably like according to Hunch and Facebook Likes. In the case of Hunch, it would query your data type in the Taste dimension of Common Sense.

    This comment is beginning to look like an essay so I better end here. There’s so much I’d like to get your insight and advice on, but you’re extremely hard to reach. I guess if I could shamelessly ask you for one more thing it would be for you to publish your reading list that everyone talks about. I’m especially interested in what you read in cognitive science.

    Thanks a lot,
    - Omar

    • Anonymous

      Omar,

      Are you mapping info along 3 axis? I wonder if Hunch were to add a 3rd axis, like time, would that increase performance?

      -Michael

      • http://twitter.com/omarelamri Omar El Amri

        Michael,

        In the platform that I’m building, time is a property of methods. For example, if the method BORN is performed on the object PERSON, TIME is a property of BORN, a sub-node. So you can imagine a visualization of the knowledge graph relative to time: basically a timeline. But to make time an axis of the graph: no, that is not what I am aiming to do, it’s too narrow.

        - Omar

        • Anonymous

          Omar,

          Thanks. Do you have some reference sites for this general field of
          data/info theory?

          Best,
          Michael

          • http://twitter.com/omarelamri Omar El Amri

            Michael,

            There aren’t any sites per se that I know of. This field is generally referred to as Knowledge Representation and it’s a subset of AI. There is extensive literature on the subject that spans over 40 years of research. I would try Google Scholar for research papers and theses. Also, “Knowledge Representation and Reasoning” from the Morgan Kaufmann series is a great reference.

            - Omar

        • http://twitter.com/omarelamri Omar El Amri

          Michael,

          There aren’t any sites per se that I know of. This field is generally referred to as Knowledge Representation and it’s a subset of AI. There is extensive literature on the subject that spans over 40 years of research. I would try Google Scholar for research papers and theses. Also, “Knowledge Representation and Reasoning” from the Morgan Kaufmann series is a great reference.

          - Omar

  • http://aweber9.tumblr.com Andrew Weber

    Excellent post!

  • http://sorebuttcheeks.blogspot.com/ Anabolic Steroids

    shouldn’t that be called location baseed instant search.

  • Pingback: Google, Already Dominant in Mobile Search, Isn’t Resting | PrintMediaCentr.com

  • Pingback: Pre-Marketing 4/25 – gpkendall.com

  • Pingback: Havas Media Lab » Disruption Landscape - 04/25/2011

  • http://brianshall.com Brian S Hall

    I like the large number of datapoints you include to help move us to this serendipity engine (I like that phrase almost as much).

    Instead of Google, might Apple do this?

    “The pieces to achieve this vision, with Apple at the core, are (falling) in place. Apple has the hardware, the smartphone operating system. That giant, mythical cloud city hovering somewhere over North Carolina. And, Siri. Siri is that ‘voice search’ technology start-up Apple bought a year ago. Only, it’s not simply voice search. Siri is a “personal assistant” technology. That has essentially gone dark since Apple acquired it. But, the potential for Siri was then, and is now, the ability to provide real-time, contextually relevant information, regardless of person, place or such mundane functions as notification settings. Siri knows, in theory, that you are with your date, where you are at, and suggests, possibly purchases on its own, two tickets for the next showing of whatever movie Siri knows you *both* will like, and for the appropriate time. There is no app for that.”

    http://brianshall.com/content/iphone-5-revolution

  • http://brianshall.com Brian S Hall

    I like the large number of datapoints you include to help move us to this serendipity engine (I like that phrase almost as much).

    Instead of Google, might Apple do this?

    “The pieces to achieve this vision, with Apple at the core, are (falling) in place. Apple has the hardware, the smartphone operating system. That giant, mythical cloud city hovering somewhere over North Carolina. And, Siri. Siri is that ‘voice search’ technology start-up Apple bought a year ago. Only, it’s not simply voice search. Siri is a “personal assistant” technology. That has essentially gone dark since Apple acquired it. But, the potential for Siri was then, and is now, the ability to provide real-time, contextually relevant information, regardless of person, place or such mundane functions as notification settings. Siri knows, in theory, that you are with your date, where you are at, and suggests, possibly purchases on its own, two tickets for the next showing of whatever movie Siri knows you *both* will like, and for the appropriate time. There is no app for that.”

    http://brianshall.com/content/iphone-5-revolution

  • http://arnoldwaldstein.com awaldstein

    Interesting.

    In a backwards way, this maps to interests driving context rather than either friendship or pure context itself.

    Services like foodspotting, for example, let you exclaim your love for tuna burgers. By surfacing the interest, as I walk around, context without search could drive information to me without searching. Foods, wines, brands, artists. If I detail these interests, rather than restaurants, liquor stores, Macy’s and museums, the assumption of what I want to search for should get closer to what I actually want and can infer the search if you will.

    I started to get this @ bt.io/GxWC but I’m just starting to think this through. It points to the need for Facebook to get its interest graph figured out and quickly as more and more of us search for info and new context through interests otherplaces, like here.

    Really interested in finding approaches like the food example for a project if anyone has any.

    • http://twitter.com/titanas Stefanos Kofopoulos

      The phone can always capture a plethora of signals like ambient light, time, speed, ambient noise, vibrations etc and figure out with greater detail what the user intends to do, so the gap “of what I want to search for should get closer to what I actually want” is minimized

      • http://arnoldwaldstein.com awaldstein

        True….but to make this work for me, it needs to know me at an intent level. Otherwise its just cold, hot, hungry, tired. We want more than primitive signs of basic needs. I do. And I need to provide the personal language of interests to help this happen it seems.

        • http://twitter.com/titanas Stefanos Kofopoulos

          Yeap. That’s where the social graphs (not graph) and serendipity come in. We tend to think the world consists of a few big problems in need of greater CPU power and smarter data mining when our brains are wired and function in response to serendipity. In virtual worlds that won’t be a problem of course.

  • http://alexcalic.com Alex Calic

    Wow- you nailed a lot of what I am writing about right now in a blog post Chris. I think conntext is the next big thing after search then social, but have thoughts around the additional layers beyond factual/implicit data (lat/long, biz listings, etc.) where Hunch operates with other (explicit data). Will share when it’s done (finally).

  • Anonymous

    A sub-node. So you can imagine a visualization of the knowledge graph
    relative to time: basically a timeline. But to make time an axis of the
    graph: no, that is not what I am aiming to do, it’s too narrow.
    Plumbers

  • Pingback: Using Google+ Huddle | Mastering Google Plus

  • Pingback: The War Between Mobile Devices and Corporate IT | CallCenterBestPractices.com

  • Pingback: Twitter-ing | Article Lane

  • Pingback: Joshua Courrege