A new way to pay-per-click?

Visual recognition technology offers the potential for an innovative ad market

We talk a lot about clicks, but most “clicks” on an ad don’t involve a mouse anymore. Since the rise of smartphones and tablets, it would be more accurate to say that SearchStar dealt in pay-per-tap advertising. Equally, I suspect that most of the photos taken around the world don’t involve much clicking anymore – how often do you use an actual camera versus taking a snap on your phone? But what if a click on an ad really wasn’t a “click” in anything like the traditional online sense, and was instead a camera “click”? This seems to be where things could be headed, thanks to a raft of new visual recognition tech developments from the likes of Pinterest, Bing and Google.

Back in 2010, Google launched an app called Google Goggles, which allowed smartphone users to take a picture and perform a basic search using that picture as the query. The Goggles app could read barcodes and even read printed text. Despite this cleverness, no updates were made to the app after 2012, and the whole project was pretty much shelved in 2014 and the future of visual-based search looked uncertain.

Fast-forward to February this year, and the announcement that Pinterest – a decidedly visual website – will release a tool called Pinterest Lens. This new tool allowed Pinterest app users to point their phone’s camera at an object in “the real world”, and to be served related ideas and fresh inspiration based on that object. One suggested use is “point it at broccoli or a pomegranate to see what recipes come up.” Whether it the tool would recommend eating something tasty instead is not clear.

More recently still, at the Google I/O conference in San Francisco last month, Google CEO Sundar Pichai announced the imminent arrival of Google Lens, not so much treading on the toes of Pinterest as stamping repeatedly on them until they were no longer identifiable as toes, even by hyper-intelligent visual recognition software. Pichai promised a tool that means your phone “won’t just see what you see, but will also understand what you see to help you take action.” Google Lens looks to blend the basic visual recognition facility of Goggles, with AI technology to make that visual recognition useful. One example use is to point Lens at the front of a restaurant in order to pull up the business listing for that restaurant, along with menus and opening times etc. Lens will also link up with Google Assistant, allowing users to point their phone at some text – of whatever language – ask “What does this say?” and get an automatic translation.

Finally, the team at Bing have just announced Bing Visual Search, an update to its existing ‘Search By Image’ function that allows users to identify individual elements within a photo – a particular lamp, perhaps – and use that as the basis for a search. Of definite interest is that Bing will automatically identify the user’s “shopping intent” and serve matching product ads in the search results.

Put all of the above together and something like a trend begins to emerge. It’s certainly clear that the likes of Google are interested in visual search and how to make it useful. The next natural step is for them to look at how to make money out of it. In a way, this is pretty simple, much more so than voice search which has oft been cited as The Next Big Thing. If users are encouraged to use their phones to “query” the world around them and the physical products they’re interested in, without any need to type a query or even know what it is they’re looking at, the volume of search queries stands to rise significantly. Then it’s just a case of the search engine showing relevant product ads.

This easy route to monetisation, along with the massive uptake of Pokemon Go which demonstrated that mobile users – especially young ones – are very happy to interact with the world via their phones, could make visual recognition the new frontier of search advertising.