The Shrewd AI Strategy behind Google's Kaggle Acquisition | Deep Learning Weekly

Bringing you everything new and exciting in the world of
 deep learning from academia to the grubby depth
 of industry every week right to your inbox. Free.

The Shrewd AI Strategy behind Google's Kaggle Acquisition

Earlier this month Google announced that it is acquiring the data science platform Kaggle which many commentators and news reports have pegged as a talent acquisition and move to open up Kaggle's massive community of data scientist and engineers as recruiting ground.

In a statement following the announcement Google's chief scientist Fei-Fei Li couched the acquisition in tellingly fuzzy terms:

During my keynote talk at Next ‘17, I emphasized the importance of democratizing AI. We must lower the barriers of entry to AI and make it available to the largest community of developers, users and enterprises, so they can apply it to their own unique needs. With Kaggle joining the Google Cloud team, we can accelerate this mission.

Just as politicians who substitute 'democratizing' for politicizing (with themselves as chief politicizer), so do giant tech companies occasionally employ the term as a touchy-feely smoke screen to gloss over their more sobering strategic, business-driven motivations. I am not insinuating that anything nefarious is going on here, but simply that neither Google's mission to 'democratize AI' nor the talent acquisition rationale fully account for this move.

I think this acquisition was a supremely strategic and shrewd business move in Google's quest to own AI in the cloud and to secure a lasting edge against its main competitors; Microsoft and Amazon. Kaggle's CEO Anthony Goldbloom delivers the quote to clue us in:

Kaggle joining Google will allow us to achieve even more. It combines the world’s largest data science community with the world’s most powerful machine learning cloud.

Google did not merely acquire a talented team of engineers or access to large pool of data scientists to recruit from, but rather the programming habits of half a million machine learning practitioners who will use open source Google technology (e.g. Tensorflow) to build their models in the Google Cloud using Google APIs and products. Considering that Kaggle is the perfect training ground for students and beginners in the field, this will have far-reaching consequences for the choice of the technology stack not only by those engineers personally, but also by any company looking to recruit data scientist and machine learning engineers or any startup wanting to build upon technology that allows it to draw upon a large talent pool. As an analogy in computing history, one can contemplate what Java's adoption as a sort of programming lingua franca in most universities has done for its spread in industry and vice versa.

But let's rewind a bit to see how this fits into Google's wider strategy. What allowed Google to win in search was the fact that the web abstracted away the underlying platform, Microsoft was not able to capitalize on their immense and absolutely dominant OS platform because there was a new runtime sitting on top of it. This levelled the playing field, removing any artificial lock-in by effectively reducing the switching cost to zero. Google, whose competitive advantage always lay with superior technology, was now able to compete on those terms and far and away outstrip the competition. Eversince, search has formed the backbone of Google's business accounting for about 75% of Google's revenue. Recently, however, it appears that Google has come to the realization that it will have to rethink its business model as there is a real possibility that search will decline.

Google search over time

Sundar Pichai himself asserted during the last I/O keynote that “we are heading from a mobile first to an AI first world” and that is a world where users will be directly interacting with an intelligent assistant whether they are speaking into their AirPods to ask Siri for the nearest flower shop, ordering new detergent off Amazon via Alexa or are accessing one of Google's services via their Pixel's assistant, fewer and fewer queries will go the circuitous route through a search engine. This, in turn, means there will be fewer opportunities to serve ads through Adwords. Google has always implicitly acknowledged this possibility with the iconic 'I'm feeling lucky' button. Now that natural language interfaces are becoming viable thanks to breakthroughs in deep learning coupled with context aware mobile devices, you don't have to feel so lucky anymore to be confident that your AI assistant will accurately gauge the relevancy of the top results to your query.

Part and parcel of Google's AI first strategy is to not repeat the mistakes of Gsuite which was never fully able to gain a footing in the enterprise the way Microsoft's enterprise suite was. This time, they are getting ahead of the game by releasing a series of products that clearly leverage Google's core advantage; data:

Whether you are comparing Bing to Google Search or iOS maps to its Google counterpart, the usability and performance of Google's products are just hard to beat. When your competitive advantage rests on data which grows along with your market share your advantage compounds and what started with a small edge turns into a winner takes all situation. That is what happened with search and that is what Google is setting out to repeat with AI.

The first order of business is to again make their competitors strong platforms irrelevant. In the case of search the web took care of this by making the OS irrelevant, but now Google has to make a conscious effort to 'choose the battlefield'. Amazon has already built an impressive range of completely modular cloud products on their dominant AWS platform and this would be very uneven ground for Google to compete on. Enter Kubernetes. Kubernetes is Google's open source container management system which is one of the fastest growing open source projects on Github at the moment and has already been integrated deeply with platforms such as Microsoft's Azure or RedHat's PaaS offering OpenShift. The central precept underlying containers is to allow developers to build on a standard interface that abstracts away the underlying hardware or operating system creating nearly full flexibility where you deploy and run the software inside the container. As is the case with Azure, platform providers have an incentive to enable potential customers to deploy their containerized software on their platform. What that means, though, with respect to product offerings that go beyond mere compute power (such as the Google services listed above) is that they compete on a level playing field. If I want to use Google new video intelligence API I simply move my containers from AWS to the Google cloud and am ready to go. The switching cost is close to zero.

Circling back to the Kaggle acquisition; the third leg of the Google strategy stool rests on grooming a community of data scientists and machine learning experts (future and present) who are used to and comfortable inside Google's machine learning ecosystem. Whether it is the extraordinary support Google is giving the Tensorflow project (most recently incorporating deep learning library Keras) or the free education in the form of joint courses on machine learning and deep learning with Udacity or now the acquisition of half a million machine learning enthusiasts through Kaggle, these moves ensure that the practitioners toolbox will be based on Google standards and technology. featured tensorflow trend over time The first order of business after the acquisition it seems is to move Kaggle's Kernels "a combination of environment, input, code, and output" over to the Google Cloud. This strategy is particularly perspicacious considering that when it comes to machine learning products and services you have to rely on sophisticated customers who know what they are doing. There is an entire generation of machine learning as a service startups that inhabit the ill-fated chasm between developers and users on the business side, being too rigid for the former who can just use open source libraries such as Tensorflow and too technical for the latter. Google has realized that the way AI will infiltrate the enterprise is through developers and data scientist who in turn will choose the tech stack. How convenient then that the side effect of 'democratizing AI' is that the demos will be on the Google stack.

TLDR: Google is using Kubernetes to reduce the cost of switching cloud platforms so it can compete on user experience and technology which given its data advantage are ahead of the curve in AI cloud arena. An additional way to gather momentum for its AI enterprise cloud platform is to groom up and coming machine learning professionals to use its platform and associated technologies through platforms such as the recently acquired Kaggle or Udacity and prominent open source projects such as Tensorflow and Keras.

Prediction: Google is poised to own the enterprise AI cloud space and will grow alongside a market that is starting to learn how to harness AI to support their business and gain a competitive edge.