A framework to connect the dots between data collection, machine learning, and value creation
Artificial Intelligence has become increasingly present in our lives in the form of tools like smartphone apps. It can also be found in high-stakes autonomous systems where it makes decisions that involve the lives of human beings — such as Autonomous Vehicles (e.g. the “Google Car”) — or that involve important amounts of money — such as automated investment systems. AI can increase our productivity and creativity, or it can replace human intervention altogether by making better decisions, both in everyday life and in business. There is strong potential in AI-powered automation, but also important issues to address such as control, morality, and market uptake. Let’s dive in…
Two years ago, Mike Gualtieri of Forrester Research coined the term “predictive applications” and pitched it as the “next big thing in app development”. Today, some people estimate that more than 50% of the apps on a typical smartphone have predictive features. Predictive apps were defined by Gualtieri as “apps that provide the right functionality and content at the right time, for the right person, by continuously learning about them and predicting what they’ll need.” For that, they use Machine Learning (ML) techniques and data. APIs such as the ones provided by Amazon Machine Learning, BigML, Google Prediction API and PredicSis all promise to make it easy for developers to apply ML to data and thus to add predictive features to their apps, but it’s not obvious how these APIs differ from one another and how to choose the right one based on your apps’ needs...
Amazon Machine Learning made a lot of noise when it came out last month. Shortly afterwards, someone posted a link to Google Prediction API on HackerNews and it quickly became one of the most popular’s posts. Google’s product is quite similar to Amazon’s but it’s actually much older since it was introduced in 2011. Anyway, this gave me the idea of comparing the performance of Amazon’s new ML API with that of Google. For that, I used the Kaggle “give me some credit” challenge. But I didn’t stop there: I also included startups who provide competing APIs in this comparison — namely, PredicSis and BigML. In this wave of new ML services, the giant tech companies are getting all the headlines, but bigger companies do not necessarily have better products. Here's how I compared them and which results I got…
There's a number of ways you could be using Data Science (DS) in your business. To manage your DS projects efficiently and have them deliver real value to your business, you should have a good overview of what DS can help you with and how. I've listed 9 things that I've grouped in 3 areas:
- I. Increasing the number of customers
- II. Serving customers better
- III. Serving customers more efficiently
Data Science can provide help in each area with the use of Machine Learning techniques. The idea is to map situations to outcomes by analyzing data, so we can then predict outcomes in new situations. Let’s see how this is used concretely...
Immediately after PAPIs.io '14 I spent a couple of days at Strata in Barcelona. Strata has several tracks and I ended up going mostly to “business” sessions, but this synthesis of things I heard at the conference will be of interest to technical people as well. Actually there was one business session that had more code in it than another data science session I went to!
Here is my selection of key take-away messages, from sessions I attended (Strata is a huge conference so this is just a very partial view of it):
- Make predictions that you can act on
- When hitting a performance plateau, use new data — not fancier algorithms
- Ensuring your work is reproducible has never been easier — now make it your habit
- Getting a business edge with data requires automating decisions
Big data startup Qucit released this month the world’s first bikeshare predictive API, tightly integrated in the popular mobile app for Bordeaux bikes. This is an inspiring example of Machine Learning usage in the real world. One of the value propositions of Predictive is better resource management. Here, in the "smart city" context, it impacts our everyday lives.
Last week I visited the Import.io offices in London and did a webinar with them in which I showed:
- how to use their tool to easily scrape real estate data from the web
- how to clean that data with the Pandas library in Python
- how to build a real-estate pricing model by sending the clean data to BigML.
I will be chairing the PAPIs.io conference taking place on 17-18 November 2014 in Barcelona, right before Strata. It will be the first ever international conference dedicated to Predictive APIs and Predictive Apps. If you're interested in presenting your work in this space, a Call for Proposals is open until 8 October 2014.
As machine learning and predictive analytics services become more widely embraced in the business world, predictive APIs are starting to open up. When evaluating this class of API, it is useful to have a common set of questions — the answers to which will help determine whether prediction APIs are a good fit for your needs and to steer you toward the best product for your organization.
Open data is a way to increase transparency into what happens in our society. When coupled with predictive modelling, it becomes a way to interpret why things happened. Even though it sounds complex, these techniques have become accessible to the masses. Let's see how this works with elections data.
Co-authored with Alexandre Vallette
I am proud to announce that I have teamed up with HumanCoders to set up a groundbreaking Machine Learning training program which is based on Prediction APIs and brings you up to speed in 3 days. At this occasion, they interviewed me and I gave them 3 copies of my book, Bootstrapping Machine Learning, to give away — here's your chance to snatch one of them!
Churn prediction is one of the most popular Big Data use cases in business. It consists of detecting customers who are likely to cancel a subscription to a service. Although originally a telco giant thing, this concerns businesses of all sizes, including startups. The problem of churn prediction can be tackled with machine learning techniques. Now, thanks to prediction services and APIs, this sort of predictive analytics is no longer exclusive to big players that can afford to hire teams of data scientists.
Unless you have shut yourself away for the past 2 years, you probably heard about what everybody else in the tech/business industry is talking about: "the biggest threat that will redefine markets and crush businesses". You see it coming? I am of course talking about all the hype surrounding Big Data and Predictive Analytics!