Startups | 5 min read

Machine learning isn’t as hard as it looks

hero

It’s easy to believe that machine learning is hard. An arcane craft known only to a select few academics.

After all, you’re teaching machines that work in ones and zeros to reach their own conclusions about the world. You’re teaching them how to think!

Indeed, the majority of literature on machine learning is riddled with complex notation, formulae and superfluous language. It puts walls up around fundamentally simple ideas. But like all of the best frameworks we have for understanding our world – Newton’s Laws of Motion, Jobs to be Done, Supply & Demand – the best ideas and concepts at the core of machine learning are simple.

As Intercom’s own machine learning expert, Fergal Reid, puts it, machine learning is basically a branch of applied statistics. He explains, “You have a problem, you’re trying to solve it, and then you have a system where the performance improves when you give it more training data…The more data you get, the better your estimate.”

If you haven’t built anything with machine learning, you should give it a try. I’ll walk through a simple example to demonstrate how you can apply it to common tasks.

Example problem – without using machine learning

Say we wanted to include a “You might also like” section at the bottom of this post. How would we go about that?

Machine learning example

One approach is the following – naive – solution:

  1. Split the current post title into its individual words.
  2. Get all other posts.
  3. Sort all other posts by those with the most words in their body in common with our title.

Or, in Ruby:

Using this method to find similar posts on this blog to “How the support team improves the product,” you’ll get the following top 10:

  • How to launch with a validated idea
  • Know your customers and how they decide
  • Designing first run experiences to delight users
  • How to hire designers
  • The dribbblisation of design
  • An interview with Ryan Singer
  • Why being first doesn’t matter
  • Proactive support with Intercom
  • An interview with Joshua Porter
  • Retention, cohorts, and visualisations

As you can see, posts about running an effective support process have little in common with cohort analysis, or debate around the merits of design. We can do better.

The best ideas and concepts at the core of machine learning are simple

The same example using simple machine learning

Let’s try a real machine learning approach. We’re going to break this into two parts:

  1. Represent posts mathematically.
  2. Cluster these mathematical representations with K-Means.

1. Representing posts mathematically

If we can represent our posts mathematically, we can plot the posts, compare distances between posts, and identify clusters of similar posts.

Machine learning - representing posts mathematically

Mapping each post to a mathematical representation is easy, we can do it in two steps:

  1. Find all words in all posts.
  2. Convert each post into an array. Each element is a 1 or a 0, denoting presence of a word. This array is of the same order for each post, as it’s based off step #1.

Or, in Ruby:

If @words equaled:

['hello', 'inside', 'intercom', 'readers', 'blog', 'post']

A post with the body “hello blog post readers” would be mapped to:

[1,0,0,1,1,1]

We don’t have simple tools for plotting vectors in 6-dimensions, like we do for those in 2-dimensions — but concepts like distance are easily extrapolated. (It’s also still useful to use the 2-dimensional example).

2. Clustering posts with the K-Means algorithm

Now we have a mathematical representation of our blog posts, let’s try find clusters of similar posts. To do this we’re going to use a crazy simple clustering algorithm called K-Means. It can be described in 5 steps:

  1. Set ‘K’ to the number of clusters you want.
  2. Choose ‘K’ random points.
  3. Assign each document to its closest point.
  4. Choose ‘K’ new points, from the ‘average’ of all documents assigned to each point.
  5. Repeat steps 3-4. Until documents’ assignments stop changing.

Let’s visualize these steps. First, we choose 2 (i.e. k = 2) random points, in the same space as our posts:

Machine learning clustering posts step 1

We assign each document to its closest point:

Machine learning clustering posts step 2

We re-evaluate the center of each of these clusters, to be the average of all posts in that cluster:

Machine learning clustering posts step 3

That’s the end of our first iteration. Now we re-assign each post to its new closest point:

Machine learning clustering posts step 4

We’ve found our clusters! We know this because it’s obvious in further iterations that the assignments would not change.

Or, in Ruby:

Here are the top 10 posts similar to “How the support team improves the product” produced with this method:

  • Are you being clear, or clever?
  • 3 rules for customer feedback
  • Asking customers what you want to hear
  • Shipping is the beginning of a process
  • What does feature creep look like?
  • Getting insight into your userbase
  • Converting customers with the right message at the right time
  • Conversations with your customers
  • Does your app have a message schedule?
  • Have you tried talking to your customers?

The results speak for themselves.

We achieved all of this with less than 40 lines of code, and some simple algorithms that can be described in a blog post. However, you would never know how simple some of these ideas are from reading academic literature. Here’s an excerpt from the paper introducing K-Means (it’s hard to pinpoint the exact first introduction of K-Means, but this was the first paper to use the term “K-Means”):

K-means academic paper

Useful resources

Now, don’t get me wrong. Academic literature can often be useful, if you’re willing to work through the notation. However, there are a lot of excellent alternative resources that are more practical and approachable, if you’re just getting started with machine learning.

Give it a try

There’s a lot of buzz around machine learning these days. Improved techniques have helped us make stunning breakthroughs in areas like computer vision, audio recognition, and natural language translation.

But machine learning isn’t just reserved for large abstract problems. Want to suggest tags in your project management app? Or assignees in your customer support tool? Or members of a group on a social network? The chances are some simple code, and an easy algorithm will get there. So, when faced with a challenge in your product where you believe machine learning can help, don’t be discouraged and give it a try.

Getting started with machine learning is easier than you might think.


We like to break things down to their fundamental principles. If that’s the way you like to work too, join our team

Careers at Intercom