Member-only story

One algorithm to rule them all: Gradient Descent

9 min readAug 3, 2020

Chances are high that you’ve heard about Machine Learning (ML). ML has seen substantial advances over the past decade and has already impacted our lives in many ways. Netflix movie recommendations; ML. Email spam filtering; ML. Alexa’s voice recognition; ML. But what exactly is ML and what sort of magic happens behind the scenes for these products and services to be able to perform human-like activities?

I won’t be going in depth in ML, that will require a lot more than this blog post. I will try and describe one very basic algorithm that is at the heart of many ML models — Gradient Descent. My intent in doing so is to give you some basic intuition into how ML works.

One distinction I would like to make is the difference between ML and traditional software. Machine learning doesn’t rely on hard-coded rules, which is how traditional software is developed. In traditional software applications, the code defines the logic that the software will follow. Machine learning relies on detecting patterns from lots of data. Machine learning builds a mathematical model that looks for patterns between the data you give it ( training in ML parlance). The model will then be used to predict outcomes or values for data it hasn’t seen before. As a rule of thumb, the more data an ML model is trained with, the better the model.

One algorithm to rule them all: Gradient Descent

Written by Karim Fanous

No responses yet