Writing a tutorial/crash-course series has always been on my to-do list but I haven't really had the time to do it until now. Since I'm in the interim between undergrad and graduate school, I figured I could start writing a few primers on doing Deep Learning for natural language processing (NLP).

To jump straight to the first tutorial post, click here!

Here's what to expect:

  • I'll be putting up posts on different topics (semi) regularly, each accompanied by a runnable Jupyter Notebook that you can load up on Google Colab.
  • Every piece of the puzzle will be explained in each notebook so don't worry :)
  • We'll be running through differrent tasks in NLP (like Sentiment Analysis, Language Modeling, Topic Modeling, Machine Translation, etc).
  • For each task, we'll first start with basic models so we can see how everything fits together, then we'll improve on those models by implementing and reproducing ideas from different state-of-the-art papers.

We'll be using PyTorch for the entirety of this series. It's what my subgroup in our lab uses, plus it's clean to code in and lets you focus on the logic instead of fighting with tensor shapes, bizzare error messages, and whatnot.

Who is this for?

Well, generally, the series is geared towards the following kinds of people:

  • You know basic machine learning and want to try NLP
  • You do deep learning and want to cross over to NLP
  • You do classic NLP and want to try using deep learning

Whenever I teach these things in class or in workshops I often get asked "will I be able to understand this with <amount> prior knowledge?" I believe as long as you're not a total beginner, you should be fine. I'd expect you to know how neural networks work at the very least. If you're good with that, then leave the rest to me. More on expectations below.


In terms of equipment, there's none needed really since all the notebooks will be runnable on Colab. In terms of libraries and programming skills, you should at least know how NumPy and Pandas work. We'll be using Python 3, so if you use Python 2, please consider upgrading :)

In terms of prior knowledge, we'd expect you to know at least basic machine learning on the math level (which means you also know some statistics, calculus, and linear algebra). I won't explain[1] why cross-entropy looks like such, nor walk you through how backprop works[2].

If you're just starting out with machine learning, Andrew Ng's course is a pretty good place to start getting your feet wet. For a more maths-oriented offering, I recommend CMU's 10-701. For deep learning, go read the Deep Learning Book. For NLP, Jurafsky & Martin is regarded as the classic introduction. For a more deep learning and math inclined offering, go check out Yoav Goldberg's primer.

For a PyTorch tutorial, the official tutorials site offers a lot of good ones, particularly this one that transitions from pure NumPy to PyTorch. I'm also working on an actual PyTorch quickstart notebook that I might add here so stay tuned for that.

Tutorial List

I'll be updating this directory every time a new post is made.

[1] Although I am also planning on starting a series of notebooks for basic neural networks (we go from logistic regression to perceptrons, we deconstruct backprop and prove the equations, etc.) but that won't happen in the very near future. We'll see!

[2] I'm a staunch believer that if you know how the chain rule works and you know how to do matrix multiplication, then you know how to do backprop.