In this blogpost, I will show you how to implement `word2vec` using the standard Python library, NumPy and two utility functions from Keras. A more complete codebase can be found under my Github webpage, with a project named word2veclite. This codebase also contains a set of unit tests that compare the solution described in this blogpost against the one obtained using `Tensorflow`.

# The backpropagation algorithm for Word2Vec

Since I have been really struggling to find an explanation of the backpropagation algorithm that I genuinely liked, I have decided to write this blogpost on the backpropagation algorithm for `word2vec`. My objective is to explain the essence of the backpropagation algorithm using a simple - yet nontrivial - neural network. Besides, `word2vec` has become so popular in the NLP community that it is quite useful to focus on it.

# Bayesian A/B Testing: a step-by-step guide

This article is aimed at anyone who is interested in understanding the details of A/B testing from a Bayesian perspective. It is accompanied by a Python project on Github, which I have named aByes (I know, I could have chosen something different from the anagram of Bayes…) and will give you access to a complete set of tools to do Bayesian A/B testing on conversion rate experiments.

# The confusion over information retrieval metrics in Recommender Systems

Recently I have been reading a lot about evaluation metrics in information retrieval for Recommender Systems and I have discovered (with great surprise) that there is no general consensus over the definition of some of these metrics. I am obviously not the first one to notice this, as demonstrated by a full tutorial at ACM RecSys 2015 discussing this issue.

In the past few weeks I have spent some time going through this maze of definitions. My hope is that you won’t have to do the same after reading this post.

# Changepoint Detection. Part II - A Bayesian Approach

I have recently discussed the problem of changepoint detection from a frequentist point of view. In that framework, changepoints were inferred using a maximum likelihood estimation (MLE) approach. This gave us point estimates for the positions of the changepoints.

In this post I will present the solution to the same problem from a Bayesian perspective, using a mix of both theory and practice (using the $\small{\texttt{pymc3}}$ package). The frequentist and Bayesian approaches give actually very similar results, as the maximum *a posteriori* (MAP) value, which maximises the posterior distribution, coincides with the MLE for uniform priors. In general, despite the added complexity in the algorithm, the Bayesian results are rather intuitive to interpret.

# Changepoint Detection. Part I - A Frequentist Approach

Changepoint Detection (CPD) refers to the problem of estimating the time at which the statistical properties of a time series… well… change. It originates in the 1950s, as a method used to automatically detect failures in industrial processes (quality control) and it is currently an active area of research that can boast of having a website on its own.

# Python implementation of Crank-Nicolson scheme

Since at this point we know everything about the Crank-Nicolson scheme, it is time to get our hands dirty. In this post, the third on the series on how to numerically solve 1D parabolic partial differential equations, I want to show a Python implementation of a Crank-Nicolson scheme for solving a heat diffusion problem.

# Implicit solution of 1D parabolic PDE (Crank-Nicolson scheme)

This post is the second one of the series on how to numerically solve 1D parabolic partial differential equations (PDEs). In my previous post I discussed the explicit solution of 1D parabolic PDEs and I have also briefly motivated why it is interesting to study this type of problems.

# The multiple hypothesis testing problem

I must admit that I only learnt about the “multiple testing” problem in statistical inference when I started reading about A/B testing. In many ways I knew about it already, since the essence of it can be captured by a basic example in probability theory: suppose a particular event has a chance of 1% of happening. Now, if we make N attempts what is the probability that this event will have happened at least once among the N attempts?