Parallel programming with Julia using MPI

Julia has been around since 2012 and after more than six years of development, its 1.0 version has been finally released. This is a major milestone and one that has inspired me to write a new blogpost (after several months of silence). This time we are going to see how to do parallel programming in Julia using the Message Passing Interface (MPI) paradigm, through the open source library Open MPI. We will do this by solving a real physical problem: heat diffusion across a two-dimensional domain.

Python implementation of Word2Vec

In this blogpost, I will show you how to implement word2vec using the standard Python library, NumPy and two utility functions from Keras. A more complete codebase can be found under my Github webpage, with a project named word2veclite. This codebase also contains a set of unit tests that compare the solution described in this blogpost against the one obtained using Tensorflow.

The backpropagation algorithm for Word2Vec

Since I have been really struggling to find an explanation of the backpropagation algorithm that I genuinely liked, I have decided to write this blogpost on the backpropagation algorithm for word2vec. My objective is to explain the essence of the backpropagation algorithm using a simple - yet nontrivial - neural network. Besides, word2vec has become so popular in the NLP community that it is quite useful to focus on it.

Bayesian A/B Testing: a step-by-step guide

This article is aimed at anyone who is interested in understanding the details of A/B testing from a Bayesian perspective. It is accompanied by a Python project on Github, which I have named aByes (I know, I could have chosen something different from the anagram of Bayes…) and will give you access to a complete set of tools to do Bayesian A/B testing on conversion rate experiments.

The confusion over information retrieval metrics in Recommender Systems

Recently I have been reading a lot about evaluation metrics in information retrieval for Recommender Systems and I have discovered (with great surprise) that there is no general consensus over the definition of some of these metrics. I am obviously not the first one to notice this, as demonstrated by a full tutorial at ACM RecSys 2015 discussing this issue.
In the past few weeks I have spent some time going through this maze of definitions. My hope is that you won’t have to do the same after reading this post.

Visualizing time dependent networks with d3.js

For a change, here is a post about data visualization. The other day I was thinking about a way of visualizing a time-dependent network in d3.js and in this post I will show a prototype solution.

Changepoint Detection. Part II - A Bayesian Approach

I have recently discussed the problem of changepoint detection from a frequentist point of view. In that framework, changepoints were inferred using a maximum likelihood estimation (MLE) approach. This gave us point estimates for the positions of the changepoints.

In this post I will present the solution to the same problem from a Bayesian perspective, using a mix of both theory and practice (using the $\small{\texttt{pymc3}}$ package). The frequentist and Bayesian approaches give actually very similar results, as the maximum a posteriori (MAP) value, which maximises the posterior distribution, coincides with the MLE for uniform priors. In general, despite the added complexity in the algorithm, the Bayesian results are rather intuitive to interpret.

Changepoint Detection. Part I - A Frequentist Approach

Changepoint Detection (CPD) refers to the problem of estimating the time at which the statistical properties of a time series… well… change. It originates in the 1950s, as a method used to automatically detect failures in industrial processes (quality control) and it is currently an active area of research that can boast of having a website on its own.

Python implementation of Crank-Nicolson scheme

Since at this point we know everything about the Crank-Nicolson scheme, it is time to get our hands dirty. In this post, the third on the series on how to numerically solve 1D parabolic partial differential equations, I want to show a Python implementation of a Crank-Nicolson scheme for solving a heat diffusion problem.

Implicit solution of 1D parabolic PDE (Crank-Nicolson scheme)

This post is the second one of the series on how to numerically solve 1D parabolic partial differential equations (PDEs). In my previous post I discussed the explicit solution of 1D parabolic PDEs and I have also briefly motivated why it is interesting to study this type of problems.