sandbox

This is my current blog where I discuss a variety of computing topics including Linux, vim, python and javascript as well as science and statistics. Many of the posts have accompanying gists, which contain code for the example discussed.

Using Python to query data from Socrata

Feb 17, 2015

I've started going to Open Oakland meetings on Tuesday nights. The group works on a variety of projects related to making city data more accessible and usable for Oakland citizens by creating websites, or apps.

Installing Node.js and npm on Ubuntu 14.04

Jan 12, 2015

I've decided to start being systematic about learning javascript, with a focus on getting good with d3js. I'll be installing nodejs and npm (node package manager) as a way to get access to a javascript console and, for later, a powerful javascript environment.

Inferring probabilities with a Beta prior, a third example of Bayesian calculations

Dec 11, 2014

In this post I will expand on a previous example of inferring probabilities from a data series. In particular, instead of considering a discrete set of candidate probabilities, I'll consider all (continuous) values between \( 0 \) and \( 1 \). This means our prior (and posterior) will now be a probability density function (pdf) instead of a probabilty mass function (pmf). More specifically, I'll use the Beta Distribution for this example.

Installing essentia for audio feature extraction

Dec 10, 2014

Some notes on the installation of essentia, a collection of c++ code with Python wrappers for audio feature extraction, following the essentia installation guide.

Getting started with Latent Dirichlet Allocation in Python

Nov 13, 2014

In this post I will go over installation and basic usage of the lda Python package for Latent Dirichlet Allocation (LDA). I will not go through the theoretical foundations of the method in this post. However, the main reference for this model, Blei et. al. (2003) is freely available online and I think the main idea of assigning documents in a corpus (set of documents) to latent (hidden) topics based on a vector of words is fairly simple to understand and the example will help to solidify our understanding of the LDA model.