The Huffman Coding algorithm is a building block of many compression algorithms, such as DEFLATE — which is used by the PNG image format and GZIP.

(subscribers to my newsletter received this first)

Have you ever wanted to know:

  • How do we compress something, without losing any data?
  • Why do some things compress better than others?
  • How does GZIP work?

Suppose we want to compress a string (Huffman coding can be used with any data, but strings makes good examples).

Inevitably, some characters will appear more often than others in the text to be compressed. Huffman Coding takes advantage of…

We say a data structure is ‘linear’ if the items inside it are stored in a sequence.

Arrays, linked lists, and stacks are all linear data structures.

Why should I care?

We use these data structures every day in programming.

Even if you’re already familiar with them, it’s helpful to recap them occasionally.

In 5 minutes or less

As we said in the introduction, a data structure is ‘linear’ if the elements form a sequence.

That means that the data structure has a first and last element, and each element is connected to its previous and next element.

  • An ‘array’ is a linear data structure; the items are stores…

Jaro-Winkler similarity is a way of measuring how similar two strings are. It is fairly easy to understand and quick to implement.

Why should I care?

String similarity metrics have various uses; from user-facing search functionality to spell checkers.

There are a few common string similarity metrics. Knowing a little about each will help you to choose the right one, should you ever need to implement something like this yourself.

Jaro Similarity, and the modified version — Jaro-Winkler — are two common ones.

In 5 minutes or less:

Imagine we’re building the search functionality for an app store.

If a user misspells their search, we’d like to be able…

Floating-point is a way of storing numbers in binary. It allows us to store a very large range of values using a fixed amount of space.

But, floating-point numbers (a.k.a. ‘floats’) can cause you some problems if you’re not careful.

Have you ever wanted to know:

  • Why 0.1 + 0.3 does not always equal 0.4 in JavaScript?
  • How we store non-integer numbers in binary?

You may be familiar with how we represent a number in binary.

Each bit represents a power of 2, and by combining them we can produce every whole number:

But what happens when we need to…

CAP Theorem describes the decisions we have to make when building a distributed data store.

Why should I care?

Have you ever wanted to know:

  • How distributed databases handle network failures?
  • What tradeoffs we must make when designing distributed data stores?

In 5 minutes or less

Eric Brewer’s CAP Theorem tells us that a distributed data store must choose no more than two of the following:

  • Consistency
  • Availability
  • Partition Tolerance.

What do those definitions mean, and why can’t we have all three?

Let’s imagine we’re designing a distributed database.

For simplicity our database will have just two interconnected instances, or ‘nodes’:

Partition Tolerance

Partition Tolerance means that one or more nodes…

Bubble sort is probably the simplest sorting algorithm. It is inefficient at scale, but quick to write and works fine on a handful of elements.

Bubble sort is an excellent introduction to sorting algorithms.

It is also useful as a reference when we cover more complex search algorithms later.

Let’s sort this array into ascending order:

Step 1: Compare pairs of elements

We’re going to loop through each pair of elements in turn.

If a pair of elements aren’t in the right order, we’ll swap them.

Let’s go…

The first pair is already in the correct order, so we can ignore them.

The Travelling Salesman Problem is a classic computer science problem, known for having no efficient solution.

Why should I care?

There are many real-life problems that are very similar to the Travelling Salesman Problem.

Learning about problems like this will help you to recognize when you’re facing something equally difficult to solve.

In 5 minutes or less

This is the ‘Travelling Salesman Problem’ (or TSP): “Given a list of cities, what is the shortest route that visits each city once and then returns to the origin city?”

It sounds simple, but when we add enough cities it becomes impossible for a computer to solve in a realistic time frame.

A hash table is a data structure that stores a collection of key/value pairs in a way that makes it very efficient to find them again later.

Have you ever wanted to know:

  • How a hash map, associative array or dictionary data structure works in a given language?
  • When is it appropriate to use a hash table to store items?
  • How do we deal with ‘collisions’ in a hash table?

Imagine we want to store a list of users so that we can find them later using their names.

We could simply store our users in an array. When we…

Recently I’ve started to notice that the majority of comments I write are either redundant, or are excuses for other failings in my code.

Since that realisation, I’ve been making a conscious effort to eliminate the majority of comments from the code I write.

I find that comments generally fall into a handful of common categories. Here’s how I’ve been dealing with each:

Comments that should be methods

I often find myself using a comment, when extracting code to a method would be better.

For example:

if (customersFirstOrder) {
// Apply free shipping
orderCost = orderCost - shipping;

I find this much easier to…

As people that build software, we’ve all been in that ‘flow’ state; completely immersed in an activity, hyper-focused, and super-productive.

When we’re designing our software, we need to consider the interactions we ask the user to perform in the context of the wider user experience — maintaining the user’s ‘flow’.

Don’t make me switch pages

Switching from one page to another is one of the worst offenders for breaking flow.

Even if the page loads quickly and the user doesn’t have to wait, they have to re-build their mental model and figure out what they need to do on a whole new page.

This might…

Dave Saunders

Developer. Lapsed blogger. 10 years of half-finished git repos.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store