Interesting articles

Sheesh, it’s been so long since my last post that my blog URL is no longer in my Chrome suggestions list. Haven’t really felt like writing, I guess.

In this post I just want to share a collection of really interesting articles I’ve found over the years. Most are computer/math/science-oriented, but some are more general. Here’s the list, organized roughly by subject:

General:
Is there anything good about men?
Edward R. Murrow 1958 speech
H – Reason.com
The two things

Math:
V.I. Arnold, On teaching mathematics
Algebra vs analysis predicts eating corn?
Commutativity of multiplication

Writing:
George Orwell, “Politics and the English Language,” 1946

Philosophy:
Axioms

Computers:
Regular expression matching can be simple and fast
Fitt’s law
A brief, incomplete, and mostly wrong history of programming languages
What computers can’t do (and why)
Exotic data structures

Science:
Can a biologist fix a radio? —
or, What I learned while studying apoptosis

Haldane, On being the right size

I will add to this post as I come across more writings.

Intro to automatic differentiation

Automatic differentiation, or autodiff, is a nifty way to compute numerical derivatives of a function using a computer. There are two more popular approaches to this:

  • Symbolic differentiation
    • The computer takes a function in symbolic form like sin(cos(x)) and applies rules to transform it into another symbolic expression: cos(cos(x))*-sin(x), which can be evaluated for input x like the original function.
  • Numerical differentiation
    • Imitating the definition of the derivative as a limit, the computer evaluates a function at values very close to the desired x value and divides the difference in outputs by the difference in inputs.

Both methods are highly intuitive and mimic how we would calculate a derivative by hand. However, symbolic differentiation can be very slow, and even impossible for functions that can’t be written as a single expression, and numerical differentiation runs into accuracy issues depending on the distance h to the input you choose to use – too large a value and it may not represent the
limit very well, but too small a value and calculation accuracy goes down due to limited precision. Choosing the right h for maximum accuracy depends on the function as well as the input value.

Autodiff has none of these problems. It calculates exact numerical derivatives (to the precision limit of the machine) at the same speed as evaluating the function itself. Unlike symbolic differentiation, it even works on crazy functions like this, which would be a nightmare to expand symbolically:

function y(x):
    a = x
    for i from 0 to 100
        a = a * sin(a * x)
    return a

Sound too good to be true? Read on…

Autodiff with forward accumulation

There are two main approaches to autodiff: forward and reverse accumulation. We’ll only discuss forward accumulation since it’s easier to understand and implement (I don’t really grasp reverse accumulation myself).

Autodiff is based on the chain rule applied to individual operations. Say we wanted to differentiate f(x) = sin(2x) * cos(x) with respect to x. This function can be broken down into single operations with “work” variables, like so:

x1 = 2 * x
x2 = sin(x1)
x3 = cos(x)
f(x) = x2 * x3

Each one of these can be individually differentiated, applying the chain rule, like so:

x1' = 2 * x'
x2' = cos(x1) * x1'
x3' = -sin(x) * x'
f'(x) = x2 * x3' + x2' * x3

(where ' denotes derivative with respect to x). Putting all those together, we have:

f'(x) = x2 * x3' + x2' * x3
      = sin(x1) * x3' + x2' * cos(x)
      = sin(2 * x) * -sin(x) + (cos(x1) * x1') * cos(x)
      = sin(2 * x) * -sin(x) + (cos(2 * x) * 2) * cos(x)

which is indeed the derivative. But here’s the insight: each calculation of a work variable corresponds to a calculation of its derivative. So if we could keep track of the work derivatives alongside the work variables while calculating f(x) for some input x, at the end we would have both the final result f(x) and its derivative f'(x).

With that in mind, here’s an example implementation to return f(x) along with its derivative:

function f(x):
    x1 = 2 * x
    x1Prime = 2 * 1

    x2 = sin(x1)
    x2Prime = cos(x1) * x1Prime

    x3 = cos(x)
    x3Prime = -sin(x) * 1

    fx = x2 * x3
    fxPrime = x2 * x3Prime + x2Prime * x3

    return (fx, fxPrime)

But, of course, performing this transformation manually on each function we wish to differentiate is impractical. In practice you use either automated source code transformation (difficult to implement, but can result in faster execution) or operator overloading (easy to implement, but some languages don’t support it and can be slower).

Operator overloading

Let’s discuss operator overloading. Typically in this approach you would define a class that held both a work variable and its derivative (called a dual number in the literature), and overload its operators to perform normal operations on the work variables and derivative calculations on work derivatives.

For example, the multiplication operator, treating the class as a 2-tuple:

operator*(x1, x2):
    return (x1[0] * x2[0], x1[1] * x2[0] + x1[0] * x2[1])

It’s as easy as that. Now modify your original function to accept and return this new class and you get the derivative along with the output. You must also pass 1 for the derivative of the input to the function, since it can be treated as a work variable itself, and set 0 for the derivative of any constants.

More resources

  • Wikipedia has a great page that discusses both forward and reverse accumulation and generalizations.
  • This set of slides also has great explanations and applications.

Understanding Python Decorators

Decorators are one of those little touches that make Python such a pleasure to program in. They’re also very easy to understand, though it may seem otherwise at first glance. In this post I attempt to explain them as clearly and concisely as possible.


Basic decorators

A decorator is a function that takes a function and returns a new one to replace it with.

For example:

    def decorator(func):
        def newFunc():
            print 'in new func'
            func()
        return newFunc

    def myFunc():
        print 'in old func'

    # replace myFunc
    myFunc = decorator(myFunc)

The reason these kinds of functions have a special name is because they have a special syntax in Python. This next snippet is equivalent to the above:

    def decorator(func):
        def newFunc():
            print 'in new func'
            func()
        return newFunc

    @decorator
    def myFunc():
        print 'in old func'

In both cases, when executed, this is the output:

    >>> myFunc()
    in new func
    in old func

Class decorators

Notice that in this line of the first snippet:

    myFunc = decorator(myFunc)

decorator could have been a class whose constructor takes a function, and myFunc would have been assigned an object of class decorator. This class would also have to implement the __call__ method, so that the new myFunc could be called, of course. Decorators implemented in this manner allow you to hold state in a more object-oriented way [1]:

    class decorator(object):
        def __init__(self, func):
            # we've got to save the function so we can access it when
            # "myFunc" is called
            self.func = func
        def __call__(self):
            print 'in new func'
            self.func()

    @decorator
    def myFunc():
        print 'in old func'

The behavior is the same as before.

Parameterized decorators

A parameterized decorator is a function that takes some arguments and returns a basic decorator.

Let’s use a practical example to demonstrate. Say we wanted to write a decorator for a function that executes an unreliable I/O request (such as a slow Web connection that often times out), and that raises an exception to signal that the request has failed. On failure, we want this decorator to retry the function a certain number of times before propagating the actual exception. We also want to be able to specify the type(s) of exceptions to watch out for, to distinguish these from other exceptions that shouldn’t trigger a retry. Here is a parameterized decorator to do just that:

    def retry(maxRetries, exceptions):
        def realDecorator(f):
            def newFunc(*args):
                tries = 0
                while True:
                    try:
                        return f(*args)
                    except exceptions, e:
                        # Note that "exceptions" can be either one exception
                        # or a tuple of exceptions - Python's syntax handles
                        # both.
                        tries += 1
                        if tries > maxRetries:
                            raise e
            return newFunc
        return realDecorator

Now we can use it on a function with much the same syntax [2]:

    @retry(5, IOError)
    def func():
        someOperation() # may raise IOError

which corresponds to the un-decorated syntax:

    def func():
        someOperation() # may raise IOError

    # "retry" returns a decorator, which takes a function and returns the
    # new one.
    func = retry(5, IOError)(func)

In both cases, when func is called, it will retry up to 5 times whenever IOError is encountered, before actually propagating the error to the caller. Other exceptions that are not of type IOError will be propagated as normal.

Of course, just as with normal decorators, you can use classes instead of functions here as well.

Chaining decorators

One last thing: you can apply multiple decorators to a function by simply listing them on top of each other:

    @decorator1
    @decorator2
    def func():
        pass

decorator2 is applied first, then decorator1. Very intuitive.


Well, that’s about it for the explanation. The next section discusses useful applications, and focuses on one that I particularly like in detail .

Applications

Here is a little litmus test I use to measure the expressiveness of any programming language: how easy is it to implement a memoizing function? [3] That is, a function that takes another function and returns a new version that saves any arguments it is called on, as well as their return value, so that subsequent calls with those same arguments can just return the saved value without executing the function again. If you’re confused, hopefully this simple implementation will help:

    def memoized(func):
        cache = {}
        def newFunc(*args):
            if args in cache:
                return cache[args]
            else:
                value = func(*args)
                cache[args] = value
                return value
        return newFunc

As you can see, it’s fairly easy in Python. So, where is this useful? Anywhere you have an function with an expensive computation, whose results only depend on its arguments, and that doesn’t modify any global state (ie. a pure function). The textbook example of this is the exponential-time Fibonacci algorithm, which is turned into a linear-time one with memoization. But here’s a more interesting example. Behold, the world’s simplest Web cache:

    import urllib2

    @memoized
    def getContent(url):
        """Return the content of the page specified by url."""
        return urllib2.urlopen(url).read()

If you call it on the same URL twice, the second time it will return what is in the cache instead of accessing the site again. Of course, this assumes that the contents at a URL never change (ie. there’s no expiration time), and doesn’t take into account different URLs that map to the same page. However, if you only need the data over a short time period and memory usage isn’t a large priority, it can be quite useful (I used it in a simple web crawler once).

Here are some other applications I’ve found for decorators:

  • to retry a function upon failure, as shown above
  • to run a function in a new thread, so it doesn’t block execution
  • to lock a function so that it can only be called with one thread at a time, with the other ones blocking until current one is finished [4]

And many more examples can be found here.

Hopefully this section has shown you some of the power and versatility of decorators.

Happy coding!


[1] I personally much prefer the functional style, as the class style is more verbose and obscures the fact that a decorator is just a higher order function. Sometimes you do need the class style, however, as Python closures can’t modify variables in their surrounding scope (until Python 3, with the nonlocal keyword).

[2] While writing this, I realized that it looks like you can put any expression that yields a function after the @, and it uses that as the decorator. However, I tried this snippet and it didn’t work:

    @(lambda f: f)
    def this():
        print 'hi'

    this()

Sad. I was totally planning to use lambda decorators everywhere. /sarcasm

[3] In C/C++, damn near impossible. If anyone has done this I’d like to know. EDIT: it’s possible in C++11.

[4] OK, this may have been symptomatic of a bad design…

Python-style dictionary constructors in C++0x

Check this out:

    #include <cstdio>
    #include "dict.hpp"

    int main() {
        auto a = make_dict(1, "one", 6, "six");
        // a has type: unordered_map<int, const char *>
        printf("%d %s\n", 6, a[6]);

        auto b = make_dict( "this", "has",
                            "as", "many",
                            "arguments", "as",
                            "you", "want"
                            );
        printf("%s %s\n", "you", b["you"]);

        return 0;
    }

which, of course, prints:

    6 six
    you want

It’s all done with the magic of variadic templates, a new feature in C++0x. Here’s the full implementation, dict.hpp:

    #include <unordered_map>
    #include <utility>

    using namespace std;

    template <typename KeyType, typename ValType>
    using Dict = unordered_map<KeyType, ValType>;

    template <typename KeyType, typename ValType>
    void _make_dict_ip(Dict<KeyType, ValType> &d, KeyType key, ValType val) {
        d.insert(make_pair(key, val));
    }

    template <typename KeyType, typename ValType, typename... Args>
    void _make_dict_ip(Dict<KeyType, ValType> &d, KeyType key, ValType val, Args... args) {
        d.insert(make_pair(key, val));
        _make_dict_ip(d, args...);
    }

    template <typename KeyType, typename ValType, typename... Args>
    Dict<KeyType, ValType> make_dict(KeyType key, ValType val, Args... args) {
        Dict<KeyType, ValType> d;
        _make_dict_ip(d, key, val, args...);
        return d;
    }

(Note that this also uses another new feature, alias templates, in defining the Dict type.)

It’s not the prettiest code ever, as you’re basically required to use recursive definitions to do anything with variadic templates, but it gets the job done. C++ is like an abusive lover – writing anything elegant with it is usually so painful, as the compiler continually vomits incomprehensible template error messages all over your screen, that you’re oddly grateful when you can trick the compiler into pretending C++ is a dynamic language, if only with a small, ugly snippet like this.

Ahem. But I digress. Hopefully someone finds this useful.

Why pink is my new favorite color

There’s an idea that’s been going around for a while that pink is not a color. This really intrigued me, and all the “popular science” posts on it gave rather hand-wavy explanations, so I went on a quest for more information. On the way, I also discovered answers to some other puzzling questions:

  • Why are the primary colors red, green, and blue, or as we learned in drawing class: red, blue, and yellow? Is one set more “correct” than the other?
  • How do hue, saturation, and value correspond to physical properties of light?
  • From a pink-related article – red and green light combined make yellow. How many ways, in general, can you create a certain color?

First of all, an explanation of HSV and an answer to the second question, because I think it helps shed light (pun intended) on the others.

HSV

Computers represent color with three parameters: red, green, and blue – every color you see on the screen is a combination of these in different proportions. While this is convenient to implement, it isn’t a very useful description of how humans perceive color (quick, what’s the RGB triplet for orange?). That’s where the hue/saturation/value system (HSV) comes in. HSV also takes three parameters, but these can be directly related to what we perceive, and are also easier to connect to a distribution of light wavelengths.

Hue is the basic color of light (the peak of the wavelength distribution). It ranges the gamut of the rainbow, from red to violet. Saturation is how spread out the distribution is. High saturation means that most of the light is concentrated at the peak; low means the graph is flattened out. Value is the intensity of light, or the height of the distribution. For a nice visualization of hue and saturation, see here.

I mention HSV not only because it’s interesting, but because it gets us thinking in terms of distributions of wavelengths. So with that in mind, here is the graph that explains pretty much everything.

The 1931 CIE color space

This graph is the result of an experiment in the 1920s that attempted to map out human color perception. The axis labels are not important for our purposes; they’re just parameters in the color space. The important part is the curved line with the wavelength labels. Now here’s the cool part of the graph: by combining wavelengths on the curve in relative proportions, you can get any color on the line connecting them. So, for example, you can get white by combining ~610 nm and ~493 nm, or ~580 nm and ~486 nm, or many other combinations. You can also combine that combination with another wavelength to get another color on the line connecting those two points, and so on. This answers our last question – there are, in principle, an infinite number of ways to generate any color by combining light of different wavelengths. Note that the combination is generally non-linear, ie. combining equal amounts of two wavelengths will not usually yield the color on the midpoint of the line segment. There have been various attempts to remedy this.

Primary colors, then, are just specific colors (points on the graph) chosen as a basis from which all other colors are generated by this line-drawing method. It’s easily seen that any choice of primary colors can only represent the colors within the polygon bounding them, which is called the gamut of the color space. In practice we use three primary colors, because of how our vision system works. Note that there are no three points that form a triangle covering the entire color space. This implies that any choice of three primary colors will necessarily be missing some colors. In fact, the most commonly used color space today (the one your monitor uses) is just one of these possible choices, and not a particularly expressive one, at that:

Image from Wikipedia, author: Spigget, licensed under CC BY-SA

It’s technically called sRGB, though it’s usually what we mean when we simply say RGB. Notice how it doesn’t cover a fairly large area of greens, blues, and purples. Remember, too, that since you’re using a monitor with this color gamut to view these images, you can never actually see the true range of perceptible colors on your screen, which is a bit sad when you think about it.

So that answers our first question.

Now, in a roundabout way, we’re back to the original motivation of the post, the mystery of why pink is somehow “less of a color” than the others. Pink is made up of two wavelengths (red and blue/violet) that are on the opposite ends of the spectrum, and as you can see from the graph, our brain does not interpret it as the wavelength in between the two as it does for many other combinations, but instead as something new. But this is true for any color along the bottom line, including a variety of magenta and reddish-purple colors. You could even say roughly the same thing for white, which can be generated from blue and orange and yet does not appear green. The difference is that white can be created in many other ways, but the general principle is the same. You could say that pink differs from other colors along the bottom line because it is not easily recognized as a mixture of red and violet, but then again, yellow is not easily recognized as a mixture of red and green (to me, at least). So then perhaps pink is the only color that satisfies these two properties:

  • Not interpreted as a wavelength in between its constituents
  • Not visually perceived as a mixture of its constituents

So does this disqualify pink as a color? Well, as usual, it depends on your definition of color. Personally, after learning how seemingly arbitrary the correlation between wavelength distribution and perceived color is, I would say that color is more of a quality of the human visual system than a quality of light, and would gladly welcome pink into our pantheon of hues.

Thanks for reading! Leave any questions/comments/corrections down below.


Miscellaneous, somewhat unimportant, notes:

  • The graphics of the CIE color space used here are technically only chromaticity diagrams, so they don’t include luminosity (brightness), which is why the two images look different and why they don’t include all possible colors. In HSV terms, it is value that is missing. Areas of low saturation are near the center, high saturation on the edges.
  • HSV has a sister color space also designed to be human-friendly, called HSL.
  • This introductory post only scratches the surface of the immensely complex subject of color perception. For more details consult your nearest Wikipedia/textbook.

New music round-up #1

I’m going to start a series of posts in which I briefly describe what music I’m currently listening to. Not that anyone reads this blog, I’m just that bored.

Anyway.

1. Entertainment for the Braindead – Hydrophobia

It’s this girl and her guitar playing and singing some quite nice folk songs that “mostly circle around the idea of drowning”. I heard the title track on radio404.org (amazing station by the way, check it out on iTunes radio under “Eclectic”) and had it stuck in my head for days, which is unusual for folk music. Anyway you can download it for free here.

2. RiEN – Il ne peut y avoir de prédiction sans avenir

Translation – “There can be no prediction without future” (I think, my French is a bit rusty). Very diverse collection of songs, from post-rock to 80s-ish pop, great eerie atmosphere throughout. I particularly like Cortez (listen here). This album is also available for free here.

3. Modest Mouse – The Lonesome Crowded West

OK, so finally a fairly well-known band and album. I listened to this about a year ago and hated it, back when I was into super weird music. Out of nowhere I decided to listen to it again recently and it finally clicked. The defiant “wild west” attitude throughout and the strong emphasis on rhythm (which tends to take a backseat to melody in most modern music) make this an instant classic for me. Highlights include: Teeth Like God’s Shoeshine, Heart Cooks Brain, Convenient Parking, Cowboy Dan, Trailer Trash, and Styrofoam Boots/It’s All Nice On Ice, Alright. Which are, like, half the songs. Seriously, there’s a reason it’s so highly regarded.


Well, that’s about it for new music. I’m also still listening to my perennial favorites, of course, which may be the subject of a future post.

Have a nice day.

misc.py

Over the years, I’ve written a lot of random Python utilities that seemed useful in general but didn’t warrant their own module. I’ve kept them in a file called misc.py.

It includes, among other things, a roman numeral converter, a numerical base converter, A* search, a Huffman table generator, and strfry().