TurnKey v14.0 RC1 is LIVE! (aka we need YOU!)

Update: v14.0 stable is available in all build types: OVA & VMDK, Proxmox, OpenNode & Docker (Proxmox build is somewhat generic LXC/OpenVZ container) and Xen & OpenStack.

Invalidating the disk cache on Linux

Here's a super easy way to invalidate the disk cache, which is useful for testing IO performance in the real world, where you can't rely on all of your reads being served up from a super-fast RAM cache rather than a vastly slower physical disk drive.

This will free up everything in the disk cache:

echo 3 > /proc/sys/vm/drop_caches

Or if you want more control over exactly what is being freed...

  1. This frees up the pagecache (e.g., cache of contents of files):

pyproject-pub: A simple Python project template

I hate repeating myself. It's boring. Life is too short. Like any self respecting hacker I will go out of my way to avoid it, even when I suspect it would cost me more to automate something away than to just do it by hand.

On the other hand, doing stuff I've done before by hand is no fun, while writing scripts is fun. Even when it does take longer, time is relative, or so Einstein said.

Getting started with Python and Lisp

A few weeks ago I talked with a friend studying computer science who I discovered had never experienced the joy of programming with a high level language. Not only that but he didn't have the first clue what he was missing. I feared without my immediate intervention another perfectly good mind would be wasted in programming hell. At his university they were using Java for nearly everything so he had somehow gotten the terribly mistaken idea that it didn't really matter what programming language one used. I carefully explained that:

Why parallel programming is hard

Implementing Cloudtask took more time than I had planned due mainly to the challenges of parallel programming, which I hadn't done that much of before. Also, parallel programming really is inherently far more difficult than serial programming.

In my mind there are three major challenges:

Parallelize - a simple yet powerful high-level interface to multiprocessing

When I was developing Cloudtask, I discovered none of the interfaces in the Python multiprocessing module were powerful enough for my needs so I had to roll my own. The result is the generically useful multiprocessing_utils module in turnkey-pylib which from my totally subjective perspective provides a far superior interface to parallelization than the built-in multiprocessing interfaces.

Three strikes - time to automate!

I caught myself today repeating a few basic operations by hand what seemed like a zillion times. Over and over again. I didn't really notice it at the time but it was really slowing me down.

For example, after committing to tklbam I would create a tklbam testing package, copy the package to one of my test machines, install it and remove the archive.

My last Perl program - a Perl obfuscater that can eat its own tail

OK, I admit it. I used to program in Perl. And I liked it! My Perl programs were terse. If I could shave a line off, I did. In fact, I spent a non-trivial amount of time figuring the shortest possible programs that solved various problems. Often that meant resorting to various tricks and arcane features of Perl that nobody other than me would bother to understand. I took pride in that.

Python optimization principles and methodology


The basic methodology for optimization:

  1. Discover where you program is spending its time (hotspots vs coolspots)

    A good way to get an overview is to use the Python profiler. The Python profile will usually be included in Python's standard library:

4 simple software optimization tips

1) Always be experimenting!

Trying to squeeze out more performance out of your program? Don't be afraid to experiment!

In practice what that means is you setup small, simple throwaway experiments to establish how things work when you're not absolutely sure you fully understand something such (e.g., how many times a second a certain function can be invoked, how the profiler measures blocking IO or the time it takes a sub-program to complete).


