A holiday plea

Shortly before the holidays I made donations to three organizations: the Free Software Foundation, Wikipedia, and Creative Commons. If you'll kindly indulge me for a minute, I'll explain why I think the work of these organizations is so important to an open internet and a free and properous society. Consider giving whatever you can spare (whenever you can spare it) to one of these groups.

1. So much of our everyday lives—both work and play—depends on the operation of software that we cannot really claim to be free people unless we are using free software. I'm donating to the Free Software Foundation both as thanks for the GNU operating system and in support of their campaigns. The GNU OS—from ls to emacs, and everything in between, and beyond—eclipses, in terms of power and productivity, pretty much any other OS you can buy. But peace of mind is even more valuable than technical power. As someone whose livelihood directly depends on software, it would be foolish in the extreme for me to compromise my autonomy and financial security by using proprietary software and "giving the keys to someone else."

The FSF also promotes awareness of a number of threats to freedom and innovation, like DRM (vile, vile stuff) and proprietary document formats (which are antithetical to the democratic idea of the free interchange of information).

2. I'd say that being able to learn about anything, anytime, on Wikipedia has been a pretty life-changing experience. I don't need to explain what this is like to most of you. But Wikipedia isn't valuable just because it satisfies my idle curiosities. One of the hats I wear is that of "teacher," and I love sharing knowledge. The dissemination of knowledge is one of the surest ways to produce prosperity. I'm donating to Wikipedia on behalf of children and other curious people everywhere.

3. Creative Commons is creating a body of actually useful creative works, as well as encouraging people to rethink copyright law. I feel like I've discovered a gem every time I find an ebook I can copy for offline reading, or music I can share with friends, or a comic or photo that I can put on my blog. What is perhaps more valuable is that CC is planting the idea in people's heads that maybe we can be more prosperous as a society if authors allow their work to be used in more ways rather than fewer.


Magit is a spectacular Emacs add-on for interacting with git. Magit was designed with git in mind (unlike VC mode, which is a more generic utility), so git commands map quite straightforwardly onto Magit commands. M-x magit-status tells you about the current state of your repo and gives you one-key access to many common git commands. However, what really sold me on Magit was its patch editor, which completely obsoletes my use of git add, git add --interactive, and git add --patch. If Magit had this patch editor and nothing else, I would still use it. That's how great this is.

M-x magit-status (which I've bound to C-c i) tells you about your working tree and the index, kind of like a combination of git diff, git diff --cached, and git status. It shows some number of sections (e.g. Staged changes, Unstaged changes, etc.); within each section you can see what files have been modified; within each file you can view the individual hunks. Within any of these containers you can press TAB to expand or collapse the heading. Moving your cursor into a file header or a diff hunk header selects the changes in that file or hunk, respectively. You can then press s to stage those changes, as shown in these before-and-after pictures:

Once you're satisfied with your staged changes, you can press c to commit, which prompts you for a log message. After you've typed a message, C-c C-c performs the actual commit.

This is already much faster than using git add --interactive or git add --patch to stage parts of a file. You just find the hunk you want rather than having git ask you yes/no about every hunk.

However, Magit also allows staging changes at an even finer granularity. If you highlight some lines in a hunk and then press s, Magit only stages the selected lines, as shown in these before-and-after pictures:

When in doubt, it's a good idea to make small commits rather than large commits. It's easy to revert (cherry-pick, explain, etc.) more than one commit, but hard to revert half a commit. Kudos to Magit for making small commits easier to create.

Finally, Magit comes with a fine manual, which you can read online.

Installing Magit

It doesn't get too much easier than this for external Emacs packages.

Check out Magit:

git clone git://gitorious.org/magit/mainline.git

Make sure that magit.el from that checkout, or a copy, is on your load path. For example:

(add-to-list 'load-path (expand-file-name "~/.emacs.d/lisp"))

Autoload Magit and bind magit-status:

(autoload 'magit-status "magit" nil t)
(global-set-key "\C-ci" 'magit-status)

Ubuntu Jaunty Jackalope

(I am catching up on all the things that I've been meaning to blog about in the past few months.)

There are good reasons already to upgrade to the Jaunty Jackalope (development release). Either for good, or just to install the following two packages:

(1) git 1.6.x (release announcement).

(2) Firefox 3.1 beta. I have to say, I wasn't sold on Firefox 3.0. But Firefox 3.1 has convinced me to switch back from Epiphany. First, it is blazing fast. Second, in continual usage for several weeks now, it seems to be pretty crash-proof. Third, it actually has a bookmarks system that I would use. When Google can get you what you want 0.5 seconds after you type it in, you really have to rethink the idea of poking around through menus to find your favorite sites. Anyway, my gratitude goes to everyone involved.

Epiphany had also been starting to get on my nerves lately. It seems to crash at least once every other day. The address bar is really laggy sometimes (if you've ever used SSH over a high-latency connection, you know how irritating this is). And it is not as fast as Firefox 3.1, at least yet.

Jaunty is interesting for another reason. In this release, Ubuntu is attempting to make bzr repositories available for the packaging+source of every single package. I am looking forward to seeing what people will do with this. If it could make it easier for casual developers to get the source for a package, poke around to fix a bug, isolate their patches and send them to Ubuntu (or upstream), it could be a huge force multiplier.

The Slashdot Top 40

For our machine learning project, we attempted to automatically guess ratings or labels for Slashdot comments based on their content. As a side effect, we generated some data on what words and phrases tend to appear disproportionately often in high-ranked (low-ranked, interesting, uninteresting, funny, unfunny, etc.) comments.

The set of the top 40 "Funny" phrases turns out to be a hodgepodge of cultural references. I am not sure I understand all of them.

1 xkcd.com $
2 xkcd.com
3 nukem forever
4 carrier $
5 slashdot editor
6 skynet
7 clod $
8 woman $
9 grue
10 newt
11 no carrier
12 asparagus
13 nigerian prince
14 porn with
15 grue $
16 an outrage
17 kentucky $
18 eight camera
19 reality distortion
20 god what
21 six video
22 electronic games
23 locally $
24 paperbacks
25 distortion field
26 its belly
27 my underwear
28 am intrigued
29 penny-arcade.com $
30 priceless $
31 lycra
32 emacs $
33 polar bear
34 cried out
35 burma shave
36 an african
37 porn for
38 your grip
39 expects the
40 not talk

("$" means end of comment; "^" means beginning of comment.)

The list of top "Interesting" phrases suggests that workplace stories are interesting:

employees were; what worked; department i; our clients; wap; reviews on; file servers; work etc; could connect; stance that; updates the; those available; hitting my; europe to; i'm seeing; happening with; snuff; time anyone; spam has; to snuff; the bases; thin and; my college; street to; extreme programming; be neutral; late 19th; management they; from game; tenacity; withstanding; own account; right beside; magpies; from intel's; my food; obscure stuff; language when; and trash; been dragging

Meanwhile, the phrases least likely to be found in "Interesting" comments are either insulting or profane:

^ no; insensitive; again $; you insensitive; ^ oh; clod; insensitive clod; ^ you; ^ just; ^ then; ^ well; the hell; ^ or; slashdot; ^ and; you $; post $; ^ yes; ^ why; ^ but; ^ yeah; um; you mean; ^ they; wikipedia.org $; then $; religious; ^ now; clod $; mod; is called; ^ not; right $; ^ he; ^ ah; first post; ^ is; ^ your; ^ it's; fuck

These lists were generated using a corpus of 55,561 comments posted between June and November 2008.