Aug 4, 2009

creating a codeswarm movie

code swarm frame

Download video (3Mb)

A codeswarm is a visualization of the activity within a source code repository. The image and linked video above shows the lifetime of one of Verilab's source repositories. You can see code being created, the check-ins as they happen and an indication of which users are doing the work at any given time. It is an example of an 'organic information visualization' and is created using the Processing toolkit. The original visualization tools were developed by Michael Ogawa and the source code is available on Google code.

In this particular case I created the animation on OS X 10.5, using a combination of codeswarm, ffmpeg and LAME. If you are interested in doing something like this yourself:

First you'll need to make sure you have a recent version of the Java Development Kit installed (JDK 1.5 or later). You'll also need a recent version of Ant installed. (I have version 1.7.0, which ships with OS X as default). Download the code_swarm source and install it. Then execute 'ant run'. If all is well, you should get a dialog box prompting you for the source repository, user name and password.

At this point, I put in the svn+ssh URL for the Verilab repository that I wanted to visualize. Everything fell over, with a Java error (NoClassDefFoundError within com/trilead/ssh2). From this I realised I needed to install the SSH libraries for Java, from Trilead. I downloaded those, unpacked them and added the jar file to my CLASSPATH. Along the way I found out the default OS X CLASSPATH definition is in /System/Library/Java/JavaConfig.plist which may be useful as a starting point.

With that fixed, I again ran 'ant run' and put in the relevant information. A bit of time passes as the checkin information is extracted from the repository, then the visualisation runs. You'll find that repository information that was extracted is saved, under the ./data directory (look for the latest realtime_sample.*.xml file) . This is useful for the next stages, as you don't have to fetch the information again. If you want to create a video of the visualisation, there are a few more hoops to jump through.

You will need to configure codeswarm to save the frames for each stage of the visualisation. You do this by editing the ./data/sample.config file. First off, copy it to a new version for your particular project. Then edit these values:

  • InputFile= [Point it at the new realtime_sample<number>.xml file in the data directory, that contains the checkin information for your project]
  • TakeSnapshots=true

That's all you really need to change. You can also change the other values, to alter the visualisation. The ColorAssignX= statements use regexp values to differentiate different types of checkin and colour code them accordingly. Play around with the other values, with TakeSnapshots set to false and re-run the visualisation until you get something you are satisfied with. Then run one more time with TakeSnapshots=true to save off the frame images. You can run with the new configuration by running 'ant run data/your_project.config'

After running with TakeSnapshots enabled, you'll have a set of images in the ./frames directory, (controlled by the SnapshotLocation option in the config file). The final step is to assemble those into a movie. The easiest way I found to do this is to use the command-line utility, ffmpeg. There are a variety of ways to install ffmpeg, but the simplest way seems to be to install ffmpegX and then extract the binary from the application bundle. You can also get it using Fink or MacPorts. If you want to use an audio track with your visualisation, you will also probably require LAME. With ffmpeg working, it is simple to point it towards the image files from codeswarm and produce the final movie. The finishing touch was adding some music from an mp3 file, then limiting the duration via the -t switch, to end when the video frames ran out, rather than playing all of the music.

ffmpeg -i frames/code_swarm-%05d.png -i 6_sym.mp3 -qmax 15 -t 100 -f image2 -r 24 <output_filename>.mpg

You can run 'ffmpeg' without any switches to get help on the options. If all goes well, you should end up with an MPEG format video in the file <output_filename>.mpg.

There are comments.

iPhone development

iPhone Simulator

Interested in learning about iPhone development? Want to study at Stanford? Don't want to pay the tuition fees? On iTunesU (the lecture streaming part of iTunes) you can follow along with class cs193p from Stanford, on iPhone Application Development. In addition to the good quality video of the lectures, all of the class slides, handouts and assignments are available, for free. If you have an Intel Mac, you can also download the development tools, iPhone SDK and a simulator, again all free. If you do want to actually develop and test applications on an iPhone or iPod Touch, you'll need to pay the $99 developer fee to get the encryption keys that let you run applications on a phone and allows you to submit apps for the app store. At least for the basics, the simulator is useful as a target platform for testing, although there are differences between it and the final platform. (features such as multi-touch and the accelerometer are hard to test for example, unless you want to start shaking your computer).

Lifehacker recently had an article on all of the educational resources that are becoming available on the web, for free. iTunesU is a good example of the sort of teaching resources that are out there, if you look. The quality is variable, but there are some excellent resources if you are prepared to dig.

There are comments.

Jul 11, 2009

learning guitar with Garageband

I've just recently started trying to learn to play guitar. My early experiences with music would be best described as making a bad noise,on the recorder. Still, I've decided to give it another go. One thing I've found early on that playing along to a metronome really helps with keeping time, but it isn't the most interesting thing to listen to. To liven things up a little, I've been playing around with Garageband on my Mac. This is an application that comes with the iLife suite and is preinstalled on most mac's. I suspect many owners don't use it for much, other than maybe making the odd iPhone ringtone (you can make them for free with Garageband, rather than paying money to get them in iTunes). Garageband also comes with several free lessons built in that are useful, if a bit limited. I've been using it mostly to record my practices and provide backing rhythm for my practice time.

Instead of playing along to the metronome, I've been making up click tracks to play to. These are simple drum loops that I use as a backing beat. It is easy to change the tempo of the drum beat and make it loop indefinitely. I've found free loops and there are also several vanilla loops that ship with GarageBand that are useful for click tracks (such as Straight Up Beat 01). I just choose the one that has the closest feel to the rhythm I'm trying to play, then drag the loop into the main display. You can extend the loop by dragging the top right hand corner of the track out to the right. After that is done, set a cycle region to loop over the number of bars that you want it to play for and play along.

One of the nicest features of Garageband is the ability to retime any of the loops to a given tempo, so I can start out slow and build to a faster beat, without having to change the basic arrangements of backing tracks. I can add more software instruments as I go, but for now I've been keeping things quite simple. I have played some of the melodies on the musical score editor, so that I can hear how things should sound and have that playing quietly under the track as I play along. You can also drag in tunes straight from iTunes and have those as backing tracks as well. These can also be retimed and slowed down (control-option-G is the secret undocumented shortcut you'll need)


This has been helpful so far. I've also started recording my attempts on a separate track (I've got an acoustic-electric guitar, so I can hook it straight into the computer). Listening to it I can hear the mistakes, but I can also pull up waveforms and visually see just where and when I was off beat. That's helped me start to tighten things up. I've been impressed by the features available in this free suite of tools. Certainly not as all encompassing as Logic Studio or Pro Tools, but more than I need to be able to help me learn more quickly.

After a few days of this, I realised the biggest problem was having to hit a key to start things recording, then quickly grab the guitar and get ready to play. Garageband will provide a 4 beat count in, but that isn't really enough for me to get ready and away from the keyboard. I found an interesting solution, where I worked out how to use my iPhone to remotely control GarageBand, using the Open Sound Control protocol and some free GUI creation tools and a bit of python scripting. More about that later.

There are comments.

May 8, 2009

a real, working time machine

Was working on some files on my mac last night, editing photographs from a recent trip. It was getting late and I accidentally deleted a directory of images. Now my images are always backed up anyway, on a separate attached storage device, so this wasn't too big of a mistake. There's another copy of the same images on some DVDs, too. But it still would have taken time to go and find them and recovery the images. I'd also have lost the various edits I'd done over the last few hours. Annoying, certainly, but not fatal. However, this is a mac, and I've been running time machine to do automatic backups, to a cheap, 1Tb USB drive I have attached to the machine. It took me about 30 seconds to unroll the damage using that, via the nifty graphical interface. I lost about 10 minutes of work in total. I was impressed by how well it works, much like the daily and hourly .snapshots you get with a much more expensive netapp filer. Seems to have been done right.

There are comments.

Apr 30, 2009

what did you say?

Wordle - Gordon_s tweets-1.jpg

I've been poking around at the Twitter API, in part just out of curiosity about what features are exposed. I have an interest in writing some visualisation widgets based upon it. The iPhone development course is also using a Twitter client as something of a 'hello world' app, too. Today, Tim O'Reilly pointed to a wordle visualisation of all the things that he's tweeted and gave a link to some code that could be used to download everything you'd tweeted. I had a look at it and decided to write something similar, using the Twitter API directly, rather than scraping the Twitter site.

The Twitter API I've been using is the excellent, minimalist python twitter tools by Mike Verdone. The main advantage over other python Twitter APIs is that ptt doesn't redefine any of the API calls. It does exactly what it says in the published Twitter API. As a result, it is incredibly easy to use. The 100 or so lines it is implemented in are also a very instructive read, to see how it is put together. I think it is a great example of how the attributes in Python can be used.

The code I wrote is available for download. It respects the rate limiting imposed by Twitter and will output all of the tweets for a particular user, to a file called <username>.tweet in the file it is run from. You can change which users are fetched in the main() ftn. The resulting text file can be opened up and then copy/ pasted over into the wordle creator.

There are comments.

Apr 26, 2009



Just installed dd-wrt on my Linksys wrt54g wireless router. I'd been meaning to do this for a while - as an easy way to get a much more functional router than the default firmware shipped by Linksys. What finally motivated me to do it was the recent storm about Time Warner Cable introducing bandwidth caps in Austin. Although TWC seem to have backed down for the moment, they have also recently started disconnecting customers for 'using too much bandwidth' on their infinite bandwidth contracts. The DD-WRT firmware gives me an independent way to monitor my usage and get an idea of how much I typically transmit & receive.  

The DD-WRT firmware install wasn't quite as smooth as the documentation might make you believe. The first time I installed and then tried to update the firmware, I got a fairly unhelpful 'Error 2: Access violation' error from the tftp prompt and not much else. I went back through the management mode initial vxkiller upload and things seemed to work better the second time around. For a while I was worried that I had a brick of router.

Once back up and running, the settings were very similar to the previous Linksys options, so it was quite quick to get the wireless settings and port forwarding, DMZ etc that I was using previously reconfigured. Now I have historical and realtime graphs of bandwidth usage available. Should be interesting to be able to monitor what's going on. If they are cutting people off for using 44GB per week and saying that is “that is more than most people use in a year” I am a little concerned at my 7.2GB in one day. That was a few iPhone development videos from Stanford and then we watched Quantum of Solace last night on the Xbox. Seems like Time Warner consider that aberrant behaviour.

There are comments.

Feb 23, 2009

the death of books

ghost rider

The reports of my death are greatly exaggerated

- Mark Twain

It seems that some universities are moving away from physical books, switching entirely to electronic textbooks. My initial reaction is that this is just a little bit crazy. Electronic reference materials have a place, but I have a real difficulty with only electronic textbooks as being the best approach. Certainly there is a financial justification and a reduction in the physical weight the students have to carry. There is no doubt an advantage for the book stores, having to carry less physical inventory and ship it around the country.

But none of this takes into account how you interact with a physical book. It just isn't the same having the material online or on a PDF in a laptop. A screen is harder to read (a good laptop screen is still less than 100dpi, books and print are 300dpi or more) and as it is a lower resolution than printed material, you can only see a small amount of the information at a time. Diagrams and accompanying text are often hard to see all in one place. This is part of the reason why reading on a screen can be so tiring. Also the backlit text is harder on the eyes than reading from a page. The second drawback is how you physically interact with a book - flicking quickly through pages, marking pages with a highlighter, inserting post-it notes, curling up in a chair to read a book, spreading several books and notes out across a table. All of these metaphors may eventually be replaced with digital analogues that are as powerful or more so, but it seems we are quite far from that time.

The Amazon Kindle is probably about as good as this gets just now and from what I can tell, it still falls far below a good hunk of printed tree. The Kindle does have a higher resolution screen, which helps with reading for a long time, but the screen is small and the navigation feels clunky. Laptops are worse.

I do find a lot of value in online reference books. I've had a subscription to O'Reilly's Safari for over a year now and have found it to be invaluable, particularly when traveling. I can have access to a variety of reference texts, easily searchable, almost always available (if you have an internet connection). However, I've never been able to read any of the books I have on my Safari subscription, for more than a few pages. It just doesn't seem to work for me. No doubt I'm destined to become a relic in my views on reading, but it seems that we approach reading on a screen differently to a book. I'd love to have some sort of larger Kindle device, linked to a Safari subscription. Some way to really read those books on Safari, rather than just treating them as reference works. It always feels that this is just right around the corner, yet we never quite get there.

There are comments.

Feb 16, 2009

edward tufte and presenting data


I was lucky enough to attend a seminar from Edward Tufte, a couple of weeks ago, on the Presentation of Data and Information. Edward Tufte is probably best known for the book 'The Quantitative Display of Visual Information' and was an engaging and entertaining presenter. He has a very different style from the normal Powerpoint-driven presentation approach. In fact, much of his work is railing against the uses and abuses of Powerpoint and similar slide techniques.

The main take-away I got from the whole day was that if you have to communicate complicated data sets or information, that you really need to consider how people will use and interact with the data first. Too often, we go straight to presentation software and start trying to work out how to express the information in slides, rather than taking the time to consider if there are other, better ways to impart the information. Tufte was very keen on the concept of a 'super-graphic' which is a data rich, high resolution physical handout that lets participants see and consider a lot of data at once. A map is a great example of a super-graphic, or the weather page in a typical newspaper. A key part of this is that paper is much higher resolution than a typical computer screen (72dpi to 600dpi means you can show a whole lot more data in the same space). This is why multiple display screens are really useful for serious work. It also means that printing out and sharing data is a great way to get information infront of people in a meeting, rather than drip feeding it from slides)

I compare this idea to another guide I saw in the same week on creating powerpoint presentations that admonishes that there should never be more than 8 numbers on any slide or graphic. Tufte's response to this was repeatedly 'when did we become so stupid, just because we walked into a business meeting?' People handle large, complex data displays every day in the real world. People read and study sports scores in a newspaper, or financial reports without any trouble at all.

Let the data drive the presentation format, rather than the presentation software drive how the data is displayed.

There are comments.

stop & search

a cold day in London

One of the least enjoyable experiences on a recent trip to London, last week, happened while I was taking pictures of the London eye. I was standing a few hundred meters away, shooting with a normal point and shoot camera, just like all the people around me, when a couple of police officers approached me. I'd heard about photographers being hassled in London but was surprised this managed to happen to me within 48 hours of arriving in the city. They started out by saying that 'they didn't really believe I was a terrorist, but were stopping photographers to make people aware that they were watching what was going on'. From there, they handed me a form that listed my rights under section s44 of the anti-terrorism law then proceeded to question me about what I was doing, where I was from, why I was taking pictures.

As far as I can tell, even though they themselves said they have no reasonable clause, the Terrorism act says that's fine. We spent about 5 minutes going through where I've lived and having me justify why I take pictures. Then they wanted to see all the pictures I'd been taking (again, as far as I can tell, in contradiction of their own guidelines on collection of evidence). On looking through the images, one of the officers stated that 'those look just like the sorts of pictures a terrorist would take' and then told me to move on. The picture above is what I was taking, when the stopped me. I got a 'stop and search' form listing that the stop was authorised under the anti-terrorism laws and that was part of a 'pre-planned op'. I can only assume from that they the London police have decided to institutionalise harassing photographers for the sake of   security theatre. Particularly, if when they find images that they think would be typical terrorist images, they wave the photographer on.

This is all in a city that seems to have more CCTV cameras everywhere than there are people. I'm not quite sure who if anyone is actually watching these camera feeds. The whole thing is quite worrying, for someone who has been out of the UK for a few years. We used to make jokes about books like 1984 or movies like V for Vendetta but it seems that piece by piece typical rights to privacy are being whittled away by a government that is using good intentions to grab as much additional powers as possible. Sure, it is just hassling a photographer in the street, taking pictures of a tourist attraction for no reason, but each time has an increasing chilling effect on what people feel they can do and what government authorities can get away with doing. I didn't argue with the particular officers, mainly as I didn't want to spend half my day discussing it in a police station on my holiday. Maybe that's part of the problem too.

'There's an implicit admission that Section 44 stops and searches do not detect terrorists. This is borne out by the available data. In the financial years 2003/4 to 2006/7, the Met stopped and searched 31,797 pedestrians using the powers of Section 44(2); of these only 79 were arrested in connection with terrorism - less than a quarter of a percent - and even fewer will be convicted. The purpose of deterring is feeble considering the extent to which the Home Office is ready to go to avoid revealing when and where the exceptional powers for Section 44 apply.'

At the end of this five minute waste of time, they started asking me about the number of megapixels my camera had, commented on how impressed they were by the quality of the pictures on the screen and asked where they could buy one and if I'd recommend it.

There are comments.

Jan 8, 2009

maker's schedule, manager's schedule

I've always had a dread of mid-afternoon status meetings. Now maybe I understand it a bit better, because of Paul Graham's excellent essay on the difference between being on a maker's schedule and being on a manager's schedule. Seems to share a lot of ideas with Csikszentmihalyi and his ideas of creativity and flow states.

There are comments.

← Previous Next → Page 3 of 6