Tuesday, January 04, 2011

Work Imitates Work

For those of you who are not familiar with supercomputing, most of your "big name" supercomputers are not made of parts that are appreciably faster than the ones in your laptop. What makes them fast and super is scale. You take the equivalent of tens of thousands (or hundreds of thousands) of laptops and program them in a way that chops up the problem into little pieces. Then you have each laptop-equivalent work on its own piece, and you pool the answers. That's (typically) how supercomputing happens.

And people write papers about how much faster the program runs as you throw more resources at it. If you have it run on two cores instead of one, it runs almost twice as fast. If you have it run on 16 cores instead of one, it might run 14 times faster. If you have it run on 1000 cores instead of one, it might run 600 times faster. Doubling the resources won't necessarily double the speed at which things run. The more pieces that you chop things into and the more cores that you have computing, the more time that is devoted to communication between the cores instead of to computing. When you have a problem that is broken up into a lot of little parts, a decent amount of computing power is devoted to keeping track of the parts and what they're doing and getting them to play together nicely -- and that means that this computing power can't do actual number-crunching because it's busy doing administration.

I work with a supercomputing center. Everyone who I work with should be familiar with this principle.

And yet: Every single week we all gather in the auditorium a full-staff meeting with a really vague agenda that is not adhered to.