Friday, June 04, 2010

Today's Adventures in Computer Science

At the moment I'm writing some code that is supposed to run in parallel. It's a "proof of concept" more than anything else; it's supposed to be one example that's part of a larger collection.

For now I am running it on my laptop, which has two cores, instead of the fancy machine at work that has a lot of cores -- but that has the sort of issues that you might expect from an engineering prototype (and that I am not sure exactly how to use). The fancy machine is up a few hours a day -- if we're lucky. The fancy machine at work is being replaced with parts of the real, non-prototype machine with even more cores and a bunch of memory sometime in the next week or so. But the hardware swap still won't fix the problem of my continuing ignorance.

So with the fancy machine unavailable I put the laptop to work doing tasks 1 through n in parallel. And it spent several hours chugging away (and heating its innards up to 170F) before I killed the processes. I tried doing the task in serial: I told the computer to do tasks 1 through n in sequence. Took a few seconds.

So next I took each of the tasks and made them trivial. My laptop could do all n of them in parallel in a few instants. So I started to make them more and more complex. I added back in more data in small doses. What I discovered: If I sent each of the parallel tasks less than roughly 8100 numbers to crunch, then they could do the task in under five seconds. But if I send each of the parallel tasks more than that, the machine will run for hours with the CPU use at 100% -- I've never been patient enough to find out if it ever finishes. There's no in between state where it takes minutes; it's either pretty quick or crazy-slow. And no, the cut-off between these two behaviors is not at 8192 pieces of data (but it is suspiciously close).