So I've been putting back in various bits of caching in both the C++ and the Java code. With nothing cached, the calculations were taking a minute and 47 seconds. Caching two data items cut the time to 1:25, which was respectable, but still slower than I had in mind.
I have now cached a third bit of interesting data and have cut the calculation time down to 57 seconds.
This is good.
Tomorrow, I'll finish this up and try caching a fourth and final bit of data and we'll see where we end up. :)