users > RE: Building a warping server
Oct 18, 2014  03:10 AM | Torsten Rohlfing
RE: Building a warping server
Hi Owen -

Let's start with the simple part first. If it's CMTK you want to use for warping, you can safely forget about the graphics card. CMTK does have a small number of tools using GPU (MR bias correction, levelset segmentation, symmetry computation), but all registration tools are purely CPU.

If you are planning to also use other packages, such as niftyreg, then you may want a GPU, but you'd have to ask someone involved with such other tools to figure out their exact recommendations.

Now as for memory and CPU - in general, you get the best throughput if you run each registration on a single core, but many of them at the same time. As an example, you have 32 cores and run single-thread registrations for 32 different image pairs in parallel. That's best because there is virtually no serial overhead if you are using this kind of "trivial" parallelism.

If you don't have quite enough cases to run that many in parallel, or if every single case runs so long on single CPU that you spend a lot of time waiting for that very last case at the end of your batch, then you want to crank up the number of course per registration. When I used to run large batches (several hundred cases), I would usually go for 4 CPUs per job, and 8-24 corse per job when running a smaller, incremental set (this is then basically determined by the number of servers you have - if you have 8 machines with 32 cores to run 16 cases, then of course you would give each case 16 cores).

Another consideration is available memory. If your images are very large, you might find that you cannot actually run 32 registrations on a 32-core machine, because you have too little memory to fit them all. Clearly you do not want your machines spending their time swapping memory.

Again, as a ballpark figure from my own experience registering humen brain MRI on the order of 256^2 x 150 slices per image, I found that 4GB was usually enough for a nonrigid registration.

Sooo... if you have a 32-core machine with 32GB of memory, you'd want to run 8 jobs, i.e., 4 cores per job, based on both memory and cores. If you had 64 GB, you could run 16 jobs with 2 cores each, but only if you have enough cases in your batch to keep your machine busy for a while.

I hope this all makes sense. Let me know if you have additional questions.

Cheers!
  Torsten

Threaded View

TitleAuthorDate
Owen Randlett Oct 16, 2014
RE: Building a warping server
Torsten Rohlfing Oct 18, 2014
Owen Randlett Oct 30, 2014
Owen Randlett Oct 30, 2014
Torsten Rohlfing Nov 1, 2014