users > nat::xform_brain for Skeletons in Python
Showing 1-2 of 2 posts
Display:
Results per page:
Sep 10, 2019  10:09 PM | comic
nat::xform_brain for Skeletons in Python
Hi all,

I am working on an HPC pipeline for processing millions of skeleton (neuron) objects that need to be transformed through a known registration. Our colleagues are working in R / Matlab and were using nat::xform_brain. I dug through the code a big to find where nat::xformpoints was defined, but I had some trouble understanding how it interacts with the C++ cmtk codebase. 

Our pipeline is in Python / C++ and ideally I would wrap the relevant parts of CMTK in Cython or pybind11 and integrate this functionality into our pipeline without the need for expensive IPC / IO.

Would someone be able to point me in the right direction?

Thanks!
Sep 11, 2019  11:09 PM | Greg Jefferis
RE: nat::xform_brain for Skeletons in Python
Dear Comic,

Originally posted by comic:
I am working on an HPC pipeline for processing millions of skeleton (neuron) objects that need to be transformed through a known registration.

I'm guessing you're collaborating with Mala Murthy's group on the flywire project ...
Our colleagues are working in R / Matlab and were using nat::xform_brain. I dug through the code a big to find where nat::xformpoints was defined, but I had some trouble understanding how it interacts with the C++ cmtk codebase. 

https://github.com/schlegelp/navis/blob/...

The eventual end point is that R shells out to call the cmtk streamxform tool. The functions that you would want to inspect are here:

https://github.com/natverse/nat/blob/336...

in particular there is a private function cmtk.streamxform that prepares the command line and uses nat::cmtk.system2 to run it. In R, you could try setting 

debug(nat::cmtk.system2)

and then running a single transform to see what the streamxform command line looks like. But ... see end of my response.
Our pipeline is in Python / C++ and ideally I would wrap the relevant parts of CMTK in Cython or pybind11

You have two options here. One is to call the R xform_brainfunction from python using rpy2. See

https://github.com/schlegelp/navis/blob/...

This may seem inelegant but xform_brain does look after quite a lot of details under the hood and more importantly your bottleneck may not be where you think it is. If you do want an example of wrapping CMTK directly, take a look at:

https://github.com/jefferis/cmtkr/

The key point is that provides a C++ wrapper of the CMTK streamxform tool:

https://github.com/jefferis/cmtkr/blob/m...

and integrate this functionality into our pipeline without the need for expensive IPC / IO.

But if you take a look at the README for cmtkr, I never took this forwards both because there were some glitches getting this to integrate nicely with R that would have taken time to work out *and* becayse the speed-ups were relatively modest in normal use because the majority of the time was spent inside CMTK computing the transform rather than in the overhead.

So now, I come to what may be my most useful advice. I'm not going to quote Knuth at you, but I would recommend that you do a little experiment. You need to know how much time is spent by CMTK computing (nonrigid) transforms vs all the other stuff. I would suggest that the easiest way to do this is to make CMTK only apply the affine part of the transforms in your full transformation chain. As you can imagine, the affine computation time is negligible compared with the nonrigid component. Imagine your call in R looks like this

 
xform_brain(myneurons, source=FAFB14, reference=FCWB)

Try doing:


xform_brain(myneurons, source=FAFB, reference=FCWB, transformtype="affine")

and timing both with system.time or bench::mark. Here is an example using some published EM data:



vfbcatmaid=catmaid_login(server='https://fafb.catmaid.virtualflybrain.org')
dapns=read.neurons.catmaid('name:PN glomerulus DA', conn=vfbcatmaid)
length(dapns)
## [1] 16
sum(nvertices(dapns))
## [1] 109498
system.time(dapns.pts.fcwb <- xform_brain(xyzmatrix(dapns), sample = FAFB14, reference = FCWB))
## user system elapsed
## 10.889 0.665 9.030
system.time(dapns.pts.fcwb.aff <- xform_brain(xyzmatrix(dapns), sample = FAFB14, reference = FCWB, transformtype="affine"))
## user system elapsed
## 6.515 0.723 4.599


Bottom line, I think you will find that you might get an order 2x speed-up by avoiding the shell step. I think that you will find that if you need to make big speed gains (think 100x) you will need to change the way the transformations themselves are represented. I'm happy to advise, but I think that's a longer conversation and I would need to know the details of the transforms that you want to apply.

All the best,

Greg.