help > How to make it faster?
Showing 1-8 of 8 posts
Display:
Results per page:
Dec 20, 2018  11:12 PM | Haleh Karbasforoushan - Northwestern University
How to make it faster?
Dear Alfonso,

Thanks for your continuing effort in improving the Conn toolbox! It has been a pretty helpful toolbox to us!

I'm going to use the conn toolbox in functional connectivity analysis of more than 200 subjects. The data which is few terabytes is store on an external hard drive that is connected to my laptop through USB 3.0. I'm going to run the analysis on my own laptop, so I've realized that it can take a VERY long time. Perhaps more than a week... Right?

Would you please give me some tips on how I can make it faster? I can't keep the data on my laptop since it's too large.

- Should I make/save the conn mat file also on the external hard drive, or will it be faster if I make it on my laptop?

- I'm going to use the toolbox pre-processing step on structural data for segmentation and normalization, and let the toolbox transfer the GM, WM, and CSF masks to ROIs. Will it be faster if I run this preprocessing using SPM first and upload the preprocessed data to ROIs directly? Or won't it make any difference?

- Is there anything else I can do to make the analysis faster?

Thanks a lot in advance!
Haleh
Dec 21, 2018  05:12 PM | Alfonso Nieto-Castanon - Boston University
RE: How to make it faster?
Dear Haleh,

While usb3 is relatively fast, depending on the length of your data, your are still probably looking at something around 1 hour per subject (you can easily test that simply running a single-subject first, which is not a bad idea because that will help you debug if there are any issues before committing to waiting for a week for your results, and that will give you a good estimate on how long will it take to run all subjects). I do not believe neither storing the conn project locally nor running preprocessing separately on SPM would significantly reduce computation time. The main thing that does in my experience significantly speed things up is parallelizing your analyses. Some options would be:

a) HPC / cluster. Many institutions offer simple access to cluster computing resources, and CONN works with many standard cluster configurations right out of the box, so that might be worth investigating (see http://www.conn-toolbox.org/resources/cl... for additional info). If you follow this route, you would connect to your institution network remotely, run CONN and then simply select the parallelization option that reads "distributed processing" when running your analyses, and CONN will automatically distribute the analyses across your choice of nodes in the network (e.g as much as one node per subject)

b) if you have a shared remote storage at home/office (e.g. a network drive) and several computers all connected to the same shared storage, you may also use this setup as a sort of DIY cluster. If you follow this route, then you would simply select in CONN the parallelization option that reads "queue / script it (save as scripts to be run later)" to break down your processing pipeline into N blocks (here N is the number of different computers you have set up). That will create N Matlab scripts (or command-line shell scripts) that you can then manually run one on each of your computers. After all are finished, simply opening your CONN project from the GUI will merge the results from all of these different computers

and c) if you do not have access to multiple computers but your computer has a reasonable number of cores, you may also simply parallelize your analyses across the different cores. if you follow this route, you would first go to "Tools. HPC options. Configuration" and make sure there that the default profile (e.g. named "background process (Unix, Mac)" if you are on a Mac computer) works fine (e.g. just select 'Test profile'), and then, as in option (a) above, simply select in CONN the parallelization option that reads "distributed processing" when running your analyses to have CONN automatically distribute the analyses across your choice of cores (e.g. one process per core)

Hope this helps and good luck!
Alfonso
Jan 10, 2019  12:01 PM | Davide Fedeli
RE: How to make it faster?
Dear Alfonso, 
thank you for keeping the CONN toolbox updated and helping the CONN community everyday!

I'm using a windows PC for our CONN analyses. Would option C (from your answer to Haleh) work also for a windows computer with multiple cores?

Thanks for your help and support
Davide

Originally posted by Alfonso Nieto-Castanon:
Dear Haleh,

While usb3 is relatively fast, depending on the length of your data, your are still probably looking at something around 1 hour per subject (you can easily test that simply running a single-subject first, which is not a bad idea because that will help you debug if there are any issues before committing to waiting for a week for your results, and that will give you a good estimate on how long will it take to run all subjects). I do not believe neither storing the conn project locally nor running preprocessing separately on SPM would significantly reduce computation time. The main thing that does in my experience significantly speed things up is parallelizing your analyses. Some options would be:

a) HPC / cluster. Many institutions offer simple access to cluster computing resources, and CONN works with many standard cluster configurations right out of the box, so that might be worth investigating (see http://www.conn-toolbox.org/resources/cl... for additional info). If you follow this route, you would connect to your institution network remotely, run CONN and then simply select the parallelization option that reads "distributed processing" when running your analyses, and CONN will automatically distribute the analyses across your choice of nodes in the network (e.g as much as one node per subject)

b) if you have a shared remote storage at home/office (e.g. a network drive) and several computers all connected to the same shared storage, you may also use this setup as a sort of DIY cluster. If you follow this route, then you would simply select in CONN the parallelization option that reads "queue / script it (save as scripts to be run later)" to break down your processing pipeline into N blocks (here N is the number of different computers you have set up). That will create N Matlab scripts (or command-line shell scripts) that you can then manually run one on each of your computers. After all are finished, simply opening your CONN project from the GUI will merge the results from all of these different computers

and c) if you do not have access to multiple computers but your computer has a reasonable number of cores, you may also simply parallelize your analyses across the different cores. if you follow this route, you would first go to "Tools. HPC options. Configuration" and make sure there that the default profile (e.g. named "background process (Unix, Mac)" if you are on a Mac computer) works fine (e.g. just select 'Test profile'), and then, as in option (a) above, simply select in CONN the parallelization option that reads "distributed processing" when running your analyses to have CONN automatically distribute the analyses across your choice of cores (e.g. one process per core)

Hope this helps and good luck!
Alfonso
Jan 10, 2019  04:01 PM | Pravesh Parekh - National Institute of Mental Health and Neurosciences
RE: How to make it faster?
Hi Davide,

Option c works on Windows too, if you have multiple cores.


Best
Pravesh

Originally posted by Davide Fedeli:
Dear Alfonso, 
thank you for keeping the CONN toolbox updated and helping the CONN community everyday!

I'm using a windows PC for our CONN analyses. Would option C (from your answer to Haleh) work also for a windows computer with multiple cores?

Thanks for your help and support
Davide

Originally posted by Alfonso Nieto-Castanon:
Dear Haleh,

While usb3 is relatively fast, depending on the length of your data, your are still probably looking at something around 1 hour per subject (you can easily test that simply running a single-subject first, which is not a bad idea because that will help you debug if there are any issues before committing to waiting for a week for your results, and that will give you a good estimate on how long will it take to run all subjects). I do not believe neither storing the conn project locally nor running preprocessing separately on SPM would significantly reduce computation time. The main thing that does in my experience significantly speed things up is parallelizing your analyses. Some options would be:

a) HPC / cluster. Many institutions offer simple access to cluster computing resources, and CONN works with many standard cluster configurations right out of the box, so that might be worth investigating (see http://www.conn-toolbox.org/resources/cl... for additional info). If you follow this route, you would connect to your institution network remotely, run CONN and then simply select the parallelization option that reads "distributed processing" when running your analyses, and CONN will automatically distribute the analyses across your choice of nodes in the network (e.g as much as one node per subject)

b) if you have a shared remote storage at home/office (e.g. a network drive) and several computers all connected to the same shared storage, you may also use this setup as a sort of DIY cluster. If you follow this route, then you would simply select in CONN the parallelization option that reads "queue / script it (save as scripts to be run later)" to break down your processing pipeline into N blocks (here N is the number of different computers you have set up). That will create N Matlab scripts (or command-line shell scripts) that you can then manually run one on each of your computers. After all are finished, simply opening your CONN project from the GUI will merge the results from all of these different computers

and c) if you do not have access to multiple computers but your computer has a reasonable number of cores, you may also simply parallelize your analyses across the different cores. if you follow this route, you would first go to "Tools. HPC options. Configuration" and make sure there that the default profile (e.g. named "background process (Unix, Mac)" if you are on a Mac computer) works fine (e.g. just select 'Test profile'), and then, as in option (a) above, simply select in CONN the parallelization option that reads "distributed processing" when running your analyses to have CONN automatically distribute the analyses across your choice of cores (e.g. one process per core)

Hope this helps and good luck!
Alfonso
Jan 11, 2019  01:01 PM | Davide Fedeli
RE: How to make it faster?
Thank you Pravesh! I've just noticed that the new version of CONN (18b) has a "background process (Windows)" configuration. I was using 17f, which didn't allow such option. I think the best option would be updating to the newest release. Would my conn project still be usable?
Many thanks!
Jan 12, 2019  09:01 AM | Pravesh Parekh - National Institute of Mental Health and Neurosciences
RE: How to make it faster?
Yes, indeed. 17f to 18a/18b should be pretty much straight forward and not cause any issues to the best of my knowledge.

Best
Pravesh
Mar 15, 2023  08:03 AM | zj_lee
RE: How to make it faster?
Dear Alfonso,

When I use conn's distributed processing, my computer shuts down. After restarting, I open conn and find that the gui prompts running, but it doesn't continue to run. Can I continue to run the previous processing? What code do I need to use?

Thanks a lot in advance!
lee
Attachment: run.png
Mar 19, 2023  11:03 PM | Alfonso Nieto-Castanon - Boston University
RE: How to make it faster?
Dear Lee,

If the status of those jobs is "running" then they are (likely) still running (e.g. in the background or in your cluster). To double-check you may click on the "refresh" button (that will re-check their status, if needed by quering the cluster's job manager process), and/or click on the "log" button (that will allow you to see what each individual job is doing). After those jobs finish running this GUI will change and it will offer you to import their results into your project automatically. 

Hope this helps
Alfonso
Originally posted by zj_lee:
Dear Alfonso,

When I use conn's distributed processing, my computer shuts down. After restarting, I open conn and find that the gui prompts running, but it doesn't continue to run. Can I continue to run the previous processing? What code do I need to use?

Thanks a lot in advance!
lee