NITRC: CONN : functional connectivity toolbox: RE: Changing data's location

Browse Forums

help

help > RE: Changing data's location

RE: Changing data's location

Hi Bob & Jalmar,

In case this helps, and following Jalmar example, I am also attaching a patch that allows you to handle those folder name changes programmatically (this patch is for release 15h, simply copy the attached file to the conn distribution folder overwriting the file with the same name there). CONN already had some search/replace capability which was used, for example, to allow you to enter through the GUI just the location of the first subject/session missing datafile and then CONN would automatically generate a search/replace pattern from the entered info and attempt to apply that same transformation to the rest of the missing subjects/sessions datafiles. What I added in this patch is then just the ability to: 1) programmatically define those search/replace patterns as well; and 2) have CONN "remember" those search/replace patterns across different projects (e.g. when merging multiple projects).

For example, if your data in the HPC cluster was located in a /tmp/data folder and in your main system it is located in a /projects/myproject/data folder, then after rsyncing your project back to your main system, you could now type (or include in your scripts):

conn_updatefilepaths('init', '/tmp/data','/projects/myproject/data');

right before loading and/or merging your project files in CONN. This will tell CONN that at any future time (within the current Matlab session), when CONN is loading a project or merging several projects, if it finds some missing files in any /tmp/data[SOMETHINGELSE] location, it should first try to see if those files exist in the location /projects/myproject/data[SOMETHINGELSE], and if they can be found there then CONN will automatically fix those references without prompting you to locate those files (if they do not exist in the new location or if the missing files do not match the /tmp/data* pattern CONN will still ask you to locate those missing files as it normally does).

In general, the syntax is:

conn_updatefilepaths('init', root_searchstring, root_replacestring)

where root_searchstring and root_replacestring may be strings (for a single folder name change), or cell arrays of strings (for multiple folder name changes), to define programmatically potential search/replace patterns.

Just for reference, you may also use either the syntax:

conn_updatefilepaths('init',{},{});

or equivalently the syntax:

conn_updatefilepaths('hold','off');

to "forget" from this point on any potential search/replace patterns that you may have entered before. Or the syntax:

conn_updatefilepaths('hold','on');

to have CONN "remember" from this point on any potential search/replace patterns that you may implicitly define through the GUI (when prompted to select the location of a missing file) without actually suggesting any initial search/replace patterns programmatically to begin with. This is useful, for example, if you load a project and manually fix some folder reference when prompted by the GUI, and after that you load a second project and you do not want to have to enter exactly the same folder-name change that you already fixed in the previous project.

Hope this helps and let me know if you run into any issues and/or if you would like me to further clarify any of the above.

Best
Alfonso

Originally posted by Bob Kraft:

I am looking for some advice on how to effectively use our HPC cluster. Our HPC cluster only has temporary storage for data processing. It is intended to be used for batch processing and not for interactive data processing or data visualization. Our data is stored on a separate computer and network disk space. With this setup I was planning to do the following

1) rsync data from network disk to HPC cluster temporary storage
2) Process each subject indivudally via conn_batch (preprocessing, setup, denoising, and first level analysis)
3) rsync_data from HPC cluster temporary storage back to our permanent network disk
4) merge individual subjects into a single project
5) perform second level analysis.

For 64 subject with pre and post scans, I am able to do steps 1-3 in about an hour (15 minutes setup, 45 minutes computer time). And although this processing stream may not be ideal it works for me.

The problem I am have is with merging the files. Since individual folders created by CONN have moved Conn needs to confirm the location of the individual's data and the location of the ROIs. Doing this manually takes me about 1-2 minutes per subject. This adds about 1 to 2 hours of just clicking buttons.

Is there anyway to automate this process in CONN?

Thanks for your help,

Bob

Attachment: conn_updatefilepaths.m

Threaded View

Title	Author	Date
Changing data's location	Bob Kraft	Feb 15, 2016

RE: Changing data's location	Benson Stevens	Mar 21, 2019
RE: Changing data's location	Alfonso Nieto-Castanon	Feb 17, 2016
RE: Changing data's location	Jalmar Teeuw	Feb 15, 2016

RE: Changing data's location	Bob Kraft	Feb 17, 2016