questions > Invalid JSON in sidecars
Jul 28, 2020  04:07 AM | Paul Wright - King's College London
Invalid JSON in sidecars
Dear Chris et al.

I have a few minor problems with BIDS sidecars that may warrant your attention. I am converting huge (100,000+) numbers of historical clinical images. So far, I have just done a dry run to get the BIDS sidecars. A small fraction gave JSON that could not be read by Pandas.

Most of these were where Patient Sex was an invalid character, hex FE. Looking at several of these with Pydicom, it looks like the sex is initially M or F, but the offending character is in a later sublist of modifed entries. I handled this by ignoring errors reading the file, so the result was an empty string. I'd like the option to recover the original entry for Patient Sex. I realize dcm2niix is behaving correctly when it puts the modified value into the BIDS sidecar, and I'll have to figure out on my own how to identify modified entries in the DICOM header and how to decide whether to accept or revert them, but if you have any tips from your experience I'd welcome them.

One scan returned this object:
"PhilipsRescaleSlope": inf
Pandas rejects this, raising "Expected object or value". Looking online there's debate about using unquoted special values like NaN, Inf, Infinity etc. in JSON, which is not explicitly in spec, but some argue there's allowance for customization. This is such an odd case that I plan to manually edit future similar cases. I thought you'd want to know so you can look into why dcm2niix calculates a value of inf and confirm you want it represented in the json in this way.

At least one sidecar seems to have used a secondary patient ID.
(0010, 0020) Patient ID LO: 'something'
...
(0010, 1002) Other Patient IDs Sequence 1 item(s) ----
   (0010, 0020) Patient ID LO: 'something else'
In this case, the second value for (0010, 0020) is an alternative ID, rather than a modified one, and the first value is the main ID. Again, I think I'll have to do some of my own work to resolve headers with multiple IDs, but welcome your comment.

Finally, I was trying to line up dcm2niix output with the original DICOM files (will script this next time) and noticed that study date, which I use in the filename, often needs cleaning up, for example if there are decimals. I'm curious how dcm2niix does this, so I can produce the same results.

I'm happy to provide example headers and jsons on request (perhaps best by direct email).

Best wishes
Paul Wright