Improving dataset tags handling
I am trying to have a WF in which my datasets (all in collection) are tagged with what I'd call named tags (custom name). See picture below
My collection has 5 datasets named after their SRR accession (downloaded with SRA tool in galaxy). I then tag these datasets with a csv file with first column the SRR accession (to have a productive match) and say 2 columns holding the TF name and the PMID i.e.
SRR002003 PMID:3562718 TF:TF2
...
The idea is to use the content of these named tag to automatically rename the files or even generate a datawarehouse like structure on the fly.
Here are now the issues and needed features :
- when trying to transfer the collection using
/scratch/girardot/TEST/{name}_{tags}.{ext}
, I got :
SRR002003_3562718tf2_3562718tf2_3562718tf2_3562718tf2_3562718tf1.fastqsanger.gz
this is way too many tags as I only had PMID:3562718
and TF:TF2
for this dataset => the tags combination of each of the 5 datasets was injected in the name of each file. Also I think a concatenation char should have been used between each tag ie 3562718_tf2
instaed of 3562718tf2
-
Only the "value" of the named tag got injected, I am not sure why. Do you clip this?
-
It would be great to add a way to select for named tags e.g.
{tags:PMID}
to specify that only the named tag PMID should be used here. Of course this means one can have more than once the{tags}
in the regex. Or one could introduce a new{tag:<name>}
insteads of{tags}
which would still represent all tags.
Could you look into this ?