## [Identification and Functional Characterization of Arylamine N-Acetyltransferases in Eubacteria: Evidence for Highly Selective Acetylation of 5-Aminosalicylic Acid](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC99640/)
To following the instructions from the ChEMBL team, check this [link](https://chembl.gitbook.io/chembl-data-deposition-guide/untitled-10/field-names-and-data-types-basic-submission):
This table is for the information on publication, fill out the compulsory headings as mentioned in the link above. Fill out optional if information available. `RIDX` heading is assigned by the depositor; e.g: it could be "Arylamine_NAT_Eubac".
In this study you can find all the chemicals used in this study in the section: "Material and Methods: Chemicals"
* 5-ASA
* 4-ASA
* p-aminobenzoic acid (PABA)
* 2-aminofluorene (2-AF)
* sulfamethazine (SMZ)
* procain amide (PA)
* iodoacetamide <br>
`CIDX`, like `RIDX`, is assigned by the depositor as an ID to the compound/chemical/drug. So you can assign ANF0001, to ANF0007 respectively. `COMPOUND_NAME` and `COMPOUND_KEY` can simply be the names of compounds used in the study, e.g: 5-ASA or 5-aminosalicylic acid.
Either you can create this file manually, or automatically. For automatic generation of this file, you need a column `SMILES` in your file `COMPOUND_RECORD.tsv`, which you should remove before submitting the files to ChEMBL. `SMILES` is only important to generate the file `COMPOUND_CTAB.sdf` automatically. Once you have a table that looks like following: <br>
Use the following [code file](https://git.embl.de/grp-zimmermann/ZM000_Pistoia/-/blob/main/Scripts/smiles_to_sdfRDKit.py?ref_type=heads). You can obtain SMILES from PubChem for each compound.
Remember to remove SMILES column before submitting this file to ChEMBL.
For microbiome drug biotransformation, there could be different kinds of assays. In this particular study, they deal with the following assay types:
* Enzyme Assays for single strain
* Enzyme Assays for Human colonic content
* Inhibition Assays
<br>
To differentiate assays into different categories can be complicated and needs to be carefully understood from the paper. The assay description for this paper is present in its section Methods and Materials. <br>
Following are the mandatory headings in `ASSAY.tsv` file and example regarding this paper.
* AIDX: this is like CIDX, and RIDX, depositor assigned. E.g: species_strain_compound OR Escherichia coli_K-12MG1655_5-ASA. This gives basic information about the assay. The enzyme activity was checked in E. coli K12, for the substrate 5-ASA.
* AIDX_DESCRIPTION: this section can be flexible. In this case we suggest the following; ``
* ASSAY_TYPE: we choose A type here for ADMET assay type in case of biotransformations <br>
#### First category: Enzyme assays for single strain
Since there are other optional headings that can be filled in this case; here is an example table using first assay category of enzyme assays for single strains:
| Escherichia coli_K-12MG1655_biotransformation | test strain: cell lysis to extract NAT from bacterial lysate | A | Escherichia coli | K-12 MG1655 | 511145 | INSERM | NAT | P77567 | Escherichia coli | 511145 | ADMET |
<br>
<br>
`ASSAY_TAX_ID` is NCBI tax id. `TARGET_ACCESSION` is the uniprot ID for [E. coli NAT enzyme](https://www.uniprot.org/uniprotkb/P77567/entry). It is important to note here that there are 6 control strains mentioned in the section "Material and Methods: Bacterial Strains and growth conditions". While the test strains are mentioned in Table 2. However, controls are generally not reported in ChEMBL as new activity. Also, the assay description for the two controls DMG100 and DMG200 are described in reference 22 of the article.
#### Second category: Enzyme assays for Human colonic content
NAT activity was also determined in human feces collected from five unrelated healthy Caucasian donors. I couldn't find the number of participants, but from the results section "NAT activity in human colonic content", 5 NAT activities are reported, so we can consider 5 donors here.
| human-donor-1_biotransformation | human fecal content: cell lysis to extract NAT | A | gut metagenome | | 749906 | INSERM | NAT | P77567 | gut metagenome | 749906 | ADMET |
<br>
The "gut metagenome" can be used for human fecal content with the NCBI tax id: 749906.
#### Third category: Inhibition assays
Most information on this assay type is present in the Method section "Inhibition studies" and result sections "Inhibition of bacterial NAT activity." The inhibition assay was only performed on three strains which are mentioned in the results section.
This tsv file describes assay parameters for different assays. e.g: if AIDX Escherichia coli_K-12MG1655_5-ASA is our assay of concern, then write all possible
parameters for this assay, and repeat for all assays.
| AIDX | TYPE | RELATION | VALUE | UNITS | TEXT_VALUE | COMMENTS |
| Escherichia coli_K-12MG1655_biotransformation | CENTRIFUGATION | | | | Bacterial cells were harvested by centrifugation (2,500 g,20 min, 4°C) and resuspended in 2 ml of Tris (pH 7.4)-EDTA-dithiothreitol-KCl buffer | |
| Escherichia coli_K-12MG1655_biotransformation | CELL LYSIS | | | | sonication on ice twice for 30 s each, with a 30-s interval | |
and so on. This should be repeated for all different assays. Each column is described in detail on the ChEMBL submission portal for further information.
`CIDX` (compound id), `AIDX` (assay id), `ACT_ID` (activity id), `CRIDX` (reference), `TYPE` and `ACTIVITY` are mandatory. For biotransformation, we put the `TYPE` as "Biotransformation" and the `ACTIVITY` as "Substrate". <br>
#### First assay category activity description
Since there are three different categories of Assays, we should deal with all types. For example for enzyme assay with single strain, we can follow the table 2 of the article, where they have reported the significant activity observed for all substrates in each study strain. So, for the `ACTIVITY.tsv` table add both the significant activity (>0.001) and the non-significant activity (<0.001). The values correspond to NAT activity (nmol min<sup>-1</sup> [mg of protein]<sup>-1</sup>)
| ANF0001 | Arylamine_NAT_Eubac | Escherichia coli_K-12MG1655_biotransformation | Biotransformation | Compound not metabolized | The NAT activity (nmol min<sup>-1</sup> [mg of protein]<sup>-1</sup>) of the drug 5-ASA is lower than the threshold value of 0.001. No significant biotransformation is detected. | Substrate |
| ANF0002 | Arylamine_NAT_Eubac | Escherichia coli_K-12MG1655_biotransformation | Biotransformation | Compound metabolized | The NAT activity (nmol min<sup>-1</sup> [mg of protein]<sup>-1</sup>) of the drug 2-AF is higher than the threshold value of 0.001 i.e. 0.02. No significant biotransformation is detected. | Substrate |
You can gather more information from the Result section: "Comparative kinetics of bacterial NATs.", and add it to the `ACTIVITY_COMMENT` when applicable.
#### Second assay category activity description
The only information present on human donor biotransformation is present in the result section "NAT activity in human colonic content". Following that paragraohs, there are approx. 5 donors and the NAT activity was checekd only for two subtrates 5ASA, and 2-AF. There is no proper given thrshold in this case. However, considering the threshold for the previous category, <0.001, we can consider this activity.
| ANF0001 | Arylamine_NAT_Eubac | human-donor-1_biotransformation | Biotransformation | Compound not metabolized | The NAT activity (nmol min<sup>-1</sup> [mg of protein]<sup>-1</sup>) of the drug 5-ASA is lower than the threshold value of 0.001. No significant biotransformation is detected. | Substrate |
| ANF0002 | Arylamine_NAT_Eubac | human-donor-1_biotransformation | Biotransformation | Compound metabolized | The NAT activity (nmol min<sup>-1</sup> [mg of protein]<sup>-1</sup>) of the drug 2-AF is higher than the threshold value of 0.001 i.e. 0.02. No significant biotransformation is detected. | Substrate |
| ANF0003 | Arylamine_NAT_Eubac | Pseudomonas aeruginosa-100720_inhibition | Biotransformation | Compound ihibits activity | The NAT activity in presence of 2-AF is inhibited by iodoacetamide with the slope value of 1.07 | INHIBITOR |