"with np.load(\"../data/results/spongilla_plddt.npz\") as spongilla:\n",
" for gene in spongilla.keys():\n",
" genes.append(gene)\n",
" score.append(np.mean(spongilla[gene]))\n",
"\n",
"plddt[\"S_lacustris\"] = np.array(score)"
]
},
{
"cell_type": "markdown",
"id": "9637a995-be1d-467e-8073-4238e8ae080e",
"metadata": {},
"source": [
"To do goodness-of-fit tests it is important to know what sort of distribution we are analysing. We will use D'Agostino and Pearson's test to check if the pLDDT score distributions are normal. The null hypothesis is that they are from a normal distribution, so a small p-value means we can reject it."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "46755534-f0c1-4c1c-8536-857fe54c1850",
"metadata": {},
"outputs": [],
"source": [
"import scipy.stats as stats\n",
"from scipy.stats import normaltest"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "b38831fe-e45e-4e23-b91d-79cfaeae0476",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Two-sided χ-squared probability for the hypothesis test (rounded to 10 decimals)\n",
"A_thaliana: 0.0\n",
"M_musculus: 0.0\n",
"D_rerio: 0.0\n",
"S_cerevisiae: 0.0\n",
"H_sapiens: 0.0\n",
"D_discoideum: 0.0\n",
"C_elegans: 0.0\n",
"D_melanogaster: 0.0\n",
"S_lacustris: 0.0\n"
]
}
],
"source": [
"print(\"Two-sided χ-squared probability for the hypothesis test (rounded to 10 decimals)\")\n",
"This is indeed the case. _Spongilla_ and _Dictyostelium_ differ even more from the other organisms, enough so that the naked eye can spot it. There is no overwhelming similarity in the profiles, however."
To do goodness-of-fit tests it is important to know what sort of distribution we are analysing. We will use D'Agostino and Pearson's test to check if the pLDDT score distributions are normal. The null hypothesis is that they are from a normal distribution, so a small p-value means we can reject it.
This is indeed the case. _Spongilla_ and _Dictyostelium_ differ even more from the other organisms, enough so that the naked eye can spot it. There is no overwhelming similarity in the profiles, however.