Bioinformatics engineer exercise
Welcome to the exercise. You will complete a couple of small tasks.
Time: 45 min from start of the exercise.
Each task is encapsulated, if you cannot complete a task, don't worry and just move on to the next.
You may use internet. You must be able to explain the way how you obtained the results. Direct help from other people is not allow.
You may use any tools you like. You can create as many intermediate files as you need, please do not modify the original files.
There are no "hidden" bugs, the input file formats are consistent.
Task A: Filter a table for gene IDs
In this exercise you will filter an input table "results.txt" for certain gene IDs listed in "ids.txt".
- tab-separated file "results.txt"
- ID file "ids.txt"
- "outputA.txt": filtered "results.txt" table containing only ids in ids.txt
Notes: Pretend this is a step you could plug into a pipeline (given results.txt and ids.txt always have the same format).
Task B: Annotate a table
The table "notannotated.txt" needs to be annotated with "mapping.txt". Both files contain the column "ID" for mapping.
- tab-separated table "notannotated.txt" containing columns "ID, p-value, logFC"
- tab-separated table "mapping.txt" containing columns "ID, name"
- "outputB.txt" - annotated results table containing "ID, name, p-value, logFC" (in any order)
Notes: Pretend this is a step you could plug into a pipeline (given notannotated.txt and mapping.txt always have the same format).
Task C: Prepare a result table for biologists
The results from our analysis have to be displayed to wet-lab biologist who have no experience in bioinformatics.
- tab-delimited "deseq.results.txt"
- Formatted Excel or Open Office table named "outputC"
Note: This is a final preparation of the table and does not have to be done in a "programming" fashion. Edit the table with Excel, Open Office, or any editor you want.
Task D: Comparison of two biological replicates
Table "raw.data.txt" contains two samples "A" and "B" of unknown properties including their "IDs" and "type". We want to compare if the biological replicates "worked". Plot your results.
- tab-separated table "raw.data.txt"
- Any type of output including graphical.
Best of luck.