README.md 8.14 KB
Newer Older
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
1
2
3
4
# Running services in the HPC cluster on demand. A MongoDB example.

With this example we are going to create a MongoDB Singularity image that then we will use to run in the cluster and do some computation.

Josep Manel Andres Moscardo's avatar
typo    
Josep Manel Andres Moscardo committed
5
## Why should we use Singularity?
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
6
7
8
9
10
11

Singularity is just another container tool that brings containers and reproducibility to scientific computing. Using Singularity containers, developers can work in reproducible environments of their choosing and design, and these complete environments can easily be copied and executed on other platforms.

- Easy way to pack our software stack
- Plenty of Docker/Singularity images already available (https://hub.docker.com/u/biocontainers/)
- Great community support
Josep Manel Andres Moscardo's avatar
Typos    
Josep Manel Andres Moscardo committed
12
- Make it reproducible and portable, so other colleagues may use it
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
13
14
15
16
17
- Existing software already support Singularity, like conda
- MPI support, GPU support
- And many more (https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0177459)


Josep Manel Andres Moscardo's avatar
Typos    
Josep Manel Andres Moscardo committed
18
## Pre-requisites
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
19
20
21

You will need:

Josep Manel Andres Moscardo's avatar
Typos    
Josep Manel Andres Moscardo committed
22
- `sudo` rights over `singularity` command to build your own images, but you only need them on the machines (physical or VM) where you want to build the images, not on the hosts where you just want to run your containers.
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
23

Josep Manel Andres Moscardo's avatar
Typos    
Josep Manel Andres Moscardo committed
24
- Access to the EMBL HPC cluster (or any other with singularity installed).
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
25

26
27
## Getting started with images

Josep Manel Andres Moscardo's avatar
Typos    
Josep Manel Andres Moscardo committed
28
Images can be build from Singularity recipes like the file we have in this repo (named Singularity), from Singularity Hub, Docker hub. (Remember you need to have sudo rights for this, **but not for running them**)
29

Josep Manel Andres Moscardo's avatar
Typos    
Josep Manel Andres Moscardo committed
30
To build it from Singularity hub:
31
32
33

      sudo singularity build lolcow.simg shub://GodloveD/lolcow

Josep Manel Andres Moscardo's avatar
Typos    
Josep Manel Andres Moscardo committed
34
If you want to build it from Docker hub:
35
36
37

      sudo singularity build lolcow.simg docker://godlovedc/lolcow

Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
38

Josep Manel Andres Moscardo's avatar
Typos    
Josep Manel Andres Moscardo committed
39
## Local build from a recipe
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
40

Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
41
**DO NOT DO IT FROM A SHARED FILESYSTEMS**
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
42

Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
43
44
First, clone the repo.

Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
45
      git clone https://git.embl.de/grp-bio-it/singularity-service-example.git
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
46
47
      cd singularity-service-example
      
48
49
50
Take a look at `Singularity` recipe

     Bootstrap: docker
51
     From: mongo:4.0.6
52
     
53
     %startscript
54
55
56
57
58
59
     
     /usr/bin/mongod --config /etc/mongo/mongod.conf
     
     %post
     
     chmod 777 /data/db
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
60
61


Josep Manel Andres Moscardo's avatar
Typos    
Josep Manel Andres Moscardo committed
62
Then, on your local machine, create an image for the mongo database providing the Singularity file (Note that you will require sudo rights for the singularity command).
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
63

64
     sudo singularity build mongo_4.0.6.img Singularity
65
66
67

Then we need to change permissions of the image to your user:

68
     sudo chown <user>:<group> mongo_4.0.6.img
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
69
    
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
70
71
Copy it to the cluster where you have this repo cloned

72
     scp mongo_4.0.6.img login:/scratch/moscardo/singularity-service-example
73
### Sections of a recipe
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
74

75
76
77
78
#### %startscript (only used by Singularity instances)
- Commands that we want our instance to execute 

#### %setup
79
80
81
- Runs commands outside of the container at start of the bootstrap process
- Runs before %post section

82
#### %post
83
- Runs once inside the container during bootstrap process
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
84
- Software installation (apt-get install mongodb-server)
85

86
#### %files
87
88
89
90
- Copy files from outside of the image to the inside of it
- Pairs of <source> <destination>
- Runs after %post section

91
#### %runscript
92
93
94
- Define custom runscript
- Command line parameter parsing etc...

95
#### %environment
96
97
- Define environment variables inside container

98
#### %labels
99
100
101
- Define custom labels/metadata
- $ singularity inspect <image>

102
#### %help
103
104
- Add help for the image
- $ singularity help ubuntu.img
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
105
106


107
## Running the service as instance
108

Josep Manel Andres Moscardo's avatar
Typos    
Josep Manel Andres Moscardo committed
109
This is the recommended way to run permanent service (as long as the job runs) and get detached from the terminal. It provides you mechanisms to start and stop the instance, execute inside the instance...
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
110

111
For this example we are going to run a MongoDB instance, insert some data, query the DDBB, and get the results back.
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
112

Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
113
If we don't have the DDBB created, we'll create it first in the login node. Go to <some-directory> and `git pull` this project.
114

Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
115
    git clone https://git.embl.de/grp-bio-it/singularity-service-example.git
116
117

Now we are ready to create the database:
Josep Manel Andres Moscardo's avatar
new doc    
Josep Manel Andres Moscardo committed
118

Josep Manel Andres Moscardo's avatar
mod    
Josep Manel Andres Moscardo committed
119
    singularity instance start  -B $PWD/data:/var/lib/mongo -B $PWD/mongoconf:/etc/mongo -B $PWD/log:/var/log/mongodb  /scratch/moscardo/singularity-service-example/mongo_4.0.6.img mongo
120

Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
121
If you take a look at the the directories `data`, `log` and `mongoconf`, you will see they contain files now. With `-B` we just mount those local directories into the container to get logs and data out of it
122
123
124

To check the instances running:

Josep Manel Andres Moscardo's avatar
mod    
Josep Manel Andres Moscardo committed
125
    [moscardo@login singularity-service-example]$ singularity instance list
126
    DAEMON NAME      PID      CONTAINER IMAGE
Josep Manel Andres Moscardo's avatar
mod    
Josep Manel Andres Moscardo committed
127
    mongo            35473    /scratch/moscardo/singularity-service-example/mongo_4.0.6.img
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
128

129

Josep Manel Andres Moscardo's avatar
new doc    
Josep Manel Andres Moscardo committed
130
131
We can see that it just runs as a normal process in the system

Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
132
133
134
    [moscardo@login singularity-service-example]$ ps aux | grep mongod
    moscardo 35477  3.8  0.6 1086512 53560 ?       Sl   14:27   0:02 /usr/bin/mongod --config /etc/mongo/mongod.conf

Josep Manel Andres Moscardo's avatar
new doc    
Josep Manel Andres Moscardo committed
135

136
137
We can just gracefully shutdown the DDBB.

Josep Manel Andres Moscardo's avatar
mod    
Josep Manel Andres Moscardo committed
138
139
    15:34 $ singularity instance stop mongo
    Stopping mongo instance of /home/xemacs/sourcecode/mongo/mongo_4.0.6.img (PID=40273)    
140
141
142
143

So the next steps will:

- Copy the DDBB to a node in the cluster on tmpfs
Josep Manel Andres Moscardo's avatar
new doc    
Josep Manel Andres Moscardo committed
144
- Start an instance with all the data mounted as volume
145
146
147
- Insert some data
- Query and print the data to a file
- Copy back to the network filesystem.
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
148

149
Here is the sbatch script that we will be running.
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
150

151
152
153
154
155
156
157
    #!/bin/bash
    
    #SBATCH --mem 8G
    #SBATCH -t 10:00
    #SBATCH --gres=tmp:1G
    
    # Make sure the directories exists
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
158
    MONGOPATH=/scratch/sing-training/moscardo/singularity-service-example
159
    
160
    echo "Copying DDBB"
Josep Manel Andres Moscardo's avatar
new doc    
Josep Manel Andres Moscardo committed
161
    time cp -ra $MONGOPATH/data $TMPDIR
162
    
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
163
    # --verbose flag can be added to Singularity for debugging purposes.
164
    echo "Running database"
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
165
    time singularity instance.start -B $MONGOPATH:$MONGOPATH -B $TMPDIR/data:/var/lib/mongo -B $MONGOPATH/mongoconf:/etc/mongo/ $MONGOPATH/mongo_4.0.6.img  mongo
166
    
Josep Manel Andres Moscardo's avatar
new doc    
Josep Manel Andres Moscardo committed
167
    # Giving some time to start up the DDBB, useful when not shutted down properly 
168
    sleep 30
Josep Manel Andres Moscardo's avatar
new doc    
Josep Manel Andres Moscardo committed
169
170
171
172
    
    # 
    #Do your computation here
    # 
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
173
    
Josep Manel Andres Moscardo's avatar
new doc    
Josep Manel Andres Moscardo committed
174
    # Insert results data into the DDBB
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
175
    time singularity  exec instance://mongo mongo 127.0.0.1/testddbb --eval 'var document = [{name  : "John",position : "Teacher",}, {name  : "Marie",position : "Doctor",}];db.MyCollection.insert(document);'
Josep Manel Andres Moscardo's avatar
new doc    
Josep Manel Andres Moscardo committed
176
177
    
    # Getting results from a query
178
    echo "Query ...."
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
179
    time singularity  exec instance://mongo  mongoexport -d testddbb --port 27017 -c MyCollection --query '{"position":{"$eq": "Teacher" }}' --out $TMPDIR/test.out --type csv --fields name,position
180
181
182
    
    sleep 5
    echo "Copying in memmory results to FS"
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
183
    time cp $TMPDIR/test.out $MONGOPATH
184
185
186
    
    sleep 5
    echo "Stopping DDBB instance"
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
187
    time singularity instance.stop mongo
Josep Manel Andres Moscardo's avatar
new doc    
Josep Manel Andres Moscardo committed
188
189
    
    echo "Copying DDBB to FS"
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
190
191
    time cp -ra $TMPDIR/data $MONGOPATH/data-mod

Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
192
**Important to know when dealing with MongoDB** Mongo needs indexes to perform quite well, and those may take up a lot of storage that we are actually moving to tmpfs, so just keep it in mind when asking for resources
193

Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
194

195
## Apps
Josep Manel Andres Moscardo's avatar
mod    
Josep Manel Andres Moscardo committed
196
What if you want to build a single container with two or three different apps that each have their own runscripts and custom environments? In some circumstances, it may be redundant to build different containers for each app with almost equivalent dependencies, in those cases we'll build a single image but configured to be called and used with different apps.
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
197
198
199
200
201
202
203

## Documentation
This is just an introductory manual to Singularity, but there is a world out there, just check their documentation and more resources available.

- https://www.sylabs.io/guides/2.6/user-guide/
- http://prace.it4i.cz/sites/prace.it4i.cz/files/files/perftools-hrabal-singularity.pdf
- https://www.westgrid.ca/files/WG%20singularity%20Nov21.pdf
204
- https://slurm.schedmd.com/SLUG17/SLUG_Bull_Singularity.pdf
Josep Manel Andres Moscardo's avatar
Josep Manel Andres Moscardo committed
205
206

## Support from them
Josep Manel Andres Moscardo's avatar
mod    
Josep Manel Andres Moscardo committed
207
I would like to mention that the support from their side is just great, as soon as you hit a bug or have any issue, you report it to them and they are willing to help you and fix it, interaction with them very easy. 
208
209
210

## Something to add....
We may want to think about adding a user task epilog just to make sure that we stop the instance in case the query reaches Time Limit.