Cleaned benchmark.py and installation

FROM ubuntu:bionic
# You can pick the Ubuntu version that suits you instead, according to the version of the boost libraries
# that you are using to compile biorseo.
# Typically, on the machine where you typed 'make', check :
# ls /usr/lib/libboost_filesystem.so.*
# this will give you the file name of your boost library, including the version number.
# Use the docker basis image of the Ubuntu which has this version of boost in the apt sources.
FROM ubuntu:focal
# installing dependencies
# compiled biorseo
COPY . /biorseo/
# Install runtime dependencies
RUN apt-get update -yq && \
apt-get upgrade -y && \
apt-get install -y python3-dev python3-pip openjdk-11-jre libgsl23 libgslcblas0 libboost-program-options-dev libboost-filesystem-dev && \
apt-get install -y libboost-program-options-dev libboost-filesystem-dev && \
rm -rf /var/lib/apt/lists/*
# compiled biorseo
COPY . /biorseo
# ViennaRNA installer
ADD "https://www.tbi.univie.ac.at/RNA/download/ubuntu/ubuntu_18_04/viennarna_2.4.14-1_amd64.deb" /
# jar3d archive
ADD http://rna.bgsu.edu/data/jar3d/models/jar3d_2014-12-11.jar /
# install codes
RUN dpkg -i /viennarna_2.4.14-1_amd64.deb && \
apt-get install -f && \
pip3 install networkx numpy regex wrapt biopython /biorseo/BayesPairing && \
cd / && \
rm -rf /biorseo/BayesPairing /ViennaRNA-2.4.13 /ViennaRNA-2.4.13.tar.gz
WORKDIR /biorseo
\ No newline at end of file
......@@ -14,10 +14,9 @@ sudo apt update && sudo apt install docker-ce docker-ce-cli containerd.io
### Download and install the RNA motifs data files:
* Move your JSON-formatted or CSV-formatted files containing motifs in the folder.
* If you use Rna3Dmotifs, you need to get RNA-MoIP's .DESC dataset: download it from [GitHub](https://github.com/McGill-CSB/RNAMoIP/blob/master/CATALOGUE.tgz). Put all the .desc from the `Non_Redundant_DESC` folder into `./data/modules/DESC`. Otherwise, you also can run Rna3Dmotifs' `catalog` program to get your own DESC modules collection from updated 3D data (download [Rna3Dmotifs](https://rna3dmotif.lri.fr/Rna3Dmotif.tgz)). You also need to move the final DESC files into `./data/modules/DESC`.
* Get the latest version of the HL and IL module models from the [BGSU website](http://rna.bgsu.edu/data/jar3d/models/) and extract the Zip files. Put the HL and IL folders from inside the Zip files into `./data/modules/BGSU`. Note that only the latest Zip is required.
### Download the docker image from Docker Hub
`docker pull persalteas/biorseo:latest`
......@@ -31,9 +30,9 @@ $ docker run
You can replace \`pwd\` by the full path of the biorseo/ root folder. Here we launch the biorseo image with 4 volumes : A first to give BiORSEO access to the module files, a second to give it access to your input file(s), a third for your trained BayesPairing, and a last for it to output the result files of your job. Considering you place your input file 'MyFastaFile.fa' into the `data/fasta` folder, an example job command can be ` ./biorseo.py -i /biorseo/data/fasta/myFastaFile.fa --rna3dmotifs --patternmatch --func B`, so the full run command would be
You can replace \`pwd\` by the full path of the biorseo/ root folder. Here we launch the biorseo image with 4 volumes : A first to give BiORSEO access to the module files, a second to give it access to your input file(s), a third for your trained BayesPairing, and a last for it to output the result files of your job. Considering you place your input file 'MyFastaFile.fa' into the `data/fasta` folder, an example job command can be ` ./biorseo.py -i /biorseo/data/fasta/myFastaFile.fa --rna3dmotifs --func B`, so the full run command would be
$ docker run -v `pwd`/data/modules:/modules -v `pwd`/data/fasta:/biorseo/data/fasta -v `pwd`/results:/biorseo/results persalteas/biorseo ./biorseo.py -i /biorseo/data/fasta/applications.fa --rna3dmotifs --patternmatch --func B
$ docker run -v `pwd`/data/modules:/modules -v `pwd`/data/fasta:/biorseo/data/fasta -v `pwd`/results:/biorseo/results persalteas/biorseo ./bin/biorseo -s /biorseo/data/fasta/applications.fa --descfolder /biorseo/data/modules/DESC --func B -v
Note that the paths to the input and output files are paths *inside the Docker container*, and those paths are mounted to folders of the host machine with -v options.
......@@ -83,12 +82,11 @@ If you use Rna3Dmotifs, you need to get RNA-MoIP's .DESC dataset: download it fr
* Check if the executable file exists: `./bin/biorseo --version`.
Now you can run biorseo.py, but, as you are not into the Docker environment, you MUST provide the options to tell it the jar3d or BayesPairing locations, for example:
Now you can run biorseo, but, as you are not into the Docker environment, you MUST provide the options to tell it the jar3d or BayesPairing locations, for example:
$ ./biorseo.py
-i ./data/fasta/applications.fa
-O ./results/
--rna3dmotifs --patternmatch --func B
--biorseo-dir /FULL/path/to/the/root/biorseo/dir
$ ./bin/biorseo
-s ./data/fasta/applications.fa
-o result.bi
--func B
......@@ -273,15 +273,15 @@ class RNA:
# print(filename, "not found !")
def load_biokop_results(self):
filename = outputDir+"PK/"+basename+".biok"
filename = outputDir+"PK/"+self.basename+".biok"
if path.isfile(filename):
rna = open(filename, "r")
lines = rna.readlines()
for i in range(2, len(lines)):
ss = lines[i].split(' ')[0].split('\t')[0]
ss = lines[i].split('\t')[0]
def load_results(self, include_noPK=False):
if "Biokop-mode" in self.meth_idx.keys():
......@@ -905,28 +905,24 @@ if __name__ == '__main__':
colors = [
'#911eb4', #purple
'#000075', #navy
'#ffe119', '#ffe119', # yellow
'#e6194B', '#e6194B', #red
'#3cb44b', '#3cb44b', #green
'#4363d8', '#4363d8', #blue
'#3cb44b', '#3cb44b', #green
def plot_best_MCCs(x_noPK_fully, x_PK_fully, x_pseudobase_fully):
print("Best MCCs...")
labels = [
"Biokop-mode\n", "RNAsubopt",
"$f_{1A}$", "$f_{1B}$",
"$f_{1A}$", "$f_{1B}$",
"Biokop\nmode", "RNA\nsubopt",
"$f_{1A}$", "$f_{1B}$",
"$f_{1A}$", "$f_{1B}$",
"$f_{1A}$", "$f_{1B}$",
fig, axes = plt.subplots(nrows=3, ncols=1, figsize=(10,5), dpi=150)
fig, axes = plt.subplots(nrows=3, ncols=1, figsize=(7,5), dpi=150)
fig.suptitle(" \n ")
fig.subplots_adjust(left=0.1, right=0.97, top=0.83, bottom=0.05)
fig.subplots_adjust(left=0.18, right=0.97, top=0.83, bottom=0.05)
# Line 1 : no Pseudoknots
......@@ -949,10 +945,10 @@ if __name__ == '__main__':
axes[0].set_ylabel("(A)\nmax MCC\n(%d RNAs)" % (len(x_noPK_fully[0])), fontsize=12)
# Line 2 : Pseudoknots supported
xpos = [ 0 ] + [ i for i in range(4,20) ]
xpos = [ 0 ] + [ 1+i for i in range(1, len(x_PK_fully)) ]
vplot = axes[1].violinplot(x_PK_fully, showmeans=False, showmedians=False, showextrema=False,
points=len(x_PK_fully[0]), positions=xpos)
for patch, color in zip(vplot['bodies'], colors[:1] + colors[4:]):
for patch, color in zip(vplot['bodies'], [colors[0]] + colors[2:]):
......@@ -986,13 +982,13 @@ if __name__ == '__main__':
for ax in axes:
ax.set_ylim((0.0, 1.01))
ax.set_xlim((-1, 20))
ax.set_xlim((-1, 8))
yticks = [ i/10 for i in range(0, 11, 2) ]
for y in yticks:
ax.axhline(y=y, color="grey", linestyle="--", linewidth=1)
ax.tick_params(top=False, bottom=False, labeltop=False, labelbottom=False)
ax.set_xticks([i for i in range(20)])
ax.set_xticks([i for i in range(8)])
axes[0].tick_params(top=True, bottom=False, labeltop=True, labelbottom=False)
for i, tick in enumerate(axes[0].xaxis.get_major_ticks()):
......@@ -1006,9 +1002,9 @@ if __name__ == '__main__':
# Figure : number of solutions
print("Number of solutions...")
plt.figure(figsize=(9,2.5), dpi=80)
plt.figure(figsize=(5,3), dpi=80)
plt.suptitle(" \n ")
plt.subplots_adjust(left=0.05, right=0.97, top=0.6, bottom=0.05)
plt.subplots_adjust(left=0.1, right=0.97, top=0.72, bottom=0.05)
xpos = [ x for x in range(len(n)) ]
for y in [ 10*x for x in range(8) ]:
plt.axhline(y=y, color="grey", linestyle="-", linewidth=0.5)
......@@ -1019,24 +1015,15 @@ if __name__ == '__main__':
labels = [
"RNAsubopt","RNA-MoIP\n1by1", "RNA-MoIP\nchunk",
"Biokop\nmode", "RNA\nsubopt",
"$f_{1A}$", "$f_{1B}$",
"$f_{1A}$", "$f_{1B}$", "$f_{1C}$", "$f_{1D}$",
"$f_{1A}$", "$f_{1B}$", "$f_{1C}$", "$f_{1D}$",
"$f_{1A}$", "$f_{1B}$", "$f_{1C}$", "$f_{1D}$",
"$f_{1A}$", "$f_{1B}$",
"$f_{1A}$", "$f_{1B}$"
plt.tick_params(top=False, bottom=False, labeltop=False, labelbottom=False)
plt.xticks([ i for i in range(len(labels))], labels)
plt.tick_params(top=True, bottom=False, labeltop=True, labelbottom=False)
for i, tick in enumerate(plt.gca().xaxis.get_major_ticks()):
if i<4: # Reduce size of RNA-MoIP labels to stay readable
# tick.label2.set_fontsize(8)
plt.yticks([ 20*x for x in range(3) ])
......@@ -1044,11 +1031,11 @@ if __name__ == '__main__':
# Figure : max number of insertions and ratio
fig, axes = plt.subplots(nrows=2, ncols=1, figsize=(10,4), dpi=80)
fig.suptitle(" \n ")
fig.subplots_adjust(left=0.09, right=0.99, top=0.7, bottom=0.05)
fig.subplots_adjust(left=0.09, right=0.99, top=0.85, bottom=0.05)
# Figure : max inserted
print("Max inserted...")
xpos = [ x for x in range(18) ]
xpos = [ x for x in range(len(max_i)) ]
axes[0].set_yticks([ 5*x for x in range(3) ])
for y in [ 2*x for x in range(7) ]:
axes[0].axhline(y=y, color="grey", linestyle="-", linewidth=0.5)
......@@ -1061,14 +1048,13 @@ if __name__ == '__main__':
# Figure : insertion ratio
print("Ratio of insertions...")
xpos = [ 0 ] + [ x for x in range(2, 1+len(r)) ]
axes[1].set_ylim((-0.01, 1.01))
yticks = [ 0, 0.5, 1.0 ]
for y in yticks:
axes[1].axhline(y=y, color="grey", linestyle="-", linewidth=0.5)
vplot = axes[1].violinplot(r, showmeans=False, showmedians=False, showextrema=False, points=len(r[0]), positions=xpos)
for patch, color in zip(vplot['bodies'], [colors[2]] + colors[4:]):
for patch, color in zip(vplot['bodies'], colors[2:]):
......@@ -1078,21 +1064,15 @@ if __name__ == '__main__':
labels = labels[2:]
for ax in axes:
ax.tick_params(top=False, bottom=False, labeltop=False, labelbottom=False)
ax.set_xticks([ i for i in range(18)])
ax.set_xticks([ i for i in range(6)])
axes[0].tick_params(top=True, bottom=False, labeltop=True, labelbottom=False)
for i, tick in enumerate(axes[0].xaxis.get_major_ticks()):
if i<2: # Reduce size of RNA-MoIP labels to stay readable
# tick.label2.set_fontsize(9)
plot_best_MCCs(x_noPK_fully, x_PK_fully, x_pseudobase_fully)
This diff is collapsed. Click to expand it.
......@@ -3,16 +3,13 @@
echo "WARNING: The purpose of this file is to document how the docker image was built.";
echo "You cannot execute it directly, because of licensing reasons. Please get your own:";
echo "- CPLEX academic version: cplex_installer_12.8_Student.bin";
echo "- Nupack header files: nupack_3.2.2.tar.gz";
exit 0;
cd ../
THISDIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
####################################################### Dependencies ##############################################################
sudo apt install -y clang-7 cmake make automake libboost-program-options-dev libboost-filesystem-dev openjdk-11-jre
sudo update-alternatives --install /usr/bin/clang++ clang++ /usr/bin/clang++-7 100
sudo update-alternatives --install /usr/bin/clang clang /usr/bin/clang-7 100
sudo apt install -y make automake libgsl-dev libmpfr-dev libeigen3-dev libboost-program-options-dev libboost-filesystem-dev
# CPLEX: only to build biorseo
# HERE YOU SHOULD GET YOUR OWN cplex_installer_12.8_Student.bin ! I am not allowed to share mine anymore.
......@@ -20,39 +17,20 @@ chmod +x cplex_installer_12.8_Student.bin
printf "4\n\n1\n\n\n\n\n" | sudo ./cplex_installer_12.8_Student.bin
rm cplex_installer_12.8_Student.bin
# Eigen: only to build biorseo (no need to give it to the docker image)
wget http://bitbucket.org/eigen/eigen/get/3.3.7.tar.gz -O eigen_src.tar.gz
tar -xf eigen_src.tar.gz
cd eigen-eigen-323c052e1731
mkdir build
cd build
cmake ..
sudo make install
cd ../..
rm -rf eigen_src.tar.gz eigen-eigen-323c052e1731
# Nupack: only to build biorseo (no need to give it to the docker image)
#curl -u yourname@yourUni.com:yourPassword http://www.nupack.org/downloads/serve_file/nupack3.2.2.tar.gz --output nupack3.2.2.tar.gz
tar -xf nupack3.2.2.tar.gz
cd nupack3.2.2
mkdir build
cd build
cmake ..
make -j8
# ViennaRNA (to build Biorseo with libRNA)
wget https://www.tbi.univie.ac.at/RNA/download/sourcecode/2_5_x/ViennaRNA-2.5.0.tar.gz
tar xzf ViennaRNA-2.5.0.tar.gz
cd ViennaRNA-2.5.0
make -j 8
sudo make install
cd ../..
sudo cp nupack3.2.2/src/thermo/*.h /usr/local/include/nupack/thermo/
rm -rf nupack3.2.2.tar.gz nupack3.2.2/
# BayesPairing: install on the docker image (done by the Dockerfile)
git clone http://jwgitlab.cs.mcgill.ca/sarrazin/rnabayespairing.git BayesPairing
######################################################### Build Biorseo ###########################################################
# build here, install later on the docker image (done by the Dockerfile)
mkdir -p results
make -j 8
make clean
rm -rf doc/ obj/
rm -rf obj/ figures/
######################################################## Build Docker container ##################################################
# Execute the Dockerfile and build the image