Louis BECQUEY

Merge branch 'stage_NBernard' into 'master'

Stage n bernard results



See merge request !1
Showing 75 changed files with 6536 additions and 5969 deletions
.vscode/*
.vscode
# LaTeX temporary files
doc/*.toc
doc/*.bbl
doc/*.gz
doc/*.log
doc/*.aux
doc/*.blg
doc/*.fls
doc/*.fdb_latexmk
# Docker installation temporary files
eigen-eigen-323c052e1731
cplex_installer_12.8_Student.bin
......@@ -20,7 +9,6 @@ ViennaRNA-2.4.13
# Compiled Object files
obj/*
doc/*.pdf
data/modules/RIN/__pycache__
# Executables
......@@ -44,4 +32,4 @@ data/modules/RIN
data/modules/ISAURE
data/sec_structs/bpRNA-1m_90.dbn
data/sec_structs/pseudobase++.dbn
data/fasta/contacts
......
......@@ -18,27 +18,6 @@ sudo apt update && sudo apt install docker-ce docker-ce-cli containerd.io
* Get the latest version of the HL and IL module models from the [BGSU website](http://rna.bgsu.edu/data/jar3d/models/) and extract the Zip files. Put the HL and IL folders from inside the Zip files into `./data/modules/BGSU`. Note that only the latest Zip is required.
### Install and train BayesPairing
To use Bayespairing, you need to install it on the host machine to train it (build bayesian networks for every RNA motif in your database). It is already installed in the Docker container, but not trained, so you need to train it on your data and tell docker to mount it (see the docker run command below).
Make sure you have Python 3.5+ with packages networkx, numpy, regex, wrapt and biopython. You can install them with pip, on Linux you will need the python3-dev package to build them.
On Windows, on Mac : script coming soon
On Linux:
```
$ sudo -H pip3 install --upgrade pip
$ sudo -H pip3 install networkx numpy regex wrapt biopython
$ git clone http://jwgitlab.cs.mcgill.ca/sarrazin/rnabayespairing.git BayesPairing
$ cd BayesPairing
$ sudo -H pip3 install .
$ cd bayespairing/src
$ python3 parse_sequences.py -d rna3dmotif -seq ACACGGGGUAAGAGCUGAACGCAUCUAAGCUCGAAACCCACUUGGAAAAGAGACACCGCCGAGGUCCCGCGUACAAGACGCGGUCGAUAGACUCGGGGUGUGCGCGUCGAGGUAACGAGACGUUAAGCCCACGAGCACUAACAGACCAAAGCCAUCAU -ss ".................................................................((...............)xxxx(...................................................)xxx).............."
$ python3 parse_sequences.py -d 3dmotifatlas -seq ACACGGGGUAAGAGCUGAACGCAUCUAAGCUCGAAACCCACUUGGAAAAGAGACACCGCCGAGGUCCCGCGUACAAGACGCGGUCGAUAGACUCGGGGUGUGCGCGUCGAGGUAACGAGACGUUAAGCCCACGAGCACUAACAGACCAAAGCCAUCAU -ss ".................................................................((...............)xxxx(...................................................)xxx).............."
```
The training is quite long, but has to be run only once.
### Download the docker image from Docker Hub
`docker pull persalteas/biorseo:latest`
......@@ -48,14 +27,13 @@ Use the following command to run the docker image:
$ docker run
-v `pwd`/data/modules:/modules
-v `pwd`/data/fasta:/biorseo/data/fasta
-v `pwd`/BayesPairing/bayespairing:/byp
-v `pwd`/results:/biorseo/results
persalteas/biorseo
yourexamplejobcommandhere
```
You can replace \`pwd\` by the full path of the biorseo/ root folder. Here we launch the biorseo image with 4 volumes : A first to give BiORSEO access to the module files, a second to give it access to your input file(s), a third for your trained BayesPairing, and a last for it to output the result files of your job. Considering you place your input file 'MyFastaFile.fa' into the `data/fasta` folder, an example job command can be ` ./biorseo.py -i /biorseo/data/fasta/myFastaFile.fa --rna3dmotifs --patternmatch --func B`, so the full run command would be
```
$ docker run -v `pwd`/data/modules:/modules -v `pwd`/data/fasta:/biorseo/data/fasta -v `pwd`/BayesPairing/bayespairing:/byp -v `pwd`/results:/biorseo/results persalteas/biorseo ./biorseo.py -i /biorseo/data/fasta/applications.fa --rna3dmotifs --patternmatch --func B
$ docker run -v `pwd`/data/modules:/modules -v `pwd`/data/fasta:/biorseo/data/fasta -v `pwd`/results:/biorseo/results persalteas/biorseo ./biorseo.py -i /biorseo/data/fasta/applications.fa --rna3dmotifs --patternmatch --func B
```
Note that the paths to the input and output files are paths *inside the Docker container*, and those paths are mounted to folders of the host machine with -v options.
......@@ -68,16 +46,12 @@ Option 2 : Compile and Install from source (without docker, Linux only)
* Create folders for the modules you will use: `mkdir -p data/modules/`. If you plan to use several module sources, add subdirectories :
```bash
mkdir -p data/modules/DESC
mkdir -p data/modules/BGSU
mkdir -p data/modules/RIN
mkdir -p data/modules/ISAURE/Motifs_derniere_version
mkdir -p data/modules/DESC
mkdir -p data/modules/JSON
```
### RNA3DMOTIFS DATA
If you use Rna3Dmotifs, you need to get RNA-MoIP's .DESC dataset: download it from [GitHub](https://github.com/McGill-CSB/RNAMoIP/blob/master/CATALOGUE.tgz). Put all the .desc from the `Non_Redundant_DESC` folder into `./data/modules/DESC`. Otherwise, you also can run Rna3Dmotifs' `catalog` program to get your own DESC modules collection from updated 3D data (download [Rna3Dmotifs](https://rna3dmotif.lri.fr/Rna3Dmotif.tgz)). You also need to move the final DESC files into `./data/modules/DESC`.
### THE RNA 3D MOTIF ATLAS DATA
Get the latest version of the HL and IL module models from the [BGSU website](http://rna.bgsu.edu/data/jar3d/models/) and extract the Zip files. Put the HL and IL folders from inside the Zip files into `./data/modules/BGSU`. Note that only the latest Zip is required.
......@@ -91,52 +65,23 @@ python3 Install_CaRNAval_RINs.py
```
If you do not have the unzip command, download and extract manually the [CaRNAval dataset](http://carnaval.lri.fr/carnaval_dataset.zip) and place the files `RIN.py` and `CaRNAval_1_as_dictionnary.nxpickled` in the folder `data/modules/RIN/`, and run the python script.
### CONTACTS DATA
### RNA3DMOTIFS DATA (DEPRECATED)
If you use contacts, you need to put the motifs.json of Isaure in `data/modules/ISAURE`.
If you use Rna3Dmotifs, you need to get RNA-MoIP's .DESC dataset: download it from [GitHub](https://github.com/McGill-CSB/RNAMoIP/blob/master/CATALOGUE.tgz). Put all the .desc from the `Non_Redundant_DESC` folder into `./data/modules/DESC`. Otherwise, you also can run Rna3Dmotifs' `catalog` program to get your own DESC modules collection from updated 3D data (download [Rna3Dmotifs](https://rna3dmotif.lri.fr/Rna3Dmotif.tgz)). You also need to move the final DESC files into `./data/modules/DESC`.
### DEPENDENCIES
- Make sure you have Python 3.7+ and a C++ compiler (tested with GCC and clang) installed on your distribution. Use a recent one, we use the 2017 C++ standard. The compilation will not work with Ubuntu 16's GCC 5.4 for example.
- Install automake, libeigen3-dev, libboost-program-options-dev and libboost-filesystem-dev, or equivalent packages in your distribution (Eigen 3 and Boost headers).
- Download and install the [ViennaRNA package](https://www.tbi.univie.ac.at/RNA/). We are interested in the libRNA library to build Biorseo first, and then, Jar3D and BayesPairing may use some of the executables (RNAsubopt, RNAfold).
- If you use the pre-complied packages, you need both the "Core" package and the development files.
- If you compile it from source, extract the archive (`tar -xvzf ViennaRNA-2.4.15.tar.gz`), go into the folder (`cd ViennaRNA-2.4.15`), configure and build it (`./configure` and then `make -j 4`) and finally install it with root permissions (`sudo make install`). Everything should be fine. This takes ~15 min. Optionnally, you may want to install optional libraries GSL and MPFR before (libgsl-dev and libmpfr-dev on Ubuntu), for better performance.
- Download and install the [ViennaRNA package](https://www.tbi.univie.ac.at/RNA/). We use their libRNA C++ API to build Biorseo.
- If you use the pre-complied packages, you need the "Core" package and the development files.
- If you compile it from source, extract the archive (`tar -xvzf ViennaRNA-2.5.0.tar.gz`), go into the folder (`cd ViennaRNA-2.5.0`), configure and build it (`./configure` and then `make -j 4`) and finally install it with root permissions (`sudo make install`). Everything should be fine. This takes ~15 min. Optionnally, you may want to install optional libraries GSL and MPFR before (libgsl-dev and libmpfr-dev on Ubuntu), for better performance.
- Download and install [IBM ILOG Cplex optimization studio](https://www.ibm.com/analytics/cplex-optimizer), through their [academic initiative](https://www.ibm.com/academic/home). The Student and the Community Edition versions are fine, but the free version is too limited. Registering as academic is free. We actually don't use the Studio, but we use the development files to compile Biorseo.
### OPTIONAL DEPENDENCIES FOR USE OF JAR3D
- Download and install Java runtime (Tested with Java 14)
- Download the latest JAR3D executable "*jar3d_releasedate.jar*" from [the BGSU website](http://rna.bgsu.edu/data/jar3d/models/).
### OPTIONAL DEPENDENCIES FOR USE OF BAYESPAIRING
- Make sure you can `import RNA` in your favorite python3.X. Otherwise, there might be a ViennaRNA installation issue.
- Depending on your distribution, this might require that you add the /usr/local/lib/python3.X/site-packages/ folder, or the libRNA.so library, to your $PYTHONPATH variable. To do so, edit your `~/.bashrc` file and write `export PYTHONPATH="$PYTHONPATH:/usr/local/lib/python3.X/site-packages"` at the end of the file, replacing 3.X by your actual version of Python. Then close and re-open the terminal.
- On Ubuntu/Debian distros, which have stupid conventions and ignore the site-packages/ folder, you need to move the python package to the dist-packages/ folder. E.g., if you compiled ViennaRNA from source, `sudo mv /usr/local/lib/python3.X/site-packages/RNA /usr/local/lib/python3.X/dist-packages/`.
- Make sure you have Python 3.6+ with pip (packages python3-pip and python3-dev on most distros)
- Clone the latest BayesPairing 2 Git repo, and install it :
```
git clone http://jwgitlab.cs.mcgill.ca/sarrazin/rnabayespairing2.git BayesPairing2
cd BayesPairing2
python3 -m pip install .
cd ..
```
### BUILDING
* You might want to edit `Makefile` if you did not install CPLEX or the libRNA in the default location. Please update the top variables $ICONCERT, $ICPLEX, $LCONCERT, and $LCPLEX with the correct locations.
* Build it: `make -j4`
* Check if the executable file exists: `./bin/biorseo --version`.
### BAYESPAIRING USERS: PREPARE BAYESIAN NETWORKS
We run an example job for it to build the bayesian networks of our modules.
```
cd BayesPairing2/bayespairing/src
python3 parse_sequences.py -seq "UUUUUUAAGGAAGAUCUGGCCUUCCCACAAGGGAAGGCCAAAGAAUUUCCUU" -t 4 -d rna3dmotif -ss "......(((((((.((((((((((((....)))))))))..))).)))))))"
```
Use `-d rna3dmotif`, `-d 3dMotifAtlas_ALL` depending on the module source you are planning to use.
This is a quite long step, but the bayesian networks will be ready for all the future uses.
### RUN BIORSEO
Now you can run biorseo.py, but, as you are not into the Docker environment, you MUST provide the options to tell it the jar3d or BayesPairing locations, for example:
```
......@@ -146,5 +91,4 @@ $ ./biorseo.py
--rna3dmotifs --patternmatch --func B
--biorseo-dir /FULL/path/to/the/root/biorseo/dir
--modules-path=./data/modules/DESC
--jar3d-exec=./jar3d_releasedate.jar OR --bypdir=./BayesPairing2/bayespairing/src
```
......
import time
import subprocess
import os
import os.path
from math import sqrt, ceil
import numpy as np
import matplotlib.pyplot as plt
log_path = "test.log"
log = open(log_path, 'a')
def run_test(cmd, log):
log.write(time.asctime(time.localtime(time.time())) + " : Run process \"" + cmd + "\"\n")
log.flush()
process = subprocess.Popen(cmd.split(' ') ,shell=False,stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
# Poll process.stdout to show stdout live
while process.poll() is None:
output = process.stdout.readline()
if output:
log.write(output.decode())
log.flush()
rc = process.poll()
#create the command line to run BiORSEO with the modules library of CaRNAval
def create_command_rin(path, name, function, estimator):
cmd = ("python3 " + path + "/biorseo.py -i " +
path + "/data/fasta/" +
name + ".fa " +
"-O results/ " +
"--carnaval " +
"--patternmatch " +
"--func " + function + " --" + estimator + " -v " +
" --biorseo-dir " + path + " " +
"--modules-path " + path + "/data/modules/RIN/Subfiles")
return cmd
#create the command line to run BiORSEO with the modules library of RNA3dMotifs Atlas
def create_command_bgsu(path, name, function, estimator):
cmd = ("python3 " + path + "/biorseo.py -i " +
path + "/data/fasta/" +
name + ".fa " +
"-O results/ " +
"--3dmotifatlas " +
"--jar3d " +
"--func " + function + " --" + estimator + " -v " +
"--jar3d-exec " + " /local/local/localopt/jar3d_2014-12-11.jar" +
" --biorseo-dir " + path + " " +
"--modules-path " + path + "/data/modules/BGSU")
return cmd
#create the command line to run BiORSEO with the modules library of RNA3dMotifs
def create_command_desc(path, name, function, estimator):
cmd = ("python3 " + path + "/biorseo.py -i " +
path + "/data/fasta/" +
name + ".fa " +
"-O results/ " +
"--rna3dmotifs " +
"--patternmatch " +
"--func " + function + " --" + estimator + " -v " +
" --biorseo-dir " + path + " " +
"--modules-path " + path + "/data/modules/DESC")
return cmd
#create the command line to run BiORSEO with the motifs library of Isaure in .json
def create_command_isaure(path, name, function, estimator):
cmd = ("python3 " + path + "/biorseo.py -i " +
path + "/data/fasta/" +
name + ".fa " +
"-O results/ " +
"--contacts " +
"--patternmatch " +
"--func " + function + " --" + estimator +
" --biorseo-dir " + path + " " +
"--modules-path " + path + "/data/modules/ISAURE/bibliotheque_a_lire")
return cmd
#execute the command line correspondin to the information put in the argument.
def execute_command(path, function, estimator, true_ctc, true_str, list_ctc, list_str, modules):
if (modules == 'desc'):
cmd = create_command_desc(path, name, function, estimator)
elif (modules == 'bgsu'):
cmd = create_command_bgsu(path, name, function, estimator)
elif (modules == 'rin'):
cmd = create_command_rin(path, name, function, estimator)
elif (modules == 'isaure'):
cmd = create_command_isaure(path, name, function, estimator)
os.system(cmd)
"""file_path = "results/test_" + name + ".json_pm" + function + "_" + estimator
if os.path.isfile(file_path):
tab = write_mcc_in_file(name, true_ctc, true_str, estimator, function)
list_ctc.append(tab[0])
list_str.append(tab[1])"""
#Retrieves the list of structures predicted by Biorseo for each sequence in the benchmark.txt file
def get_list_str_by_seq(name, estimator, function, list_str, true_str, modules):
if modules == 'bgsu':
extension = ".jar3d"
elif modules == 'desc':
extension = ".desc_pm"
elif modules == 'rin':
extension = ".rin_pm"
elif modules == 'json':
extension = ".json_pm"
file_path = "results/test_" + name + extension + function + "_" + estimator
if os.path.isfile(file_path):
path_benchmark = "data/modules/ISAURE/benchmark.txt"
max_mcc = get_mcc_structs_max(path_benchmark, name, estimator, function, extension, modules, true_str)
list_str.append(max_mcc)
# ================== Code from Louis Becquey Benchark.py ==============================
def dbn_to_basepairs(structure):
parenthesis = []
brackets = []
braces = []
rafters = []
basepairs = []
As = []
Bs = []
for i, c in enumerate(structure):
if c == '(':
parenthesis.append(i)
if c == '[':
brackets.append(i)
if c == '{':
braces.append(i)
if c == '<':
rafters.append(i)
if c == 'A':
As.append(i)
if c == 'B':
Bs.append(i)
if c == '.':
continue
if c == ')':
basepairs.append((i, parenthesis.pop()))
if c == ']':
basepairs.append((i, brackets.pop()))
if c == '}':
basepairs.append((i, braces.pop()))
if c == '>':
basepairs.append((i, rafters.pop()))
if c == 'a':
basepairs.append((i, As.pop()))
if c == 'b':
basepairs.append((i, Bs.pop()))
return basepairs
def compare_two_contacts(true_ctc, prediction):
tp = 0
fp = 0
tn = 0
fn = 0
for i in range(len(true_ctc)):
if true_ctc[i] == '*' and prediction[i] == '*':
tp += 1
elif true_ctc[i] == '.' and prediction[i] == '.':
tn += 1
elif true_ctc[i] == '.' and prediction[i] == '*':
fp += 1
elif true_ctc[i] == '*' and prediction[i] == '.':
fn += 1
return [tp, tn, fp, fn]
def compare_two_structures(true2d, prediction):
true_basepairs = dbn_to_basepairs(true2d)
pred_basepairs = dbn_to_basepairs(prediction)
tp = 0
fp = 0
tn = 0
fn = 0
for bp in true_basepairs:
if bp in pred_basepairs:
tp += 1
else:
fn += 1
for bp in pred_basepairs:
if bp not in true_basepairs:
fp += 1
tn = len(true2d) * (len(true2d) - 1) * 0.5 - fp - fn - tp
return [tp, tn, fp, fn]
def mattews_corr_coeff(tp, tn, fp, fn):
if ((tp + fp) * (tp + fn) * (tn + fp) * (tn + fn) == 0):
#print("warning: division by zero! no contact in the prediction")
#print("tp: " + str(tp) + " fp: " + str(fp) + " tn: " + str(tn) + " fn: " + str(fn))
return -1
elif (tp + fp == 0):
print("We have an issue : no positives detected ! (linear structure)")
return (tp * tn - fp * fn) / sqrt((tp + fp) * (tp + fn) * (tn + fp) * (tn + fn))
def f1_score(tp, tn, fp, fn):
return 2 * tp / (2 * tp + fp + fn)
def specificity(tp, tn, fp, fn):
return tn / (tn + fp)
# ================== Code from Louis Becquey Benchark.py ==============================
#Get the best MCC value for all prediction of the results file of the sequence in argument
def get_mcc_structs_max(path_benchmark, sequence_id, estimator, function, extension, modules, true_structure):
read_prd = open("results/test_" + sequence_id + extension + function + "_" + estimator, "r")
write = open("results/test_" + sequence_id + ".mcc_" + function + "_" + estimator + "_" + modules, "w")
max_mcc_str = -1;
title_exp = ">test_" + sequence_id + ": "
write.write(title_exp)
structure_exp = true_structure
write.write("structure 2d attendue:\n" + structure_exp + "\n")
title_prd = read_prd.readline()
structure_prd = read_prd.readline()
sequence_prd = structure_prd
while structure_prd:
structure_prd = read_prd.readline()
if (len(structure_prd) != 0):
write.write("\nstructure 2d predite:\n" + structure_prd[:len(sequence_prd)] + "\n")
mcc_tab = compare_two_structures(structure_exp, structure_prd[:len(sequence_prd)])
mcc_str = mattews_corr_coeff(mcc_tab[0], mcc_tab[1], mcc_tab[2], mcc_tab[3])
if (max_mcc_str < mcc_str):
max_mcc_str = mcc_str
write.write("mcc: " + str(mcc_str) + "\n")
contacts_prd = read_prd.readline()
write.write("max mcc 2D:" + str(max_mcc_str))
read_prd.close()
write.close()
return max_mcc_str
#Create a file that store the information concerning the MCC value obtains for each prediction of the
#sequence in input
def write_mcc_in_file(sequence_id, true_contacts, true_structure, estimator, function):
read_prd = open("results/test_" + sequence_id + ".json_pm" + function + "_" + estimator, "r")
write = open("results/test_" + sequence_id + ".mcc_" + function + "_" + estimator, "w")
max_mcc_str = -1;
max_mcc_ctc = -1;
title_exp = ">test_" + sequence_id + ": "
write.write(title_exp)
contacts_exp = true_contacts
structure_exp = true_structure
write.write("structure 2d attendue:\n" + structure_exp + "\n")
write.write("contacts attendus:\n" + contacts_exp + "\n" + len(structure_exp) * "-")
title_prd = read_prd.readline()
structure_prd = read_prd.readline()
sequence_prd = structure_prd
while structure_prd:
structure_prd = read_prd.readline()
if (len(structure_prd) != 0):
write.write("\nstructure 2d predite:\n" + structure_prd[:len(sequence_prd)] + "\n")
mcc_tab = compare_two_structures(structure_exp, structure_prd[:len(sequence_prd)])
mcc_str = mattews_corr_coeff(mcc_tab[0], mcc_tab[1], mcc_tab[2], mcc_tab[3])
if (max_mcc_str < mcc_str):
max_mcc_str = mcc_str
write.write("mcc: " + str(mcc_str) + "\n")
contacts_prd = read_prd.readline()
write.write("\ncontacts predits:\n" + contacts_prd)
if (len(contacts_prd) == len(contacts_exp)):
mcc_tab = compare_two_contacts(contacts_exp, contacts_prd)
mcc_ctc = mattews_corr_coeff(mcc_tab[0], mcc_tab[1], mcc_tab[2], mcc_tab[3])
if (max_mcc_ctc < mcc_ctc):
max_mcc_ctc = mcc_ctc
write.write("mcc: " + str(mcc_ctc) + "\n\n")
else:
write.write("mcc: no expected contacts sequence or not same length between expected and predicted\n\n")
write.write("max mcc 2D:" + str(max_mcc_str))
write.write("max mcc ctc:" + str(max_mcc_ctc))
read_prd.close()
write.close()
return [max_mcc_ctc, max_mcc_str]
def set_axis_style(ax, labels):
ax.xaxis.set_tick_params(direction='out')
ax.xaxis.set_ticks_position('bottom')
ax.set_xticks(np.arange(1, len(labels) + 1))
ax.set_xticklabels(labels)
ax.set_xlim(0.25, len(labels) + 0.75)
ax.set_xlabel('Sample name')
#Create one violin plot to see the distribution of the best MCC (of the base pairing)
def visualization_best_mcc_str(list_struct2d, estimator, function, modules, color, lines_color):
print(estimator + " + " + function + ": ")
np_struct2d = np.array(list_struct2d)
data_to_plot = np_struct2d
median_2d = np.median(np_struct2d)
print("mediane 2D: " + str(median_2d) + "\n")
fig = plt.figure()
ax = fig.add_axes([0, 0, 1, 1])
labels = ['structure 2D']
ax.set_xticks(np.arange(1, len(labels) + 1))
ax.set_xticklabels(labels)
ax.set_xlabel(function)
ax.set_ylabel('MCC')
violins = ax.violinplot(data_to_plot, showmedians=True)
for partname in ('cbars', 'cmins', 'cmaxes', 'cmedians'):
vp = violins[partname]
vp.set_edgecolor(lines_color)
vp.set_linewidth(1)
for v in violins['bodies']:
v.set_facecolor(color)
plt.savefig('visualisation_28_09' + estimator + '_' + function + '_' + modules + '.png', bbox_inches='tight')
#Create 4 violin plot to see the distribution of the best MCC (of the base pairing)
def visualization_best_mcc_str_4_figures(list_struct2d, color, lines_color):
np_struct2d_1 = np.array(list_struct2d[0])
np_struct2d_2 = np.array(list_struct2d[1])
np_struct2d_3 = np.array(list_struct2d[2])
np_struct2d_4 = np.array(list_struct2d[3])
data_to_plot = [np_struct2d_1, np_struct2d_2, np_struct2d_3, np_struct2d_4]
fig = plt.figure()
fig.set_size_inches(6, 3)
ax = fig.add_axes([0, 0, 1, 1])
labels = ['MEA + E', 'MEA + F', 'MFE + E', 'MFE + F']
ax.set_xticks(np.arange(1, len(labels) + 1))
ax.set_xticklabels(labels)
#ax.set_xlim(0.25, len(labels) + 0.75)
ax.set_xlabel("motifs d'Isaure")
ax.set_ylabel('MCC (en fonction des appariements)')
violins = ax.violinplot(data_to_plot, showmedians=True)
for partname in ('cbars', 'cmins', 'cmaxes', 'cmedians'):
vp = violins[partname]
vp.set_edgecolor(lines_color)
vp.set_linewidth(1)
for v in violins['bodies']:
v.set_facecolor(color)
plt.savefig('visualisation_28_09_Isaure_E_F.png', dpi=200, bbox_inches='tight')
#Create 2 violin plot to see the distribution of the best MCC (one for the MCC of
# the base pairing and one for the contacts)
def visualization_best_mcc(list_struct2d, list_contacts, estimator, function, modules, color, lines_color):
print(estimator + " + " + function + ": ")
np_struct2d = np.array(list_struct2d)
np_contacts = np.array(list_contacts)
data_to_plot = [np_struct2d, np_contacts]
median_2d = np.median(np_struct2d)
median_ctc = np.median(np_contacts)
print("mediane 2D: " + str(median_2d) + "\n")
print("mediane ctc: " + str(median_ctc) + "\n")
fig = plt.figure()
ax = fig.add_axes([0, 0, 1, 1])
labels = ['structure 2D', 'contacts']
ax.set_xticks(np.arange(1, len(labels) + 1))
ax.set_xticklabels(labels)
ax.set_xlabel(function)
ax.set_ylabel('MCC')
violins = ax.violinplot(data_to_plot, showmedians=True)
for partname in ('cbars', 'cmins', 'cmaxes', 'cmedians'):
vp = violins[partname]
vp.set_edgecolor(lines_color)
vp.set_linewidth(1)
for v in violins['bodies']:
v.set_facecolor(color)
plt.savefig('visualisation_16_06_' + estimator + '_' + function + '_' + modules + '.png', bbox_inches='tight')
#Return the list of names of all the sequence in the benchmark.txt, the list of structures and the list
# of contacts predicted in the result file of each sequence.
# This function is only use for the result in .json_pm format files
def get_list_structs_contacts_all(path_benchmark, estimator, function):
myfile = open(path_benchmark, "r")
list_name = []
complete_list_struct2d = []
complete_list_contacts = []
name = myfile.readline()
contacts = myfile.readline()
seq = myfile.readline()
structure2d = myfile.readline()
count = 0
while seq:
name = name[6:].strip()
count = count + 1
file_path = "results/test_" + name + ".json_pm" + function + "_" + estimator
if os.path.isfile(file_path):
file_result = open(file_path, "r")
list_struct2d = []
list_contacts = []
list_name.append(name)
title_prd = file_result.readline()
structure_prd = file_result.readline()
sequence = structure_prd
while structure_prd:
structure_prd = file_result.readline()
if (len(structure_prd) != 0):
mcc_tab = compare_two_structures(structure2d, structure_prd[:len(sequence)])
mcc_str = mattews_corr_coeff(mcc_tab[0], mcc_tab[1], mcc_tab[2], mcc_tab[3])
list_struct2d.append(mcc_str)
contacts_prd = file_result.readline()
if (len(contacts_prd) == len(contacts)):
mcc_tab = compare_two_contacts(contacts, contacts_prd)
mcc_ctc = mattews_corr_coeff(mcc_tab[0], mcc_tab[1], mcc_tab[2], mcc_tab[3])
list_contacts.append(mcc_ctc)
complete_list_struct2d.append(list_struct2d)
complete_list_contacts.append(list_contacts)
name = myfile.readline()
contacts = myfile.readline()
seq = myfile.readline()
structure2d = myfile.readline()
return [list_name, complete_list_struct2d, complete_list_contacts]
myfile.close()
#Return the list of names of all the sequence in the benchmark.txt, the list of structures
# predicted in the result file of each sequence.
def get_list_structs_all(path_benchmark, estimator, function, modules):
if modules == 'bgsu':
extension = ".jar3d"
elif modules == 'desc':
extension = ".desc_pm"
elif modules == 'rin':
extension = ".rin_pm"
elif modules == 'json':
extension = ".json_pm"
myfile = open(path_benchmark, "r")
list_name = []
complete_list_struct2d = []
complete_list_contacts = []
name = myfile.readline()
contacts = myfile.readline()
seq = myfile.readline()
structure2d = myfile.readline()
count = 0
while seq:
name = name[6:].strip()
count = count + 1
file_path = "results/test_" + name + extension + function + "_" + estimator
if os.path.isfile(file_path):
file_result = open(file_path, "r")
list_struct2d = []
list_name.append(name)
title_prd = file_result.readline()
structure_prd = file_result.readline()
sequence = structure_prd
while structure_prd:
structure_prd = file_result.readline()
if (len(structure_prd) != 0):
mcc_tab = compare_two_structures(structure2d, structure_prd[:len(sequence)])
mcc_str = mattews_corr_coeff(mcc_tab[0], mcc_tab[1], mcc_tab[2], mcc_tab[3])
list_struct2d.append(mcc_str)
contacts_prd = file_result.readline()
complete_list_struct2d.append(list_struct2d)
name = myfile.readline()
contacts = myfile.readline()
seq = myfile.readline()
structure2d = myfile.readline()
return [list_name, complete_list_struct2d]
myfile.close()
#Return the list in argument in two lists, each list containing half of the list in argument
def get_half(list):
first_half = []
second_half = []
if (len(list) % 2 == 0):
middle = len(list) / 2
else:
middle = len(list) / 2 + 0.5
for i in range (int(middle)):
first_half.append(list[i])
for i in range (int(middle)):
if i + int(middle) < len(list):
second_half.append(list[i + int(middle)])
return [first_half, second_half]
#Create a boxplot with all the MCC (for the base pairing) obtains with all the prediction for each sequence,
# divide in two figures.
def visualization_all_mcc_str(path_benchmark, estimator, function, modules):
list_name = get_list_structs_all(path_benchmark, estimator, function, modules)[0]
tab_struct2d = get_list_structs_all(path_benchmark, estimator, function, modules)[1]
min = 20
max = 0
max_i = 0
min_i = 0
for i in range(len(tab_struct2d)):
if (len(tab_struct2d[i]) > max):
max = len(tab_struct2d[i])
max_i = i
if (len(tab_struct2d[i]) < min):
min = len(tab_struct2d[i])
min_i = i
print("max: " + list_name[max_i] + " " + str(max) + " min: " + list_name[min_i] + " " + str(min) + "\n")
np_struct2d = np.array(tab_struct2d)
size = len(tab_struct2d)
list_median_str = []
for i in range(size):
list_median_str.append(np.median(np_struct2d[i]))
all_str = []
for i in range(size):
for j in range(len(np_struct2d[i])):
all_str.append(np_struct2d[i][j])
"""print("mediane struct" + estimator + " + " + function + " : " + str(np.median(all_str)))
print("ecart struct" + estimator + " + " + function + " : " + str(np.std(all_str)) + "\n")"""
data = [x for _, x in sorted(zip(list_median_str, tab_struct2d))]
boxName = [x for _, x in sorted(zip(list_median_str, list_name))]
if (len(data) % 2 == 0):
absciss = len(data) / 2
else:
absciss = len(data) / 2 + 0.5
divide_tab_name = get_half(boxName)
divide_tab_data = get_half(data)
plt.figure(figsize=(15,4),dpi=200)
plt.xticks(rotation=90)
plt.boxplot(divide_tab_data[0], medianprops=dict(color='black'))
for i in range(int(absciss)):
y =data[i]
x = np.random.normal(1 + i, 0.04, size=len(y))
plt.scatter(x, y)
plt.xticks(np.arange(1, absciss + 1), divide_tab_name[0])
plt.xlabel('nom de la séquence')
plt.ylabel('MCC (appariements)')
plt.savefig('visualisation_all_mcc_' + estimator + "_" + function + "_" + modules + '.png', bbox_inches='tight')
plt.figure(figsize=(15, 4), dpi=200)
plt.xticks(rotation=90)
plt.boxplot(divide_tab_data[1], medianprops=dict(color='black'))
for i in range(len(data)):
if i + int(absciss) < len(data):
y = data[i + int(absciss)]
x = np.random.normal(1 + i, 0.04, size=len(y))
plt.scatter(x, y)
plt.xticks(np.arange(1, absciss + 1), divide_tab_name[1])
plt.xlabel('nom de la séquence')
plt.ylabel('MCC')
plt.savefig('visualisation_all_mcc_' + estimator + "_" + function + "_" + modules + '_2.png', bbox_inches='tight')
# Create a boxplot with all the MCC (for the base pairing and the contacts) obtains with all
# the prediction for each sequence, divide in two figures.
def visualization_all_mcc(path_benchmark, estimator, function, modules):
list_name = get_list_structs_contacts_all(path_benchmark, estimator, function)[0]
tab_struct2d = get_list_structs_contacts_all(path_benchmark, estimator, function)[1]
tab_contacts = get_list_structs_contacts_all(path_benchmark, estimator, function)[2]
min = 20
max = 0
max_i = 0
min_i = 0
for i in range(len(tab_struct2d)):
if (len(tab_struct2d[i]) > max):
max = len(tab_struct2d[i])
max_i = i
if (len(tab_struct2d[i]) < min):
min = len(tab_struct2d[i])
min_i = i
print("max: " + list_name[max_i] + " " + str(max) + " min: " + list_name[min_i] + " " + str(min) + "\n")
np_struct2d = np.array(tab_struct2d)
size = len(tab_struct2d)
list_median_str = []
for i in range(size):
list_median_str.append(np.median(np_struct2d[i]))
all_str = []
for i in range(size):
for j in range(len(np_struct2d[i])):
all_str.append(np_struct2d[i][j])
"""print("mediane struct" + estimator + " + " + function + " : " + str(np.median(all_str)))
print("ecart struct" + estimator + " + " + function + " : " + str(np.std(all_str)) + "\n")"""
data = [x for _, x in sorted(zip(list_median_str, tab_struct2d))]
boxName = [x for _, x in sorted(zip(list_median_str, list_name))]
if (len(data) % 2 == 0):
absciss = len(data) / 2
else:
absciss = len(data) / 2 + 0.5
divide_tab_name = get_half(boxName)
divide_tab_data = get_half(data)
plt.figure(figsize=(15,4),dpi=200)
plt.xticks(rotation=90)
plt.boxplot(divide_tab_data[0], medianprops=dict(color='black'))
for i in range(int(absciss)):
y =data[i]
x = np.random.normal(1 + i, 0.04, size=len(y))
plt.scatter(x, y)
plt.xticks(np.arange(1, absciss + 1), divide_tab_name[0])
plt.xlabel('nom de la séquence')
plt.ylabel('MCC (appariements)')
plt.savefig('visualisation_128arn_structure2d_' + estimator + "_" + function + "_" + modules + '.png', bbox_inches='tight')
plt.figure(figsize=(15, 4), dpi=200)
plt.xticks(rotation=90)
plt.boxplot(divide_tab_data[1], medianprops=dict(color='black'))
for i in range(len(data)):
if i + int(absciss) < len(data):
y = data[i + int(absciss)]
x = np.random.normal(1 + i, 0.04, size=len(y))
plt.scatter(x, y)
plt.xticks(np.arange(1, absciss + 1), divide_tab_name[1])
plt.xlabel('nom de la séquence')
plt.ylabel('MCC')
plt.savefig('visualisation_128arn_structure2d_' + estimator + "_" + function + "_" + modules + '_2.png', bbox_inches='tight')
np_contacts = np.array(tab_contacts)
size = len(tab_contacts)
list_median_ctc = []
for i in range(size):
list_median_ctc.append(np.median(np_contacts[i]))
all_ctc = []
for i in range(size):
for j in range(len(np_contacts[i])):
all_ctc.append(np_contacts[i][j])
"""print("mediane ctc" + estimator + " + " + function + " : " + str(np.median(all_ctc)))
print("ecart ctc" + estimator + " + " + function + " : " + str(np.std(all_ctc)) + "\n")"""
data = [x for _, x in sorted(zip(list_median_ctc, tab_contacts))]
boxName = [x for _, x in sorted(zip(list_median_ctc, list_name))]
if (len(data) % 2 == 0) :
absciss = len(data)/2
else :
absciss = len(data)/2 + 0.5
divide_tab_name = get_half(boxName)
divide_tab_data = get_half(data)
plt.figure(figsize=(15, 4), dpi=200)
plt.xticks(rotation=90)
plt.boxplot(divide_tab_data[0], medianprops=dict(color='black'))
for i in range(int(absciss)):
y = data[i]
x = np.random.normal(1 + i, 0.04, size=len(y))
plt.scatter(x, y)
plt.xticks(np.arange(1, absciss + 1), divide_tab_name[0])
plt.xlabel('nom de la séquence')
plt.ylabel('MCC (contacts)')
plt.savefig('visualisation_128arn_contacts_' + estimator + "_" + function + '.png', bbox_inches='tight')
plt.figure(figsize=(15, 4), dpi=200)
plt.xticks(rotation=90)
plt.boxplot(divide_tab_data[1], medianprops=dict(color='black'))
for i in range(len(data)):
if i + int(absciss) < len(data) :
y = data[i + int(absciss)]
x = np.random.normal(1 + i, 0.04, size=len(y))
plt.scatter(x, y)
plt.xticks(np.arange(1, absciss + 1), divide_tab_name[1])
plt.xlabel('nom de la séquence')
plt.ylabel('MCC')
plt.savefig('visualisation_128arn_contacts_' + estimator + "_" + function + '_2.png', bbox_inches='tight')
# Most of those commands lines and this script work only for the benchmark_16-07-2021.json
# file provides by Isaure Chauvot de Beauchene. This script is means to automize the execution
# of BiORSEO for each sequence and the creation of figures.
# after compiling create_file.cpp in 'scripts' directory (compiling once is enough) :
# clang++ create_files.cpp -o create
#cmd = ("scripts/create")
# after compiling create_file.cpp in 'scripts' directory (compiling once is enough) :
# clang++ add_delimiter.cpp -o addDelimiter
#cmd0 = ("cppsrc/Scripts/addDelimiter")
# after compiling count_pattern.cpp in 'scripts' directory (compiling once is enough) :
# clang++ count_pattern.cpp -o countPattern
#cmd1 = ("cppsrc/Scripts/countPattern")
myfile = open("data/modules/ISAURE/benchmark.txt", "r")
name = myfile.readline()
contacts = myfile.readline()
seq = myfile.readline()
structure2d = myfile.readline()
list_struct2d = [[],[],[],[]]
# source path to the directory (for my computer)
path = "/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo"
path2 = "/local/local/BiorseoNath"
while seq:
name = name[6:].strip()
print(name)
# after compiling delete_same_pdb.cpp in 'scripts' directory (compiling once is enough) :
# clang++ delete_same_pdb.cpp -o deletePdb
#cmd2 = ("cppsrc/Scripts/deletePdb " + name)
#os.system(cmd2)
"""get_list_str_by_seq(name, 'MEA', 'E', list_struct2d[0], structure2d, 'json')
get_list_str_by_seq(name, 'MEA', 'F', list_struct2d[1], structure2d, 'json')
get_list_str_by_seq(name, 'MFE', 'E', list_struct2d[2], structure2d, 'json')
get_list_str_by_seq(name, 'MFE', 'F', list_struct2d[3], structure2d, 'json')"""
name = myfile.readline()
contacts = myfile.readline()
seq = myfile.readline()
structure2d = myfile.readline()
#visualization_best_mcc_str_4_figures(list_struct2d, 'red', '#900C3F')
"""visualization_best_mcc(list_struct2d_A_MFE, list_contacts_A_MFE, 'MFE', 'A', 'rin', 'red', '#900C3F')
visualization_best_mcc(list_struct2d_B_MFE, list_contacts_B_MFE, 'MFE', 'B', 'rin', 'blue', '#0900FF')
visualization_best_mcc(list_struct2d_A_MEA, list_contacts_A_MEA, 'MEA', 'A', 'rin', 'red', '#900C3F')
visualization_best_mcc(list_struct2d_B_MEA, list_contacts_B_MEA, 'MEA', 'B', 'rin', 'blue', '#0900FF')"""
myfile.close()
path_benchmark = "data/modules/ISAURE/benchmark.txt"
visualization_all_mcc_str(path_benchmark, 'MEA', 'C', 'bgsu')
visualization_all_mcc_str(path_benchmark, 'MEA', 'D', 'bgsu')
visualization_all_mcc_str(path_benchmark, 'MFE', 'C', 'bgsu')
visualization_all_mcc_str(path_benchmark, 'MFE', 'D', 'bgsu')
\ No newline at end of file
......@@ -9,7 +9,7 @@ CC = g++
CFLAGS = -Icppsrc/ -I/usr/local/include -I$(CPLEX)/concert/include -I$(CPLEX)/cplex/include -g -O3
CXXFLAGS = --std=c++17 -Wall -Wpedantic -Wextra -Wno-deprecated-copy -Wno-ignored-attributes
LINKER = g++
LDFLAGS = -L$(CPLEX)/concert/lib/x86-64_linux/static_pic/ -L$(CPLEX)/cplex/lib/x86-64_linux/static_pic/ -lboost_system -lboost_filesystem -lboost_program_options -lgomp -lconcert -lilocplex -lcplex -lpthread -ldl -lRNA -lm
LDFLAGS = -Wno-free-nonheap-object -L$(CPLEX)/concert/lib/x86-64_linux/static_pic/ -L$(CPLEX)/cplex/lib/x86-64_linux/static_pic/ -lboost_system -lboost_filesystem -lboost_program_options -lgomp -lconcert -lilocplex -lcplex -lpthread -ldl -lRNA -lm
# change these to proper directories where each file should be
SRCDIR = cppsrc
......@@ -31,20 +31,8 @@ $(OBJECTS): $(OBJDIR)/%.o : $(SRCDIR)/%.cpp $(INCLUDES)
$(CC) -c $(CFLAGS) $(CXXFLAGS) $< -o $@
@echo -e "\033[00;32mCompiled "$<".\033[00m"
doc: mainpdf supppdf
@echo -e "\033[00;32mLaTeX documentation rendered.\033[00m"
mainpdf: doc/main_bioinformatics.tex doc/references.bib doc/bioinfo.cls doc/natbib.bst
cd doc; pdflatex -synctex=1 -interaction=nonstopmode -file-line-error main_bioinformatics
cd doc; bibtex main_bioinformatics
cd doc; pdflatex -synctex=1 -interaction=nonstopmode -file-line-error main_bioinformatics
cd doc; pdflatex -synctex=1 -interaction=nonstopmode -file-line-error main_bioinformatics
supppdf: doc/supplementary_material.tex
cd doc; pdflatex -synctex=1 -interaction=nonstopmode -file-line-error supplementary_material
.PHONY: all
all: $(BINDIR)/$(TARGET) doc
all: $(BINDIR)/$(TARGET)
.PHONY: re
re: remove clean all
......
......@@ -19,6 +19,7 @@ THEN
OUTPUT:
- A set of secondary structures from the Pareto front,
- The list of known modules inserted inplace in the corresponding structures
- A set of positions of the nucleotides in contact with the protein represented by asterisks (only if the motifs_28-05-2021.json library is used!)
2/ The different models
==================================
......@@ -28,7 +29,8 @@ Biorseo can be used with two modules datasets (yet):
* Rna3Dmotifs (from the work of *Djelloul & Denise, 2008*)
* The RNA 3D Motif Atlas of BGSU's RNA lab (*Petrov et al, 2013*, see http://rna.bgsu.edu/rna3dhub/motifs/)
* CaRNAval 1.0 (*Reinhartz et al, 2018*)
* RNA-Bricks 2, RNAMC, CaRNAval 2.0, and others could theoretically be used, but are not supported (yet). You might write your own API.
* /data/modules/ISAURE/motifs_28-05-2021.json a library of motifs from RNA linked to a protein from Isaure Chauvot de Beauchêne of LORIA laboratory
(contact:isaure.chauvot-de-beauchene@loria.fr)
PATTERN MATCHING STEP
- Use **simple pattern matching**. Rna3Dmotifs modules are available with sequence information. We use regular expressions to find those known loops in your query. This is the approach of RNA-MoIP (*Reinharz et al, 2012*), we deal the same way with short components and wildcards.
......@@ -43,6 +45,8 @@ OBJECTIVE FUNCTIONS FOR THE MODULE INSERTION CRITERIA
* **Function B** : weights a module by its number of components (strands) and penalizes it by the log^(_2) of its nucleotide size.
* **Function C** : weights a module by its insertion site score (JAR3D or BayesPairing score).
* **Function D** : weights a module by its number of components (strands) and insertion site score (JAR3D or BayesPairing score), and penalizes it by the log^(_2) of its nucleotide size.
* **Function E** : weights a module by its nucleotides in contact with a protein, number of occurences and number of nucleotides in the module.
* **Function F** : weights a module by its nucleotides in contact with a protein, number of occurences and number of nucleotides along the entire length of the RNA.
3/ Installation
==================================
......@@ -55,22 +59,22 @@ Check the file [INSTALL.md](INSTALL.md) for installation instructions.
- If you **might expect a pseudoknot, or don't know**:
* The most promising method is the use of direct pattern matching with Rna3Dmotifs and function A. But this method is sometimes subject to combinatorial explosion issues. If you have a long RNA or a large number of loops, don't use it. Example:
`./biorseo.py -i PDB_00304.fa -O resultsFolder/ --rna3dmotifs --patternmatch --func A`
`./biorseo.py -i PDB_00304.fa -O resultsFolder/ --rna3dmotifs --patternmatch --func A --MEA`
* The use of the RNA 3D Motif Atlas placed by JAR3D and scored with function A is not subject to combinatorial issues, but performs a bit worse. It also returns less solutions. Example:
`./biorseo.py -i PDB_00304.fa -O resultsFolder/ --3dmotifatlas --jar3d --func A
`./biorseo.py -i PDB_00304.fa -O resultsFolder/ --3dmotifatlas --jar3d --func A --MEA
5/ List of Options
==================================
```
Usage: You must provide:
1) a FASTA input file with -i,
2) a module type with --rna3dmotifs, --carnaval or --3dmotifatlas
2) a module type with --rna3dmotifs, --carnaval, --3dmotifatlas or --contacts
3) one module placement method in { --patternmatch, --jar3d, --bayespairing }
4) one scoring function with --func A, B, C or D
4) one scoring function with --func A, B, C, D, E ou F
5) one estimator betwenn --MEA or --MFE
If you are not using the Docker image:
5) --modules-path, --biorseo-dir and (--jar3d-exec or --bypdir)
6) --modules-path, --biorseo-dir and (--jar3d-exec or --bypdir)
Options:
-h [ --help ] Print this help message
......@@ -79,16 +83,21 @@ Options:
--rna3dmotifs Use DESC modules from Djelloul & Denise, 2008
--carnaval Use RIN modules from Reinharz & al, 2018
--3dmotifatlas Use the HL and IL loops from BGSU's 3D Motif Atlas (updated)
--contacts Use the library of motifs, created from RNA sequences linked to proteins provided by I. Chauvot de Beauchene of LORIA laboratory
-p [ --patternmatch ] Use regular expressions to place modules in the sequence (requires --rna3dmotifs or --carnaval)
-j [ --jar3d ] Use JAR3D to place modules in the sequence (requires --3dmotifatlas)
-b [ --bayespairing ] Use BayesPairing2 to place modules in the sequence (requires --rna3dmotifs or --3dmotifatlas)
-o [ --output=… ] File to summarize the results
-O [ --outputf=… ] Folder where to output result and temp files
-f [ --func=… ] (A, B, C or D, default is B) Objective function to score module insertions:
-f [ --func=… ] (A, B, C, D, E or F default is B) Objective function to score module insertions:
(A) insert big modules (B) insert light, high-order modules
(c) insert modules which score well with the sequence
(C) insert modules which score well with the sequence
(D) insert light, high-order modules which score well with the sequence.
C and D require cannot be used with --patternmatch.
C and D cannot be used with --patternmatch.
(E) and (F) insert modules with a lot of nucleotides and a lot of nucleotides in contact with a proteine, and a huge number of occurences.
(E) maximize the number of contact nucleotide inside the module, while (F) maximize the number of contact nucleotide along the entire length of the RNA.
--MEA Use Maximum Expected Accuracy for the second objective
--MFE Use Minimum Free Energy based on the formula of (*Legendre et al., 2018*) for the second objective
-c [ --first-objective=… ] (default 1) Objective to solve in the mono-objective portions of the algorithm.
(1) is the module objective given by --func, (2) is the expected accuracy of the structure.
-l [ --limit=… ] (default 500) Number of solutions in the Pareto set from which
......@@ -113,9 +122,9 @@ Options:
BiORSEO from outside the docker image. Use the FULL path.
Examples:
biorseo.py -i myRNA.fa -O myResultsFolder/ --rna3dmotifs --patternmatch --func B
biorseo.py -i myRNA.fa -O myResultsFolder/ --3dmotifatlas --jar3d --func B -l 800
biorseo.py -i myRNA.fa -v --3dmotifatlas --bayespairing --func D
biorseo.py -i myRNA.fa -O myResultsFolder/ --rna3dmotifs --patternmatch --func B --MEA
biorseo.py -i myRNA.fa -O myResultsFolder/ --3dmotifatlas --jar3d --func B -l 800 --MEA
biorseo.py -i myRNA.fa -v --3dmotifatlas --bayespairing --func D --MEA
The allowed module/placement-method/function combinations are:
......@@ -123,5 +132,6 @@ The allowed module/placement-method/function combinations are:
--rna3dmotifs A. B. A. B. C. D.
--3dmotifatlas A. B. C. D. A. B. C. D.
--carnaval A. B.
--contacts E. F.
```
......
......@@ -29,11 +29,11 @@ import pickle
# ================== DEFINITION OF THE PATHS ==============================
biorseoDir = path.realpath(".")
jar3dexec = "/home/persalteas/Software/jar3dbin/jar3d_2014-12-11.jar"
jar3dexec = "/local/local/localopt/jar3d_2014-12-11.jar"
bypdir = biorseoDir + "/BayesPairing/bayespairing/src"
byp2dir = biorseoDir + "/BayesPairing2/bayespairing/src"
moipdir = "/home/persalteas/Software/RNAMoIP/Src/RNAMoIP.py"
biokopdir = "/home/persalteas/Software/biokop/biokop"
moipdir = "/local/local/localopt/RNAMoIP/Src/RNAMoIP.py"
biokopdir = "/local/local/localopt/biokop/biokop"
runDir = path.dirname(path.realpath(__file__))
bpRNAFile = argv[1]
PseudobaseFile = argv[2]
......@@ -1109,8 +1109,11 @@ def load_from_dbn(file, header_style=3):
if not '(' in struct:
continue # ignore linear structures
if is_canonical_nts(seq) and is_canonical_bps(struct):
# keeps what's inside brackets at the end as the filename
if header_style == 1: container.append(RNA(header.replace('/', '_').split('(')[-1][:-1], header, seq, struct))
# keeps what's inside square brackets at the end as the filename
if header_style == 2: container.append(RNA(header.replace('/', '_').split('[')[-1][:-41], header, seq, struct))
# keeps all the header as filename
if header_style == 3: container.append(RNA(header[1:], header, seq, struct))
if '[' in struct: counter += 1
db.close()
......@@ -1475,8 +1478,8 @@ def print_StudyCase_results():
if __name__ == '__main__':
print("> Loading files...", flush=True)
bpRNAContainer, bpRNA_pk_counter = load_from_dbn(bpRNAFile)
PseudobaseContainer, Pseudobase_pk_counter = load_from_dbn(PseudobaseFile)
bpRNAContainer, bpRNA_pk_counter = load_from_dbn(bpRNAFile, header_style=1)
PseudobaseContainer, Pseudobase_pk_counter = load_from_dbn(PseudobaseFile, header_style=3)
StudycaseContainer, StudyCase_pk_counter = load_from_dbn(StudyCaseFile, header_style=1)
for nt, number in ignored_nt_dict.items():
......
#!/usr/bin/python3
# coding=utf-8
import getopt, multiprocessing, subprocess, sys
import matplotlib.pyplot as plt
from scipy import stats
from os import path, makedirs, getcwd, chdir, devnull, remove, walk
from matplotlib import colors
from math import sqrt
from multiprocessing import cpu_count, Manager
from shutil import move
from ast import literal_eval
# Parse options
try:
cmd_opts, cmd_args = getopt.getopt( sys.argv[1:],
"bc:f:hi:jl:no:O:pt:v",
[ "verbose","rna3dmotifs","3dmotifatlas","carnaval","contacts","jar3d","bayespairing","patternmatch","func=",
"help","version","seq=","modules-path=", "jar3d-exec=", "bypdir=", "biorseo-dir=", "first-objective=","output=","theta=",
"interrupt-limit=", "outputf="])
except getopt.GetoptError as err:
print(err)
sys.exit(2)
m = Manager()
running_stats = m.list()
running_stats.append(0) # n_launched
running_stats.append(0) # n_finished
running_stats.append(0) # n_skipped
ignored_nt_dict = {}
def is_canonical_nts(seq):
"""Checks if a nucleotide is ACGU, and stores it in a dictionnary if it's not,
with an occurrence count."""
for c in seq[:-1]:
if c not in "ACGU":
if c in ignored_nt_dict.keys():
ignored_nt_dict[c] += 1
else:
ignored_nt_dict[c] = 1
return False
return True
def absolutize_path(p, directory=False):
if p[0] != '/':
p = getcwd() + '/' + p
if directory and p[-1] != '/':
p = p + '/'
return p
class NoDaemonProcess(multiprocessing.Process):
@property
def daemon(self):
return False
@daemon.setter
def daemon(self, value):
pass
class NoDaemonContext(type(multiprocessing.get_context())):
Process = NoDaemonProcess
class MyPool(multiprocessing.pool.Pool):
# We sub-class multiprocessing.pool.Pool instead of multiprocessing.Pool
# because the latter is only a wrapper function, not a proper class.
def __init__(self, *args, **kwargs):
kwargs['context'] = NoDaemonContext()
super(MyPool, self).__init__(*args, **kwargs)
class Loop:
"""Just a data structure to store module informations."""
def __init__(self, header, subsequence, looptype, position):
self.header = header
self.seq = subsequence
self.type = looptype
self.position = position
class InsertionSite:
"""An interval of an RNA sequence where a particular module could be inserted.
The philosophy of Biorseo is : if this portion can fold like this module,
then it may be a loop of the corresponding type."""
def __init__(self, loop, csv_line):
# BEWARE : jar3d csv output is crap because of java's locale settings.
# On french OSes, it uses commas to delimit the fields AND as floating point delimiters !!
# Parse with caution, and check what the csv output files look like on your system...
info = csv_line.split(',')
self.loop = loop # the Loop object that has been searched with jar3d
# position of the loop's components, so the motif's ones, in the query sequence.
self.position = loop.position
# Motif model identifier of the RNA 3D Motif Atlas
self.atlas_id = info[2]
# alignment score of the subsequence to the motif model
self.score = int(float(info[4]))
# should the motif model be inverted to fit the sequence ?
self.rotation = int(info[-2])
def __lt__(self, other):
"""compare two insertion sites scores"""
return self.score < other.score
def __gt__(self, other):
"""compare two insertion sites scores"""
return self.score > other.score
class Job:
"""A class to store the properties of a tool execution, in order to run similar jobs in parallel."""
def __init__(self, command=None, function=None, args=None, how_many_in_parallel=0, priority=1, timeout=None):
self.cmd_ = command
self.func_ = function
self.args_ = args
self.priority_ = priority
self.timeout_ = timeout
if not how_many_in_parallel:
self.nthreads = cpu_count()
elif how_many_in_parallel == -1:
self.nthreads = cpu_count() - 1
else:
self.nthreads = how_many_in_parallel
class RNA:
"""Just a data structure gathering header, sequence and length."""
def __init__(self, header, seq):
self.seq_ = seq
self.header = header
self.length = len(seq)
class BiorseoInstance:
"""A run of the biorseo tool, to predict one or several RNA sequences' folding(s),
including all the necessary previous run of other tools."""
def __init__(self, opts):
# set default run type options
self.type = "dpm" # direct pattern mathcing
self.modules = "desc" # ...with Rna3dMotifs "DESC" modules
self.func = 'B' # ...and function B
self.forward_options = [] # options to pass to the C++ biorseo
self.jobcount = 0
self.joblist = []
# set default file output locations
self.finalname = ""
self.outputf = "" # A folder where to output the computation files
if path.exists("/biorseo/results"): # docker image default
self.outputf = "/biorseo/results"
self.output = "" # A file to store the solutions
# set default data input locations
self.mode = 0 # default is single sequence mode
self.inputfile = ""
self.jar3d_exec = "/jar3d_2014-12-11.jar"
self.bypdir = "/byp/src" # Docker image locations
self.HL_motif_dir = "/modules/BGSU/HL/3.2/lib"
self.IL_motif_dir = "/modules/BGSU/IL/3.2/lib"
self.desc_folder = "/modules/DESC"
self.rin_folder = "/modules/RIN/Subfiles"
self.json_folder = "/modules/ISAURE"
self.biorseo_dir = "/biorseo"
self.run_dir = path.dirname(path.realpath(__file__))
self.temp_dir = "temp/"
for opt, arg in opts:
if opt == "-h" or opt == "--help":
print( "Biorseo, Bi-Objective RNA Structure Efficient Optimizer\n"
"Bio-objective integer linear programming framework to predict RNA secondary structures by including known RNA modules.\n"
"Developped by Louis Becquey (louis.becquey@univ-evry.fr), 2018-2020\n\n")
print("Usage:\tYou must provide:\n\t1) a FASTA input file with -i,\n\t2) a module type with --rna3dmotifs, --carnaval, --contacts or --3dmotifatlas"
"\n\t3) one module placement method in { --patternmatch, --jar3d, --bayespairing }\n\t4) one scoring function with --func A, B, C or D"
"\n\n\tIf you are not using the Docker image: \n\t5) --modules-path, --biorseo-dir and (--jar3d-exec or --bypdir)")
print()
print("Options:")
print("-h [ --help ]\t\t\tPrint this help message")
print("--version\t\t\tPrint the program version")
print("-i [ --seq=… ]\t\t\tFASTA file with the query RNA sequence")
print("--rna3dmotifs\t\t\tUse DESC modules from Djelloul & Denise, 2008")
print("--carnaval\t\t\tUse RIN modules from Reinharz & al, 2018")
print("--3dmotifatlas\t\t\tUse the HL and IL loops from BGSU's 3D Motif Atlas (updated)")
print("--contacts\t\t\tUse .json motif from Isaure")
print("-p [ --patternmatch ]\t\tUse regular expressions to place modules in the sequence (requires --rna3dmotifs or --carnaval)")
print("-j [ --jar3d ]\t\t\tUse JAR3D to place modules in the sequence (requires --3dmotifatlas)")
print("-b [ --bayespairing ]\t\tUse BayesPairing2 to place modules in the sequence (requires --rna3dmotifs or --3dmotifatlas)")
print("-o [ --output=… ]\t\tFile to summarize the results")
print("-O [ --outputf=… ]\t\tFolder where to output result and temp files")
print("-f [ --func=… ]\t\t\t(A, B, C or D, default is B)"
" Objective function to score module insertions:\n\t\t\t\t (A) insert big modules (B) insert light, high-order modules"
"\n\t\t\t\t (c) insert modules which score well with the sequence\n\t\t\t\t (D) insert light, high-order modules which score well with the sequence."
"\n\t\t\t\t C and D require cannot be used with --patternmatch.")
print("-c [ --first-objective=… ]\t(default 1) Objective to solve in the mono-objective portions of the algorithm."
"\n\t\t\t\t (1) is the module objective given by --func, (2) is the expected accuracy of the structure.")
print("-l [ --limit=… ]\t\t(default 500) Number of solutions in the Pareto set from which"
"\n\t\t\t\t we give up the computation, before eliminating secondary structure doublons.")
print("-t [ --theta=… ]\t\t(default 0.001) Pairing-probability threshold to consider or not the possibility of pairing")
print("-n [ --disable-pseudoknots ]\tAdd constraints to explicitly forbid the formation of pseudoknots")
print("-v [ --verbose ]\t\tPrint what is happening to stdout")
print("--modules-path=…\t\tPath to the modules data.\n\t\t\t\t The folder should contain modules in the DESC format as output by Djelloul & Denise's"
"\n\t\t\t\t 'catalog' program for use with --rna3dmotifs, or the IL/ and HL/ folders\n\t\t\t\t from a release of the RNA 3D Motif Atlas "
"for use with --3dmotifatlas, or the\n\t\t\t\t data/modules/RIN/Subfiles/ folder for use with --carnaval.\n\t\t\t\t Consider placing these files on a fast I/O device (NVMe SSD, ...)")
print("--jar3d-exec=…\t\t\tPath to the jar3d executable.\n\t\t\t\t Default is /jar3d_2014-12-11.jar, you should use this option if you run"
"\n\t\t\t\t BiORSEO from outside the docker image.")
print("--bypdir=…\t\t\tPath to the BayesParing src folder.\n\t\t\t\t Default is /byp/src, you should use this option if you run"
"\n\t\t\t\t BiORSEO from outside the docker image.")
print("--biorseo-dir=…\t\t\tPath to the BiORSEO root directory.\n\t\t\t\t Default is /biorseo, you should use this option if you run"
"\n\t\t\t\t BiORSEO from outside the docker image. Use the FULL path.")
print("\nExamples:")
print("biorseo.py -i myRNA.fa -O myResultsFolder/ --rna3dmotifs --patternmatch --func B")
print("biorseo.py -i myRNA.fa -O myResultsFolder/ --3dmotifatlas --jar3d --func B -l 800")
print("biorseo.py -i myRNA.fa -v --3dmotifatlas --bayespairing --func D")
print("\nThe allowed module/placement-method/function combinations are:\n")
print(" --patternmatch --bayespairing --jar3d")
print("--rna3dmotifs A. B. A. B. C. D.")
print("--3dmotifatlas A. B. C. D. A. B. C. D.")
print("--carnaval A. B.")
print("--contacts A. B. E. D.\n")
sys.exit()
elif opt == "-i" or opt == "--seq":
self.inputfile = arg
elif opt == "-O" or opt == "--outputf":
self.outputf = absolutize_path(arg, directory=True) # output folder
elif opt == "-o" or opt == "--output":
self.output = absolutize_path(arg) # output file
elif opt == "-f" or opt == "--func":
if arg in ['A', 'B', 'C', 'D', 'E', 'F']:
self.func = arg
else:
raise "Unknown scoring function " + arg
elif opt == "-p" or opt == "--patternmatch":
self.type = "dpm"
elif opt == "-j" or opt == "--jar3d":
self.type = "jar3d"
elif opt == "-b" or opt == "--bayespairing":
self.type = "byp"
elif opt == "--carnaval":
self.modules = "rin"
elif opt == "--contacts":
self.modules = "json"
elif opt == "--rna3dmotifs":
self.modules = "desc"
elif opt == "--3dmotifatlas":
self.modules = "bgsu"
elif opt == "--modules-path":
self.HL_motif_dir = absolutize_path(arg, directory=True) + "HL/3.2/lib"
self.IL_motif_dir = absolutize_path(arg, directory=True) + "IL/3.2/lib"
self.desc_folder = absolutize_path(arg, directory=True)
self.rin_folder = absolutize_path(arg, directory=True)
self.json_folder = absolutize_path(arg, directory=True)
print("Looking for modules in", arg)
elif opt == "--jar3d-exec":
self.jar3d_exec = absolutize_path(arg)
print("Using ", arg)
elif opt == "--bypdir":
self.bypdir = absolutize_path(arg, directory=True)
print("Using trained BayesPairing in", arg)
elif opt == "--biorseo-dir":
self.biorseo_dir = absolutize_path(arg, directory=True)
elif opt == "--version":
subprocess.run([self.biorseo_dir+"bin/biorseo", "--version"])
exit(0)
elif opt == "-l" or opt == "--interrupt-limit":
self.forward_options.append("-l")
self.forward_options.append(arg)
elif opt == "-v" or opt == "--verbose":
self.forward_options.append("-v")
elif opt == "-n" or opt == "--disable-pseudoknots":
self.forward_options.append("-n")
elif opt == "-t" or opt == "--theta":
self.forward_options.append("-t")
self.forward_options.append(arg)
elif opt == "-c" or opt == "--first-objective":
self.forward_options.append("-c")
self.forward_options.append(arg)
# Check the argument combination is OK
self.check_args()
if self.outputf != "":
print("saving files to", self.outputf)
# create jobs
self.list_jobs()
# run them
self.execute_jobs()
# locate the results at the right place
if self.output != "":
subprocess.run(["mv", self.temp_dir+self.finalname.split('/')[-1], self.output], check=True)
if self.outputf != "":
for src_dir, _, files in walk(self.temp_dir):
dst_dir = src_dir.replace(self.temp_dir, self.outputf, 1)
if not path.exists(dst_dir):
makedirs(dst_dir)
for file_ in files:
src_file = path.join(src_dir, file_)
dst_file = path.join(dst_dir, file_)
if path.exists(dst_file):
# in case of the src and dst are the same file
if path.samefile(src_file, dst_file):
continue
remove(dst_file)
move(src_file, dst_dir)
subprocess.run(["rm", "-rf", self.temp_dir], check=True) # remove the temp folder
def check_args(self):
"""Checks that the argument combination passed by user is a realistic project"""
warning = "ERROR: The argument list you passed contains errors:"
issues = False
if self.modules == "desc" and self.type == "jar3d":
issues = True
print(warning)
print("/!\\ Using jar3d requires the 3D Motif Atlas modules. Use --3dmotifatlas instead of --rna3dmotifs or --carnaval.")
if (self.modules == "desc" or self.modules == "rin" or self.modules == "bgsu") and (self.func == 'E' or self.func == 'F'):
issues = True
print(warning)
print("/!\\ Functions E and F are only compatible with the contacts library from Isaure.")
if self.modules == "rin" and self.type != "dpm":
issues = True
print(warning)
print("/!\\ CaRNAval does not support placement tools (yet), or scoring tools. Please use it with --patternmatch, not --jar3d nor --bayespairing.")
if self.modules == "json" and (self.type != "dpm" or self.func == 'C' or self.func == 'D'):
issues = True
print(warning)
print("/!\\ Contacts does not support placement tools (yet). Please use it with --patternmatch, not --jar3d nor --bayespairing.")
if self.modules == "bgsu" and self.type == "dpm":
issues = True
print(warning)
print("/!\\ Cannot place the Atlas loops by direct pattern matching. Please use a dedicated tool --jar3d or --bayespairing to do so.")
if issues:
print("\nUsage:\tYou must provide:\n\t1) a FASTA input file with -i,\n\t2) one module type in { --rna3dmotifs, --carnaval, --3dmotifatlas, --contacts }"
"\n\t3) one module placement method in { --patternmatch, --jar3d, --bayespairing }\n\t4) one scoring function with --func A, B, C or D"
"\n\n\tIf you are not using the Docker image: \n\t5) --modules-path, --biorseo-dir and (--jar3d-exec or --bypdir)")
print("\nThe allowed module/placement-method/function combinations are:\n")
print(" --patternmatch --bayespairing --jar3d")
print("--rna3dmotifs A. B. A. B. C. D.")
print("--3dmotifatlas A. B. C. D. A. B. C. D.")
print("--carnaval A. B.")
print("--contacts A. B. E. F.\n")
exit(1)
def enumerate_loops(self, s):
"""Lists all the loop positions in a dot-bracket notation"""
def resort(unclosedLoops):
loops.insert(len(loops)-1-unclosedLoops, loops[-1])
loops.pop(-1)
opened = []
openingStart = []
closingStart = []
loops = []
loopsUnclosed = 0
consecutiveOpenings = []
if s[0] == '(':
consecutiveOpenings.append(1)
consecutiveClosings = 0
lastclosed = -1
previous = ''
for i in range(len(s)):
# If we arrive on an unpaired segment
if s[i] == '.':
if previous == '(':
openingStart.append(i-1)
if previous == ')':
closingStart.append(i-1)
# Opening basepair
if s[i] == '(':
if previous == '(':
consecutiveOpenings[-1] += 1
else:
consecutiveOpenings.append(1)
if previous == ')':
closingStart.append(i-1)
# We have something like (...(
if len(openingStart) and openingStart[-1] == opened[-1]:
# Create a new loop starting with this component.
loops.append([(openingStart[-1], i)])
openingStart.pop(-1)
loopsUnclosed += 1
# We have something like )...( or even )(
if len(closingStart) and closingStart[-1] == lastclosed:
# Append a component to existing multiloop
loops[-1].append((closingStart[-1], i))
closingStart.pop(-1)
opened.append(i)
# Closing basepair
if s[i] == ')':
if previous == ')':
consecutiveClosings += 1
else:
consecutiveClosings = 1
# This is not supposed to happen in real data, but whatever.
if previous == '(':
openingStart.append(i-1)
# We have something like (...) or ()
if len(openingStart) and openingStart[-1] == opened[-1]:
# Create a new loop, and save it as already closed (HL)
loops.append([(openingStart[-1], i)])
openingStart.pop(-1)
resort(loopsUnclosed)
# We have something like )...)
if len(closingStart) and closingStart[-1] == lastclosed:
# Append a component to existing multiloop and close it.
loops[-1].append((closingStart[-1], i))
closingStart.pop(-1)
loopsUnclosed -= 1
resort(loopsUnclosed)
if i+1 < len(s):
if s[i+1] != ')': # We are on something like: ).
# an openingStart has not been correctly detected, like in ...((((((...)))...)))
if consecutiveClosings < consecutiveOpenings[-1]:
# Create a new loop (uncompleted)
loops.append([(opened[-2], opened[-1])])
loopsUnclosed += 1
# We just completed an HL+stem, like ...(((...))).., we can forget its info
if consecutiveClosings == consecutiveOpenings[-1]:
consecutiveClosings = 0
consecutiveOpenings.pop(-1)
else: # There are still several basepairs to remember, forget only the processed ones, keep the others
consecutiveOpenings[-1] -= consecutiveClosings
consecutiveClosings = 0
else: # We are on something like: ))
# we are on an closingStart that cannot be correctly detected, like in ...(((...(((...))))))
if consecutiveClosings == consecutiveOpenings[-1]:
# Append a component to the uncomplete loop and close it.
loops[-1].append((i, i+1))
loopsUnclosed -= 1
resort(loopsUnclosed)
# Forget the info about the processed stem.
consecutiveClosings = 0
consecutiveOpenings.pop(-1)
opened.pop(-1)
lastclosed = i
previous = s[i]
# print(i,"=",s[i],"\t", "consec. Op=", consecutiveOpenings,"Cl=",consecutiveClosings)
return loops
def launch_JAR3D_worker(self, loop):
"""Launches a jar3d search of specific loop types (IL or HL) on a RNA subsequence (fasta file)"""
# write motif to a file
modulefolder = self.temp_dir + loop.header[1:] + '/'
if not path.exists(modulefolder):
makedirs(modulefolder)
filename = modulefolder + loop.header[1:]+".fasta"
fasta = open(filename, 'w')
fasta.write('>'+loop.header+'\n'+loop.seq+'\n')
fasta.close()
# Launch Jar3D on it
if loop.type == 'h':
cmd = ["java", "-jar", self.jar3d_exec, loop.header[1:]+".fasta", self.HL_motif_dir+"/all.txt",
loop.header[1:]+".HLloop.csv", loop.header[1:]+".HLseq.csv"]
else:
cmd = ["java", "-jar", self.jar3d_exec, loop.header[1:]+".fasta", self.IL_motif_dir+"/all.txt",
loop.header[1:]+".ILloop.csv", loop.header[1:]+".ILseq.csv"]
with open(self.temp_dir + "log_of_the_run.sh", 'a') as logfile:
logfile.write(' '.join(cmd) + '\n')
chdir(modulefolder)
with open(devnull, 'w') as nowhere:
subprocess.run(cmd, stdout=nowhere)
chdir(self.biorseo_dir)
# Retrieve results
insertion_sites = []
if loop.type == 'h':
capstype = "HL"
else:
capstype = "IL"
csv = open(modulefolder + loop.header[1:] +".%sseq.csv" % capstype, 'r')
l = csv.readline()
while l:
if "true" in l:
insertion_sites.append(InsertionSite(loop, l))
l = csv.readline()
csv.close()
return insertion_sites
def launch_JAR3D(self, seq_, basename):
"""Identify loops in a RNA sequence from RNAsubopt results and search
for 3D motif Atlas modules to place on it using jar3d."""
rnasubopt_preds = []
# Extracting probable loops from RNA-subopt structures
rna = open(self.temp_dir + basename + ".subopt", "r")
lines = rna.readlines()
rna.close()
for i in range(2, len(lines)):
ss = lines[i].split(' ')[0]
if ss not in rnasubopt_preds:
rnasubopt_preds.append(ss)
HLs = []
ILs = []
for ss in rnasubopt_preds:
loop_candidates = self.enumerate_loops(ss)
for loop_candidate in loop_candidates:
if len(loop_candidate) == 1 and loop_candidate not in HLs:
HLs.append(loop_candidate)
if len(loop_candidate) == 2 and loop_candidate not in ILs:
ILs.append(loop_candidate)
# Retrieve subsequences corresponding to the possible loops
loops = []
for i, l in enumerate(HLs):
loops.append(Loop(">HL%d" % (i+1), seq_[l[0][0]-1:l[0][1]], "h", l))
for i, l in enumerate(ILs):
loops.append(Loop(">IL%d" % (i+1), seq_[l[0][0]-1:l[0][1]]+'*'+seq_[l[1][0]-1:l[1][1]], "i", l))
# Scanning loop subsequences against motif database
pool = MyPool(processes=cpu_count())
insertion_sites = [ x for y in pool.map(self.launch_JAR3D_worker, loops) for x in y ]
pool.close()
pool.join()
insertion_sites.sort(reverse=True)
# Writing results to CSV file
c = 0
resultsfile = open(self.biorseo_dir + "/" + self.temp_dir+basename+".sites.csv", "w")
resultsfile.write("Motif,Rotation,Score,Start1,End1,Start2,End2\n")
for site in insertion_sites:
if site.score > 10:
c += 1
string = "FOUND with score %d:\t\t possible insertion of motif " % site.score + site.atlas_id
if site.rotation:
string += " (reversed)"
string += (" on " + site.loop.header + " at positions")
resultsfile.write(site.atlas_id+',' + str(bool(site.rotation))+",%d" % site.score+',')
positions = [','.join([str(y) for y in x]) for x in site.position]
if len(positions) == 1:
positions.append("-,-")
resultsfile.write(','.join(positions)+'\n')
resultsfile.close()
def launch_BayesPairing(self, module_type, seq_, header_):
"""DEPRECATED: now we are using Bayespairing 2.
Searches for module occurrences in the provided RNA sequence using BayesPairing 1."""
# Run BayePairing
cmd = ["python3", "parse_sequences.py", "-seq", self.biorseo_dir + '/' + self.temp_dir + header_ + ".fa", "-d", module_type, "-interm", "1"]
logfile = open(self.biorseo_dir + "/" + self.temp_dir + "log_of_the_run.sh", 'a')
logfile.write(" ".join(cmd))
logfile.write("\n")
logfile.close()
chdir(self.bypdir)
out = subprocess.check_output(cmd).decode('utf-8')
BypLog = out.split('\n')
# Extract results from command output to file
idx = 0
l = BypLog[idx]
while l[:3] != "PUR":
idx += 1
l = BypLog[idx]
insertion_sites = [x for x in literal_eval(l.split(":")[1][1:])]
if module_type == "rna3dmotif":
rna = open(self.biorseo_dir + "/" + self.temp_dir + header_ + ".byp.csv", "w")
else:
rna = open(self.biorseo_dir + "/" + self.temp_dir + header_ + ".bgsubyp.csv", "w")
rna.write("Motif,Score,Start1,End1,Start2,End2...\n")
for i, module in enumerate(insertion_sites):
if len(module):
for (score, positions, _) in zip(*[iter(module)]*3):
pos = []
q = -2
for p in positions:
if p-q > 1:
pos.append(q)
pos.append(p)
q = p
pos.append(q)
rna.write(module_type+str(i)+','+str(int(score)))
for (p, q) in zip(*[iter(pos[1:])]*2):
if q > p:
rna.write(','+str(p)+','+str(q))
rna.write('\n')
rna.close()
def launch_BayesPairing2(self, module_type, seq_, header_):
"""Search for module occurrences in the provided RNA sequence using BayesPairing 2"""
# Run BayesPairing 2
if module_type=="rna3dmotif":
BP2_type = "rna3dmotif"
else:
BP2_type = "3dMotifAtlas_ALL"
cmd = ["python3", "parse_sequences.py", "-seq", self.biorseo_dir + '/' + self.temp_dir + header_ + ".fa", "-samplesize", "1000", "-d", BP2_type ]
logfile = open(self.temp_dir + "log_of_the_run.sh", 'a')
logfile.write(" ".join(cmd))
logfile.write("\n")
logfile.close()
chdir(self.bypdir)
out = subprocess.check_output(cmd).decode('utf-8')
Byp2Log = out.splitlines()
# remove what is not in the original input
Byp2Log.pop(0)
Byp2Log.pop(0)
Byp2Log.pop()
Byp2Log.pop()
# remove the 2 first lines of output
Byp2Log.pop(0)
Byp2Log.pop(0)
# Extract results from command line output to file
lines = []
for i in range(len(Byp2Log)):
line = Byp2Log[i].replace("|", ' ').replace(",", ' ').split()
if line != []:
if "=" in line[0]: #skip the "| MODULE N HITS PERCENTAGE |" part
break
module_sequence = line.pop().split("&") #remove the sequence
if line != []:
new_line = [line[0], line[1]]
num_comp = 0
position_index = 2
while num_comp < len(module_sequence):
len_comp = 0
comp = module_sequence[num_comp]
element = line[position_index].split("-")
new_line.append(element[0])
if position_index >= len(line):
print("!!! Skipped one BP2 result : positions not matching sequence length\n")
new_line = []
break
while len_comp < len(comp):
element = line[position_index].split("-")
if len(element)==1:
len_comp += 1
else:
len_comp += int(element[1]) - int(element[0]) + 1
position_index += 1
new_line.append(element[-1])
num_comp += 1
if new_line != [] :
lines.append(new_line)
if module_type=="rna3dmotif":
rna = open(self.biorseo_dir + "/" + self.temp_dir + header_ + ".byp2.csv", "w")
else:
rna = open(self.biorseo_dir + "/" + self.temp_dir + header_ + ".bgsubyp2.csv", "w")
rna.write("Motif,Score,Start1,End1,Start2,End2...\n")
for line in lines:
rna.write(module_type)
for i in range(len(line)-1):
rna.write(line[i] + ",")
rna.write(line[-1] + "\n")
rna.close()
def execute_job(self, j):
"""Execute the command or function stored in a Job class object."""
running_stats[0] += 1
if j.cmd_ is not None:
logfile = open(self.biorseo_dir + "/" + self.temp_dir + "log_of_the_run.sh", 'a')
logfile.write(" ".join(j.cmd_))
logfile.write("\n")
logfile.close()
print("["+str(running_stats[0]+running_stats[2]) +
'/'+str(self.jobcount)+"]\t"+" ".join(j.cmd_))
r = subprocess.run(j.cmd_, timeout=j.timeout_)
elif j.func_ is not None:
print("["+str(running_stats[0]+running_stats[2])+'/'+str(self.jobcount) +
"]\t"+j.func_.__name__+'('+", ".join([a for a in j.args_])+')')
try:
if j.args_ is not None:
r = j.func_(*j.args_)
else:
r = j.func_()
except Exception as e:
r = 1
print("\033[31m", e, "\033[0m")
pass
running_stats[1] += 1
return r
def execute_jobs(self):
"""Groups job by priority and ability to be run in parallel,
and runs them."""
jobs = {}
self.jobcount = len(self.joblist)
for job in self.joblist:
if job.priority_ not in jobs.keys():
jobs[job.priority_] = {}
if job.nthreads not in jobs[job.priority_].keys():
jobs[job.priority_][job.nthreads] = []
jobs[job.priority_][job.nthreads].append(job)
nprio = max(jobs.keys())
if len(jobs) > 1 :
for i in range(1,nprio+1):
if not len(jobs[i].keys()): continue
# check the thread numbers
different_thread_numbers = [ n for n in jobs[i].keys() ]
different_thread_numbers.sort()
for n in different_thread_numbers:
bunch = jobs[i][n]
if not len(bunch): continue
pool = MyPool(processes=n)
pool.map(self.execute_job, bunch)
pool.close()
pool.join()
else:
for job in self.joblist:
self.execute_job(job)
def list_jobs(self):
"""Determines the required tool runs we need before running the C++ Biorseo,
and creates a job list to run."""
if self.outputf != "":
subprocess.run(["mkdir", "-p", self.outputf]) # Create the output folder
subprocess.run(["mkdir", "-p", self.temp_dir]) # Create the temp folder
# Read fasta file, which can contain one or several RNAs
RNAcontainer = []
print("loading file %s..." % self.inputfile)
db = open(self.inputfile, "r")
c = 0
header = ""
seq = ""
while True:
l = db.readline()
if l == "":
break
c += 1
c = c % 2
if c == 1:
if header != "": # This is our second RNA in the fasta file
self.mode = 1 # we switch to batch mode
header = l[1:-1]
if c == 0:
seq = l.upper()
if is_canonical_nts(seq):
header = header.replace('/', '_').replace('\'','').replace('(','').replace(')','').replace(' ','_').replace('>','')
if (header != ""):
RNAcontainer.append(RNA(header, seq))
if not path.isfile(self.temp_dir + header + ".fa"):
rna = open(self.temp_dir + header + ".fa", "w")
rna.write(">" + header +'\n')
rna.write(seq +'\n')
rna.close()
db.close()
for nt, number in ignored_nt_dict.items():
print("ignored %d sequences because of char %c" % (number, nt))
tot = len(RNAcontainer)
print("Loaded %d RNAs." % (tot))
# define job list
for instance in RNAcontainer:
executable = self.biorseo_dir + "bin/biorseo"
fastafile = self.temp_dir+instance.header+".fa"
method_type = ""
priority = 1
if self.type == "jar3d":
ext = ".jar3d"
method_type = "--jar3dcsv"
csv = self.temp_dir + instance.header + ".sites.csv"
# RNAsubopt
self.joblist.append(Job(command=["RNAsubopt", "-i", fastafile, "--outfile="+ instance.header + ".subopt"], priority=1))
self.joblist.append(Job(command=["mv", instance.header + ".subopt", self.temp_dir], priority=2))
# JAR3D
self.joblist.append(Job(function=self.launch_JAR3D, args=[instance.seq_, instance.header], priority=3, how_many_in_parallel=1))
priority = 4
if self.type == "byp":
method_type = "--bayespaircsv"
if self.modules == "desc":
ext = ".byp2"
csv = self.temp_dir + instance.header + ".byp2.csv"
self.joblist.append(Job(function=self.launch_BayesPairing2, args=["rna3dmotif", instance.seq_, instance.header], how_many_in_parallel=1, priority=1))
elif self.modules == "bgsu":
ext = ".bgsubyp2"
csv = self.temp_dir + instance.header + ".bgsubyp2.csv"
self.joblist.append(Job(function=self.launch_BayesPairing2, args=["3dmotifatlas", instance.seq_, instance.header], how_many_in_parallel=1, priority=1))
priority = 2
if self.type == "dpm":
if self.modules == "desc":
method_type = "--descfolder"
csv = self.desc_folder
ext = ".desc_pm"
elif self.modules == "rin":
method_type = "--rinfolder"
csv = self.rin_folder
ext = ".rin_pm"
elif self.modules == "json":
method_type = "--jsonfolder"
csv = self.json_folder
ext = ".json_pm"
command = [ executable, "-s", fastafile ]
if method_type:
command += [ method_type, csv ]
self.finalname = self.temp_dir + instance.header + ext + self.func
command += [ "-o", self.finalname, "--function", self.func ]
command += self.forward_options
self.joblist.append(Job(command=command, priority=priority, timeout=3600, how_many_in_parallel=3))
BiorseoInstance(cmd_opts)
......@@ -24,6 +24,7 @@ using namespace std;
using json = nlohmann::json;
char MOIP::obj_function_nbr_ = 'A';
char MOIP::obj_function2_nbr_ = 'b';
uint MOIP::obj_to_solve_ = 1;
double MOIP::precision_ = 1e-5;
bool MOIP::allow_pk_ = true;
......@@ -68,6 +69,7 @@ MOIP::MOIP(const RNA& rna, string source, string source_path, float theta, bool
if (verbose_) cout << "Defining problem decision variables..." << endl;
basepair_dv_ = IloNumVarArray(env_);
insertion_dv_ = IloNumVarArray(env_);
stacks_dv_ = IloNumVarArray(env_);
// Add the y^u_v decision variables
if (verbose_) cout << "\t> Legal basepairs : ";
......@@ -87,6 +89,24 @@ MOIP::MOIP(const RNA& rna, string source, string source_path, float theta, bool
}
if (verbose_) cout << endl;
// Add the x_i,j decision variables
if (verbose_) cout << "\t> The possible stacks of two base pairs (i,j) and (i+1,j-1) : ";
c = 0;
index_of_xij_ = vector<vector<size_t>>(rna_.get_RNA_length() - 6, vector<size_t>(0));
for (u = 0; u < rna_.get_RNA_length() - 6; u++)
for (v = u + 4; v < rna_.get_RNA_length(); v++) // A basepair is possible if v > u+3
if (rna_.get_pij(u, v) > theta and rna_.get_pij(u + 1, v - 1) > theta) { // or u-1 v+1 ??
if (verbose_) cout << u << '-' << v << " ";
index_of_xij_[u].push_back(c);
c++;
char name[15];
sprintf(name, "x%d,%d", u, v);
stacks_dv_.add(IloNumVar(env_, 0, 1, IloNumVar::Bool, name)); // A boolean whether (u,v) and (u+1,v-1) are a stack
} else {
index_of_xij_[u].push_back(rna_.get_RNA_length() * rna_.get_RNA_length() + 1);
}
if (verbose_) cout << endl;
// Look for insertions sites, then create the appropriate Cxip variables
insertion_sites_ = vector<Motif>();
......@@ -146,7 +166,7 @@ MOIP::MOIP(const RNA& rna, string source, string source_path, float theta, bool
{
if (verbose)
{
cerr << "\t>Ignoring motif " << it.path().stem();
cerr << "\t> Ignoring motif " << it.path().stem();
switch (error)
{
case '-': cerr << ", some nucleotides have a negative number..."; break;
......@@ -191,25 +211,25 @@ MOIP::MOIP(const RNA& rna, string source, string source_path, float theta, bool
thread_pool.push_back(thread(&Pool::infinite_loop_func, &pool));
// Read every RIN file and add it to the queue (iff valid)
char error;
char error;
for (auto it : recursive_directory_range(source_path))
{
if ((error = Motif::is_valid_RIN(it.path().string()))) // Returns error if RIN file is incorrect
{
if (verbose)
if ((error = Motif::is_valid_RIN(it.path().string()))) // Returns error if RIN file is incorrect
{
if (verbose)
{
cerr << "\t> Ignoring RIN " << it.path().stem();
switch (error)
{
case 'l': cerr << ", too short to be considered."; break;
case 'x': cerr << ", because not constraining the secondary structure."; break;
default: cerr << ", unknown reason";
default: cerr << ", unknown reason";
}
cerr << endl;
}
errors++;
errors++;
continue;
}
}
accepted++;
args_of_parallel_func args(it.path(), posInsertionSites_access);
inserted++;
......@@ -234,32 +254,36 @@ MOIP::MOIP(const RNA& rna, string source, string source_path, float theta, bool
size_t errors = 0;
// Read every JSON files
vector<pair<uint,char>> errors_id;
vector<pair<uint,char>> errors_id;
for (auto it : recursive_directory_range(source_path))
{
errors_id = Motif::is_valid_JSON(it.path().string());
if (!(errors_id.empty())) // Returns error if JSON file is incorrect
{
if (!(errors_id.empty())) // Returns error if JSON file is incorrect
{
for(uint j = 0; j < errors_id.size(); j++)
{
cerr << "\t> Ignoring JSON " << errors_id[j].first;
uint error = errors_id[j].second;
switch (error)
{
case 'l': cerr << ", too short to be considered."; break;
case 'x': cerr << ", sequence and secondary structure are of different size."; break;
case 'd' : cerr << ", missing header."; break;
case 'e' : cerr << ", sequence is empty."; break;
case 'f' : cerr << ", 2D is empty."; break;
case 'n' : cerr << ", brackets are not balanced."; break;
case 'k' : cerr << ", a component is too small and got removed."; break;
default: cerr << ", unknown reason";
if(verbose) {
cerr << "\t> Ignoring JSON " << errors_id[j].first;
uint error = errors_id[j].second;
switch (error)
{
case 'l': cerr << ", too short to be considered."; break;
case 'x': cerr << ", sequence and secondary structure are of different size."; break;
case 'd' : cerr << ", missing header."; break;
case 'e' : cerr << ", sequence is empty."; break;
case 'f' : cerr << ", 2D is empty."; break;
case 'n' : cerr << ", brackets are not balanced."; break;
case 'k' : cerr << ", a component is too small and got removed."; break;
case 'a' : cerr << ", the number of components is different between contacts and sequence"; break;
case 'b' : cerr << ", the number of nucleotides is different between contacts and sequence"; break;
default: cerr << ", unknown reason";
}
cerr << endl;
}
cerr << endl;
}
errors++;
}
}
accepted++;
args_of_parallel_func args(it.path(), posInsertionSites_access);
inserted++;
......@@ -272,7 +296,7 @@ MOIP::MOIP(const RNA& rna, string source, string source_path, float theta, bool
}
else
{
cout << "!!! Unknown module source" << endl;
cout << "Err: Unknown module source." << endl;
}
// Add the Cx,i,p decision variables
......@@ -347,7 +371,7 @@ MOIP::MOIP(const RNA& rna, string source, string source_path, float theta, bool
break;
case 'E':
// Fonction f1E
// Fonction f1E
for (const Component& c : insertion_sites_[i].comp) sum_k += c.k;
obj1 += IloNum(sum_k * insertion_sites_[i].contact_ * insertion_sites_[i].tx_occurrences_) * insertion_dv_[index_of_first_components[i]] ;
break;
......@@ -361,13 +385,40 @@ MOIP::MOIP(const RNA& rna, string source, string source_path, float theta, bool
}
}
// Define the expected accuracy objective function:
obj2 = IloExpr(env_);
for (size_t u = 0; u < rna_.get_RNA_length() - 6; u++) {
for (size_t v = u + 4; v < rna_.get_RNA_length(); v++) {
if (allowed_basepair(u, v)) obj2 += (IloNum(rna_.get_pij(u, v)) * y(u, v));
}
//Stacking energy parameter matrix
double energy[7][7] = {
{0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0},
{0.0, 1.1, 2.1, 2.2, 1.4, 0.9, 0.6},
{0.0, 2.1, 2.4, 3.3, 2.1, 2.1, 1.4},
{0.0, 2.2, 3.3, 3.4, 2.5, 2.4, 1.5},
{0.0, 1.4, 2.1, 2.5, 1.3, 1.3, 0.5},
{0.0, 0.9, 2.1, 2.4, 1.3, 1.3, 1.0},
{0.0, 0.6, 1.4, 1.5, 0.5, 1.0, 0.3}
};
obj2 = IloExpr(env_);
switch (obj_function2_nbr_) {
case 'a':
// Define the MFE (Minimum Free Energy):
for (size_t u = 0; u < rna_.get_RNA_length() - 6; u++) {
for (size_t v = u + 4; v < rna_.get_RNA_length(); v++) {
if (get_xij_index(u, v) != rna_.get_RNA_length() * rna_.get_RNA_length() + 1) {
uint type1 = rna_.get_type()[u][v];
uint type2 = rna_.get_type()[u + 1][v - 1];
obj2 += IloNum(energy[type1][type2]) * x(u, v);
}
}
}
break;
case 'b':
// Define the expected accuracy objective function:
//MEA:
for (size_t u = 0; u < rna_.get_RNA_length() - 6; u++) {
for (size_t v = u + 4; v < rna_.get_RNA_length(); v++) {
if (allowed_basepair(u, v)) obj2 += (IloNum(rna_.get_pij(u, v)) * y(u, v));
}
}
break;
}
}
......@@ -407,6 +458,25 @@ void MOIP::define_problem_constraints(string& source)
}
}
// Ensure that the stacking of (i,j) and (i+1,j-1) exists if and only if the pairing (i,j) and (i+1, j-1) exist
if (verbose_) cout << "\t> ensuring that the stacks are possible..." << endl;
for (u = 0; u < n - 5; u++) {
for (v = u + 4; v < n; v++) {
if (allowed_basepair(u, v) and allowed_basepair(u + 1, v - 1)) {
IloExpr c7_1(env_);
IloExpr c7_2(env_);
c7_1 += y(u, v) + y(u + 1, v - 1);
c7_2 += y(u, v) + y(u + 1, v - 1) - IloNum(1);
model_.add(IloNum(2) * x(u, v) <= c7_1);
if (verbose_) cout << "\t\t" << (2 * x(u,v) <= c7_1) << endl;
model_.add(x(u, v) >= c7_2);
if (verbose_) cout << "\t\t" << (x(u, v) >= c7_2) << endl << endl;
}
}
}
// forbid lonely basepairs if databases other than CaRNAval are being used
if (source != "rinfolder" and source != "jsonfolder")
{
......@@ -625,7 +695,6 @@ void MOIP::define_problem_constraints(string& source)
SecondaryStructure MOIP::solve_objective(int o, double min, double max)
{
//cout << endl << "BEGIN" << endl;
// Solves one of the objectives, under constraint that the other should be in [min, max]
if (min > max) {
......@@ -675,17 +744,11 @@ SecondaryStructure MOIP::solve_objective(int o, double min, double max)
}
// if (verbose_) cout << "\t\t>retrieving basepairs of the result secondary structure..." << endl;
//cout << "y(2,80): " << cplex_.getValue(y(u, v)) << endl;
for (size_t u = 0; u < rna_.get_RNA_length() - 6; u++)
for (size_t v = u + 4; v < rna_.get_RNA_length(); v++)
if (allowed_basepair(u, v))
if (cplex_.getValue(y(u, v)) > 0.5) {
best_ss.set_basepair(u, v);
/*if (u == 5 && v == 26) {
cout << endl << "(" << u << "," << v << "): " << endl;
cout << best_ss.to_string() << endl;
cout << "(((...((((((((....))))))))(((.....((((((((....)))))))))))...((((((((....)))))))))))" << endl;
}*/
}
best_ss.sort(); // order the basepairs in the vector
......@@ -714,7 +777,9 @@ SecondaryStructure MOIP::solve_objective(int o, double min, double max)
void MOIP::search_between(double lambdaMin, double lambdaMax)
{
SecondaryStructure s = solve_objective(obj_to_solve_, lambdaMin, lambdaMax);
//if (fabs(lambdaMin - lambdaMax) < MOIP::precision_) return;
if (lambdaMin - lambdaMax > 0.0) return;
SecondaryStructure s = solve_objective(obj_to_solve_, lambdaMin + MOIP::precision_, lambdaMax);
//cout << "min: " << lambdaMin << " max: " << lambdaMax << endl;
if (!s.is_empty_structure) { // A solution has been found
......@@ -743,10 +808,10 @@ void MOIP::search_between(double lambdaMin, double lambdaMax)
double max = lambdaMax;
if (verbose_)
cout << std::setprecision(-log10(precision_) + 4) << "\nSolving objective function " << obj_to_solve_
cout << std::setprecision(-log10(precision_) + 7) << "\nSolving objective function " << obj_to_solve_
<< ", on top of " << s.get_objective_score(3 - obj_to_solve_) << ": Obj" << 3 - obj_to_solve_
<< " being in [" << std::setprecision(-log10(precision_) + 4) << min << ", "
<< std::setprecision(-log10(precision_) + 4) << max << "]..." << endl;
<< " being in [" << std::setprecision(-log10(precision_) + 7) << min << ", "
<< std::setprecision(-log10(precision_) + 7) << max << "]..." << endl;
search_between(min, max);
......@@ -756,10 +821,10 @@ void MOIP::search_between(double lambdaMin, double lambdaMax)
min = lambdaMin;
max = s.get_objective_score(3 - obj_to_solve_);
if (verbose_)
cout << std::setprecision(-log10(precision_) + 4) << "\nSolving objective function " << obj_to_solve_
cout << std::setprecision(-log10(precision_) + 7) << "\nSolving objective function " << obj_to_solve_
<< ", below (or eq. to) " << max << ": Obj" << 3 - obj_to_solve_ << " being in ["
<< std::setprecision(-log10(precision_) + 4) << min << ", "
<< std::setprecision(-log10(precision_) + 4) << max << "]..." << endl;
<< std::setprecision(-log10(precision_) + 7) << min << ", "
<< std::setprecision(-log10(precision_) + 7) << max << "]..." << endl;
search_between(min, max);
}
......@@ -818,6 +883,14 @@ size_t MOIP::get_yuv_index(size_t u, size_t v) const
size_t MOIP::get_Cpxi_index(size_t x_i, size_t i_on_j) const { return index_of_Cxip_[x_i][i_on_j]; }
size_t MOIP::get_xij_index(size_t u, size_t v) const
{
size_t a, b;
a = (u < v) ? u : v;
b = (u > v) ? u : v;
return index_of_xij_[a][b - 4 - a];
}
void MOIP::remove_solution(uint i) { pareto_.erase(pareto_.begin() + i); }
bool MOIP::allowed_basepair(size_t u, size_t v) const
......@@ -1020,7 +1093,7 @@ void MOIP::allowed_motifs_from_rin(args_of_parallel_func arg_struct)
mutex& posInsertionSites_access = arg_struct.posInsertionSites_mutex;
std::ifstream motif;
string filepath = rinfile.string();
string filepath = rinfile.string();
vector<vector<Component>> vresults, r_vresults;
vector<string> component_sequences;
uint carnaval_id;
......@@ -1029,7 +1102,7 @@ void MOIP::allowed_motifs_from_rin(args_of_parallel_func arg_struct)
string reversed_rna = rna_.get_seq();
std::reverse(reversed_rna.begin(), reversed_rna.end());
filenumber = filepath.substr(filepath.find("Subfiles/")+9, filepath.find(".txt"));
filenumber = filepath.substr(filepath.find("Subfiles/")+9, filepath.find(".txt"));
carnaval_id = 1 + stoi(filenumber); // Start counting at 1 to be consistant with the website numbering
motif = std::ifstream(rinfile.string());
......@@ -1054,13 +1127,13 @@ void MOIP::allowed_motifs_from_rin(args_of_parallel_func arg_struct)
{
Motif temp_motif = Motif(v, rinfile, carnaval_id, false);
bool unprobable = false;
for (const Link& l : temp_motif.links_)
{
if (!allowed_basepair(l.nts.first,l.nts.second))
unprobable = true;
}
if (unprobable) continue;
bool unprobable = false;
for (const Link& l : temp_motif.links_)
{
if (!allowed_basepair(l.nts.first,l.nts.second))
unprobable = true;
}
if (unprobable) continue;
// Add it to the results vector
unique_lock<mutex> lock(posInsertionSites_access);
......@@ -1069,182 +1142,18 @@ void MOIP::allowed_motifs_from_rin(args_of_parallel_func arg_struct)
}
}
//Temporaire--------------------------------------
//Check if the sequence is a rna sequence (ATGC) and replace T by U or remove modified nucleotide if necessary
string check_motif_sequence(string seq) {
std::transform(seq.begin(), seq.end(), seq.begin(), ::toupper);
for (int i = seq.size(); i >= 0; i--) {
if(seq[i] == 'T') {
seq[i] = 'U';
} else if (!(seq [i] == 'A' || seq [i] == 'U' || seq [i] == '&'
|| seq [i] == 'G' || seq [i] == 'C')) {
seq = seq.erase(i,1);
}
}
return seq;
}
// Based on the 2d structure find all positions of the pairings.
vector<Link> search_pairing(string& struc, vector<Component>& v) {
vector<Link> vec;
stack<uint> parentheses;
stack<uint> crochets;
stack<uint> accolades;
stack<uint> chevrons;
/*for(uint j = 0; j < v.size(); j++) {
cout << "composante: (" << v[j].pos.first << "," << v[j].pos.second << ")" << endl << endl;
}*/
uint count = 0;
uint debut = v[count].pos.first;
uint gap = 0;
for (uint i = 0; i < struc.size(); i++) {
if (struc[i] == '(') {
parentheses.push(i + debut + gap - count);
//cout << "i: " << i << " pos :" << parentheses.top() << endl;
} else if (struc[i] == ')') {
Link l;
l.nts.first = parentheses.top();
//cout << "top :" << parentheses.top() << endl;
l.nts.second = i + debut + gap - count;
vec.push_back(l);
parentheses.pop();
} else if (struc[i] == '[') {
crochets.push(i + debut + gap - count);
} else if (struc[i] == ']') {
Link l;
l.nts.first = crochets.top();
l.nts.second = i + debut + gap - count;
vec.push_back(l);
crochets.pop();
} else if (struc[i] == '{') {
accolades.push(i + debut + gap - count);
} else if (struc[i] == '}') {
Link l;
l.nts.first = accolades.top();
l.nts.second = i + debut + gap - count;
vec.push_back(l);
accolades.pop();
} else if (struc[i] == '<') {
chevrons.push(i + debut + gap - count);
} else if (struc[i] == '>') {
Link l;
l.nts.first = chevrons.top();
l.nts.second = i + debut + gap - count;
vec.push_back(l);
chevrons.pop();
} else if (struc[i] == '&') {
count ++;
gap += v[count].pos.first - v[count - 1].pos.second - 1;
//cout << "count: " << count << endl;
//cout << "gap : " << gap << endl;
}
}
return vec;
}
size_t count_contacts(string contacts) {
size_t count = 0;
for (uint i = 0; i < contacts.size(); i++) {
if (contacts[i] == '*') {
count++;
}
}
return count;
}
uint find_max_occurrences (string filepath) {
uint max = 0;
std::ifstream in = std::ifstream(filepath);
json js = json::parse(in);
string contacts_id;
for(auto it = js.begin(); it != js.end(); ++it) {
contacts_id = it.key();
for(auto it2 = js[contacts_id].begin(); it2 != js[contacts_id].end(); ++it2) {
string test = it2.key();
if (!test.compare("occurences")) {
uint occ = it2.value();
if (occ > max) {
max = occ;
}
}
}
}
return max;
}
uint find_max_sequence (string filepath) {
uint max = 0;
std::ifstream in = std::ifstream(filepath);
json js = json::parse(in);
string contacts_id;
string seq;
for(auto it = js.begin(); it != js.end(); ++it) {
contacts_id = it.key();
for(auto it2 = js[contacts_id].begin(); it2 != js[contacts_id].end(); ++it2) {
string test = it2.key();
if (!test.compare("sequence")) {
seq = it2.value();
uint size = seq.size();
if (size > max) {
max = size;
}
}
}
}
return max;
}
vector<string> find_components(string sequence, string delimiter) {
vector<string> list;
string seq = sequence;
string subseq;
uint fin = 0;
while(seq.find(delimiter) != string::npos) {
fin = seq.find(delimiter);
subseq = seq.substr(0, fin);
seq = seq.substr(fin + 1);
list.push_back(subseq); // new component sequence
//std::cout << "subseq: " << subseq << endl;
}
if (!seq.empty()) {
list.push_back(seq);
//std::cout << "subseq: " << seq << endl;
}
return list;
}
//Temporaire--------------------------------------
void MOIP::allowed_motifs_from_json(args_of_parallel_func arg_struct, vector<pair<uint, char>> errors_id)
{
/*
Searches where to place some JSONs in the RNA
*/
// Searches where to place some JSON motifs in the RNA
path jsonfile = arg_struct.motif_file;
mutex& posInsertionSites_access = arg_struct.posInsertionSites_mutex;
std::ifstream motif;
string filepath = jsonfile.string();
string filepath = jsonfile.string();
vector<vector<Component>> vresults, r_vresults;
vector<string> component_sequences;
vector<string> component_strucs;
vector<string> component_contacts;
vector<string> pdbs;
string contacts, field, seq, struct2d;
string contacts_id;
string line, filenumber;
......@@ -1253,30 +1162,23 @@ void MOIP::allowed_motifs_from_json(args_of_parallel_func arg_struct, vector<pai
size_t nb_contacts = 0;
double tx_occurrences = 0;
// cout << filepath << endl;
std::reverse(reversed_rna.begin(), reversed_rna.end());
motif = std::ifstream(filepath);
json js = json::parse(motif);
string keys[4] = {"contacts", "occurences", "sequence", "struct2d"};
string keys[5] = {"contacts", "occurences", "pdb", "sequence", "struct2d"};
uint it_errors = 0;
uint comp;
//uint max_occ = 0;
//uint max_n = 0;
uint occ = 0;
for(auto it = js.begin(); it != js.end(); ++it) {
contacts_id = it.key();
comp = stoi(contacts_id);
// Check for known errors to ignore correspopnding motifs
// Check for known errors to ignore corresponding motifs
if (comp == errors_id[it_errors].first) {
while (comp == errors_id[it_errors].first) {
//cout << "id erreur: " << errors_id[it_errors].first << endl;
/*if (contacts_id.compare("974") == 0) {
cout << "id erreur: " << errors_id[it_errors].second << endl;
}*/
it_errors ++;
}
continue;
......@@ -1288,48 +1190,56 @@ void MOIP::allowed_motifs_from_json(args_of_parallel_func arg_struct, vector<pai
{
contacts = it2.value();
nb_contacts = count_contacts(contacts);
component_contacts = find_components(contacts, "&");
}
else if (!field.compare(keys[1])) // This is the occurences field
{
occ = it2.value();
//max_occ = find_max_occurrences(filepath);
tx_occurrences = (double)occ; // / (double)max_occ;
//cout << "occ: " << tx_occurrences << endl;
tx_occurrences = (double)occ; // (double)max_occ;
}
else if (!field.compare(keys[2])) // This is the sequence field
else if (!field.compare(keys[2])) // This is the pdb field
{
vector<string> tab = it2.value();
pdbs = tab;
}
else if (!field.compare(keys[3])) // This is the sequence field
{
seq = check_motif_sequence(it2.value());
/*max_n = find_max_sequence(filepath);
tx_occurrences = (double)occ / (double)max_n - seq.size() + 1 ;*/
component_sequences = find_components(seq, "&");
}
else if (!field.compare(keys[3])) // This is the struct2D field
else if (!field.compare(keys[4])) // This is the struct2D field
{
struct2d = it2.value();
component_strucs = find_components(struct2d, "&");
struct2d = it2.value();
}
}
vresults = json_find_next_ones_in(rna, 0, component_sequences, component_strucs);
vresults = json_find_next_ones_in(rna, 0, component_sequences);
for (vector<Component>& v : vresults)
{
if (verbose_) cout << "\t> Considering motif JSON " << contacts_id << "\t" << seq << ", " << struct2d << ", ";
if (verbose_) cout << "\t> Considering motif JSON " << contacts_id << "\t" << seq << ", " << struct2d << " ";
Motif temp_motif = Motif(v, contacts_id, nb_contacts, tx_occurrences);
temp_motif.links_ = search_pairing(struct2d, v);
temp_motif.links_ = build_motif_pairs(struct2d, v);
temp_motif.pos_contacts = find_contacts(component_contacts, v);
// Check if the motif can be inserted, checking the basepairs probabilities and theta
bool unprobable = false;
if (!temp_motif.links_.size()) {
if (verbose_) cout << "discarded, no constraints on the secondary structure, it is a useless motif." << endl;
continue;
}
if (verbose_) cout << "at position ";
for (const Link& l : temp_motif.links_)
{
if (verbose_) cout << l.nts.first << ',' << l.nts.second << ' ';
if (!allowed_basepair(l.nts.first,l.nts.second)) {
if (!allowed_basepair(l.nts.first, l.nts.second)) {
if (verbose_) cout << "(unlikely) ";
unprobable = true;
}
}
if (unprobable) {
if (verbose_) cout << "discarded because of unlikely or impossible basepairs" << endl;
if (verbose_) cout << ", discarded because of unlikely or impossible basepairs" << endl;
continue;
}
if (verbose_) cout << endl;
......@@ -1341,4 +1251,5 @@ void MOIP::allowed_motifs_from_json(args_of_parallel_func arg_struct, vector<pai
}
component_sequences.clear();
}
}
}
\ No newline at end of file
......
......@@ -37,6 +37,7 @@ class MOIP
void forbid_solutions_between(double min, double max);
IloEnv& get_env(void);
static char obj_function_nbr_; // On what criteria do you want to insert motifs ?
static char obj_function2_nbr_; // Do you want to use MEA or MFE to determine the best energy score ?
static uint obj_to_solve_; // What objective do you prefer to solve in mono-objective portions of the algorithm ?
static double precision_; // decimals to keep in objective values, to avoid numerical issues. otherwise, solution with objective 5.0000000009 dominates solution with 5.0 =(
static bool allow_pk_; // Wether we forbid pseudoknots (false) or allow them (true)
......@@ -47,8 +48,12 @@ class MOIP
void define_problem_constraints(string& source);
size_t get_yuv_index(size_t u, size_t v) const;
size_t get_Cpxi_index(size_t x_i, size_t i_on_j) const;
size_t get_xij_index(size_t u, size_t v) const;
IloNumExprArg& y(size_t u, size_t v); // Direct reference to y^u_v in basepair_dv_
IloNumExprArg& C(size_t x, size_t i); // Direct reference to C_p^xi in insertion_dv_
IloNumExprArg& x(size_t u, size_t v); // Direct reference to x_i,j in stacks_dv_
bool exists_vertical_outdated_labels(const SecondaryStructure& s) const;
bool exists_horizontal_outdated_labels(const SecondaryStructure& s) const;
void allowed_motifs_from_desc(args_of_parallel_func arg_struct);
......@@ -66,12 +71,16 @@ class MOIP
IloEnv env_; // environment CPLEX object
IloNumVarArray basepair_dv_; // Decision variables
IloNumVarArray insertion_dv_; // Decision variables
IloNumVarArray stacks_dv_; // Decision variables
IloModel model_; // Solver for objective 1
IloExpr obj1; // Objective function that counts inserted motifs
IloExpr obj2; // Objective function of expected accuracy
vector<vector<size_t>> index_of_Cxip_; // Stores the indexes of the Cxip in insertion_dv_
vector<size_t> index_of_first_components; // Stores the indexes of Cx1p in insertion_dv_
vector<vector<size_t>> index_of_yuv_; // Stores the indexes of the y^u_v in basepair_dv_
vector<vector<size_t>> index_of_xij_; //Stores the indexes of the xij variables (BioKop) in stacks_dv_
};
inline uint MOIP::get_n_solutions(void) const { return pareto_.size(); }
......@@ -79,6 +88,8 @@ inline uint MOIP::get_n_candidates(void) const { return ins
inline const SecondaryStructure& MOIP::solution(uint i) const { return pareto_[i]; }
inline IloNumExprArg& MOIP::y(size_t u, size_t v) { return basepair_dv_[get_yuv_index(u, v)]; }
inline IloNumExprArg& MOIP::C(size_t x, size_t i) { return insertion_dv_[get_Cpxi_index(x, i)]; }
inline IloNumExprArg& MOIP::x(size_t u, size_t v) { return stacks_dv_[get_xij_index(u, v)]; }
inline SecondaryStructure MOIP::solve_objective(int o) { return solve_objective(o, 0, rna_.get_RNA_length()); }
inline IloEnv& MOIP::get_env(void) { return env_; }
......
......@@ -11,16 +11,6 @@ using namespace boost::filesystem;
using namespace std;
using json = nlohmann::json;
struct recursive_directory_range {
typedef recursive_directory_iterator iterator;
recursive_directory_range(path p) : p_(p) {}
iterator begin() { return recursive_directory_iterator(p_); }
iterator end() { return recursive_directory_iterator(); }
path p_;
};
Motif::Motif(void) {}
Motif::Motif(const vector<Component>& v, string PDB) : comp(v), PDBID(PDB)
......@@ -275,67 +265,9 @@ char Motif::is_valid_RIN(const string& rinfile)
return (char) 0;
}
//temporaire---------------------------------------------------
bool checkSecondaryStructure(string struc)
{
stack<uint> parentheses;
stack<uint> crochets;
stack<uint> accolades;
stack<uint> chevrons;
for (uint i = 0; i < struc.length(); i++)
{
if (struc[i] != '(' && struc[i] != ')'
&& struc[i] != '.' && struc[i] != '&'
&& struc[i] != '[' && struc[i] != ']'
&& struc[i] != '{' && struc[i] != '}'
&& struc[i] != '<' && struc[i] != '>') {
return false;
} else {
for (uint i = 0; i < struc.size(); i++) {
if (struc[i] == '(') {
parentheses.push(i);
} else if (struc[i] == ')') {
if (!parentheses.empty())
parentheses.pop();
else return false;
} else if (struc[i] == '[') {
crochets.push(i);
} else if (struc[i] == ']') {
if (!crochets.empty())
crochets.pop();
else return false;
} else if (struc[i] == '{') {
accolades.push(i);
} else if (struc[i] == '}') {
if (!accolades.empty())
accolades.pop();
else return false;
} else if (struc[i] == '<') {
chevrons.push(i);
} else if (struc[i] == '>') {
if (!chevrons.empty())
chevrons.pop();
else return false;
}
}
}
}
return (parentheses.empty() && crochets.empty() && accolades.empty() && chevrons.empty());
}
//--------------------------------------------------------------
vector<pair<uint,char>> Motif::is_valid_JSON(const string& jsonfile)
{
// /!\ returns 0 if no errors
// returns 0 if no errors
std::ifstream motif;
motif = std::ifstream(jsonfile);
json js = json::parse(motif);
......@@ -344,26 +276,28 @@ vector<pair<uint,char>> Motif::is_valid_JSON(const string& jsonfile)
uint fin = 0;
std::string keys[6] = {"contacts", "occurences", "pdb", "pfam", "sequence", "struct2d"};
// Iterating over Motifs
for (auto i = js.begin(); i != js.end(); ++i) {
int j = 0;
string id = i.key();
size_t size;
string complete_seq;
//cout << id << ": " << endl;
string contacts;
// Iterating over json keys
for (auto it = js[id].begin(); it != js[id].end(); ++it) {
string test = it.key();
//std::cout << "test: " << test << endl;
if (test.compare(keys[j]))
// it.key() contains the key, and it.value() the field's value.
string curr_key = it.key();
if (curr_key.compare(keys[j]))
{
//std::cout << "error header : keys[" << j << "]: " << keys[j] << " vs test: " << test << endl;
//std::cout << "error header : keys[" << j << "]: " << keys[j] << " vs curr_key: " << curr_key << endl;
errors_id.push_back(make_pair(stoi(id), 'd'));
//return 'd';
}
else if(!test.compare(keys[5])) // This is the secondary structure field
else if(!curr_key.compare(keys[5])) // This is the secondary structure field
{
//std::cout << "struct2d: " << it.value() << endl;
string ss = it.value();
if (ss.empty()) {
errors_id.push_back(make_pair(stoi(id), 'f'));
......@@ -376,19 +310,33 @@ vector<pair<uint,char>> Motif::is_valid_JSON(const string& jsonfile)
break;
}
}
else if (!test.compare(keys[4])) // This is the sequence field
else if(!curr_key.compare(keys[0])) // This is the contacts field
{
contacts = it.value();
}
else if (!curr_key.compare(keys[4])) // This is the sequence field
{
//std::cout << "sequence: " << it.value() << "\n";
string seq = it.value();
size = count_nucleotide(seq);
complete_seq = seq;
if (seq.empty()) {
errors_id.push_back(make_pair(stoi(id), 'e'));
break;
}
if (seq.size() < 4) {
if (size < 4) {
errors_id.push_back(make_pair(stoi(id), 'l'));
break;
}
size_t count_contact = count_delimiter(contacts);
size_t count_seq = count_delimiter(seq);
if (count_contact != count_seq) {
errors_id.push_back(make_pair(stoi(id), 'a'));
break;
}
if (contacts.size() != seq.size()) {
errors_id.push_back(make_pair(stoi(id), 'b'));
break;
}
// Iterate on components to check their length
string subseq;
......@@ -399,13 +347,13 @@ vector<pair<uint,char>> Motif::is_valid_JSON(const string& jsonfile)
if (subseq.size() >= 2) {
components.push_back(subseq);
} else {
errors_id.push_back(make_pair(stoi(id), 'k'));
//errors_id.push_back(make_pair(stoi(id), 'k'));
}
}
if (seq.size() >= 2) { // Last component after the last &
components.push_back(seq);
} else {
errors_id.push_back(make_pair(stoi(id), 'k'));
//errors_id.push_back(make_pair(stoi(id), 'k'));
}
size_t n = 0;
......@@ -418,11 +366,249 @@ vector<pair<uint,char>> Motif::is_valid_JSON(const string& jsonfile)
}
j++;
}
//std::cout << "no error!\n" << endl;
}
return errors_id;
}
//count the number of nucleotide in the motif sequence
size_t count_nucleotide(string& seq) {
size_t count = 0;
for(uint i = 0; i < seq.size(); i++) {
char c = seq.at(i);
if (c != '&') {
count++;
}
}
return count;
}
//count the number of '&' in the motif sequence
size_t count_delimiter(string& seq) {
size_t count = 0;
for(uint i = 0; i < seq.size(); i++) {
char c = seq.at(i);
if (c == '&') {
count++;
}
}
return count;
}
//count the number of '*' in the motif
size_t count_contacts(string& contacts) {
size_t count = 0;
for (uint i = 0; i < contacts.size(); i++) {
if (contacts[i] == '*') {
count++;
}
}
return count;
}
//Check if the sequence is a rna sequence (ATGC) and replace T by U or remove modified nucleotide if necessary
string check_motif_sequence(string seq) {
std::transform(seq.begin(), seq.end(), seq.begin(), ::toupper);
for (int i = seq.size(); i >= 0; i--) {
if(seq[i] == 'T') {
seq[i] = 'U';
} else if (!(seq [i] == 'A' || seq [i] == 'U' || seq [i] == '&'
|| seq [i] == 'G' || seq [i] == 'C')) {
seq = seq.erase(i,1);
}
}
return seq;
}
//check that there are as many opening parentheses as closing ones
bool checkSecondaryStructure(string struc)
{
stack<uint> parentheses;
stack<uint> crochets;
stack<uint> accolades;
stack<uint> chevrons;
for (uint i = 0; i < struc.length(); i++)
{
if (struc[i] != '(' && struc[i] != ')'
&& struc[i] != '.' && struc[i] != '&'
&& struc[i] != '[' && struc[i] != ']'
&& struc[i] != '{' && struc[i] != '}'
&& struc[i] != '<' && struc[i] != '>') {
return false;
} else {
for (uint i = 0; i < struc.size(); i++) {
if (struc[i] == '(') {
parentheses.push(i);
} else if (struc[i] == ')') {
if (!parentheses.empty())
parentheses.pop();
else return false;
} else if (struc[i] == '[') {
crochets.push(i);
} else if (struc[i] == ']') {
if (!crochets.empty())
crochets.pop();
else return false;
} else if (struc[i] == '{') {
accolades.push(i);
} else if (struc[i] == '}') {
if (!accolades.empty())
accolades.pop();
else return false;
} else if (struc[i] == '<') {
chevrons.push(i);
} else if (struc[i] == '>') {
if (!chevrons.empty())
chevrons.pop();
else return false;
}
}
}
}
return (parentheses.empty() && crochets.empty() && accolades.empty() && chevrons.empty());
}
// Converts a dot-bracket motif secondary structure to vector of Link
vector<Link> build_motif_pairs(string& struc, vector<Component>& v) {
vector<Link> vec;
stack<uint> parentheses;
stack<uint> crochets;
stack<uint> accolades;
stack<uint> chevrons;
uint count = 0;
uint debut = v[count].pos.first;
uint gap = 0;
for (uint i = 0; i < struc.size(); i++) {
if (struc[i] == '(') {
parentheses.push(i + debut + gap - count);
} else if (struc[i] == ')') {
Link l;
l.nts.first = parentheses.top();
l.nts.second = i + debut + gap - count;
vec.push_back(l);
parentheses.pop();
} else if (struc[i] == '[') {
crochets.push(i + debut + gap - count);
} else if (struc[i] == ']') {
Link l;
l.nts.first = crochets.top();
l.nts.second = i + debut + gap - count;
vec.push_back(l);
crochets.pop();
} else if (struc[i] == '{') {
accolades.push(i + debut + gap - count);
} else if (struc[i] == '}') {
Link l;
l.nts.first = accolades.top();
l.nts.second = i + debut + gap - count;
vec.push_back(l);
accolades.pop();
} else if (struc[i] == '<') {
chevrons.push(i + debut + gap - count);
} else if (struc[i] == '>') {
Link l;
l.nts.first = chevrons.top();
l.nts.second = i + debut + gap - count;
vec.push_back(l);
chevrons.pop();
} else if (struc[i] == '&') {
count ++;
gap += v[count].pos.first - v[count - 1].pos.second - 1;
}
}
return vec;
}
uint find_max_occurrences(string& filepath) {
uint max = 0;
std::ifstream in = std::ifstream(filepath);
json js = json::parse(in);
string contacts_id;
for(auto it = js.begin(); it != js.end(); ++it) {
contacts_id = it.key();
for(auto it2 = js[contacts_id].begin(); it2 != js[contacts_id].end(); ++it2) {
string test = it2.key();
if (!test.compare("occurences")) {
uint occ = it2.value();
if (occ > max) {
max = occ;
}
}
}
}
return max;
}
uint find_max_sequence(string& filepath) {
uint max = 0;
std::ifstream in = std::ifstream(filepath);
json js = json::parse(in);
string contacts_id;
string seq;
for(auto it = js.begin(); it != js.end(); ++it) {
contacts_id = it.key();
for(auto it2 = js[contacts_id].begin(); it2 != js[contacts_id].end(); ++it2) {
string test = it2.key();
if (!test.compare("sequence")) {
seq = it2.value();
uint size = seq.size();
if (size > max) {
max = size;
}
}
}
}
return max;
}
vector<string> find_components(string& sequence, string delimiter) {
vector<string> list;
string seq = sequence;
string subseq;
uint fin = 0;
while(seq.find(delimiter) != string::npos) {
fin = seq.find(delimiter);
subseq = seq.substr(0, fin);
seq = seq.substr(fin + 1);
list.push_back(subseq); // new component sequence
}
if (!seq.empty()) {
list.push_back(seq);
}
return list;
}
vector<uint> find_contacts(vector<string>& struc2d, vector<Component>& v) {
vector<uint> positions;
string delimiter = "*";
uint debut;
for (uint i = 0; i < v.size(); i++) {
debut = v[i].pos.first;
uint pos = struc2d[i].find(delimiter, 0);
while(pos != string::npos && pos <= struc2d[i].size())
{
positions.push_back(pos + debut);
pos = struc2d[i].find(delimiter, pos+1);
}
}
return positions;
}
bool is_desc_insertible(const string& descfile, const string& rna)
{
std::ifstream motif;
......@@ -484,17 +670,9 @@ vector<vector<Component>> find_next_ones_in(string rna, uint offset, vector<stri
if (regex_search(rna, c)) {
if (vc.size() > 2) {
next_seqs = vector<string>(&vc[1], &vc[vc.size()]);
/*for (uint i = 0; i < next_seqs.size(); i++) {
std::cout << "next seq: " << next_seqs[i] << endl;
}
std::cout << endl;*/
}
else {
next_seqs = vector<string>(1, vc.back());
/*for (uint i = 0; i < next_seqs.size(); i++) {
std::cout << "next seq: " << next_seqs[i] << endl;
}
std::cout << endl;*/
}
uint j = 0;
// For every regexp match
......@@ -551,25 +729,12 @@ vector<vector<Component>> find_next_ones_in(string rna, uint offset, vector<stri
return results;
}
uint count_pairing(string struc2d) {
uint count = 0;
for(uint i = 0; i < struc2d.size(); i++) {
if (struc2d[i] == '(' || struc2d[i] == '[' || struc2d[i] == '<' || struc2d[i] == '{') {
count++;
}
}
//cout << struc2d << ": " << count << endl;
return count;
}
vector<vector<Component>> json_find_next_ones_in(string rna, uint offset, vector<string>& vc, vector<string>& vs)
vector<vector<Component>> json_find_next_ones_in(string rna, uint offset, vector<string>& vc)
{
pair<uint, uint> pos;
uint nb_pairing;
vector<vector<Component>> results;
vector<vector<Component>> next_ones;
vector<string> next_seqs;
vector<string> next_strucs;
regex c(vc[0]);
//cout << "\t\t>Searching " << vc[0] << " in " << rna << endl;
......@@ -579,19 +744,9 @@ vector<vector<Component>> json_find_next_ones_in(string rna, uint offset, vector
if (regex_search(rna, c)) {
if (vc.size() > 2) {
next_seqs = vector<string>(&vc[1], &vc[vc.size()]);
next_strucs = vector<string>(&vs[1], &vs[vs.size()]);
/*for (uint i = 0; i < next_seqs.size(); i++) {
std::cout << "next seq: " << next_seqs[i] << endl;
}
std::cout << endl;*/
}
else {
next_seqs = vector<string>(1, vc.back());
next_strucs = vector<string>(1, vs.back());
/*for (uint i = 0; i < next_seqs.size(); i++) {
std::cout << "next seq: " << next_seqs[i] << endl;
}
std::cout << endl;*/
}
uint j = 0;
// For every regexp match
......@@ -599,7 +754,6 @@ vector<vector<Component>> json_find_next_ones_in(string rna, uint offset, vector
smatch match = *i;
pos.first = match.position() + offset;
pos.second = pos.first + match.length() - 1;
nb_pairing = count_pairing(vs[0]);
//cout << "\t\t>Inserting " << vc[j] << " in [" << pos.first << ',' << pos.second << "]" << endl;
// +5 because HL < 3 pbs but not for CaRNAval or Contacts
......@@ -609,7 +763,7 @@ vector<vector<Component>> json_find_next_ones_in(string rna, uint offset, vector
continue;
}
next_ones = json_find_next_ones_in(rna.substr(pos.second - offset + 2), pos.second + 2, next_seqs, next_strucs);
next_ones = json_find_next_ones_in(rna.substr(pos.second - offset + 2), pos.second + 2, next_seqs);
if (!next_ones.size()) {
// cout << "\t\t... but we cannot place the next components : Ignored.2" << endl;
continue;
......@@ -620,7 +774,7 @@ vector<vector<Component>> json_find_next_ones_in(string rna, uint offset, vector
// Combine the match for this component pos with the combination
// of next_ones as a whole solution
vector<Component> r;
r.push_back(Component(pos, nb_pairing));
r.push_back(Component(pos));
for (Component& c : v) r.push_back(c);
results.push_back(r);
}
......@@ -635,13 +789,12 @@ vector<vector<Component>> json_find_next_ones_in(string rna, uint offset, vector
smatch match = *i;
pos.first = match.position() + offset;
pos.second = pos.first + match.length() - 1;
nb_pairing = count_pairing(vs[0]);
//cout << "\t\t>Inserting " << vc[0] << " in [" << pos.first << ',' << pos.second << "]" << endl;
// Create a vector of component with one component for that match
vector<Component> r;
r.push_back(Component(pos, nb_pairing));
r.push_back(Component(pos));
results.push_back(r);
}
}
......
......@@ -20,13 +20,7 @@ typedef struct Comp_ {
pair<uint, uint> pos;
size_t k;
string seq_;
uint nb_pairing;
Comp_(pair<int, int> p) : pos(p) { k = 1 + pos.second - pos.first; }
Comp_(pair<int, int> p, uint nb_pair) : pos(p)
{
k = 1 + pos.second - pos.first;
nb_pairing = nb_pair;
}
Comp_(uint start, uint length) : k(length)
{
pos.first = start;
......@@ -64,6 +58,7 @@ class Motif
string get_identifier(void) const;
vector<Component> comp;
vector<Link> links_;
vector<uint> pos_contacts;
size_t contact_;
double tx_occurrences_;
......@@ -89,7 +84,19 @@ vector<Motif> load_csv(const string& path);
vector<Motif> load_json_folder(const string& path, const string& rna, bool verbose);
vector<vector<Component>> find_next_ones_in(string rna, uint offset, vector<string>& vc);
vector<vector<Component>> json_find_next_ones_in(string rna, uint offset, vector<string>& vc, vector<string>& vs);
vector<vector<Component>> json_find_next_ones_in(string rna, uint offset, vector<string>& vc);
// utilities for Json motifs
size_t count_nucleotide(string&);
size_t count_delimiter(string&);
size_t count_contacts(string&);
string check_motif_sequence(string);
bool checkSecondaryStructure(string);
vector<Link> build_motif_pairs(string&, vector<Component>&);
uint find_max_occurrences(string&);
uint find_max_sequence(string&);
vector<string> find_components(string&, string);
vector<uint> find_contacts(vector<string>&, vector<Component>&);
// utilities to compare secondary structures:
bool operator==(const Motif& m1, const Motif& m2);
......
#include <iostream>
#include <sstream>
#include <fstream>
#include "/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/cppsrc/json.hpp"
#include <typeinfo>
#include <set>
#include <algorithm>
#include <cstdio>
#include <vector>
using namespace std;
using json = nlohmann::json;
//Return true if the first sequence seq1 is included in the second sequence seq2
//if not return false
int is_contains(string& seq1, string& seq2) {
uint size1 = seq1.size();
uint size2 = seq2.size();
int index = -1;
if (size1 > size2) {
//cout << "size1: " << size1 << ", size2: " << size2 << endl;
return -1;
}
/*cout << "seq1: " << seq1 << endl;
cout << "seq2: " << seq2 << endl;*/
index = seq2.find(seq1);
if (index == -1) {
return -1;
} else {
//cout << "index: " << index << endl;
return index;
}
return -1;
}
//If we find the sequence and structure of pattern A in pattern B, we have to concatenate the pfam lists of A and B,
//remove the duplicates, assign this new list of pfam lists to A, and assign as occurrence to A the size of this list.
void counting_occurences(const string& jsonfile, const string& jsonoutfile) {
std::ifstream lib(jsonfile);
std::ifstream lib2(jsonfile);
std::ofstream outfile (jsonoutfile);
json new_motif;
json new_id;
string delimiter = "&";
json js = json::parse(lib);
json js2 = json::parse(lib2);
//the list of pfam lists of the motif we want to count the inclusion in other motif
for (auto it = js.begin(); it != js.end(); ++it) {
string id = it.key();
string test;
uint occurrences = 0;
int fin;
string sequence;
string struc;
vector<string> composantes;
vector<string> tab_struc;
vector<vector<string>> list_pfams;
vector<vector<string>> list_pfams2;
vector<vector<string>> union_pfams;
bool is_change = false;
//cout << "id: " << id << endl;
for (auto it2 = js[id].begin(); it2 != js[id].end(); ++it2) {
test = it2.key();
string test = it2.key();
if (!test.compare("pfam")) {
vector<vector<string>> tab = it2.value();
list_pfams = tab;
/*set<set<string>>::iterator iit;
set<string>::iterator iit2;
for(iit = list_pfams.begin(); iit != list_pfams.end(); iit++) {
for (iit2 = iit->begin(); iit2 != iit->end(); ++iit2) {
cout << *iit2 << endl;
}
cout << endl << endl;
}*/
} else if (!test.compare("sequence")) {
//cout << "sequence: " << it2.value() << endl;
sequence = it2.value();
new_id[test] = it2.value();
string subseq;
while(sequence.find(delimiter) != string::npos) {
fin = sequence.find(delimiter);
subseq = sequence.substr(0, fin);
sequence = sequence.substr(fin + 1);
composantes.push_back(subseq); // new component sequence
//std::cout << "subseq: " << subseq << endl;
}
if (!sequence.empty()) {
composantes.push_back(sequence);
//std::cout << "subseq: " << seq << endl;
}
} else if (!test.compare("struct2d")) {
//cout << "struct2d: " << it2.value() << endl;
struc = it2.value();
new_id[test] = it2.value();
string subseq;
while(struc.find(delimiter) != string::npos) {
fin = struc.find(delimiter);
subseq = struc.substr(0, fin);
struc = struc.substr(fin + 1);
tab_struc.push_back(subseq); // new component sequence
//std::cout << "subseq: " << subseq << endl;
}
if (!struc.empty()) {
tab_struc.push_back(struc);
//std::cout << "subseq: " << seq << endl;
}
}
else if (!test.compare("occurences") ) {
occurrences = it2.value();
} else {
new_id[test] = it2.value();
}
}
//cout << "-------begin---------" << endl;
for (auto it3 = js2.begin(); it3 != js2.end(); ++it3) {
string id2 = it3.key();
string sequence2, struc2;
vector<string> composantes2;
vector<string> tab_struc2;
int occurences2;
int fin;
//cout << "id: " << id << " / id2: " << id2 << endl;
for (auto it4 = js[id2].begin(); it4 != js[id2].end(); ++it4) {
string test = it4.key();
if (id != id2) {
if (!test.compare("pfam")) {
vector<vector<string>> tab = it4.value();
list_pfams2 = tab;
/*for (uint k = 0; k < tab2.size(); k++) {
for (uint l = 0; l < tab2[k].size(); l++) {
pfams2.insert(tab2[k][l]);
}
list_pfams2.insert(pfams);
pfams2.clear();
}*/
/*set<set<string>>::iterator iit;
set<string>::iterator iit2;
for(iit = list_pfams.begin(); iit != list_pfams.end(); iit++) {
for (iit2 = iit->begin(); iit2 != iit->end(); ++iit2) {
cout << *iit2 << endl;
}
cout << endl << endl;
}*/
} else if (!test.compare("occurences")) {
occurences2 = it4.value();
//cout << "occurences2: "<< occurences2 << endl;
} else if (!test.compare("sequence")) {
sequence2 = it4.value();
string subseq;
while(sequence2.find(delimiter) != string::npos) {
fin = sequence2.find(delimiter);
subseq = sequence2.substr(0, fin);
sequence2 = sequence2.substr(fin + 1);
composantes2.push_back(subseq); // new component sequence
//std::cout << "subseq: " << subseq << endl;
}
if (!sequence2.empty()) {
composantes2.push_back(sequence2);
//std::cout << "subseq: " << seq << endl;
}
} else if (!test.compare("struct2d")) {
struc2 = it4.value();
string subseq;
while(struc2.find(delimiter) != string::npos) {
fin = struc2.find(delimiter);
subseq = struc2.substr(0, fin);
struc2 = struc2.substr(fin + 1);
tab_struc2.push_back(subseq); // new component sequence
//std::cout << "subseq: " << subseq << endl;
}
if (!struc.empty()) {
tab_struc2.push_back(struc2);
//std::cout << "subseq: " << seq << endl;
}
uint number = 0;
int tab[composantes.size()];
for (uint ii = 0; ii < composantes.size(); ii++) {
//cout << "tab[" << ii << "]: " << tab[ii] << endl;
tab[ii] = 0;
}
//flag is true if the first component is found or if the k component is indeed placed after the k-1 component
//It checks if the found components are in the correct order
for (uint k = 0; k < composantes.size() ; k++) {
bool flag = false;
for (uint l = 0; l < composantes2.size(); l++) {
int test1 = is_contains(composantes[k], composantes2[l]);
int test2 = is_contains(tab_struc[k], tab_struc2[l]);
if (test1 == test2 && test1 != -1 && test2 != -1) {
if(!flag) {
if (k == 0 || test1 + composantes[k].size() > tab[k-1]) {
tab[k] = test1 + composantes[k].size();
flag = true;
}
}
}
//cout << "----end----" << endl;
//}
}
if(flag) {
number++;
}
}
// if number equal to the size of the number of component in the motif, it means that the motif is included.
//So we add the intersection of the two pfams list to the motif
if(number == composantes.size()) {
cout << "id: " << id << " / id2: " << id2 << endl;
vector<vector<string>> add_pfams;
std::set_difference(list_pfams2.begin(), list_pfams2.end(), list_pfams.begin(), list_pfams.end(),
std::inserter(add_pfams, add_pfams.begin()));
list_pfams.insert(list_pfams.begin(), add_pfams.begin(), add_pfams.end());
cout << "size: " << list_pfams.size() << endl;
add_pfams.clear();
is_change = true;
}
}
}
}
//cout << endl;*/
}
/*for(uint ii = 0; ii < list_pfams.size(); ii++) {
for (uint jj = 0; jj < list_pfams[ii].size(); jj++) {
cout << "[" << ii << "][" << jj << "]: " << list_pfams[ii][jj] << endl;
}
}*/
new_id["occurences"] = list_pfams.size();
new_id["pfam"] = list_pfams;
//cout << "-------ending---------" << endl;
new_motif[id] = new_id;
new_id.clear();
//cout << "valeur: " << ite << endl;
/*for (uint i = 0; i < tab_struc.size() ; i++) {
cout << "tab_struc[" << i << "]: " << tab_struc[i] << endl << endl;
} */
}
outfile << new_motif.dump(4) << endl;
outfile.close();
}
int main()
{
//183
//cout << "------------------BEGIN-----------------" << endl;
string jsonfile = "/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/data/modules/ISAURE/Motifs_version_initiale/motifs_06-06-2021.json";
string out = "/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/data/modules/ISAURE/Motifs_derniere_version/motifs_final.json";
counting_occurences(jsonfile, out);
//cout << "------------------END-----------------" << endl;
return 0;
}
#include <iostream>
#include <sstream>
#include <fstream>
#include "/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/cppsrc/json.hpp"
#include <typeinfo>
#include <set>
#include <algorithm>
#include <cstdio>
#include <vector>
using namespace std;
using json = nlohmann::json;
void delete_redundant_pdb(const string& jsonfile, const string& jsontest, const string& jsonoutfile) {
std::ifstream lib(jsonfile);
std::ifstream lib2(jsontest);
std::ofstream outfile (jsonoutfile);
json new_motif;
json new_id;
json js = json::parse(lib);
json js2 = json::parse(lib2);
//the list of pfam lists of the motif we want to count the inclusion in other motif
for (auto it = js.begin(); it != js.end(); ++it) {
string id = it.key();
vector<string> list_pdbs;
vector<string> list_pdbs2;
bool is_added = true;
//cout << "id: " << id << endl;
for (auto it2 = js[id].begin(); it2 != js[id].end(); ++it2) {
string test = it2.key();
if (!test.compare("pdb")) {
vector<string> tab = it2.value();
list_pdbs = tab;
/*set<set<string>>::iterator iit;
set<string>::iterator iit2;
for(iit = list_pfams.begin(); iit != list_pfams.end(); iit++) {
for (iit2 = iit->begin(); iit2 != iit->end(); ++iit2) {
cout << *iit2 << endl;
}
cout << endl << endl;
}*/
} else {
new_id[test] = it2.value();
}
}
//cout << "-------begin---------" << endl;
for (auto it3 = js2.begin(); it3 != js2.end(); ++it3) {
string id2 = it3.key();
//cout << "id: " << id << " / id2: " << id2 << endl;
for (auto it4 = js[id2].begin(); it4 != js[id2].end(); ++it4) {
string test = it4.key();
if (!test.compare("pdb")) {
vector<string> tab = it4.value();
list_pdbs2 = tab;
//cout << id << " / " << id2 << endl;
for (uint k = 0; k < list_pdbs2.size(); k++) {
if (count(list_pdbs.begin(), list_pdbs.end(), list_pdbs2[k])) {
is_added = false;
}
//cout << list_pdbs2[k] << endl;
}
}
}
//cout << endl;*/
}
/*for(uint ii = 0; ii < list_pfams.size(); ii++) {
for (uint jj = 0; jj < list_pfams[ii].size(); jj++) {
cout << "[" << ii << "][" << jj << "]: " << list_pfams[ii][jj] << endl;
}
}*/
if (is_added) {
new_id["pdb"] = list_pdbs;
new_motif[id] = new_id;
}
new_id.clear();
//cout << "valeur: " << ite << endl;
/*for (uint i = 0; i < tab_struc.size() ; i++) {
cout << "tab_struc[" << i << "]: " << tab_struc[i] << endl << endl;
} */
}
outfile << new_motif.dump(4) << endl;
outfile.close();
}
int main()
{
string jsonfile = "/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/data/modules/ISAURE/Motifs_version_initiale/bibli_test2.json";
string jsontest = "/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/data/modules/ISAURE/Motifs_version_initiale/benchmark_test.json";
string out = "/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/data/modules/ISAURE/Motifs_derniere_version/motifs_final_test.json";
delete_redundant_pdb(jsonfile, jsontest, out);
return 0;
}
......@@ -3,11 +3,13 @@
#include <algorithm>
#include <boost/format.hpp>
#define RESET "\033[0m"
#define RED "\033[31m" /* Red */
using std::abs;
using std::cout;
using std::endl;
SecondaryStructure::SecondaryStructure() {}
......@@ -98,6 +100,26 @@ string SecondaryStructure::to_DBN(void) const
return res;
}
string structure_with_contacts(const SecondaryStructure& ss) {
string sequence = ss.rna_.get_seq();
string construct = "";
bool flag;
for (uint i = 0; i < sequence.size(); i++) {
flag = false;
for (const Motif& m : ss.motif_info_) {
for (uint j = 0; j < m.pos_contacts.size(); j++) {
if (m.pos_contacts[j] == i) flag = true;
}
}
if (flag) {
construct += "*";
} else {
construct += ".";
}
}
return construct;
}
string SecondaryStructure::to_string(void) const
{
string s;
......@@ -119,13 +141,35 @@ void SecondaryStructure::set_basepair(uint i, uint j)
void SecondaryStructure::insert_motif(const Motif& m) { motif_info_.push_back(m); }
void colored_contacts(string sequence, vector<Motif> motif_info_) {
bool flag;
for (uint i = 0; i < sequence.size(); i++) {
flag = false;
for (const Motif& m : motif_info_) {
for (uint j = 0; j < m.pos_contacts.size(); j++) {
if (m.pos_contacts[j] == i) flag = true;
}
}
if (flag) {
cout << RED << sequence[i] << RESET;
} else {
cout << sequence[i];
}
}
}
void SecondaryStructure::print(void) const
{
cout << endl;
cout << '\t' << rna_.get_seq() << endl;
cout << '\t' << to_string() << endl;
cout << '\t';
colored_contacts(rna_.get_seq(), motif_info_);
//rna_.get_seq()
cout << endl;
string ss = to_string();
cout << '\t';
colored_contacts(ss, motif_info_);
//cout << ss;
cout << endl;
for (const Motif& m : motif_info_) {
uint i = 0;
cout << '\t';
......
......@@ -30,7 +30,6 @@ class SecondaryStructure
string to_DBN() const;
string to_string() const;
vector<double> objective_scores_; // values of the different objective functions for that SecondaryStructure
vector<pair<uint, uint>> basepairs_; // values of the decision variable of the integer program
vector<Motif> motif_info_; // information about known motives in this secondary structure and their positions
......@@ -58,5 +57,7 @@ inline void SecondaryStructure::set_objective_score(int i, double s) { objecti
inline uint SecondaryStructure::get_n_motifs(void) const { return motif_info_.size(); }
inline uint SecondaryStructure::get_n_bp(void) const { return nBP_; }
string structure_with_contacts(const SecondaryStructure& ss);
#endif // SECONDARY_STRUCTURE_
\ No newline at end of file
......
/***
Biorseo, Louis Becquey, nov 2018-August 2020
Special thanks to Lénaic Durand for working a lot on version 2.0
louis.becquey@univ-evry.fr
Biorseo, Louis Becquey, nov 2018-nov 2021
Special thanks to Lénaic Durand and Nathalie Bertrand for working a lot on version 2.0
louis.becquey@univ-evry.fr (likely expired email account, find me on the internet)
***/
#include <cmath>
......@@ -22,236 +22,236 @@ namespace po = boost::program_options;
string remove_ext(const char* mystr, char dot, char sep)
{
// COPYPASTA from stackoverflow
char *retstr, *lastdot, *lastsep;
// Error checks and allocate string.
if (mystr == nullptr) return nullptr;
if ((retstr = static_cast<char*>(malloc(strlen(mystr) + 1))) == nullptr) return nullptr;
// Make a copy and find the relevant characters.
strcpy(retstr, mystr);
lastdot = strrchr(retstr, dot);
lastsep = (sep == 0) ? nullptr : strrchr(retstr, sep);
// If it has an extension separator.
if (lastdot != nullptr) {
// and it's before the extenstion separator.
if (lastsep != nullptr) {
if (lastsep < lastdot) {
// then remove it.
*lastdot = '\0';
}
} else {
// Has extension separator with no path separator.
*lastdot = '\0';
}
}
// Return the modified string.
return string(retstr);
// COPYPASTA from stackoverflow
char *retstr, *lastdot, *lastsep;
// Error checks and allocate string.
if (mystr == nullptr) return nullptr;
if ((retstr = static_cast<char*>(malloc(strlen(mystr) + 1))) == nullptr) return nullptr;
// Make a copy and find the relevant characters.
strcpy(retstr, mystr);
lastdot = strrchr(retstr, dot);
lastsep = (sep == 0) ? nullptr : strrchr(retstr, sep);
// If it has an extension separator.
if (lastdot != nullptr) {
// and it's before the extenstion separator.
if (lastsep != nullptr) {
if (lastsep < lastdot) {
// then remove it.
*lastdot = '\0';
}
} else {
// Has extension separator with no path separator.
*lastdot = '\0';
}
}
// Return the modified string.
return string(retstr);
}
int main(int argc, char* argv[])
{
/* VARIABLE DECLARATIONS */
string inputName, outputName, motifs_path_name, basename;
bool verbose = false;
float theta_p_threshold;
char obj_function_nbr = 'B';
list<Fasta> f;
ofstream outfile;
SecondaryStructure bestSSO1, bestSSO2;
RNA myRNA;
/* ARGUMENT CHECKING */
po::options_description desc("Options");
desc.add_options()
("help,h", "Print the help message")
("version", "Print the program version")
("seq,s", po::value<string>(&inputName)->required(), "Fasta file containing the RNA sequence")
("descfolder,d", po::value<string>(&motifs_path_name), "A folder containing modules in .desc format, as produced by Djelloul & Denise's catalog program")
("rinfolder,x", po::value<string>(&motifs_path_name), "A folder containing CaRNAval's RINs in .txt format, as produced by script transform_caRNAval_pickle.py")
("jsonfolder,a", po::value<string>(&motifs_path_name), "A folder containing the motif library of Isaure in .json format")
("jar3dcsv,j", po::value<string>(&motifs_path_name), "A file containing the output of JAR3D's search for motifs in the sequence, as produced by biorseo.py")
("bayespaircsv,b", po::value<string>(&motifs_path_name), "A file containing the output of BayesPairing's search for motifs in the sequence, as produced by biorseo.py")
("first-objective,c", po::value<unsigned int>(&MOIP::obj_to_solve_)->default_value(1), "Objective to solve in the mono-objective portions of the algorithm")
("output,o", po::value<string>(&outputName), "A file to summarize the computation results")
("theta,t", po::value<float>(&theta_p_threshold)->default_value(0.001), "Pairing probability threshold to consider or not the possibility of pairing")
("function,f", po::value<char>(&obj_function_nbr)->default_value('B'), "What objective function to use to include motifs: square of motif size in nucleotides like "
"RNA-MoIP (A), light motif size + high number of components (B), site score (C), light motif size + site score + high number of components (D)")
("disable-pseudoknots,n", "Add constraints forbidding the formation of pseudoknots")
("limit,l", po::value<unsigned int>(&MOIP::max_sol_nbr_)->default_value(500), "Intermediate number of solutions in the Pareto set above which we give up the calculation.")
("verbose,v", "Print what is happening to stdout");
po::variables_map vm;
po::store(po::parse_command_line(argc, argv, desc), vm);
basename = remove_ext(inputName.c_str(), '.', '/');
//theta_p_threshold = 0.01;
try {
po::store(po::parse_command_line(argc, argv, desc), vm); // can throw
if (vm.count("help") or vm.count("-h")) {
cout << "Biorseo, bio-objective integer linear programming framework to predict RNA secondary "
"structures by including known RNA modules."
<< endl
<< "developped by Louis Becquey (louis.becquey@univ-evry.fr), 2018-2020" << endl
<< endl
<< desc << endl;
return EXIT_SUCCESS;
}
if (vm.count("version")) {
cout << "Biorseo v2.0, dockerized, August 2020" << endl;
return EXIT_SUCCESS;
}
if (vm.count("verbose")) verbose = true;
if (vm.count("disable-pseudoknots")) MOIP::allow_pk_ = false;
if (!vm.count("jar3dcsv") and !vm.count("bayespaircsv") and !vm.count("descfolder") and !vm.count("rinfolder") and !vm.count("jsonfolder")) {
cerr << "\033[31mYou must provide at least one of --descfolder, --rinfolder, --jar3dcsv or --bayespaircsv.\033[0m See --help "
"for more information."
<< endl;
return EXIT_FAILURE;
}
if ((vm.count("-d") or vm.count("-x")) and (obj_function_nbr == 'C' or obj_function_nbr == 'D')) {
cerr << "\033[31mYou must provide --jar3dcsv or --bayespaircsv to use --function C or --function D.\033[0m See "
"--help for more information."
<< endl;
return EXIT_FAILURE;
}
po::notify(vm); // throws on error, so do after help in case there are any problems
} catch (po::error& e) {
cerr << "ERROR: \033[31m" << e.what() << "\033[0m" << endl;
cerr << desc << endl;
return EXIT_FAILURE;
}
MOIP::obj_function_nbr_ = obj_function_nbr;
/* FILE PARSING */
// load fasta file
if (verbose) cout << "Reading input files..." << endl;
if (access(inputName.c_str(), F_OK) == -1) {
cerr << "\033[31m" << inputName << " not found\033[0m" << endl;
return EXIT_FAILURE;
}
Fasta::load(f, inputName.c_str());
list<Fasta>::iterator fa = f.begin();
if (verbose) cout << "loading " << fa->name() << "..." << endl;
myRNA = RNA(fa->name(), fa->seq(), verbose);
if (verbose) cout << "\t> " << inputName << " successfuly loaded (" << myRNA.get_RNA_length() << " nt)" << endl;
// load CSV file
if (access(motifs_path_name.c_str(), F_OK) == -1) {
cerr << "\033[31m" << motifs_path_name << " not found\033[0m" << endl;
return EXIT_FAILURE;
}
/* FIND PARETO SET */
string source;
if (vm.count("jar3dcsv"))
source = "jar3dcsv";
else if (vm.count("bayespaircsv"))
source = "bayespaircsv";
else if (vm.count("rinfolder"))
source = "rinfolder";
else if (vm.count("descfolder"))
source = "descfolder";
else
source = "jsonfolder";
MOIP myMOIP = MOIP(myRNA, source, motifs_path_name.c_str(), theta_p_threshold, verbose);
double min, max;
IloConstraintArray F(myMOIP.get_env());
// return 0;
if (verbose)
cout << "Solving..." << endl;
try {
bestSSO1 = myMOIP.solve_objective(1, -__DBL_MAX__, __DBL_MAX__);
if (verbose) cout << endl;
bestSSO2 = myMOIP.solve_objective(2, -__DBL_MAX__, __DBL_MAX__);
if (verbose) {
cout << endl << "Best solution according to objective 1 :" << bestSSO1.to_string() << endl;
cout << "Best solution according to objective 2 :" << bestSSO2.to_string() << endl;
}
// extend the Pareto set on top
if (MOIP::obj_to_solve_ == 1) {
myMOIP.add_solution(bestSSO1);
min = bestSSO1.get_objective_score(2) + MOIP::precision_;
max = bestSSO2.get_objective_score(2);
if (verbose) cout << endl << "Solving obj1 on top of best solution 1." << endl;
} else {
myMOIP.add_solution(bestSSO2);
min = bestSSO2.get_objective_score(1) + MOIP::precision_;
max = bestSSO1.get_objective_score(1);
if (verbose) cout << endl << "Solving obj2 on top of best solution 2." << endl;
}
if (verbose)
cout << setprecision(-log10(MOIP::precision_) + 4) << "\nSolving objective function " << MOIP::obj_to_solve_ << ", on top of "
<< min << ": Obj" << 3 - MOIP::obj_to_solve_ << " being in [" << min << ", " << max << "]..." << endl;
myMOIP.search_between(min, max);
// extend the Pareto set below
if (MOIP::obj_to_solve_ == 1) {
if (verbose) cout << endl << "Solving obj1 below best solution 1." << endl;
min = -__DBL_MAX__;
max = bestSSO1.get_objective_score(2);
} else {
if (verbose) cout << endl << "Solving obj2 below best solution 2." << endl;
min = -__DBL_MAX__;
max = bestSSO2.get_objective_score(1);
}
if (verbose)
cout << setprecision(-log10(MOIP::precision_) + 4) << "\nSolving objective function " << MOIP::obj_to_solve_
<< ", below (or eq. to) " << max << ": Obj" << 3 - MOIP::obj_to_solve_ << " being in [" << min << ", "
<< max << "]..." << endl
<< "\t> Forbidding " << F.getSize() << " solutions found in [" << setprecision(10)
<< min - MOIP::precision_ << ", " << max + MOIP::precision_ << ']' << endl;
myMOIP.search_between(min, max);
} catch (IloCplex::Exception& e) {
cerr << "\033[31mCplex Exception: " << e.getMessage() << "\033[0m" << endl;
exit(EXIT_FAILURE);
}
/* DISPLAY RESULTS */
// print the pareto set
if (verbose) {
cout << endl << endl << "---------------------------------------------------------------" << endl;
cout << "Whole Pareto Set:" << endl;
for (uint i = 0; i < myMOIP.get_n_solutions(); i++) myMOIP.solution(i).print();
cout << endl;
cout << myMOIP.get_n_candidates() << " candidate insertion sites, " << myMOIP.get_n_solutions() << " solutions kept." << endl;
cout << "Best value for Motif insertion objective: " << bestSSO1.get_objective_score(1) << endl;
cout << "Best value for structure expected accuracy: " << bestSSO2.get_objective_score(2) << endl;
}
// Save it to file
if (vm.count("output")) {
if (verbose) cout << "Saving structures to " << outputName << "..." << endl;
outfile.open(outputName);
outfile << fa->name() << endl << fa->seq() << endl;
//cout << "----struc----" << endl << myMOIP.solution(0).to_string() << endl;
for (uint i = 0; i < myMOIP.get_n_solutions(); i++) outfile << myMOIP.solution(i).to_string() << endl;
outfile.close();
}
/* QUIT */
return EXIT_SUCCESS;
/* VARIABLE DECLARATIONS */
string inputName, outputName, motifs_path_name, basename;
bool verbose = false;
float theta_p_threshold;
char obj_function_nbr;
char mea_or_mfe = 'b'; // a for MFE, b for MEA
list<Fasta> f;
ofstream outfile;
SecondaryStructure bestSSO1, bestSSO2;
RNA myRNA;
/* ARGUMENT CHECKING */
po::options_description desc("Options");
desc.add_options()
("help,h", "Print the help message")
("version", "Print the program version")
("seq,s", po::value<string>(&inputName)->required(), "Fasta file containing the RNA sequence")
("descfolder,d", po::value<string>(&motifs_path_name), "A folder containing modules in .desc format, as produced by Djelloul & Denise's catalog program (deprecated)")
("rinfolder,r", po::value<string>(&motifs_path_name), "A folder containing CaRNAval's RINs in .txt format, as produced by script transform_caRNAval_pickle.py")
("jsonfolder,j", po::value<string>(&motifs_path_name), "A folder containing a custom motif library in .json format")
("pre-placed,x", po::value<string>(&motifs_path_name), "A CSV file providing motif insertion sites obtained with another tool.")
("function,f", po::value<char>(&obj_function_nbr)->default_value('B'),
"(A, B, C, D, E or F) Objective function to score module insertions:\n"
" (A) insert big modules\n (B) light, high-order modules\n"
" (C) well-scored modules\n (D) light, high-order, well-scored\n modules\n"
" (E, F) insert big modules with many\n contacts with proteins, different\n ponderations.\n"
" C and D require position scores\n provided by --pre-placed.\n"
" E and F require protein-contact\n information and should be\n used only with --jsonfolder.")
("mfe,E", "Minimize stacking energies\n (leads to MFE extimator)")
("mea,A", "(default) Maximize expected accuracy\n (leads to MEA estimator)")
("first-objective,c", po::value<unsigned int>(&MOIP::obj_to_solve_)->default_value(2),
"(1 or 2) Objective to solve in the mono-objective portions of the algorithm.\n"
" (1) is the module objective,\n given by --function\n"
" (2) is energy-based objective,\n either MFE or MEA")
("output,o", po::value<string>(&outputName), "A file to summarize the computation results")
("theta,t", po::value<float>(&theta_p_threshold)->default_value(1e-3, "0.001"), "Pairing probability threshold to consider or not the possibility of pairing")
("disable-pseudoknots,n", "Add constraints forbidding the formation of pseudoknots")
("limit,l", po::value<unsigned int>(&MOIP::max_sol_nbr_)->default_value(500), "Intermediate number of solutions in the Pareto set above which we give up the calculation.")
("verbose,v", "Print what is happening to console");
po::variables_map vm;
po::store(po::parse_command_line(argc, argv, desc), vm);
basename = remove_ext(inputName.c_str(), '.', '/');
try {
po::store(po::parse_command_line(argc, argv, desc), vm); // can throw
if (vm.count("help") or vm.count("-h")) {
cout << "Biorseo, Bi-Objective RNA Structure Efficient Optimizer" << endl
<< "Bio-objective integer linear programming framework to predict RNA secondary structures by including known RNA modules." << endl
<< "Developped by Louis Becquey, 2018-2021\nLénaïc Durand, 2019\nNathalie Bernard, 2021" << endl << endl
<< "Usage:\tYou must provide:\n\t1) a FASTA input file with -s," << endl
<< "\t2) a module type with --rna3dmotifs, --carnaval, --contacts or --pre-placed," << endl
<< "\t3) one module-based scoring function with --func A, B, C, D, E or F," << endl
<< "\t4) one energy-based scoring function with --mfe or --mea," << endl
<< "\t5) how to display results: in console (-v), or in a result file (-o)." << endl
<< endl
<< desc << endl;
return EXIT_SUCCESS;
}
if (vm.count("version")) {
cout << "Biorseo v2.1, dockerized, November 2021" << endl;
return EXIT_SUCCESS;
}
po::notify(vm); // throws on error, so do after help in case there are any problems
if (vm.count("mfe")) mea_or_mfe = 'a';
if (vm.count("mea")) mea_or_mfe = 'b';
if (vm.count("verbose")) verbose = true;
if (vm.count("disable-pseudoknots")) MOIP::allow_pk_ = false;
} catch (po::error& e) {
cerr << "ERROR: \033[31m" << e.what() << "\033[0m" << endl;
cerr << desc << endl;
return EXIT_FAILURE;
}
MOIP::obj_function_nbr_ = obj_function_nbr;
MOIP::obj_function2_nbr_ = mea_or_mfe;
/* FILE PARSING */
// load fasta file
if (verbose) cout << "Reading input files..." << endl;
if (access(inputName.c_str(), F_OK) == -1) {
cerr << "\033[31m" << inputName << " not found\033[0m" << endl;
return EXIT_FAILURE;
}
Fasta::load(f, inputName.c_str());
list<Fasta>::iterator fa = f.begin();
if (verbose) cout << "loading " << fa->name() << "..." << endl;
myRNA = RNA(fa->name(), fa->seq(), verbose);
if (verbose) cout << "\t> " << inputName << " successfuly loaded (" << myRNA.get_RNA_length() << " nt)" << endl;
// check motif folder exists
if (access(motifs_path_name.c_str(), F_OK) == -1) {
cerr << "\033[31m" << motifs_path_name << " not found\033[0m" << endl;
return EXIT_FAILURE;
}
/* FIND PARETO SET */
string source;
if (vm.count("rinfolder"))
source = "rinfolder";
else if (vm.count("descfolder"))
source = "descfolder";
else if (vm.count("jsonfolder"))
source = "jsonfolder";
else if (vm.count("pre-placed"))
source = "csvfile";
else
cerr << "ERR: no source of modules provided !" << endl;
MOIP myMOIP = MOIP(myRNA, source, motifs_path_name.c_str(), theta_p_threshold, verbose);
double min, max;
IloConstraintArray F(myMOIP.get_env());
if (verbose)
cout << "Solving..." << endl;
try {
bestSSO1 = myMOIP.solve_objective(1, -__DBL_MAX__, __DBL_MAX__);
if (verbose) cout << endl;
bestSSO2 = myMOIP.solve_objective(2, -__DBL_MAX__, __DBL_MAX__);
if (verbose) {
cout << endl << "Best solution according to objective 1 :" << bestSSO1.to_string() << endl;
cout << "Best solution according to objective 2 :" << bestSSO2.to_string() << endl;
}
// extend the Pareto set on top
if (MOIP::obj_to_solve_ == 1) {
myMOIP.add_solution(bestSSO1);
min = bestSSO1.get_objective_score(2) + MOIP::precision_;
max = bestSSO2.get_objective_score(2) + MOIP::precision_;
if (verbose) cout << endl << "Solving obj1 on top of best solution 1." << endl;
} else {
myMOIP.add_solution(bestSSO2);
min = bestSSO2.get_objective_score(1) + MOIP::precision_;
max = bestSSO1.get_objective_score(1) + MOIP::precision_;
if (verbose) cout << endl << "Solving obj2 on top of best solution 2." << endl;
}
if (verbose)
cout << setprecision(-log10(MOIP::precision_) + 4) << "\nSolving objective function " << MOIP::obj_to_solve_ << ", on top of "
<< min << ": Obj" << 3 - MOIP::obj_to_solve_ << " being in [" << min << ", " << max << "]..." << endl;
myMOIP.search_between(min, max);
// extend the Pareto set below
if (MOIP::obj_to_solve_ == 1) {
if (verbose) cout << endl << "Solving obj1 below best solution 1." << endl;
min = -__DBL_MAX__;
max = bestSSO1.get_objective_score(2);
} else {
if (verbose) cout << endl << "Solving obj2 below best solution 2." << endl;
min = -__DBL_MAX__;
max = bestSSO2.get_objective_score(1);
}
if (verbose)
cout << setprecision(-log10(MOIP::precision_) + 4) << "\nSolving objective function " << MOIP::obj_to_solve_
<< ", below (or eq. to) " << max << ": Obj" << 3 - MOIP::obj_to_solve_ << " being in [" << min << ", "
<< max << "]..." << endl
<< "\t> Forbidding " << F.getSize() << " solutions found in [" << setprecision(10)
<< min - MOIP::precision_ << ", " << max + MOIP::precision_ << ']' << endl;
myMOIP.search_between(min, max);
} catch (IloCplex::Exception& e) {
cerr << "\033[31mCplex Exception: " << e.getMessage() << "\033[0m" << endl;
exit(EXIT_FAILURE);
}
/* DISPLAY RESULTS */
// print the pareto set
if (verbose) {
cout << endl << endl << "---------------------------------------------------------------" << endl;
cout << "Whole Pareto Set:" << endl;
for (uint i = 0; i < myMOIP.get_n_solutions(); i++) myMOIP.solution(i).print();
cout << endl;
cout << myMOIP.get_n_candidates() << " candidate insertion sites, " << myMOIP.get_n_solutions() << " solutions kept." << endl;
cout << "Best value for Motif insertion objective: " << bestSSO1.get_objective_score(1) << endl;
cout << "Best value for structure expected accuracy: " << bestSSO2.get_objective_score(2) << endl;
}
// Save it to file
if (vm.count("output")) {
if (verbose) cout << "Saving structures to " << outputName << "..." << endl;
outfile.open(outputName);
outfile << fa->name() << endl << fa->seq() << endl;
for (uint i = 0; i < myMOIP.get_n_solutions(); i++) {
outfile << myMOIP.solution(i).to_string() << endl << structure_with_contacts(myMOIP.solution(i)) << endl;
string str1 = myMOIP.solution(i).to_string();
}
outfile.close();
}
/* QUIT */
return EXIT_SUCCESS;
}
......
No preview for this file type
......@@ -58,12 +58,49 @@ RNA::RNA(string name, string seq, bool verbose)
pij_(results->i-1,results->j-1) = results->p;
results++;
}
/*define type_*/
type_ = vector<vector<int>>(n_, vector<int>(n_));
for(uint i = 0; i < n_; i++){
for(uint j = 0; j < n_; j++){
if (i < j){
std::stringstream ss;
ss << seq_[i] << seq_[j];
std::string str = ss.str();
if(str.compare("AU") == 0 ){
type_[i][j] = 1;
}
else if(str.compare("CG") == 0 ){
type_[i][j] = 2;
}
else if(str.compare("GC") == 0 ){
type_[i][j] = 3;
}
else if(str.compare("GU") == 0 ){
type_[i][j] = 4;
}
else if(str.compare("UG") == 0 ){
type_[i][j] = 5;
}
else if(str.compare("UA") == 0 ){
type_[i][j] = 6;
}
else{
type_[i][j] = 0;
}
}
else{
type_[i][j] = 0;
}
}
}
}
else cerr << "NULL result returned by vrna_pfl_fold" << endl;
}
void RNA::print_basepair_p_matrix(float theta) const
{
cout << endl;
......
......@@ -32,6 +32,8 @@ class RNA
uint get_RNA_length(void) const;
void print_basepair_p_matrix(float theta) const;
vector<vector<int>> get_type();
bool verbose_; // Should we print things ?
private:
......@@ -41,10 +43,15 @@ class RNA
string seq_; // sequence of the rna with chars
uint n_; // length of the rna
MatrixXf pij_; // matrix of basepair probabilities
vector<vector<int>> type_; //vector of base pair types
};
inline float RNA::get_pij(int i, int j) { return pij_(i, j); }
inline uint RNA::get_RNA_length() const { return n_; }
inline string RNA::get_seq(void) const { return seq_; }
inline vector<vector<int>> RNA::get_type() { return type_; }
#endif
......
>__'CRYSTAL_STRUCTURE_OF_A_TIGHT-BINDING_GLUTAMINE_TRNA_BOUND_TO_GLUTAMINE_AMINOACYL_TRNA_SYNTHETASE_'_(PDB_00376)
GGGGUAUCGCCAAGCGGUAAGGCACCGGAUUCUGAUUCCGGAGGUCGAGGUUCGAAUCCUCGUACCCCAGCCA
>__'GUANINE_RIBOSWITCH_U22C,_A52G_MUTANT_BOUND_TO_HYPOXANTHINE_'_(PDB_01023)
GGACAUACAAUCGCGUGGAUAUGGCACGCAAGUUUCUGCCGGGCACCGUAAAUGUCCGACUAUGUCCa
>__'SOLUTION_STRUCTURE_OF_THE_P2B-P3_PSEUDOKNOT_FROM_HUMAN_TELOMERASE_RNA_'_(PDB_00857)
GGGCUGUUUUUCUCGCUGACUUUCAGCCCCAAACAAAAAAGUCAGCA
>test_CRYSTAL_STRUCTURE_OF_A_TIGHT-BINDING_GLUTAMINE_TRNA_BOUND_TO_GLUTAMINE_AMINOACYL_TRNA_SYNTHETASE__PDB_00376
GGGGUAUCGCCAAGCGGUAAGGCACCGGAUUCUGAUUCCGGAGGUCGAGGUUCGAAUCCUCGUACCCCAGCCA
>test_GUANINE_RIBOSWITCH_U22C,_A52G_MUTANT_BOUND_TO_HYPOXANTHINE__PDB_01023
GGACAUACAAUCGCGUGGAUAUGGCACGCAAGUUUCUGCCGGGCACCGUAAAUGUCCGACUAUGUCCa
>test_SOLUTION_STRUCTURE_OF_THE_P2B-P3_PSEUDOKNOT_FROM_HUMAN_TELOMERASE_RNA__PDB_00857
GGGCUGUUUUUCUCGCUGACUUUCAGCCCCAAACAAAAAAGUCAGCA
\ No newline at end of file
......
File mode changed
> JSON1000_extended
AAUAUCCGGGCGUUUAAUCCCGGGAUAAA
\ No newline at end of file
The motif library used with --contacts is particular. It was provided by Isaure Chauvot de Beauchêne from the LORIA
laboratory. These motifs are made up of RNA fragments linked to proteins.
==================================================================================================================
Several versions of these designs have been provided, but the most complete is the latest:'motifs_06-06-2021.json'
The current scripts were created based on this file, and doesn't work with the other older libraries.
There is also 2 benchmarks files also in json format : 'benchmark_16-06-2021.json' and 'benchmark_16-07-2021.json'.
It contains complete RNA sequences that bind to a protein, the first one contains only 33 RNA, and the second one
contains 130 RNA.
The benchmark.dbn and benchmark.txt were created based on the 'benchmark_16-07-2021.json'.
They are mostly used for the Isaure_benchmark.py script and scripts from the 'scripts' directory.
The motifs_final.json it obtains after executing the count_pattern.cpp script in 'script' directory on
the 'motifs_06-06-2021.json' motifs file.
This script count the number of "occurrences" of the motif. So we consider that if the sequence of motif A
is included in motif B, then for each inclusion of B we also have an inclusion of A. And vice versa.
The motif library used by BiORSEO is the one in the 'bibliotheque_a_lire' directory. There should only be
the json file we wish to be used by BiORSEO for it's prediction. That's why you shouldn't put other type of file!
test_1ASY_R.
UCCGUGAUAGUUUAAUGGUCAGAAUGGGCGCUUGUCGCGUGCCAGAUCGGGGUUCAAUUCCCCGUCGCGGAGCCA
(((((((..((((........)))).(((((.......)))))....(((((.......))))))))))))....
*........****..........*******.********..........................**********
test_1B23_R.
GGCGCGUUAACAAAGCGGUUAUGUAGCGGAUUGCAAAUCCGUCUAGUCCGGUUCGACUCCGGAACGCGCCUCCA
(((((((..(((.........))).(((((.......)))))....(((((.......))))))))))))....
***..........................................*****........*****....*****..
test_1C0A_B.
GGAGCGGUAGUUCAGUCGGUUAGAAUACCUGCCUGUCACGCAGGGGGUCGCGGGUUCGAGUCCCGUCCGUUCCGCCA
(((((((..((((.........)))).(((((.......))))).....(((((.......))))))))))))....
*....*...****...........******.*********...........................**********
test_1DRZ_B.
GGCCGGCAUGGUCCCAGCCUCCUCGCUGGCGCCGGCUGGGCAACACCAUUGCACUCCGGUGGCGAAUGGGAC
(((((((.........(((........))))))))))......((((..........))))...........
..........................................****.*******.***..............
test_1EFW_C.
GGAGCGGUAGUUCAGUCGGUUAGAAUACCUGCCUGUCACGCAGGGGGUCGCGGGUUCGAGUCCCGUCCGUUCC
(((((((..((((.........)))).(((((.......))))).....(((((.......))))))))))))
.........***..............****..********............................*****
test_1EIY_C.
GCCGAGGUAGCUCAGUUGGUAGAGCAUGCGACUGAAAAUCGCAGUGUCCGCGGUUCGAUUCCGCGCCUCGGCACCA
(((((((..((((........)))).(((((.......))))).....(((((.......))))))))))))....
.........**..............**..................*...................*....******
test_1EUQ_B.
GGGGUAUCGCCAAGCGGUAAGGCACCGGAUUCUGAUUCCGGCAGCGAGGUUCGAAUCCUCGUACCCCAGCCA
((((((..(((.........))).(((((.......)))))...(((((.......))))))))))).....
*******.*******.....******.....******..........................*********
test_1EUY_B.
GGGGUAUCGCCAAGCGGUAAGGCACCGGAUUCUGAUUCCGGCAAGCGAGGUUCGAAUCCUCGUACCCCAGCCA
((((((..(((.........))).(((((.......)))))....(((((.......))))))))))).....
*******.*******.....******.....******............................********
test_1F7U_B.
UUCCUCGUGGCCCAAUGGUCACGGCGUCUGGCUGCGAACCAGAAGAUUCCAGGUUCAAGUCCUGGCGGGGAAGCCA
(((((((..(((..........))).(((((.......))))).....(((((.......))))))))))))....
..****.....*****.**.*****.....*..********..............*...........*********
test_1F7V_B.
UUCCUCGUGGCCCAAUGGUCACGGCGUCUGGCUGCGAACCAGAAGAUUCCAGGUUCAAGUCCUGGCGGGGAAG
(((((((..(((..........))).(((((.......))))).....(((((.......)))))))))))).
..****.....*****.********.....*..********..............*...........*****.
test_1FFY_T.
GGGCUUGUAGCUCAGGUGGUUAGAGCGCACCCCUGAUAAGGGUGAGGUCGGUGGUUCAAGUCCACUCAGGCCCAC
(((((((..((((.........)))).(((((.......))))).....(((((.......))))))))))))..
..****...*****.........****...**************........................*******
test_1GAX_C.
GGGCGGCUAGCUCAGCGGAAGAGCGCUCGCCUCACACGCGAGAGGUCGUAGGUUCAAGUCCUACGCCGCCCACCA
(((((((..((((.......)))).(((((.......))))).....(((((.......))))))))))))....
...***...******..***.***....**.****.********..........*..............******
test_1GTR_B.
GGGGUAUCGCCAAGCGGUAAGGCACCGGAUUCUGAUUCCGGCAUUCCGAGGUUCGAAUCCUCGUACCCCAGCCA
((((((..(((.........))).(((((.......))))).....(((((.......))))))))))).....
*******.*******.....*****......*****..............................********
test_1H3E_B.
GGGCAGGUUCCCGAGCGGCCAAAGGGGACGGUCUGUAAAACCGUUGGCGUAUGCCUUCGCUGGUUCGAAUCCAGCCCUGCCCA
(((((((..(((...........)))((((((.......))))))(((....)))...(((((.......)))))))))))).
..........**...*...........****..****.....******.............................***...
test_1H4Q_T.
GGAGUAGCGCAGCCCGGUAGCGCACCUCGUUCGGGACGAGGGGGGCGCUGGUUCAGAUCCAGUCUCC
((((..((((.........)))).((((((.....)))))).....(((((.......)))))))))
...*..................***..*******...***...........................
test_1HC8_C.
CCAGGAUGUAGGCUUAGAAGCAGCCAUCAUUUAAAGAAAGCGUAAUAGCUCACUGGU
((((.(((..(((.........)))..))).....(..(((......)))).)))).
.....****.****.*.......********.....*....................
test_1IL2_D.
CCGUGAUAGUUUAAUGGUCAGAAUGGGCGCUUGUCGCGUGCCAGAUCGGGGUUCAAUUCCCCGUCGCGGA
((((((..((((........)))).(((((.......)))))....(((((.......))))))))))).
........****..........******.*********...........................***..
test_1J1U_B.
CCGGCGGUAGUUCAGCCUGGUAGAACGGCGGACUGUAGAUCCGCAUGUCGCUGGUUCAAAUCCGGCCCGCCGGA
(((((((..((((.........)))).(((((.......))))).....(((((.......)))))))))))).
...........................***....****..............................**....
test_1J2B_C.
GGGCCCGUGGUCUAGUUGGUCAUGACGCCGCCCUUACGAGGCGGAGGUCCGGGGUUCAAGUCCCCGCGGGCCCACCA
(((((((................(((..((((.......))))...)))(((((.......))))))))))))....
*.....**************...****....***..*****......***.........*......******.****
test_1JJ2_9.
UUAGGCGGCCACAGCGGUGGGGUUGCCUCCCGUACCCAUCCCGAACACGGAAGAUAAGCCCACCAGCGUUCCGGGGAGUACUGGAGUGCGCGAGCCUCUGGGAAACCCGGUUCGCCGCCACC
...((((((....((((((((......((((((...(.....)...))))..))....)))))).))...(((((.....((((((.((....))))))))....)))))...))))))...
...************.**.....*.*******.****..***.****************.......****.............*****..***...*****............*******..
test_1KUQA_B.
GGGCGGCCUUCGGGCUAGACGGUGGGAGAGGCUUCGGCUGGUCCACCCGUGACGCUC
((((.(((....)))..(((((((((...(((....)))..))))).)))..)))))
...................*****.....*****..*****......********..
test_1L9A_B.
GACACUAAGUUCGGCAUCAAUAUGGUGACCUCCCGGGAGCGGGGGACCACCAGGUUGCCUAUUUGAGGGGUGAACCGGCCCAGGUCGGAAACGGAGCAGGUCAAAACUCCCGUGCUGAUCAGUAGUGUC
((((((..(.(((((((((((.(((((..((((((....))))))..))))).)))).........(((((.....(.((....(((....)))....)).)...))))).))))))).)...))))))
.............................************.........................................*****....****..................................
test_1LNG_B.
UCGGCGGUGGGGGAGCAUCUCCUGUAGGGGAGAUGUAACCCCCUUUACCUGCCGAACCCCGCCAGGCCCGGAAGGGAGCAACGGUAGGCAGGACGUC
..(.((..(((((.(((((((((....)))))))))..)))))....((((((.....(((.....(((....))).....)))..)))))).)).)
................************.....................................*****....***....................
test_1MFQ_A.
GACACUAAGUUCGGCAUCAAUAUGGUGACCUCCCGGGAGCGGGGGACCACCAGGUUGCCUAAGGAGGGGUGAACCGGCCCAGGUCGGAAACGGAGCAGGUCAAAACUCCCGUGCUGAUCAGUAGUGUC
((((((..(.(((((((((((.(((((..((((((....))))))..))))).))))......(.(((((.....((((....(((....)))....))))...))))))))))))).)...))))))
............................*****.******................................**......*******....*******..............................
test_1MMS_C.
GCUGGGAUGUUGGCUUAGAAGCAGCCAUCAUUUAAAGAGUGCGUAACAGCUCACCAGC
(((((.(((..(((.........)))..))).....(...((......)).).)))))
......*********....*....*********....*......***...........
test_1MZP_B.
GGGAUGCGUAGGAUAGGUGGGAGCCGCAAGGCGCCGGUGAAAUACCACCCUUCCC
((((.(.........(((((..(((....)))............)))))).))))
...............**********..........********.******.....
test_1QA6_C.
GCCAGGAUGUAGGCUUAGAAGCAGCCAUCAUUUAAAGAAAGCGUAAUAGCUCACUGGU
(((((.(((..(((.........)))..))).....(..(((......)))).)))))
......*********..........*******.....*....................
test_1QF6_B.
GCCGAUAUAGCUCAGUUGGUAGAGCAGCGCAUUCGUAAUGCGAAGGUCGUAGGUUCGACUCCUAUUAUCGGCACCA
(((((((..((((........))))..((((.......))))......(((((.......))))))))))))....
****......**.............*.....********.........*................***********
test_1SJ3_R.
UGGCCGGCAUGGUCCCAGCCUCCUCGCUGGCGCCGGCUGGGCAACACCAUUGCACUCCGGUGGUGAAUGGGAC
.(((((((.........(((........))))))))))......((((..........))))...........
...........................................**...*******..**..............
test_1SJF_B.
AUGGCCGGCAUGGUCCCAGCCUCCUCGCUGGCGCCGGCUGGGCAACACCAUUGCACUCCGGUGGUGAAUGGGAC
..(((((((.........(((........))))))))))......((((..........))))...........
............................................**...*******.***..............
test_1U6P_B.
GGCGGUACUAGUUGAGAAACUAGCUCUGUAUCUGGCGGACCCGUGGUGGAACUGUGAAGUUCGGAACACCCGGCCGCAACCCUGGGAGAGGUCCCAGGGUU
.((((..((((((....))))))..)))).....((((..(((.(((((((((....)))))....)))))))))))((((((((((....))))))))))
.............................*****..............................................**..................*
test_1UN6_E.
GCCGGCCACACCUACGGGGCCUGGUUAGUACCUGGGAAACCUGGGAAUACCAGGUGCCGGC
((((((....((....))(((((((.....((..(....)..))....)))))))))))))
...*.******........****.****..............*****..............
test_1WZ2_C.
GCGGGGGUUGCCGAGCCUGGUCAAAGGCGGGGGACUCAAGAUCCCCUCCCGUAGGGGUUCCGGGGUUCGAAUCCCCGCCCCCGCACCA
(((((((..(((.............))).(((((.......))))).(((....)))...(((((.......))))))))))))....
*****.......***....**...****............*****.....******..........................*.****
test_1Y39_C.
CCAGGAUGUAUGCUUAGAAGCAGCAAUCAUUUAAAGAGUGCGUAAUAGCUCACUGGU
((((.(((..(((.........)))..))).....(...((......)).).)))).
.....*********.........********.....*....................
test_1Y69_9.
CCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGG
(((((((((.....((.(((((....((((((...............)))..)))...)))))..))(((.......((.(((((....))))).)).......)))..)))))))))
.........***................................................................**.........****...........................
test_1YHQ_9.
UUAGGCGGCCACAGCGGUGGGGUUGCCUCCCGUACCCAUCCCGAACACGGAAGAUAAGCCCACCAGCGUUCCAGGGAGUACUGGAGUGCGCGAGCCUCUGGGAAAUCCGGUUCGCCGCCACC
...((((((....((((((((......((((((...(.....)...))))..))....)))))).))...((.((.....((((((.((....))))))))....)).))...))))))...
...***************......********.****..***.****************.......****.............*****..***...****.............*******..
test_2AKE_B.
GACCUCGUGGCGCAAUGGUAGCGCGUCUGACUCCAGAUCAGAAGGUUGCGUGUUCGAAUCACGUCGGGGUCA
(((((((..((((.......)))).(((((.......))))).....(((((.......)))))))))))).
.........................*....*****...............................**....
test_2CSX_C.
GGCGGCGUAGCUCAGCUGGUCAGAGCGGGGAUCUCAUAAGUCCCAGGUCGGAGGUUCGAGUCCUCCCGCCGCCAC
(((.(((..((((.........)))).(((((.......))))).....(((((.......)))))))).)))..
..**.........*............**.....**********............................****
test_2CZJ_B.
GGGGGUGAAACGGUCUCGACAGGGGUUCGCCUUUGGACGUGGGUUCGACUCCCACCACCUCC
(((((((............((((.(....).))))...(((((.......))))))))))))
...********..*******...............*****................*****.
test_2D6F_E.
AGUCCCGUGGGGUAGUGGUAAUCCUGCUGGGCUUUGGACCCGGCGACAGCGGUUCGACUCCGCUCGGGACUA
(((((((...(........{...).((((((.......))))))...(((((......})))))))))))).
****..............................................****.....***.........*
test_2D6F_F.
AGUCCCGUGGGGUAGUGGUAAUCCUGCUGGGCUUUGGACCCGGCGACAGCGGUUCGACUCCGCUCGGGACUACC
(((((((..(((..........)))((((((.......))))))...(((((.......))))))))))))...
*****.............................................*****....***.........***
test_2DER_C.
GUCCCCUUCGUCUAGAGGCCCAGGACACCGCCCUUUCACGGCGGUAACAGGGGUUCGAAUCCCCUAGGGGACGC
....(.(..((((.........))))((((((.......))))))...(((((.......)))))).)......
.........****...........*******..*********................................
test_2DER_D.
CCCCUUCGUCUAGAGGCCCAGGACACCGCCCUUUCACGGCGGUAACAGGGGUUCGAAUCCCCUAGGGGACG
.((((..((((.........))))((((((.......))))))...(((((.......)))))))))....
.....***............*****....********..................................
test_2DET_C.
GUCCCCUUCGUCUAGAGGCCCAGGACACCGCCCUUUCACGGCGGUAACAGGGGUUCGAAUCCCCUAGGGG
..((.....((((.........))))((((((.......))))))....((((.......))))...)).
.........***............*******.**********............................
test_2DR2_B.
GACCUCGUGGCGCAAUGGUAGCGCGUCUGACUCCAGAUCAGAAGGUUGCGUGUUCGAAUCACGUCGGGGUCACCA
(((((((..(.((.......)).).(((((.......))))).....(((((.......))))))))))))....
.........................*....*****...............................**.......
test_2DU3_D.
GCCAGGGUGGCAGAGGGGCUUUGCGGCGGACUGCAGAUCCGCUUUACCCCGGUUCGAAUCCGGGCCCUGGC
(((((((..(((.........)))..((((.......))))......(((((.......))))))))))))
.........**...............*..********............................*.....
test_2DU5_D.
GCCAGGGUGGCAGAGGGGCUUUGCGGCGGACUUCAGAUCCGCUUUACCCCGGUUCGAAUCCGGGCCCUGGC
(((((((..(((.........))).(((((.......))))).....(((((.......))))))))))))
.........**..............*....*.******..........................**.....
test_2FMT_C.
CGCGGGGUGGAGCAGCCUGGUAGCUCGUCGGGCUCAUAACCCGAAGAUCGUCGGUUCAAAUCCGGCCCCCGCAACCA
.((((((..((((.........)))).(((((.......))))).....(((((.......))))))))))).....
*******..*****..........****........................................*********
test_2HGH_B.
GGGCCAUACCUCUUGGGCCUGGUUAGUACCUCUUCGGUGGGAAUACCAGGUGCCC
((((....((....))(((((((.....((.(....).))....)))))))))))
*..******........*********............****.............
test_2IHX_B.
CUGCCCUCAUCCGUCUCGCUUAUUCGGGGAGCGGACGAUGACCCUAGUAGAGGGGGCUGCGGCUUAGGAGGGCAG
((((((((...((((.(((((.......)))))))))....((((.....))))(((....)))...))))))))
......***.**......................******................******..*******....
test_2L3J_B.
GGCAUUAAGGUGGGUGGAAUAGUAUAACAAUAUGCUAAAUGUUGUUAUAGUAUCCCACCUACCCUGAUGCC
(((((((.(((((((((.(((.(((((((((((.....))))))))))).))).))))))))).)))))))
........****.....********......**.......***......***.****.....***......
test_2MF0_G.
UGUCGACGGAUAGACACAGCCAUCAAGGACGAUGGUCAGGACAUCGCAGGAAGCGAUUCAUCAGGACGAUGA
((((........))))..((((((......)))))).......((((.....))))..((((.....)))).
.************....**.************.*...******************..*************..
test_2MQV_B.
GGGCGAGGGUCUCCUCUGAGUGAUUGACUACCCGUCAGCGGGGGUCUUUCAUUUGGGGGCUCGUGCCC
(((((.((((((((.....((((..((((.((((....))))))))..))))..)))))))).)))))
..............*****.............................********............
test_2MS0_B.
GGCUCGUUGGUCUAGGGGUAUGAUUCUCGCUUAGGGUGCGAGAGGUCCCGGGUUCAAAUCCCGGACGAGCC
(((((....(((.........)))((((((.......))))))....(((((.......)))))..)))))
**...*****............****************..........................****.**
test_2V3C_M.
GGCGGUGGGGGAGCAUCUCCUGUAGGGGAGAUGUAACCCCCUUUACCUGCCGAACCCCGCCAGGCCCGGAAGGGAGCAACGGUAGGCAGGACGUCG
((((..(((((.(((((((((....)))))))))..))))).....(((((.....(((.....(((....))).....)))..)))))..)))).
..............************...........******.****.....**....*********...**********........***....
test_2ZJQ_Y.
CACCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGGUU
.((((((((((.....((.(((((....(((((((...(.....)...))))..)))...)))))..))(((.......((.(((((....))))).)).......)))..)))))))))).
......****.**..............********..**.******.******.****..*............*******..***.....******.......*****.....******...
test_2ZM5_C.
GCCCGGAUAGCUCAGUCGGUAGAGCAGGGGAUUGAAAAUCCCCGUGUCCUUGGUUCGAUUCCGAGUCCGGGCAC
(((((((..((((........)))).(((((.......))))).....(((((.......))))))))))))..
.........*..............*****************...........................*****.
test_2ZM5_D.
GCCCGGAUAGCUCAGUCGGUAGAGCAGGGGAUUGAAAAUCCCCGUGUCCUUGGUUCGAUUCCGAGUCCG
....(((..((((........)))).(((((.......))))).....(((((.......)))))))).
........................*****************........................***.
test_2ZNI_C.
GGGGGGUGGAUCGAAUAGAUCACACGGACUCUAAAUUCGUGCAGGCGGGUGAAACUCCCGUACUCCCCGCCA
((((.((.((((.....)))).((((.(.......).))))...(((((.......))))))).))))....
**.***.******......****......................................***********
test_2ZUEA_B.
GGACCGGUAGCCUAGCCAGGACAGGGCGGCGGCCUCCUAAGCCGCAGGUCCGGGGUUCAAAUCCCCGCCGGUCCG
(((((((..((((..........)))).(((((.......))))).....(((((.......)))))))))))).
.*******...****.*.***..*****....*...*******.............***...........*****
test_2ZUEB_B.
GGACCGGUAGCCUAGCCAGGACAGGGCGGCGGCCUCCUAAGCCGCAGGUCCGGGGUUCAAAUCCCCGCCGGUCCGC
(((((((..((((..........)))).(((((.......))))).....(((((.......))))))))))))..
.*******...****.*.***..*****....*...*******.............***...........******
test_2ZZM_B.
GCAGGGGUCGCCAAGCCUGGCCAAAGGCGCUGGGCCUAGGACCCAGUCCCGUAGGGGUUCCAGGGUUCAAAUCCCUGCCCCUGC
....(((..(((.............)))((((((.......))))))(((....)))...(.(((.......))).))))....
..........****.....**...******...***********.....*.................**...........*...
test_2ZZN_C.
GCCGGGGUAGUCUAGGGGCUAGGCAGCGGACUGCAGAUCCGCCUUACGUGGGUUCAAAUCCCACCCCCGGC
(((((((..((((.......)))).(((((.......))))).....(((((.......))))))))))))
........*******..**********..***********.******.**...***.*.........**..
test_3ADB_C.
GGCCGCCGCCACCGGGGUGGUCCCCGGGCCGGACUUCAGAUCCGGCGCGCCCCGAGUGGGGCGCGGGGUUCAAUUCCCCGCGGCGGCCGCCA
(((((((((..((((((....))))))((((((.......))))))(((((((....)))))))((((.......)))))))))))))....
*............********.****...............****...*.....................**.**.....*..**....***
test_3AKZ_F.
UGGGAGGUCGUCUAACGGUAGGACGGCGGACUCUGGAUCCGCUGGUGGAGGUUCGAGUCCUCCCCUCCCAGCCA
(((((((..((((.......)))).(((((.......)))))....(((((.......))))))))))))....
.*******.*****.......***.**...*******..........................*********..
test_3AM1_B.
GGCGCGGGGUACCGGGCUUGGUAGCCCGGGGCUUCGGCCGAGGGCGAGAGCCCUCGGGGUUCGAUUCCCCCCCUGCGCCGC
(((((((((..(((((((....)))))))(((....)))((((((....))))))((((.......)))))))))))))..
***.............*********..........***.......................*..**........*******
test_3AMT_B.
GGGCCCGUAGCUUAGCCAGGUCAGAGCGCCCGGCUCAUAACCGGGCGGUCGAGGGUUCGAAUCCCUCCGGGCCCACCA
(((((((..((((..........)))).((((.........)))).....(((((.......))))))))))))....
***........................****..**.********..........................********
test_3CUL_C.
GAUGGCGAAAGCCAUUUCCGCAGGCCCCAUUGCACUCCGGGGUAUUGGCGUUAGGUGGUGGUACGAGGUUCGAAUCCUCGUACCGCAGCCA
((((((....))))))..(((..(((((..........)))))....)))...(((.(.(((((((((.......))))))))).).))).
............................***********...................*................................
test_3CUL_D.
GGAUGGCGAAAGCCAUUUCCGCAGGCCCCAUUGCACUCCGGGGUAUUGGCGUUAGGUGGUGGUACGAGGUUCGAAUCCUCGUACCGCAGCCA
(((((((....)))))).)(((..(((((..........)))))....)))...(((.(((((((((((.......))))))))))).))).
.............................********.**................................................*...
test_3EGZ_B.
GAGGGAGAGGUGAAGAAUACGACCACCUAGGUACCAUUGCACUCCGGUACCUAAAACAUACCCUC
(((((...((((...........))))((((((((..........)))))))).......)))))
..............................****.********.**...................
test_3EPH_E.
CUCGUAUGGCGCAGUGGUAGCGCAGCAGAUUGCAAAUCUGUUGGUCCUUAGUUCGAUCCUGAGUGCGAG
((((.(..((((.......))))..((((.......))))......((.((.......)).))).))))
.......*........****..**********************.........................
test_3FOZ_C.
GCCCGGAUAGCUCAGUCGGUAGAGCAGGGGAUUGAAAAUCCCCGUGUCCUUGGUUCGAUUCCGAGUCUGGGCAC
(((((((..((((........)))).(((((.......))))).....(((((.......))))))))))))..
.........*..............*****************...........................*****.
test_3FOZ_D.
CGGAUAGCUCAGUCGGUAGAGCAGGGGAUUGAAAAUCCCCGUGUCCUUGGUUCGAUUCCGAGUCCGGGC
((((..((((........)))).(((((.......))))).....(.(((.......))).)))))...
...*..............*****************..................................
test_3HHN_C.
UCCAGUAGGAACACUAUACUACUGGAUAAUCAAAGACAAAUCUGCCCGAAGGGCUUGAGAACAUACCCAUUGCACUCCGGGUAUGCAGAGGUGGCAGCCUCCGGUGGGUUAAAACCCAACGUUCUCAACAAUAGUGA
((((((((..........))))))))....................(...).(.((((((((((((((..........)))))))..((((......))))((.((((......)))).))))))))))........
...............................................................**...********.**..........................................................
test_3HJW_D.
GGGCCACGGAAACCGCGCGCGGUGAUCAAUGAGCCGCGUUCGCUCCCGUGGCCCACAA
(((((((((.......(((((((.........)))))))......)))))))))....
..****........*****************.......*******.....********
test_3IRW_R.
GUCACGCACAGGGCAAACCAUUCGAAAGAGUGGGACGCAAAGCCUCCGGCCUAAACCAUUGCACUCCGGUAGGUAGCGGGGUUACCGAUG
..((((......((...((((((....))))))...))...(((.(((((((...((..........)).))))..))))))...)).))
....................................................****.********.**......................
test_3IVKA_M.
UCCAGUAGGAACACUAUACUACUGGAUAAUCAAAGACAAAUCUGCCCGAAGGGCUUGAGAACAUCGAAACACGAUGCAGAGGUGGCAGCCUCCGGUGGGUUAAAACCCAACGUUCUCAACAAUAGUGA
((((((((..........))))))))....................(...).(.((((((((((((.....)))))..((((......))))((.((((......)))).))))))))))........
.....................................................................*.*******........................****.........***..........
test_3IWN_A.
CACGCACAGGGCAAACCAUUCGAAAGAGUGGGACGCAAAGCCUCCGGCCUAAACGGCAUUGCACUCCGCCGUAGGUAGCGGGGUUACCGAUGG
((((......((...((((((....))))))...))...(((.((((((((..((((..........)))))))))..))))))...)).)).
....................................................************.*******.....................
test_3K0J_E.
GCGACUCGGGGUGCCCUCCAUUGCACUCCGGAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUCGC
((((((((((.((((.(((..........))))))......)..)))).....(((...((((......))))...)))..))))))
.............**....***********............*........*.....***.****.**...................
test_3KTW_C.
AGAUAGUCGUGGGUUCCCUUUCUGGAGGGAGAGGGAAUUCCACGUUGACCGGGGGAACCGGCCAGGCCCGGAAGGGAGCAACCGUGCCCGGCUAUC
.(((...((((((((((((((((....))))))))))).))))).....((((.....(((.....(((....))).....)))..))))...)))
................************...................................*..*****..****...***.............
test_3KTW_D.
AGAUAGUCGUGGGUUCCCUUUCUGGAGGGAGAGGGAAUUCCACGUUGACCGGGGGAACCGGCCAGGCCCGGAAGGGAGCAACCGUGCCCGGCUAU
..(((..((.(((((((((((((....))))))))))).)).))....(.(((.....(((.....(((....))).....)))..))).).)))
...............*************............................*.........*****...****.................
test_3MUM_R.
GUCACGCACAGAGCAAACCAUUCGAAAGAGUGGGACGCAAAGCCUCCGGCCUAAACCAUUGCACUCCGGUAGGUAGCGGGGUUACCGAUGG
..(.((......((...((((((....))))))...))...(((.((((((((..((..........)))))))..))))))...))...)
....................................................***..*******..**.......................
test_3MUR_R.
GUCACGCACAGGGCAAACCAUUCGAAAGAGUGGGACGCAAAGCCUCCGGCCUAAACCAUUGCACUCCGGUAGGUAGCGGGGUUAUCGAUGG
..(.((......((...((((((....))))))...))...(((.((((((((...(..........).)))))..))))))...))...)
....................................................****.********.**.......................
test_3NDB_M.
GUCUCGUCCCGUGGGGCUCGGCGGUGGGGGAGCAUCUCCUGUAGGGGAGAUGUAACCCCCUUUACCUGCCGAACCCCGCCAGGCCCGGAAGGGAGCAACGGUAGGCAGGACGUCGGCGCUCACGGGGGUGCGGGAC
((((((..(((((.(((.(((((..(((((.(((((((((....)))))))))..)))))....((((((.....(((.....(((....))).....)))..)))))).))))))).).))))).....))))))
.................................*************..........................**....**********..*********.....................................
test_3PIO_Y.
ACCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGGU
((((((((((.....((.(((((....((((((...............)))..)))...)))))..)).((.......((.(((((....))))).)).......))...))))))))))
.....****.***.............********..**.******.*****..****..*............*******..***....*****.*.......****......******..
test_3RW6_H.
GCACUAACCUAAGACAGGAGGGCCGGGAAACCUGCCUAAUCCAAUGACGGGUAAUAGUGUC
((((((((((......(((((((.((....)).))))..)))......))))..)))))).
****....********.***.........*****..............***.**.......
test_3TUP_T.
GCCGAGGUAGCUCAGUUGGUAGAGCAUGCGACUGAAAAUCGCAGUGUCGGCGGUUCGAUUCUGCUCCUCGGCAC
(((((((..((((........)))).(((((.......)))))......((.(.......).)).)))))))..
**.......****..........***......*******..........................*********
test_3U4M_B.
GGGAUGCGUAGGAUAGGUGGGAGCCUGUGAACCCCCGCCUCCGGGUGGGGGGGAGGCGCCGGUGAAAUACCACCCUUCCC
((((.(.........(((((..(((......((((((((....))))))))...)))............)))))).))))
...............**************..................................****.*******.....
test_3V7E_C.
GGCUUAUCAAGAGAGGUGGAGGGACUGGCCCGAUGAAACCCGGCAACCACUAGUCUAGCGUCAGCUUCGGCUGACGCUAGGCUAGUGGUGCCAAUUCCUGCAGCGGAAACGUUGAAAGAUGAGCCA
((((((((....(.(((...(((.....)))......))))(((..(((((((((((((((((((....))))))))))))))))))).)))...(....(((((....)))))..))))))))).
........*........**............****...................................................*......................................*
test_3W3S_B.
GGGAGAGGUUGGCCGGCUGGUGCCGCCCCGGGACUUCAAAUCCCGUGGGAGGUCCCGCAAGGGAGCUCCGGAGGGUUCGAUUCCCUCCCUCUCCCGCC
((((((((..((.((((....))))))((((((.......))))).)((((.((((....)))).)))).(((((.......)))))))))))))...
...................**............................*...........................*....................
test_3WFRA_A.
GGCCAGGUAGCUCAGUUGGUAGAGCACUGGACUGAAAAUCCAGGUGUCGGCGGUUCGAUUCCGCCCCUGGCCA
(((((((....(...........)..(((((.......))))).....(((((.......)))))))))))).
*****..........**.*....................................*.............****
test_3WFRA_C.
GGCCAGGUAGCUCAGUUGGUAGAGCACUGGACUGAAAAUCCAGGUGUCGGCGGUUCGAUUCCGCCCCUGGCCAC
(((((((....((........).)..(((((.......))))).....(((((.......))))))))))))..
****...........**.*....................................*.............*****
test_3WFRB_C.
GGCCAGGUAGCUCAGUUGGUAGAGCACUGGACUGAAAAUCCAGGUGUCGGCGGUUCGAUUCCGCCCCUGGCCACC
(((((((....((........).)..(((((.......))))).....(((((.......))))))))))))...
****...........**.*....................................*.............******
test_3WQY_C.
GGGCUCGUAGCUCAGCGGGAGAGCGCCGCCUUUGCGAGGCGGAGGCCGCGGGUUCAAAUCCCGCCGAGUCCACCA
(((((((..((((.......)))).((((((.....)))))).....(((((.......))))))))))))....
*****......****..***......................****..**....**.........**********
test_3WQZ_C.
GGACUCGUAGCUCAGCGGGAGAGCGCCGCCUUUGCGAGGCGGAGGCCGCGGGUUCAAAUCCCGCCGAGUCCACCA
(((((((..((((.......)))).(((((((...))))))).....(((((.......))))))))))))....
*****......****..***......................********....**........***********
test_4BY9_A.
GCGAGCAAUGAUGAGUGAUGGGCGAACUGAGCUCGAAAGAGCAAUGAUGAGUGAUGGGCGAACUGAGCUCGC
((((((......(.............(...((((....))))......).............)...))))))
.....*****************************......***************.....******.**...
test_4KR2_C.
CGCCGCUGGUGUAGUGGUAUCAUGCAAGAUUCCCAUUCUUGCGACCCGGGUUCGAUUCCCGGGCGGCG
((((((..(((.........)))((((((.......))))))....((((.......)))).))))))
**......***...........*****...******..........................******
test_4KR3_C.
CGCCGCUGGUGUAGUGGUAUCAUGCAAGAUUCCCAUUCUUGCGACCCGGGUUCGAUUCCCGGGCGGCGC
((((((..(((.........)))((((((.......))))))...(((((.......))))))))))).
*.......***...........**************..........................*******
test_4KZD_R.
GACGCGACCGAAAUGGUGAAGGACGGGUCCAGUGCGAAACACGCACUGUUGAGUAGAGUGUGAGCUCCGUAACUGGUCGCGUC
((((((((((..((((.(.....(...(((((((((.....)))))))..)).........)..).))))..).)))))))))
...............................*.*******...........................................
test_4LCK_C.
GGGUGCGAUGAGAAGAAGAGUAUUAAGGAUUUACUAUGAUUAGCGACUCUAGGAUAGUGAAAGCUAGAGGAUAGUAACCUUAAGAAGGCACUUCGAGCACCC
((((((.....((((.......((((((..(((((((.........((((((...........)))))).))))))))))))).......))))..))))))
......****...................................................................................**.......
test_4M4O_B.
GGGUUCAUCAGGGCUAAAGAGUGCAGAGUUACUUAGUUCACUGCAGACUUGACGAACCC
((((((.(((((.((.......((((.(...(...)..).)))))).))))).))))))
...........................*.****.***.*....................
test_4P3EA_A.
GACUAAGUUCGGCAUCAAUAUGGUGACCUCCCGGGAGCGGGGGACCACCAGGUUGCCUAAGGAGGGGUGAACCGGCCCAGGUCGGAAACGGAGCAGGUCAAAACUCCCGUGCUGAUCAGUAGUU
(.((..(.(((((((((((.(((((..((((((....))))))..))))).))))......(.(((((.....((((....(((....)))....))))...))))))))))))).)...)).)
..******..................*****..*****.............***..*******..................****....***................*******.........
test_4PKD_V.
GAUCCAUUGCACUCCGGAUCCAGGAGAUACCAUGAUCACGAAGGUGGUUUUCCU
(((((..........))))).((((((..((((..........)))).))))))
*....***********........****....*************.........
test_4U7U_L.
AUAAACCGGGCUCCCUGUCGGUUGUAAUUGAUAAUGUUGAGAGUUCCCCGCGCCAGCGGGG
.............................................((((((....))))))
***************************************************..********
test_4UYJ_R.
GGGGCUAGGCCGGGGGGUUCGGCGUCCCCUGUAACCGGAAACCGCCGAUAUGCCGGGGCCGAAGCCCGAGGGGCGGUUCCCGUAAGGGUUCCCACCCUCGGGCGUGCCUC
(((((..(((((((((.........)))))....((((..............))))))))...((((((((((.((..(((....)))..))).)))))))))..)))))
......*.............************.....................**...........*****.................***...................
test_4UYK_R.
GGGGCUAGGCCGGGGGGUUCGGCGUCCCCUGUAACCGGAAACCGCCGAUAUGCCGGGGCCGAAGCCCGAGGGGCGGUUCCCGAAGCCGCCUCUGUAAGGAGGCGGUGGAGGGUUCCCACCCUCGGGCGUGCCUC
(((((..(((((((((.........)))))....((((..............))))))))...(((((((((..((..(((...(((((((((....)))))))))...)))..))..)))))))))..)))))
......*.............************.....................**...........****..........................................***...................
test_4W90_C.
GCGCGCUUAAUCUGAAAUCAGAGCGGGGGACCCAUUGCACUCCGGGUUUUUCCCGUAAGGGGUGAAUCCUUUUUAGGUAGGGCGAAAGCCCGAAUCCGUCAGCUAACCUCGUAAGCGCGC
(((((((...((((....))))((((((....(.........................(((((....(((....)))..((((....))))..))))..).)....)))))).)))))))
..............................*...******.......................................**.......................................
test_4WC3_B.
GGCCAGGUAGCUCAGUUGGUAGAGCACUGGACUGAAAAUCCAGGUGUCGGCGGUUCGAUUCCGCCCCUGGCCACCA
((((.((..((((........)))).(((((.......))))).....(((((.......))))))).))))....
*****............***..................................**.*...**........*****
test_4WF9_Y.
UCUGGUGACUAUAGCAAGGAGGUCACACCUGUUCCCAUGCCGAACACAGAAGUUAAGGUCUUUAGCGACGAUGGUAGCCAACUUACGUUCCGCUAGAGUAGAACGUUGCCAGGC
.(..(..(.....((((.((......((((((...(.....)...))))..)).....)).)).))............(............)..............)..)..).
...****..**.............*****.....*...***...******.****.*.............*****.*..***....**..***.....***.......****..
test_4X0B_B.
GGGCCAGGUAGCUCAGUUGGUAGAGCACUGGACUGAAAAUCCAGGUGUCGGCGGUUCGAUUCCGCCCCUGGCCCACC
.((((.((..((((........)))).(.((.........)).).....(((((.......))))))).))))....
******............***..................................**.*...**........*****
test_4YB1_R.
GGGCACGCACAGAGCAAACCAUUCGAAAGAGUGGGACGCAAAGCCUCCGGCCUAAACCAUUGCACCUCGGUAGGUAGCGGGGUUACCGAUG
...((((......((...((((((....))))))...))...(((.(((((((...((..........)).))))..))))))...)).))
****.............................................................................*.........
test_4YCO_D.
GCGCGGAUAGCUCAGUCGGUAGAGCAGGGGAUUGAAAAUCCCCGUGUCCUUGGUUCGAUUCCGAGUCCGCGCAC
(((((((..((((........)))).(((((.......))))).....(((((.......))))))))))))..
............*****.******...***...........****.........**..................
test_4YCO_E.
GCGGAUAGCUCAGUCGGUAGAGCAGGGGAUUGAAAAUCCCCGUGUCCUUGGUUCGAUUCCGAGUCCG
..(((..((((........)))).(((((.......))))).....(((((.......)))))))).
........*****.******...................*..........**...............
test_4YCO_F.
GCGCGGAUAGCUCAGUCGGUAGAGCAGGGGAUUGAAAAUCCCCGUGUCCUUGGUUCGAUUCCGAGUCCGCG
..(((((..((((........)))).(((((.......))))).....(((((.......)))))))))).
............*****.******..............................**...............
test_4YVI_C.
UGGGAGGUCGUCUAACGGUAGGACGGCGGACUCUGGAUCCGCUGGUGGAGGUUCGAGUCCUCCCCUCCCAG
.((((((..((((.......))))((((((.......))))))...(((((.......)))))))))))..
.........***.........*****..**.******..................................
test_4YYE_C.
GUUAUAUUAGCUUAAUUGGUAGAGCAUUCGUUUUGUAAUCGAAAGGUUUGGGGUUCAAAUCCCUAAUAUAACAC
(((((((..((((........)))).((((.........)))).....(((((.......))))))))))))..
**.......**.....*........****.**********..***.*..................********.
test_4ZT0_B.
GAUGAGACGCGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGU
..........((((((..((((....))))....))))))..(((..).)).......((((....))))..
******************...........***************.**************.........****
test_5AXM_P.
GGAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUCUGGAGGUCCUGUGUUCGAUCCACAGAAUCCCCAC
(((((..((((........)))).(((((.......))))).....(((((.......))))))))))....
***.............*..............................********.....*****..*****
test_5CCBA_N.
GGCCCGGAUAGCUCAGUCGGUAGAGCAUCAGACUUUUAAUCUGAGGGUCCAGGGUUCAAGUCCCUGUUCGGGCGCCA
((((((((..((((........)))).(((((.(...).))))).....(((((.......))))))))))))..).
..........**..........**.......***.........******************....**..........
test_5D6G_0.
GCCUAAGACAGCGGGGAGGUUGGCUUAGAAGCAGCCAUCCUUUAAAGAGUGCGUAACAGCUCACCCGUCGAGGC
(((.......(((((.(((..(((.........)))..))).....(...((......)).).)))))...)))
...*****....******.......................*****...................***......
test_5DDP_A.
CGUUGACCCAGGAAACUGGGCGGAAGUAAGGUCCAUUGCACUCCGGGCCUGAAGCAACGCG
(((((.(((((....)))))........((((((..........))))))....)))))..
............................***...********.**................
test_5E6M_C.
CGCCGCUGGUGUAGUGGUAUCAUGCAAGAUUCCCAUUCUUGCGACCCGGGUUCGAUUCCCGGGCGGCGCACCA
((((((..(((.........)))((((((.......))))))...(((((.......))))))))))).....
*.......***...........*****...******...****...................***********
test_5HR6_C.
CCCCUUCGUCUAGAGGCCCAGGACACCGCCCUUUCACGGCGGUAACAGGGGUUCGAAUCCCCUAGGGG
(((((..((((.........))))((((((.......))))))...(((((.......))))))))))
......****.******....******************.............*..........****.
test_5HR6_D.
CCCUUCGUCUAGAGGCCCAGGACACCGCCCUUUCACGGCGGUAACAGGGGUUCGAAUCCCCUAGGG
((((..((((.........))))((((((.......))))))...(((((.......)))))))))
....****...........******************........................***..
test_5HR7_D.
CCCCUUCGUCUAGAGGCCCAGGACACCGCCCUUUCACGGCGGUAACAGGGGUUCGAAUCCCCUAGGGGAC
(((((..((((.........))))((((((.......))))))...(((((.......))))))))))..
.....****.............*****************........................*****..
test_5M73_A.
GGUGUCCGCACUAAGUUCGGCAUCAAUAUGGUGACCUCCCGGGAGCGGGGGACCACCAGGUUGCCUAAGGAGGGGUGAACCGGCCCAGGUCGGAAACGGAGCAGGUCAAAACUCCCGUGCUGAUCAGUAGUGGGAUCGCGCCUA
(((((((.(((((.(.(((((((((((.(((((..((((((....))))))..))))).))))......(.((((......((((....(((....)))....))))....)))))))))))).)..)))))))...)))))..
....****...*****..***...*.........*****..*****............*.***.*******...................***....***................**********........****......
test_5TF6_B.
GGUCAAUUUGAAACAAUACAGAGAUGAUCAGCAGUUCCCCUGCAUAAGGAUGAACCGUUUUACAAAGAGAC
.(((..((((...................(((.((((.(((.....)))..)))).)))...))))..)))
......**.********************.**................***.**.......**........
test_5V6X_C.
GGAAACCUGAUCAUGUAGAUCGAAUGGACUCUAAAUCCGUUCAGCCGGGUUAGAUUCCCGGGGUUUCCGC
(((((((.((((.....))))(((((((.......)))))))..(((((.......))))))))))))..
........*.......***....................**********....***...****.......
test_5V6X_D.
GGAAACCUGAUCAUGUAGAUCGAAUGGACUCUAAAUCCGUUCAGCCGGGUUAGAUUCCCGGGGUUUCCGCCA
(((((((.((((.....))))(((((((.......)))))))..(((((.......))))))))))))....
........*......****....................**********...****...****.........
test_5VW1_C.
UUGUAAAAAAGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGUC
..........((((((..((((....))))....))))))..(((..).)).......((((....))))...
******************...........***************.**************.......*.****.
test_5XBL_B.
UGCGCUUGGCGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGCUU
..........(((((...(((......))).....)))))..(.(..)..).......((((....)))).....(.....)......
******************..........****************.**************..........*************.*....
{
"1JJ2": {
"ctc": "...************.**....*..*******.****..***.****************.......****.............*****..***...*****............*******..",
"seq": "UUAGGCGGCCACAGCGGUGGGGUUGCCUCCCGUACCCAUCCCGAACACGGAAGAUAAGCCCACCAGCGUUCCGGGGAGUACUGGAGUGCGCGAGCCUCUGGGAAACCCGGUUCGCCGCCACC",
"str": "...((((((....((((((((......((((((...(.....)...))))..))....)))))).))...(((((.....((((((.((....))))))))....)))))...))))))..."
},
"1L9A": {
"ctc": "............................************..........................................*****....****..................................",
"seq": "GACACUAAGUUCGGCAUCAAUAUGGUGACCUCCCGGGAGCGGGGGACCACCAGGUUGCCUAUUUGAGGGGUGAACCGGCCCAGGUCGGAAACGGAGCAGGUCAAAACUCCCGUGCUGAUCAGUAGUGUC",
"str": "((((((..(.(((((((((((.(((((..((((((....))))))..))))).)))).........(((((.....(.((....(((....)))....)).)...))))).))))))).)...))))))"
},
"1LNG": {
"ctc": "................************.....................................*****....***....................",
"seq": "UCGGCGGUGGGGGAGCAUCUCCUGUAGGGGAGAUGUAACCCCCUUUACCUGCCGAACCCCGCCAGGCCCGGAAGGGAGCAACGGUAGGCAGGACGUC",
"str": "..(.((..(((((.(((((((((....)))))))))..)))))....((((((.....(((.....(((....))).....)))..)))))).)).)"
},
"1MFQ": {
"ctc": "............................*****.******........**..............................*******....*******..............................",
"seq": "GACACUAAGUUCGGCAUCAAUAUGGUGACCUCCCGGGAGCGGGGGACCACCAGGUUGCCUAAGGAGGGGUGAACCGGCCCAGGUCGGAAACGGAGCAGGUCAAAACUCCCGUGCUGAUCAGUAGUGUC",
"str": "((((((..(.(((((((((((.(((((..((((((....))))))..))))).))))......(.(((((.....((((....(((....)))....))))...))))))))))))).)...))))))"
},
"1SM1": {
"seq": "CCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGG",
"str": "(((((((((.....((.(((((.....(((((...............)))..))....)))))..)).((.......((.(((((....))))).)).......))...)))))))))"
},
"1U6P": {
"ctc": ".............................*****.....**.....*......................................................",
"seq": "GGCGGUACUAGUUGAGAAACUAGCUCUGUAUCUGGCGGACCCGUGGUGGAACUGUGAAGUUCGGAACACCCGGCCGCAACCCUGGGAGAGGUCCCAGGGUU",
"str": ".((((..((((((....))))))..)))).....((((..(((.(((((((((....)))))....)))))))))))((((((((((....))))))))))"
},
"1Y69": {
"ctc": ".........***................................**.........................................****...........................",
"seq": "CCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGG",
"str": "(((((((((.....((.(((((....((((((...............)))..)))...)))))..))(((.......((.(((((....))))).)).......)))..)))))))))"
},
"1YHQ": {
"ctc": "...***************......********.****..***.****************.......****............******..***...****.............*******..",
"seq": "UUAGGCGGCCACAGCGGUGGGGUUGCCUCCCGUACCCAUCCCGAACACGGAAGAUAAGCCCACCAGCGUUCCAGGGAGUACUGGAGUGCGCGAGCCUCUGGGAAAUCCGGUUCGCCGCCACC",
"str": "...((((((....((((((((......((((((...(.....)...))))..))....)))))).))...((.((.....((((((.((....))))))))....)).))...))))))..."
},
"1YI2": {
"ctc": "...************.**......********.****..***.****************.......****............******..***...****.............*******..",
"seq": "UUAGGCGGCCACAGCGGUGGGGUUGCCUCCCGUACCCAUCCCGAACACGGAAGAUAAGCCCACCAGCGUUCCAGGGAGUACUGGAGUGCGCGAGCCUCUGGGAAAUCCGGUUCGCCGCCACC",
"str": "...((((((....((((((((......((((((...(.....)...))))..))....)))))).))...((.(......((((((.((....)))))))).....).))...))))))..."
},
"2V3C": {
"ctc": "..............*************...................****.*.**......*******....********.......*****....",
"seq": "GGCGGUGGGGGAGCAUCUCCUGUAGGGGAGAUGUAACCCCCUUUACCUGCCGAACCCCGCCAGGCCCGGAAGGGAGCAACGGUAGGCAGGACGUCG",
"str": "(.((..(((((.(((((((((....)))))))))..)))))....((((((.....(((.....(((....))).....)))..)))))).)).)."
},
"2ZJQ": {
"ctc": "......****.**..............********..**.******.******.****..*............*******..***.....******.......*****.....******...",
"seq": "CACCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGGUU",
"str": ".((((((((((.....((.(((((....(((((((...(.....)...))))..)))...)))))..))(((.......((.(((((....))))).)).......)))..))))))))))."
},
"2ZJR": {
"ctc": "......****.**..............********..**..*****.******.****...............*******..***....*******.......*****.....******...",
"seq": "CACCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGGUU",
"str": ".((((((((((.....((.(((((....(((((((...(.....)...))))..)))...)))))..)).((.......((.(((((....))))).)).......))...))))))))))."
},
"3ADB": {
"ctc": "***............******.***.................**..........................**.**.*......*********",
"seq": "GGCCGCCGCCACCGGGGUGGUCCCCGGGCCGGACUUCAGAUCCGGCGCGCCCCGAGUGGGGCGCGGGGUUCAAUUCCCCGCGGCGGCCGCCA",
"str": "(((((((((..((((((..[.))))))((((((.......))))))(((((((....)))))))((((..]....)))))))))))))...."
},
"3CUL": {
"ctc": ".............................********.**.......*............................................",
"seq": "GGAUGGCGAAAGCCAUUUCCGCAGGCCCCAUUGCACUCCGGGGUAUUGGCGUUAGGUGGUGGUACGAGGUUCGAAUCCUCGUACCGCAGCCA",
"str": "(((((((....)))))).)(((..(((((..........)))))....)))...(((.(((((((((((.......))))))))))).)))."
},
"3CUN": {
"ctc": "............................********.**.........*..........................................",
"seq": "GAUGGCGAAAGCCAUUUCCGCAGGCCCCAUUGCACUCCGGGGUAUUGGCGUUAGGUGGUGGUACGAGGUUCGAAUCCUCGUACCGCAGCCA",
"str": "((((((....))))))..(.(...((((..........)))).....).)...((..(((((((((((.......)))))))))))..))."
},
"3DLL": {
"ctc": "......****.**..............********..**.******.******.*****..............*******..***.....***..*......******.....*****....",
"seq": "CACCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGGUU",
"str": ".((((((((((.....((.(((((.....((((((...(.....)...))))..))....)))))..)).((.......((.(((((....))))).)).......))...))))))))))."
},
"3HHN": {
"ctc": ".....**.............................................................********.**..........................................................",
"seq": "UCCAGUAGGAACACUAUACUACUGGAUAAUCAAAGACAAAUCUGCCCGAAGGGCUUGAGAACAUACCCAUUGCACUCCGGGUAUGCAGAGGUGGCAGCCUCCGGUGGGUUAAAACCCAACGUUCUCAACAAUAGUGA",
"str": "((((((((...[[[[[[.))))))))...............[[[[[(...).(.((((((((((((((..........)))))))..((((.]]]]]))))((.((((......)))).)))))))))).]]]]]]."
},
"3IVKA": {
"ctc": ".....................................................................*.*******........................****.........***..........",
"seq": "UCCAGUAGGAACACUAUACUACUGGAUAAUCAAAGACAAAUCUGCCCGAAGGGCUUGAGAACAUCGAAACACGAUGCAGAGGUGGCAGCCUCCGGUGGGUUAAAACCCAACGUUCUCAACAAUAGUGA",
"str": "((((((((...[[[[[[.))))))))...............[[[[[(...).(.((((((((((((.....)))))..((((.]]]]]))))((.((((......)))).)))))))))).]]]]]]."
},
"3IWN": {
"ctc": ".**.....................................................********.***.........................",
"seq": "CACGCACAGGGCAAACCAUUCGAAAGAGUGGGACGCAAAGCCUCCGGCCUAAACGGCAUUGCACUCCGCCGUAGGUAGCGGGGUUACCGAUGG",
"str": "((((......((...(((.((....)).)))..[))...(((.((.(((((..((((..........))))))))).].)))))...)).))."
},
"3KTW": {
"ctc": "...............*************.......*..............................*****...****.................",
"seq": "AGAUAGUCGUGGGUUCCCUUUCUGGAGGGAGAGGGAAUUCCACGUUGACCGGGGGAACCGGCCAGGCCCGGAAGGGAGCAACCGUGCCCGGCUAU",
"str": "..(((..((.(((((((((((((....))))))))))).)).))....(.(((.....(((.....(((....))).....)))..))).).)))"
},
"3MUM": {
"ctc": "..............***........................................*******..**.......................",
"seq": "GUCACGCACAGAGCAAACCAUUCGAAAGAGUGGGACGCAAAGCCUCCGGCCUAAACCAUUGCACUCCGGUAGGUAGCGGGGUUACCGAUGG",
"str": "..(.((......((...((((((....))))))..[))...(((.((((((((..((..........))))))).]))))))...))...)"
},
"3MUR": {
"ctc": "....................................................****.********.**.......................",
"seq": "GUCACGCACAGGGCAAACCAUUCGAAAGAGUGGGACGCAAAGCCUCCGGCCUAAACCAUUGCACUCCGGUAGGUAGCGGGGUUAUCGAUGG",
"str": "..(.((......((...((((((....))))))..[))...(((.((((((((...(..........).))))).]))))))...))...)"
},
"3NDB": {
"ctc": ".................................*************........**......................**********..*********.....................................",
"seq": "GUCUCGUCCCGUGGGGCUCGGCGGUGGGGGAGCAUCUCCUGUAGGGGAGAUGUAACCCCCUUUACCUGCCGAACCCCGCCAGGCCCGGAAGGGAGCAACGGUAGGCAGGACGUCGGCGCUCACGGGGGUGCGGGAC",
"str": "((((((..(((((.(((.(((((..(((((.(((((((((....)))))))))..)))))....((((((.....(((.....(((....))).....)))..)))))).))))))).).))))).....))))))"
},
"3PIO": {
"ctc": ".....****.***.............********..**.******.*****..****..*............*******..***....*****.*.......****......******..",
"seq": "ACCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGGU",
"str": "((((((((((.....((.(((((....((((((...............)))..)))...)))))..)).((.......((.(((((....))))).)).......))...))))))))))"
},
"3PIP": {
"ctc": ".....****.***.............********..*********.*****..****..*............*******..***....*****.*......*****......*****...",
"seq": "ACCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGGU",
"str": "((((((((((.....(..(((((....(((((((.............))))..)))...)))))...).((.......((.((.((....)).)).)).......))...))))))))))"
},
"3V7E": {
"ctc": "........*....*.................****.................................**....................................*...................",
"seq": "GGCUUAUCAAGAGAGGUGGAGGGACUGGCCCGAUGAAACCCGGCAACCACUAGUCUAGCGUCAGCUUCGGCUGACGCUAGGCUAGUGGUGCCAAUUCCUGCAGCGGAAACGUUGAAAGAUGAGCCA",
"str": "((((((((....(.(((...(((.[.[[)))......))))(((..(((((((((((((((((.(....).))))))))))))))))).)))...(]].](((((....)))))..)))))))))."
},
"3W3S": {
"ctc": "...................**...........*............................................*....................",
"seq": "GGGAGAGGUUGGCCGGCUGGUGCCGCCCCGGGACUUCAAAUCCCGUGGGAGGUCCCGCAAGGGAGCUCCGGAGGGUUCGAUUCCCUCCCUCUCCCGCC",
"str": "((((((((..((.((((..[.))))))((((((.......))))).)((((.((((....)))).)))).(((((..]....)))))))))))))..."
},
"4IO9": {
"ctc": ".....*****.**..............********..**..*****.******.****...............****.**..***.....******.......*****.....******...",
"seq": "CACCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGGUU",
"str": ".((((((((((.....((.(((((....(((.((....(.....)....))...)))...)))))..))(((.......((.(((((....))))).)).......)))..))))))))))."
},
"4IOA": {
"ctc": "......****.**..............*****.**..**..*****.******.****...............*******..***.....*******.....******.....******...",
"seq": "CACCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGGUU",
"str": ".((((((((((.....((.(((((....((((((....(.....)....)))..)))...)))))..))(((.......((.(((((....))))).)).......)))..))))))))))."
},
"4P3EA": {
"ctc": "..******..................*****..*****.............***..*******..................****....***................*******.........",
"seq": "GACUAAGUUCGGCAUCAAUAUGGUGACCUCCCGGGAGCGGGGGACCACCAGGUUGCCUAAGGAGGGGUGAACCGGCCCAGGUCGGAAACGGAGCAGGUCAAAACUCCCGUGCUGAUCAGUAGUU",
"str": "(.((..(.(((((((((((.(((((..((((((....))))))..))))).))))......(.(((((.....((((....(((....)))....))))...))))))))))))).)...)).)"
},
"4P3EB": {
"ctc": "..******..................*****..*****.............***..*******..................****....***................*******.........",
"seq": "GACUAAGUUCGGCAUCAAUAUGGUGACCUCCCGGGAGCGGGGGACCACCAGGUUGCCUAAGGAGGGGUGAACCGGCCCAGGUCGGAAACGGAGCAGGUCAAAACUCCCGUGCUGAUCAGUAGUU",
"str": "(.((.((.(((((((((((.(((((..((((((....))))))..))))).))))......(.(((((.....((((....(((....)))....))))...))))))))))))).)..))).)"
},
"4UYJ": {
"ctc": "......*....*.***....***********...........*..........**...........****.****.........***.***...................",
"seq": "GGGGCUAGGCCGGGGGGUUCGGCGUCCCCUGUAACCGGAAACCGCCGAUAUGCCGGGGCCGAAGCCCGAGGGGCGGUUCCCGUAAGGGUUCCCACCCUCGGGCGUGCCUC",
"str": "(((((..(((((((((...[[[[[.)))))....((((....]]]]].....))))))))...(((((((((..((..(((....)))..))..)))))))))..)))))"
},
"4UYK": {
"ctc": "......*.............************.....................**...........****..........................................***...................",
"seq": "GGGGCUAGGCCGGGGGGUUCGGCGUCCCCUGUAACCGGAAACCGCCGAUAUGCCGGGGCCGAAGCCCGAGGGGCGGUUCCCGAAGCCGCCUCUGUAAGGAGGCGGUGGAGGGUUCCCACCCUCGGGCGUGCCUC",
"str": "(((((..(((((((((...[[[[[.)))))....((((....]]]]].....))))))))...(((((((((..((..(((...(((((((((....)))))))))...)))..))..)))))))))..)))))"
},
"4W90": {
"seq": "GCGCGCUUAAUCUGAAAUCAGAGCGGGGGACCCAUUGCACUCCGGGUUUUUCCCGUAAGGGGUGAAUCCUUUUUAGGUAGGGCGAAAGCCCGAAUCCGUCAGCUAACCUCGUAAGCGCGC",
"str": "(((((((...((((....))))((((((....(.........................(((((....(((....)))..((((....))))..))))..).)....)))))).)))))))"
},
"4WF9": {
"ctc": "...****..**.............*****.....*...***...******.****.*.............*****.*..***....**..***.....***.......****..",
"seq": "UCUGGUGACUAUAGCAAGGAGGUCACACCUGUUCCCAUGCCGAACACAGAAGUUAAGGUCUUUAGCGACGAUGGUAGCCAACUUACGUUCCGCUAGAGUAGAACGUUGCCAGGC",
"str": ".(..(..(.....((((.((......((((((...(.....)...))))..)).....)).)).))............(............)..............)..)..)."
},
"4XCO": {
"ctc": "..............*************........**........................*******....********................",
"seq": "GGCGGUGGGGGAGCAUCUCCUGUAGGGGAGAUGUAACCCCCUUUACCUGCCGAACCCCGCCAGGCCCGGAAGGGAGCAACGGUAGGCAGGACGUCG",
"str": "((((..(((((.(((((((((....)))))))))..)))))....((((((.....(((.....(((....))).....)))..)))))).))))."
},
"4YB1": {
"ctc": "****..*....................................................................................",
"seq": "GGGCACGCACAGAGCAAACCAUUCGAAAGAGUGGGACGCAAAGCCUCCGGCCUAAACCAUUGCACCUCGGUAGGUAGCGGGGUUACCGAUG",
"str": "...((((......((...((((((....))))))..[))...(((.(((((((...((..........)).)))).]))))))...)).))"
},
"5DM7": {
"ctc": "......*******..............*****..*..***.*****.******.*****..............*****....***.....****.***.....****......******...",
"seq": "CACCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGGUU",
"str": "..(((((((((.....((.(((((....((((((....(.....)....)))..)))...)))))..))(((.......((.(((((....))))).)).......)))..))))))))).."
},
"5JVGA": {
"ctc": ".....****..**.............*****.**..**..*****.******.*******............*****....***.....*******......*****.....*****...",
"seq": "ACCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGGU",
"str": "(((.((((((.....((.(.(((....(((((((...(.....)...))))..)))...))).)..)).((.......((.(((((....))))).)).......))...)))))).)))"
},
"5M73": {
"seq": "GGUGUCCGCACUAAGUUCGGCAUCAAUAUGGUGACCUCCCGGGAGCGGGGGACCACCAGGUUGCCUAAGGAGGGGUGAACCGGCCCAGGUCGGAAACGGAGCAGGUCAAAACUCCCGUGCUGAUCAGUAGUGGGAUCGCGCCUA",
"str": "(((((((.((((..(.((((((.((((.(((((..((((((....))))))..))))).))))......(.((((......((((.....((....)).....))))....))))).)))))).)...))))))...))))).."
},
"5NRGA": {
"ctc": "...****..**.............******....**..*****..*****.****.*.............********.***....**.******..****.......****..",
"seq": "UCUGGUGACUAUAGCAAGGAGGUCACACCUGUUCCCAUGCCGAACACAGAAGUUAAGGUCUUUAGCGACGAUGGUAGCCAACUUACGUUCCGCUAGAGUAGAACGUUGCCAGGC",
"str": ".((.(..(.....((((.((......((((((...(.....)...))))..)).....)).)).))..........(.(............).)............)..).))."
}
}
>test_1ASY_R
*........****..........*******.********..........................**********
UCCGUGAUAGUUUAAUGGUCAGAAUGGGCGCUUGUCGCGUGCCAGAUCGGGGUUCAAUUCCCCGUCGCGGAGCCA
(((((((..((((........)))).(((((.......)))))....(((((.......))))))))))))....
>test_1B23_R
***..........................................*****........*****....*****..
GGCGCGUUAACAAAGCGGUUAUGUAGCGGAUUGCAAAUCCGUCUAGUCCGGUUCGACUCCGGAACGCGCCUCCA
(((((((..(((.........))).(((((.......)))))....(((((.......))))))))))))....
>test_1C0A_B
*....*...****...........******.*********...........................**********
GGAGCGGUAGUUCAGUCGGUUAGAAUACCUGCCUGUCACGCAGGGGGUCGCGGGUUCGAGUCCCGUCCGUUCCGCCA
(((((((..((((.........)))).(((((.......))))).....(((((.......))))))))))))....
>test_1DRZ_B
..........................................****.*******.***..............
GGCCGGCAUGGUCCCAGCCUCCUCGCUGGCGCCGGCUGGGCAACACCAUUGCACUCCGGUGGCGAAUGGGAC
(((((((.........(((........))))))))))......((((..........))))...........
>test_1EFW_C
.........***..............****..********............................*****
GGAGCGGUAGUUCAGUCGGUUAGAAUACCUGCCUGUCACGCAGGGGGUCGCGGGUUCGAGUCCCGUCCGUUCC
(((((((..((((.........)))).(((((.......))))).....(((((.......))))))))))))
>test_1EIY_C
.........**..............**..................*...................*....******
GCCGAGGUAGCUCAGUUGGUAGAGCAUGCGACUGAAAAUCGCAGUGUCCGCGGUUCGAUUCCGCGCCUCGGCACCA
(((((((..((((........)))).(((((.......))))).....(((((.......))))))))))))....
>test_1EUQ_B
*******.*******.....******.....******..........................*********
GGGGUAUCGCCAAGCGGUAAGGCACCGGAUUCUGAUUCCGGCAGCGAGGUUCGAAUCCUCGUACCCCAGCCA
((((((..(((.........))).(((((.......)))))...(((((.......))))))))))).....
>test_1EUY_B
*******.*******.....******.....******............................********
GGGGUAUCGCCAAGCGGUAAGGCACCGGAUUCUGAUUCCGGCAAGCGAGGUUCGAAUCCUCGUACCCCAGCCA
((((((..(((.........))).(((((.......)))))....(((((.......))))))))))).....
>test_1F7U_B
..****.....*****.**.*****.....*..********..............*...........*********
UUCCUCGUGGCCCAAUGGUCACGGCGUCUGGCUGCGAACCAGAAGAUUCCAGGUUCAAGUCCUGGCGGGGAAGCCA
(((((((..(((..........))).(((((.......))))).....(((((.......))))))))))))....
>test_1F7V_B
..****.....*****.********.....*..********..............*...........*****.
UUCCUCGUGGCCCAAUGGUCACGGCGUCUGGCUGCGAACCAGAAGAUUCCAGGUUCAAGUCCUGGCGGGGAAG
(((((((..(((..........))).(((((.......))))).....(((((.......)))))))))))).
>test_1FFY_T
..****...*****.........****...**************........................*******
GGGCUUGUAGCUCAGGUGGUUAGAGCGCACCCCUGAUAAGGGUGAGGUCGGUGGUUCAAGUCCACUCAGGCCCAC
(((((((..((((.........)))).(((((.......))))).....(((((.......))))))))))))..
>test_1GAX_C
...***...******..***.***....**.****.********..........*..............******
GGGCGGCUAGCUCAGCGGAAGAGCGCUCGCCUCACACGCGAGAGGUCGUAGGUUCAAGUCCUACGCCGCCCACCA
(((((((..((((.......)))).(((((.......))))).....(((((.......))))))))))))....
>test_1GTR_B
*******.*******.....*****......*****..............................********
GGGGUAUCGCCAAGCGGUAAGGCACCGGAUUCUGAUUCCGGCAUUCCGAGGUUCGAAUCCUCGUACCCCAGCCA
((((((..(((.........))).(((((.......))))).....(((((.......))))))))))).....
>test_1H3E_B
..........**...*...........****..****.....******.............................***...
GGGCAGGUUCCCGAGCGGCCAAAGGGGACGGUCUGUAAAACCGUUGGCGUAUGCCUUCGCUGGUUCGAAUCCAGCCCUGCCCA
(((((((..(((...........)))((((((.......))))))(((....)))...(((((.......)))))))))))).
>test_1H4Q_T
...*..................***..*******...***...........................
GGAGUAGCGCAGCCCGGUAGCGCACCUCGUUCGGGACGAGGGGGGCGCUGGUUCAGAUCCAGUCUCC
((((..((((.........)))).((((((.....)))))).....(((((.......)))))))))
>test_1HC8_C
.....****.****.*.......********.....*....................
CCAGGAUGUAGGCUUAGAAGCAGCCAUCAUUUAAAGAAAGCGUAAUAGCUCACUGGU
((((.(((..(((.........)))..))).....(..(((......)))).)))).
>test_1IL2_D
........****..........******.*********...........................***..
CCGUGAUAGUUUAAUGGUCAGAAUGGGCGCUUGUCGCGUGCCAGAUCGGGGUUCAAUUCCCCGUCGCGGA
((((((..((((........)))).(((((.......)))))....(((((.......))))))))))).
>test_1J1U_B
...........................***....****..............................**....
CCGGCGGUAGUUCAGCCUGGUAGAACGGCGGACUGUAGAUCCGCAUGUCGCUGGUUCAAAUCCGGCCCGCCGGA
(((((((..((((.........)))).(((((.......))))).....(((((.......)))))))))))).
>test_1J2B_C
*.....**************...****....***..*****......***.........*......******.****
GGGCCCGUGGUCUAGUUGGUCAUGACGCCGCCCUUACGAGGCGGAGGUCCGGGGUUCAAGUCCCCGCGGGCCCACCA
(((((((................(((..((((.......))))...)))(((((.......))))))))))))....
>test_1JJ2_9
...************.**.....*.*******.****..***.****************.......****.............*****..***...*****............*******..
UUAGGCGGCCACAGCGGUGGGGUUGCCUCCCGUACCCAUCCCGAACACGGAAGAUAAGCCCACCAGCGUUCCGGGGAGUACUGGAGUGCGCGAGCCUCUGGGAAACCCGGUUCGCCGCCACC
...((((((....((((((((......((((((...(.....)...))))..))....)))))).))...(((((.....((((((.((....))))))))....)))))...))))))...
>test_1KUQA_B
...................*****.....*****..*****......********..
GGGCGGCCUUCGGGCUAGACGGUGGGAGAGGCUUCGGCUGGUCCACCCGUGACGCUC
((((.(((....)))..(((((((((...(((....)))..))))).)))..)))))
>test_1L9A_B
.............................************.........................................*****....****..................................
GACACUAAGUUCGGCAUCAAUAUGGUGACCUCCCGGGAGCGGGGGACCACCAGGUUGCCUAUUUGAGGGGUGAACCGGCCCAGGUCGGAAACGGAGCAGGUCAAAACUCCCGUGCUGAUCAGUAGUGUC
((((((..(.(((((((((((.(((((..((((((....))))))..))))).)))).........(((((.....(.((....(((....)))....)).)...))))).))))))).)...))))))
>test_1LNG_B
................************.....................................*****....***....................
UCGGCGGUGGGGGAGCAUCUCCUGUAGGGGAGAUGUAACCCCCUUUACCUGCCGAACCCCGCCAGGCCCGGAAGGGAGCAACGGUAGGCAGGACGUC
..(.((..(((((.(((((((((....)))))))))..)))))....((((((.....(((.....(((....))).....)))..)))))).)).)
>test_1MFQ_A
............................*****.******................................**......*******....*******..............................
GACACUAAGUUCGGCAUCAAUAUGGUGACCUCCCGGGAGCGGGGGACCACCAGGUUGCCUAAGGAGGGGUGAACCGGCCCAGGUCGGAAACGGAGCAGGUCAAAACUCCCGUGCUGAUCAGUAGUGUC
((((((..(.(((((((((((.(((((..((((((....))))))..))))).))))......(.(((((.....((((....(((....)))....))))...))))))))))))).)...))))))
>test_1MMS_C
......*********....*....*********....*......***...........
GCUGGGAUGUUGGCUUAGAAGCAGCCAUCAUUUAAAGAGUGCGUAACAGCUCACCAGC
(((((.(((..(((.........)))..))).....(...((......)).).)))))
>test_1MZP_B
...............**********..........********.******.....
GGGAUGCGUAGGAUAGGUGGGAGCCGCAAGGCGCCGGUGAAAUACCACCCUUCCC
((((.(.........(((((..(((....)))............)))))).))))
>test_1QA6_C
......*********..........*******.....*....................
GCCAGGAUGUAGGCUUAGAAGCAGCCAUCAUUUAAAGAAAGCGUAAUAGCUCACUGGU
(((((.(((..(((.........)))..))).....(..(((......)))).)))))
>test_1QF6_B
****......**.............*.....********.........*................***********
GCCGAUAUAGCUCAGUUGGUAGAGCAGCGCAUUCGUAAUGCGAAGGUCGUAGGUUCGACUCCUAUUAUCGGCACCA
(((((((..((((........))))..((((.......))))......(((((.......))))))))))))....
>test_1SJ3_R
...........................................**...*******..**..............
UGGCCGGCAUGGUCCCAGCCUCCUCGCUGGCGCCGGCUGGGCAACACCAUUGCACUCCGGUGGUGAAUGGGAC
.(((((((.........(((........))))))))))......((((..........))))...........
>test_1SJF_B
............................................**...*******.***..............
AUGGCCGGCAUGGUCCCAGCCUCCUCGCUGGCGCCGGCUGGGCAACACCAUUGCACUCCGGUGGUGAAUGGGAC
..(((((((.........(((........))))))))))......((((..........))))...........
>test_1U6P_B
.............................*****..............................................**..................*
GGCGGUACUAGUUGAGAAACUAGCUCUGUAUCUGGCGGACCCGUGGUGGAACUGUGAAGUUCGGAACACCCGGCCGCAACCCUGGGAGAGGUCCCAGGGUU
.((((..((((((....))))))..)))).....((((..(((.(((((((((....)))))....)))))))))))((((((((((....))))))))))
>test_1UN6_E
...*.******........****.****..............*****..............
GCCGGCCACACCUACGGGGCCUGGUUAGUACCUGGGAAACCUGGGAAUACCAGGUGCCGGC
((((((....((....))(((((((.....((..(....)..))....)))))))))))))
>test_1WZ2_C
*****.......***....**...****............*****.....******..........................*.****
GCGGGGGUUGCCGAGCCUGGUCAAAGGCGGGGGACUCAAGAUCCCCUCCCGUAGGGGUUCCGGGGUUCGAAUCCCCGCCCCCGCACCA
(((((((..(((.............))).(((((.......))))).(((....)))...(((((.......))))))))))))....
>test_1Y39_C
.....*********.........********.....*....................
CCAGGAUGUAUGCUUAGAAGCAGCAAUCAUUUAAAGAGUGCGUAAUAGCUCACUGGU
((((.(((..(((.........)))..))).....(...((......)).).)))).
>test_1Y69_9
.........***................................................................**.........****...........................
CCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGG
(((((((((.....((.(((((....((((((...............)))..)))...)))))..))(((.......((.(((((....))))).)).......)))..)))))))))
>test_1YHQ_9
...***************......********.****..***.****************.......****.............*****..***...****.............*******..
UUAGGCGGCCACAGCGGUGGGGUUGCCUCCCGUACCCAUCCCGAACACGGAAGAUAAGCCCACCAGCGUUCCAGGGAGUACUGGAGUGCGCGAGCCUCUGGGAAAUCCGGUUCGCCGCCACC
...((((((....((((((((......((((((...(.....)...))))..))....)))))).))...((.((.....((((((.((....))))))))....)).))...))))))...
>test_2AKE_B
.........................*....*****...............................**....
GACCUCGUGGCGCAAUGGUAGCGCGUCUGACUCCAGAUCAGAAGGUUGCGUGUUCGAAUCACGUCGGGGUCA
(((((((..((((.......)))).(((((.......))))).....(((((.......)))))))))))).
>test_2CSX_C
..**.........*............**.....**********............................****
GGCGGCGUAGCUCAGCUGGUCAGAGCGGGGAUCUCAUAAGUCCCAGGUCGGAGGUUCGAGUCCUCCCGCCGCCAC
(((.(((..((((.........)))).(((((.......))))).....(((((.......)))))))).)))..
>test_2CZJ_B
...********..*******...............*****................*****.
GGGGGUGAAACGGUCUCGACAGGGGUUCGCCUUUGGACGUGGGUUCGACUCCCACCACCUCC
(((((((............((((.(....).))))...(((((.......))))))))))))
>test_2D6F_E
****..............................................****.....***.........*
AGUCCCGUGGGGUAGUGGUAAUCCUGCUGGGCUUUGGACCCGGCGACAGCGGUUCGACUCCGCUCGGGACUA
(((((((...(........{...).((((((.......))))))...(((((......})))))))))))).
>test_2D6F_F
*****.............................................*****....***.........***
AGUCCCGUGGGGUAGUGGUAAUCCUGCUGGGCUUUGGACCCGGCGACAGCGGUUCGACUCCGCUCGGGACUACC
(((((((..(((..........)))((((((.......))))))...(((((.......))))))))))))...
>test_2DER_C
.........****...........*******..*********................................
GUCCCCUUCGUCUAGAGGCCCAGGACACCGCCCUUUCACGGCGGUAACAGGGGUUCGAAUCCCCUAGGGGACGC
....(.(..((((.........))))((((((.......))))))...(((((.......)))))).)......
>test_2DER_D
.....***............*****....********..................................
CCCCUUCGUCUAGAGGCCCAGGACACCGCCCUUUCACGGCGGUAACAGGGGUUCGAAUCCCCUAGGGGACG
.((((..((((.........))))((((((.......))))))...(((((.......)))))))))....
>test_2DET_C
.........***............*******.**********............................
GUCCCCUUCGUCUAGAGGCCCAGGACACCGCCCUUUCACGGCGGUAACAGGGGUUCGAAUCCCCUAGGGG
..((.....((((.........))))((((((.......))))))....((((.......))))...)).
>test_2DR2_B
.........................*....*****...............................**.......
GACCUCGUGGCGCAAUGGUAGCGCGUCUGACUCCAGAUCAGAAGGUUGCGUGUUCGAAUCACGUCGGGGUCACCA
(((((((..(.((.......)).).(((((.......))))).....(((((.......))))))))))))....
>test_2DU3_D
.........**...............*..********............................*.....
GCCAGGGUGGCAGAGGGGCUUUGCGGCGGACUGCAGAUCCGCUUUACCCCGGUUCGAAUCCGGGCCCUGGC
(((((((..(((.........)))..((((.......))))......(((((.......))))))))))))
>test_2DU5_D
.........**..............*....*.******..........................**.....
GCCAGGGUGGCAGAGGGGCUUUGCGGCGGACUUCAGAUCCGCUUUACCCCGGUUCGAAUCCGGGCCCUGGC
(((((((..(((.........))).(((((.......))))).....(((((.......))))))))))))
>test_2FMT_C
*******..*****..........****........................................*********
CGCGGGGUGGAGCAGCCUGGUAGCUCGUCGGGCUCAUAACCCGAAGAUCGUCGGUUCAAAUCCGGCCCCCGCAACCA
.((((((..((((.........)))).(((((.......))))).....(((((.......))))))))))).....
>test_2HGH_B
*..******........*********............****.............
GGGCCAUACCUCUUGGGCCUGGUUAGUACCUCUUCGGUGGGAAUACCAGGUGCCC
((((....((....))(((((((.....((.(....).))....)))))))))))
>test_2IHX_B
......***.**......................******................******..*******....
CUGCCCUCAUCCGUCUCGCUUAUUCGGGGAGCGGACGAUGACCCUAGUAGAGGGGGCUGCGGCUUAGGAGGGCAG
((((((((...((((.(((((.......)))))))))....((((.....))))(((....)))...))))))))
>test_2L3J_B
........****.....********......**.......***......***.****.....***......
GGCAUUAAGGUGGGUGGAAUAGUAUAACAAUAUGCUAAAUGUUGUUAUAGUAUCCCACCUACCCUGAUGCC
(((((((.(((((((((.(((.(((((((((((.....))))))))))).))).))))))))).)))))))
>test_2MF0_G
.************....**.************.*...******************..*************..
UGUCGACGGAUAGACACAGCCAUCAAGGACGAUGGUCAGGACAUCGCAGGAAGCGAUUCAUCAGGACGAUGA
((((........))))..((((((......)))))).......((((.....))))..((((.....)))).
>test_2MQV_B
..............*****.............................********............
GGGCGAGGGUCUCCUCUGAGUGAUUGACUACCCGUCAGCGGGGGUCUUUCAUUUGGGGGCUCGUGCCC
(((((.((((((((.....((((..((((.((((....))))))))..))))..)))))))).)))))
>test_2MS0_B
**...*****............****************..........................****.**
GGCUCGUUGGUCUAGGGGUAUGAUUCUCGCUUAGGGUGCGAGAGGUCCCGGGUUCAAAUCCCGGACGAGCC
(((((....(((.........)))((((((.......))))))....(((((.......)))))..)))))
>test_2V3C_M
..............************...........******.****.....**....*********...**********........***....
GGCGGUGGGGGAGCAUCUCCUGUAGGGGAGAUGUAACCCCCUUUACCUGCCGAACCCCGCCAGGCCCGGAAGGGAGCAACGGUAGGCAGGACGUCG
((((..(((((.(((((((((....)))))))))..))))).....(((((.....(((.....(((....))).....)))..)))))..)))).
>test_2ZJQ_Y
......****.**..............********..**.******.******.****..*............*******..***.....******.......*****.....******...
CACCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGGUU
.((((((((((.....((.(((((....(((((((...(.....)...))))..)))...)))))..))(((.......((.(((((....))))).)).......)))..)))))))))).
>test_2ZM5_C
.........*..............*****************...........................*****.
GCCCGGAUAGCUCAGUCGGUAGAGCAGGGGAUUGAAAAUCCCCGUGUCCUUGGUUCGAUUCCGAGUCCGGGCAC
(((((((..((((........)))).(((((.......))))).....(((((.......))))))))))))..
>test_2ZM5_D
........................*****************........................***.
GCCCGGAUAGCUCAGUCGGUAGAGCAGGGGAUUGAAAAUCCCCGUGUCCUUGGUUCGAUUCCGAGUCCG
....(((..((((........)))).(((((.......))))).....(((((.......)))))))).
>test_2ZNI_C
**.***.******......****......................................***********
GGGGGGUGGAUCGAAUAGAUCACACGGACUCUAAAUUCGUGCAGGCGGGUGAAACUCCCGUACUCCCCGCCA
((((.((.((((.....)))).((((.(.......).))))...(((((.......))))))).))))....
>test_2ZUEA_B
.*******...****.*.***..*****....*...*******.............***...........*****
GGACCGGUAGCCUAGCCAGGACAGGGCGGCGGCCUCCUAAGCCGCAGGUCCGGGGUUCAAAUCCCCGCCGGUCCG
(((((((..((((..........)))).(((((.......))))).....(((((.......)))))))))))).
>test_2ZUEB_B
.*******...****.*.***..*****....*...*******.............***...........******
GGACCGGUAGCCUAGCCAGGACAGGGCGGCGGCCUCCUAAGCCGCAGGUCCGGGGUUCAAAUCCCCGCCGGUCCGC
(((((((..((((..........)))).(((((.......))))).....(((((.......))))))))))))..
>test_2ZZM_B
..........****.....**...******...***********.....*.................**...........*...
GCAGGGGUCGCCAAGCCUGGCCAAAGGCGCUGGGCCUAGGACCCAGUCCCGUAGGGGUUCCAGGGUUCAAAUCCCUGCCCCUGC
....(((..(((.............)))((((((.......))))))(((....)))...(.(((.......))).))))....
>test_2ZZN_C
........*******..**********..***********.******.**...***.*.........**..
GCCGGGGUAGUCUAGGGGCUAGGCAGCGGACUGCAGAUCCGCCUUACGUGGGUUCAAAUCCCACCCCCGGC
(((((((..((((.......)))).(((((.......))))).....(((((.......))))))))))))
>test_3ADB_C
*............********.****...............****...*.....................**.**.....*..**....***
GGCCGCCGCCACCGGGGUGGUCCCCGGGCCGGACUUCAGAUCCGGCGCGCCCCGAGUGGGGCGCGGGGUUCAAUUCCCCGCGGCGGCCGCCA
(((((((((..((((((....))))))((((((.......))))))(((((((....)))))))((((.......)))))))))))))....
>test_3AKZ_F
.*******.*****.......***.**...*******..........................*********..
UGGGAGGUCGUCUAACGGUAGGACGGCGGACUCUGGAUCCGCUGGUGGAGGUUCGAGUCCUCCCCUCCCAGCCA
(((((((..((((.......)))).(((((.......)))))....(((((.......))))))))))))....
>test_3AM1_B
***.............*********..........***.......................*..**........*******
GGCGCGGGGUACCGGGCUUGGUAGCCCGGGGCUUCGGCCGAGGGCGAGAGCCCUCGGGGUUCGAUUCCCCCCCUGCGCCGC
(((((((((..(((((((....)))))))(((....)))((((((....))))))((((.......)))))))))))))..
>test_3AMT_B
***........................****..**.********..........................********
GGGCCCGUAGCUUAGCCAGGUCAGAGCGCCCGGCUCAUAACCGGGCGGUCGAGGGUUCGAAUCCCUCCGGGCCCACCA
(((((((..((((..........)))).((((.........)))).....(((((.......))))))))))))....
>test_3CUL_C
............................***********...................*................................
GAUGGCGAAAGCCAUUUCCGCAGGCCCCAUUGCACUCCGGGGUAUUGGCGUUAGGUGGUGGUACGAGGUUCGAAUCCUCGUACCGCAGCCA
((((((....))))))..(((..(((((..........)))))....)))...(((.(.(((((((((.......))))))))).).))).
>test_3CUL_D
.............................********.**................................................*...
GGAUGGCGAAAGCCAUUUCCGCAGGCCCCAUUGCACUCCGGGGUAUUGGCGUUAGGUGGUGGUACGAGGUUCGAAUCCUCGUACCGCAGCCA
(((((((....)))))).)(((..(((((..........)))))....)))...(((.(((((((((((.......))))))))))).))).
>test_3EGZ_B
..............................****.********.**...................
GAGGGAGAGGUGAAGAAUACGACCACCUAGGUACCAUUGCACUCCGGUACCUAAAACAUACCCUC
(((((...((((...........))))((((((((..........)))))))).......)))))
>test_3EPH_E
.......*........****..**********************.........................
CUCGUAUGGCGCAGUGGUAGCGCAGCAGAUUGCAAAUCUGUUGGUCCUUAGUUCGAUCCUGAGUGCGAG
((((.(..((((.......))))..((((.......))))......((.((.......)).))).))))
>test_3FOZ_C
.........*..............*****************...........................*****.
GCCCGGAUAGCUCAGUCGGUAGAGCAGGGGAUUGAAAAUCCCCGUGUCCUUGGUUCGAUUCCGAGUCUGGGCAC
(((((((..((((........)))).(((((.......))))).....(((((.......))))))))))))..
>test_3FOZ_D
...*..............*****************..................................
CGGAUAGCUCAGUCGGUAGAGCAGGGGAUUGAAAAUCCCCGUGUCCUUGGUUCGAUUCCGAGUCCGGGC
((((..((((........)))).(((((.......))))).....(.(((.......))).)))))...
>test_3HHN_C
...............................................................**...********.**..........................................................
UCCAGUAGGAACACUAUACUACUGGAUAAUCAAAGACAAAUCUGCCCGAAGGGCUUGAGAACAUACCCAUUGCACUCCGGGUAUGCAGAGGUGGCAGCCUCCGGUGGGUUAAAACCCAACGUUCUCAACAAUAGUGA
((((((((..........))))))))....................(...).(.((((((((((((((..........)))))))..((((......))))((.((((......)))).))))))))))........
>test_3HJW_D
..****........*****************.......*******.....********
GGGCCACGGAAACCGCGCGCGGUGAUCAAUGAGCCGCGUUCGCUCCCGUGGCCCACAA
(((((((((.......(((((((.........)))))))......)))))))))....
>test_3IRW_R
....................................................****.********.**......................
GUCACGCACAGGGCAAACCAUUCGAAAGAGUGGGACGCAAAGCCUCCGGCCUAAACCAUUGCACUCCGGUAGGUAGCGGGGUUACCGAUG
..((((......((...((((((....))))))...))...(((.(((((((...((..........)).))))..))))))...)).))
>test_3IVKA_M
.....................................................................*.*******........................****.........***..........
UCCAGUAGGAACACUAUACUACUGGAUAAUCAAAGACAAAUCUGCCCGAAGGGCUUGAGAACAUCGAAACACGAUGCAGAGGUGGCAGCCUCCGGUGGGUUAAAACCCAACGUUCUCAACAAUAGUGA
((((((((..........))))))))....................(...).(.((((((((((((.....)))))..((((......))))((.((((......)))).))))))))))........
>test_3IWN_A
....................................................************.*******.....................
CACGCACAGGGCAAACCAUUCGAAAGAGUGGGACGCAAAGCCUCCGGCCUAAACGGCAUUGCACUCCGCCGUAGGUAGCGGGGUUACCGAUGG
((((......((...((((((....))))))...))...(((.((((((((..((((..........)))))))))..))))))...)).)).
>test_3K0J_E
.............**....***********............*........*.....***.****.**...................
GCGACUCGGGGUGCCCUCCAUUGCACUCCGGAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUCGC
((((((((((.((((.(((..........))))))......)..)))).....(((...((((......))))...)))..))))))
>test_3KTW_C
................************...................................*..*****..****...***.............
AGAUAGUCGUGGGUUCCCUUUCUGGAGGGAGAGGGAAUUCCACGUUGACCGGGGGAACCGGCCAGGCCCGGAAGGGAGCAACCGUGCCCGGCUAUC
.(((...((((((((((((((((....))))))))))).))))).....((((.....(((.....(((....))).....)))..))))...)))
>test_3KTW_D
...............*************............................*.........*****...****.................
AGAUAGUCGUGGGUUCCCUUUCUGGAGGGAGAGGGAAUUCCACGUUGACCGGGGGAACCGGCCAGGCCCGGAAGGGAGCAACCGUGCCCGGCUAU
..(((..((.(((((((((((((....))))))))))).)).))....(.(((.....(((.....(((....))).....)))..))).).)))
>test_3MUM_R
....................................................***..*******..**.......................
GUCACGCACAGAGCAAACCAUUCGAAAGAGUGGGACGCAAAGCCUCCGGCCUAAACCAUUGCACUCCGGUAGGUAGCGGGGUUACCGAUGG
..(.((......((...((((((....))))))...))...(((.((((((((..((..........)))))))..))))))...))...)
>test_3MUR_R
....................................................****.********.**.......................
GUCACGCACAGGGCAAACCAUUCGAAAGAGUGGGACGCAAAGCCUCCGGCCUAAACCAUUGCACUCCGGUAGGUAGCGGGGUUAUCGAUGG
..(.((......((...((((((....))))))...))...(((.((((((((...(..........).)))))..))))))...))...)
>test_3NDB_M
.................................*************..........................**....**********..*********.....................................
GUCUCGUCCCGUGGGGCUCGGCGGUGGGGGAGCAUCUCCUGUAGGGGAGAUGUAACCCCCUUUACCUGCCGAACCCCGCCAGGCCCGGAAGGGAGCAACGGUAGGCAGGACGUCGGCGCUCACGGGGGUGCGGGAC
((((((..(((((.(((.(((((..(((((.(((((((((....)))))))))..)))))....((((((.....(((.....(((....))).....)))..)))))).))))))).).))))).....))))))
>test_3PIO_Y
.....****.***.............********..**.******.*****..****..*............*******..***....*****.*.......****......******..
ACCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGGU
((((((((((.....((.(((((....((((((...............)))..)))...)))))..)).((.......((.(((((....))))).)).......))...))))))))))
>test_3RW6_H
****....********.***.........*****..............***.**.......
GCACUAACCUAAGACAGGAGGGCCGGGAAACCUGCCUAAUCCAAUGACGGGUAAUAGUGUC
((((((((((......(((((((.((....)).))))..)))......))))..)))))).
>test_3TUP_T
**.......****..........***......*******..........................*********
GCCGAGGUAGCUCAGUUGGUAGAGCAUGCGACUGAAAAUCGCAGUGUCGGCGGUUCGAUUCUGCUCCUCGGCAC
(((((((..((((........)))).(((((.......)))))......((.(.......).)).)))))))..
>test_3U4M_B
...............**************..................................****.*******.....
GGGAUGCGUAGGAUAGGUGGGAGCCUGUGAACCCCCGCCUCCGGGUGGGGGGGAGGCGCCGGUGAAAUACCACCCUUCCC
((((.(.........(((((..(((......((((((((....))))))))...)))............)))))).))))
>test_3V7E_C
........*........**............****...................................................*......................................*
GGCUUAUCAAGAGAGGUGGAGGGACUGGCCCGAUGAAACCCGGCAACCACUAGUCUAGCGUCAGCUUCGGCUGACGCUAGGCUAGUGGUGCCAAUUCCUGCAGCGGAAACGUUGAAAGAUGAGCCA
((((((((....(.(((...(((.....)))......))))(((..(((((((((((((((((((....))))))))))))))))))).)))...(....(((((....)))))..))))))))).
>test_3W3S_B
...................**............................*...........................*....................
GGGAGAGGUUGGCCGGCUGGUGCCGCCCCGGGACUUCAAAUCCCGUGGGAGGUCCCGCAAGGGAGCUCCGGAGGGUUCGAUUCCCUCCCUCUCCCGCC
((((((((..((.((((....))))))((((((.......))))).)((((.((((....)))).)))).(((((.......)))))))))))))...
>test_3WFRA_A
*****..........**.*....................................*.............****
GGCCAGGUAGCUCAGUUGGUAGAGCACUGGACUGAAAAUCCAGGUGUCGGCGGUUCGAUUCCGCCCCUGGCCA
(((((((....(...........)..(((((.......))))).....(((((.......)))))))))))).
>test_3WFRA_C
****...........**.*....................................*.............*****
GGCCAGGUAGCUCAGUUGGUAGAGCACUGGACUGAAAAUCCAGGUGUCGGCGGUUCGAUUCCGCCCCUGGCCAC
(((((((....((........).)..(((((.......))))).....(((((.......))))))))))))..
>test_3WFRB_C
****...........**.*....................................*.............******
GGCCAGGUAGCUCAGUUGGUAGAGCACUGGACUGAAAAUCCAGGUGUCGGCGGUUCGAUUCCGCCCCUGGCCACC
(((((((....((........).)..(((((.......))))).....(((((.......))))))))))))...
>test_3WQY_C
*****......****..***......................****..**....**.........**********
GGGCUCGUAGCUCAGCGGGAGAGCGCCGCCUUUGCGAGGCGGAGGCCGCGGGUUCAAAUCCCGCCGAGUCCACCA
(((((((..((((.......)))).((((((.....)))))).....(((((.......))))))))))))....
>test_3WQZ_C
*****......****..***......................********....**........***********
GGACUCGUAGCUCAGCGGGAGAGCGCCGCCUUUGCGAGGCGGAGGCCGCGGGUUCAAAUCCCGCCGAGUCCACCA
(((((((..((((.......)))).(((((((...))))))).....(((((.......))))))))))))....
>test_4BY9_A
.....*****************************......***************.....******.**...
GCGAGCAAUGAUGAGUGAUGGGCGAACUGAGCUCGAAAGAGCAAUGAUGAGUGAUGGGCGAACUGAGCUCGC
((((((......(.............(...((((....))))......).............)...))))))
>test_4KR2_C
**......***...........*****...******..........................******
CGCCGCUGGUGUAGUGGUAUCAUGCAAGAUUCCCAUUCUUGCGACCCGGGUUCGAUUCCCGGGCGGCG
((((((..(((.........)))((((((.......))))))....((((.......)))).))))))
>test_4KR3_C
*.......***...........**************..........................*******
CGCCGCUGGUGUAGUGGUAUCAUGCAAGAUUCCCAUUCUUGCGACCCGGGUUCGAUUCCCGGGCGGCGC
((((((..(((.........)))((((((.......))))))...(((((.......))))))))))).
>test_4KZD_R
...............................*.*******...........................................
GACGCGACCGAAAUGGUGAAGGACGGGUCCAGUGCGAAACACGCACUGUUGAGUAGAGUGUGAGCUCCGUAACUGGUCGCGUC
((((((((((..((((.(.....(...(((((((((.....)))))))..)).........)..).))))..).)))))))))
>test_4LCK_C
......****...................................................................................**.......
GGGUGCGAUGAGAAGAAGAGUAUUAAGGAUUUACUAUGAUUAGCGACUCUAGGAUAGUGAAAGCUAGAGGAUAGUAACCUUAAGAAGGCACUUCGAGCACCC
((((((.....((((.......((((((..(((((((.........((((((...........)))))).))))))))))))).......))))..))))))
>test_4M4O_B
...........................*.****.***.*....................
GGGUUCAUCAGGGCUAAAGAGUGCAGAGUUACUUAGUUCACUGCAGACUUGACGAACCC
((((((.(((((.((.......((((.(...(...)..).)))))).))))).))))))
>test_4P3EA_A
..******..................*****..*****.............***..*******..................****....***................*******.........
GACUAAGUUCGGCAUCAAUAUGGUGACCUCCCGGGAGCGGGGGACCACCAGGUUGCCUAAGGAGGGGUGAACCGGCCCAGGUCGGAAACGGAGCAGGUCAAAACUCCCGUGCUGAUCAGUAGUU
(.((..(.(((((((((((.(((((..((((((....))))))..))))).))))......(.(((((.....((((....(((....)))....))))...))))))))))))).)...)).)
>test_4PKD_V
*....***********........****....*************.........
GAUCCAUUGCACUCCGGAUCCAGGAGAUACCAUGAUCACGAAGGUGGUUUUCCU
(((((..........))))).((((((..((((..........)))).))))))
>test_4U7U_L
***************************************************..********
AUAAACCGGGCUCCCUGUCGGUUGUAAUUGAUAAUGUUGAGAGUUCCCCGCGCCAGCGGGG
.............................................((((((....))))))
>test_4UYJ_R
......*.............************.....................**...........*****.................***...................
GGGGCUAGGCCGGGGGGUUCGGCGUCCCCUGUAACCGGAAACCGCCGAUAUGCCGGGGCCGAAGCCCGAGGGGCGGUUCCCGUAAGGGUUCCCACCCUCGGGCGUGCCUC
(((((..(((((((((.........)))))....((((..............))))))))...((((((((((.((..(((....)))..))).)))))))))..)))))
>test_4UYK_R
......*.............************.....................**...........****..........................................***...................
GGGGCUAGGCCGGGGGGUUCGGCGUCCCCUGUAACCGGAAACCGCCGAUAUGCCGGGGCCGAAGCCCGAGGGGCGGUUCCCGAAGCCGCCUCUGUAAGGAGGCGGUGGAGGGUUCCCACCCUCGGGCGUGCCUC
(((((..(((((((((.........)))))....((((..............))))))))...(((((((((..((..(((...(((((((((....)))))))))...)))..))..)))))))))..)))))
>test_4W90_C
..............................*...******.......................................**.......................................
GCGCGCUUAAUCUGAAAUCAGAGCGGGGGACCCAUUGCACUCCGGGUUUUUCCCGUAAGGGGUGAAUCCUUUUUAGGUAGGGCGAAAGCCCGAAUCCGUCAGCUAACCUCGUAAGCGCGC
(((((((...((((....))))((((((....(.........................(((((....(((....)))..((((....))))..))))..).)....)))))).)))))))
>test_4WC3_B
*****............***..................................**.*...**........*****
GGCCAGGUAGCUCAGUUGGUAGAGCACUGGACUGAAAAUCCAGGUGUCGGCGGUUCGAUUCCGCCCCUGGCCACCA
((((.((..((((........)))).(((((.......))))).....(((((.......))))))).))))....
>test_4WF9_Y
...****..**.............*****.....*...***...******.****.*.............*****.*..***....**..***.....***.......****..
UCUGGUGACUAUAGCAAGGAGGUCACACCUGUUCCCAUGCCGAACACAGAAGUUAAGGUCUUUAGCGACGAUGGUAGCCAACUUACGUUCCGCUAGAGUAGAACGUUGCCAGGC
.(..(..(.....((((.((......((((((...(.....)...))))..)).....)).)).))............(............)..............)..)..).
>test_4X0B_B
******............***..................................**.*...**........*****
GGGCCAGGUAGCUCAGUUGGUAGAGCACUGGACUGAAAAUCCAGGUGUCGGCGGUUCGAUUCCGCCCCUGGCCCACC
.((((.((..((((........)))).(.((.........)).).....(((((.......))))))).))))....
>test_4YB1_R
****.............................................................................*.........
GGGCACGCACAGAGCAAACCAUUCGAAAGAGUGGGACGCAAAGCCUCCGGCCUAAACCAUUGCACCUCGGUAGGUAGCGGGGUUACCGAUG
...((((......((...((((((....))))))...))...(((.(((((((...((..........)).))))..))))))...)).))
>test_4YCO_D
............*****.******...***...........****.........**..................
GCGCGGAUAGCUCAGUCGGUAGAGCAGGGGAUUGAAAAUCCCCGUGUCCUUGGUUCGAUUCCGAGUCCGCGCAC
(((((((..((((........)))).(((((.......))))).....(((((.......))))))))))))..
>test_4YCO_E
........*****.******...................*..........**...............
GCGGAUAGCUCAGUCGGUAGAGCAGGGGAUUGAAAAUCCCCGUGUCCUUGGUUCGAUUCCGAGUCCG
..(((..((((........)))).(((((.......))))).....(((((.......)))))))).
>test_4YCO_F
............*****.******..............................**...............
GCGCGGAUAGCUCAGUCGGUAGAGCAGGGGAUUGAAAAUCCCCGUGUCCUUGGUUCGAUUCCGAGUCCGCG
..(((((..((((........)))).(((((.......))))).....(((((.......)))))))))).
>test_4YVI_C
.........***.........*****..**.******..................................
UGGGAGGUCGUCUAACGGUAGGACGGCGGACUCUGGAUCCGCUGGUGGAGGUUCGAGUCCUCCCCUCCCAG
.((((((..((((.......))))((((((.......))))))...(((((.......)))))))))))..
>test_4YYE_C
**.......**.....*........****.**********..***.*..................********.
GUUAUAUUAGCUUAAUUGGUAGAGCAUUCGUUUUGUAAUCGAAAGGUUUGGGGUUCAAAUCCCUAAUAUAACAC
(((((((..((((........)))).((((.........)))).....(((((.......))))))))))))..
>test_4ZT0_B
******************...........***************.**************.........****
GAUGAGACGCGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGU
..........((((((..((((....))))....))))))..(((..).)).......((((....))))..
>test_5AXM_P
***.............*..............................********.....*****..*****
GGAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUCUGGAGGUCCUGUGUUCGAUCCACAGAAUCCCCAC
(((((..((((........)))).(((((.......))))).....(((((.......))))))))))....
>test_5CCBA_N
..........**..........**.......***.........******************....**..........
GGCCCGGAUAGCUCAGUCGGUAGAGCAUCAGACUUUUAAUCUGAGGGUCCAGGGUUCAAGUCCCUGUUCGGGCGCCA
((((((((..((((........)))).(((((.(...).))))).....(((((.......))))))))))))..).
>test_5D6G_0
...*****....******.......................*****...................***......
GCCUAAGACAGCGGGGAGGUUGGCUUAGAAGCAGCCAUCCUUUAAAGAGUGCGUAACAGCUCACCCGUCGAGGC
(((.......(((((.(((..(((.........)))..))).....(...((......)).).)))))...)))
>test_5DDP_A
............................***...********.**................
CGUUGACCCAGGAAACUGGGCGGAAGUAAGGUCCAUUGCACUCCGGGCCUGAAGCAACGCG
(((((.(((((....)))))........((((((..........))))))....)))))..
>test_5E6M_C
*.......***...........*****...******...****...................***********
CGCCGCUGGUGUAGUGGUAUCAUGCAAGAUUCCCAUUCUUGCGACCCGGGUUCGAUUCCCGGGCGGCGCACCA
((((((..(((.........)))((((((.......))))))...(((((.......))))))))))).....
>test_5HR6_C
......****.******....******************.............*..........****.
CCCCUUCGUCUAGAGGCCCAGGACACCGCCCUUUCACGGCGGUAACAGGGGUUCGAAUCCCCUAGGGG
(((((..((((.........))))((((((.......))))))...(((((.......))))))))))
>test_5HR6_D
....****...........******************........................***..
CCCUUCGUCUAGAGGCCCAGGACACCGCCCUUUCACGGCGGUAACAGGGGUUCGAAUCCCCUAGGG
((((..((((.........))))((((((.......))))))...(((((.......)))))))))
>test_5HR7_D
.....****.............*****************........................*****..
CCCCUUCGUCUAGAGGCCCAGGACACCGCCCUUUCACGGCGGUAACAGGGGUUCGAAUCCCCUAGGGGAC
(((((..((((.........))))((((((.......))))))...(((((.......))))))))))..
>test_5M73_A
....****...*****..***...*.........*****..*****............*.***.*******...................***....***................**********........****......
GGUGUCCGCACUAAGUUCGGCAUCAAUAUGGUGACCUCCCGGGAGCGGGGGACCACCAGGUUGCCUAAGGAGGGGUGAACCGGCCCAGGUCGGAAACGGAGCAGGUCAAAACUCCCGUGCUGAUCAGUAGUGGGAUCGCGCCUA
(((((((.(((((.(.(((((((((((.(((((..((((((....))))))..))))).))))......(.((((......((((....(((....)))....))))....)))))))))))).)..)))))))...)))))..
>test_5TF6_B
......**.********************.**................***.**.......**........
GGUCAAUUUGAAACAAUACAGAGAUGAUCAGCAGUUCCCCUGCAUAAGGAUGAACCGUUUUACAAAGAGAC
.(((..((((...................(((.((((.(((.....)))..)))).)))...))))..)))
>test_5V6X_C
........*.......***....................**********....***...****.......
GGAAACCUGAUCAUGUAGAUCGAAUGGACUCUAAAUCCGUUCAGCCGGGUUAGAUUCCCGGGGUUUCCGC
(((((((.((((.....))))(((((((.......)))))))..(((((.......))))))))))))..
>test_5V6X_D
........*......****....................**********...****...****.........
GGAAACCUGAUCAUGUAGAUCGAAUGGACUCUAAAUCCGUUCAGCCGGGUUAGAUUCCCGGGGUUUCCGCCA
(((((((.((((.....))))(((((((.......)))))))..(((((.......))))))))))))....
>test_5VW1_C
******************...........***************.**************.......*.****.
UUGUAAAAAAGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGUC
..........((((((..((((....))))....))))))..(((..).)).......((((....))))...
>test_5XBL_B
******************..........****************.**************..........*************.*....
UGCGCUUGGCGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGCUU
..........(((((...(((......))).....)))))..(.(..)..).......((((....)))).....(.....)......
{
"1JJ2": {
"seq": "UUAGGCGGCCACAGCGGUGGGGUUGCCUCCCGUACCCAUCCCGAACACGGAAGAUAAGCCCACCAGCGUUCCGGGGAGUACUGGAGUGCGCGAGCCUCUGGGAAACCCGGUUCGCCGCCACC",
"str": "...((((((....((((((((......((((((...(.....)...))))..))....)))))).))...(((((.....((((((.((....))))))))....)))))...))))))..."
},
"1L9A": {
"seq": "GACACUAAGUUCGGCAUCAAUAUGGUGACCUCCCGGGAGCGGGGGACCACCAGGUUGCCUAUUUGAGGGGUGAACCGGCCCAGGUCGGAAACGGAGCAGGUCAAAACUCCCGUGCUGAUCAGUAGUGUC",
"str": "((((((..(.(((((((((((.(((((..((((((....))))))..))))).)))).........(((((.....(.((....(((....)))....)).)...))))).))))))).)...))))))"
},
"1LNG": {
"seq": "UCGGCGGUGGGGGAGCAUCUCCUGUAGGGGAGAUGUAACCCCCUUUACCUGCCGAACCCCGCCAGGCCCGGAAGGGAGCAACGGUAGGCAGGACGUC",
"str": "..(.((..(((((.(((((((((....)))))))))..)))))....((((((.....(((.....(((....))).....)))..)))))).)).)"
},
"1MFQ": {
"seq": "GACACUAAGUUCGGCAUCAAUAUGGUGACCUCCCGGGAGCGGGGGACCACCAGGUUGCCUAAGGAGGGGUGAACCGGCCCAGGUCGGAAACGGAGCAGGUCAAAACUCCCGUGCUGAUCAGUAGUGUC",
"str": "((((((..(.(((((((((((.(((((..((((((....))))))..))))).))))......(.(((((.....((((....(((....)))....))))...))))))))))))).)...))))))"
},
"1SM1": {
"seq": "CCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGG",
"str": "(((((((((.....((.(((((.....(((((...............)))..))....)))))..)).((.......((.(((((....))))).)).......))...)))))))))"
},
"1U6P": {
"seq": "GGCGGUACUAGUUGAGAAACUAGCUCUGUAUCUGGCGGACCCGUGGUGGAACUGUGAAGUUCGGAACACCCGGCCGCAACCCUGGGAGAGGUCCCAGGGUU",
"str": ".((((..((((((....))))))..)))).....((((..(((.(((((((((....)))))....)))))))))))((((((((((....))))))))))"
},
"1Y69": {
"seq": "CCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGG",
"str": "(((((((((.....((.(((((....((((((...............)))..)))...)))))..))(((.......((.(((((....))))).)).......)))..)))))))))"
},
"1YHQ": {
"seq": "UUAGGCGGCCACAGCGGUGGGGUUGCCUCCCGUACCCAUCCCGAACACGGAAGAUAAGCCCACCAGCGUUCCAGGGAGUACUGGAGUGCGCGAGCCUCUGGGAAAUCCGGUUCGCCGCCACC",
"str": "...((((((....((((((((......((((((...(.....)...))))..))....)))))).))...((.((.....((((((.((....))))))))....)).))...))))))..."
},
"1YI2": {
"seq": "UUAGGCGGCCACAGCGGUGGGGUUGCCUCCCGUACCCAUCCCGAACACGGAAGAUAAGCCCACCAGCGUUCCAGGGAGUACUGGAGUGCGCGAGCCUCUGGGAAAUCCGGUUCGCCGCCACC",
"str": "...((((((....((((((((......((((((...(.....)...))))..))....)))))).))...((.(......((((((.((....)))))))).....).))...))))))..."
},
"2V3C": {
"seq": "GGCGGUGGGGGAGCAUCUCCUGUAGGGGAGAUGUAACCCCCUUUACCUGCCGAACCCCGCCAGGCCCGGAAGGGAGCAACGGUAGGCAGGACGUCG",
"str": "(.((..(((((.(((((((((....)))))))))..)))))....((((((.....(((.....(((....))).....)))..)))))).)).)."
},
"2ZJQ": {
"seq": "CACCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGGUU",
"str": ".((((((((((.....((.(((((....(((((((...(.....)...))))..)))...)))))..))(((.......((.(((((....))))).)).......)))..))))))))))."
},
"2ZJR": {
"seq": "CACCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGGUU",
"str": ".((((((((((.....((.(((((....(((((((...(.....)...))))..)))...)))))..)).((.......((.(((((....))))).)).......))...))))))))))."
},
"3ADB": {
"seq": "GGCCGCCGCCACCGGGGUGGUCCCCGGGCCGGACUUCAGAUCCGGCGCGCCCCGAGUGGGGCGCGGGGUUCAAUUCCCCGCGGCGGCCGCCA",
"str": "(((((((((..((((((..[.))))))((((((.......))))))(((((((....)))))))((((..]....)))))))))))))...."
},
"3CUL": {
"seq": "GGAUGGCGAAAGCCAUUUCCGCAGGCCCCAUUGCACUCCGGGGUAUUGGCGUUAGGUGGUGGUACGAGGUUCGAAUCCUCGUACCGCAGCCA",
"str": "(((((((....)))))).)(((..(((((..........)))))....)))...(((.(((((((((((.......))))))))))).)))."
},
"3CUN": {
"seq": "GAUGGCGAAAGCCAUUUCCGCAGGCCCCAUUGCACUCCGGGGUAUUGGCGUUAGGUGGUGGUACGAGGUUCGAAUCCUCGUACCGCAGCCA",
"str": "((((((....))))))..(.(...((((..........)))).....).)...((..(((((((((((.......)))))))))))..))."
},
"3DLL": {
"seq": "CACCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGGUU",
"str": ".((((((((((.....((.(((((.....((((((...(.....)...))))..))....)))))..)).((.......((.(((((....))))).)).......))...))))))))))."
},
"3HHN": {
"seq": "UCCAGUAGGAACACUAUACUACUGGAUAAUCAAAGACAAAUCUGCCCGAAGGGCUUGAGAACAUACCCAUUGCACUCCGGGUAUGCAGAGGUGGCAGCCUCCGGUGGGUUAAAACCCAACGUUCUCAACAAUAGUGA",
"str": "((((((((...[[[[[[.))))))))...............[[[[[(...).(.((((((((((((((..........)))))))..((((.]]]]]))))((.((((......)))).)))))))))).]]]]]]."
},
"3IVKA": {
"seq": "UCCAGUAGGAACACUAUACUACUGGAUAAUCAAAGACAAAUCUGCCCGAAGGGCUUGAGAACAUCGAAACACGAUGCAGAGGUGGCAGCCUCCGGUGGGUUAAAACCCAACGUUCUCAACAAUAGUGA",
"str": "((((((((...[[[[[[.))))))))...............[[[[[(...).(.((((((((((((.....)))))..((((.]]]]]))))((.((((......)))).)))))))))).]]]]]]."
},
"3IWN": {
"seq": "CACGCACAGGGCAAACCAUUCGAAAGAGUGGGACGCAAAGCCUCCGGCCUAAACGGCAUUGCACUCCGCCGUAGGUAGCGGGGUUACCGAUGG",
"str": "((((......((...(((.((....)).)))..[))...(((.((.(((((..((((..........))))))))).].)))))...)).))."
},
"3KTW": {
"seq": "AGAUAGUCGUGGGUUCCCUUUCUGGAGGGAGAGGGAAUUCCACGUUGACCGGGGGAACCGGCCAGGCCCGGAAGGGAGCAACCGUGCCCGGCUAU",
"str": "..(((..((.(((((((((((((....))))))))))).)).))....(.(((.....(((.....(((....))).....)))..))).).)))"
},
"3MUM": {
"seq": "GUCACGCACAGAGCAAACCAUUCGAAAGAGUGGGACGCAAAGCCUCCGGCCUAAACCAUUGCACUCCGGUAGGUAGCGGGGUUACCGAUGG",
"str": "..(.((......((...((((((....))))))..[))...(((.((((((((..((..........))))))).]))))))...))...)"
},
"3MUR": {
"seq": "GUCACGCACAGGGCAAACCAUUCGAAAGAGUGGGACGCAAAGCCUCCGGCCUAAACCAUUGCACUCCGGUAGGUAGCGGGGUUAUCGAUGG",
"str": "..(.((......((...((((((....))))))..[))...(((.((((((((...(..........).))))).]))))))...))...)"
},
"3NDB": {
"seq": "GUCUCGUCCCGUGGGGCUCGGCGGUGGGGGAGCAUCUCCUGUAGGGGAGAUGUAACCCCCUUUACCUGCCGAACCCCGCCAGGCCCGGAAGGGAGCAACGGUAGGCAGGACGUCGGCGCUCACGGGGGUGCGGGAC",
"str": "((((((..(((((.(((.(((((..(((((.(((((((((....)))))))))..)))))....((((((.....(((.....(((....))).....)))..)))))).))))))).).))))).....))))))"
},
"3PIO": {
"seq": "ACCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGGU",
"str": "((((((((((.....((.(((((....((((((...............)))..)))...)))))..)).((.......((.(((((....))))).)).......))...))))))))))"
},
"3PIP": {
"seq": "ACCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGGU",
"str": "((((((((((.....(..(((((....(((((((.............))))..)))...)))))...).((.......((.((.((....)).)).)).......))...))))))))))"
},
"3V7E": {
"seq": "GGCUUAUCAAGAGAGGUGGAGGGACUGGCCCGAUGAAACCCGGCAACCACUAGUCUAGCGUCAGCUUCGGCUGACGCUAGGCUAGUGGUGCCAAUUCCUGCAGCGGAAACGUUGAAAGAUGAGCCA",
"str": "((((((((....(.(((...(((.[.[[)))......))))(((..(((((((((((((((((.(....).))))))))))))))))).)))...(]].](((((....)))))..)))))))))."
},
"3W3S": {
"seq": "GGGAGAGGUUGGCCGGCUGGUGCCGCCCCGGGACUUCAAAUCCCGUGGGAGGUCCCGCAAGGGAGCUCCGGAGGGUUCGAUUCCCUCCCUCUCCCGCC",
"str": "((((((((..((.((((..[.))))))((((((.......))))).)((((.((((....)))).)))).(((((..]....)))))))))))))..."
},
"4IO9": {
"seq": "CACCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGGUU",
"str": ".((((((((((.....((.(((((....(((.((....(.....)....))...)))...)))))..))(((.......((.(((((....))))).)).......)))..))))))))))."
},
"4IOA": {
"seq": "CACCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGGUU",
"str": ".((((((((((.....((.(((((....((((((....(.....)....)))..)))...)))))..))(((.......((.(((((....))))).)).......)))..))))))))))."
},
"4P3EA": {
"seq": "GACUAAGUUCGGCAUCAAUAUGGUGACCUCCCGGGAGCGGGGGACCACCAGGUUGCCUAAGGAGGGGUGAACCGGCCCAGGUCGGAAACGGAGCAGGUCAAAACUCCCGUGCUGAUCAGUAGUU",
"str": "(.((..(.(((((((((((.(((((..((((((....))))))..))))).))))......(.(((((.....((((....(((....)))....))))...))))))))))))).)...)).)"
},
"4P3EB": {
"seq": "GACUAAGUUCGGCAUCAAUAUGGUGACCUCCCGGGAGCGGGGGACCACCAGGUUGCCUAAGGAGGGGUGAACCGGCCCAGGUCGGAAACGGAGCAGGUCAAAACUCCCGUGCUGAUCAGUAGUU",
"str": "(.((.((.(((((((((((.(((((..((((((....))))))..))))).))))......(.(((((.....((((....(((....)))....))))...))))))))))))).)..))).)"
},
"4UYJ": {
"seq": "GGGGCUAGGCCGGGGGGUUCGGCGUCCCCUGUAACCGGAAACCGCCGAUAUGCCGGGGCCGAAGCCCGAGGGGCGGUUCCCGUAAGGGUUCCCACCCUCGGGCGUGCCUC",
"str": "(((((..(((((((((...[[[[[.)))))....((((....]]]]].....))))))))...(((((((((..((..(((....)))..))..)))))))))..)))))"
},
"4UYK": {
"seq": "GGGGCUAGGCCGGGGGGUUCGGCGUCCCCUGUAACCGGAAACCGCCGAUAUGCCGGGGCCGAAGCCCGAGGGGCGGUUCCCGAAGCCGCCUCUGUAAGGAGGCGGUGGAGGGUUCCCACCCUCGGGCGUGCCUC",
"str": "(((((..(((((((((...[[[[[.)))))....((((....]]]]].....))))))))...(((((((((..((..(((...(((((((((....)))))))))...)))..))..)))))))))..)))))"
},
"4W90": {
"seq": "GCGCGCUUAAUCUGAAAUCAGAGCGGGGGACCCAUUGCACUCCGGGUUUUUCCCGUAAGGGGUGAAUCCUUUUUAGGUAGGGCGAAAGCCCGAAUCCGUCAGCUAACCUCGUAAGCGCGC",
"str": "(((((((...((((....))))((((((....(.........................(((((....(((....)))..((((....))))..))))..).)....)))))).)))))))"
},
"4WF9": {
"seq": "UCUGGUGACUAUAGCAAGGAGGUCACACCUGUUCCCAUGCCGAACACAGAAGUUAAGGUCUUUAGCGACGAUGGUAGCCAACUUACGUUCCGCUAGAGUAGAACGUUGCCAGGC",
"str": ".(..(..(.....((((.((......((((((...(.....)...))))..)).....)).)).))............(............)..............)..)..)."
},
"4XCO": {
"seq": "GGCGGUGGGGGAGCAUCUCCUGUAGGGGAGAUGUAACCCCCUUUACCUGCCGAACCCCGCCAGGCCCGGAAGGGAGCAACGGUAGGCAGGACGUCG",
"str": "((((..(((((.(((((((((....)))))))))..)))))....((((((.....(((.....(((....))).....)))..)))))).))))."
},
"4YB1": {
"seq": "GGGCACGCACAGAGCAAACCAUUCGAAAGAGUGGGACGCAAAGCCUCCGGCCUAAACCAUUGCACCUCGGUAGGUAGCGGGGUUACCGAUG",
"str": "...((((......((...((((((....))))))..[))...(((.(((((((...((..........)).)))).]))))))...)).))"
},
"5DM7": {
"seq": "CACCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGGUU",
"str": "..(((((((((.....((.(((((....((((((....(.....)....)))..)))...)))))..))(((.......((.(((((....))))).)).......)))..))))))))).."
},
"5JVGA": {
"seq": "ACCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGGU",
"str": "(((.((((((.....((.(.(((....(((((((...(.....)...))))..)))...))).)..)).((.......((.(((((....))))).)).......))...)))))).)))"
},
"5M73": {
"seq": "GGUGUCCGCACUAAGUUCGGCAUCAAUAUGGUGACCUCCCGGGAGCGGGGGACCACCAGGUUGCCUAAGGAGGGGUGAACCGGCCCAGGUCGGAAACGGAGCAGGUCAAAACUCCCGUGCUGAUCAGUAGUGGGAUCGCGCCUA",
"str": "(((((((.((((..(.((((((.((((.(((((..((((((....))))))..))))).))))......(.((((......((((.....((....)).....))))....))))).)))))).)...))))))...))))).."
},
"5NRGA": {
"seq": "UCUGGUGACUAUAGCAAGGAGGUCACACCUGUUCCCAUGCCGAACACAGAAGUUAAGGUCUUUAGCGACGAUGGUAGCCAACUUACGUUCCGCUAGAGUAGAACGUUGCCAGGC",
"str": ".((.(..(.....((((.((......((((((...(.....)...))))..)).....)).)).))..........(.(............).)............)..).))."
}
}
\ No newline at end of file
{
"1ASY": {
"R": {
"contacts": "*........****..........*******.********..........................**********",
"sequence": "UCCGUGAUAGUUUAAUGGUCAGAAUGGGCGCUUGUCGCGUGCCAGAUCGGGGUUCAAUUCCCCGUCGCGGAGCCA",
"struct2d": "(((((((..((((........)))).(((((.......)))))....(((((.......))))))))))))...."
},
"pfams": [
"PF01336",
"PF00152"
]
},
"1B23": {
"R": {
"contacts": "***..........................................*****........*****....*****..",
"sequence": "GGCGCGUUAACAAAGCGGUUAUGUAGCGGAUUGCAAAUCCGUCUAGUCCGGUUCGACUCCGGAACGCGCCUCCA",
"struct2d": "(((((((..(((.........))).(((((.......)))))....(((((.......))))))))))))...."
},
"pfams": [
"PF00009",
"PF03144",
"PF03143"
]
},
"1C0A": {
"B": {
"contacts": "*....*...****...........******.*********...........................**********",
"sequence": "GGAGCGGUAGUUCAGUCGGUUAGAAUACCUGCCUGUCACGCAGGGGGUCGCGGGUUCGAGUCCCGUCCGUUCCGCCA",
"struct2d": "(((((((..((((.........)))).(((((.......))))).....(((((.......))))))))))))...."
},
"pfams": [
"PF01336",
"PF02938",
"PF00152"
]
},
"1DRZ": {
"B": {
"contacts": "..........................................****.*******.***..............",
"sequence": "GGCCGGCAUGGUCCCAGCCUCCUCGCUGGCGCCGGCUGGGCAACACCAUUGCACUCCGGUGGCGAAUGGGAC",
"struct2d": "(((((((.........(((........))))))))))......((((..........))))..........."
},
"pfams": [
"PF00076"
]
},
"1EFW": {
"C": {
"contacts": ".........***..............****..********............................*****",
"sequence": "GGAGCGGUAGUUCAGUCGGUUAGAAUACCUGCCUGUCACGCAGGGGGUCGCGGGUUCGAGUCCCGUCCGUUCC",
"struct2d": "(((((((..((((.........)))).(((((.......))))).....(((((.......))))))))))))"
},
"D": {
"contacts": ".........**...............****..********.............................*.**",
"sequence": "GGAGCGGUAGUUCAGUCGGUUAGAAUACCUGCCUGUCACGCAGGGGGUCGCGGG&UCGAGUCCCGUCCGUUCC",
"struct2d": "(((((((..((((.........)))).(((((.......))))).....(((((&......))))))))))))"
},
"pfams": [
"PF01336",
"PF02938",
"PF00152"
]
},
"1EIY": {
"C": {
"contacts": ".........**..............**..................*...................*....******",
"sequence": "GCCGAGGUAGCUCAGUUGGUAGAGCAUGCGACUGAAAAUCGCAGUGUCCGCGGUUCGAUUCCGCGCCUCGGCACCA",
"struct2d": "(((((((..((((........)))).(((((.......))))).....(((((.......))))))))))))...."
},
"pfams": [
"PF17759",
"PF01409",
"PF03484",
"PF03483",
"PF03147",
"PF02912",
"PF01588"
]
},
"1EUQ": {
"B": {
"contacts": "*******.*******.....******.....******..........................*********",
"sequence": "GGGGUAUCGCCAAGCGGUAAGGCACCGGAUUCUGAUUCCGGCAGCGAGGUUCGAAUCCUCGUACCCCAGCCA",
"struct2d": "((((((..(((.........))).(((((.......)))))...(((((.......)))))))))))....."
},
"pfams": [
"PF00749",
"PF03950"
]
},
"1EUY": {
"B": {
"contacts": "*******.*******.....******.....******............................********",
"sequence": "GGGGUAUCGCCAAGCGGUAAGGCACCGGAUUCUGAUUCCGGCAAGCGAGGUUCGAAUCCUCGUACCCCAGCCA",
"struct2d": "((((((..(((.........))).(((((.......)))))....(((((.......)))))))))))....."
},
"pfams": [
"PF00749",
"PF03950"
]
},
"1F7U": {
"B": {
"contacts": "..****.....*****.**.*****.....*..********..............*...........*********",
"sequence": "UUCCUCGUGGCCCAAUGGUCACGGCGUCUGGCUGCGAACCAGAAGAUUCCAGGUUCAAGUCCUGGCGGGGAAGCCA",
"struct2d": "(((((((..(((..........))).(((((.......))))).....(((((.......))))))))))))...."
},
"pfams": [
"PF00750",
"PF05746",
"PF03485"
]
},
"1F7V": {
"B": {
"contacts": "..****.....*****.********.....*..********..............*...........*****.",
"sequence": "UUCCUCGUGGCCCAAUGGUCACGGCGUCUGGCUGCGAACCAGAAGAUUCCAGGUUCAAGUCCUGGCGGGGAAG",
"struct2d": "(((((((..(((..........))).(((((.......))))).....(((((.......))))))))))))."
},
"pfams": [
"PF00750",
"PF05746",
"PF03485"
]
},
"1FFY": {
"T": {
"contacts": "..****...*****.........****...**************........................*******",
"sequence": "GGGCUUGUAGCUCAGGUGGUUAGAGCGCACCCCUGAUAAGGGUGAGGUCGGUGGUUCAAGUCCACUCAGGCCCAC",
"struct2d": "(((((((..((((.........)))).(((((.......))))).....(((((.......)))))))))))).."
},
"pfams": [
"PF06827",
"PF08264",
"PF00133"
]
},
"1GAX": {
"C": {
"contacts": "...***...******..***.***....**.****.********..........*..............******",
"sequence": "GGGCGGCUAGCUCAGCGGAAGAGCGCUCGCCUCACACGCGAGAGGUCGUAGGUUCAAGUCCUACGCCGCCCACCA",
"struct2d": "(((((((..((((.......)))).(((((.......))))).....(((((.......))))))))))))...."
},
"pfams": [
"PF00133",
"PF08264",
"PF10458"
]
},
"1GTR": {
"B": {
"contacts": "*******.*******.....*****......*****..............................********",
"sequence": "GGGGUAUCGCCAAGCGGUAAGGCACCGGAUUCUGAUUCCGGCAUUCCGAGGUUCGAAUCCUCGUACCCCAGCCA",
"struct2d": "((((((..(((.........))).(((((.......))))).....(((((.......)))))))))))....."
},
"pfams": [
"PF00749",
"PF03950"
]
},
"1H3E": {
"B": {
"contacts": "..........**...*...........****..****.....******.............................***...",
"sequence": "GGGCAGGUUCCCGAGCGGCCAAAGGGGACGGUCUGUAAAACCGUUGGCGUAUGCCUUCGCUGGUUCGAAUCCAGCCCUGCCCA",
"struct2d": "(((((((..(((...........)))((((((.......))))))(((....)))...(((((.......))))))))))))."
},
"pfams": [
"PF00579"
]
},
"1H4Q": {
"T": {
"contacts": "...*..................***..*******...***...........................",
"sequence": "GGAGUAGCGCAGCCCGGUAGCGCACCUCGUUCGGGACGAGGGGGGCGCUGGUUCAGAUCCAGUCUCC",
"struct2d": "((((..((((.........)))).((((((.....)))))).....(((((.......)))))))))"
},
"pfams": [
"PF03129",
"PF00587",
"PF09180"
]
},
"1HC8": {
"C": {
"contacts": ".....****.****.*.......********.....*....................",
"sequence": "CCAGGAUGUAGGCUUAGAAGCAGCCAUCAUUUAAAGAAAGCGUAAUAGCUCACUGGU",
"struct2d": "((((.(((..(((.........)))..))).....(..(((......)))).))))."
},
"pfams": [
"PF00298"
]
},
"1IL2": {
"D": {
"contacts": "........****..........******.*********...........................***..",
"sequence": "CCGUGAUAGUUUAAUGGUCAGAAUGGGCGCUUGUCGCGUGCCAGAUCGGGGUUCAAUUCCCCGUCGCGGA",
"struct2d": "((((((..((((........)))).(((((.......)))))....(((((.......)))))))))))."
},
"pfams": [
"PF01336",
"PF02938",
"PF00152"
]
},
"1J1U": {
"B": {
"contacts": "...........................***....****..............................**....",
"sequence": "CCGGCGGUAGUUCAGCCUGGUAGAACGGCGGACUGUAGAUCCGCAUGUCGCUGGUUCAAAUCCGGCCCGCCGGA",
"struct2d": "(((((((..((((.........)))).(((((.......))))).....(((((.......))))))))))))."
},
"pfams": [
"PF00579"
]
},
"1J2B": {
"C": {
"contacts": "*.....**************...****....***..*****......***.........*......******.****",
"sequence": "GGGCCCGUGGUCUAGUUGGUCAUGACGCCGCCCUUACGAGGCGGAGGUCCGGGGUUCAAGUCCCCGCGGGCCCACCA",
"struct2d": "(((((((................(((..((((.......))))...)))(((((.......))))))))))))...."
},
"pfams": [
"PF01702",
"PF01472",
"PF14810",
"PF14809"
]
},
"1JJ2": {
"9": {
"contacts": "...************.**.....*.*******.****..***.****************.......****.............*****..***...*****............*******..",
"sequence": "UUAGGCGGCCACAGCGGUGGGGUUGCCUCCCGUACCCAUCCCGAACACGGAAGAUAAGCCCACCAGCGUUCCGGGGAGUACUGGAGUGCGCGAGCCUCUGGGAAACCCGGUUCGCCGCCACC",
"struct2d": "...((((((....((((((((......((((((...(.....)...))))..))....)))))).))...(((((.....((((((.((....))))))))....)))))...))))))..."
},
"pfams": [
"PF00572",
"PF17144",
"PF01198",
"PF01248",
"PF01780",
"PF00237",
"PF00347",
"PF03947",
"PF00673",
"PF00827",
"PF01907",
"PF00828",
"PF01280",
"PF00297",
"PF00467",
"PF16906",
"PF00832",
"PF00935",
"PF00573",
"PF01655",
"PF00276",
"PF01246",
"PF00181",
"PF01157",
"PF00831",
"PF00238",
"PF00252",
"PF00327",
"PF00281",
"PF00466"
]
},
"1KUQA": {
"B": {
"contacts": "...................*****.....*****..*****......********..",
"sequence": "GGGCGGCCUUCGGGCUAGACGGUGGGAGAGGCUUCGGCUGGUCCACCCGUGACGCUC",
"struct2d": "((((.(((....)))..(((((((((...(((....)))..))))).)))..)))))"
},
"pfams": [
"PF00312"
]
},
"1L9A": {
"B": {
"contacts": ".............................************.........................................*****....****..................................",
"sequence": "GACACUAAGUUCGGCAUCAAUAUGGUGACCUCCCGGGAGCGGGGGACCACCAGGUUGCCUAUUUGAGGGGUGAACCGGCCCAGGUCGGAAACGGAGCAGGUCAAAACUCCCGUGCUGAUCAGUAGUGUC",
"struct2d": "((((((..(.(((((((((((.(((((..((((((....))))))..))))).)))).........(((((.....(.((....(((....)))....)).)...))))).))))))).)...))))))"
},
"pfams": [
"PF01922"
]
},
"1LNG": {
"B": {
"contacts": "................************.....................................*****....***....................",
"sequence": "UCGGCGGUGGGGGAGCAUCUCCUGUAGGGGAGAUGUAACCCCCUUUACCUGCCGAACCCCGCCAGGCCCGGAAGGGAGCAACGGUAGGCAGGACGUC",
"struct2d": "..(.((..(((((.(((((((((....)))))))))..)))))....((((((.....(((.....(((....))).....)))..)))))).)).)"
},
"pfams": [
"PF01922"
]
},
"1MFQ": {
"A": {
"contacts": "............................*****.******................................**......*******....*******..............................",
"sequence": "GACACUAAGUUCGGCAUCAAUAUGGUGACCUCCCGGGAGCGGGGGACCACCAGGUUGCCUAAGGAGGGGUGAACCGGCCCAGGUCGGAAACGGAGCAGGUCAAAACUCCCGUGCUGAUCAGUAGUGUC",
"struct2d": "((((((..(.(((((((((((.(((((..((((((....))))))..))))).))))......(.(((((.....((((....(((....)))....))))...))))))))))))).)...))))))"
},
"pfams": [
"PF02978",
"PF01922"
]
},
"1MMS": {
"C": {
"contacts": "......*********....*....*********....*......***...........",
"sequence": "GCUGGGAUGUUGGCUUAGAAGCAGCCAUCAUUUAAAGAGUGCGUAACAGCUCACCAGC",
"struct2d": "(((((.(((..(((.........)))..))).....(...((......)).).)))))"
},
"pfams": [
"PF00298",
"PF03946"
]
},
"1MZP": {
"B": {
"contacts": "...............**********..........********.******.....",
"sequence": "GGGAUGCGUAGGAUAGGUGGGAGCCGCAAGGCGCCGGUGAAAUACCACCCUUCCC",
"struct2d": "((((.(.........(((((..(((....)))............)))))).))))"
},
"pfams": [
"PF00687"
]
},
"1OB5": {
"B": {
"contacts": "***........*...........***..........****.........*****........*****.....***.",
"sequence": "GCGGAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUCUGGAGGUCCUGUGUUCGAUCCACAGAAUUCGCAC&A",
"struct2d": "(((((((..((((........)))).(((((.......))))).....(((((.......))))))))))))..&."
},
"pfams": [
"PF00009",
"PF03144",
"PF03143"
]
},
"1QA6": {
"C": {
"contacts": "......*********..........*******.....*....................",
"sequence": "GCCAGGAUGUAGGCUUAGAAGCAGCCAUCAUUUAAAGAAAGCGUAAUAGCUCACUGGU",
"struct2d": "(((((.(((..(((.........)))..))).....(..(((......)))).)))))"
},
"pfams": [
"PF00298"
]
},
"1QF6": {
"B": {
"contacts": "****......**.............*.....********.........*................***********",
"sequence": "GCCGAUAUAGCUCAGUUGGUAGAGCAGCGCAUUCGUAAUGCGAAGGUCGUAGGUUCGACUCCUAUUAUCGGCACCA",
"struct2d": "(((((((..((((........))))..((((.......))))......(((((.......))))))))))))...."
},
"pfams": [
"PF02824",
"PF07973",
"PF00587",
"PF03129"
]
},
"1SER": {
"T": {
"contacts": "..............*...........***...............*******.......****.*",
"sequence": "GAGGUGCCCGAGUGGCUGAAGGG&GGUAGGGGGG&CCUCCCUCGCGGGUUCGAAUCCCGCCCUC",
"struct2d": "((((..(((...........)))&...(((((((&))))))).(((((.......)))))))))"
},
"pfams": [
"PF02403",
"PF00587"
]
},
"1SJ3": {
"R": {
"contacts": "...........................................**...*******..**..............",
"sequence": "UGGCCGGCAUGGUCCCAGCCUCCUCGCUGGCGCCGGCUGGGCAACACCAUUGCACUCCGGUGGUGAAUGGGAC",
"struct2d": ".(((((((.........(((........))))))))))......((((..........))))..........."
},
"pfams": [
"PF00076"
]
},
"1SJF": {
"B": {
"contacts": "............................................**...*******.***..............",
"sequence": "AUGGCCGGCAUGGUCCCAGCCUCCUCGCUGGCGCCGGCUGGGCAACACCAUUGCACUCCGGUGGUGAAUGGGAC",
"struct2d": "..(((((((.........(((........))))))))))......((((..........))))..........."
},
"pfams": [
"PF00076"
]
},
"1U6P": {
"B": {
"contacts": ".............................*****..............................................**..................*",
"sequence": "GGCGGUACUAGUUGAGAAACUAGCUCUGUAUCUGGCGGACCCGUGGUGGAACUGUGAAGUUCGGAACACCCGGCCGCAACCCUGGGAGAGGUCCCAGGGUU",
"struct2d": ".((((..((((((....))))))..)))).....((((..(((.(((((((((....)))))....)))))))))))((((((((((....))))))))))"
},
"pfams": [
"PF00098"
]
},
"1UN6": {
"E": {
"contacts": "...*.******........****.****..............*****..............",
"sequence": "GCCGGCCACACCUACGGGGCCUGGUUAGUACCUGGGAAACCUGGGAAUACCAGGUGCCGGC",
"struct2d": "((((((....((....))(((((((.....((..(....)..))....)))))))))))))"
},
"pfams": [
"PF00096"
]
},
"1VC5": {
"B": {
"contacts": "........................................**...***********..............",
"sequence": "GGCCGGCAUGGUCCCAGCCUCCUC&UGGCGCCGGCUGGGCAACACCAUUGCACUCCGGUGGCGAAUGGGA",
"struct2d": "(((((((.........(((.....&.))))))))))......((((..........)))).........."
},
"pfams": [
"PF00076"
]
},
"1WZ2": {
"C": {
"contacts": "*****.......***....**...****............*****.....******..........................*.****",
"sequence": "GCGGGGGUUGCCGAGCCUGGUCAAAGGCGGGGGACUCAAGAUCCCCUCCCGUAGGGGUUCCGGGGUUCGAAUCCCCGCCCCCGCACCA",
"struct2d": "(((((((..(((.............))).(((((.......))))).(((....)))...(((((.......))))))))))))...."
},
"pfams": [
"PF08264",
"PF00133"
]
},
"1Y39": {
"C": {
"contacts": ".....*********.........********.....*....................",
"sequence": "CCAGGAUGUAUGCUUAGAAGCAGCAAUCAUUUAAAGAGUGCGUAAUAGCUCACUGGU",
"struct2d": "((((.(((..(((.........)))..))).....(...((......)).).))))."
},
"pfams": [
"PF00298"
]
},
"1Y69": {
"9": {
"contacts": ".........***................................................................**.........****...........................",
"sequence": "CCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGG",
"struct2d": "(((((((((.....((.(((((....((((((...............)))..)))...)))))..))(((.......((.(((((....))))).)).......)))..)))))))))"
},
"pfams": [
"PF01765",
"PF01016",
"PF00252"
]
},
"1YHQ": {
"9": {
"contacts": "...***************......********.****..***.****************.......****.............*****..***...****.............*******..",
"sequence": "UUAGGCGGCCACAGCGGUGGGGUUGCCUCCCGUACCCAUCCCGAACACGGAAGAUAAGCCCACCAGCGUUCCAGGGAGUACUGGAGUGCGCGAGCCUCUGGGAAAUCCGGUUCGCCGCCACC",
"struct2d": "...((((((....((((((((......((((((...(.....)...))))..))....)))))).))...((.((.....((((((.((....))))))))....)).))...))))))..."
},
"pfams": [
"PF00572",
"PF17144",
"PF01198",
"PF01248",
"PF01780",
"PF00237",
"PF00347",
"PF03947",
"PF00673",
"PF00827",
"PF01907",
"PF00828",
"PF01280",
"PF00297",
"PF00467",
"PF16906",
"PF00832",
"PF00935",
"PF00573",
"PF01655",
"PF00276",
"PF01246",
"PF00181",
"PF00298",
"PF01157",
"PF00831",
"PF00238",
"PF00252",
"PF00327",
"PF00281",
"PF00466"
]
},
"2AKE": {
"B": {
"contacts": ".........................*....*****...............................**....",
"sequence": "GACCUCGUGGCGCAAUGGUAGCGCGUCUGACUCCAGAUCAGAAGGUUGCGUGUUCGAAUCACGUCGGGGUCA",
"struct2d": "(((((((..((((.......)))).(((((.......))))).....(((((.......))))))))))))."
},
"pfams": [
"PF00579"
]
},
"2BTE": {
"B": {
"contacts": "..**.......****...**..*****.**....****....*..............**............*******",
"sequence": "GCCGGGGUGGCGGAAUGGGUAGACGCGCAUGAC&AUCAUGUGCGCAAGCGUGCGGGUUCAAGUCCCGCCCCCGGCACC",
"struct2d": "(((((((..(((...........)))((((((.&.))))))((....))..(((((.......))))))))))))..."
},
"pfams": [
"PF14795",
"PF08264",
"PF13603",
"PF00133"
]
},
"2CSX": {
"C": {
"contacts": "..**.........*............**.....**********............................****",
"sequence": "GGCGGCGUAGCUCAGCUGGUCAGAGCGGGGAUCUCAUAAGUCCCAGGUCGGAGGUUCGAGUCCUCCCGCCGCCAC",
"struct2d": "(((.(((..((((.........)))).(((((.......))))).....(((((.......)))))))).))).."
},
"pfams": [
"PF08264",
"PF09334"
]
},
"2CZJ": {
"B": {
"contacts": "...********..*******...............*****................*****.",
"sequence": "GGGGGUGAAACGGUCUCGACAGGGGUUCGCCUUUGGACGUGGGUUCGACUCCCACCACCUCC",
"struct2d": "(((((((............((((.(....).))))...(((((.......))))))))))))"
},
"pfams": [
"PF01668"
]
},
"2D6F": {
"E": {
"contacts": "****..............................................****.....***.........*",
"sequence": "AGUCCCGUGGGGUAGUGGUAAUCCUGCUGGGCUUUGGACCCGGCGACAGCGGUUCGACUCCGCUCGGGACUA",
"struct2d": "(((((((...(........{...).((((((.......))))))...(((((......}))))))))))))."
},
"F": {
"contacts": "*****.............................................*****....***.........***",
"sequence": "AGUCCCGUGGGGUAGUGGUAAUCCUGCUGGGCUUUGGACCCGGCGACAGCGGUUCGACUCCGCUCGGGACUACC",
"struct2d": "(((((((..(((..........)))((((((.......))))))...(((((.......))))))))))))..."
},
"pfams": [
"PF17763",
"PF02637",
"PF02934",
"PF00710",
"PF18195",
"PF02938"
]
},
"2DER": {
"C": {
"contacts": ".........****...........*******..*********................................",
"sequence": "GUCCCCUUCGUCUAGAGGCCCAGGACACCGCCCUUUCACGGCGGUAACAGGGGUUCGAAUCCCCUAGGGGACGC",
"struct2d": "....(.(..((((.........))))((((((.......))))))...(((((.......)))))).)......"
},
"D": {
"contacts": ".....***............*****....********..................................",
"sequence": "CCCCUUCGUCUAGAGGCCCAGGACACCGCCCUUUCACGGCGGUAACAGGGGUUCGAAUCCCCUAGGGGACG",
"struct2d": ".((((..((((.........))))((((((.......))))))...(((((.......)))))))))...."
},
"pfams": [
"PF03054"
]
},
"2DET": {
"C": {
"contacts": ".........***............*******.**********............................",
"sequence": "GUCCCCUUCGUCUAGAGGCCCAGGACACCGCCCUUUCACGGCGGUAACAGGGGUUCGAAUCCCCUAGGGG",
"struct2d": "..((.....((((.........))))((((((.......))))))....((((.......))))...))."
},
"pfams": [
"PF03054"
]
},
"2DLC": {
"Y": {
"contacts": "...........................****.***.........................***......",
"sequence": "CUCUCGGUAGCCAAG&GG&AAGGCGCAAG&GUAAAUCUUGAG&UCGGGCGUUCGACUCGCCCCCGGGAG",
"struct2d": "(((((((..(((...&..&..))).(...&.........)..&..(((((.......))))))))))))"
},
"pfams": [
"PF00579"
]
},
"2DR2": {
"B": {
"contacts": ".........................*....*****...............................**.......",
"sequence": "GACCUCGUGGCGCAAUGGUAGCGCGUCUGACUCCAGAUCAGAAGGUUGCGUGUUCGAAUCACGUCGGGGUCACCA",
"struct2d": "(((((((..(.((.......)).).(((((.......))))).....(((((.......))))))))))))...."
},
"pfams": [
"PF00579"
]
},
"2DU3": {
"D": {
"contacts": ".........**...............*..********............................*.....",
"sequence": "GCCAGGGUGGCAGAGGGGCUUUGCGGCGGACUGCAGAUCCGCUUUACCCCGGUUCGAAUCCGGGCCCUGGC",
"struct2d": "(((((((..(((.........)))..((((.......))))......(((((.......))))))))))))"
},
"pfams": [
"PF18006",
"PF01409"
]
},
"2DU5": {
"D": {
"contacts": ".........**..............*....*.******..........................**.....",
"sequence": "GCCAGGGUGGCAGAGGGGCUUUGCGGCGGACUUCAGAUCCGCUUUACCCCGGUUCGAAUCCGGGCCCUGGC",
"struct2d": "(((((((..(((.........))).(((((.......))))).....(((((.......))))))))))))"
},
"pfams": [
"PF18006",
"PF01409"
]
},
"2FK6": {
"R": {
"contacts": ".***..............*..............***....****.......**",
"sequence": "GCUUCCAUAGCUCAGCAGGUAGAGC&GUCAGCGGUUCGAGCCCGCUUGGAAGC",
"struct2d": "(((((((..((((........))))&...(((((.......))))))))))))"
},
"pfams": [
"PF12706"
]
},
"2FMT": {
"C": {
"contacts": "*******..*****..........****........................................*********",
"sequence": "CGCGGGGUGGAGCAGCCUGGUAGCUCGUCGGGCUCAUAACCCGAAGAUCGUCGGUUCAAAUCCGGCCCCCGCAACCA",
"struct2d": ".((((((..((((.........)))).(((((.......))))).....(((((.......)))))))))))....."
},
"pfams": [
"PF00551",
"PF02911"
]
},
"2HGH": {
"B": {
"contacts": "*..******........*********............****.............",
"sequence": "GGGCCAUACCUCUUGGGCCUGGUUAGUACCUCUUCGGUGGGAAUACCAGGUGCCC",
"struct2d": "((((....((....))(((((((.....((.(....).))....)))))))))))"
},
"pfams": [
"PF00096"
]
},
"2HVY": {
"E": {
"contacts": "..****........*********..****.....*.*****......*******...",
"sequence": "GGGUCCGCCUUGA&UGCCCGGGUGA&AAGCAUGAUCCCGGGUAAU&AUGGCGGACCC",
"struct2d": "(((((((((....&.(((((((...&.........)))))))...&..)))))))))"
},
"pfams": [
"PF01248",
"PF01472",
"PF08068",
"PF04410",
"PF04135",
"PF01509",
"PF16198"
]
},
"2IHX": {
"B": {
"contacts": "......***.**......................******................******..*******....",
"sequence": "CUGCCCUCAUCCGUCUCGCUUAUUCGGGGAGCGGACGAUGACCCUAGUAGAGGGGGCUGCGGCUUAGGAGGGCAG",
"struct2d": "((((((((...((((.(((((.......)))))))))....((((.....))))(((....)))...))))))))"
},
"pfams": [
"PF00098"
]
},
"2L3J": {
"B": {
"contacts": "........****.....********......**.......***......***.****.....***......",
"sequence": "GGCAUUAAGGUGGGUGGAAUAGUAUAACAAUAUGCUAAAUGUUGUUAUAGUAUCCCACCUACCCUGAUGCC",
"struct2d": "(((((((.(((((((((.(((.(((((((((((.....))))))))))).))).))))))))).)))))))"
},
"pfams": [
"PF00035"
]
},
"2MF0": {
"G": {
"contacts": ".************....**.************.*...******************..*************..",
"sequence": "UGUCGACGGAUAGACACAGCCAUCAAGGACGAUGGUCAGGACAUCGCAGGAAGCGAUUCAUCAGGACGAUGA",
"struct2d": "((((........))))..((((((......)))))).......((((.....))))..((((.....))))."
},
"pfams": [
"PF02599"
]
},
"2MQV": {
"B": {
"contacts": "..............*****.............................********............",
"sequence": "GGGCGAGGGUCUCCUCUGAGUGAUUGACUACCCGUCAGCGGGGGUCUUUCAUUUGGGGGCUCGUGCCC",
"struct2d": "(((((.((((((((.....((((..((((.((((....))))))))..))))..)))))))).)))))"
},
"pfams": [
"PF00098"
]
},
"2MS0": {
"B": {
"contacts": "**...*****............****************..........................****.**",
"sequence": "GGCUCGUUGGUCUAGGGGUAUGAUUCUCGCUUAGGGUGCGAGAGGUCCCGGGUUCAAAUCCCGGACGAGCC",
"struct2d": "(((((....(((.........)))((((((.......))))))....(((((.......)))))..)))))"
},
"pfams": [
"PF00098"
]
},
"2V0G": {
"B": {
"contacts": "..***......****.**..*****.......****....*.........**...**............*******",
"sequence": "GCCGGGGUGGCGGAA&GGUAGACGCGCAUGAC&AUCAUGUGCGCAAGCGUGCGGGUUCAAGUCCCGCCCCCGGCAC",
"struct2d": "(.(((((..(((...&......)))(((((..&..)))))((....))..(((((.......)))))))))).).."
},
"pfams": [
"PF14795",
"PF08264",
"PF13603",
"PF00133"
]
},
"2V3C": {
"M": {
"contacts": "..............************...........******.****.....**....*********...**********........***....",
"sequence": "GGCGGUGGGGGAGCAUCUCCUGUAGGGGAGAUGUAACCCCCUUUACCUGCCGAACCCCGCCAGGCCCGGAAGGGAGCAACGGUAGGCAGGACGUCG",
"struct2d": "((((..(((((.(((((((((....)))))))))..))))).....(((((.....(((.....(((....))).....)))..)))))..))))."
},
"pfams": [
"PF02978",
"PF01922",
"PF00448",
"PF02881"
]
},
"2ZJQ": {
"Y": {
"contacts": "......****.**..............********..**.******.******.****..*............*******..***.....******.......*****.....******...",
"sequence": "CACCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGGUU",
"struct2d": ".((((((((((.....((.(((((....(((((((...(.....)...))))..)))...)))))..))(((.......((.(((((....))))).)).......)))..))))))))))."
},
"pfams": [
"PF00572",
"PF16320",
"PF17136",
"PF01632",
"PF00237",
"PF00830",
"PF00347",
"PF03947",
"PF00673",
"PF00828",
"PF00829",
"PF00471",
"PF01245",
"PF00297",
"PF00467",
"PF01783",
"PF00444",
"PF00861",
"PF01016",
"PF00542",
"PF00573",
"PF00453",
"PF01386",
"PF00276",
"PF01196",
"PF00181",
"PF00468",
"PF00298",
"PF14693",
"PF00831",
"PF00238",
"PF00252",
"PF00327",
"PF03946",
"PF00281"
]
},
"2ZM5": {
"C": {
"contacts": ".........*..............*****************...........................*****.",
"sequence": "GCCCGGAUAGCUCAGUCGGUAGAGCAGGGGAUUGAAAAUCCCCGUGUCCUUGGUUCGAUUCCGAGUCCGGGCAC",
"struct2d": "(((((((..((((........)))).(((((.......))))).....(((((.......)))))))))))).."
},
"D": {
"contacts": "........................*****************........................***.",
"sequence": "GCCCGGAUAGCUCAGUCGGUAGAGCAGGGGAUUGAAAAUCCCCGUGUCCUUGGUUCGAUUCCGAGUCCG",
"struct2d": "....(((..((((........)))).(((((.......))))).....(((((.......))))))))."
},
"pfams": [
"PF01715"
]
},
"2ZNI": {
"C": {
"contacts": "**.***.******......****......................................***********",
"sequence": "GGGGGGUGGAUCGAAUAGAUCACACGGACUCUAAAUUCGUGCAGGCGGGUGAAACUCCCGUACUCCCCGCCA",
"struct2d": "((((.((.((((.....)))).((((.(.......).))))...(((((.......))))))).))))...."
},
"pfams": [
"PF01409"
]
},
"2ZUEA": {
"B": {
"contacts": ".*******...****.*.***..*****....*...*******.............***...........*****",
"sequence": "GGACCGGUAGCCUAGCCAGGACAGGGCGGCGGCCUCCUAAGCCGCAGGUCCGGGGUUCAAAUCCCCGCCGGUCCG",
"struct2d": "(((((((..((((..........)))).(((((.......))))).....(((((.......))))))))))))."
},
"pfams": [
"PF00750",
"PF05746",
"PF03485"
]
},
"2ZUEB": {
"B": {
"contacts": ".*******...****.*.***..*****....*...*******.............***...........******",
"sequence": "GGACCGGUAGCCUAGCCAGGACAGGGCGGCGGCCUCCUAAGCCGCAGGUCCGGGGUUCAAAUCCCCGCCGGUCCGC",
"struct2d": "(((((((..((((..........)))).(((((.......))))).....(((((.......)))))))))))).."
},
"pfams": [
"PF00750",
"PF05746",
"PF03485"
]
},
"2ZZM": {
"B": {
"contacts": "..........****.....**...******...***********.....*.................**...........*...",
"sequence": "GCAGGGGUCGCCAAGCCUGGCCAAAGGCGCUGGGCCUAGGACCCAGUCCCGUAGGGGUUCCAGGGUUCAAAUCCCUGCCCCUGC",
"struct2d": "....(((..(((.............)))((((((.......))))))(((....)))...(.(((.......))).))))...."
},
"pfams": [
"PF18093",
"PF02475"
]
},
"2ZZN": {
"C": {
"contacts": "........*******..**********..***********.******.**...***.*.........**..",
"sequence": "GCCGGGGUAGUCUAGGGGCUAGGCAGCGGACUGCAGAUCCGCCUUACGUGGGUUCAAAUCCCACCCCCGGC",
"struct2d": "(((((((..((((.......)))).(((((.......))))).....(((((.......))))))))))))"
},
"pfams": [
"PF18093",
"PF02475"
]
},
"3ADB": {
"C": {
"contacts": "*............********.****...............****...*.....................**.**.....*..**....***",
"sequence": "GGCCGCCGCCACCGGGGUGGUCCCCGGGCCGGACUUCAGAUCCGGCGCGCCCCGAGUGGGGCGCGGGGUUCAAUUCCCCGCGGCGGCCGCCA",
"struct2d": "(((((((((..((((((....))))))((((((.......))))))(((((((....)))))))((((.......)))))))))))))...."
},
"pfams": [
"PF08433"
]
},
"3ADC": {
"C": {
"contacts": "***.............*****.***............****...*.....................**.**.....*..*********",
"sequence": "GGCCGCCGCCACCGGGGUGGUCCCCGGGCCGGAC&GAUCCGGCGCGCCCCGAGUGGGGCGCGGGGUUCAAUUCCCCGCGGCGGCCGCC",
"struct2d": "(((((((((..((((((....))))))((((((.&..))))))(((((((....)))))))((((.......)))))))))))))..."
},
"pfams": [
"PF08433"
]
},
"3ADD": {
"D": {
"contacts": "***............******.**.................*..........................*..**........*******",
"sequence": "GGCCGCCGCCACCGGGGUGGUCCCCGGGCCGGACU&AGAUCCGGCGCGCCCCGAGUGGGGCGCGGGGUUCAAUUCCCCGCGGCGGCCG",
"struct2d": "(((((((((..((((((....))))))((((((..&...))))))(((((((....)))))))((((.......)))))))))))))."
},
"pfams": [
"PF08433"
]
},
"3AKZ": {
"F": {
"contacts": ".*******.*****.......***.**...*******..........................*********..",
"sequence": "UGGGAGGUCGUCUAACGGUAGGACGGCGGACUCUGGAUCCGCUGGUGGAGGUUCGAGUCCUCCCCUCCCAGCCA",
"struct2d": "(((((((..((((.......)))).(((((.......)))))....(((((.......))))))))))))...."
},
"pfams": [
"PF00749"
]
},
"3AM1": {
"B": {
"contacts": "***.............*********..........***.......................*..**........*******",
"sequence": "GGCGCGGGGUACCGGGCUUGGUAGCCCGGGGCUUCGGCCGAGGGCGAGAGCCCUCGGGGUUCGAUUCCCCCCCUGCGCCGC",
"struct2d": "(((((((((..(((((((....)))))))(((....)))((((((....))))))((((.......))))))))))))).."
},
"pfams": [
"PF08433"
]
},
"3AMT": {
"B": {
"contacts": "***........................****..**.********..........................********",
"sequence": "GGGCCCGUAGCUUAGCCAGGUCAGAGCGCCCGGCUCAUAACCGGGCGGUCGAGGGUUCGAAUCCCUCCGGGCCCACCA",
"struct2d": "(((((((..((((..........)))).((((.........)))).....(((((.......))))))))))))...."
},
"pfams": [
"PF01336",
"PF08489"
]
},
"3CIY": {
"CD": {
"contacts": "...****..........*******.***............*****.&...*****..........*******.***............*****",
"sequence": "AUUCUGCGGAUUAUUUGGCAAAGGAAGCAUUGACACAUGCGCCAAU&AUUGGCGCAUGUGUCAAUGCUUCCUUUGCCAAAUAAUCCGCAGAAU",
"struct2d": "((((((((((((((((((((((((((((((((((((((((((((((&))))))))))))))))))))))))))))))))))))))))))))))"
},
"pfams": [
"PF00560",
"PF13855",
"PF13516"
]
},
"3CUL": {
"C": {
"contacts": "............................***********...................*................................",
"sequence": "GAUGGCGAAAGCCAUUUCCGCAGGCCCCAUUGCACUCCGGGGUAUUGGCGUUAGGUGGUGGUACGAGGUUCGAAUCCUCGUACCGCAGCCA",
"struct2d": "((((((....))))))..(((..(((((..........)))))....)))...(((.(.(((((((((.......))))))))).).)))."
},
"D": {
"contacts": ".............................********.**................................................*...",
"sequence": "GGAUGGCGAAAGCCAUUUCCGCAGGCCCCAUUGCACUCCGGGGUAUUGGCGUUAGGUGGUGGUACGAGGUUCGAAUCCUCGUACCGCAGCCA",
"struct2d": "(((((((....)))))).)(((..(((((..........)))))....)))...(((.(((((((((((.......))))))))))).)))."
},
"pfams": [
"PF00076"
]
},
"3EGZ": {
"B": {
"contacts": "..............................****.********.**...................",
"sequence": "GAGGGAGAGGUGAAGAAUACGACCACCUAGGUACCAUUGCACUCCGGUACCUAAAACAUACCCUC",
"struct2d": "(((((...((((...........))))((((((((..........)))))))).......)))))"
},
"pfams": [
"PF00076"
]
},
"3EPH": {
"E": {
"contacts": ".......*........****..**********************.........................",
"sequence": "CUCGUAUGGCGCAGUGGUAGCGCAGCAGAUUGCAAAUCUGUUGGUCCUUAGUUCGAUCCUGAGUGCGAG",
"struct2d": "((((.(..((((.......))))..((((.......))))......((.((.......)).))).))))"
},
"pfams": [
"PF01715"
]
},
"3FOZ": {
"C": {
"contacts": ".........*..............*****************...........................*****.",
"sequence": "GCCCGGAUAGCUCAGUCGGUAGAGCAGGGGAUUGAAAAUCCCCGUGUCCUUGGUUCGAUUCCGAGUCUGGGCAC",
"struct2d": "(((((((..((((........)))).(((((.......))))).....(((((.......)))))))))))).."
},
"D": {
"contacts": "...*..............*****************..................................",
"sequence": "CGGAUAGCUCAGUCGGUAGAGCAGGGGAUUGAAAAUCCCCGUGUCCUUGGUUCGAUUCCGAGUCCGGGC",
"struct2d": "((((..((((........)))).(((((.......))))).....(.(((.......))).)))))..."
},
"pfams": [
"PF01715"
]
},
"3HHN": {
"C": {
"contacts": "...............................................................**...********.**..........................................................",
"sequence": "UCCAGUAGGAACACUAUACUACUGGAUAAUCAAAGACAAAUCUGCCCGAAGGGCUUGAGAACAUACCCAUUGCACUCCGGGUAUGCAGAGGUGGCAGCCUCCGGUGGGUUAAAACCCAACGUUCUCAACAAUAGUGA",
"struct2d": "((((((((..........))))))))....................(...).(.((((((((((((((..........)))))))..((((......))))((.((((......)))).))))))))))........"
},
"pfams": [
"PF00076"
]
},
"3HJW": {
"D": {
"contacts": "..****........*****************.......*******.....********",
"sequence": "GGGCCACGGAAACCGCGCGCGGUGAUCAAUGAGCCGCGUUCGCUCCCGUGGCCCACAA",
"struct2d": "(((((((((.......(((((((.........)))))))......)))))))))...."
},
"pfams": [
"PF01248",
"PF01472",
"PF08068",
"PF04135",
"PF01509",
"PF16198"
]
},
"3HL2A": {
"E": {
"contacts": "***................**.****......*****...............*..***..**......****...**.***.",
"sequence": "GCCCGGAUGAUCCUCAGUGGUCUGGGGUGCAGG&CCUGUAGCUGUCUAGCGACAGAGUGGUUCAAUUCCACCUUUCGGGCGC",
"struct2d": "(.(((((.(..((((((....))))))(.((((&)))).).(((((....))))).((((.......))))).))))).).."
},
"pfams": [
"PF05889"
]
},
"3ICQ": {
"D": {
"contacts": "*****..........****.............*..*****.**....*******....****",
"sequence": "GCGGAUUUAACUCAGUUGGGAGAGCGC&GGAGGUCCUGUGUUCGAUCCACAGAAUUCGCACC",
"struct2d": "(((.(((...(((........)))..(&.).....(((((.......)))))))).)))..."
},
"E": {
"contacts": "*****...........****............*..*****.**....*******....****",
"sequence": "GCGGAUUUAACUCAGUUGGGAGAGCGCC&GAGGUCCUGUGUUCGAUCCACAGAAUUCGCACC",
"struct2d": "(((((((...(((........)))..(.&).....(((((.......))))))))))))..."
},
"pfams": [
"PF00071",
"PF08389"
]
},
"3IRW": {
"R": {
"contacts": "....................................................****.********.**......................",
"sequence": "GUCACGCACAGGGCAAACCAUUCGAAAGAGUGGGACGCAAAGCCUCCGGCCUAAACCAUUGCACUCCGGUAGGUAGCGGGGUUACCGAUG",
"struct2d": "..((((......((...((((((....))))))...))...(((.(((((((...((..........)).))))..))))))...)).))"
},
"pfams": [
"PF00076"
]
},
"3IVKA": {
"M": {
"contacts": ".....................................................................*.*******........................****.........***..........",
"sequence": "UCCAGUAGGAACACUAUACUACUGGAUAAUCAAAGACAAAUCUGCCCGAAGGGCUUGAGAACAUCGAAACACGAUGCAGAGGUGGCAGCCUCCGGUGGGUUAAAACCCAACGUUCUCAACAAUAGUGA",
"struct2d": "((((((((..........))))))))....................(...).(.((((((((((((.....)))))..((((......))))((.((((......)))).))))))))))........"
},
"pfams": [
"UNK30"
]
},
"3IWN": {
"A": {
"contacts": "....................................................************.*******.....................",
"sequence": "CACGCACAGGGCAAACCAUUCGAAAGAGUGGGACGCAAAGCCUCCGGCCUAAACGGCAUUGCACUCCGCCGUAGGUAGCGGGGUUACCGAUGG",
"struct2d": "((((......((...((((((....))))))...))...(((.((((((((..((((..........)))))))))..))))))...)).))."
},
"pfams": [
"PF00076"
]
},
"3K0J": {
"E": {
"contacts": ".............**....***********............*........*.....***.****.**...................",
"sequence": "GCGACUCGGGGUGCCCUCCAUUGCACUCCGGAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUCGC",
"struct2d": "((((((((((.((((.(((..........))))))......)..)))).....(((...((((......))))...)))..))))))"
},
"pfams": [
"PF00076"
]
},
"3KFU": {
"K": {
"contacts": "**.......****..***...*****..*********............******...********...*****",
"sequence": "UCCGCGGUAGCUCAG&GGUAGAGCAGCCGGCUGUUAACCGGUAGGUCGCAGGUUCGAGUCCUGCCCGCGGAGCC",
"struct2d": "(((((((..((((..&....)))).(((((.......))))).....(((((.......))))))))))))..."
},
"M": {
"contacts": "..*......****........*****...********............................**..**",
"sequence": "UCCGCGGUAGCUCAG&GGUAGAGCAGCCGGCUGUUAACCGGUAGGUCGCAGGUUCGAGUCCUGCCCGCGGA",
"struct2d": "(((((((..((((..&....)))).(((((.......))))).....(((((.......))))))))))))"
},
"pfams": [
"PF01425",
"PF00152",
"PF02686",
"PF02637",
"PF02934",
"PF01336"
]
},
"3KTW": {
"C": {
"contacts": "................************...................................*..*****..****...***.............",
"sequence": "AGAUAGUCGUGGGUUCCCUUUCUGGAGGGAGAGGGAAUUCCACGUUGACCGGGGGAACCGGCCAGGCCCGGAAGGGAGCAACCGUGCCCGGCUAUC",
"struct2d": ".(((...((((((((((((((((....))))))))))).))))).....((((.....(((.....(((....))).....)))..))))...)))"
},
"D": {
"contacts": "...............*************............................*.........*****...****.................",
"sequence": "AGAUAGUCGUGGGUUCCCUUUCUGGAGGGAGAGGGAAUUCCACGUUGACCGGGGGAACCGGCCAGGCCCGGAAGGGAGCAACCGUGCCCGGCUAU",
"struct2d": "..(((..((.(((((((((((((....))))))))))).)).))....(.(((.....(((.....(((....))).....)))..))).).)))"
},
"pfams": [
"PF01922"
]
},
"3MOJ": {
"A": {
"contacts": "........................................**.*.*****...........***.....",
"sequence": "GGCUCAUCACAUCCUGGGGCUGGAAACGGUCCCAAGGGUAUGGCUGUUCGCCAUUUAAAGUGGU&AGCU",
"struct2d": ".((..(((((((((((((((((....))))))).)))))(((((.....))))).....)))))&.))."
},
"pfams": [
"PF03880"
]
},
"3MUM": {
"R": {
"contacts": "....................................................***..*******..**.......................",
"sequence": "GUCACGCACAGAGCAAACCAUUCGAAAGAGUGGGACGCAAAGCCUCCGGCCUAAACCAUUGCACUCCGGUAGGUAGCGGGGUUACCGAUGG",
"struct2d": "..(.((......((...((((((....))))))...))...(((.((((((((..((..........)))))))..))))))...))...)"
},
"pfams": [
"PF00076"
]
},
"3MUR": {
"R": {
"contacts": "....................................................****.********.**.......................",
"sequence": "GUCACGCACAGGGCAAACCAUUCGAAAGAGUGGGACGCAAAGCCUCCGGCCUAAACCAUUGCACUCCGGUAGGUAGCGGGGUUAUCGAUGG",
"struct2d": "..(.((......((...((((((....))))))...))...(((.((((((((...(..........).)))))..))))))...))...)"
},
"pfams": [
"PF00076"
]
},
"3NDB": {
"M": {
"contacts": ".................................*************..........................**....**********..*********.....................................",
"sequence": "GUCUCGUCCCGUGGGGCUCGGCGGUGGGGGAGCAUCUCCUGUAGGGGAGAUGUAACCCCCUUUACCUGCCGAACCCCGCCAGGCCCGGAAGGGAGCAACGGUAGGCAGGACGUCGGCGCUCACGGGGGUGCGGGAC",
"struct2d": "((((((..(((((.(((.(((((..(((((.(((((((((....)))))))))..)))))....((((((.....(((.....(((....))).....)))..)))))).))))))).).))))).....))))))"
},
"pfams": [
"PF02978",
"PF01922",
"PF00448",
"PF02881"
]
},
"3PIO": {
"Y": {
"contacts": ".....****.***.............********..**.******.*****..****..*............*******..***....*****.*.......****......******..",
"sequence": "ACCCCCGUGCCCAUAGCACUGUGGAACCACCCCACCCCAUGCCGAACUGGGUCGUGAAACACAGCAGCGCCAAUGAUACUCGGACCGCAGGGUCCCGGAAAAGUCGGUCAGCGCGGGGGU",
"struct2d": "((((((((((.....((.(((((....((((((...............)))..)))...)))))..)).((.......((.(((((....))))).)).......))...))))))))))"
},
"pfams": [
"PF00572",
"PF17136",
"PF01632",
"PF00237",
"PF00830",
"PF00347",
"PF03947",
"PF00673",
"PF00828",
"PF00829",
"PF00471",
"PF01245",
"PF00297",
"PF00467",
"PF01783",
"PF00444",
"PF00861",
"PF01016",
"PF00573",
"PF00453",
"PF01386",
"PF00276",
"PF01196",
"PF00181",
"PF00468",
"PF00298",
"PF14693",
"PF00831",
"PF00238",
"PF00252",
"PF00327",
"PF00281"
]
},
"3RW6": {
"H": {
"contacts": "****....********.***.........*****..............***.**.......",
"sequence": "GCACUAACCUAAGACAGGAGGGCCGGGAAACCUGCCUAAUCCAAUGACGGGUAAUAGUGUC",
"struct2d": "((((((((((......(((((((.((....)).))))..)))......))))..))))))."
},
"pfams": [
"PF09162"
]
},
"3TUP": {
"T": {
"contacts": "**.......****..........***......*******..........................*********",
"sequence": "GCCGAGGUAGCUCAGUUGGUAGAGCAUGCGACUGAAAAUCGCAGUGUCGGCGGUUCGAUUCUGCUCCUCGGCAC",
"struct2d": "(((((((..((((........)))).(((((.......)))))......((.(.......).)).))))))).."
},
"pfams": [
"PF01409",
"PF03147"
]
},
"3U4M": {
"B": {
"contacts": "...............**************..................................****.*******.....",
"sequence": "GGGAUGCGUAGGAUAGGUGGGAGCCUGUGAACCCCCGCCUCCGGGUGGGGGGGAGGCGCCGGUGAAAUACCACCCUUCCC",
"struct2d": "((((.(.........(((((..(((......((((((((....))))))))...)))............)))))).))))"
},
"pfams": [
"PF00687"
]
},
"3V7E": {
"C": {
"contacts": "........*........**............****...................................................*......................................*",
"sequence": "GGCUUAUCAAGAGAGGUGGAGGGACUGGCCCGAUGAAACCCGGCAACCACUAGUCUAGCGUCAGCUUCGGCUGACGCUAGGCUAGUGGUGCCAAUUCCUGCAGCGGAAACGUUGAAAGAUGAGCCA",
"struct2d": "((((((((....(.(((...(((.....)))......))))(((..(((((((((((((((((((....))))))))))))))))))).)))...(....(((((....)))))..)))))))))."
},
"pfams": [
"PF01248"
]
},
"3W3S": {
"B": {
"contacts": "...................**............................*...........................*....................",
"sequence": "GGGAGAGGUUGGCCGGCUGGUGCCGCCCCGGGACUUCAAAUCCCGUGGGAGGUCCCGCAAGGGAGCUCCGGAGGGUUCGAUUCCCUCCCUCUCCCGCC",
"struct2d": "((((((((..((.((((....))))))((((((.......))))).)((((.((((....)))).)))).(((((.......)))))))))))))..."
},
"pfams": [
"PF00587",
"PF18490"
]
},
"3WFRA": {
"A": {
"contacts": "*****..........**.*....................................*.............****",
"sequence": "GGCCAGGUAGCUCAGUUGGUAGAGCACUGGACUGAAAAUCCAGGUGUCGGCGGUUCGAUUCCGCCCCUGGCCA",
"struct2d": "(((((((....(...........)..(((((.......))))).....(((((.......))))))))))))."
},
"C": {
"contacts": "****...........**.*....................................*.............*****",
"sequence": "GGCCAGGUAGCUCAGUUGGUAGAGCACUGGACUGAAAAUCCAGGUGUCGGCGGUUCGAUUCCGCCCCUGGCCAC",
"struct2d": "(((((((....((........).)..(((((.......))))).....(((((.......)))))))))))).."
},
"pfams": [
"PF01743",
"PF12627",
"PF01966"
]
},
"3WFRB": {
"C": {
"contacts": "****...........**.*....................................*.............******",
"sequence": "GGCCAGGUAGCUCAGUUGGUAGAGCACUGGACUGAAAAUCCAGGUGUCGGCGGUUCGAUUCCGCCCCUGGCCACC",
"struct2d": "(((((((....((........).)..(((((.......))))).....(((((.......))))))))))))..."
},
"pfams": [
"PF01743",
"PF12627",
"PF01966"
]
},
"3WQY": {
"C": {
"contacts": "*****......****..***......................****..**....**.........**********",
"sequence": "GGGCUCGUAGCUCAGCGGGAGAGCGCCGCCUUUGCGAGGCGGAGGCCGCGGGUUCAAAUCCCGCCGAGUCCACCA",
"struct2d": "(((((((..((((.......)))).((((((.....)))))).....(((((.......))))))))))))...."
},
"pfams": [
"PF07973",
"PF01411",
"PF02272"
]
},
"3WQZ": {
"C": {
"contacts": "*****......****..***......................********....**........***********",
"sequence": "GGACUCGUAGCUCAGCGGGAGAGCGCCGCCUUUGCGAGGCGGAGGCCGCGGGUUCAAAUCCCGCCGAGUCCACCA",
"struct2d": "(((((((..((((.......)))).(((((((...))))))).....(((((.......))))))))))))...."
},
"pfams": [
"PF07973",
"PF01411",
"PF02272"
]
},
"3ZGZ": {
"B": {
"contacts": "*****.....******..**..*****.....***.....*.................*..........**********",
"sequence": "GCCCGGAUGGUGGAAUCGGUAGACACAAGGGA&UCCCUCGGCG&UCGCGCUGUGCGGGUUCAAGUCCCGCUCCGGGCAC",
"struct2d": "(((((((..(((...........))).(((((&))))).((((&...))))..(((((.......)))))))))))).."
},
"E": {
"contacts": "*.***.*...*******.**..*****......**.....*................**...........********",
"sequence": "GCCCGGAUGGUGGAAUCGGUAGACACAAGGGA&UCCCUCGGCG&CGCGCUGUGCGGGUUCAAGUCCCGCUCCGGGCAC",
"struct2d": "(((((((..(((...........))).(((((&))))).((((&..))))..(((((.......)))))))))))).."
},
"pfams": [
"PF08264",
"PF13603",
"PF00133"
]
},
"3ZJT": {
"B": {
"contacts": "..**.....*******..**..******.....*****....*..................**.............*.***",
"sequence": "GCCCGGAUGGUGGAAUCGGUAGACACAAGGGG&AAUCCCUCGGCGUUCGCGCUGUGCGGGUUCAAGUCCCGCUCCGGGUAC",
"struct2d": "(((((((..(((...........))).(..(.&...)..).((((....))))..(((((.......)))))))))))).."
},
"pfams": [
"PF08264",
"PF13603",
"PF00133"
]
},
"3ZJU": {
"B": {
"contacts": ".****....********.**..*******...****....*..................**.............*.***",
"sequence": "GCCCGGAUGGUGGAAUCGGUAGACACAAGGGA&UCCCUCGGCGUUCGCGCUGUGCGGGUUCAAGUCCCGCUCCGGGUAC",
"struct2d": "(((((((..(((...........))).((.(.&.).)).((((....))))..(((((.......)))))))))))).."
},
"pfams": [
"PF08264",
"PF13603",
"PF00133"
]
},
"3ZJV": {
"B": {
"contacts": "..**.....********.**..******.........**.*....*..................**..............****",
"sequence": "GCCCGGAUGGUGGAAUCGGUAGACACAAGGGAUU&AAAUCCCUCGGCGUUCGCGCUGUGCGGGUUCAAGUCCCGCUCCGGGUAC",
"struct2d": "(((((((..(((...........))).(((((..&...))))).((((....))))..(((((.......)))))))))))).."
},
"pfams": [
"PF08264",
"PF13603",
"PF00133"
]
},
"4AQ7": {
"B": {
"contacts": "*****.*...******..**..*****.....**......*................**..........*********",
"sequence": "GCCCGGAUGGUGGAAUCGGUAGACACAAGGGA&UCCCUCGGCG&CGCGCUGUGCGGGUUCAAGUCCCGCUCCGGGUAC",
"struct2d": "(((((((..(((...........))).(((((&))))).((((&..))))..(((((.......)))))))))))).."
},
"E": {
"contacts": "*******...*******.**..*****.....***.....*..............**...........********",
"sequence": "GCCCGGAUGGUGGAAUCGGUAGACACAAGGGA&UCCCUCGGC&GCGCUGUGCGGGUUCAAGUCCCGCUCCGGGUAC",
"struct2d": "(((((((..(((...........))).(((((&))))).(((&..)))..(((((.......)))))))))))).."
},
"pfams": [
"PF08264",
"PF13603",
"PF00133"
]
},
"4ARC": {
"B": {
"contacts": ".*...*******..**..*******...****..***..................**...........***",
"sequence": "GGAUGGUGGAAUCGGUAGACACAAGGGA&UCCCUCGGCGUUCGCGCUGUGCGGGUUCAAGUCCCGCUCC&C",
"struct2d": "(((..(((...........))).((.((&)).)).((((....))))..(((((.......))))))))&."
},
"pfams": [
"PF08264",
"PF13603",
"PF00133"
]
},
"4ARI": {
"B": {
"contacts": "..****...*.******.**..*******...****....*..................**.............*.****",
"sequence": "GCCCGGAUGGUGGAAUCGGUAGACACAAGGGA&UCCCUCGGCGUUCGCGCUGUGCGGGUUCAAGUCCCGCUCCGGGUACC",
"struct2d": "(((((((..(((...........))).((.((&)).)).((((....))))..(((((.......))))))))))))..."
},
"pfams": [
"PF08264",
"PF13603",
"PF00133"
]
},
"4AS1": {
"B": {
"contacts": "..**.....********.**..******.......****....*..................*..............******",
"sequence": "GCCCGGAUGGUGGAAUCGGUAGACACAAGGGA&AAAUCCCUCGGCGUUCGCGCUGUGCGGGUUCAAGUCCCGCUCCGGGUACC",
"struct2d": ".((((((..(((...........))).((((.&....)))).((((....))))..(((((.......)))))))))))...."
},
"pfams": [
"PF08264",
"PF13603",
"PF00133"
]
},
"4BY9": {
"A": {
"contacts": ".....*****************************......***************.....******.**...",
"sequence": "GCGAGCAAUGAUGAGUGAUGGGCGAACUGAGCUCGAAAGAGCAAUGAUGAGUGAUGGGCGAACUGAGCUCGC",
"struct2d": "((((((......(.............(...((((....))))......).............)...))))))"
},
"pfams": [
"PF01248",
"PF01269",
"PF01798"
]
},
"4CQN": {
"B": {
"contacts": "*******...******..**..*****.......*****.....*................**..........*********",
"sequence": "GCCCGGAUGGUGGAAUCGGUAGACACAAGGGAUU&AAUCCCUCGGCG&CGCGCUGUGCGGGUUCAAGUCCCGCUCCGGGUAC",
"struct2d": "(((((((..(((...........))).(((((..&..))))).((((&..))))..(((((.......)))))))))))).."
},
"E": {
"contacts": "*******...*******.**..*****......***.....*................**..........*********",
"sequence": "GCCCGGAUGGUGGAAUCGGUAGACACAAGGGAU&UCCCUCGGCG&CGCGCUGUGCGGGUUCAAGUCCCGCUCCGGGUAC",
"struct2d": "(((((((..(((...........))).((((..&.)))).((((&..))))..(((((.......)))))))))))).."
},
"pfams": [
"PF08264",
"PF13603",
"PF00133"
]
},
"4GCW": {
"B": {
"contacts": ".***............*.**.............*****...****.......***",
"sequence": "GCUUCCAUAGCUCAGCAGGUAGAGCA&GUCAGCGGUUCGAGCCCGCUUGGAAGCU",
"struct2d": "(((((((...(............)..&...(((((.......))))))))))))."
},
"pfams": [
"PF12706"
]
},
"4JXX": {
"B": {
"contacts": "******..*******.....******.....******..........................********",
"sequence": "GGGGUAUCGCCAAGCGGUAAGGCACCGGAUUCUGAUUCCGGC&CCGAGGUUCGAAUCCUCGUACCCCAGCC",
"struct2d": "((((((..(((.........))).(((((.......))))).&.(((((.......)))))))))))...."
},
"pfams": [
"PF00749",
"PF03950"
]
},
"4JYZ": {
"B": {
"contacts": "*******.*******.....******.....******...........................********",
"sequence": "GGGGUAUCGCCAAGCGGUAAGGCACCGGUUUUUGAUACCGGCA&CGCAGGUUCGAAUCCUGCUACCCCAGCC",
"struct2d": "((((((..(((.........))).(((((.......)))))..&.(((((.......)))))))))))...."
},
"pfams": [
"PF00749",
"PF03950"
]
},
"4KR2": {
"C": {
"contacts": "**......***...........*****...******..........................******",
"sequence": "CGCCGCUGGUGUAGUGGUAUCAUGCAAGAUUCCCAUUCUUGCGACCCGGGUUCGAUUCCCGGGCGGCG",
"struct2d": "((((((..(((.........)))((((((.......))))))....((((.......)))).))))))"
},
"pfams": [
"PF00587",
"PF03129"
]
},
"4KR3": {
"C": {
"contacts": "*.......***...........**************..........................*******",
"sequence": "CGCCGCUGGUGUAGUGGUAUCAUGCAAGAUUCCCAUUCUUGCGACCCGGGUUCGAUUCCCGGGCGGCGC",
"struct2d": "((((((..(((.........)))((((((.......))))))...(((((.......)))))))))))."
},
"pfams": [
"PF00587",
"PF03129"
]
},
"4KZD": {
"R": {
"contacts": "...............................*.*******...........................................",
"sequence": "GACGCGACCGAAAUGGUGAAGGACGGGUCCAGUGCGAAACACGCACUGUUGAGUAGAGUGUGAGCUCCGUAACUGGUCGCGUC",
"struct2d": "((((((((((..((((.(.....(...(((((((((.....)))))))..)).........)..).))))..).)))))))))"
},
"pfams": [
"UNK38"
]
},
"4LCK": {
"C": {
"contacts": "......****...................................................................................**.......",
"sequence": "GGGUGCGAUGAGAAGAAGAGUAUUAAGGAUUUACUAUGAUUAGCGACUCUAGGAUAGUGAAAGCUAGAGGAUAGUAACCUUAAGAAGGCACUUCGAGCACCC",
"struct2d": "((((((.....((((.......((((((..(((((((.........((((((...........)))))).))))))))))))).......))))..))))))"
},
"pfams": [
"PF01248"
]
},
"4M4O": {
"B": {
"contacts": "...........................*.****.***.*....................",
"sequence": "GGGUUCAUCAGGGCUAAAGAGUGCAGAGUUACUUAGUUCACUGCAGACUUGACGAACCC",
"struct2d": "((((((.(((((.((.......((((.(...(...)..).)))))).))))).))))))"
},
"pfams": [
"PF00062"
]
},
"4P3EA": {
"A": {
"contacts": "..******..................*****..*****.............***..*******..................****....***................*******.........",
"sequence": "GACUAAGUUCGGCAUCAAUAUGGUGACCUCCCGGGAGCGGGGGACCACCAGGUUGCCUAAGGAGGGGUGAACCGGCCCAGGUCGGAAACGGAGCAGGUCAAAACUCCCGUGCUGAUCAGUAGUU",
"struct2d": "(.((..(.(((((((((((.(((((..((((((....))))))..))))).))))......(.(((((.....((((....(((....)))....))))...))))))))))))).)...)).)"
},
"pfams": [
"PF16969",
"PF01922"
]
},
"4PKD": {
"V": {
"contacts": "*....***********........****....*************.........",
"sequence": "GAUCCAUUGCACUCCGGAUCCAGGAGAUACCAUGAUCACGAAGGUGGUUUUCCU",
"struct2d": "(((((..........))))).((((((..((((..........)))).))))))"
},
"pfams": [
"PF12220",
"PF00076"
]
},
"4RDX": {
"C": {
"contacts": "**........................******.*****............................**.*******",
"sequence": "GUGAGCGUAGCUCAGCUGGUUAGAGCACCGGACUGUGG&UCCGGGGGUCGUGGGUUCAAGUCCCAUCGCUCACCCC",
"struct2d": "(((((((..((((.........)))).((((.......&.)))).....(((((.......))))))))))))..."
},
"pfams": [
"PF13393",
"PF03129"
]
},
"4U7U": {
"L": {
"contacts": "***************************************************..********",
"sequence": "AUAAACCGGGCUCCCUGUCGGUUGUAAUUGAUAAUGUUGAGAGUUCCCCGCGCCAGCGGGG",
"struct2d": ".............................................((((((....))))))"
},
"pfams": [
"PF09485",
"PF08798",
"PF09481",
"PF09704",
"PF09344"
]
},
"4UYJ": {
"R": {
"contacts": "......*.............************.....................**...........*****.................***...................",
"sequence": "GGGGCUAGGCCGGGGGGUUCGGCGUCCCCUGUAACCGGAAACCGCCGAUAUGCCGGGGCCGAAGCCCGAGGGGCGGUUCCCGUAAGGGUUCCCACCCUCGGGCGUGCCUC",
"struct2d": "(((((..(((((((((.........)))))....((((..............))))))))...((((((((((.((..(((....)))..))).)))))))))..)))))"
},
"pfams": [
"PF05486",
"PF02290"
]
},
"4UYK": {
"R": {
"contacts": "......*.............************.....................**...........****..........................................***...................",
"sequence": "GGGGCUAGGCCGGGGGGUUCGGCGUCCCCUGUAACCGGAAACCGCCGAUAUGCCGGGGCCGAAGCCCGAGGGGCGGUUCCCGAAGCCGCCUCUGUAAGGAGGCGGUGGAGGGUUCCCACCCUCGGGCGUGCCUC",
"struct2d": "(((((..(((((((((.........)))))....((((..............))))))))...(((((((((..((..(((...(((((((((....)))))))))...)))..))..)))))))))..)))))"
},
"pfams": [
"PF05486",
"PF02290"
]
},
"4V2S": {
"Q": {
"contacts": ".*...........******......................**......********",
"sequence": "AUGUAGACCCGUC&CUUCGCCUGCGUCACGGGUCCUGGUUAGACGCAGGCGUUUUCU",
"struct2d": ".............&...((((((((((..............))))))))))......"
},
"pfams": [
"PF17209"
]
},
"4W90": {
"C": {
"contacts": "..............................*...******.......................................**.......................................",
"sequence": "GCGCGCUUAAUCUGAAAUCAGAGCGGGGGACCCAUUGCACUCCGGGUUUUUCCCGUAAGGGGUGAAUCCUUUUUAGGUAGGGCGAAAGCCCGAAUCCGUCAGCUAACCUCGUAAGCGCGC",
"struct2d": "(((((((...((((....))))((((((....(.........................(((((....(((....)))..((((....))))..))))..).)....)))))).)))))))"
},
"pfams": [
"PF00076"
]
},
"4WC3": {
"B": {
"contacts": "*****............***..................................**.*...**........*****",
"sequence": "GGCCAGGUAGCUCAGUUGGUAGAGCACUGGACUGAAAAUCCAGGUGUCGGCGGUUCGAUUCCGCCCCUGGCCACCA",
"struct2d": "((((.((..((((........)))).(((((.......))))).....(((((.......))))))).))))...."
},
"pfams": [
"PF01743"
]
},
"4WF9": {
"Y": {
"contacts": "...****..**.............*****.....*...***...******.****.*.............*****.*..***....**..***.....***.......****..",
"sequence": "UCUGGUGACUAUAGCAAGGAGGUCACACCUGUUCCCAUGCCGAACACAGAAGUUAAGGUCUUUAGCGACGAUGGUAGCCAACUUACGUUCCGCUAGAGUAGAACGUUGCCAGGC",
"struct2d": ".(..(..(.....((((.((......((((((...(.....)...))))..)).....)).)).))............(............)..............)..)..)."
},
"pfams": [
"PF00572",
"PF17136",
"PF01632",
"PF00237",
"PF00347",
"PF03947",
"PF00673",
"PF00828",
"PF00829",
"PF01245",
"PF00297",
"PF00467",
"PF01783",
"PF00861",
"PF01016",
"PF00573",
"PF00453",
"PF01386",
"PF00276",
"PF01196",
"PF00181",
"PF00468",
"PF14693",
"PF00831",
"PF00238",
"PF00252",
"PF00327",
"PF00281"
]
},
"4X0B": {
"B": {
"contacts": "******............***..................................**.*...**........*****",
"sequence": "GGGCCAGGUAGCUCAGUUGGUAGAGCACUGGACUGAAAAUCCAGGUGUCGGCGGUUCGAUUCCGCCCCUGGCCCACC",
"struct2d": ".((((.((..((((........)))).(.((.........)).).....(((((.......))))))).))))...."
},
"pfams": [
"PF01743"
]
},
"4YB1": {
"R": {
"contacts": "****.............................................................................*.........",
"sequence": "GGGCACGCACAGAGCAAACCAUUCGAAAGAGUGGGACGCAAAGCCUCCGGCCUAAACCAUUGCACCUCGGUAGGUAGCGGGGUUACCGAUG",
"struct2d": "...((((......((...((((((....))))))...))...(((.(((((((...((..........)).))))..))))))...)).))"
},
"pfams": [
"PF00076"
]
},
"4YCO": {
"D": {
"contacts": "............*****.******...***...........****.........**..................",
"sequence": "GCGCGGAUAGCUCAGUCGGUAGAGCAGGGGAUUGAAAAUCCCCGUGUCCUUGGUUCGAUUCCGAGUCCGCGCAC",
"struct2d": "(((((((..((((........)))).(((((.......))))).....(((((.......)))))))))))).."
},
"E": {
"contacts": "........*****.******...................*..........**...............",
"sequence": "GCGGAUAGCUCAGUCGGUAGAGCAGGGGAUUGAAAAUCCCCGUGUCCUUGGUUCGAUUCCGAGUCCG",
"struct2d": "..(((..((((........)))).(((((.......))))).....(((((.......))))))))."
},
"F": {
"contacts": "............*****.******..............................**...............",
"sequence": "GCGCGGAUAGCUCAGUCGGUAGAGCAGGGGAUUGAAAAUCCCCGUGUCCUUGGUUCGAUUCCGAGUCCGCG",
"struct2d": "..(((((..((((........)))).(((((.......))))).....(((((.......))))))))))."
},
"pfams": [
"PF01207"
]
},
"4YVI": {
"C": {
"contacts": ".........***.........*****..**.******..................................",
"sequence": "UGGGAGGUCGUCUAACGGUAGGACGGCGGACUCUGGAUCCGCUGGUGGAGGUUCGAGUCCUCCCCUCCCAG",
"struct2d": ".((((((..((((.......))))((((((.......))))))...(((((.......))))))))))).."
},
"pfams": [
"PF01746"
]
},
"4YYE": {
"C": {
"contacts": "**.......**.....*........****.**********..***.*..................********.",
"sequence": "GUUAUAUUAGCUUAAUUGGUAGAGCAUUCGUUUUGUAAUCGAAAGGUUUGGGGUUCAAAUCCCUAAUAUAACAC",
"struct2d": "(((((((..((((........)))).((((.........)))).....(((((.......)))))))))))).."
},
"pfams": [
"PF00587",
"PF03129"
]
},
"4ZT0": {
"B": {
"contacts": "******************...........***************.**************.........****",
"sequence": "GAUGAGACGCGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGU",
"struct2d": "..........((((((..((((....))))....))))))..(((..).)).......((((....)))).."
},
"pfams": [
"PF16592",
"PF16593",
"PF16595",
"PF13395"
]
},
"5AH5": {
"C": {
"contacts": "..***......******.**..****.........***....*................**............*******",
"sequence": "GCCCGCAUGGUGAAAUCGGUAAACACAUCGCAC&AAUGCGCCGCCUCUGGCUUGCCGGUUCAAGUCCGGCUGCGGGCACC",
"struct2d": "(((((((..(((...........)))..((((.&..))))..(((...)))..(((((.......))))))))))))..."
},
"D": {
"contacts": "..***......******.**..****.......*..***....*................**............*******",
"sequence": "GCCCGCAUGGUGAAAUCGGUAAACACAUCGCACU&AAUGCGCCGCCUCUGGCUUGCCGGUUCAAGUCCGGCUGCGGGCACC",
"struct2d": "(((((((..(((...........)))..((((..&..))))..(((...)))..(((((.......))))))))))))..."
},
"pfams": [
"PF08264",
"PF13603",
"PF00133"
]
},
"5AOX": {
"C": {
"contacts": "...*..............***********................****.......****............***.........",
"sequence": "GCCGGGCGCGGUGGCUCACGCCUGUAAUCCCAGCACUUUGGGAGGCCGAGGCGGGAGGAUCGCGAA&CGCGAGACCCCGUCUCU",
"struct2d": "((((((((.(......).)))))....((((........))))))).((((((((.(..(((((..&)))))..)))))))))."
},
"F": {
"contacts": "..................***********................****......****.............***.........",
"sequence": "GCCGGGCGCGGUGGCUCACGCCUGUAAUCCCAGCACUUUGGGAGGCCGAGGCGGG&GGAUCGCGAAC&CGCGAGACCCCGUCUC",
"struct2d": "((((((((.(......).)))))....((((........))))))).((((((((&(..(((((...&)))))..)))))))))"
},
"pfams": [
"PF05486",
"PF02290"
]
},
"5AXM": {
"P": {
"contacts": "***.............*..............................********.....*****..*****",
"sequence": "GGAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUCUGGAGGUCCUGUGUUCGAUCCACAGAAUCCCCAC",
"struct2d": "(((((..((((........)))).(((((.......))))).....(((((.......))))))))))...."
},
"pfams": [
"UNK48"
]
},
"5AXN": {
"P": {
"contacts": "***.............*.......................********.....*****..*****",
"sequence": "GGAUUUAGCUCAGUUGGGAGAGCGCCAG&AUCUGGAGGUCCUGUGUUCGAUCCACAGAAUCCCCA",
"struct2d": "(((((..((((........)))).((..&....)).....(((((.......))))))))))..."
},
"pfams": [
"UNK49"
]
},
"5CCBA": {
"N": {
"contacts": "..........**..........**.......***.........******************....**..........",
"sequence": "GGCCCGGAUAGCUCAGUCGGUAGAGCAUCAGACUUUUAAUCUGAGGGUCCAGGGUUCAAGUCCCUGUUCGGGCGCCA",
"struct2d": "((((((((..((((........)))).(((((.(...).))))).....(((((.......))))))))))))..)."
},
"pfams": [
"PF04189",
"PF08704"
]
},
"5CCXB": {
"N": {
"contacts": ".........*...........**.......***.........******************.**...........",
"sequence": "GGCCCGGAUAGCUCAGUCGGUAGAGCAUCAGACUUUUAAUCUGAGGGUCCAGGGUUCAA&CCUGUUCGGGCGCC",
"struct2d": ".(((((((..((((........)))).(((((.(...).))))).....((((......&)))))))))))..."
},
"pfams": [
"PF04189",
"PF08704"
]
},
"5D6G": {
"0": {
"contacts": "...*****....******.......................*****...................***......",
"sequence": "GCCUAAGACAGCGGGGAGGUUGGCUUAGAAGCAGCCAUCCUUUAAAGAGUGCGUAACAGCUCACCCGUCGAGGC",
"struct2d": "(((.......(((((.(((..(((.........)))..))).....(...((......)).).)))))...)))"
},
"pfams": [
"PF17777",
"PF00466"
]
},
"5DDO": {
"A": {
"contacts": ".........**......*....**....*******..**...................",
"sequence": "CGUUGGCCCAGGAAACUGGGU&AGUAAGGUCCAUUGCACUCCGGGCCUGAAGCAACGC",
"struct2d": "(((((((((((....))))))&....((((((..........))))))....)))))."
},
"pfams": [
"PF00076"
]
},
"5DDP": {
"A": {
"contacts": "............................***...********.**................",
"sequence": "CGUUGACCCAGGAAACUGGGCGGAAGUAAGGUCCAUUGCACUCCGGGCCUGAAGCAACGCG",
"struct2d": "(((((.(((((....)))))........((((((..........))))))....))))).."
},
"pfams": [
"PF00076"
]
},
"5E6M": {
"C": {
"contacts": "*.......***...........*****...******...****...................***********",
"sequence": "CGCCGCUGGUGUAGUGGUAUCAUGCAAGAUUCCCAUUCUUGCGACCCGGGUUCGAUUCCCGGGCGGCGCACCA",
"struct2d": "((((((..(((.........)))((((((.......))))))...(((((.......)))))))))))....."
},
"pfams": [
"PF00587",
"PF03129",
"PF00458"
]
},
"5HR6": {
"C": {
"contacts": "......****.******....******************.............*..........****.",
"sequence": "CCCCUUCGUCUAGAGGCCCAGGACACCGCCCUUUCACGGCGGUAACAGGGGUUCGAAUCCCCUAGGGG",
"struct2d": "(((((..((((.........))))((((((.......))))))...(((((.......))))))))))"
},
"D": {
"contacts": "....****...........******************........................***..",
"sequence": "CCCUUCGUCUAGAGGCCCAGGACACCGCCCUUUCACGGCGGUAACAGGGGUUCGAAUCCCCUAGGG",
"struct2d": "((((..((((.........))))((((((.......))))))...(((((.......)))))))))"
},
"pfams": [
"PF04055"
]
},
"5HR7": {
"D": {
"contacts": ".....****.............*****************........................*****..",
"sequence": "CCCCUUCGUCUAGAGGCCCAGGACACCGCCCUUUCACGGCGGUAACAGGGGUUCGAAUCCCCUAGGGGAC",
"struct2d": "(((((..((((.........))))((((((.......))))))...(((((.......)))))))))).."
},
"pfams": [
"PF04055"
]
},
"5M73": {
"A": {
"contacts": "....****...*****..***...*.........*****..*****............*.***.*******...................***....***................**********........****......",
"sequence": "GGUGUCCGCACUAAGUUCGGCAUCAAUAUGGUGACCUCCCGGGAGCGGGGGACCACCAGGUUGCCUAAGGAGGGGUGAACCGGCCCAGGUCGGAAACGGAGCAGGUCAAAACUCCCGUGCUGAUCAGUAGUGGGAUCGCGCCUA",
"struct2d": "(((((((.(((((.(.(((((((((((.(((((..((((((....))))))..))))).))))......(.((((......((((....(((....)))....))))....)))))))))))).)..)))))))...))))).."
},
"pfams": [
"PF16969",
"PF08492",
"PF01922"
]
},
"5OMW": {
"B": {
"contacts": "*******...******..**..*****.......*****......*................**..........*********",
"sequence": "GCCCGGAUGGUGGAAUCGGUAGACACAAGGGAUU&AAAUCCCUCGGCG&CGCGCUGUGCGGGUUCAAGUCCCGCUCCGGGUAC",
"struct2d": "(((((((..(((...........))).(((((..&...))))).((((&..))))..(((((.......)))))))))))).."
},
"pfams": [
"PF08264",
"PF13603",
"PF00133"
]
},
"5ON2": {
"B": {
"contacts": "*.*****...******..**..*****.......*******......*................**..........*********",
"sequence": "GCCCGGAUGGUGGAAUCGGUAGACACAAGGGAUUUAAAAUCCCUCGGCG&CGCGCUGUGCGGGUUCAAGUCCCGCUCCGGGUACC",
"struct2d": "(((((((..(((...........))).((((((.....)))))).((((&..))))..(((((.......))))))))))))..."
},
"E": {
"contacts": "*.*****...*******.**..*****.......***.....*................**..........*********",
"sequence": "GCCCGGAUGGUGGAAUCGGUAGACACAAGGGA&AAUCCCUCGGCG&CGCGCUGUGCGGGUUCAAGUCCCGCUCCGGGUAC",
"struct2d": "(((((((..(((...........))).((((.&...)))).((((&..))))..(((((.......)))))))))))).."
},
"pfams": [
"PF08264",
"PF13603",
"PF00133"
]
},
"5ONH": {
"E": {
"contacts": "*.*****...*******.**..*****.......****....*..................**..........*********",
"sequence": "GCCCGGAUGGUGGAAUCGGUAGACACAAGGGA&AAUCCCUCGGCGUUCGCGCUGUGCGGGUUCAAGUCCCGCUCCGGGUACC",
"struct2d": "(((((((..(((...........))).(((((&..))))).((((....))))..(((((.......))))))))))))..."
},
"pfams": [
"PF08264",
"PF13603",
"PF00133"
]
},
"5TF6": {
"B": {
"contacts": "......**.********************.**................***.**.......**........",
"sequence": "GGUCAAUUUGAAACAAUACAGAGAUGAUCAGCAGUUCCCCUGCAUAAGGAUGAACCGUUUUACAAAGAGAC",
"struct2d": ".(((..((((...................(((.((((.(((.....)))..)))).)))...))))..)))"
},
"pfams": [
"PF16842",
"PF00076"
]
},
"5U34": {
"B": {
"contacts": ".***************...**************......******......................***********",
"sequence": "GGUCUAGAGGACAGAA&ACGGGUGUGCCAAUGGCCACUUUCCAGGUGGCAAAGCCCGUUGAGCU&GAAGUGGCACCAG",
"struct2d": ".((((..).)))....&......(((((....((((((.....))))))...............&.....)))))..."
},
"pfams": [
"UNK63"
]
},
"5V6X": {
"C": {
"contacts": "........*.......***....................**********....***...****.......",
"sequence": "GGAAACCUGAUCAUGUAGAUCGAAUGGACUCUAAAUCCGUUCAGCCGGGUUAGAUUCCCGGGGUUUCCGC",
"struct2d": "(((((((.((((.....))))(((((((.......)))))))..(((((.......)))))))))))).."
},
"D": {
"contacts": "........*......****....................**********...****...****.........",
"sequence": "GGAAACCUGAUCAUGUAGAUCGAAUGGACUCUAAAUCCGUUCAGCCGGGUUAGAUUCCCGGGGUUUCCGCCA",
"struct2d": "(((((((.((((.....))))(((((((.......)))))))..(((((.......))))))))))))...."
},
"pfams": [
"UNK70"
]
},
"5VW1": {
"C": {
"contacts": "******************...........***************.**************.......*.****.",
"sequence": "UUGUAAAAAAGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGUC",
"struct2d": "..........((((((..((((....))))....))))))..(((..).)).......((((....))))..."
},
"pfams": [
"PF16592",
"PF16593",
"PF16595",
"PF13395"
]
},
"5WQE": {
"B": {
"contacts": "..*********.*...*************.....*****.....**...**********",
"sequence": "GUCUAGAGGACAG&ACGGGUGUGCCAAUGGCCACU&CAGGUGGCA&GCCCG&UGGCACG",
"struct2d": "((((..).)))..&.(((((.........((((((&..)))))).&)))))&......."
},
"pfams": [
"UNK82"
]
},
"5WWR": {
"C": {
"contacts": "****....****........*****.......***.............................********",
"sequence": "AGGGUAUAGCUCAGGGGUAGAGCAUUUGACU&AGAUCAAGAGGUCCCUGGUUCAAAUCCAGGUGCCCUCUCC",
"struct2d": ".(((((..((((.......)))).(((((((&)).)))).)....(((((.......))))))))))....."
},
"pfams": [
"PF01189"
]
},
"5WWS": {
"D": {
"contacts": "****....****.......******......***.............................********",
"sequence": "AGGGUAUAGCUCAGGGGUAGAGCAUUUGAC&AGAUCAAGAGGUCCCUGGUUCAAAUCCAGGUGCCCUCUCC",
"struct2d": ".(((((..((((.......)))).((((((&.).)))).)....(((((.......))))))))))....."
},
"pfams": [
"PF01189"
]
},
"5XBL": {
"B": {
"contacts": "******************..........****************.**************..........*************.*....",
"sequence": "UGCGCUUGGCGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGCUU",
"struct2d": "..........(((((...(((......))).....)))))..(.(..)..).......((((....)))).....(.....)......"
},
"pfams": [
"PF16592",
"PF16593",
"PF16595",
"PF13395"
]
},
"5XWP": {
"CD": {
"contacts": "*******************************....****************.******&*****..***********************",
"sequence": "GACCACCCCAAAAAUGAAGGGGACUAAAACACAAAUCUAUCUGAAUAAACUCUUCUUC&GGAAGAAGAGUUUAUUCAGAUAGAUUUGUC",
"struct2d": "....(((((.........))))..).....(((((((((((((((((((((((.((((&.)))).)))))))))))))))))))))))."
},
"pfams": [
"UNK89"
]
},
"5ZAL": {
"C": {
"contacts": "****.......****.......******...........****...............",
"sequence": "UGAGGUAGUAGGUUGUAUAGUUUUAGGG&GGAGAUAACUAUACAAUCUACUGUCUUAC",
"struct2d": "..((...(((((((((((((.(.(....&....).).)))))))))))))...))..."
},
"pfams": [
"PF00636",
"PF03368",
"PF04851",
"PF00271",
"PF02170"
]
}
}
\ No newline at end of file
This diff could not be displayed because it is too large.
This diff could not be displayed because it is too large.
This diff could not be displayed because it is too large.
This diff could not be displayed because it is too large.
This diff could not be displayed because it is too large.
File mode changed
File mode changed
File mode changed
File mode changed
No preview for this file type
% algorithm2e.sty --- style file for algorithms
%% Copyright 1996-2005 Christophe Fiorio
%
% This program may be distributed and/or modified under the
% conditions of the LaTeX Project Public License, either version 1.2
% of this license or (at your option) any later version.
% The latest version of this license is in
% http://www.latex-project.org/lppl.txt
% and version 1.2 or later is part of all distributions of LaTeX
% version 1999/12/01 or later.
%
% This program consists of the files algorithm2e.sty and algorithm2e.tex
%
% Report bugs and comments to:
% fiorio@lirmm.fr
%
% $Id: algorithm2e.sty,v 3.9 2005/10/04 12:34:52 fiorio Exp $
%
% PACKAGES REQUIRED:
%
% - float (in contrib/supported/float)
% - ifthen (in base)
% - xspace (in packages/tools)
%
%%%%%%%%%%%%%%% Release 3.9
%
% History:
%
% - October 04 2005 - revision 3.9 -
% * ADD: - \setalcaphskip command which set the horizontal skip before Algorithm: in caption when
% used in ruled algorithm.
% * ADD: - SetAlgoInsideSkip command which allows to add an extra vertical space before and after
% the core of the algorithm (ie: \SetAlgoInsideSkip{bigskip})
% * CHANGE: - caption, when used with figure option, is no more controlled by algorithm2e package
% and so follows the exact behaviour of figures. The drawback is that you cannot change
% the typo with AlTitleFnt or CapFnt. The avantage is that if you use caption package,
% it works.
% * FIX: - problem with numbering line and pdflatex
% * FIX: - error when algorithm2e package was used with beamer and listings together
% - February 12 2005 - revision 3.8 -
% * FIX: - extra line with noend option.
% - February 10 2005 - revision 3.7 -
% * ADD: - sidecomment: different macros allowing to put text right after code
% on the same line. They are defined in the same time comment macros
% are defined with a star after the macro name. By default comments
% are right justified but this can be change with appropriate option
% in the macro. Ex:
% . default: \tcc*{side comment}
% . same as previous: \tcc*[r]{side comment}
% . left justify: \tcc*[l]{side comment}
% . here: \tcc*[h]{side comment} don't put the end of line mark before
% comment (; by default) and don't end the line.
% . flushed: \tcc*[f]{side comment} same as the precedent but right
% justified
% * ADD: - scright OPTION (default): right justified side comments (side comments
% are flushed to the righr)
% * ADD: - scleft OPTION: left justified side comments (side comments are
% put right after the code line)
% * ADD: - \SetSideCommentLeft acts as scleft option
% * ADD: - \SetSideCommentRight acts as scright option
% * ADD: - block like macro side text: all macro defining a block allows now
% to put text right after key words by putting text into (). Done to
% be used with sidecomment macros, but all text can be used. Ex:
% \eIf(\tcc*[f]{then comment}){test}{then text}(else side text){else text}
% * ADD: - fillcomment OPTION (default): end mark of comment is flushed to the
% right so comments fill all the width of text
% * ADD: - nofillcomment OPTION: end mark of comment is put right after the
% comment
% * ADD: - \SetNoFillComment acts as nofillcomment option.
% * ADD: - \SetFillComment acts as fillcomment option.
% * ADD: - dotocloa OPTION which adds an entry in the toc for the list of
% algorithms. This option load package tocbibind if not already done
% and so list of figures and list of tables are also added in the toc.
% If you want to control which ones of the lists will be added in the
% toc, please load package tocbibind before package algorithm and give
% it the options you want.
% * FIX: - vertical spacing for uif macro with noend option
% * FIX: - all the compatibility problems between caption and other packages
% * FIX: - typographical differences between list of algorithms and other lists
% when in report or book
%
% - January 24 2005 - revision 3.6 -
% * FIX: - vertical spacing and space characters at the beginning or end of
% comments.
% line numbers of comments not in the nlsty.
% Thanks to Arnaud Giersch for his comments and suggestions.
% * FIX: - Set*Sty macro: the styles defined was not protected and was modified
% by surrounding context. For example KwTo in a \For{}{} was in bold AND
% italic instead of just in bold.
% * FIX: - line number misplacement after \Indp
%
% - January 21 2005 - revision 3.5 -
% * ADD: - hidden numbering of the lines. Lines are auto-numbered but numbers
% are shown only on lines you specify:
% * linesnumberedhidden option or \linesnumberedhidden macro activate
% this functionnality.
% * \showln and \showlnlabel{lab} macros make the number visible on
% the line. \showlnlabel{lab} allows to set a label for this line.
% Thanks to Samson de Jager who makes this suggestion and provides the
% macros.
% * ADD: - \AlCapFnt and \SetAlCapFnt which allow to have a different font for
% caption. Works like \AlFnt and \SetAlFnt and by default is the same.
% * ADD: - \AlCapSkip skip length. This vertical space is added before caption
% in plain ou boxed mode. It allows to change distance between text
% and caption.
% * FIX: - caption compatible with IEEEtran class.
% * FIX: - some vertical spacing error with \uIf macros (Thanks to Arnaud Giersch)
% * FIX: - Procedure and Function: lines are also numbered like algorithms
% * FIX: - CommentSty was not used for Comments
%
% - January 10 2005 - revision 3.4 -
% * FIX: - caption compatible with new release of Beamer class.
%
% - June 16 2004 - revision 3.3 -
% * FIX: - Hyperlink references of Hyperref package works now if compiled with pdflatex
% and [naturalnames] option of hyperref package is used.
% * FIX: - algorithm[H] had problem in an list environment - corrected
% * FIX: - interline was not so regular in nested blocks - corrected
% * ADD - \Setvlineskip macro which set the vertical skip after the little horizontal
% rule which closes a block in Vlined mode. By default 0.8ex
%
% - June 11 2004 - revision 3.2 - AUTO NUMBERING LINES !!!
% * ADD: auto numbering of the lines (the so asked and so long awaiting feature)
% this feature is managed by 3 options and 3 commands:
% - linesnumbered option: lines of the algo are numbered except for comments and
% input/output (KwInput and KwInOut)
% - commentsnumbered option: makes comments be numbered
% - inoutnumbered option: makes data input/output be numbered
% - \nllabel{lab} labels the line so you can cite with \ref{lab}
% - \linesnumbered make the following algorithms having auto-numbered lines
% - \linesnotnumbered make the following algorithms having no auto-numbered lines
% * Change: algo2e option renames listofalgorithm in listofalgorithme
% * FIX: new solution for compatibility with color package, more robust and not tricky.
% Many thanks to David Carlisle for his advices
%
% - June 09 2004 - revision 3.1 -
% * Change: \SetKwSwitch command defines an additionnal
% macro \uCase and \Case prints end
% * Change: now macros SetKw* do a renewcommand if the
% keyword is already defined. So you can redefine
% default definition at your own convenience or
% change your definition without introducing a
% new macro and changing your text.
% * ADD: new macro \SetKwIF which do \SetKwIf and
% \SetKwIfElseIf.The following default definition has been added:
% \SetKwIF{If}{ElseIf}{Else}{if}{then}{else if}{else}{endif}
% and so you get the macros;
% \If \eIf \lIf \uIf \ElseIf \uElseIf \lElseIf \Else
% \uElse \lElse
% * ADD: new macro \SetAlgoSkip which allow to fix the
% vertical skip before and after the algorithms.
% Default is smallskip, do \SetAlgoSkip{} if you
% don't want an extra space or \SetAlgoSkip{medskip}
% or \SetAlgoSkip{bigskip} if you want bigger space.
% * ADD: macro \SetKwIf defines in addition a new macro
% \uElse (depending on wat name you
% have given in #2 arg).
% * ADD: macro \SetKwIfElseIf defines in addition a new macro
% \uElse and \ugElseIf (depending on what name you
% have given in #2 and #3 arg).
% * Change: baseline of algorithm is now top, so two
% algorithms can be put side by side.
% * FIX: Compatibility with color package solved. The problem
% was due to a redefinition of standard macros by color package
% This solves compatibility problem with other packages
% as pstcol or colortbl.
% (notified by Dirk Fressmann, Antti Tarvainen and Koby Crammer)
% * Fix: extra little shift to the right with boxed style
% algorithm removed (notified by P. Tanovski)
% * Fix: algoln option was buggy (notified bye Jiaying Shen)
% * Fix: german and portuges option didn't work due to bad
% typo (notified by Martin Sievers, Thorsten Vitt
% and Jeronimo Pellegrini)
%
% - February 13 2004 - revision 3.0 -
% * Major revision which makes the package independent from
% float.sty, so now
% - algorithm* works better, in particular can be used in
% multicols environments
% - (known bug corrected)
% [H] works now for all sort of environment but is
% handled differently for classic environment and star
% environment (algorithm, figure, procedure and
% function). For star environment, H acts like for
% classical figure environment, so it doesn't stay here
% absolutely.
% - (known bug corrected)
% you can use now floatflt package with algorithm
% package and even with figure option. Beware that if
% you want to put an algorithm inside a floatingfigure,
% it cannot be floating, so [H] is required and then
% figure option should not be used, since standard
% figure[H] are still floating with LaTeX.
% * boxruled: a new style added. Possible now since no
% style no more defined by the float package.
% * nocaptionofalgo: dosen't print Algorithm #: in the
% caption for algorithm in ruled or algoruled style.
% note: this is just documentation of a macro which was
% already in the package.
% - December 14 2003 - revision 2.52 -
% * output message shorter
% * french keyword macro \PourTous was missing for
% longend option, it has been added.
% * TitleofAlgo prints Function or Procedure in
% corresponding environments.
%
% - October 27 2003 - revision 2.51 - Revision submitted to CTAN archive
% * correction of a minor which make caption in procedure
% and function to be blanck with pdfscreen package
% (thanks to Joel Gossens for the notification)
% * add two internal definition to avoid some errors when
% used with Hyperref package (Hyperref package need to
% define new counter macro from existing ones, and
% don't do it for algorithm2e package, so we do it)
%
% - October 17 2003 - revision 2.50 - first revision for CTAN archive
%
% * add \AlFnt and \SetAlFnt{font} macros:
% \AlFnt is used at the beginning of the caption and the
% body of algorithm in order to define the fonts used
% for typesetting algorithms. You can use it elsewhere
% you want to typeset text as algorithm. For example
% you can do \SetAlFnt{\small\sf} to have algorithms
% typeset in small sf font. Default is nothing so
% algorithm is typeset as the text of the document.
% * add \AlTitleFnt{text} and \SetAlTitleFnt{font} macros:
% The {Algorithm: } in the caption is typeset with
% \AlTitleFnt{Algorithm:}. You can use it to have text
% typeset as {Algorithm:} of captions. Default is
% textbf.
% Default can be redefined by \SetAlTitleFnt{font}.
% * add CommentSty typo for text comment.
% * add some compatibility with hyperref package (still
% an error on multiply defined refs but pdf correctly
% generated)
% * flush text to left in order to have correct
% indentation even with class as amsart which center
% all figures
% * add german, portugues and czech options for title of
% algorithms and typo.
% * add portuguese translation of predefined keywords
% * add czech translation of some predefined keywords
%
% - December 23 2002 - revision 2.40
% * add some french keyword missing
% * add function* and procedure* environment like
% algorithme* environment: print in one column even
% if twocolumn option is specified for the document.
% * add a new macro \SetKwComment to define macro which
% writes comments in the text. First argument is the
% name of the macro, second is the text put before the
% comment, third is the text put at the end of the
% comment.Default are \tcc and \tcp
% * add new options to change the way algo are numbered:
% [algopart] algo are numbered within part (counter must exist)
% [algochapter] algo are numbered within chapter
% [algosection] algo are numbered within section
%
% - March 27 2002 - revision 2.39
% * Gilles Geeraerts: added the \SetKwIfElseIf to manage
% if (c)
% i;
% else if (c)
% i;
% ...
% else
% i;
% end
% * Also added \gIf \gElsIf \gElse.
%
% - January 02 2001 - revision 2.38
% * bugs related to the caption in procedure and function
% environment are corrected.
% * bug related to option noend (extra vertical space added
% after block command as If or For) is corrected.
% * czech option language added (thanks to Libor Bus: l.bus@sh.cvut.cz).
%
% - October 16 2000 - revision 2.37
% * option algo2e added: change the name of environment
% algorithm into algorithm2e. So allow to use the package
% with some journal style which already define an algorithm
% environment.
%
% - September 13 2000 - revision 2.36
% * option slide added: require package color
% * Hack for slide class in order to have correct
% margins
%
% - November 25 1999 - revision 2.35
% * revision number match RCS number
% * Thanks to David A. Bader, a new option is added:
% noend: no end keywords are printed.
%
% - November 19 1999 - revision 2.32
% * minor bug on longend option corrected.
%
% - August 26 1999 - revision 2.31
% * add an option : figure
% this option makes algorithms be figure and so are numbered
% as figures, have Figure as caption and are put in
% the \listoffigures
%
% - January 21 1999 - revision 2.3 beta
% add 2 new environments: procedure and function.
% These environments works like algorithm environment but:
% - the ruled (or algoruled) style is imperative.
% - the caption now writes Procedure name....
% - the syntax of the \caption command is restricted as
% follow: you MUST put a name followed by 2 braces like
% this ``()''. You can put arguments inside the braces and
% text after. If no argument is given, the braces will be
% removed in the title.
% - label now puts the name (the text before the braces in the
% caption) of the procedure or function as reference (not
% the number like a classic algorithm environment).
% There are also two new styles : ProcNameSty and
% ProcArgSty. These style are by default the same as FuncSty
% and ArgSty but are used in the caption of a procedure or a
% function.
%
% - November 28 1996 - revision 2.22
% add a new macro \SetKwInParam{arg1}{arg2}{arg3}:
% it defines a macro \arg1{name}{arg} which prints name in keyword
% style followed byt arg surrounded by arg2 and arg3. The main
% application is to a function working as \SetKwInput to be used
% in the head of the algorithm. For example
% \SetKwInParam{Func}{(}{)} allows
% \Func{functionname}{list of arguments} which prints:
% \KwSty{functioname(}list of arguments\KwSty{)}
%
%
% - November 27 1996 - revision 2.21 :
% minor bug in length of InOut boxes fixed.
% add algorithm* environment.
%
% - July 12 1996 - revision 2.2 : \SetArg and \SetKwArg macros removed.
%
% \SetArg has been removed since it never has been
% documented.
% \SetKwArg has been removed since \SetKw can now
% take an argument in order to be consistent with
% \SetKwData and \SetKwFunction macros.
%
% - July 04 1996 - revision 2.1 : still more LaTeX2e! Minor compatibility break
%
% Macros use now \newcommand instead of \def, use of \setlength,
% \newsavebox, ... and other LaTeX2e specific stuff.
% The compatibility break:
% - \SetData becomes \SetKwData to be more consistent. So the old
% \SetKwData becomes \SetKwInput
% - old macros \titleofalgo, \Freetitleofalgo and \freetitleofalgo
% from LaTeX209 version which did print a warning message and call
% \Titleofalgo in version 2.0 are now removed!
%
% - March 13 1996 - revision 2.0: first official major revision.
%
%
%%%%%%%%%%%%%%
%
% Known bugs:
% -----------
% - no more known bugs... all are corrected!
%
%%%%%%%%%%%%%%
%
% Package options:
% ---------------
% - french, english, german, portuguese, czech : for the name of the algorithm, e.g.
% - boxed, boxruled, ruled, algoruled, plain : layout of the algorithm
% - algo2e : environment is algorithm2e instead of algorithms
% and \listofalgorithmes instead of \listofalgorithms
% - slide : to use when making slides
% - noline,lined,vlined : how block are designed.
% - linesnumbered : auto numbering of the algorithm's lines
% - algopart,algochapter,algosection : algo numbering within part, chapter or section
% - titlenumbered,titlenotnumbered : numbering of title set by \Titleofalgo
% - figure : algorithms are figures, numbered as figures, and put in the list of figures.
% - resetcount, noresetcount : start value of line numbers.
% - algonl : line numbers preceded by algo number
% - shortend, longend, noend : short or long end keyword as endif for e.g.
%
% defaults are; english,plain,resetcount,titlenotnumbered
%
%%%%%%%%%%%%%%
%
% Short summary
% -------------
%
% algorithm is an environment for writing algorithm in LaTeX2e
% It provide macros that allow you to create differents
% sorts of key words, therefore a set of predefined key word
% is gived.
%
% IT should be used as follows
%
% \begin{algorithm}
% ...
% ...
% \end{algorithm}
%
%
% IMPORTANT : each line MUST end with \;
%
% Note that if you define macros outside algorithm environment they
% are avaible in all the document and particulary you can use then
% inside all algorithms without re-define them.
%
% an example:
%
% \begin{algorithm}[H]
% \SetLine
% \AlgData{this text}
% \AlgResult{how to write algorithm with \LaTeX2e }
%
% initialization\;
% \While{not at end of this document}{
% read current section\;
% \eIf{understand}{
% go to next section\;
% current section becomes this one\;
% }{
% go back to the beginning of current section\;
% }
% }
% \caption{How to write algorithm}
% \end{algorithm}
%
%
%%%%%%%%%%%%%% predefined english keywords
%
% \AlgData{input}
% \AlgResult{output}
% \KwIn{input}
% \KwOut{output}
% \KwData{input}
% \KwResult{output}
% \Ret{[value]}
% \KwTo % a simple keyword
% \Begin{block inside}
% \If{condition}{Then block} % in a block
% \uIf{condition}{Then block} % in a block unended
% \Else{inside Else} % in a block
% \eIf{condition}{Then Block}{Else block} % in blocks
% \lIf{condition}{Else text} % on the same line
% \lElse{Else text} % on the same line
% \Switch{Condition}{Switch block}
% \Case{a case}{case block} % in a block
% \lCase{a case}{case text} % on the same line
% \Other{otherwise block} % in a block
% \lOther{otherwise block} % on the same line
% \For{condition}{text loop} % in a block
% \lFor{condition}{text} % on the same line
% \ForEach{condition}{text loop} % in a block
% \lForEach{condition}{text} % on the same line
% \Repeat{End condition}{text loop} % in a block
% \lRepeat{condition}{text} % on the same line
% \While{condition}{text loop} % in a block
% \lWhile{condition}{text loop} % on the same line
%
%
%%%%%%%%%%%%%% predefined french keywords
%
% \AlgDonnees{input}
% \AlgRes{input}
% \Donnees{input}
% \Res{input}
% \Retour[valeur]}
% \Deb{block inside}
% \KwA % un mot clef simple
% \Si{condition}{Bloc du Alors} % Dans un bloc
% \uSi{condition}{Bloc du Alors} % Dans un bloc non termine
% \eSi{condition}{Bloc du Alors}{Bloc du Sinon} % Dans un bloc
% \lSi{condition}{texte du Alors} % sur la meme ligne
% \lSinon{texte du Sinon} % sur la meme ligne
% \Suivant{Condition}{Bloc de l'instruction}
% \Cas{cas}{Bloc de ce cas} % Dans un bloc
% \lCas{cas}{Bloc de ce cas} % sur la meme ligne
% \Autres{Bloc de l'alternative} % Dans un bloc
% \lAutres{Bloc de l'alternative} % sur la meme ligne
% \Pour{condition}{texte de la boucle} % Dans un bloc
% \lPour{condition}{texte} % sur la meme ligne
% \PourCh{condition}{texte de la boucle} % Dans un bloc
% \lPourCh{condition}{texte} % sur la meme ligne
% \Repeter{End condition}{texte de la boucle} % Dans un bloc
% \lRepeter{condition}{texte} % sur la meme ligne
% \Tq{condition}{texte de la boucle} % Dans un bloc
% \lTq{condition}{texte de la boucle} % sur la meme ligne
%
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% for more complete informations you can see algorithm2e.tex
%
%
%%%%%%%%%%%%%%%%%%%%%%%% Identification Part %%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
\NeedsTeXFormat{LaTeX2e}[1994/12/01]
%
\ProvidesPackage{algorithm2e}[2005/10/04 v3.9 algorithms environments]
%
%
%%%%%%%%%%%%%%%%%%%%%%%%%%% Initial Code %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
\@makeother\*% some package redefined it as a letter (as color.sty)
%
% definition of commands which can be redefined in options of the package.
%
\newcounter{AlgoLine}
\setcounter{AlgoLine}{0}
%
\newcommand{\listalgorithmcfname}{}
\newcommand{\algorithmcfname}{}
\newcommand{\algocf@typo}{}
\newcommand{\@algocf@procname}{}
\newcommand{\@algocf@funcname}{}
\newcommand{\@algocf@titleofalgoname}{\algorithmcfname}
\newcommand{\@algocf@algotitleofalgo}{%
\renewcommand{\@algocf@titleofalgoname}{\algorithmcfname}}
\newcommand{\@algocf@proctitleofalgo}{%
\renewcommand{\@algocf@titleofalgoname}{\algocf@procname}}
%
\newcommand{\algocf@style}{plain}
\newcommand{\@ResetCounterIfNeeded}{}
\newcommand{\@titleprefix}{}
%
\newcommand{\algocf@numbering}[1]{\newcommand{\algocf@within}{#1}}
%
\newcommand{\defaultsmacros@algo}{\algocf@defaults@shortend}
%
\newcommand{\algocf@list}{loa}
\newcommand{\algocf@float}{algocf}
%
\newcommand{\algocf@envname}{algorithm}
\newcommand{\algocf@listofalgorithms}{listofalgorithms}
%
%
%%%%%%%%%%%%%%%%%%%%%% Declaration of Options %%%%%%%%%%%%%%%%%%%%%%%%%%%
%
\RequirePackage{ifthen}
%
\DeclareOption{algo2e}{%
\renewcommand{\algocf@envname}{algorithm2e}
\renewcommand{\algocf@listofalgorithms}{listofalgorithmes}
}
%
\newboolean{algocf@slide}\setboolean{algocf@slide}{false}
\DeclareOption{slide}{%
\setboolean{algocf@slide}{true}%
}
%
\DeclareOption{figure}{
\renewcommand{\algocf@list}{lof}
\renewcommand{\algocf@float}{figure}
}
%
\DeclareOption{english}{%
\renewcommand{\listalgorithmcfname}{List of Algorithms}%
\renewcommand{\algorithmcfname}{Algorithm}%
\renewcommand{\algocf@typo}{}%
\renewcommand{\@algocf@procname}{Procedure}
\renewcommand{\@algocf@funcname}{Function}
}
%
\DeclareOption{french}{%
\renewcommand{\listalgorithmcfname}{Liste des Algorithmes}%
\renewcommand{\algorithmcfname}{Algorithme}%
\renewcommand{\algocf@typo}{\ }%
\renewcommand{\@algocf@procname}{Procédure}
\renewcommand{\@algocf@funcname}{Fonction}
}
%
\DeclareOption{czech}{%
\renewcommand{\listalgorithmcfname}{Seznam algoritm\v{u}}%
\renewcommand{\algorithmcfname}{Algoritmus}%
\renewcommand{\algocf@typo}{}%
\renewcommand{\@algocf@procname}{Procedura}
\renewcommand{\@algocf@funcname}{Funkce}
}
%
\DeclareOption{german}{%
\renewcommand{\listalgorithmcfname}{Liste der Algorithmen}%
\renewcommand{\algorithmcfname}{Algorithmus}%
\renewcommand{\algocf@typo}{\ }%
\renewcommand{\@algocf@procname}{Prozedur}%
\renewcommand{\@algocf@funcname}{Funktion}%
}
%
\DeclareOption{portugues}{%
\renewcommand{\listalgorithmcfname}{Lista de Algoritmos}%
\renewcommand{\algorithmcfname}{Algoritmo}%
\renewcommand{\algocf@typo}{}%
\renewcommand{\@algocf@procname}{Procedimento}
\renewcommand{\@algocf@funcname}{Fun\c{c}\~{a}o}
}
%
% OPTIONs plain, boxed, ruled, algoruled & boxruled
%
\newcommand{\algocf@style@plain}{\renewcommand{\algocf@style}{plain}}
\newcommand{\algocf@style@boxed}{\renewcommand{\algocf@style}{boxed}}
\newcommand{\algocf@style@ruled}{\renewcommand{\algocf@style}{ruled}}
\newcommand{\algocf@style@algoruled}{\renewcommand{\algocf@style}{algoruled}}
\newcommand{\algocf@style@boxruled}{\renewcommand{\algocf@style}{boxruled}}
\newcommand{\restylealgo}[1]{\csname algocf@style@#1\endcsname}
\DeclareOption{plain}{\algocf@style@plain}
\DeclareOption{boxed}{\algocf@style@boxed}
\DeclareOption{ruled}{\algocf@style@ruled}
\DeclareOption{algoruled}{\algocf@style@algoruled}
\DeclareOption{boxruled}{\algocf@style@boxruled}
%
% OPTIONs algopart,algochapter & algosection
%
\DeclareOption{algopart}{\algocf@numbering{part}} %algo part numbered
\DeclareOption{algochapter}{\algocf@numbering{chapter}} %algo chapter numbered
\DeclareOption{algosection}{\algocf@numbering{section}} %algo section numbered
%
% OPTIONs resetcount & noresetcount
%
\DeclareOption{resetcount}{\renewcommand{\@ResetCounterIfNeeded}{\setcounter{AlgoLine}{0}}}
\DeclareOption{noresetcount}{\renewcommand{\@ResetCounterIfNeeded}{}}
%
% OPTION linesnumbered
%
\newboolean{algocf@linesnumbered}\setboolean{algocf@linesnumbered}{false}
\newcommand{\algocf@linesnumbered}{\relax}
\DeclareOption{linesnumbered}{%
\setboolean{algocf@linesnumbered}{true}
\renewcommand{\algocf@linesnumbered}{\everypar={\nl}}
}
%
% OPTION linesnumberedhidden
%
\DeclareOption{linesnumberedhidden}{%
\setboolean{algocf@linesnumbered}{true}
\renewcommand{\algocf@linesnumbered}{\everypar{\stepcounter{AlgoLine}}}
}
%
% OPTION commentsnumbered inoutnumbered
%
\newboolean{algocf@commentsnumbered}\setboolean{algocf@commentsnumbered}{false}
\DeclareOption{commentsnumbered}{\setboolean{algocf@commentsnumbered}{true}}
\newboolean{algocf@inoutnumbered}\setboolean{algocf@inoutnumbered}{false}
\DeclareOption{inoutnumbered}{\setboolean{algocf@inoutnumbered}{true}}
%
% OPTIONs titlenumbered & titlenotnumbered
%
\DeclareOption{titlenumbered}{%
\renewcommand{\@titleprefix}{%
\refstepcounter{algocf@float}%
\AlTitleFnt{\@algocf@titleofalgoname\
\expandafter\csname the\algocf@float\endcsname\algocf@typo : }}%
}
%
\DeclareOption{titlenotnumbered}{\renewcommand{\@titleprefix}{%
\AlTitleFnt{\@algocf@titleofalgoname\algocf@typo : }}%
}
%
% OPTIONs lined, vlined & noline
%
\DeclareOption{lined}{\AtBeginDocument{\SetLine}} % \SetLine
\DeclareOption{vlined}{\AtBeginDocument{\SetVline}} % \SetVline
\DeclareOption{noline}{\AtBeginDocument{\SetNoline}} % \Setnoline (default)
%
% OPTIONs algonl
% line numbered with the counter of the algorithm
%
\DeclareOption{algonl}{\renewcommand{\theAlgoLine}{\expandafter\csname the\algocf@float\endcsname.\arabic{AlgoLine}}}
%
% OPTIONs longend, shotend & noend
%
\DeclareOption{longend}{%
\renewcommand{\defaultsmacros@algo}{\algocf@defaults@longend}}
\DeclareOption{shortend}{%
\renewcommand{\defaultsmacros@algo}{\algocf@defaults@shortend}}
\newboolean{algocf@optnoend}\setboolean{algocf@optnoend}{false}
\DeclareOption{noend}{%
\setboolean{algocf@optnoend}{true}%
\renewcommand{\defaultsmacros@algo}{\algocf@defaults@noend}}
%
% OPTION dotoc
%
\newboolean{algocf@dotocloa}\setboolean{algocf@dotocloa}{false}
\DeclareOption{dotocloa}{%
\setboolean{algocf@dotocloa}{true}
}
%
% OPTION comments
%
\newboolean{algocf@optfillcomment}\setboolean{algocf@optfillcomment}{true}
\DeclareOption{nofillcomment}{%
\setboolean{algocf@optfillcomment}{false}%
}
\DeclareOption{fillcomment}{%
\setboolean{algocf@optfillcomment}{true}%
}
%
% OPTION sidecommments
%
\newboolean{algocf@scleft}\setboolean{algocf@scleft}{false}
\DeclareOption{scleft}{%
\setboolean{algocf@scleft}{true}%
}
\DeclareOption{sright}{% default
\setboolean{algocf@scleft}{false}%
}
%
%
%%%%%%%%%%%%%%%%%%%%%%% Execution of Options %%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
\ExecuteOptions{english,plain,resetcount,titlenotnumbered}
%
\ProcessOptions
%
\@algocf@algotitleofalgo % fix name for \Titleofalgo to \algorithmcfname by default
%
%%%%%%%%%%%%%%%%%%%%%%%%%% Package Loading %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
%\RequirePackage{float}[2001/11/08]
%
\RequirePackage{xspace}
%
\ifthenelse{\boolean{algocf@slide}}{\RequirePackage{color}}{}
%
\AtEndOfPackage{%
\ifthenelse{\boolean{algocf@dotocloa}}{%
\renewcommand{\listofalgorithmes}{\tocfile{\listalgorithmcfname}{loa}}%
}{\relax}
}
% if loa in toc required, load tocbibind package if not already done.
\ifthenelse{\boolean{algocf@dotocloa}}{%
\ifx\@tocextra\undefined%
\RequirePackage{tocbibind}
\fi%
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Main Part %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
\newcommand{\algocf@name}{algorithm2e}
\newcommand{\algocf@date}{october 04 2005}
\newcommand{\algocf@version}{Release 3.9}
\newcommand{\algocf@id}{\algocf@version\space -- \algocf@date\space --}
\typeout{********************************************************^^JPackage `\algocf@name'\space\algocf@id^^J%
- algorithm2e-announce@lirmm.fr mailing list for announcement about releases^^J%
- algorithm2e-discussion@lirmm.fr mailing list for discussion about package^^J%
subscribe by emailing sympa@lirmm.fr with 'subscribe <list> <firstname name>'^^J%
- Author: Christophe Fiorio (fiorio@lirmm.fr)^^J********************************************************}
%%
%%
%%
%%
%%
%%
%%%% hyperref compatibility tricks: Hyperref package defines H counters from
% standard counters (i.e \theHpage from \thepage) and check some particular
% counters of some packages, unfortunately it doesn't do the same for
% algorithm2e package but act as Hcounter was defined. To avoid errors we
% defined \theHalgocf ourself
%%%%
% \@ifundefined{theHalgocf}{\def\theHalgocf{\thealgocf}}{}%
% \@ifundefined{theHAlgoLine}{\def\theHAlgoLine{\theAlgoLine}}{}%
% \@ifundefined{theHalgocf}{\def\theHalgocf{\thealgocf}}{}%
% \@ifundefined{theHAlgoLine}{\def\theHAlgoLine{\thealgocf}}{}%
% \@ifundefined{toclevel@algocf}{\def\toclevel@algocf{0}}{}%
%%
%%
%%
\newcommand{\@defaultskiptotal}{0.5em}%\Setnlskip{0.5em}
\newskip\skiptotal\skiptotal=0.5em%\Setnlskip{0.5em}
\newskip\skiprule
\newskip\skiphlne
\newskip\skiptext
\newskip\skiplength
\newskip\algomargin
\newskip\skipalgocfslide\skipalgocfslide=1em
\newdimen\algowidth
\newdimen\inoutsize
\newdimen\inoutline
%
\newcommand{\@algoskip}{\smallskip}%
\newcommand{\SetAlgoSkip}[1]{\renewcommand{\@algoskip}{\csname#1\endcsname}}%
\newcommand{\@algoinsideskip}{\relax}%
\newcommand{\SetAlgoInsideSkip}[1]{\renewcommand{\@algoinsideskip}{\csname#1\endcsname}}%
%
\newsavebox{\algocf@inoutbox}
\newsavebox{\algocf@inputbox}
%%
%%
\newcommand{\arg@e}{}
\newcommand{\arg@space}{\ }
\newcommand{\BlankLine}{\vskip 1ex}
%%
\newcommand{\vespace}{1ex}
\newcommand{\SetInd}[2]{%
\skiprule=#1%
\skiptext=#2%
\skiplength=\skiptext\advance\skiplength by \skiprule\advance\skiplength by 0.4pt}
\SetInd{0.5em}{1em}
\algomargin=\leftskip\advance\algomargin by \parindent
\newcommand{\incmargin}[1]{\advance\algomargin by #1}
\newcommand{\decmargin}[1]{\advance\algomargin by -#1}
\newcommand{\Setnlskip}[1]{%
\renewcommand{\@defaultskiptotal}{#1}%
\setlength{\skiptotal}{#1}}
\newcommand{\setnlskip}[1]{\Setnlskip{#1}}%kept for compatibility issue
%%
\newskip\AlCapSkip\AlCapSkip=0ex
\newskip\AlCapHSkip\AlCapSkip=0ex
\newcommand{\setalcapskip}[1]{\setlength{\AlCapSkip}{#1}}
\newcommand{\setalcaphskip}[1]{\setlength{\AlCapHSkip}{#1}}
\setalcaphskip{.5\algomargin}
%%
%%
\newcommand{\Indentp}[1]{\advance\leftskip by #1}
\newcommand{\Indp}{\advance\leftskip by 1em}
\newcommand{\Indpp}{\advance\leftskip by 0.5em}
\newcommand{\Indm}{\advance\leftskip by -1em}
\newcommand{\Indmm}{\advance\leftskip by -0.5em}
%%
%%
%% Line Numbering
%%
%%
% number line style
\newcommand{\nlSty}[1]{\textnormal{\textbf{#1}}}% default definition
\newcommand{\Setnlsty}[3]{\renewcommand{\nlSty}[1]{\textnormal{\csname#1\endcsname{#2##1#3}}}}
%
%
\newcommand{\algocf@nlhlabel}[2]{%
\immediate\write\@auxout{%
\string\newlabel{#1}{%
{#2}% current label
{\thepage}% page
{}% current label string
% {AlgoLine\thealgocfline.\theAlgoLine}% current Href
{AlgoLine\thealgocfline.\theAlgoLine}% current Href
{}%
}%
}%
}
%
% nl definitions
%
\newcommand{\nl}{%
\@ifundefined{href}{% if not hyperref then do a simple refstepcounter
\refstepcounter{AlgoLine}%
}{% else if hyperref, do the anchor so 2 lines in two differents algorithms cannot have the same href
% \stepcounter{AlgoLine}\Hy@raisedlink{\hyper@anchorstart{AlgoLine\thealgocfline.\theAlgoLine}\hyper@anchorend}%
\stepcounter{AlgoLine}\Hy@raisedlink{\hyper@anchorstart{AlgoLine\thealgocfline.\theAlgoLine}\hyper@anchorend}%
}% now we can do the line numbering
\strut\vadjust{\kern-\dp\strutbox\vtop to \dp\strutbox{%
\baselineskip\dp\strutbox\vss\llap{\scriptsize{\nlSty{\theAlgoLine}\hskip\skiptotal}}\null}}%
}%
\newcommand{\nllabel}[1]{%
\@ifundefined{href}{\label{#1}}{\algocf@nlhlabel{#1}{\theAlgoLine}}}%
%
\newcommand{\enl}{;%
\@ifundefined{href}{% if not hyperref then do a simple refstepcounter
\refstepcounter{AlgoLine}%
}{% else if hyperref, do the anchor so 2 lines in two differents algorithms cannot have the same href
% \stepcounter{AlgoLine}\Hy@raisedlink{\hyper@anchorstart{AlgoLine\thealgocfline.\theAlgoLine}\hyper@anchorend}%
\stepcounter{AlgoLine}\Hy@raisedlink{\hyper@anchorstart{AlgoLine\thealgocfline.\theAlgoLine}\hyper@anchorend}%
}% now we can do the line numbering
\hfill\rlap{%
\scriptsize{\nlSty{\theAlgoLine}}}\par}
\newcommand{\nlset}[1]{%
\hskip 0pt\llap{%
\scriptsize{\nlSty{#1}}\hskip\skiptotal}\ignorespaces}
%
% lnl definitions
%
\@ifundefined{href}{% if not hyperref
\newcommand{\lnl}[1]{\nl\label{#1}\ignorespaces}%
}{% else hyperref
\newcommand{\lnl}[1]{\nl\algocf@nlhlabel{#1}{\theAlgoLine}\ignorespaces}%
}
%
% nlset
%
\@ifundefined{href}{%
\newcommand{\lnlset}[2]{\nlset{#2}\protected@edef\@currentlabel{#2}\label{#1}}%
}{%else hyperref
\newcommand{\lnlset}[2]{\nlset{#2}%
\Hy@raisedlink{\hyper@anchorstart{AlgoLine.#2}\hyper@anchorend}\algocf@nlhlabel{#1}{#2}%
\ignorespaces%
}%
}
%
% set char put at end of each line
%
\newcommand{\algocf@endline}{\string;}
\newcommand{\SetEndCharOfAlgoLine}[1]{\renewcommand{\algocf@endline}{#1}}
%
% end of line definition
%
\newcommand{\@endalgoln}{\algocf@endline\par}% default definition: printsemicolon
\newcommand{\dontprintsemicolon}{\renewcommand{\@endalgoln}{\par}}
\newcommand{\printsemicolon}{\renewcommand{\@endalgoln}{\algocf@endline\par}}
%
% line numbering
%
\newcommand{\linesnumbered}{\setboolean{algocf@linesnumbered}{true}\renewcommand{\algocf@linesnumbered}{\everypar={\nl}}}
\newcommand{\linesnotnumbered}{%
\setboolean{algocf@linesnumbered}{false}%
\renewcommand{\algocf@linesnumbered}{\relax}%
}
%
\newcommand{\linesnumberedhidden}{%
\setboolean{algocf@linesnumbered}{true}\renewcommand{\algocf@linesnumbered}{\everypar{\stepcounter{AlgoLine}}}}
\newcommand{\showln}{\nlset{\theAlgoLine}\ignorespaces} % display the line number on this line (without labelling)
\newcommand{\showlnlabel}[1]{\lnlset{#1}{\theAlgoLine}\ignorespaces} % display the line number and label this line
%
%%
%
%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% Styling text commands
%
\newcommand{\AlTitleFnt}[1]{\textbf{#1}\unskip}% default definition
\newcommand{\SetAlTitleFnt}[1]{\renewcommand{\AlTitleFnt}[1]{\csname#1\endcsname{##1}\unskip}}%
\newcommand{\AlFnt}{\relax}% default definition
\newcommand{\SetAlFnt}[1]{\renewcommand{\AlFnt}{#1}}%
\newcommand{\AlCapFnt}{\AlFnt{}}% default definition
\newcommand{\SetAlCapFnt}[1]{\renewcommand{\AlCapFnt}{#1}}%
\newcommand{\KwSty}[1]{\textnormal{\textbf{#1}}\unskip}% default definition
\newcommand{\SetKwSty}[1]{\renewcommand{\KwSty}[1]{\textnormal{\csname#1\endcsname{##1}}\unskip}}%
\newcommand{\ArgSty}[1]{\textnormal{\emph{#1}}\unskip}%\SetArgSty{emph}
\newcommand{\SetArgSty}[1]{\renewcommand{\ArgSty}[1]{\textnormal{\csname#1\endcsname{##1}}\unskip}}%
\newcommand{\FuncSty}[1]{\textnormal{\texttt{#1}}\unskip}%\SetFuncSty{texttt}
\newcommand{\SetFuncSty}[1]{\renewcommand{\FuncSty}[1]{\textnormal{\csname#1\endcsname{##1}}\unskip}}%
\newcommand{\DataSty}[1]{\textnormal{\textsf{#1}}\unskip}%%\SetDataSty{textsf}
\newcommand{\SetDataSty}[1]{\renewcommand{\DataSty}[1]{\textnormal{\csname#1\endcsname{##1}}\unskip}}%
\newcommand{\CommentSty}[1]{\textnormal{\texttt{#1}}\unskip}%%\SetDataSty{texttt}
\newcommand{\SetCommentSty}[1]{\renewcommand{\CommentSty}[1]{\textnormal{\csname#1\endcsname{##1}}\unskip}}%
\newcommand{\TitleSty}[1]{#1\unskip}%\SetTitleSty{}{}
\newcommand{\SetTitleSty}[2]{\renewcommand{\TitleSty}[1]{%
\csname#1\endcsname{\csname#2\endcsname##1}}\unskip}
%
%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% Block basic commands
%
\newcommand{\al@push}[1]{\advance\skiptotal by #1\moveright #1}
\newcommand{\al@pop}[1]{\advance\skiptotal by -#1}
\newcommand{\al@addskiptotal}{\advance\skiptotal by 0.4pt\advance\hsize by -0.4pt} % 0.4 pt=width of \vrule
\newcommand{\al@subskiptotal}{\advance\skiptotal by -0.4pt\advance\hsize by 0.4pt} % 0.4 pt=width of \vrule
%
\skiphlne=.8ex%
\newcommand{\Setvlineskip}[1]{\skiphlne=#1}
\newcommand{\V@line}[1]{% no vskip in between boxes but a strut to separate them,
\strut\par\nointerlineskip% then interblock space stay the same whatever is inside it
\al@push{\skiprule}% move to the right before the vertical rule
\hbox{\vrule%
\vtop{\al@push{\skiptext}%move the right after the rule
\vtop{\al@addskiptotal\advance\hsize by -\skiplength #1}\Hlne}}\vskip\skiphlne% inside the block
\al@pop{\skiprule}%\al@subskiptotal% restore indentation
\nointerlineskip}% no vskip after
%
\newcommand{\V@sline}[1]{% no vskip in between boxes but a strut to separate them,
\strut\par\nointerlineskip% then interblock space stay the same whatever is inside it
\al@push{\skiprule}% move to the right before the vertical rule
\hbox{\vrule% the vertical rule
\vtop{\al@push{\skiptext}%move the right after the rule
\vtop{\al@addskiptotal\advance\hsize by -\skiplength #1}}}% inside the block
\al@pop{\skiprule}}% restore indentation
%\nointerlineskip}% no vskip after
%
\newcommand{\H@lne}{\hrule height 0.4pt depth 0pt width .5em}
%
\newcommand{\No@line}[1]{% no vskip in between boxes but a strut to separate them,
\strut\par\nointerlineskip% then interblock space stay the same whatever is inside it
\al@push{\skiprule}%
\hbox{%
\vtop{\al@push{\skiptext}%
\vtop{\advance\hsize by -\skiplength #1}}}% inside the block
\al@pop{\skiprule}}%
%\nointerlineskip}% no vskip after
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%
%% default=NoLine
%
\newcommand{\a@@block}[2]{\No@line{##1}\KwSty{##2}\par}
\newcommand{\a@block}[2]{\a@@block{#1}{#2}} % this to be redefined as a@group in
% case of noend option
\newcommand{\a@group}[1]{\No@line{##1}}
\newcommand{\Hlne}{}
%
%
\newcommand{\SetNoline}{%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Noline
\renewcommand{\a@@block}[2]{\No@line{##1}\KwSty{##2}\strut\par}%
\renewcommand{\a@group}[1]{\No@line{##1}}
\renewcommand{\Hlne}{}}
%
\newcommand{\SetVline}{%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Vline
\renewcommand{\a@@block}[2]{\V@line{##1}}%
\renewcommand{\a@group}[1]{\V@sline{##1}\strut\ignorespaces}
\renewcommand{\Hlne}{\H@lne}}
%
\newcommand{\SetLine}{%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Line
\renewcommand{\a@@block}[2]{\strut\V@sline{##1}\KwSty{##2}\strut\par}% no skip after a block so garantie at least a line
\renewcommand{\a@group}[1]{\V@sline{##1}\strut\ignorespaces}
\renewcommand{\Hlne}{}}
%
\newcommand{\SetNothing}{%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Noline
\renewcommand{\a@@block}[2]{\No@line{##1}\par}%
%\long
\renewcommand{\a@group}[1]{\No@line{##1}}
\renewcommand{\Hlne}{}}
%
%%
%%
%
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% ``Input :'''s like command
%
%%%
% text staying at the right of the longer keyword of KwInOut commands
% (text of KwInOut commands are all vertically aligned)
%
\newcommand{\algocf@newinout}{\par\parindent=\wd\algocf@inoutbox}% to put right indentation after a \\ in the KwInOut
\newcommand{\SetKwInOut}[2]{%
\sbox\algocf@inoutbox{\hbox{\KwSty{#2}\algocf@typo:\ }}%
\expandafter\ifx\csname InOutSizeDefined\endcsname\relax% if first time used
\newcommand\InOutSizeDefined{}\setlength{\inoutsize}{\wd\algocf@inoutbox}%
\else% else keep the larger dimension
\ifdim\wd\algocf@inoutbox>\inoutsize\setlength{\inoutsize}{\wd\algocf@inoutbox}\fi%
\fi% the dimension of the box is now defined.
\@ifundefined{#1}{\let\algocf@mkcmd=\newcommand}{\let\algocf@mkcmd=\renewcommand}%
\expandafter\algocf@mkcmd\csname#1\endcsname[1]{%
\ifthenelse{\boolean{algocf@inoutnumbered}}{\relax}{\everypar={\relax}}
{\let\\\algocf@newinout\hangindent=\wd\algocf@inoutbox\hangafter=1\parbox[t]{\inoutsize}{\KwSty{#2}\hfill:\mbox{\ }}##1\par}
\algocf@linesnumbered% reset the numbering of the lines
}}%
%
%% allow to ajust the skip size of InOut
%%
\newcommand{\ResetInOut}[1]{%
\sbox\algocf@inoutbox{\hbox{\KwSty{#1}\algocf@typo:\ }}%
\setlength{\inoutsize}{\wd\algocf@inoutbox}%
}
%
%
%%%
% text staying at the right of the keyword.
%
\newcommand{\algocf@newinput}{\par\parindent=\wd\algocf@inputbox}% to put right indentation after a \\ in the KwInput
\newcommand{\SetKwInput}[2]{%
\@ifundefined{#1}{\let\algocf@mkcmd=\newcommand}{\let\algocf@mkcmd=\renewcommand}%
\expandafter\algocf@mkcmd\csname#1\endcsname[1]{%
\sbox\algocf@inputbox{\hbox{\KwSty{#2}\algocf@typo: }}%
\ifthenelse{\boolean{algocf@inoutnumbered}}{\relax}{\everypar={\relax}}%
{\let\\\algocf@newinput\hangindent=\wd\algocf@inputbox\hangafter=1\unhbox\algocf@inputbox##1\par}%
\algocf@linesnumbered% reset the numbering of the lines
}}%
\newcommand{\SetKwData}[2]{%
\@ifundefined{#1}{\let\algocf@mkcmd=\newcommand}{\let\algocf@mkcmd=\renewcommand}%
\expandafter\algocf@mkcmd\csname @#1\endcsname[1]{\DataSty{#2(}\ArgSty{##1}\DataSty{)}}%
\expandafter\algocf@mkcmd\csname#1\endcsname{%
\@ifnextchar\bgroup{\csname @#1\endcsname}{\DataSty{#2}\xspace}}%
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% Comments macros
%
%%%%
% comment in the text, first argument is the name of the macro, second is
% the text put before the comment, third is the text put at the end of the
% comment.
%
% first side comment justification
\newcommand{\SetSideCommentLeft}{\setboolean{algocf@scleft}{true}}
\newcommand{\SetSideCommentRight}{\setboolean{algocf@scleft}{false}}
\newcommand{\SetNoFillComment}{\setboolean{algocf@optfillcomment}{false}}
\newcommand{\SetFillComment}{\setboolean{algocf@optfillcomment}{true}}
%
% next comment and side comment
%
\newcommand{\algocf@endmarkcomment}{\relax}%
\newcommand{\algocf@fillcomment}{%
\ifthenelse{\boolean{algocf@optfillcomment}}{\hfill}{\relax}}%
%
\newcommand{\algocf@startcomment}{%
\hangindent=\wd\algocf@inputbox\hangafter=1\usebox\algocf@inputbox}%
\newcommand{\algocf@endcomment}{\algocf@fillcomment\algocf@endmarkcomment\ignorespaces\par}%
\newcommand{\algocf@endstartcomment}{\algocf@endcomment\algocf@startcomment\ignorespaces}%
%
\newboolean{algocf@sidecomment}%
\newboolean{algocf@altsidecomment}\setboolean{algocf@altsidecomment}{false}%
\newcommand{\algocf@scpar}{\ifthenelse{\boolean{algocf@altsidecomment}}{\relax}{\par}}%
\newcommand{\algocf@sclfill}{\ifthenelse{\boolean{algocf@scleft}}{\algocf@fillcomment}{\relax}}%
\newcommand{\algocf@scrfill}{\ifthenelse{\boolean{algocf@scleft}}{\relax}{\hfill}}
\newcommand{\algocf@startsidecomment}{\usebox\algocf@inputbox}%
\newcommand{\algocf@endsidecomment}{\algocf@endmarkcomment\algocf@scpar}%
\newcommand{\algocf@endstartsidecomment}{%
\algocf@sclfill\algocf@endsidecomment%
\algocf@scrfill\algocf@startsidecomment\ignorespaces}%
%
\newcommand{\SetKwComment}[3]{%
% newcommand or renewcommand ?
\@ifundefined{#1}{\let\algocf@mkcmd=\newcommand}{\let\algocf@mkcmd=\renewcommand}%
%%% comment definition
\expandafter\algocf@mkcmd\csname algocf@#1\endcsname[1]{%
\sbox\algocf@inputbox{\CommentSty{\hbox{#2}}}%
\ifthenelse{\boolean{algocf@commentsnumbered}}{\relax}{\everypar={\relax}}%
{\renewcommand{\algocf@endmarkcomment}{#3}%
\let\\\algocf@endstartcomment%
\algocf@startcomment\CommentSty{%
\strut\ignorespaces##1\strut\algocf@fillcomment#3}\par}%
\algocf@linesnumbered% reset the numbering of the lines
}%
%%% side comment definitions
% option or not?
\expandafter\algocf@mkcmd\csname algocf@#1@star\endcsname{%
\@ifnextchar [{\csname algocf@#1@staropt\endcsname}{\csname algocf@#1@sidecomment\endcsname}%
}%
% manage option
\expandafter\def\csname algocf@#1@staropt\endcsname[##1]##2{%
\ifthenelse{\boolean{algocf@scleft}}{\setboolean{algocf@sidecomment}{true}}{\setboolean{algocf@sidecomment}{false}}%
\ifx##1h\setboolean{algocf@altsidecomment}{true}\SetSideCommentLeft\fi%
\ifx##1f\setboolean{algocf@altsidecomment}{true}\SetSideCommentRight\fi%
\ifx##1l\setboolean{algocf@altsidecomment}{false}\SetSideCommentLeft\fi%
\ifx##1r\setboolean{algocf@altsidecomment}{false}\SetSideCommentRight\fi%
\csname algocf@#1@sidecomment\endcsname{##2}% call sidecomment
\ifthenelse{\boolean{algocf@sidecomment}}{\setboolean{algocf@scleft}{true}}{\setboolean{algocf@scleft}{false}}%
\setboolean{algocf@altsidecomment}{false}%
}%
% side comment
\expandafter\algocf@mkcmd\csname algocf@#1@sidecomment\endcsname[1]{%
\sbox\algocf@inputbox{\CommentSty{\hbox{#2}}}%
\ifthenelse{\boolean{algocf@commentsnumbered}}{\relax}{\everypar={\relax}}%
{%
\renewcommand{\algocf@endmarkcomment}{#3}%
\let\\\algocf@endstartsidecomment%
% here is the comment
\ifthenelse{\boolean{algocf@altsidecomment}}{\relax}{\algocf@endline\ }%
\algocf@scrfill\algocf@startsidecomment\CommentSty{%
\strut\ignorespaces##1\strut\algocf@sclfill#3}\algocf@scpar%
}%
\algocf@linesnumbered% reset the numbering of the lines
}
\expandafter\algocf@mkcmd\csname#1\endcsname{\@ifstar{\csname algocf@#1@star\endcsname}{\csname algocf@#1\endcsname}}
}%
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% Kw
%
\newcommand{\SetKw}[2]{%
\@ifundefined{#1}{\let\algocf@mkcmd=\newcommand}{\let\algocf@mkcmd=\renewcommand}%
\expandafter\algocf@mkcmd\csname @#1\endcsname[1]{\KwSty{#2} \ArgSty{##1}}%
\expandafter\algocf@mkcmd\csname#1\endcsname{%
\@ifnextchar\bgroup{\csname @#1\endcsname}{\KwSty{#2}\xspace}}%
}
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% KwFunction
%
\newcommand{\SetKwFunction}[2]{%
\@ifundefined{#1}{\let\algocf@mkcmd=\newcommand}{\let\algocf@mkcmd=\renewcommand}%
\expandafter\algocf@mkcmd\csname @#1\endcsname[1]{\FuncSty{#2(}\ArgSty{##1}\FuncSty{)}}%
\expandafter\algocf@mkcmd\csname#1\endcsname{%
\@ifnextchar\bgroup{\csname @#1\endcsname}{\FuncSty{#2}\xspace}}%
}
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% KwBlock
%
\newcommand{\SetKwBlock}[3]{%
\@ifundefined{algocf@#1}{\let\algocf@mkcmd=\newcommand}{\let\algocf@mkcmd=\renewcommand}%
% side text or not?
\expandafter\def\csname#1\endcsname{ %Begin
\@ifnextchar({\csname algocf@#1opt\endcsname}{\csname algocf@#1\endcsname}}
% with side text
\expandafter\def\csname algocf@#1opt\endcsname(##1)##2{% \Begin(){}
\KwSty{#2} ##1\a@group{##2}\KwSty{#3}%
\@ifnextchar({\csname algocf@#1end\endcsname}{\par}}%
% without side text at the beginning
\expandafter\algocf@mkcmd\csname algocf@#1\endcsname[1]{% \Begin{}
\KwSty{#2}\a@group{##1}\KwSty{#3}\@ifnextchar({\csname algocf@#1end\endcsname}{\par}}%
% side text at the end
\expandafter\def\csname algocf@#1end\endcsname(##1){% \Begin{}
\ ##1\par}%
}
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% For Switch
%
\newcommand{\SetKwSwitch}[8]{% #1=\Switch #2=\Case #3=\Other #4=swicth #5=case #6=do #7=otherwise #8=endsw
%%%% Switch
\@ifundefined{algocf@#1}{\let\algocf@mkcmd=\newcommand}{\let\algocf@mkcmd=\renewcommand}%
% side text or not?
\expandafter\def\csname#1\endcsname{ %Switch
\@ifnextchar({\csname algocf@#1opt\endcsname}{\csname algocf@#1\endcsname}}
% with side text
\expandafter\def\csname algocf@#1opt\endcsname(##1)##2##3{% \Switch(){}{}
\KwSty{#4} \ArgSty{##2} \KwSty{#5} ##1\a@block{##3}{#8}}%
% without side text
\expandafter\algocf@mkcmd\csname algocf@#1\endcsname[2]{% \Switch{}{}
\KwSty{#4} \ArgSty{##1} \KwSty{#5}\a@block{##2}{#8}}%
% side text at the end
\expandafter\def\csname algocf@#1end\endcsname(##1){% \Switch{}{}()
}
%%%% Case
\@ifundefined{algocf@#2}{\let\algocf@mkcmd=\newcommand}{\let\algocf@mkcmd=\renewcommand}%
% side text or not?
\expandafter\def\csname#2\endcsname{ %Case
\@ifnextchar({\csname algocf@#2opt\endcsname}{\csname algocf@#2\endcsname}}
\expandafter\def\csname u#2\endcsname{ %uCase
\@ifnextchar({\csname algocf@u#2opt\endcsname}{\csname algocf@u#2\endcsname}}
\expandafter\def\csname l#2\endcsname{ %lCase
\@ifnextchar({\csname algocf@l#2opt\endcsname}{\csname algocf@l#2\endcsname}}
% with side text
\expandafter\def\csname algocf@#2opt\endcsname(##1)##2##3{% \Case(){}{}
\KwSty{#6} \ArgSty{##2} ##1\a@block{##3}{#8}}%
\expandafter\def\csname algocf@u#2opt\endcsname(##1)##2##3{% \uCase(){}{}
\KwSty{#6} \ArgSty{##2} ##1\a@group{##3}}%
\expandafter\def\csname algocf@l#2opt\endcsname(##1)##2##3{% \lCase(){}{}
\KwSty{#6} \ArgSty{##2} ##3\algocf@endline\ ##1\par}%
% without side text
\expandafter\algocf@mkcmd\csname algocf@#2\endcsname[2]{% \Case{}{}
\KwSty{#6} \ArgSty{##1}\a@block{##2}{#8}}%
\expandafter\algocf@mkcmd\csname algocf@u#2\endcsname[2]{% \uCase{}{}
\KwSty{#6} \ArgSty{##1}\a@group{##2}}%
\expandafter\algocf@mkcmd\csname algocf@l#2\endcsname[2]{% \lCase{}{}
\KwSty{#6} \ArgSty{##1} ##2}%
%%%% Other
\@ifundefined{algocf@#3}{\let\algocf@mkcmd=\newcommand}{\let\algocf@mkcmd=\renewcommand}%
% side text or not?
\expandafter\def\csname#3\endcsname{ %Other
\@ifnextchar({\csname algocf@#3opt\endcsname}{\csname algocf@#3\endcsname}}
\expandafter\def\csname l#3\endcsname{ %Other
\@ifnextchar({\csname algocf@l#3opt\endcsname}{\csname algocf@l#3\endcsname}}
% with side text
\expandafter\def\csname algocf@#3opt\endcsname(##1)##2{% \Other(){}{}
\KwSty{#7} ##1\a@block{##2}{#8}}%
\expandafter\def\csname algocf@l#3opt\endcsname(##1)##2{% \Other(){}{}
\KwSty{#7} ##2\algocf@endline\ ##1\par}%
% without side text
\expandafter\algocf@mkcmd\csname algocf@#3\endcsname[1]{% default
\KwSty{#7}\a@block{##1}{#8}}%
\expandafter\algocf@mkcmd\csname algocf@l#3\endcsname[1]{% ldefault
\KwSty{#7} ##1}%
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% If macros
%
\newcommand{\SetKwIF}[8]{% #1=\If #2=\ElseIf #3=\Else #4=if #5=then #6=elseif si #7=else #8=endif
%
% common text
\@ifundefined{#1@ifthen}{\let\algocf@mkcmd=\newcommand}{\let\algocf@mkcmd=\renewcommand}%
\expandafter\algocf@mkcmd\csname #1@ifthen\endcsname[1]{%
\KwSty{#4} \ArgSty{##1} \KwSty{#5}}%
\expandafter\algocf@mkcmd\csname #1@endif\endcsname[1]{\a@block{##1}{#8}}%
\expandafter\algocf@mkcmd\csname #1@noend\endcsname[1]{\a@group{##1}}%
\expandafter\algocf@mkcmd\csname #1@else\endcsname[1]{\a@group{##1}\KwSty{#7}}%
\@ifundefined{#2@elseif}{\let\algocf@mkcmd=\newcommand}{\let\algocf@mkcmd=\renewcommand}%
\expandafter\algocf@mkcmd\csname #2@elseif\endcsname[1]{%
\KwSty{#6} \ArgSty{##1} \KwSty{#5}}%
\@ifundefined{#3@else}{\let\algocf@mkcmd=\newcommand}{\let\algocf@mkcmd=\renewcommand}%
\expandafter\algocf@mkcmd\csname #3@else\endcsname{\KwSty{#7}}%
%%%% If then { } endif
%
\@ifundefined{algocf@#1}{\let\algocf@mkcmd=\newcommand}{\let\algocf@mkcmd=\renewcommand}%
% side text or not?
\expandafter\def\csname#1\endcsname{%
\@ifnextchar({\csname algocf@#1opt\endcsname}{\csname algocf@#1\endcsname}}
% with side text
\expandafter\def\csname algocf@#1opt\endcsname(##1)##2##3{% \If(){}{}
\csname #1@ifthen\endcsname{##2} ##1\csname #1@endif\endcsname{##3}}%
% without side text
\expandafter\algocf@mkcmd\csname algocf@#1\endcsname[2]{% \If{}{}
\csname #1@ifthen\endcsname{##1}\csname #1@endif\endcsname{##2}}%
%
%%%% If then {} else {} endif
%
% side text or not?
\expandafter\def\csname e#1\endcsname{%
\@ifnextchar({\csname algocf@e#1opt\endcsname}{\csname algocf@e#1optif\endcsname}}
% with side text after if
\expandafter\def\csname algocf@e#1opt\endcsname(##1)##2##3{% \eIf()
\csname #1@ifthen\endcsname{##2} ##1\csname #1@else\endcsname{##3}%
\csname algocf@e#1opte\endcsname}
% without side text after if
\expandafter\def\csname algocf@e#1optif\endcsname##1##2{% \eIf()
\csname #1@ifthen\endcsname{##1}\csname #1@else\endcsname{##2}%
\csname algocf@e#1opte\endcsname}%
% side text after else or not ?
\expandafter\def\csname algocf@e#1opte\endcsname{%
\@ifnextchar({\csname algocf@e#1optopt\endcsname}{\csname algocf@e#1\endcsname}}
% else with a side text
\expandafter\def\csname algocf@e#1optopt\endcsname(##1)##2{%
##1\csname #1@endif\endcsname{##2}}
% else without side text
\expandafter\algocf@mkcmd\csname algocf@e#1\endcsname[1]{%
\csname #1@endif\endcsname{##1}}
%
%%%% If then
%
% side text or not?
\expandafter\def\csname l#1\endcsname{% lif
\@ifnextchar({\csname algocf@l#1opt\endcsname}{\csname algocf@l#1\endcsname}}
\expandafter\def\csname u#1\endcsname{% uif
\@ifnextchar({\csname algocf@u#1opt\endcsname}{\csname algocf@u#1\endcsname}}
% with side text
\expandafter\def\csname algocf@l#1opt\endcsname(##1)##2##3{% \lIf(){}{}
\csname #1@ifthen\endcsname{##2} ##3\algocf@endline\ ##1\par}%
\expandafter\def\csname algocf@u#1opt\endcsname(##1)##2##3{% \uIf(){}{}
\csname #1@ifthen\endcsname{##2} ##1\csname#1@noend\endcsname{##3}}%
% without side text
\expandafter\algocf@mkcmd\csname algocf@l#1\endcsname[2]{% \lIf{}{}
\csname #1@ifthen\endcsname{##1} ##2}%
\expandafter\algocf@mkcmd\csname algocf@u#1\endcsname[2]{% \uIf{}{}
\csname #1@ifthen\endcsname{##1}\csname#1@noend\endcsname{##2}}%
%
%%%% ElseIf {} endif
%
\@ifundefined{algocf@#2}{\let\algocf@mkcmd=\newcommand}{\let\algocf@mkcmd=\renewcommand}%
% side text or not?
\expandafter\def\csname#2\endcsname{% ElseIf
\@ifnextchar({\csname algocf@#2opt\endcsname}{\csname algocf@#2\endcsname}}
% with side text
\expandafter\def\csname algocf@#2opt\endcsname(##1)##2##3{% \ElseIf(){}{}
\csname #2@elseif\endcsname{##2} ##1\csname #1@endif\endcsname{##3}}
% without side text
\expandafter\algocf@mkcmd\csname algocf@#2\endcsname[2]{% \ElseIf{}{}
\csname #2@elseif\endcsname{##1}\csname #1@endif\endcsname{##2}}
%
%%%% ElseIf
%
% side text or not?
\expandafter\def\csname l#2\endcsname{% lElseIf
\@ifnextchar({\csname algocf@l#2opt\endcsname}{\csname algocf@l#2\endcsname}}
\expandafter\def\csname u#2\endcsname{% uElseIf
\@ifnextchar({\csname algocf@u#2opt\endcsname}{\csname algocf@u#2\endcsname}}
% with side text
\expandafter\def\csname algocf@l#2opt\endcsname(##1)##2##3{% \lElseIf(){}{}
\csname #2@elseif\endcsname{##2} ##3\algocf@endline\ ##1\par}
\expandafter\def\csname algocf@u#2opt\endcsname(##1)##2##3{% \uElseIf(){}{}
\csname #2@elseif\endcsname{##2} ##1\csname #1@noend\endcsname{##3}}
% without side text
\expandafter\algocf@mkcmd\csname algocf@l#2\endcsname[2]{% \lElseIf{}{}
\csname #2@elseif\endcsname{##1} ##2}%
\expandafter\algocf@mkcmd\csname algocf@u#2\endcsname[2]{% \uElseIf{}{}
\csname #2@elseif\endcsname{##1}\csname #1@noend\endcsname{##2}}
%
%%%% Else {} endif
%
\@ifundefined{algocf@#3}{\let\algocf@mkcmd=\newcommand}{\let\algocf@mkcmd=\renewcommand}%
% side text or not?
\expandafter\def\csname#3\endcsname{% Else
\@ifnextchar({\csname algocf@#3opt\endcsname}{\csname algocf@#3\endcsname}}
% with side text
\expandafter\def\csname algocf@#3opt\endcsname(##1)##2{% \Else(){}
\csname #3@else\endcsname\ ##1\csname #1@endif\endcsname{##2}}
% without side text
\expandafter\algocf@mkcmd\csname algocf@#3\endcsname[1]{% \Else{}
\csname #3@else\endcsname\csname #1@endif\endcsname{##1}}%
%
%%%% Else
%
% side text or not?
\expandafter\def\csname l#3\endcsname{% lElse
\@ifnextchar({\csname algocf@l#3opt\endcsname}{\csname algocf@l#3\endcsname}}
\expandafter\def\csname u#3\endcsname{% uElse
\@ifnextchar({\csname algocf@u#3opt\endcsname}{\csname algocf@u#3\endcsname}}
% with side text
\expandafter\def\csname algocf@l#3opt\endcsname(##1)##2{% \lElse(){}
\csname #3@else\endcsname\ ##2\algocf@endline\ ##1\par}
\expandafter\def\csname algocf@#3opt\endcsname(##1)##2{% \uElse(){}
\csname #3@else\endcsname\ ##1\csname #1@noend\endcsname{##2}}
% without side text
\expandafter\algocf@mkcmd\csname algocf@l#3\endcsname[1]{% \lElse{}
\csname #3@else\endcsname\ ##1}%
\expandafter\algocf@mkcmd\csname algocf@u#3\endcsname[1]{% \uElse{}
\csname #3@else\endcsname\csname #1@noend\endcsname{##1}}%
}
%
% old for backward compatibility
\newcommand{\SetKwIf}[6]{%
\SetKwIF{#1}{cf@dumb}{#2}{#3}{#4}{cf@dumb}{#5}{#6}%
\typeout{**** WARNING: SetKwIf deprecated: use SetKwIF instead*****^^J}%
}%
\newcommand{\SetKwIfElseIf}[8]{%
\SetKwIF{#1}{#2}{#3}{#4}{#5}{#6}{#7}{#8}%
\typeout{**** WARNING: SetKwIfElseIf deprecated: use SetKwIF instead*****^^J}%
}%
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% For macros
%
\newcommand{\SetKwFor}[4]{%
\@ifundefined{algocf@#1}{\let\algocf@mkcmd=\newcommand}{\let\algocf@mkcmd=\renewcommand}%
% side text or not?
\expandafter\def\csname#1\endcsname{ %For
\@ifnextchar({\csname algocf@#1opt\endcsname}{\csname algocf@#1\endcsname}}
\expandafter\def\csname l#1\endcsname{ %For
\@ifnextchar({\csname algocf@l#1opt\endcsname}{\csname algocf@l#1\endcsname}}
% with side text
\expandafter\def\csname algocf@#1opt\endcsname(##1)##2##3{% \For(){}{}
\KwSty{#2} \ArgSty{##2} \KwSty{#3} ##1\a@block{##3}{#4}}%
\expandafter\def\csname algocf@l#1opt\endcsname(##1)##2##3{% \lFor(){}{}
\KwSty{#2} \ArgSty{##2} \KwSty{#3} ##3\algocf@endline\ ##1\par}
% without side text
\expandafter\algocf@mkcmd\csname algocf@#1\endcsname[2]{% \For{}{}
\KwSty{#2} \ArgSty{##1} \KwSty{#3}\a@block{##2}{#4}}%
\expandafter\algocf@mkcmd\csname algocf@l#1\endcsname[2]{% \lFor{}{}
\KwSty{#2} \ArgSty{##1} \KwSty{#3} ##2}%
}
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% Repeat macros
%
\newcommand{\SetKwRepeat}[3]{%
\@ifundefined{algocf@#1}{\let\algocf@mkcmd=\newcommand}{\let\algocf@mkcmd=\renewcommand}%
% side text or not?
\expandafter\def\csname#1\endcsname{ %Repeat
\@ifnextchar({\csname algocf@#1opt\endcsname}{\csname algocf@#1\endcsname}}
\expandafter\def\csname l#1\endcsname{ %lRepeat
\@ifnextchar({\csname algocf@l#1opt\endcsname}{\csname algocf@l#1\endcsname}}
% with side text
\expandafter\def\csname algocf@#1opt\endcsname(##1)##2##3{% \Repeat(){}{}
\KwSty{#2} ##1\a@group{##3}\KwSty{#3} \ArgSty{##2}%
\@ifnextchar({\csname algocf@#1optopt\endcsname}{\@endalgoln}%
}%
\expandafter\def\csname algocf@#1optopt\endcsname(##1){% \Repeat(){}{}()
##1\@endalgoln}%
\expandafter\def\csname algocf@l#1opt\endcsname(##1)##2##3{% \lRepeat(){}{}
\KwSty{#2} ##3 \KwSty{#3} \ArgSty{##2}\algocf@endline\ ##1\par}%
% without side text
\expandafter\algocf@mkcmd\csname algocf@#1\endcsname[2]{% \Repeat{}{}
\KwSty{#2}\a@group{##2}\KwSty{#3} \ArgSty{##1}
\@ifnextchar({\csname algocf@#1optopt\endcsname}{\@endalgoln}%
}%
\expandafter\algocf@mkcmd\csname algocf@l#1\endcsname[2]{% \lRepeat{}{}
\KwSty{#2} ##2 \KwSty{#3} \ArgSty{##1}}%
}
%
%
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%% Environments definitions %%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
%%
%% Caption management
%%
% for the following macros:
% #1 is given by caption and is equal to fnum@algocf
% #2 is the text given in argument by the user in the \caption macro
%
%%%%% text of caption
\newcommand{\algocf@captiontext}[2]{#1\algocf@typo: \AlCapFnt{}#2} % text of caption
%
%%%%% default caption of algorithm: used if no specific style caption is defined
\newcommand{\algocf@makecaption}[2]{%
\addtolength{\hsize}{\algomargin}%
\sbox\@tempboxa{\algocf@captiontext{#1}{#2}}%
\ifdim\wd\@tempboxa >\hsize% % if caption is longer than a line
\hskip .5\algomargin%
\parbox[t]{\hsize}{\algocf@captiontext{#1}{#2}}% then caption is not centered
\else%
\global\@minipagefalse%
\hbox to\hsize{\hfil\box\@tempboxa\hfil}% else caption is centered
\fi%
\addtolength{\hsize}{-\algomargin}%
}
%
\newsavebox\algocf@capbox
\newcommand{\algocf@makecaption@plain}[2]{%
\global\sbox\algocf@capbox{\algocf@makecaption{#1}{#2}}}%
\newcommand{\algocf@makecaption@boxed}[2]{%
\addtolength{\hsize}{-\algomargin}%
\global\sbox\algocf@capbox{\algocf@makecaption{#1}{#2}}
\addtolength{\hsize}{\algomargin}%
}%
%
\newcommand{\algocf@makecaption@algoruled}[2]{\algocf@makecaption@ruled{#1}{#2}}%
\newcommand{\algocf@makecaption@boxruled}[2]{\algocf@makecaption@ruled{#1}{#2}}%
\newcommand{\algocf@makecaption@ruled}[2]{%
\global\sbox\algocf@capbox{\hskip\AlCapHSkip% .5\algomargin%
\parbox[t]{\hsize}{\algocf@captiontext{#1}{#2}}}% then caption is not centered
}
%
\newcommand{\algocf@caption@plain}{\vskip\AlCapSkip\box\algocf@capbox}%
\newcommand{\algocf@caption@boxed}{\vskip\AlCapSkip\box\algocf@capbox}%
\newcommand{\algocf@caption@ruled}{\box\algocf@capbox\kern2pt\hrule height.8pt depth0pt\kern2pt}%
\newcommand{\algocf@caption@algoruled}{\algocf@caption@ruled}%
\newcommand{\algocf@caption@boxruled}{%
\addtolength{\hsize}{-0.8pt}%
\hbox to\hsize{%
\vrule%\hskip-0.35pt%
\vbox{%
\hrule\vskip2\lineskip%
\hbox to\hsize{\unhbox\algocf@capbox\hfill}\vskip2\lineskip%
}%
%\hskip-0.35pt%
\vrule%
}\vskip-2\lineskip\nointerlineskip%
\addtolength{\hsize}{0.8pt}%
}
%
%
%%%% set caption for the environment
% beamer define is own caption overrinding latex caption!
% as we need it, we have put here the original definition
\long\def\algocf@latexcaption#1[#2]#3{% original definition of caption
\par
\addcontentsline{\csname ext@#1\endcsname}{#1}%
{\protect\numberline{\csname the#1\endcsname}{\ignorespaces #2}}%
\begingroup
\@parboxrestore
\if@minipage
\@setminipage
\fi
\normalsize
\@makecaption{\csname fnum@#1\endcsname}{\ignorespaces #3}\par
\endgroup%
}
\ifx\beamer@makecaption\undefined%
\else% beamer detected
\ifx\@makecaption\undefined%
\newcommand{\@makecaption}[2]{\relax}%
\fi%
\fi
%
% more and more packages redefine \@caption instead of just \@makecaption which makes algorithm2e
% caption not works since based on standard \@caption. So we force the definition of \@caption to be
% the standard one (the one from LaTeX) inside algorithm environment.
%
\newcommand{\algocf@setcaption}{%
\let\algocf@savecaption=\@caption%
\let\@caption=\algocf@latexcaption%
\let\algocf@oldmakecaption=\@makecaption%
\renewcommand{\@makecaption}[2]{%
\expandafter\csname algocf@makecaption@\algocf@style\endcsname{##1}{##2}}%
}
%
%%%%% reset caption
%
% since we have force the LaTeX caption for algorithm environment, we must go back to the caption
% used in the text.
\newcommand{\algocf@resetcaption}{%
\let\@caption=\algocf@savecaption%
\let\@makecaption=\algocf@oldmakecaption%
}
%
%%%%% nocaptionofalgo and restorecaptionofalgo --
\newcommand{\nocaptionofalgo}{%
\let\@old@algocf@captiontext=\algocf@captiontext%
\renewcommand{\algocf@captiontext}[2]{\AlCapFnt{}##2}%
}
\newcommand{\restorecaptionofalgo}{%
\let\algocf@captiontext=\@old@algocf@captiontext%
}
%
% ---------------------- algocf environment
%
\newcounter{algocfline} % new counter to make lines numbers be internally
\setcounter{algocfline}{0} % different in different algorithms
%
\expandafter\ifx\csname algocf@within\endcsname\relax% if \algocf@within doesn't exist
\newcounter{algocf} % just define a new counter
\renewcommand\thealgocf{\@arabic\c@algocf} % and the way it is printed
\else% else
\newcounter{algocf}[\algocf@within] % counter is numbered within \algocf@within
\renewcommand\thealgocf{\csname the\algocf@within\endcsname.\@arabic\c@algocf}
\fi
%
\def\fps@algocf{htbp} % default
\def\ftype@algocf{10} % float type
\def\ext@algocf{\algocf@list} % loa by default, lof if figure option used
\def\fnum@algocf{{\AlCapFnt\AlTitleFnt{\algorithmcfname\nobreakspace\thealgocf}}}
\newenvironment{algocf}% % float environment for algorithms
{\@float{algocf}}%
{\end@float}
\newenvironment{algocf*}% % float* environment for algorithms
{\@dblfloat{algocf}}
{\end@dblfloat}
\ifx\l@chapter\undefined%
\newcommand\listofalgocfs{ % list of algorithms
\section*{\listalgorithmcfname}%
\@mkboth{\MakeUppercase\listalgorithmcfname}%
{\MakeUppercase\listalgorithmcfname}%
\@starttoc{loa}%
}
\else%
%\newcommand\listofalgocfs{%
% \if@twocolumn
% \@restonecoltrue\onecolumn
% \else
% \@restonecolfalse
% \fi
% \chapter*{\listalgorithmcfname}%
% \@mkboth{\MakeUppercase\listalgorithmcfname}%
% {\MakeUppercase\listalgorithmcfname}%
% \@starttoc{loa}%
% \if@restonecol\twocolumn\fi
% }
\fi
\newcommand*\l@algocf{\@dottedtocline{1}{1em}{2.3em}}% line of the list
%
% ---------------------- algorithm environment
%
%%%%%%%
%%
%% Algorithm environment definition
%%
%%%%%%%
%%
%
\newsavebox\algocf@algoframe
\def\@algocf@pre@plain{\relax}% action to be done before printing the algo.
\def\@algocf@post@plain{\relax}% action to be done after printing the algo.
\def\@algocf@capt@plain{bottom}% where the caption should be localized.
\def\@algocf@pre@boxed{\noindent\begin{lrbox}{\algocf@algoframe}}
\def\@algocf@post@boxed{\end{lrbox}\framebox[\hsize]{\box\algocf@algoframe}\par}%
\def\@algocf@capt@boxed{under}%
\def\@algocf@pre@ruled{\hrule height.8pt depth0pt\kern2pt}%
\def\@algocf@post@ruled{\kern2pt\hrule\relax}%
\def\@algocf@capt@ruled{top}%
\def\@algocf@pre@algoruled{\hrule height.8pt depth0pt\kern2pt}%
\def\@algocf@post@algoruled{\kern2pt\hrule\relax}%
\def\@algocf@capt@algoruled{top}%
\def\@algocf@pre@boxruled{\noindent\begin{lrbox}{\algocf@algoframe}}%
\def\@algocf@post@boxruled{\end{lrbox}\framebox[\hsize]{\box\algocf@algoframe}\par}%
\def\@algocf@capt@boxruled{above}%
%
%% before algocf or figure environment
\newcommand{\@algocf@init@caption}{%
\@algocf@algotitleofalgo% fix name for \Titleofalgo to \algorithmcfname
\algocf@setcaption% set caption to our caption style
}%
\newcommand{\@algocf@init}{%
\refstepcounter{algocfline}%
\ifthenelse{\boolean{algocf@optnoend}}{%
\renewcommand{\a@block}[2]{\a@group{##1}}%
}{%
\renewcommand{\a@block}[2]{\a@@block{##1}{##2}}%
}%
}
%% after the end of algocf or figure environment
\newcommand{\@algocf@term@caption}{%
\algocf@resetcaption% restore original caption
}%
\newcommand{\@algocf@term}{%
\setboolean{algocf@algoH}{false}% no H by default
\ifthenelse{\boolean{algocf@optnoend}}{%
\renewcommand{\a@block}[2]{\a@@block{##1}{##2}}
}{%
\renewcommand{\a@block}[2]{\a@group{##1}}%
}%
}
%
%%%%%%%%%%%%%%%%%
%% makethealgo: macro which print effectively the algo in its box
%%
\newsavebox\algocf@algobox
\newcommand{\algocf@makethealgo}{%
\vtop{%
% place caption above if needed bye the style
\ifthenelse{\equal{\csname @algocf@capt@\algocf@style\endcsname}{above}}%
{\csname algocf@caption@\algocf@style\endcsname}{}%
%
% precommand according to the style
\csname @algocf@pre@\algocf@style\endcsname%
% place caption at top if needed bye the style
\ifthenelse{\equal{\csname @algocf@capt@\algocf@style\endcsname}{top}}%
{\csname algocf@caption@\algocf@style\endcsname}{}%
%
\box\algocf@algobox% the algo
% place caption at bottom if needed bye the style
\ifthenelse{\equal{\csname @algocf@capt@\algocf@style\endcsname}{bottom}}%
{\csname algocf@caption@\algocf@style\endcsname}{}%
% postcommand according to the style
\csname @algocf@post@\algocf@style\endcsname%
% place caption under if needed bye the style
\ifthenelse{\equal{\csname @algocf@capt@\algocf@style\endcsname}{under}}
{\csname algocf@caption@\algocf@style\endcsname}{}%
}%
}
%%%%%%%%%%%%%%%%%%%
%
%% at the beginning of algocf or figure environment
\newcommand{\@algocf@start}{%
\@algoskip%
\begin{lrbox}{\algocf@algobox}%
\setlength{\algowidth}{\hsize}%
\vbox\bgroup% save all the algo in a box
\hbox to\algowidth\bgroup\hbox to \algomargin{\hfill}\vtop\bgroup%
\ifthenelse{\boolean{algocf@slide}}{\parskip 0.5ex\color{black}}{}%
% initialization
\addtolength{\hsize}{-1.5\algomargin}%
\let\@mathsemicolon=\;\def\;{\ifmmode\@mathsemicolon\else\@endalgoln\fi}%
\raggedright\AlFnt{}%
\ifthenelse{\boolean{algocf@slide}}{\incmargin{\skipalgocfslide}}{}%
\@algoinsideskip%
%
}
%
%% at the end of algocf or figure environment
\newcommand{\@algocf@finish}{%
\@algoinsideskip%
\egroup%end of vtop which contain all the text
\egroup%end of hbox wich contains [margin][vtop]
\ifthenelse{\boolean{algocf@slide}}{\decmargin{\skipalgocfslide}}{}%
%
\egroup%end of main vbox
\end{lrbox}%
%\egroup% end of algo box
\algocf@makethealgo% print the algo
\@algoskip%
% restore dimension and macros
\setlength{\hsize}{\algowidth}%
\lineskip\normallineskip\setlength{\skiptotal}{\@defaultskiptotal}%
\let\;=\@mathsemicolon%
%
}
%%%%%%%%%%%%%%%%%%%%
%% basic definition of the environment algorithm
%%
\newboolean{algocf@algoH}\setboolean{algocf@algoH}{false}
\newenvironment{algocf@Here}{\noindent%
\def\@captype{algocf}% if not defined, caption exit with an error
% \hbox\bgroup%
\begin{minipage}{\hsize}
}{%
\end{minipage}
% \egroup%
}%
\newenvironment{\algocf@envname}[1][htbp]{%
\@algocf@init%
\ifthenelse{\equal{\algocf@float}{figure}}%
{\begin{figure}[#1]}%
{\@algocf@init@caption\ifthenelse{\equal{#1}{H}}%
{\setboolean{algocf@algoH}{true}\begin{algocf@Here}}%
{\begin{algocf}[#1]}%
}%
\@algocf@start%
\@ResetCounterIfNeeded%
\algocf@linesnumbered%
}{%
\@algocf@finish%
\ifthenelse{\equal{\algocf@float}{figure}}%
{\end{figure}}%
{\@algocf@term@caption\ifthenelse{\boolean{algocf@algoH}}%
{\end{algocf@Here}}%
{\end{algocf}}%
}%
\@algocf@term
}
%%%
%%% algorithm*
%%%
\newenvironment{\algocf@envname*}[1][htbp]{%
\@algocf@init%
\ifthenelse{\equal{\algocf@float}{figure}}%
{\begin{figure*}[#1]}%
{\begin{algocf*}[#1]}%
\@algocf@start%
\@ResetCounterIfNeeded%
\algocf@linesnumbered%
}{
\@algocf@finish%
\ifthenelse{\equal{\algocf@float}{figure}}%
{\end{figure*}}%
{\end{algocf*}}%
\@algocf@term%
}
%
%%%%%%%%%%%%%%%%%%%%%%%
%%%
%
\expandafter\newcommand\csname\algocf@listofalgorithms\endcsname{%
\ifthenelse{\equal{\algocf@float}{figure}}{\listoffigures}{\listofalgocfs}
}
%%%
%%%
%
% ---------------------- procedure and function environments
%
%
% -- new style (used in particular in the caption of function and procedure environments)
%
\newcommand{\ProcNameSty}[1]{\FuncSty{#1}}%
\newcommand{\SetProcNameSty}[1]{\renewcommand{\ProcNameSty}[1]{\textnormal{\csname#1\endcsname{##1}}}}
\newcommand{\ProcArgSty}[1]{\ArgSty{#1}}%
\newcommand{\SetProcArgSty}[1]{\renewcommand{\ProcArgSty}[1]{\textnormal{\csname#1\endcsname{##1}}}}
% three macros to extract parts of the caption
\gdef\algocf@captname#1(#2)#3@{#1} % keep characters before the first brace
\gdef\algocf@captparam#1(#2)#3@{#2} % keep character in between the braces
\gdef\algocf@captother#1(#2)#3@{#3} % keep character after the braces
%
%%% Text of caption for Procedure or Function
\newcommand{\algocf@captionproctext}[2]{%
{\AlCapFnt{}\AlTitleFnt{\algocf@procname} %
\ProcNameSty{\algocf@captname #2@}% Name of the procedure in ProcName Style.
\ifthenelse{\equal{\algocf@captparam #2@}{\arg@e}}{}% if no argument, write nothing
{% else put arguments in ProcArgSty:
\ProcNameSty{(}\ProcArgSty{\algocf@captparam #2@}\ProcNameSty{)}%
}% endif
\algocf@captother #2@%
}
}
%%%% set caption for the environment
% unfortunately, makecaption is called with \ignorespace #3 so
% we can't do the @currentlabel definition inside \algocf@captionproctext
\long\def\algocf@caption@proc#1[#2]#3{%
\gdef\@currentlabel{\algocf@captname #3@}%
\algocf@old@caption{#1}[\algocf@procname\nobreakspace #2]{\ #3}%
}%
\newcommand{\algocf@setcaptionproc}{%
\let\algocf@oldcaptiontext=\algocf@captiontext%
\renewcommand{\algocf@captiontext}[2]{%
\algocf@captionproctext{##1}{##2}}%
\let\algocf@old@caption=\@caption%
\let\@caption=\algocf@caption@proc%
}
%%%%% reset caption
\newcommand{\algocf@resetcaptionproc}{%
\let\algocf@captiontext=\algocf@oldcaptiontext%
\let\@caption=\algocf@old@caption%
}
%
%
%%%%% algocf@proc is the generic environment for procedure and function environment.
%
\newboolean{algocf@procstar}\setboolean{algocf@procstar}{false}
\newenvironment{algocf@proc}[1][htbp]{%
\@algocf@proctitleofalgo% set Titleofalgo to Procedure: or Function:
% accordingly to the environment
\let\old@thealgocf=\thealgocf%\renewcommand{\thealgocf}{--}%
\algocf@setcaptionproc% set the text of caption to proc
\algocf@setcaption% set caption to our caption style
\refstepcounter{algocfline}%
\ifthenelse{\equal{\algocf@float}{figure}}{%
\ifthenelse{\boolean{algocf@procstar}}{\begin{figure*}[#1]}{\begin{figure}[#1]}%
}{%
\ifthenelse{\boolean{algocf@procstar}}%
{\begin{algocf*}[#1]}%
{\ifthenelse{\equal{#1}{H}}%
{\setboolean{algocf@algoH}{true}\begin{algocf@Here}}%
{\begin{algocf}[#1]}%
}%
}%
\@algocf@start%
\@ResetCounterIfNeeded%
\algocf@linesnumbered%
}{%
\@algocf@finish%
\ifthenelse{\equal{\algocf@float}{figure}}{%
\ifthenelse{\boolean{algocf@procstar}}{\end{figure*}}{\end{figure}}%
}{%
\ifthenelse{\boolean{algocf@procstar}}
{\end{algocf*}}
{\ifthenelse{\boolean{algocf@algoH}}
{\end{algocf@Here}}%
{\end{algocf}}%
}%
}%
\let\thealgocf=\old@thealgocf%
\@algocf@term% restore original caption and H boolean
\algocf@resetcaptionproc%
}
%
% -- procedure and function environments are defined from algocf@proc environment
%
\newenvironment{procedure}[1][htbp]%
{\setboolean{algocf@procstar}{false}%
\newcommand{\algocf@procname}{\@algocf@procname}\begin{algocf@proc}[#1]}%
{\end{algocf@proc}}
\newenvironment{function}[1][htbp]%
{\setboolean{algocf@procstar}{false}%
\newcommand{\algocf@procname}{\@algocf@funcname}\begin{algocf@proc}[#1]}%
{\end{algocf@proc}}
%
\newenvironment{procedure*}[1][htbp]%
{\setboolean{algocf@procstar}{true}%
\newcommand{\algocf@procname}{\@algocf@procname}\begin{algocf@proc}[#1]}%
{\end{algocf@proc}}
\newenvironment{function*}[1][htbp]%
{\setboolean{algocf@procstar}{true}%
\newcommand{\algocf@procname}{\@algocf@funcname}\begin{algocf@proc}[#1]}%
{\end{algocf@proc}}
%
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
%
\newcommand{\Titleofalgo}[1]{\@titleprefix\TitleSty{#1}\par\smallskip}
%
%
% ------------------------- Default Definitions
%
%%
%%
%
\newcommand{\algocf@defaults@common}{
%\SetKwInOut{AlgDonnees}{Donn\'ees}\SetKwInOut{AlgRes}{R\'esultat}
\SetKwInput{Donnees}{Donn\'ees}%
\SetKwInput{Res}{R\'esultat}%
\SetKwInput{Entree}{Entr\'ees}%
\SetKwInput{Sortie}{Sorties}%
\SetKw{KwA}{\`a}%
\SetKw{Retour}{retourner}%
\SetKwBlock{Deb}{d\'ebut}{fin}%
\SetKwRepeat{Repeter}{r\'ep\'eter}{jusqu'\`a}%
%
\SetKwComment{tcc}{/* }{ */}
\SetKwComment{tcp}{// }{}
%
%\SetKwInOut{AlgData}{Data}\SetKwInOut{AlgResult}{Result}
\SetKwInput{KwIn}{Input}%
\SetKwInput{KwOut}{Output}%
\SetKwInput{KwData}{Data}%
\SetKwInput{KwResult}{Result}%
\SetKw{KwTo}{to}
\SetKw{KwRet}{return}%
\SetKw{Return}{return}%
\SetKwBlock{Begin}{begin}{end}%
\SetKwRepeat{Repeat}{repeat}{until}%
%
% --- German keywords
%
% \SetKwInOut{AlgDaten}{Daten}%AlgData
% \SetKwInOut{AlgErgebnis}{Ergebnis}%AlgResult
\SetKwInput{Ein}{Eingabe}%KwIn
\SetKwInput{Aus}{Ausgabe}%KwOut
\SetKwInput{Daten}{Daten}%KwData
\SetKwInput{Ergebnis}{Ergebnis}%KwResult
\SetKw{Bis}{bis}%KwTo
\SetKw{KwZurueck}{zur\"uck}%KwRet
\SetKw{Zurueck}{zur\"uck}%Return
\SetKwBlock{Beginn}{Beginn}{Ende}%Begin
\SetKwRepeat{Wiederh}{wiederhole}{bis}%Repeat
%
% --- Czech keywords
%
% \SetKwInOut{AlgVst}{Vstup}\SetKwInOut{AlgVyst}{V\'{y}stup}
\SetKwInput{Vst}{Vstup}%
\SetKwInput{Vyst}{V\'{y}stup}%
\SetKwInput{Vysl}{V\'{y}sledek}%
%
% --- Portuguese keywords
%
% \SetKwInOut{AlgDados}{Dados}\SetKwInOut{AlgResultado}{Result.}
\SetKwInput{Entrada}{Entrada}%
\SetKwInput{Saida}{Sa\'{i}da}%
\SetKwInput{Dados}{Dados}%
\SetKwInput{Resultado}{Resultado}%
\SetKw{Ate}{at\'{e}}
\SetKw{KwRetorna}{retorna}%
\SetKw{Retorna}{retorna}%
\SetKwBlock{Inicio}{in\'{i}cio}{fim}%
\SetKwRepeat{Repita}{repita}{at\'{e}}%
% --- End
}
%
%
\newcommand{\algocf@defaults@longend}{%
\algocf@defaults@common
\SetKwIF{gSi}{gSinonSi}{gSinon}{si}{alors}{sinon si}{sinon}{finsi}%
\SetKwIF{Si}{SinonSi}{Sinon}{si}{alors}{sinon si}{sinon}{finsi}%
\SetKwSwitch{Suivant}{Cas}{Autre}{suivant}{faire}{cas o\`u}{autres cas}{fin d'alternative}%
\SetKwFor{Pour}{pour}{faire}{finpour}%
\SetKwFor{PourPar}{pour}{faire en parallèle}{finpour}%
\SetKwFor{PourCh}{pour chaque}{faire}{finprch}%
\SetKwFor{PourTous}{pour tous les}{faire}{finprts}%
\SetKwFor{Tq}{tant que}{faire}{fintq}%
%
\SetKwIF{gIf}{gElsIf}{gElse}{if}{then}{else if}{else}{endif}%
\SetKwIF{If}{ElseIf}{Else}{if}{then}{else if}{else}{endif}%
\SetKwSwitch{Switch}{Case}{Other}{switch}{do}{case}{otherwise}{endsw}%
\SetKwFor{For}{for}{do}{endfor}%
\SetKwFor{ForPar}{for}{do in parallel}{endfpar}
\SetKwFor{ForEach}{foreach}{do}{endfch}%
\SetKwFor{ForAll}{forall the}{do}{endfall}%
\SetKwFor{While}{while}{do}{endw}%
%
% --- German for longend
%
\SetKwIF{gWenn}{gSonstWenn}{gSonst}{wenn}{dann}{sonst wenn}{sonst}{Ende-wenn}%gIf
\SetKwIF{Wenn}{SonstWenn}{Sonst}{wenn}{dann}{sonst wenn}{sonst}{Ende-wenn}%gIf
\SetKwSwitch{Unterscheide}{Fall}{Anderes}{unterscheide}{tue}{Fall}{sonst}{Ende-Unt.}%Switch
\SetKwFor{Fuer}{f\"ur}{tue}{Ende-f\"ur}%For
\SetKwFor{FuerPar}{f\"ur}{tue gleichzeitig}{Ende-gleichzeitig}%ForPar
\SetKwFor{FuerJedes}{f\"ur jedes}{tue}{Ende-f\"ur}%ForEach
\SetKwFor{FuerAlle}{f\"ur alle}{tue}{Ende-f\"ur}%ForAll
\SetKwFor{Solange}{solange}{tue}{Ende-solange}%While
%
% --- Portuguese
%
\SetKwIF{gSe}{gSenaoSe}{gSenao}{se}{ent\~{a}o}{sen\~{a}o se}{sen\~{a}o}{fim se}%
\SetKwIF{Se}{SenaoSe}{Senao}{se}{ent\~{a}o}{sen\~{a}o se}{sen\~{a}o}{fim se}%
\SetKwSwitch{Selec}{Caso}{Outro}{selecione}{fa\c{c}a}{caso}{sen\~{a}o}{fim selec}%
\SetKwFor{Para}{para}{fa\c{c}a}{fim para}%
\SetKwFor{ParaPar}{para}{fa\c{c}a em paralelo}{fim para}
\SetKwFor{ParaCada}{para cada}{fa\c{c}a}{fim para cada}%
\SetKwFor{ParaTodo}{para todo}{fa\c{c}a}{fim para todo}%
\SetKwFor{Enqto}{enquanto}{fa\c{c}a}{fim enqto}%
}
%
%
\newcommand{\algocf@defaults@shortend}{%
\algocf@defaults@common
\SetKwIF{gSi}{gSinonSi}{gSinon}{si}{alors}{sinon si}{sinon}{fin}%
\SetKwIF{Si}{SinonSi}{Sinon}{si}{alors}{sinon si}{sinon}{fin}%
\SetKwSwitch{Suivant}{Cas}{Autre}{suivant}{faire}{cas o\`u}{autres cas}{fin}%
\SetKwFor{Pour}{pour}{faire}{fin}%
\SetKwFor{PourPar}{pour}{faire en parallèle}{fin}%
\SetKwFor{PourCh}{pour chaque}{faire}{fin}%
\SetKwFor{PourTous}{pour tous les}{faire}{fin}%
\SetKwFor{Tq}{tant que}{faire}{fin}%
%
%
\SetKwIF{gIf}{gElsIf}{gElse}{if}{then}{else if}{else}{end}%
\SetKwIF{If}{ElseIf}{Else}{if}{then}{else if}{else}{end}%
\SetKwSwitch{Switch}{Case}{Other}{switch}{do}{case}{otherwise}{end}%
\SetKwFor{For}{for}{do}{end}%
\SetKwFor{ForPar}{for}{do in parallel}{end}
\SetKwFor{ForEach}{foreach}{do}{end}%
\SetKwFor{ForAll}{forall}{do}{end}%
\SetKwFor{While}{while}{do}{end}%
%
% --- German for shortend
%
\SetKwIF{gWenn}{gSonstWenn}{gSonst}{wenn}{dann}{sonst wenn}{sonst}{Ende}%gIf
\SetKwIF{Wenn}{SonstWenn}{Sonst}{wenn}{dann}{sonst wenn}{sonst}{Ende}%gIf
\SetKwSwitch{Unterscheide}{Fall}{Anderes}{unterscheide}{tue}{Fall}{sonst}{}%Switch
\SetKwFor{Fuer}{f\"ur}{tue}{Ende}%For
\SetKwFor{FuerPar}{f\"ur}{tue gleichzeitig}{Ende}%ForPar
\SetKwFor{FuerJedes}{f\"ur jedes}{tue}{Ende}%ForEach
\SetKwFor{FuerAlle}{f\"ur alle}{tue}{Ende}%ForAll
\SetKwFor{Solange}{solange}{tue}{Ende}%While
%
% --- Portuguese
%
\SetKwIF{gSe}{gSenaoSe}{gSenao}{se}{ent\~{a}o}{sen\~{a}o se}{sen\~{a}o}{fim}%
\SetKwIF{Se}{SenaoSe}{Senao}{se}{ent\~{a}o}{sen\~{a}o se}{sen\~{a}o}{fim}%
\SetKwSwitch{Selec}{Caso}{Outro}{selecione}{fa\c{c}a}{caso}{sen\~{a}o}{fim}%
\SetKwFor{Para}{para}{fa\c{c}a}{fim}%
\SetKwFor{ParaPar}{para}{fa\c{c}a em paralelo}{fim}
\SetKwFor{ParaCada}{para cada}{fa\c{c}a}{fim}%
\SetKwFor{ParaTodo}{para todo}{fa\c{c}a}{fim}%
\SetKwFor{Enqto}{enquanto}{fa\c{c}a}{fim}%
}
%
%
\newcommand{\algocf@defaults@noend}{%
\renewcommand{\a@block}[2]{\a@group{##1}}
\algocf@defaults@common
\SetKwIF{gSi}{gSinonSi}{gSinon}{si}{alors}{sinon si}{sinon}{}%
\SetKwIF{Si}{SinonSi}{Sinon}{si}{alors}{sinon si}{sinon}{}%
\SetKwSwitch{Suivant}{Cas}{Autre}{suivant}{faire}{cas où}{autres cas}{}%
\SetKwFor{Pour}{pour}{faire}{}%
\SetKwFor{PourPar}{pour}{faire en parallèle}{}%
\SetKwFor{PourCh}{pour chaque}{faire}{}%
\SetKwFor{PourTous}{pour tous les}{faire}{}%
\SetKwFor{Tq}{tant que}{faire}{}%
%
\SetKwIF{gIf}{gElsIf}{gElse}{if}{then}{else if}{else}{}%
\SetKwIF{If}{ElsIf}{Else}{if}{then}{else if}{else}{}%
\SetKwSwitch{Switch}{Case}{Other}{switch}{do}{case}{otherwise}{}%
\SetKwFor{For}{for}{do}{}%
\SetKwFor{ForPar}{for}{do in parallel}{}
\SetKwFor{ForEach}{foreach}{do}{}%
\SetKwFor{ForAll}{forall}{do}{}%
\SetKwFor{While}{while}{do}{}%
% --- German for noend
\SetKwIF{gWenn}{gSonstWenn}{gSonst}{wenn}{dann}{sonst wenn}{sonst}{}%gIf
\SetKwIF{Wenn}{SonstWenn}{Sonst}{wenn}{dann}{sonst wenn}{sonst}{}%gIf
\SetKwSwitch{Unterscheide}{Fall}{Anderes}{unterscheide}{tue}{Fall}{sonst}{}%Switch
\SetKwFor{Fuer}{f\"ur}{tue}{}%For
\SetKwFor{FuerPar}{f\"ur}{tue gleichzeitig}{}%ForPar
\SetKwFor{FuerJedes}{f\"ur jedes}{tue}{}%ForEach
\SetKwFor{FuerAlle}{f\"ur alle}{tue}{}%ForAll
\SetKwFor{Solange}{solange}{tue}{}%While
% --- Portuguese
\SetKwIF{gSe}{gSenaoSe}{gSenao}{se}{ent\~{a}o}{sen\~{a}o se}{sen\~{a}o}{}%
\SetKwIF{Se}{SenaoSe}{Senao}{se}{ent\~{a}o}{sen\~{a}o se}{sen\~{a}o}{}%
\SetKwSwitch{Selec}{Caso}{Outro}{selecione}{fa\c{c}a}{caso}{sen\~{a}o}{}%
\SetKwFor{Para}{para}{fa\c{c}a}{}%
\SetKwFor{ParaPar}{para}{fa\c{c}a em paralelo}{}
\SetKwFor{ParaCada}{para cada}{fa\c{c}a}{}%
\SetKwFor{ParaTodo}{para todo}{fa\c{c}a}{}%
\SetKwFor{Enqto}{enquanto}{fa\c{c}a}{}%
}
%
%%
%%
%%
%
% default macros are:
\defaultsmacros@algo
\SetNoline
%
%
%
%%
%%%
%%%% END
\ No newline at end of file
\newcommand\classname{bioinfo}
\newcommand\lastmodifieddate{2003/02/08}
\newcommand\versionnumber{0.1}
% Are we printing crop marks?
\newif\if@cropmarkson \@cropmarksontrue
\NeedsTeXFormat{LaTeX2e}[2001/06/01]
\ProvidesClass{\classname}[\lastmodifieddate\space\versionnumber]
\setlength{\paperheight}{11truein}
\setlength{\paperwidth}{8.5truein}
\newif\if@final
\DeclareOption{draft}{\PassOptionsToPackage{draft}{graphicx}}
\DeclareOption{b4paper}{\PassOptionsToPackage{b4}{crop}}
\DeclareOption{centre}{\PassOptionsToPackage{center}{crop}}
\DeclareOption{crop}{\PassOptionsToPackage{cam}{crop}\global\@cropmarksontrue}
\DeclareOption{nocrop}{\PassOptionsToPackage{off}{crop}\global\@cropmarksonfalse}
\DeclareOption{info}{\PassOptionsToPackage{info}{crop}}
\DeclareOption{noinfo}{\PassOptionsToPackage{noinfo}{crop}}
\DeclareOption{final}{\global\@finaltrue}
\ExecuteOptions{b4paper,crop,centre,info}
\ProcessOptions
% Load all necessary packages
\RequirePackage{inputenc,crop,graphicx,amsmath,array,color,amssymb,flushend,stfloats,amsthm,chngpage,times}
\RequirePackage[LY1]{fontenc}
\renewcommand{\rmdefault}{ptm}
\renewcommand{\sfdefault}{phb}
\renewcommand{\ttdefault}{pcr}
%\RequirePackage[LY1,mtbold]{mathtime}
\def\helvetica{\fontfamily{phv}\selectfont}
\def\helveticaitalic{\fontfamily{phv}\itshape\selectfont}
\def\helveticabold{\fontfamily{phv}\bfseries\selectfont}
\def\helveticabolditalic{\fontfamily{phv}\bfseries\itshape\selectfont}
\def\helveticacn{\fontfamily{phv}\fontseries{mc}\fontshape{n}\selectfont}
\def\helveticacnitalic{\fontfamily{phv}\fontseries{mc}\fontshape{sl}\selectfont}
\def\helveticacnbold{\fontfamily{phv}\fontseries{bc}\fontshape{n}\selectfont}
\def\helveticacnbolditalic{\fontfamily{phv}\fontseries{bc}\fontshape{sl}\selectfont}
% Not sure if needed.
\newcommand\@ptsize{0}
% Set twoside printing
\@twosidetrue
% Marginal notes are on the outside edge
\@mparswitchfalse
\reversemarginpar
\renewcommand\normalsize{%
\@setfontsize\normalsize{8}{11}%
\abovedisplayskip 11\p@ \@plus2\p@ \@minus5\p@
\abovedisplayshortskip \z@ \@plus3\p@
\belowdisplayshortskip 6\p@ \@plus3\p@ \@minus3\p@
\belowdisplayskip \abovedisplayskip
\let\@listi\@listI}
\normalsize
\let\@bls\baselineskip
\newcommand\small{%
\@setfontsize\small{7}{10}%
\abovedisplayskip 10\p@ minus 3\p@
\belowdisplayskip \abovedisplayskip
\abovedisplayshortskip \z@ plus 2\p@
\belowdisplayshortskip 4\p@ plus 2\p@ minus2\p@
\def\@listi{\topsep 4.5\p@ plus 2\p@ minus 1\p@
\itemsep \parsep
\topsep 4\p@ plus 2\p@ minus 2\p@}}
\newcommand\footnotesize{%
\@setfontsize\footnotesize{8}{10}%
\abovedisplayskip 6\p@ minus 3\p@
\belowdisplayskip\abovedisplayskip
\abovedisplayshortskip \z@ plus 3\p@
\belowdisplayshortskip 6\p@ plus 3\p@ minus 3\p@
\def\@listi{\topsep 3\p@ plus 1\p@ minus 1\p@
\parsep 2\p@ plus 1\p@ minus 1\p@\itemsep \parsep}}
\def\scriptsize{\@setfontsize\scriptsize{6.5pt}{9.5pt}}
\def\tiny{\@setfontsize\tiny{5pt}{7pt}}
\def\large{\@setfontsize\large{11.5pt}{12pt}}
\def\Large{\@setfontsize\Large{14pt}{16}}
\def\LARGE{\@setfontsize\LARGE{15pt}{17pt}}
\def\huge{\@setfontsize\huge{22pt}{22pt}}
\def\Huge{\@setfontsize\Huge{30pt}{30pt}}
\DeclareOldFontCommand{\rm}{\normalfont\rmfamily}{\mathrm}
\DeclareOldFontCommand{\sf}{\normalfont\helvetica}{\mathsf}
\DeclareOldFontCommand{\sfit}{\normalfont\sffamily\itshape}{\mathsf}
\DeclareOldFontCommand{\sfb}{\normalfont\helveticabold}{\mathsf}
\DeclareOldFontCommand{\sfbi}{\normalfont\sffamily\bfseries\itshape}{\mathsf}
\DeclareOldFontCommand{\tt}{\normalfont\ttfamily}{\mathtt}
\DeclareOldFontCommand{\bf}{\normalfont\bfseries}{\mathbf}
\DeclareOldFontCommand{\it}{\normalfont\itshape}{\mathit}
\DeclareOldFontCommand{\sl}{\normalfont\slshape}{\@nomath\sl}
\DeclareOldFontCommand{\sc}{\normalfont\scshape}{\@nomath\sc}
% Line spacing
\setlength\lineskip{1\p@}
\setlength\normallineskip{1\p@}
\renewcommand\baselinestretch{}
% Paragraph dimensions and inter-para spacing
\setlength\parskip{0\p@}
\setlength\parindent{12pt}
% Set inter-para skips
\setlength\smallskipamount{3\p@ \@plus 1\p@ \@minus 1\p@}
\setlength\medskipamount{6\p@ \@plus 2\p@}
\setlength\bigskipamount{12\p@ \@plus 4\p@ \@minus 4\p@}
% Page break penalties
\@lowpenalty 51
\@medpenalty 151
\@highpenalty 301
% Disallow widows and orphans
\clubpenalty 10000
\widowpenalty 10000
% Disable page breaks before equations, allow pagebreaks after
% equations and discourage widow lines before equations.
\displaywidowpenalty 100
\predisplaypenalty 10000
\postdisplaypenalty 2500
% Allow breaking the page in the middle of a paragraph
\interlinepenalty 0
% Disallow breaking the page after a hyphenated line
\brokenpenalty 10000
% Hyphenation; don't split words into less than three characters
\lefthyphenmin=3
\righthyphenmin=3
%
% Set page layout dimensions
%
\setlength\headheight{16\p@} % height of running head
\setlength\topmargin{2.5pc} % head margin
\addtolength\topmargin{-1in} % subtract out the 1 inch driver margin
\setlength\topskip{10\p@} % height of first line of text
\setlength\headsep{8\p@} % space below running head --
\setlength\footskip{10\p@} % space above footer line
\setlength\maxdepth{.5\topskip} % pages can be short or deep by half a line?
\setlength\textwidth{40.5pc} % text measure excluding margins
\setlength\textheight{61\baselineskip} % 54 lines on a full page,
\addtolength\textheight{\topskip} % including the first
% line on the page
% Set the margins
\setlength\marginparsep{3\p@}
\setlength\marginparpush{3\p@}
\setlength\marginparwidth{35\p@}
\setlength\oddsidemargin{5.25pc}
\addtolength\oddsidemargin{-1in} % subtract out the 1 inch driver margin
\setlength\@tempdima{\paperwidth}
\addtolength\@tempdima{-\textwidth}
\addtolength\@tempdima{-5.25pc}
\setlength\evensidemargin{\@tempdima}
\addtolength\evensidemargin{-1in}
\setlength\columnsep{1.5pc} % space between columns for double-column text
\setlength\columnseprule{0\p@} % width of rule between two columns
% Footnotes
\setlength\footnotesep{9\p@} % space between footnotes
% space between text and footnote
\setlength{\skip\footins}{12\p@ \@plus 6\p@ \@minus 1\p@}
% Float placement parameters
% The total number of floats that can be allowed on a page.
\setcounter{totalnumber}{10}
% The maximum number of floats at the top and bottom of a page.
\setcounter{topnumber}{5}
\setcounter{bottomnumber}{5}
% The maximum part of the top or bottom of a text page that can be
% occupied by floats. This is set so that at least four lines of text
% fit on the page.
\renewcommand\topfraction{.9}
\renewcommand\bottomfraction{.9}
% The minimum amount of a text page that must be occupied by text.
% This should accomodate four lines of text.
\renewcommand\textfraction{.06}
% The minimum amount of a float page that must be occupied by floats.
\renewcommand\floatpagefraction{.94}
% The same parameters repeated for double column output
\renewcommand\dbltopfraction{.9}
\renewcommand\dblfloatpagefraction{.9}
% Space between floats
\setlength\floatsep {12\p@ \@plus 2\p@ \@minus 2\p@}
% Space between floats and text
\setlength\textfloatsep{20\p@ \@plus 2\p@ \@minus 4\p@}
% Space above and below an inline figure
\setlength\intextsep {18\p@ \@plus 2\p@ \@minus 2\p@}
% For double column floats
\setlength\dblfloatsep {12\p@ \@plus 2\p@ \@minus 2\p@}
\setlength\dbltextfloatsep{20\p@ \@plus 2\p@ \@minus 4\p@}
% Space left at top, bottom and inbetween floats on a float page.
\setlength\@fptop{0\p@} % no space above float page figures
\setlength\@fpsep{12\p@ \@plus 1fil}
\setlength\@fpbot{0\p@}
% The same for double column
\setlength\@dblfptop{0\p@}
\setlength\@dblfpsep{12\p@ \@plus 1fil}
\setlength\@dblfpbot{0\p@}
% Override settings in mathtime back to TeX defaults
\DeclareMathSizes{5} {5} {5} {5}
\DeclareMathSizes{6} {6} {5} {5}
\DeclareMathSizes{7} {7} {5} {5}
\DeclareMathSizes{8} {8} {6} {5}
\DeclareMathSizes{9} {9} {6.5} {5}
\DeclareMathSizes{10} {10} {7.5} {5}
\DeclareMathSizes{12} {12} {9} {7}
% Page styles
\def\ps@headings
{%
% \def\@oddfoot{\vbox to 12.5\p@{\hbox{\rule{\textwidth}{1\p@}}\vss
% \hbox to \textwidth{\hfill\helveticabold\small\thepage}%
% }}%
% \def\@evenfoot{\vbox to 12.5\p@{\rule{\textwidth}{1\p@}\vss
% \hbox to \textwidth{\helveticabold\small\thepage\hfill}%
% }}%
\let\@oddfoot\@empty%
\let\@evenfoot\@empty%
\def\@evenhead{\vbox{\hbox to \textwidth{\fontsize{8}{10}\selectfont
{\helveticabold{\selectfont\thepage}}\hfill\helveticaitalic{\fontshape{sl}\selectfont
\strut\leftmark}}\vspace{5\p@}\rule{\textwidth}{1\p@}}}%
\def\@oddhead{\vbox{\hbox to \textwidth{\fontsize{8}{10}\selectfont
{\helveticaitalic{\fontshape{it}\selectfont\strut\rightmark}}\hfill{\helveticabold{\thepage}}}%
\vspace{5\p@}\rule{\textwidth}{1\p@}}}%
\def\titlemark##1{\markboth{##1}{##1}}%
\def\authormark##1{\gdef\leftmark{##1}}%
}
\def\ps@opening
{%
\def\@oddfoot{{
\hbox to \textwidth{\helvetica
\fontsize{7}{9}\fontshape{n}\selectfont \copyright\space The Author \@copyrightyear. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com%
\hfill\small\helveticabold\thepage}%
}}%
\def\@evenfoot{{
\hbox to \textwidth{\helvetica\thepage\hfill
\fontsize{7}{9}\fontshape{n}\selectfont The Author \@copyrightyear. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com}%
}}%
\let\@evenhead\relax
\let\@oddhead\relax}
% Page range
\newif\iflastpagegiven \lastpagegivenfalse
\newcommand\firstpage[1]{%
\gdef\@firstpage{#1}%
\ifnum\@firstpage>\c@page
\setcounter{page}{#1}%
\ClassWarning{BIO}{Increasing pagenumber to \@firstpage}%
\else \ifnum\@firstpage<\c@page
\ClassWarning{BIO}{Firstpage lower than pagenumber}\fi\fi
\xdef\@firstpage{\the\c@page}%
}
\def\@firstpage{1}
\def\pagenumbering#1{%
\global\c@page \@ne
\gdef\thepage{\csname @#1\endcsname \c@page}%
\gdef\thefirstpage{%
\csname @#1\endcsname \@firstpage}%
\gdef\thelastpage{%
\csname @#1\endcsname \@lastpage}%
}
\newcommand\lastpage[1]{\xdef\@lastpage{#1}%
\global\lastpagegiventrue}
\def\@lastpage{0}
\def\setlastpage{\iflastpagegiven\else
\edef\@tempa{@lastpage@}%
\expandafter
\ifx \csname \@tempa \endcsname \relax
\gdef\@lastpage{0}%
\else
\xdef\@lastpage{\@nameuse{@lastpage@}}%
\fi
\fi }
\def\writelastpage{%
\iflastpagegiven \else
\immediate\write\@auxout%
{\string\global\string\@namedef{@lastpage@}{\the\c@page}}%
\fi
}
\def\thepagerange{%
\ifnum\@lastpage =0 {\ \bf ???} \else
\ifnum\@lastpage = \@firstpage \ \thefirstpage\else
\thefirstpage--\thelastpage \fi\fi}
\AtBeginDocument{\setlastpage
\pagenumbering{arabic}%
}
\AtEndDocument{%
\writelastpage
\if@final
\clearemptydoublepage
\else
\clearpage
\fi}
%
% Sectional units
%
% Counters
\newcounter{section}
\newcounter{subsection}[section]
\newcounter{subsubsection}[subsection]
\newcounter{paragraph}[subsubsection]
\newcounter{subparagraph}[paragraph]
\newcounter{figure}
\newcounter{table}
% Form of the numbers
\newcommand\thepage{\arabic{page}}
\renewcommand\thesection{\arabic{section}}
\renewcommand\thesubsection{{\thesection.\arabic{subsection}}}
\renewcommand\thesubsubsection{{\thesubsection.\arabic{subsubsection}}}
\renewcommand\theparagraph{\thesubsubsection.\arabic{paragraph}}
\renewcommand\thesubparagraph{\theparagraph.\arabic{subparagraph}}
\renewcommand\theequation{\arabic{equation}}
% Form of the words
\newcommand\contentsname{Contents}
\newcommand\listfigurename{List of Figures}
\newcommand\listtablename{List of Tables}
\newcommand\partname{Part}
\newcommand\appendixname{Appendix}
\newcommand\abstractname{Abstract}
\newcommand\refname{References}
\newcommand\bibname{References}
\newcommand\indexname{Index}
\newcommand\figurename{Fig.}
\newcommand\tablename{Table}
% Clearemptydoublepage should really clear the running heads too
\newcommand{\clearemptydoublepage}{\newpage{\pagestyle{empty}\cleardoublepage}}
% Frontmatter, mainmatter and backmatter
\newif\if@mainmatter \@mainmattertrue
\newcommand\frontmatter{%
\clearpage
\@mainmatterfalse
\pagenumbering{roman}}
\newcommand\mainmatter{%
\clearpage
\@mainmattertrue
\pagenumbering{arabic}}
\newcommand\backmatter{%
\clearpage
\@mainmatterfalse}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% TITLE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\newlength{\dropfromtop}
\setlength{\dropfromtop}{\z@}
% Application Notes
\newif\if@appnotes
\newcommand{\application}{%
% \setlength{\dropfromtop}{-2.25pc}%
\global\@appnotestrue}
\long\def\title{\@ifnextchar[{\short@title}{\@@title}}
\def\short@title[#1]{\titlemark{#1}\@@@title}
\def\@@title#1{\authormark{#1}\@@@title{#1}}
\long\def\@@@title#1{\gdef\@title{#1}}
\long\def\subtitle#1{\gdef\@subtitle{#1}}
\subtitle{Genome analysis}
\long\def\author{\@ifnextchar[{\short@uthor}{\@uthor}}
\def\short@uthor[#1]{\authormark{#1}\@@author}
\def\@uthor#1{\authormark{#1}\@@author{#1}}
\long\def\@@author#1{\gdef\@author{#1}}
\def\vol#1{\global\def\@vol{#1}}
\def\issue#1{\global\def\@issue{#1}}
\def\address#1{\global\def\@issue{#1}}
\def\history#1{\global\def\@history{#1}}
\def\abstract#1{\global\def\@abstract{#1}}
\def\editor#1{\global\def\@editor{#1}}
\def\pubyear#1{\global\def\@pubyear{#1}}
\def\copyrightyear#1{\global\def\@copyrightyear{#1}}
\def\address#1{\global\def\@address{#1}}
\def\corresp#1{\global\def\@corresp{#1}}
\def\DOI#1{\global\def\@DOI{#1}}
\definecolor{gray}{cmyk}{0, 0, 0, 0.15}
\definecolor{grayfifty}{cmyk}{0, 0, 0, 0.5}
\definecolor{graysixtyfive}{cmyk}{0, 0, 0, 0.65}
\newlength{\extraspace}
\setlength{\extraspace}{\z@}
\newcommand\maketitle{\par
\begingroup
\renewcommand\thefootnote{\@fnsymbol\c@footnote}%
\def\@makefnmark{\rlap{\@textsuperscript{\normalfont\@thefnmark}}}%
\long\def\@makefntext##1{\parindent 3mm\noindent
% \@textsuperscript{\normalfont\@thefnmark}\raggedright##1}%
\@textsuperscript{\normalfont\@thefnmark}##1}%
\if@twocolumn
\ifnum \col@number=\@ne
\@maketitle
\else
\twocolumn[\@maketitle]%
\fi
\else
\newpage\enlargethispage{-23pt}
\global\@topnum\z@ % Prevents figures from going at top of page.
\@maketitle
\fi
\thispagestyle{opening}\@thanks
\endgroup
\setcounter{footnote}{0}%
\global\let\thanks\relax
\global\let\maketitle\relax
\global\let\@maketitle\relax
\global\let\@address\@empty
\global\let\@corresp\@empty
\global\let\@history\@empty
\global\let\@editor\@empty
\global\let\@thanks\@empty
\global\let\@author\@empty
\global\let\@date\@empty
\global\let\@subtitle\@empty
\global\let\@title\@empty
\global\let\@pubyear\@empty
\global\let\address\relax
\global\let\history\relax
\global\let\editor\relax
\global\let\title\relax
\global\let\author\relax
\global\let\date\relax
\global\let\pubyear\relax
\global\let\@copyrightline\@empty
\global\let\and\relax
\@afterindentfalse\@afterheading
\enlargethispage{-23pt}}
\newlength{\aboveskipchk}%for checking oddpage or evenpage top skip
\setlength{\aboveskipchk}{\z@}%
\def\access#1{\gdef\@access{#1}}
\access{Advance Access Publication Date: 2 April 2015}
\def\appnotes#1{\gdef\@appnotes{#1}}
\appnotes{Applications Note}
\def\@maketitle{%
\let\footnote\thanks
\clearemptydoublepage
\checkoddpage\ifcpoddpage\setlength{\aboveskipchk}{-16pt}\else\setlength{\aboveskipchk}{-16pt}\fi%for checking oddpage or evenpage top skip%%
\vspace*{\aboveskipchk}%
\vspace{\dropfromtop}%
\hbox to \textwidth{\raisebox{5pt}[0pt]{%
\parbox[b]{415pt}{\raggedleft{\helveticacnitalic\fontsize{8}{12}\selectfont {Bioinformatics}}\\[1pt]
{\helveticacn doi.10.1093/bioinformatics/xxxxxx}\\[1pt]
{\ifx\@access\@empty
\else
{\helveticacn \@access}\fi}
\vskip1pt
{\ifx\@appnotes\@empty
\else
{\helveticacn \@appnotes}\fi}
}}%
%\enskip \parbox[b]{11.3pc}{%
% \helvetica
% \flushright\fontsize{8}{10}\fontshape{it}\selectfont
% Vol. 00\ no. 00 \@pubyear\\
% \hfill Pages \thepagerange
% }
\hfill\includegraphics{OUP_First_SBk_Bot_8401.eps}}
\vskip2pt
\rule{415pt}{2\p@}\par%
\helvetica
\hbox to \textwidth{%
\parbox[t]{36.5pc}{%
\vspace*{3pt}
\ifx\@subtitle\@empty
\else
{\helveticacn\fontsize{14}{21}\selectfont\raggedright \@subtitle \par}%
\vspace{7.5\p@}
\fi
{\helveticabold\fontsize{18}{23}\selectfont\raggedright \@title \par}%
\vspace{8.8\p@}
{\helveticabold\boldmath\fontsize{12}{15}\selectfont\raggedright \@author \par}%
\vspace{9\p@}
{\helveticacn\fontsize{9}{12}\selectfont\raggedright \@address \par}%
\vspace{6\p@}
{\helveticacn\fontsize{8.5}{12}\selectfont\raggedright \@corresp \par}%
\vspace{2\p@}
{\helvetica\fontsize{8.5}{12}\selectfont\raggedright \@editor \par}
\vspace{4\p@}
{\helvetica\fontsize{7}{12}\selectfont\raggedright \@history \par}
\vspace{14\p@}
{
\let\section\absection
{\helvetica\fontsize{10}{12}\bfseries\selectfont Abstract}\par}
\vskip5pt
\begingroup\begin{minipage}[t]{415pt}\parindent=0pt
{\helvetica\fontsize{9}{12}\selectfont \@abstract\par}
\end{minipage}
\endgroup
%\vspace{20\p@}
}%
}
\vspace{13.5\p@}%
\rule{415pt}{2\p@}%
\vspace{12\p@ plus 6\p@ minus 6\p@}%
\vspace{\extraspace}
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%% Abstract %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\newcommand{\absection}[1]{%
\par\noindent{\bfseries #1}\space\ignorespaces}
%\newenvironment{abstract}{%
% \begingroup
% \let\section\absection
% \fontfamily{\sfdefault}\fontsize{10}{12}\sffamily\selectfont
% {\fontseries{b}\selectfont Abstract}\par}
%{\endgroup\bigskip\@afterheading\@afterindentfalse\vskip 12pt plus 3pt minus 1pt}
% Section macros
% Lowest level heading that takes a number by default
\setcounter{secnumdepth}{3}
\renewcommand{\@seccntformat}[1]{\csname the#1\endcsname\space}
\def\section{%
\@startsection{section}{1}{\z@}
{-22\p@ plus -3\p@}{4\p@}
{\reset@font\raggedright\helveticabold\fontsize{10}{12}\selectfont}}
\def\subsection{%
\@startsection{subsection}{2}{\z@}
{-11\p@ plus -2\p@}{4\p@}
{\reset@font\raggedright\helvetica\fontsize{9}{12}\selectfont}}
\def\subsubsection{%
\@startsection{subsubsection}{3}{\z@}
{-11\p@ plus -1\p@}{0.001em}
{\reset@font\normalfont\mathversion{bold}\normalsize\bfseries}}
\def\textcolon{\text{\rm :}}
\def\paragraph{%
\@startsection{paragraph}{4}{\z@}
{-6\p@}
{-.4em}
{\reset@font\itshape}}
% ********************
% Figures and tables *
% ********************
% Table and array parameters
\setlength\arraycolsep{.5em}
\setlength\tabcolsep{.5em}
\setlength\arrayrulewidth{.5pt}
\setlength\doublerulesep{2.5pt}
\setlength\extrarowheight{\z@}
\renewcommand\arraystretch{1}
\newlength{\abovecaptionskip}
\newlength{\belowcaptionskip}
\setlength{\abovecaptionskip}{13pt}
\setlength{\belowcaptionskip}{2pt}
\long\def\@makecaption#1#2{\vspace{\abovecaptionskip}%
\begingroup
\scriptsize\sffamily
\text{\sfb #1.}\space{#2}\par
\endgroup}
\long\def\@tablecaption#1#2{%
\begingroup
\fontsize{7.5pt}{10.5pt}\sffamily\selectfont
\textbf{#1.}\space{#2\strut\par}
\endgroup\vspace{\belowcaptionskip}}
% Table rules
\def\toprule{\noalign{\ifnum0=`}\fi\hrule \@height 0.5pt \hrule \@height 4pt \@width 0pt \futurelet
\@tempa\@xhline}
\def\midrule{\noalign{\ifnum0=`}\fi \hrule \@height 3pt \@width 0pt \hrule \@height 0.5pt
\hrule \@height 4pt \@width 0pt \futurelet \@tempa\@xhline}
\def\botrule{\noalign{\ifnum0=`}\fi \hrule \@height 3.75pt \@width 0pt \hrule \@height 0.5pt \futurelet
\@tempa\@xhline}
\def\hrulefill{\leavevmode\leaders\hrule height .5pt\hfill\kern\z@}
\def\thefigure{\@arabic\c@figure}
\def\fps@figure{tbp}
\def\ftype@figure{1}
\def\ext@figure{lof}
\def\fnum@figure{\figurename~\thefigure}
\def\figure{\@float{figure}}
\let\endfigure\end@float
\@namedef{figure*}{\@dblfloat{figure}}
\@namedef{endfigure*}{\end@dblfloat}
\def\thetable{\@arabic\c@table}
\def\fps@table{tbp}
\def\ftype@table{2}
\def\ext@table{lot}
\def\fnum@table{Table~\thetable}
\def\table{\let\@makecaption\@tablecaption\let\source\tablesource\@float{table}}
\def\endtable{\end@float}
\@namedef{table*}{\let\@makecaption\@tablecaption\@dblfloat{table}}
\@namedef{endtable*}{\end@dblfloat}
\newif\if@rotate \@rotatefalse
\newif\if@rotatecenter \@rotatecenterfalse
\def\rotatecenter{\global\@rotatecentertrue}
\def\rotateendcenter{\global\@rotatecenterfalse}
\def\rotate{\global\@rotatetrue}
\def\endrotate{\global\@rotatefalse}
\newdimen\rotdimen
\def\rotstart#1{\special{ps: gsave currentpoint currentpoint translate
#1 neg exch neg exch translate}}
\def\rotfinish{\special{ps: currentpoint grestore moveto}}
\def\rotl#1{\rotdimen=\ht#1\advance\rotdimen by \dp#1
\hbox to \rotdimen{\vbox to\wd#1{\vskip \wd#1
\rotstart{270 rotate}\box #1\vss}\hss}\rotfinish}
\def\rotr#1{\rotdimen=\ht #1\advance\rotdimen by \dp#1
\hbox to \rotdimen{\vbox to \wd#1{\vskip \wd#1
\rotstart{90 rotate}\box #1\vss}\hss}\rotfinish}
\newdimen\tempdime
\newbox\temptbox
% From ifmtarg.sty
% Copyright Peter Wilson and Donald Arseneau, 2000
\begingroup
\catcode`\Q=3
\long\gdef\@ifmtarg#1{\@xifmtarg#1QQ\@secondoftwo\@firstoftwo\@nil}
\long\gdef\@xifmtarg#1#2Q#3#4#5\@nil{#4}
\long\gdef\@ifnotmtarg#1{\@xifmtarg#1QQ\@firstofone\@gobble\@nil}
\endgroup
\def\tablesize{\@setfontsize\tablesize{7.5\p@}{10\p@}}
\newenvironment{processtable}[3]{\setbox\temptbox=\hbox{{\tablesize #2}}%
\tempdime\wd\temptbox\@processtable{#1}{#2}{#3}{\tempdime}}
{\relax}
\newcommand{\@processtable}[4]{%
\if@rotate
\setbox4=\vbox to \hsize{\vss\hbox to \textheight{%
\begin{minipage}{#4}%
\@ifmtarg{#1}{}{\caption{#1}}{\tablesize #2}%
\vskip7\p@\noindent
\parbox{#4}{\fontsize{7}{9}\selectfont #3\par}%
\end{minipage}}\vss}%
\rotr{4}
\else
\hbox to \hsize{\hss\begin{minipage}[t]{#4}%
\vskip2.9pt
\@ifmtarg{#1}{}{\caption{#1}}{\tablesize #2}%
\vskip6\p@\parindent=12pt
\parbox{#4}{\fontsize{7}{9}\selectfont #3\par}%
\end{minipage}\hss}\fi}%
\newcolumntype{P}[1]{>{\raggedright\let\\\@arraycr\hangindent1em}p{#1}}
% ******************************
% List numbering and lettering *
% ******************************
\def\labelenumi{{\rm\arabic{enumi}.}}
\def\theenumi{\arabic{enumi}}
\def\labelenumii{{\rm\alph{enumii}.}}
\def\theenumii{\alph{enumii}}
\def\p@enumii{\theenumi}
\def\labelenumiii{{\rm(\arabic{enumiii})}}
\def\theenumiii{\roman{enumiii}}
\def\p@enumiii{\theenumi(\theenumii)}
\def\labelenumiv{{\rm(\arabic{enumiv})}}
\def\theenumiv{\Alph{enumiv}}
\def\p@enumiv{\p@enumiii\theenumiii}
\def\labelitemi{{\small$\bullet$}}
\def\labelitemii{{\small$\bullet$}}
\def\labelitemiii{{\small$\bullet$}}
\def\labelitemiv{{\small$\bullet$}}
\def\@listI{\leftmargin\leftmargini \topsep\medskipamount}
\let\@listi\@listI
\@listi
\def\@listii{\topsep\z@\leftmargin\leftmarginii}
\def\@listiii{\leftmargin\leftmarginiii \topsep\z@}
\def\@listiv{\leftmargin\leftmarginiv \topsep\z@}
\def\@listv{\leftmargin\leftmarginv \topsep\z@}
\def\@listvi{\leftmargin\leftmarginvi \topsep\z@}
\setlength{\leftmargini}{3mm}
\setlength{\leftmarginii}{\z@}
\setlength{\leftmarginiii}{\z@}
\setlength{\leftmarginiv}{\z@}
% Changes to the list parameters for enumerate
\def\enumargs{%
\partopsep \z@
\itemsep \z@
\parsep \z@
\labelsep 1em
\listparindent \parindent
\itemindent \z@
\topsep 7\p@
}
\def\enumerate{%
\@ifnextchar[{\@numerate}{\@numerate[0]}}
\def\@numerate[#1]{%
\ifnum \@enumdepth >3 \@toodeep\else
\advance\@enumdepth \@ne
\edef\@enumctr{enum\romannumeral\the\@enumdepth}
\list{\csname label\@enumctr\endcsname}{%
\enumargs
\setlength{\leftmargin}{\csname leftmargin\romannumeral\the\@enumdepth\endcsname}
\usecounter{\@enumctr}
\settowidth\labelwidth{#1}
\addtolength{\leftmargin}{\labelwidth}
\addtolength{\leftmargin}{2pt}
\def\makelabel##1{\hss \llap{##1}}}%
\fi
}
\let\endenumerate\endlist
% Changes to the list parameters for itemize
\def\itemargs{%
\partopsep \z@
\itemsep 0\p@
\parsep \z@
\labelsep 1em
\rightmargin \z@
\listparindent \parindent
\itemindent \z@
\topsep7\p@
}
\def\itemize{%
\@ifnextchar[{\@itemize}{\@itemize[$\bullet$]}}
\def\@itemize[#1]{%
\ifnum \@itemdepth >3 \@toodeep\else
\advance\@itemdepth \@ne
\edef\@itemctr{item\romannumeral\the\@itemdepth}
\list{\csname label\@itemctr\endcsname}{%
\itemargs
\setlength{\leftmargin}{\csname leftmargin\romannumeral\the\@itemdepth\endcsname}
\settowidth\labelwidth{#1}
\addtolength{\leftmargin}{\labelwidth}
%\addtolength{\leftmargin}{\labelsep}
\def\makelabel##1{\hss \llap{##1}}}%
\fi
}
\let\enditemize\endlist
\newenvironment{unlist}{%
\begin{list}{}%
{\setlength{\labelwidth}{\z@}%
\setlength{\labelsep}{\z@}%
\setlength{\topsep}{\medskipamount}%
\setlength{\itemsep}{3\p@}%
\setlength{\leftmargin}{2em}%
\setlength{\itemindent}{-2em}}}
{\end{list}}
% ***********************
% Quotes and Quotations *
% ***********************
\def\quotation{\par\begin{list}{}{
\setlength{\topsep}{\medskipamount}
\setlength{\leftmargin}{2em}%
\setlength{\rightmargin}{\z@}%
\setlength\labelwidth{0pt}%
\setlength\labelsep{0pt}%
\listparindent\parindent}%
\item[]}
\def\endquotation{\end{list}}
\let\quote\quotation
\let\endquote\endquotation
\skip\@mpfootins = \skip\footins
\fboxsep=6\p@
\fboxrule=1\p@
% *******************
% Table of contents *
% *******************
\newcommand\@pnumwidth{4em}
\newcommand\@tocrmarg{2.55em plus 1fil}
\newcommand\@dotsep{1000}
\setcounter{tocdepth}{4}
\def\numberline#1{\hbox to \@tempdima{{#1}}}
\def\@authortocline#1#2#3#4#5{%
\vskip 1.5\p@
\ifnum #1>\c@tocdepth \else
{\leftskip #2\relax \rightskip \@tocrmarg \parfillskip -\rightskip
\parindent #2\relax\@afterindenttrue
\interlinepenalty\@M
\leavevmode
\@tempdima #3\relax
\advance\leftskip \@tempdima \null\nobreak\hskip -\leftskip
{\itshape #4}\nobreak
\leaders\hbox{$\m@th
\mkern \@dotsep mu\hbox{.}\mkern \@dotsep
mu$}\hfill
\nobreak
\hb@xt@\@pnumwidth{\hfil}%
\par}%
\fi}
\newcommand*\l@author{\@authortocline{2}{0pt}{30pt}}
\newcommand*\l@section{\@dottedtocline{3}{11pt}{20pt}}
\newcommand*\l@subsection{\@dottedtocline{4}{31pt}{29pt}}
\newcommand*\l@subsubsection[2]{}
% ***********
% Footnotes *
% ***********
\def\footnoterule{\noindent\rule{\columnwidth}{0.5pt}}
\def\@makefnmark{\@textsuperscript{\normalfont\@thefnmark}}%
\newcommand\@makefntext[1]{\noindent{\@makefnmark}\enskip#1}
% ***********
% References *
% ***********
\providecommand{\newblock}{}
\newenvironment{thebibliography}{%
\section{\bibname}%
\begingroup
\small
\begin{list}{}{%
\setlength{\topsep}{\z@}%
\setlength{\labelsep}{\z@}%
\settowidth{\labelwidth}{\z@}%
\setlength{\leftmargin}{4mm}%
\setlength{\itemindent}{-4mm}}\small}
{\end{list}\endgroup}
\RequirePackage{natbib}
% **********
% Appendix *
% **********
\newif\ifappend % Are we in the Appendix?
\def\appendix{\par
\setcounter{section}{0}
\setcounter{subsection}{0}
\appendtrue
}
%Math parameters
\setlength{\jot}{5\p@}
\mathchardef\@m=1500 % adapted value
\def\frenchspacing{\sfcode`\.\@m \sfcode`\?\@m \sfcode`\!\@m
\sfcode`\:\@m \sfcode`\;\@m \sfcode`\,\@m}
% Theorems
\def\th@plain{%
%% \let\thm@indent\noindent % no indent
\thm@headfont{\quad\scshape}% heading font is bold
\thm@notefont{\upshape\mdseries}% same as heading font
\thm@headpunct{.}% no period after heading
\thm@headsep 5\p@ plus\p@ minus\p@\relax
%% \let\thm@swap\@gobble
%% \thm@preskip\topsep
%% \thm@postskip\theorempreskipamount
\itshape % body font
}
\vbadness=9999
\tolerance=9999
\doublehyphendemerits=10000
\doublehyphendemerits 640000 % corresponds to badness 800
\finalhyphendemerits 1000000 % corresponds to badness 1000
\flushbottom
\frenchspacing
\ps@headings
\twocolumn
% Screen PDF compatability
\newcommand{\medline}[1]{%
\unskip\unskip\ignorespaces}
%%%%for smaller size text
\newenvironment{methods}{%
\begingroup
\def\section{%
\@startsection{section}{1}{\z@}
{-24\p@ plus -3\p@}{4\p@}
{\reset@font\raggedright\helveticabold\fontsize{10}{12}\selectfont}}
\def\subsection{%
\@startsection{subsection}{2}{\z@}
{-11\p@ plus -2\p@}{4\p@}
{\reset@font\raggedright\helvetica\fontsize{9}{12}\selectfont}}
\def\subsubsection{%
\@startsection{subsubsection}{3}{\z@}
{-11\p@ plus -1\p@}{0.001em}
{\reset@font\normalfont\mathversion{bold}\normalsize\bfseries}}
\normalsize
\par}
{\par\endgroup\bigskip\@afterheading\@afterindentfalse}
\graphicspath{{g:/artwork/oup/bioinfo/}}
\language=2
\hyphenation{Figure Table Figures Tables}
\newcommand{\href}[2]{#2}
\renewenvironment{proof}[1][\proofname]{\par
\normalfont \topsep6\p@\@plus6\p@\relax
\labelsep 0.5em
\trivlist
\item[\hskip\labelsep\hskip1em\textsc{#1}.]\ignorespaces
}{\endtrivlist\@endpefalse}
%%Different Bonds
\def\sbond{\ensuremath{\raise.25ex\hbox{${-}\!\!\!\!{-}$}}\kern -.9pt}
\def\dbond{\ensuremath{\raise.25ex\hbox{=$\!$=}}}
\def\tbond{\ensuremath{\raise.20ex\hbox{${\equiv}\!\!\!{\equiv}$}}}
% Author queries
%\fboxsep=4\p@
%\fboxrule=0.5\p@
\newcommand{\query}[2][0pt]{}%
% \marginpar{\vspace*{#1}%
% {\parbox{\marginparwidth}{%
% \raggedright\fontsize{6}{8}\selectfont
% #2}}}}
\renewcommand{\dag}{{\mathversion{normal}$^{\dagger}$}}
\endinput
\ No newline at end of file
\documentclass{bioinfo}
\copyrightyear{2019} \pubyear{2019}
\access{Advance Access Publication Date: Day Month Year}
\appnotes{Manuscript Category}
\usepackage[ruled,vlined]{algorithm2e}
\usepackage{xcolor}
\begin{document}
\firstpage{1}
\subtitle{Subject Section}
\title[BiORSEO]{BiORSEO: A bi-objective method to predict RNA secondary structures with pseudoknots using RNA 3D modules}
\author[Becquey \textit{et~al}.]{Louis Becquey\,$^{\text{\sfb 1,}*}$, Eric Angel\,$^{\text{\sfb 1}}$ and Fariza Tahi\,$^{\text{\sfb 1}*}$}
\address{$^{\text{\sf 1}}$IBISC, Univ Evry, Universite Paris-Saclay, 91025, Evry, France}
\corresp{$^\ast$To whom correspondence should be addressed.}
\history{Received on XXXXX; revised on XXXXX; accepted on XXXXX}
\editor{Associate Editor: XXXXXXX}
\abstract{\textbf{Motivation:} RNA loops have been modelled and clustered from solved 3D structures into ordered collections of recurrent non-canonical interactions called "RNA modules", available in databases. This work explores what information from such modules can be used to improve secondary structure prediction.
We propose a bi-objective method for predicting RNA secondary structures by minimizing both an energy-based and a knowledge-based potential. The tool, called \textsc{BiORSEO}, outputs secondary structures corresponding to the optimal solutions from the Pareto set.\\
\textbf{Results:} We compare several approaches to predict secondary structures using inserted RNA modules information: two module data sources, Rna3Dmotif and The RNA 3D Motif Atlas, and different ways to score the module insertions: module size, module complexity, or module probability according to models like JAR3D and BayesPairing. We benchmark them against a large set of known secondary structures, \textcolor{red}{including some state-of-the-art tools, and comment on the usefulness of the half physics-based, half data-based approach}.\\ % Some of the tested methods present a good performance, especially on structures containing pseudoknots. They are compared to state of the art tools for secondary structure prediction.\\
\textbf{Availability:} The software \textcolor{red}{is} available for download on the \href{https://evryrna.ibisc.univ-evry.fr/evryrna/BiORSEO/}{EvryRNA website}, as well as the datasets.\\
\textbf{Contact:} \href{louis.becquey@univ-evry.fr}{louis.becquey@univ-evry.fr}, \href{fariza.tahi@univ-evry.fr}{fariza.tahi@univ-evry.fr}\\
\textbf{Supplementary information:} Supplementary sections are available at \textit{Bioinformatics}
online.}
\maketitle
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Introduction}
Ribonucleic acid (RNA) is a macromolecule which is often single-stranded. Therefore, the strand has the ability to fold in space in more complex ways than DNA, that we mostly know to form double-stranded stems. A stem is a succession of basepairs called Watson-Crick basepairs, or "canonical", stacked on top of each other. As this can still happen with RNA, we also observe several other ways for a nucleotide to interact with another. For example, Leontis and Westhof proposed a classification of 12 non-canonical basepairs~(\citealp{leontis2001geometric}). Some of the nucleotides can also interact with the 2'OH of a ribose, or with a phosphate, or even not interact at all and bulge out the RNA structure.
\paragraph{Modelling RNA structures as graphs} ~
For modelling purposes, researchers working on computational problems involving RNA represent them with graphs. A recent article~(\citealp{schlick2018adventures}) details the different graph models of RNAs and their respective advantages.
We are particularly interested in the secondary structure graph of the RNA, i.e. a graph where the nucleotides are nodes, and backbone bonds and canonical basepairs are edges. In this kind of graph, the non-canonical interactions do not appear.
As the problem of predicting the 3D structure of an RNA from sequence has been too computationally expensive for years, and is still difficult, a common first step has been to predict this secondary structure (2D) graph, by computing what regions will form stems and what regions will \textcolor{red}{not be Watson-Crick basepaired}, forming so-called loops.
In many cases, the solution to the 2D folding problem is not unique, and RNAs have the ability to switch between several metastable conformations. Most approaches use dynamic programming schemes to compute the RNA partition function and/or canonical pairing probabilities, i.e. the probability for each nucleotide to form a canonical base-pair with every other nucleotide, or to remain unpaired~(\citealp{mccaskill1990equilibrium}). Once the partition function is computed, several models exist to rebuild one or several best structure(s): we can choose the Minimum Free Energy (MFE) structure, the one that maximizes expected accuracy (MEA), or the centroid of the ensemble. The most used implementations are some low complexity implementations such as \texttt{RNAFold} and \texttt{Fold}~(\citealp{lorenz2011viennarna}, \citealp{mathews2004using}). An important limitation of these algorithms is their inability to model so-called \textit{pseudoknotted} structures, i.e. structures with basepairs $(i,j)$ and $(k,l)$ when $i<k<j<l$.
Later, some variants taking pseudoknots into account were developped, e.g. in the NUPACK package~(\citealp{dirksAlgorithmComputingNucleic2004}) or \texttt{ProbKnot} from the RNAstructure package~(\citealp{bellaousov2010probknot}).
We can also cite Biokop (\citealp{legendre_bi-objective_2018}), a recent tool that uses both MFE and MEA criterions in a bi-objective framework, and returns optimal and suboptimal structures including pseudoknots.
RNAs can be seen as an assembly of stems and loops. To move from the planar 2D graph to 3D, one usually predicts the 3D structure of stems and loops separately. Stems are relatively easy to tackle because of the isostericity of the Watson-Crick basepairs; their structure has been widely observed and features low variability. On the other hand, to accurately model loops in 3D, one needs to take the non-canonical interactions into account. In many cases, two hairpin loops can even form new canonical basepairs to form \textit{kissing hairpins}, a particular pseudoknot type (called HHH) which is hard to predict because it involves a distant basepair between two unpaired loop regions. More nomenclature details about loops and pseudoknots are provided in Supplementary Section~A.
\paragraph{\textcolor{red}{Modelling loops as modules}} ~ Several works have gathered 3D crystal structures involving RNA chains, extracted the loops from those RNA chains and annotated the base contacts using MC-Annotate (\citealp{gendron2001quantitative}), FR3D (\citealp{sarver_fr3d:_2008}) or DSSR (\citealp{lu_dssr:_2015}).
They model the loops with more detailed graphs describing non-canonical contacts on their edges.
The graphs can then be clustered with respect to a similarity or isomorphism measure, and the sequence variations over the nucleotides of the loop can be modeled.
Those models are called RNA \textit{modules}, i.e. an ordered collection of non-canonical basepairs or stacking interactions, leading to a conserved 3D shape in different RNA molecules.
We can cite the work from (\citealp{djelloul_automated_2008}) with Rna3Dmotif, a pipeline that extracts terminal hairpin loops (HL), internal loops (IL), and multiple loops (ML) from structures annotated by FR3D, and can cluster them using a graph similarity metric.
Another one is the RNA 3D Motif Atlas~(\citealp{petrov_automated_2013}), which does not support multiple loops, but clusters the loops using all sequence information, nucleotide contacts and shape information, which leads to loop module models with tolerance in sequence and length variations.
\textcolor{red}{More recent ones are RNAMotifClusters (\citealp{ge2018novo}), and also} CaRNAval (\citealp{reinharz2018mining}), an approach that enables to model a wide variety of structural features such as multipairs, multi-stranded loops, and pseudoknots.
To be exhaustive, we also can cite RNA Bricks 2 (\citealp{chojnowski2014rna}), which has the particularity to also study contacts with protein chains.
\textcolor{red}{Further work has been done to help researchers detecting if some RNA sequence folds following some known module model, user-provided, or searched in a database.}
For example, JAR3D (\citealp{zirbel_identifying_2015}) can score the modules from the RNA 3D Motif Atlas against a query \textcolor{red}{loop} sequence.
\textcolor{red}{With RMDetect~(\citealp{cruz2011sequence}), the authors proposed to build bayesian networks from any RNA module's graph to summarize all the observed sequence variants of the module.
The metaRNAmodules pipeline (\citealp{theis2013automated}) can then be used to build bayesian networks for Djelloul \& Denise's modules and detect them in aligned RNA sequences.
The recent BayesPairing tool (\citealp{sarrazin2019automated}) can do the same for modules from several databases, and allows to detect them in single RNA sequences.}
\paragraph{\textcolor{red}{Secondary structure prediction using RNA modules}} ~ In this paper, \textcolor{red}{we want to test if the knowledge that a subsequence matches well some module model helps identifying a loop in the RNA, when predicting the secondary structure.}
In other words, we do not use the 3D data to \textcolor{red}{propose 3D conformations for the loops}, but we use it to score 2D conformations.
A first attempt \textcolor{red}{to use modules to guide RNA 2D structure predictions} is RNA-MoIP~(\citealp{reinharz_towards_2012}).
The tool \textcolor{red}{inserts modules from Rna3Dmotif in a set of candidate solutions} (often given by a simple tool like RNAsubopt~(\citealp{lorenz2011viennarna})) to \textcolor{red}{return a single} 2D structure with non-canonical base-pairs included, \textcolor{red}{which is then a better input} to give to the 3D reconstruction tool MC-Sym~(\citealp{parisien2008mc}), resulting in better prediction of 3D structures.
\textcolor{red}{Another example is the use of JAR3D and metaRNAmodules by (\citealp{theis2015rna}) to predict secondary structures at genome-wide scale, using an alignment of 13 genomes.
This last study shows the use of modules lowers the false discovery rate of secondary structure elements.
This is encouraging, but as we focus more on short, unaligned single RNA sequences, we compare ourselves to RNA-MoIP in this study.
Going further than metaRNAmodules, BayesPairing allows to detect modules from more databases and without sequence alignments, so we are rather considering it in this work.}
\textcolor{red}{When inserting modules into secondary structures, RNA-MoIP sometimes has to break basepairs.
It tries to find a compromise between staying close to the initial structure, and insert more modules. Unfortunately, }it cannot distinguish important base-pairs from less important ones, and might break some of the ones stabilizing a whole stem while inserting a module, resulting in less probable structures as output. \textcolor{red}{Evidence is provided in Supplementary Section B.}
To \textcolor{red}{overcome this problem}, we design a method which builds a 2D structure by simultaneously placing base-pairs and modules in a single step, taking into account two objectives: the expected accuracy of the structure in the equilibrium ensemble fold, and a custom function that reflects the number and quality of inserted modules (several models are studied).
This method leads to our new tool BiORSEO (Bi-Objective RNA Structure Efficient Optimizer).
Our approach avoids using a weighted linear combination of the objectives as done in RNA-MoIP (which can miss interesting \textcolor{red}{so-called \textit{non-supported} solutions}). In this paper, we use a bi-objective Pareto-based approach, i.e. we identify all the non-dominated structures (the structures for which no other structure scores better on the two objectives).\\
The paper is organized as follows. In the next section, we present the module models sources, the insertion models and some objective functions, and the procedure to compare them. Then we present a benchmark of all those variants against reference tools in Section \ref{sec:results}, using two reference datasets. \textcolor{red}{We also discuss on the usefulness of the module criteria.}
\begin{figure*}[t]
\includegraphics[width=\textwidth]{graph_abstract.jpg}
\caption{The information flow in our method. The inputs are the RNA sequence and module models from a database. First, we compute the pairing probabilities based on~(\citealp{dirksAlgorithmComputingNucleic2004}) and the probable insertion sites using 3 different methods listed in Section \ref{sec:models}. Then, we can define two objectives and linear constraints to compute the Pareto set of secondary structures for the input sequence, by solving a bi-objective integer linear program (ILP).}
\label{fig:pipeline}
\end{figure*}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{methods}
\section{Methods}\label{sec:methods}
We compare two databases of modules: (1) the module data-set used by RNA-MoIP ~(\citealp{reinharz_towards_2012}), in the DESC file format of Rna3Dmotif~(\citealp{djelloul_automated_2008}); (2) modules from the RNA 3D Motif Atlas v3.2, as provided on the BGSU website, see~(\citealp{petrov_automated_2013}) for more information. Our main procedure is the following:
\begin{itemize}
\item \textbf{Pattern-matching step:} Find all possible occurrences of known RNA modules in the query sequence, by finding subsequences of the query that score well with the probabilistic models of the modules (several models are compared).
\item \textbf{Constraints definition step:} Define constraints on the secondary structure imposed by modules if they would be included (in this case, \textcolor{red}{the closing basepairs of the module are mandatory}).
\item \textbf{Optimization step:} Find a secondary structure that satisfies as much as possible both the expected accuracy of the structure and a criterion taking into account module inclusions, by solving a bi-objective integer linear programming (ILP) problem, using the constraints defined in the previous step.
\end{itemize}
The ILP framework used to define the constraints and solve the resulting optimization problem is similar to previous works like IPknot (\citealp{sato_ipknot:_2011}), RNA-MoIP~(\citealp{reinharz_towards_2012}) or Biokop (\citealp{legendre_bi-objective_2018}).
We chose the ILP approach because it has proven its outperformance with IPknot and Biokop when it comes to pseudoknot prediction, see (\citealp{legendre_bi-objective_2018}). In particular, any type of pseudoknot can be predicted.
Figure \ref{fig:pipeline} summarizes the \textcolor{red}{whole} procedure on a graphical pipeline.
\subsection{Pattern matching step}\label{sec:models}
Several methods have been proposed to determine if a sequence (or a part of it) is likely to fold following a given module. \textcolor{red}{We benchmarked the following ones:}
\paragraph{Direct pattern matching} ~ The simplest approach when no statistical model is available is to use a regular expression and direct pattern matching against the input sequence. This is the approach used by RNA-MoIP. We used it with the Rna3Dmotif data as presented in RNA-MoIP's article ~(\citealp{reinharz_towards_2012}), dealing with special cases in the same way (very short components, wildcards).
\paragraph{JAR3D} ~ For each motif group in the RNA 3D Motif Atlas, \textcolor{red}{~(\citealp{zirbel_identifying_2015}) built} a probabilistic model for sequence variability.
Their implementation, called JAR3D, takes user-provided loop sub-sequences in input and outputs a score for every motif of the Atlas on every provided loop. This method has the advantage to allow variations in sequence length compared to the module model. Unfortunately, it can only be used for hairpin and internal loops. Another major drawback is that it requires the computation of the pairing probabilities to first locate \textit{where} the most probable loops are, to give them as input to JAR3D. We therefore use RNAsubopt first to get the positions of the probable loops. JAR3D has been developed for modules from the RNA 3D Motif Atlas and can only be tested with them.
\paragraph{\textcolor{red}{BayesPairing's sequence probability distributions}} ~ When we have data for several instances of a module, we can estimate a probabilistic distribution of the nucleotides over the module nodes. A first intuitive approach is to use the base frequencies. But as paired nucleotides are not independent at all, it is more rigorous to model those dependencies. An approach proposed in ~(\citealp{cruz2011sequence}) is to transform the module's graph into a Bayesian network, which models the dependencies between nucleotide probabilities at every node of the graph.
\textcolor{red}{BayesPairing~(\citealp{sarrazin2019automated}) automates the building of bayesian networks for every module.}
A large number of sequences are sampled using the Bayesian network, and they are pattern-matched against the query to find occurrences.
An additional step compares the free energy of the structure with and without the constraint of each matched module, and selects only the candidate sites that do not deteriorate too much the energy.
This last step is \textcolor{red}{not required here}, first because it \textcolor{red}{repeats the calculation} of the partition function, and then because we would try to insert modules \textcolor{red}{with BiORSEO} that were pre-selected \textcolor{red}{by BayesPairing} to be appropriate.
Therefore we chose to ignore this last step and let our optimizer select the pertinent modules in the candidates. BayesPairing can be used for both data sources.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Constraints definition step and IP model} \label{sec:ip}
The full list of variables we used to model the problem in an integer linear program and the linear formulation of each constraint are detailed in Supplementary Section C. Here we propose different objective functions to score the candidate module insertions. %, whose performances are compared in section \ref{sec:results}.
\paragraph{Notations} ~ We call \textit{component} a piece of strand which forms an unpaired portion of a module. A HL has one, an IL or bulge has two, and a ML has more (see Supplementary Section A). Components of a module are linked together by canonical base-pairs at their extremities to form a loop. Let $x$ be a module which could be inserted at some defined position in the sequence. Let $\|x\|$ be the number of components of this module, and $k_{x,i}$ the nucleotide count of the $i$th component of $x$. When a scoring model is used (JAR3D or BayesPairing), we denote $p(x)$ the score value of $x$ inserted at the defined position. Let $p_{uv}$ be the probability for nucleotides $u$ and $v$ (with $v>u+3$) to form a canonical base-pair. We use NUPACK's dynamic programming scheme~(\citealp{dirksAlgorithmComputingNucleic2004}), which supports pseudoknots, to compute such probabilities. We denote $y^u_v$ the binary decision variable indicating that these nucleotides do form a canonical base pair, and $C^x_1$ the decision binary variable indicating whether the module $x$ will be inserted or not. The resolution of the ILP outputs solutions by fixing definitive values for the different $y^u_v$ and $C^x_1$.
\paragraph{Objective functions} ~ The more modules that are included, the more information about set and unset base-pairs, and eventually about tertiary folds of the loops in space.
So maximizing the number of modules could be a valid criteria. But, a disadvantage of such a criteria is that it penalizes MLs with large $\|x\|$, because the insertion of a ML forbids at the same time the insertion of several ILs or bulges in place.
RNA-MoIP uses the sum of the squared nucleotide count over the components to try to encourage large modules~(\citealp{reinharz_towards_2012}). This is our first benchmarked criteria $f_{1A}$.
Conversely, we could also try to maximize the number of components $\|x\|$ in the module.
Then, we \textcolor{red}{also could} penalize a module insertion by the logarithm of the number of nucleotides involved in the looped zone (sum of the $k_{x,i}$) to avoid very long unpaired zones.
We introduce such a penalty in criteria $f_{1B}$. We also define two more criteria, $f_{1C}$ which uses only the score returned by JAR3D or BayesPairing, and $f_{1D}$ which includes all the presented terms.
Let $X$ be the set of all our decision variables, then the different objective functions to maximize are:
\begin{equation}f_{1A}(X) = \sum_{x} \sum_{i=1}^{\|x\|} k_{x,i}^2 \times C^x_1\label{eq:A}\end{equation}
\begin{equation}f_{1B}(X) = \sum_{x} \left[ \frac{\|x\|}{\log_2(\sum_{i=1}^{\|x\|}k_{x,i})} \times C^x_1 \right] \label{eq:B}\end{equation}
\begin{equation}f_{1C}(X) = \sum_{x} p(x) \times C^x_1 \label{eq:C}\end{equation}
\begin{equation}f_{1D}(X) = \sum_{x} \left[ \frac{\|x\|}{\log_2(\sum_{i=1}^{\|x\|}k_{x,i})} \times p(x) \times C^x_1 \right]\label{eq:D}\end{equation} \\
Regarding the second objective, aimed at maximizing the expected accuracy of the structures, we use $f_2(X) = \sum y^u_v \times p_{uv} \times I[p_{uv}>\theta]$. As first proposed by~(\citealp{sato_ipknot:_2011}), $f_2$ uses a parameter $\theta = 0.001$ to ignore very unlikely base-pairs. This prevents the explosion of the number of variables and allows a fast resolution of the IP problem.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Optimization step}
We use a simple dichotomic search algorithm (presented in Figure \ref{fig:findP}) to find the Pareto set of the bi-objective problem. On the first pass of the dichotomy, it solves iteratively a mono-objective problem with a constraint on the second objective, requiring it to be in an interval $[\lambda_{min}, \lambda_{max}]$. For example, if we decide to maximize objective 1; every-time a new non-dominated solution is found, $\lambda_{min}$ is set just above the new solution's objective 2 value. We then search and find another solution with a worse objective 1 value but a higher objective 2 value than previously found solutions. The second pass of the dichotomy searches below newly found solutions. In fact, it is required to search for superposed solutions to Pareto optimal ones. This is important when the criteria used to rank inserted modules is not able to separate them very well; many solutions therefore get the same $f_1$ score. This algorithm is implemented in C++ using the CPLEX solver concert technology (\href{https://www.ibm.com/analytics/optimization-modeling-interfaces}{ILOG CPLEX Optimizer 12.8}).
\begin{figure}[!tbp]
\begin{algorithm}[H]
F:= $\emptyset$\;
\tcp{find the extrema of the Pareto front:}
L1:= maximize($f_1$, $-\infty$, $+\infty$, F)\;
L2:= maximize($f_2$, $-\infty$, $+\infty$, F)\;
\tcp{Add L1 to the results:}
\textit{R} := $\{$L1$\}$\;
\tcp{search on top of L1:}
search\_between($f_2(\text{L1}) + \epsilon$, $f_2(\text{L2})$)\;
\tcp{search if solutions superposed to L1 exist:}
search\_between($-\infty$, $f_2(\text{L1})$)\;
\Return{R}\;
\caption{FindParetoSet()}
\end{algorithm}
\begin{algorithm}[H]
$s$:= maximize($f_1$, $\lambda_{min}$, $\lambda_{max}$, F)\;
\If{$s \neq \emptyset$}{
F:= F $\cup \{s\}$\;
\If{$\nexists x \in R$ \textcolor{red}{such that} $x>s$}{
\tcp{solution is undominated, add it to \textit{R}}
\textit{R} := \textit{R} $\cup \{s\}$\;
\While{$\exists x \in R$ \textcolor{red}{such that} $s>x$}{
\tcp{remove dominated solutions}
\textit{R} := \textit{R}$\setminus \{x\}$\;
}
\tcp{search on top of $s$}
search\_between($f_2(s) + \epsilon$, $\lambda_{max}$)\;
\If{$\lambda_{max} - \lambda_{min} > \epsilon$}{
\tcp{search if another solution superposed to $s$ exists}
search\_between($\lambda_{min}$, $f_2(s)$)\;
}
}
}
\caption{search\_between($\lambda_{min}$, $\lambda_{max}$)}
\end{algorithm}
\caption{The dichotomic search algorithm to find the Pareto set. F is the ensemble of already-found structures which grows over time, and that we forbid the solver to find again. R is the set of Pareto-optimal solutions. L1 and L2 are the best solutions to the mono-objective problems regarding $f1$ and $f2$. $\mathbf{maximize}$($f$, $\lambda_{min}$, $\lambda_{max}$, F) is a procedure that \textcolor{red}{maximizes} the function $f$ (mono-objective IP problem) under the constraint that the other one has to be in interval $[\lambda_{min}$, $\lambda_{max}]$, and with the solutions in F forbidden. The inequality sign $a>b$ between two solutions denotes that solution $a$ dominates solution $b$.}\label{fig:findP}
\end{figure}
\end{methods}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Results}\label{sec:results}
\begin{figure*}[!tbp]
\includegraphics[width=\textwidth]{Benchmark_unconstrained.jpg}
\caption{\textcolor{red}{First, second and third quartiles of the max MCC for each RNA, for all tools and BiORSEO variants.}
(A); Results of the methods that cannot find pseudoknots on the RNA-Strand dataset: RNAsubopt, RNA-MoIP, and the 14 variants of BiORSEO's bi-objective methods with a constraint that explicitly forbids pseudoknots.
(B); Results of the methods which allow pseudoknot predictions on the RNA-Strand dataset: Biokop, and the 14 BiORSEO variants without the no-pseudoknot constraint.
(C); Results of every method on the Pseudobase dataset.
\textcolor{red}{Due to the combinatorial issues commented in section \ref{sec:mat}, we counted only RNAs that could be predicted with every method: 287/344 in (A), 281/344 in (B), and 236/264 in (C).}}
\label{fig:benchmark}
\end{figure*}
All the method \textcolor{red}{variants} introduced return an ensemble of possible secondary structures \textcolor{red}{(the Pareto set)} for a given input sequence.
We compare them in a benchmark. \textcolor{red}{A small case study on three well known RNAs can be found in Supplementary Section D.}
\subsection{Benchmark protocol} \label{sec:bench}
\paragraph{Benchmark data sources} \label{sec:data}
A first dataset of RNA secondary structures was extracted from the RNA-Strand database ~(\citealp{andronescu2008rna}). We selected the RNAs for which experimental proof of the structure exists, with size varying between 10 and 100 nucleotides. Sequences containing modified nucleotides were discarded. The resulting set contains 344 secondary structures of various RNA families, 74 of them containing pseudoknots. We repeated the experiments twice: first, by forbidding explicitly the formation of pseudoknots with additional constraints (for fair comparison with RNA-MoIP).
Then, a second experiment without such limitation, to reach maximum performance. In addition, to explicitly assess the performance on pseudoknotted RNAs, methods were tested on a second collection of 264 pseudoknotted-only RNAs from the Pseudobase database~(\citealp{van2000pseudobase}), covering all pseudoknot families, and of the same length range.
\paragraph{Reference comparison methods}
To study the usefulness of the module data sources, objective functions, and module placement methods, we added state-of-the art tools to the comparison.
The same RNA sequences were submitted to RNAsubopt + RNA-MoIP for direct performance comparison.
\textcolor{red}{In 'one by one' mode, RNA-MoIP inserts modules in every solution from RNAsubopt one by one, and we select the best solution ourselves like we do with BiORSEO (see the following paragraph \textit{Metrics}). In 'chunk' mode, RNA-MoIP takes all the RNAsubopt predictions at once and returns only one solution, supposed to be the best in the input set according to its own objective function.}
We used RNAsubopt as a reference method without pseudoknot support, because it is fast, widely used, easy to understand and returns several solutions \textcolor{red}{that are in an energy range from the MFE (default parameter used)}.
We used Biokop, \textcolor{red}{another} bi-objective ILP framework, as a reference method for prediction of secondary structures with pseudoknots.
Both tools \textcolor{red}{RNAsubopt and Biokop out-perform} other state-of-the-art tools in their respective categories \textcolor{red}{(see (\citealp{lorenz2011viennarna}) and (\citealp{legendre_bi-objective_2018})} for more benchmarks against other tools\textcolor{red}{)}.
\paragraph{Metrics} ~ We compute the Matthews correlation coefficient (MCC) between the real secondary structure and every proposed structure.
The coefficient is defined as:
{\small
\begin{equation}
MCC = \frac{TP. TN - FP. FN}{\sqrt{(TP+FP)(TP+FN)(TN+FP)(TN+FN)}}. \label{eq:MCC}
\end{equation}
}\noindent
where TP, TN, FP and FN are the number of true/false positive and negative predicted base pairs.
\textcolor{red}{The choice of MCC over accuracy or F1 score is justified by the large difference between the size of the classes: in any secondary structure, there exist much more pairs of nucleotides that do not interact than pairs that do.
Here, we wonder if the "true" structure (which is only a possible state that has been observed and reported in a database) can be found in the Pareto set.
Therefore, we chose to keep the maximum MCC value found over the set of proposed structures as a metric of the method's performance, showing that a solution is close or not to this "true" structure. This is, to our knowledge, the best way to relate if the reference structure is part of our ensemble of solutions.} For comprehensiveness, results with average MCC are also provided in Supplementary Section E.
\subsection{Benchmark results}
Performance results under the form of \textcolor{red}{maximum} MCC are summarized in Figure \ref{fig:benchmark}. No data source, nor objective function taken alone performs significantly better than the other ones.
Without pseudoknots, RNAsubopt out-performs the other tools (see Figure \ref{fig:benchmark}(A)). The second most performing model is \textcolor{red}{BiORSEO, using Rna3Dmotifs modules, placed by regular expression searches only, and scored with module size ($f_{1A}$). } With pseudoknots support, most of the RNAs are predicted with small pseudoknots as the method allows it. As Figure \ref{fig:benchmark}(B) shows, the methods \textcolor{red}{are still all very close in terms of performance, including Biokop} which performs \textcolor{red}{as well} without module information. \textcolor{red}{The best BiORSEO variant is the same, and it out-performs Biokop for 60 RNAs, which is only 17\% of the cases. }
The results on the dataset of pseudoknotted-only structures are presented on Figure~\ref{fig:benchmark}(C).
\textcolor{red}{Again}, the 14 variants are very close and no module source, pattern-matching method nor objective function distinguishes itself. \textcolor{red}{But this time, Biokop performs better in average.}
\paragraph{\textcolor{red}{Number of solutions}} ~ \textcolor{red}{As shown in Figure~\ref{fig:nsol}}, \textcolor{red}{all the BiORSEO models} return very small sets of unique secondary structures, some of them being one optimal solution sometimes for example when using JAR3D. Meanwhile, RNAsubopt returns from one to \textcolor{red}{tens} of solutions (with \textcolor{red}{default energy gap} settings)\textcolor{red}{, and Biokop even more. The RNA-MoIP goal is conserved here with our approach: identify a secondary structure among the MEA suboptimals (or a few, but rarely more than 5) based on its compatibility with known modules. Note that the selection of the max MCC solution in the results set naturally favors RNAsubopt and Biokop in our benchmark since they return way more solutions.}
\begin{figure}[!tbp]
\includegraphics[width=\linewidth]{Nsol.jpg}
\caption{\textcolor{red}{Number of unique secondary structures in the Pareto set or returned ensemble. Data is from the RNAStrand dataset. Pseudoknots are allowed. Note that this is not the size of the Pareto set, one unique secondary structure can often be found several times in the Pareto set, with different module combinations inserted in. The default RNA-MoIP usage, labelled "chunk" here, always returns only one solution selected by itself.}}
\label{fig:nsol}
\end{figure}
\begin{figure}[!tbp]
\includegraphics[width=\linewidth]{Nmotifs.jpg}
\caption{\textcolor{red}{(A) Maximum number of modules inserted in a solution of the set. (B) Ratio of the number of modules inserted in the solution which is closest to the true structure (i.e. the max MCC solution) and the maximum number of inserted modules in a solution. Only RNAs for which the Pareto set contains more than one solution are counted (the count is given under the distribution). Data is from the RNAStrand dataset. Pseudoknots are allowed.}}
\label{fig:info}
\end{figure}
\begin{figure*}[!tbp]
\includegraphics[width=\linewidth]{kernels_A.jpg}
\caption{\textcolor{red}{Position of the best solution in the Pareto set for 4 variants using objective $f_{1A}$, after normalization on both axis. Computed on the RNA-Strand dataset with pseudoknots allowed. RNAs for which the Pareto set is a single optimal solution, or a set of solutions with the same secondary structure but different modules inserted into it, are ignored. The final number of considered RNAs is given on the figure.}}
\label{fig:pareto}
\end{figure*}
\textcolor{red}{
\subsection{Usefulness of the two objectives}
To further study the RNA modules criteria usefulness, we looked at the maximum number of inserted modules over the Pareto solutions proposed for each RNA (Figure~\ref{fig:info}(A)), and the ratio between the number of inserted modules in the best prediction in the Pareto set (the max MCC solution) and this maximum number (Figure~\ref{fig:info}(B)).
When this ratio is 1, the best solution is the one with the most modules detected in (or one of the solutions with the most modules). The results show that in large majority, this ratio is 1.
In particular, the variants which use direct pattern matching can insert more modules in average (up to 12) and this ratio is always at one. This is encouraging, showing that the best solutions do contain a lot of modules inserted.\\
Then, we display a solution space plot where we report the position of the best solution in the Pareto set after a normalization step, to directly observe where the best solutions are located in the MEA and RNA modules bi-objective plane. Variants which use $f_{1A}$ are presented on Figure \ref{fig:pareto}, plots for the other criteria are available in Supplementary Section F, they are close to what happens with $f_{1A}$. We only consider Pareto sets for which there exist more than one secondary structure solution in.
First, we observe that the more Pareto sets we eliminate for this reason, the more the criteria are correlated, resulting in a single optimal solution. For example, we can say that $f_{1A}$ when used with JAR3D is correlated to MEA because in 312 cases on 344 an optimal secondary structure is found (most right plot on Figure~\ref{fig:pareto}). The eight variants which use BayesPairing are the less correlated with MEA, resulting in well spread positions of the best solution across the solution space.
But the most interesting variant is Rna3Dmotifs + direct pattern-matching + $f_{1A}$, which performs the best. When several secondary structures are proposed, the best one is more often on the module criteria side than the MEA (most left plot on Figure~\ref{fig:pareto}; most of the best solutions have an abscissa of 1.0 or close). Therefore in this variant's case, when the criteria are not perfectly correlated, $f_{1A}$ is able to select the good solutions which are not the best on MEA.
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Discussion}
\paragraph{Comparison to RNA-MoIP} ~ An interesting point is the improvement between RNA-MoIP and our bi-objective variant which uses direct-pattern matching to spot insertion sites and $f_{1A}$ to score the insertions. This variant only differs from RNA-MoIP because it is bi-objective.
Then, the Pareto approach really improves the structure prediction by itself. This result supports our hypothesis about RNA-MoIP breaking important basepairs.
\paragraph{\textcolor{red}{About pseudoknots predictions}} ~
The support of pseudoknots \textcolor{red}{and a lower number of solutions returned are the real interesting features} about BiORSEO. \textcolor{red}{It returns several} solutions, some with pseudoknots, and some without. As we are looking at the max MCC here, the appropriate solution is selected for each RNA. \textcolor{red}{Fortunately, the number of solutions returned is always smaller than the state-of-the-art tools. This makes the use of the maximum MCC acceptable as a metric.}
However, pseudoknot prediction quality is still difficult to assess with a metric like MCC, because a pseudoknot could be involving only a few base-pairs. Finding them or not does not alter much the MCC even if the structure is much more right or wrong from a biological point of view. Unfortunately, no automated verification method exists yet to our knowledge. \textcolor{red}{We illustrate this issue with the G riboswitch pseudoknot example in Supplementary Sections D2 and D3. The pseudoknot is sometimes found, but not with the exact same list of basepairs, which is penalized by the MCC.}
\paragraph{\textcolor{red}{About RNA modules}}
\textcolor{red}{
On one side, the state-of-the-art~(\citealp{reinharz_towards_2012, theis2015rna}), the number of inserted modules in the best solutions (Figure \ref{fig:info}), and the position of the best solutions in the Pareto sets (Figure \ref{fig:pareto}) argue that criteria related to known modules are relevant. They should bring information from data to assist the theoretical model. But on the other side, Biokop is \textit{in fine} equal or above BiORSEO in terms of performance, while Biokop does not use modules. The simplest explanation is that the MFE criterion it uses is important, and we lost information when we replaced it by a module criterion in BiORSEO.
}
\paragraph{On the objective functions} ~
Regarding objective functions to include modules, the different criteria proposed seem to give comparable results at first sight regarding the average performance and the dispersion. However, an important difference between $f_{1A}$, $f_{1B}$ on one side, and $f_{1C}$, $f_{1D}$ on the other side, is about the computation time. As $f_{1A}$, $f_{1B}$ do not use a score to rank potential module insertion sites, every module of the same size can be equally inserted. When the RNA presents several loops, the combinatorial possibilities grow fast with the number of modules in the dataset. Therefore, the number of undominated solutions can reach several hundreds or thousands even for short sequences. Such large Pareto sets are not informative for our application, because they consist \textcolor{red}{of} very redundant secondary structures with different module references, which are counted only for one secondary structure solution at the end. On the other hand, $f_{1C}$ and $f_{1D}$ require the run of an additional tool (JAR3D or BayesPairing) to score the insertion sites. Given an RNA, a compromise must be found according to its length and amount of loops.
\paragraph{The bias with JAR3D} ~ One should keep in mind that JAR3D takes as input the sequences of RNA loops to score modules against them. We detect the loops in the RNA sequence with RNAsubopt. This use of JAR3D is biased, since we score modules on sequence portions that we already know likely to form loops and unlikely to form stems, so the information brought by a module insertion is low. \textcolor{red}{This is why we find it correlated to MEA.}
\paragraph{About BayesPairing} ~ Please remind that we skipped BayesPairing's last step which checks if the insertion of a module would deteriorate too much the energy. We did so because our bi-objective framework was supposed to be able to do it itself. The low performance of BayesPairing in the benchmark is not an argument against it when used in its intended purpose.
\paragraph{Computation times and material limits} \label{sec:mat}~
As evocated in Figure \ref{fig:benchmark}, the methods were not all able to fold all the sequences. The missing ones often are from a combination of direct pattern-matching or JAR3D with $f_{1A}$ or $f_{1B}$. \textcolor{red}{On the other hand, direct-pattern matching with $f_{1B}$ and in particular $f_{1A}$ also are the fastest methods. Computation time examples are provided in the case study in Supplementary Section D.}
The RAM, not the time, typically limits the size of the RNAs the methods can process. RNAs up to 230 bases are fine \textcolor{red}{on our workstation with 16GB of RAM}. A prediction typically takes a few seconds, sometimes minutes. The time required grows with both the nucleotide count, the number of similar loops, and the number of similar modules insertible on those similar loops. The objective functions $f_{1A}$ and $f_{1B}$, which are not sequence-dependant, were sometimes not discriminative enough and equally ranked a large number of combinatorial ordering variations of the same modules on the same loops. For that reason, we had to arbitrarily stop the jobs exceeding 500 structures in the Pareto set, because they would require several hours to complete, leading to incomplete dataset predictions.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\vspace{-0.5cm}
\section{Conclusion}
We developed a general bi-objective method to benchmark different sources of RNA modules models (the RNA 3D Motif Atlas and Rna3Dmotif), different methods to place them in sequences (direct pattern matching, BayesPairing, and JAR3D), and different scoring functions. The bi-objective method uses \textcolor{red}{a theoretical criterion,} the expected accuracy of the structure, and \textcolor{red}{a data-based criterion, one of} the previous scoring functions, to select relevant secondary structures.
Our models \textcolor{red}{out-perform} RNA-MoIP, a previous attempt to predict better secondary structures using module information and a linear combination of two objectives into a scoring function. Our simplest best-performing new method \textcolor{red}{uses Rna3Dmotifs placed in sequences by regular expression searches and scored by module size. It} could be interpreted as an upgraded RNA-MoIP with a real bi-objective framework, which predicts the base pairs and the module insertions in a row, preventing the insertion to break important base-pairs. \textcolor{red}{However, no module model distinguishes itself from the other. Regarding module detection methods, JAR3D seems biased and results in correlated criteria. The use of direct pattern matching is simple and efficient but leads to combinatorial issues on some RNAs, because we cannot score the insertion sites so that one beats the other candidates. Even if no variant of the method has a better predictive performance than Biokop in average (which performs the same or better using MFE but no module information), the module criterion improves the prediction in 17\% of the cases.} \textcolor{red}{In a future work, we should then develop a tri-objective optimization program including MFE as another criterion.}
Improvement perspectives relies on the hope newer databases like CaRNAval~(\citealp{reinharz2018mining}) that contain more recent and more diverse module models \textcolor{red}{including real pseudoknotted interaction networks},
bring more information to assist the energy criteria.
\begin{thebibliography}{25}
\bibitem[Andronescu {\em et~al.}(2008)Andronescu, Bereg, Hoos, and
Condon]{andronescu2008rna}
Andronescu, M., Bereg, V., Hoos, H.~H., and Condon, A. (2008).
\newblock Rna strand: the {RNA} secondary structure and statistical analysis
database.
\newblock {\em BMC bioinformatics\/}, {\bf 9}(1), 340.
\bibitem[Bellaousov and Mathews(2010)Bellaousov and
Mathews]{bellaousov2010probknot}
Bellaousov, S. and Mathews, D.~H. (2010).
\newblock Probknot: fast prediction of {RNA} secondary structure including
pseudoknots.
\newblock {\em RNA\/}, {\bf 16}(10), 1870--1880.
\bibitem[Chojnowski {\em et~al.}(2014)Chojnowski, Wale{\'n}, and
Bujnicki]{chojnowski2014rna}
Chojnowski, G., Wale{\'n}, T., and Bujnicki, J.~M. (2014).
\newblock Rna bricks-a database of {RNA} 3d motifs and their interactions.
\newblock {\em Nucleic acids research\/}, {\bf 42}(D1), D123--D131.
\bibitem[Cruz and Westhof(2011)Cruz and Westhof]{cruz2011sequence}
Cruz, J.~A. and Westhof, E. (2011).
\newblock Sequence-based identification of 3d structural modules in {RNA} with
rmdetect.
\newblock {\em Nature methods\/}, {\bf 8}(6), 513.
\bibitem[Dirks and Pierce(2004)Dirks and
Pierce]{dirksAlgorithmComputingNucleic2004}
Dirks, R.~M. and Pierce, N.~A. (2004).
\newblock An algorithm for computing nucleic acid base-pairing probabilities
including pseudoknots.
\newblock {\em Journal of Computational Chemistry\/}, {\bf 25}(10), 1295--1304.
\bibitem[Djelloul and Denise(2008)Djelloul and Denise]{djelloul_automated_2008}
Djelloul, M. and Denise, A. (2008).
\newblock Automated motif extraction and classification in {RNA} tertiary
structures.
\newblock {\em RNA\/}, {\bf 14}(12), 2489--2497.
\bibitem[Ge {\em et~al.}(2018)Ge, Islam, Zhong and Zhang]{ge2018novo}
Ge, P., Islam, S., Zhong, C., and Zhang, S. (2018).
\newblock De novo discovery of structural motifs in RNA 3D structures through clustering.
\newblock {\em Nucleic acids research\/}, {\bf 46}(9), 4783--4793.
\bibitem[Gendron {\em et~al.}(2001)Gendron, Lemieux, and
Major]{gendron2001quantitative}
Gendron, P., Lemieux, S., and Major, F. (2001).
\newblock Quantitative analysis of nucleic acid three-dimensional structures.
\newblock {\em Journal of molecular biology\/}, {\bf 308}(5), 919--936.
\bibitem[Legendre {\em et~al.}(2018)Legendre, Angel, and
Tahi]{legendre_bi-objective_2018}
Legendre, A., Angel, E., and Tahi, F. (2018).
\newblock Bi-objective integer programming for {RNA} secondary structure
prediction with pseudoknots.
\newblock {\em BMC Bioinformatics\/}, {\bf 19}(1), 13.
\bibitem[Leontis and Westhof(2001)Leontis and Westhof]{leontis2001geometric}
Leontis, N.~B. and Westhof, E. (2001).
\newblock Geometric nomenclature and classification of {RNA} base pairs.
\newblock {\em RNA\/}, {\bf 7}(4), 499--512.
\bibitem[Lorenz {\em et~al.}(2011b)Lorenz, Bernhart, Zu~Siederdissen, Tafer,
Flamm, Stadler, and Hofacker]{lorenz2011viennarna}
Lorenz, R., Bernhart, S.~H., Zu~Siederdissen, C.~H., Tafer, H., Flamm, C.,
Stadler, P.~F., and Hofacker, I.~L. (2011b).
\newblock ViennaRNA Package 2.0.
\newblock {\em Algorithms for Molecular Biology\/}, {\bf 6}(1), 26.
\bibitem[Lu {\em et~al.}(2015)Lu, Bussemaker, and Olson]{lu_dssr:_2015}
Lu, X.-J., Bussemaker, H.~J., and Olson, W.~K. (2015).
\newblock {DSSR}: an integrated software tool for dissecting the spatial
structure of {RNA}.
\newblock {\em Nucleic Acids Research\/}, {\bf 43}(21), e142--e142.
\bibitem[Mathews(2004)Mathews]{mathews2004using}
Mathews, D.~H. (2004).
\newblock Using an {RNA} secondary structure partition function to determine
confidence in base pairs predicted by free energy minimization.
\newblock {\em RNA\/}, {\bf 10}(8), 1178--1190.
\bibitem[McCaskill(1990)McCaskill]{mccaskill1990equilibrium}
McCaskill, J.~S. (1990).
\newblock The equilibrium partition function and base pair binding
probabilities for {RNA} secondary structure.
\newblock {\em Biopolymers: Original Research on Biomolecules\/}, {\bf
29}(6-7), 1105--1119.
\bibitem[Parisien and Major(2008)Parisien and Major]{parisien2008mc}
Parisien, M. and Major, F. (2008).
\newblock The mc-fold and mc-sym pipeline infers {RNA} structure from sequence
data.
\newblock {\em Nature\/}, {\bf 452}(7183), 51.
\bibitem[Petrov {\em et~al.}(2013)Petrov, Zirbel, and
Leontis]{petrov_automated_2013}
Petrov, A.~I., Zirbel, C.~L., and Leontis, N.~B. (2013).
\newblock Automated classification of {RNA} 3d motifs and the {RNA} 3d {Motif}
{Atlas}.
\newblock {\em RNA\/}, {\bf 19}(10), 1327--1340.
\bibitem[Reinharz {\em et~al.}(2012)Reinharz, Major, and
Waldisp{\"u}hl]{reinharz_towards_2012}
Reinharz, V., Major, F., and Waldisp{\"u}hl, J. (2012).
\newblock Towards 3d structure prediction of large {RNA} molecules: an integer
programming framework to insert local 3d motifs in {RNA} secondary structure.
\newblock {\em Bioinformatics\/}, {\bf 28}(12), i207--i214.
\bibitem[Reinharz {\em et~al.}(2018)Reinharz, Soul{\'e}, Westhof,
Waldisp{\"u}hl, and Denise]{reinharz2018mining}
Reinharz, V., Soul{\'e}, A., Westhof, E., Waldisp{\"u}hl, J., and Denise, A.
(2018).
\newblock Mining for recurrent long-range interactions in {RNA} structures
reveals embedded hierarchies in network families.
\newblock {\em Nucleic Acids Research\/}, {\bf 46}(8), 3841--3851.
\bibitem[Sarrazin-Gendron {\em et~al.}(2019)Sarrazin-Gendron, Reinharz, Oliver,
Moitessier, and Waldisp{\"u}hl]{sarrazin2019automated}
Sarrazin-Gendron, R., Reinharz, V., Oliver, C.~G., Moitessier, N., and
Waldisp{\"u}hl, J. (2019).
\newblock Automated, customizable and efficient identification of 3d base pair
modules with bayespairing.
\newblock {\em Nucleic acids research\/}.
\bibitem[Sarver {\em et~al.}(2008)Sarver, Zirbel, Stombaugh, Mokdad, and
Leontis]{sarver_fr3d:_2008}
Sarver, M., Zirbel, C.~L., Stombaugh, J., Mokdad, A., and Leontis, N.~B.
(2008).
\newblock {FR}3d: finding local and composite recurrent structural motifs in
{RNA} 3d structures.
\newblock {\em Journal of Mathematical Biology\/}, {\bf 56}(1), 215--252.
\bibitem[Sato {\em et~al.}(2011)Sato, Kato, Hamada, Akutsu, and
Asai]{sato_ipknot:_2011}
Sato, K., Kato, Y., Hamada, M., Akutsu, T., and Asai, K. (2011).
\newblock {IPknot}: fast and accurate prediction of {RNA} secondary structures
with pseudoknots using integer programming.
\newblock {\em Bioinformatics\/}, {\bf 27}(13), i85--i93.
\bibitem[Schlick(2018)Schlick]{schlick2018adventures}
Schlick, T. (2018).
\newblock Adventures with {RNA} graphs.
\newblock {\em Methods\/}, {\bf 143}, 16--33.
\bibitem[Theis {\em et~al.}(2013)Theis, Zu Siederdissen, Hofacker and Gorodkin]{theis2013automated}
Theis, C., Zu Siederdissen, C., Hofacker, I. L., and Gorodkin, J. (2013).
\newblock Automated identification of RNA 3D modules with discriminative power in RNA structural alignments.
\newblock {\em Nucleic acids research\/}, {\bf 41}(22), 9999--10009.
\bibitem[Theis {\em et~al.}(2015)Theis, Zirbel, Zu Siederdisse, Anthon, Hofacker, Nielsen and Gorodkin]{theis2015rna}
Theis, C., Zirbel, C. L., Zu Siederdissen, C. H., Anthon, C., Hofacker, I. L., Nielsen, H., and Gorodkin, J. (2015).
\newblock RNA 3D modules in genome-wide predictions of RNA 2D structure.
\newblock {\em PloS One}, {\bf 10}(10), e0139900.
\bibitem[Van~Batenburg {\em et~al.}(2000)Van~Batenburg, Gultyaev, Pleij, Ng,
and Oliehoek]{van2000pseudobase}
Van~Batenburg, F., Gultyaev, A.~P., Pleij, C., Ng, J., and Oliehoek, J. (2000).
\newblock Pseudobase: a database with rna pseudoknots.
\newblock {\em Nucleic Acids Research\/}, {\bf 28}(1), 201--204.
\bibitem[Zirbel {\em et~al.}(2015)Zirbel, Roll, Sweeney, Petrov, Pirrung, and
Leontis]{zirbel_identifying_2015}
Zirbel, C.~L., Roll, J., Sweeney, B.~A., Petrov, A.~I., Pirrung, M., and
Leontis, N.~B. (2015).
\newblock Identifying novel sequence variants of {RNA} 3d motifs.
\newblock {\em Nucleic Acids Research\/}, {\bf 43}(15), 7504--7520.
\end{thebibliography}
\end{document}
\documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage{graphicx}
\usepackage{amsmath}
\usepackage{stmaryrd} % llbracket, rrbracket
\usepackage{siunitx} % SI units
\usepackage{geometry}
\usepackage{charter} % betterfont
\geometry{top=1.5cm,bottom=1.5cm, left=2cm, right=2cm}
\begin{document}
\appendix
\section{Secondary structure elements of an RNA}
RNA secondary structure elements can be classified in stems and loops. Stems are stacks of canonical interactions. They consist in double-stranded regions which stabilize the molecule. \\
Loops are unpaired portions (unpaired in terms of \textit{canonical} pairings) which are often more flexible, and can interact with distant portions of the molecule (forming canonical or non-canonical pseudoknots) or with another molecule, providing a function to the RNA.
\begin{center}\includegraphics[width=0.5\linewidth]{fig/RNA_SSE.png}\end{center}
\textit{\scriptsize Figure extracted from Dr. Xiang-Jun Lu's 3DNA website (\texttt{https://x3dna.org/articles/exterior-loop-in-rna-secondary-structure}), June 2019}\\
Loops can be further classified according to their number of strands, sometimes called "components" in this article. A \textit{Hairpin Loop} (HL) is composed of only one strand joining the two strands of only one stem. An \textit{Internal Loop} (IL) is composed of two strands linking together two stems. Note that a particular case of IL, called the \textit{bulge}, has one strand of length zero. Loops with more than 2 stems are called \textit{Multibranch Loops} (ML), and sometimes referred as $k$-way junctions for $k$ stems.
The different strands of the loops which link stems together do not necessarily belong to the same RNA molecule, the same vocabulary can be used to describe RNA complexes. For example, if one cuts the top Hairpin Loop on this figure, the bottom Internal Loop remains an Internal Loop, but uses two different RNA molecules. There is no difference from a structural point of view.
\begin{center}
\includegraphics[width=0.8\linewidth]{fig/pseudoknots.png}
\end{center}
Pseudoknots families describe how they can be formed. The HHH type happens when 2 hairpin loops form a new stem together (squared in blue), forming a third hairpin loop. The H type is a simple strand forming a new stem with an HL, forming a new HL. If there is a IL or ML formed instead of a HL, because of a stem+loop already present, we call it a HLin (Hairpin Loop inside the loop). If a strand forms a new stem not with a HL, but with an IL, forming a new HL, we call it a HLout (Hairpin Loop pointing out). If an IL or ML is formed instead of a HL, it is called an LL pseudoknot.
\newpage
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{RNA-MoIP can break important base-pairs}
We justify our Pareto-based bi-objective approach by the fact that a linear combination of two objectives (what RNA-MoIP does) always have several defects:
\begin{itemize}
\item It can miss solutions when the Pareto set is not perfectly convex,
\item It requires weights on every term of the sum. If these weights are not finely tuned, the compromise between including modules and not breaking important basepairs is missed. We believe this is the case and illustrate this with the following plots:
\end{itemize}
\begin{figure}[h]
\includegraphics[width=\linewidth]{fig/MOIP_subopt.jpg}
\end{figure}
Here we compare the prediction performance of:
\begin{itemize}
\item RNAsubopt (which predicts a list of sub-optimal secondary structures without pseudoknots, that are then fed forward into RNA-MoIP),
\item RNA-MoIP "chunk", which is the default running mode of RNA-MoIP, which selects only one solution from the list of sub-optimals based on a linear combination of two objectives: the possibility to insert modules with $f_{1A}$ on one side, and not breaking too much basepairs from the input structure on the other side,
\item RNA-MoIP 'one by one', a different use of RNA-MoIP where we give it every sub-optimal solution one by one and let it modify it to insert modules. Then we manually select the best solution according to max MCC (like we do for RNAsubopt).
\end{itemize}
First on the left figure (a), the max MCC found across the set of solutions is reported for each of the 344 RNAs of the RNA-Strand dataset. RNAs are sorted by RNA-MoIP 'chunk' performance. We can see that the two RNA-MoIP series do not differ a lot, which means RNA-MoIP efficiently selects the best solution in the set after the solutions have been modified to insert modules. But, on the other side, we can see that one of the input RNAsubopt solutions often was better than the solutions transformed by RNA-MoIP: it breaks important basepairs in the inputs.\\
The right figure (b) quantifies the differences. The performance difference between the two RNA-MoIP usages is almost always null, meaning RNA-MoIP is good at selecting its best solution in the set. But when compared to RNAsubopt, the performance decreases more often than it increases.
We think "breaking" base-pairs in an input is not the best way to do, so we chose to build the solutions taking into account the two criteria simultaneously in a bi-objective optimisation program.
\newpage
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Linear constraints to model RNA Structures in an integer linear program}
We present here the linear constraints we used to model our problem and solve it with a regular integer-linear-programming solver. The constraints have been written by us, but are inspired by works like (Sato \textit{et al.}, 2011), (Reinharz \textit{et al.}, 2012) and (Legendre \textit{et al.}, 2018).
\paragraph{Extended notations} ~ \\
Let $n$ be the number of nucleotides in the query RNA sequence $s$.\\
Let $M$ be the set of modules that could be inserted in $s$.\\
Let $x$ be a module of $M$, $\|x\|$ be the number of distinct components of $x$, and $p(x)$ the associated score of insertion given by JAR3D or BayesPairing for that motif inserted at a particular position.\\
Let $P_{x,i}$ be the position in $s$ where we can insert the $i$th component of module $x$.\\
As the same module model can be inserted several times in $s$, several different $x$ modules in $M$ may refer to the same theoretical module, but inserted at different positions.\\
Let $k_{x,i}$ be the size in nucleotides of that $i$th component of $x$.\\
Let $y^u_v$ be the \textbf{decision boolean variable} indicating that $s[u]$ and $s[v]$ form a canonical base pairing. According to the standard loop model, we always have $v > u + 3$.\\
Let $C^x_i$ be the \textbf{decision boolean variable} indicating that we do insert the $i$th component of module $x$ at position $P_{x,i}$.
Note that a base pair $y^u_v$ is possible if and only if $v>u+3$, and that we do not need to use two variables $y^u_v$ and $y^v_u$ for the same pair.
Then, we have $\sum_{i=4}^n (n-i)$ decision variables ($\approx \frac{1}{2}n^2$ decision variables) of the form $y^u_v$.
Regarding the $C^x_i$, if we have an average insertion of $\nu$ motifs by RNA sequence, the motifs having in average $\mu$ components, components that can be inserted in average at $\pi$ different positions in $s$,
then we need to add, in average, $\nu \times \mu \times \pi$ decision variables $C^x_i$.
Then, we expect having around $\frac{1}{2}n^2+\nu \mu \pi$ decision variables.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\paragraph{Constraint to ensure there only is 0 or 1 canonical pairing by nucleotide} ~
\begin{equation} \label{constraint:1}
\sum_{v<u} y^v_u + \sum_{v>u} y^u_v \leq 1 \qquad\qquad \forall u \in \llbracket 1,n \rrbracket
\end{equation}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\paragraph{Constraints to forbid lonely base pairs} ~
% \begin{equation} \label{constraint:2}
% \sum_{v=u}^n y^{u-1}_v - \sum_{v=u+1}^n y^u_v + \sum_{v=u+2}^n y^{u+1}_v \geq 0 \qquad \qquad \forall u \in \llbracket 1,n\rrbracket
% \end{equation}
% \begin{equation} \label{constraint:3}
% \sum_{u=1}^{v-2} y^u_{v-1} - \sum_{u=1}^{v-1} y^u_v + \sum_{u=1}^{v} y^u_{v+1} \geq 0 \qquad \qquad \forall v \in \llbracket 1,n\rrbracket
% \end{equation}
% These conditions ensure that if a base pair exists with $s[i]$,
% one of the adjacent bases is paired too.
% Equation \ref{constraint:2} is useful if $s[u]$ is paired with $s[v>u]$ (a nucleotide later in the sequence),
% and equation \ref{constraint:3} if $s[v]$ is paired with $s[u<v]$ (a nucleotide earlier in the sequence).
\begin{equation} \label{constraint:2}
y^{u-1}_{v+1} - y^u_v + y^{u+1}_{v-1} \geq 0 \qquad \qquad \forall (u,v) \in \{ (u,v) \in \llbracket 1,n\rrbracket^2 \; | \; u + 3 <v \}
\end{equation}
A basepair should be accompanied by one of its neighbours, forming a stable structure stabilized by stacking energies. In theory, this might add up to \( \frac{1}{2}n^2\) constraints, but in practice, this number is very reasonable as
the only decision variables kept are those with probability above a $\theta$ threshold.
Then, this condition sets to zero "lonely decision variables" who have no neighbour basepair variable allowed.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% \paragraph{Constraint to forbid pairings inside a module component} ~
% \begin{equation} \label{constraint:4}
% (k_{x,i}-2) \; C^x_i + \sum_{u=P_{x,i}+1}^{P_{x,i}+k_{x,i}-2}\left[ \sum_{v>u} y^u_v + \sum_{v<u} y^v_u \right] \leq (k_{x,i} - 2)
% \qquad \qquad \forall x \in M, i \in \llbracket 1,\|x\| \rrbracket
% \end{equation}
% If $C^x_i$ is set to 1, then the sum has to be zero. Obviously, this constraint prevents the program to correctly detect pseudoknots of HHH (kissing hairpins) and LL types (kissing higher-order loops), which is a limit of the approach.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\paragraph{Constraints to forbid components to overlap} ~
\begin{equation} \label{constraint:5}
\sum_{x \in M} \sum_{i=1}^{\|x\|} C^x_i \times I(P_{x,i}<u<P_{x,i}+k_{x,i}-1) \leq 1 \qquad \qquad \forall u \in \llbracket 1,n \rrbracket
\end{equation}
$I(P_{x,i}<u<P_{x,i}+k_{x,i}-1)$ is a boolean value depending on the condition's truth. Then, whatever the nucleotide $u$, it can be part of a module component only once.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\paragraph{Constraints to respect the structure of large motifs ($\{ x\in M \; | \; \|x\| \geq 2\}$)} ~
These constraints ensure that none or all the components of a motif are inserted.
\begin{equation}\label{constraint:6}
\sum_{i=2}^{\|x\|} C^x_i = (\|x\| - 1) \times C^{x}_{1} \qquad \qquad \forall x \in \{ x\in M \; | \; \|x\| \geq 2\}
\end{equation}
And then, we force the base pairs between the end of a component and the beginning of the next one, so that the module $x$ has all its closing basepairs:
\begin{equation}\label{constraint:7}
C^x_1 \leq y^{P_{x,1}}_{P_{x,\|x\|}+k_{x,\|x\|}-1} \qquad \qquad \forall x \in \{ x\in M \; | \; \|x\| \geq 2\}
\end{equation}
\begin{equation}\label{constraint:8}
C^x_j \leq y^{P_{x,j}+k_{x,j}-1}_{P_{x,j+1}} \qquad \qquad \forall x \in \{ x\in M \; | \; \|x\| \geq 2\}, \forall j \in \llbracket 1,\|x\| \llbracket
\end{equation}
Constraint (\ref{constraint:7}) binds the first nucleotide of first component to the last one of the last component.
Constraint (\ref{constraint:8}) binds the last nucleotide of component $j$ to the first of component $j+1$.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\paragraph{Facultative constraints to forbid pseudoknots} ~
\begin{equation}\label{constraint:9}
y^u_v + y^k_l \leq 1 \qquad \qquad \forall u,v,k,l \text{ such as } 1\leq u<k<v<l\leq n
\end{equation}
To limit the number of constraints added, we obviously define the condition for allowed basepairs only ($u + 3 <v$, $k + 3 <l$, $p_{uv} > \theta$, $p_{kl} > \theta$).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\paragraph{Constraints to forbid previously found solutions} ~
As several solutions may result in the same values of the two objectives, and as we want to get all the solutions, we can't forbid the algorithm to search twice the same region of the objective space.
We have to explicitly forbid to find again every found solution.\\
We do it by adding iteratively, for every structure $s^*$ found, the following condition:
\begin{equation}\label{constraint:10}
\sum_{y^u_v \in \{ y^u_v | y^u_v = 1 \text{ in } s^* \}} (1 - y^u_v) + \sum_{y^u_v \in \{ y^u_v | y^u_v = 0 \text{ in } s^* \}} y^u_v +
\sum_{C^x_i \in \{ C^x_i | C^x_i = 1 \text{ in } s^* \}} (1 - C^x_i) + \sum_{C^x_i \in \{ C^x_i |C^x_i = 0 \text{ in } s^* \}} C^x_i \geq 1
\end{equation}
It ensures that at least one of the decision variables differs from $s^*$.
\newpage
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Case study}
To complete the large benchmark, we have a deeper look at well-known structures to check if some methods are able to predict them correctly. We used a Gln tRNA from E. coli (RNA-Strand code PDB\_00376), a Guanine riboswitch (RNA-Strand code PDB\_01023), and the pseudoknot of the human telomerase (PDB\_00857). The tRNA is unpseudoknotted, the G riboswitch contains a hard-to-predict HHH type pseudoknot, and the telomerase pseudoknot is a simple H type pseudoknot.
%\begin{table}[h]
%\footnotesize{
% \begin{tabular*}{\textwidth}{l@{\extracolsep{\fill}}lllllll}
% \hline
% & RNAsubopt & RNA-MoIP & BiokoP & \multicolumn{4}{c}{BiORSEO}\\
% & & & & Rna3Dmotif & Rna3Dmotif & 3D Motif Atlas & 3D Motif Atlas\\
% & & & & + Direct P.M. & + BayesPairing & + JAR3D & + BayesPairing \\
% \hline
% tRNA Gln & 0.68 & 0.67 & 0.67 & 0.64 (A,B) & 0.64 (B,C,D), 0.60 (A) & 0.64 (B,C,D), 0.63 (B) & 0.64 %(\textit{all}) \\
% G riboswitch & 0.86 & 0.84, 0.68 & 0.76 & 0.72 (A), 0.15(B) & 0.39 (C,D), 0.15 (A,B) & 0.28 %(\textit{all}) & 0.63 (C,D), 0.57 (A), 0.14 (B)\\
% Telomerase PK & 0.77 & 0.77, 0.7 & 1.0 & 1.0 & 1.0 (B,C,D), 0.66 (A) & 0.97 (\textit{all}), & 1.0 %(\textit{all})\\
% \hline
% \end{tabular*}
% }
% \end{table}
%The table just above reports the value of the max MCC value found across the Pareto set. Pseudoknots %are allowed. The best structure is often the same across the different objective functions $f_{1A}, %f_{1B}, f_{1C}, f_{1D}$, but the rest of the sets can still differ in number of solutions and diversity. %Detailed results including structures, number of solutions and computation times are provided below.
The results are consistent with the general benchmark: BiORSEO variants perform slightly worse than Biokop or RNAsubopt.
The tRNA is an example of structure where all methods that support pseudoknots predict some in the loops. On the other hand, the telomerase pseudoknot is correctly predicted by both BiORSEO and Biokop, that support pseudoknots.
Detailed results are given below for each RNA. The number of unique solutions and computation times are also reported. Note that these cases are small RNAs, resulting in both small number of solutions and small times. The times are the "real" time spent, therefore you should use a 4-thread CPU to reproduce them, because there are several multi-threaded parts in the process. When several tools are required, the times are split by tool (for example, BayesPairing + BiORSEO, or RNAsubopt + JAR3D + BiORSEO). They also are very dependant on the I/O delays. Especially with methods reading modules from disk, you may want to use a very fast storage device (e.g. NVMe SSD NAND storage) to increase the speed.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{E. coli's Gln tRNA}
\paragraph{Referenced "true" structure in RNA-Strand (PDB 00376)} ~
\texttt{GGGGUAUCGCCAAGCGGUAAGGCACCGGAUUCUGAUUCCGGAGGUCGAGGUUCGAAUCCUCGUACCCCAGCCA}
\texttt{((((((..(((.........)))((((((((...))))))))...(((((.......))))))))))).....}
\paragraph{Best prediction results} ~
{\scriptsize
\begin{tabular}{rlccl}
Method & Best secondary structure & max MCC & N solutions & time (s)\\
\hline
Reference & \texttt{((((((..(((.........)))((((((((...))))))))...(((((.......))))))))))).....} & & & \\
RNAsubopt & \texttt{(((((((.(((....)))..(((.(((((.......)))))..)))((((.......))))))))))).....} & 0.68 & 4 & 0.0\\
Biokop & \texttt{[[[[[[((((...))))...(((.((((([[[....)))))....(((((...]]].)))))]]]]]].))).} & 0.67 & 30 & 634.6\\
RNA-MoIP (1by1) & \texttt{((((((..((......))...((.(((((.......)))))..))..((.........))..)))))).....} & 0.67 & 4 & 0.0+9.7\\
RNA-MoIP (chunk) & \texttt{((((((..((......))...((.(((((.......)))))..))..((.........))..)))))).....} & 0.67 & 1 & 0.0+7.8\\
DESC-Direct P.M.-A & \texttt{((((((((((...))))...[[[.((((([[[.[[.)))))....(((((]].]]].))))))))))).]]].} & 0.64 & 1 & 9.1\\
DESC-Direct P.M.-B & \texttt{((((((((((...))))...[[[.((((([[[.[[.)))))....(((((]].]]].))))))))))).]]].} & 0.64 & 3 & 30.1\\
DESC-BPairing-A & \texttt{((((((((((...)))....((..((((([[[.[[.))))).{{))((((]].]]].)))))))))))..}}.} & 0.60 & 1 & 103.0-8.9\\
DESC-BPairing-B & \texttt{((((((((((...))))...[[[.((((([[[.[[.)))))....(((((]].]]].))))))))))).]]].} & 0.64 & 12 & 103.0-18.6\\
DESC-BPairing-C & \texttt{((((((((((...))))...[[[.((((([[[.[[.)))))....(((((]].]]].))))))))))).]]].} & 0.64 & 4 & 103.0-11.1\\
DESC-BPairing-D & \texttt{((((((((((...))))...[[[.((((([[[.[[.)))))....(((((]].]]].))))))))))).]]].} & 0.64 & 4 & 103.0+10.6\\
BGSU-BPairing-A & \texttt{((((((((((...))))...[[[.((((([[[.[[.)))))....(((((]].]]].))))))))))).]]].} & 0.64 & 5 & 110.9+10.9\\
BGSU-BPairing-B & \texttt{((((((((((...))))...[[[.((((([[[.[[.)))))....(((((]].]]].))))))))))).]]].} & 0.64 & 7 & 110.9+10.4\\
BGSU-BPairing-C & \texttt{((((((((((...))))...[[[.((((([[[.[[.)))))....(((((]].]]].))))))))))).]]].} & 0.64 & 2 & 110.9+9.5\\
BGSU-BPairing-D & \texttt{((((((((((...))))...[[[.((((([[[.[[.)))))....(((((]].]]].))))))))))).]]].} & 0.64 & 2 & 110.9+9.3\\
BGSU-Jar3d-A & \texttt{(((((((((((....)))..[[[.((((([[[.[[.))))).)).(((((]].]]].))))))))))).]]].} & 0.63 & 1 & 0.0+1.9+10.3\\
BGSU-Jar3d-B & \texttt{((((((((((...))))...[[[.((((([[[.[[.)))))....(((((]].]]].))))))))))).]]].} & 0.64 & 1 & 0.0+1.9+9.8\\
BGSU-Jar3d-C & \texttt{((((((((((...))))...[[[.((((([[[.[[.)))))....(((((]].]]].))))))))))).]]].} & 0.64 & 2 & 0.0+1.9+10.4\\
BGSU-Jar3d-D & \texttt{((((((((((...))))...[[[.((((([[[.[[.)))))....(((((]].]]].))))))))))).]]].} & 0.64 & 2 & 0.0+1.9+10.5\\
\end{tabular}}
\paragraph{Notes} ~
Note that both BiORSEO and BiokoP insert a false-positive pseudoknot. If we look at our recommended method, Rna3Dmotifs + Direct Pattern-matching + $f_{1A}$, here is an example of difference in the number of solutions : Biorseo returns 1 solution in 9.1s, while others return more solution in longer times, without doing better.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{G Riboswitch}
\paragraph{Referenced "true" structure in RNA-Strand (PDB 01023)} ~
\texttt{GGACAUACAAUCGCGUGGAUAUGGCACGCAAGUUUCUGCCGGGCACCGUAAAUGUCCGACUAUGUCCA}
\texttt{(((((((...(((((((.[[..[[)))))))........((((((]]...]]))))))..))))))).}
\paragraph{Best prediction results} ~
{\scriptsize
\begin{tabular}{rlccl}
Method & Best secondary structure & max MCC & N solutions & time (s)\\
\hline
Reference &\texttt{(((((((...(((((((.[[..[[)))))))........((((((]]...]]))))))..))))))).} & & & \\
RNAsubopt &\texttt{(((((((.....(((((.......)))))..........((((((.......))))))..))))))).} & 0.86 & 3 & 0.0 \\
Biokop &\texttt{(((((((.[[(([[[[[))(((((]]]]]..((]]..))[[[[[[)))))..]]]]]]..))))))).} & 0.76 & 14 & 330.9\\
RNA-MoIP (1by1) &\texttt{(((((((.....(((((.......)))))..........(((((.........)))))..))))))).} & 0.84 & 3& 0.0+4.6\\
RNA-MoIP (chunk) &\texttt{(((((((........((.......)).....((....)).((((.........))))...))))))).} & 0.68 & 1& 0.0+3.3\\
DESC-Direct P.M.-A &\texttt{((((((((....(((((.....[[)))))..((....))[[[[[[..))...]]]]]].]])))))).} & 0.72 & 1 & 6.5\\
DESC-Direct P.M.-B &\texttt{((((((......(((([[[[[[[[.))))..((((.[[[.))))...]]].))))))..]]]]]]]].} & 0.15 & 7 & 8.4\\
DESC-BPairing-A &\texttt{((((((((.((.(((([[[[[[[[.))))..))((.[[[.))]]]..))..))))))..]]]]]]]].} & 0.15 & 2 & 91.2+5.6\\
DESC-BPairing-B &\texttt{((((((((.((.(((([[[[[[[[.))))..))((.[[[.))]]]..))..))))))..]]]]]]]].} & 0.15 & 2 & 91.2+5.8\\
DESC-BPairing-C &\texttt{(((([[((....((....{{..[[.[[))..]]...((([[[)))..))....))]]].]]]]}})).} & 0.39 & 7 & 91.2+8.6\\
DESC-BPairing-D &\texttt{(((([[((....((....{{..[[.[[))..]]...((([[[)))..))....))]]].]]]]}})).} & 0.39 & 7 & 91.2+8.5\\
BGSU-BPairing-A &\texttt{((..(((((((..((((.[[..[[))))[[.)))..]].[[[[[[..))...]]]]]].]]))]])).} & 0.57 & 1 & 102.3+6.6\\
BGSU-BPairing-B &\texttt{(((((((((((.(((([[[[[[[[.))))..)))..(((.[[)))]]))..))))))..]]]]]]]].} & 0.14 & 2 & 102.3+6.2\\
BGSU-BPairing-C &\texttt{((..((((....(((((.[[..[[)))))..((....))[[[[[[..))...]]]]]].]]))]])).} & 0.63 & 3 & 102.3+14.0\\
BGSU-BPairing-D &\texttt{((..((((....(((((.[[..[[)))))..((....))[[[[[[..))...]]]]]].]]))]])).} & 0.63 & 4 & 102.3+7.3\\
BGSU-Jar3d-A &\texttt{(((..((.....((((([[[[[[[)))))..((....))(((.[[)))))..]])))..]]]]]]]..} & 0.28 & 5 & 0.0+1.5+25.6\\
BGSU-Jar3d-B &\texttt{(((..((.....((((([[[[[[[)))))..((....))(((.[[)))))..]])))..]]]]]]]..} & 0.28 & 5 & 0.0+1.5+30.6\\
BGSU-Jar3d-C &\texttt{(((..((.....((((([[[[[[[)))))..((....))(((.[[)))))..]])))..]]]]]]]..} & 0.28 & 6 & 0.0+1.5+7.4\\
BGSU-Jar3d-D &\texttt{(((..((.....((((([[[[[[[)))))..((....))(((.[[)))))..]])))..]]]]]]]..} & 0.28 & 6 & 0.0+1.5+7.3\\
\end{tabular}}
\paragraph{Notes} ~
Here is a good example showing that MCC does not reflects the correct prediction of pseudoknots. The reference structure contains a small HHH-type knot between the two main hairpin loops, with an additional stem, itself containing an internal loop. Biokop finds it. The BiORSEO variants don't, most of them find other H-type pseudoknots, which all are wrong from a RNA function point of view, but the MCC scores are very diverse even if the structure is wrong, because the basepair lists can share a various amount of positives with the reference even if not located in the same stems. For example, Rna3Dmotifs + Direct Pattern-matching + $f_{1A}$ is at least able to identify the two hairpin loops, its score is much greater than the others (0.72), and the pseudoknot is still wrong.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Human telomerase's RNA pseudoknot}
\paragraph{Referenced "true" structure in RNA-Strand (PDB 00857)} ~
\texttt{GGGCUGUUUUUCUCGCUGACUUUCAGCCCCAAACAAAAAAGUCAGCA}
\texttt{[[[[[[........(((((((((]]]]]]........))))))))).}
\paragraph{Best prediction results} ~
{\scriptsize
\begin{tabular}{rlccl}
Method & Best secondary structure & max MCC & N solutions & time (s)\\
\hline
Reference & \texttt{[[[[[[........(((((((((]]]]]]........))))))))).} & & & \\
RNAsubopt & \texttt{..............(((((((((..............))))))))).} & 0.77 & 3 & 0.0 \\
Biokop & \texttt{[[[[[[........(((((((((]]]]]]........))))))))).} & 1.00 & 1 & 2.4\\
RNA-MoIP (1by1) & \texttt{..............(((((((((..............))))))))).} & 0.77 & 3 & 0.0 + 2.1\\
RNA-MoIP (chunk)& \texttt{((..........))(((((((((..............))))))))).} & 0.70 & 1 & 0.0 + 1.0\\
DESC-Direct P.M.-A & \texttt{[[[[[[........(((((((((]]]]]]........))))))))).} & 1.00 & 1 & 0.8\\
DESC-Direct P.M.-B & \texttt{[[[[[[........(((((((((]]]]]]........))))))))).} & 1.00 & 1 & 0.9\\
DESC-BPairing-A & \texttt{[[[[[[........(((((((((]]].]]]......)).))))))).} & 0.66 & 1 & 63.9+0.7\\
DESC-BPairing-B & \texttt{[[[[[[........(((((((((]]]]]]........))))))))).} & 1.00 & 2 & 63.9+0.8\\
DESC-BPairing-C & \texttt{[[[[[[........(((((((((]]]]]]........))))))))).} & 1.00 & 2 & 63.9+0.7\\
DESC-BPairing-D & \texttt{[[[[[[........(((((((((]]]]]]........))))))))).} & 1.00 & 2 & 63.9+0.7\\
BGSU-BPairing-A & \texttt{[[[[[[........(((((((((]]]]]]........))))))))).} & 1.00 & 2 & 73.1+0.7\\
BGSU-BPairing-B & \texttt{[[[[[[........(((((((((]]]]]]........))))))))).} & 1.00 & 2 & 73.1+0.7\\
BGSU-BPairing-C & \texttt{[[[[[[........(((((((((]]]]]]........))))))))).} & 1.00 & 2 & 73.1+0.7\\
BGSU-BPairing-D & \texttt{[[[[[[........(((((((((]]]]]]........))))))))).} & 1.00 & 2 & 73.1+0.7\\
BGSU-Jar3d-A & \texttt{[[[[[[........((((((((.]]]]]].........)))))))).} & 0.97 & 1 & 0.0+1.7+0.7\\
BGSU-Jar3d-B & \texttt{[[[[[[........((((((((.]]]]]].........)))))))).} & 0.97 & 1 & 0.0+1.7+0.8\\
BGSU-Jar3d-C & \texttt{[[[[[[........((((((((.]]]]]].........)))))))).} & 0.97 & 1 & 0.0+1.7+0.7\\
BGSU-Jar3d-D & \texttt{[[[[[[........((((((((.]]]]]].........)))))))).} & 0.97 & 1 & 0.0+1.7+0.7\\
\end{tabular}}
\paragraph{Notes} ~
The methods which support pseudoknots are able to predict it correctly.
\newpage
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Average MCC of the method's variants}
Instead of looking at the max MCC to see if the reference structure has been found in the Pareto set, one can look at the average MCC over the Pareto set.
We provide such results to satisfy the reader's curiosity, but this average is hard to interpret.
The Pareto set is supposed to propose several solutions that could be several meta-stable states, but there is no reason that these states should be close one to another, nor to be close to the "true" structure that has been observed and saved in the database.
A possible interpretation is the average distance of the meta-stable states to the "true" structure, if and only if we assume the predictions are correct.
It also gives the user an idea of how close to a real conformation he is if he chooses a structure randomly in the Pareto set.
\vspace{0.5cm}
\includegraphics[width=0.95\textwidth]{fig/Benchmark_avg.jpg}
\vspace{0.5cm}
Average MCC obtained by the different methods considered in our benchmark. On (A), the RNAstrand dataset for methods which do not support pseudoknots (computations succeeded for all methods for 287 RNAs). (B) is the same dataset but with pseudoknot support (281 RNAs), and (C) is the Pseudobase dataset, with 236 RNAs.
\newpage
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Positions of the best solutions in Pareto sets}
Here we report the position of the best solution (the one which has the max MCC with the true structure) in the normalized Pareto set. The normalization consists in dividing the objective functions values of a solution by the maximum value observed in the Pareto set on each axis. Then, 0 on an axis is the value 0 of the objective, and 1 is the maximum value observed.
The first line is identical to Figure 6. Direct P.M.-$f_{1A}$, Direct P.M.-$f_{1B}$, Jar3d-$f_{1A}$ and Jar3d-$f_{1B}$ are the variants that sometimes result in combinatorial issues.
\begin{figure}[h!]
\includegraphics[width=\textwidth]{kernels_A.jpg}
\includegraphics[width=\textwidth]{fig/kernels_B.png}
\includegraphics[width=\textwidth]{fig/kernels_C.png}
\includegraphics[width=\textwidth]{fig/kernels_D.png}
\end{figure}
\end{document}
\ No newline at end of file
from math import sqrt, ceil
import numpy as np
import matplotlib.pyplot as plt
import re
import seaborn as sns
import pandas as pd
import matplotlib.pylab as plt
# Retrieve for each rna the best value for MEA and compare this energy value with the one obtains with
# RNAeval and RNAfold from the ViennaRNA Package 2.0 (Ronny Lorentz et al., 2011)
# After getting those values, it will creates a figure.
def get_result_MEA(filename):
ext = "json_pmE"
file2 = open( "/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/results/" + filename + ext, "r")
name = file2.readline()
rna = file2.readline()
twod = file2.readline()
pred = re.findall(r'\S+', twod)
score = '-' + pred[len(pred)-1]
min = float(score)
contacts = file2.readline()
while twod:
twod = file2.readline()
pred = re.findall(r'\S+', twod)
if len(pred) > 0:
score = '-' + pred[len(pred) - 1]
if float(score) < min:
min = float(score)
contacts = file2.readline()
file2.close()
return min
fileMFE = open( "/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/RNAfold_bm.log", "r")
lineRna = fileMFE.readline()
lineStruct = fileMFE.readline()
fileEval = open( "/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/RNAeval_bm.log", "r")
lineRna2 = fileEval.readline()
lineStruct2 = fileEval.readline()
file = open("/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/data/modules/ISAURE/Motifs_version_initiale/benchmark.dbn", "r")
name = file.readline().strip()
rna = file.readline()
twod = file.readline()
contacts = file.readline()
list_name = []
list_score = []
list_type = []
print(np)
while name:
#print(name)
if lineRna != rna:
while lineRna != rna:
lineRna = fileMFE.readline()
lineStruct = fileMFE.readline()
MFE = float(lineStruct[len(lineStruct)-8:len(lineStruct)-2])
list_name.append(name[5:len(name)-1])
list_score.append(MFE)
list_type.append('MFE')
#print("MFE:" + str(MFE))
lineRna = fileMFE.readline()
lineStruct = fileMFE.readline()
if lineRna2 != rna:
while lineRna2 != rna:
lineRna2 = fileEval.readline()
lineStruct2 = fileEval.readline()
eval = float(lineStruct2[len(lineStruct2)-8:len(lineStruct2)-2])
list_name.append(name[5:len(name) - 1])
list_score.append(eval)
list_type.append('eval')
#print("Eval:" + str(eval))
lineRna2 = fileEval.readline()
lineStruct2 = fileEval.readline()
best_mea = get_result_MEA(name)
#print("MEA: " + str(best_mea) + "\n")
list_name.append(name[5:len(name) - 1])
list_score.append(best_mea)
list_type.append('MEA')
name = file.readline().strip()
rna = file.readline()
twod = file.readline()
contacts = file.readline()
file.close()
fileMFE.close()
fileEval.close()
'''print(list_MFE)
print(list_MEA)
print(list_eval)'''
#np = [["rna", "type_score", "score"]]
d = {'rna':list_name,'score':list_score, 'type_score':list_type}
df = pd.DataFrame(d, columns=['rna','type_score','score'])
sns.stripplot(x="rna",y="score",data=df,jitter=True,hue='type_score',palette='Set1')
plt.xticks(rotation=90)
plt.savefig("compare_BiORSEOMEA_RNAeval_RNAfold.png")
#include <iostream>
#include <sstream>
#include <fstream>
#include "/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/cppsrc/json.hpp"
#include <typeinfo>
#include <set>
#include <algorithm>
#include <cstdio>
#include <vector>
using namespace std;
using json = nlohmann::json;
//Count the number of '&' in the motif sequence
size_t count_delimiter(string& seq) {
size_t count = 0;
for(uint i = 0; i < seq.size(); i++) {
char c = seq.at(i);
if (c == '&') {
count++;
}
}
return count;
}
/*
If there is a '&' in the motif sequence in the field 'sequence' but not in the field 'contacts',
th script put a '&' in the same position in the field 'contacts' than in the field 'sequence'.
*/
void add_delimiter(const string& jsonfile, const string& jsonoutfile) {
std::ifstream lib(jsonfile);
std::ofstream outfile (jsonoutfile);
json new_motif;
json new_id;
json js = json::parse(lib);
//the list of pfam lists of the motif we want to count the inclusion in other motif
for (auto it = js.begin(); it != js.end(); ++it) {
string id = it.key();
string test;
string sequence;
string contacts;
bool is_change = false;
//cout << "id: " << id << endl;
for (auto it2 = js[id].begin(); it2 != js[id].end(); ++it2) {
test = it2.key();
if (!test.compare("sequence")) {
//cout << "sequence: " << it2.value() << endl;
sequence = it2.value();
new_id[test] = it2.value();
} else if (!test.compare("contacts") ) {
contacts = it2.value();
} else {
new_id[test] = it2.value();
}
}
string tmp = "";
if (count_delimiter(contacts) != count_delimiter(sequence) && contacts.size() == sequence.size()) {
for (uint i = 0; i < sequence.size(); i++) {
if (sequence.at(i) == '&') {
tmp += "&";
} else {
tmp += contacts.at(i);
}
}
} else {
tmp = contacts;
}
new_id["contacts"] = tmp;
new_motif[id] = new_id;
new_id.clear();
}
outfile << new_motif.dump(4) << endl;
outfile.close();
}
int main()
{
string jsonfile = "/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/data/modules/ISAURE/motifs_06-06-2021.json";
string out = "/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/data/modules/ISAURE/motifs_tmp.json";
add_delimiter(jsonfile, out);
return 0;
}
......@@ -29,7 +29,7 @@ import pickle
# ================== DEFINITION OF THE PATHS ==============================
biorseoDir = path.realpath(".")
jar3dexec = "/home/persalteas/Software/jar3dbin/jar3d_2014-12-11.jar"
jar3dexec = "/local/local/localopt/jar3d_2014-12-11.jar"
bypdir = biorseoDir + "/BayesPairing/bayespairing/src"
byp2dir = biorseoDir + "/BayesPairing2/bayespairing/src"
moipdir = "/home/persalteas/Software/RNAMoIP/Src/RNAMoIP.py"
......@@ -803,7 +803,7 @@ class Method:
else:
results_file = outputDir+f"{'' if self.allow_pk else 'no'}PK/"+basename+f".biorseo_{self.data_source.lower()}_{self.placement_method.lower()}_{self.func}"
c += ["--bayespaircsv", outputDir+basename+f".{self.data_source.lower()}_{self.placement_method.lower()}.csv"]
c += ["-o", results_file, "--func", self.func]
c += ["-o", results_file, "--func", self.func, "--MFE"]
if not self.allow_pk:
c += ["-n"]
self.joblist.append(Job(command=c, priority=4, timeout=3600,
......
......@@ -11,6 +11,12 @@
using namespace std;
using json = nlohmann::json;
/*
This script count the number of "occurrences" of the motif.
So we consider that if the sequence of pattern A is included in pattern B,
then for each inclusion of B we also have an inclusion of A. And vice versa.
*/
//Return true if the first sequence seq1 is included in the second sequence seq2
//if not return false
int is_contains(string& seq1, string& seq2) {
......@@ -38,6 +44,8 @@ int is_contains(string& seq1, string& seq2) {
//If we find the sequence and structure of pattern A in pattern B, we have to concatenate the pfam lists of A and B,
//remove the duplicates, assign this new list of pfam lists to A, and assign as occurrence to A the size of this list.
//The pattern A is counted only once in every other pattern, i.e. even if the sequence of A is found several times in B,
// it will be added only once in the occurrences of A.
void counting_occurences(const string& jsonfile, const string& jsonoutfile) {
std::ifstream lib(jsonfile);
std::ifstream lib2(jsonfile);
......@@ -73,14 +81,6 @@ void counting_occurences(const string& jsonfile, const string& jsonoutfile) {
if (!test.compare("pfam")) {
vector<vector<string>> tab = it2.value();
list_pfams = tab;
/*set<set<string>>::iterator iit;
set<string>::iterator iit2;
for(iit = list_pfams.begin(); iit != list_pfams.end(); iit++) {
for (iit2 = iit->begin(); iit2 != iit->end(); ++iit2) {
cout << *iit2 << endl;
}
cout << endl << endl;
}*/
} else if (!test.compare("sequence")) {
//cout << "sequence: " << it2.value() << endl;
sequence = it2.value();
......@@ -124,7 +124,6 @@ void counting_occurences(const string& jsonfile, const string& jsonoutfile) {
new_id[test] = it2.value();
}
}
//cout << "-------begin---------" << endl;
for (auto it3 = js2.begin(); it3 != js2.end(); ++it3) {
string id2 = it3.key();
......@@ -142,22 +141,6 @@ void counting_occurences(const string& jsonfile, const string& jsonoutfile) {
if (!test.compare("pfam")) {
vector<vector<string>> tab = it4.value();
list_pfams2 = tab;
/*for (uint k = 0; k < tab2.size(); k++) {
for (uint l = 0; l < tab2[k].size(); l++) {
pfams2.insert(tab2[k][l]);
}
list_pfams2.insert(pfams);
pfams2.clear();
}*/
/*set<set<string>>::iterator iit;
set<string>::iterator iit2;
for(iit = list_pfams.begin(); iit != list_pfams.end(); iit++) {
for (iit2 = iit->begin(); iit2 != iit->end(); ++iit2) {
cout << *iit2 << endl;
}
cout << endl << endl;
}*/
} else if (!test.compare("occurences")) {
occurences2 = it4.value();
//cout << "occurences2: "<< occurences2 << endl;
......@@ -216,7 +199,6 @@ void counting_occurences(const string& jsonfile, const string& jsonoutfile) {
}
}
//cout << "----end----" << endl;
//}
}
if(flag) {
......@@ -242,23 +224,12 @@ void counting_occurences(const string& jsonfile, const string& jsonoutfile) {
//cout << endl;*/
}
/*for(uint ii = 0; ii < list_pfams.size(); ii++) {
for (uint jj = 0; jj < list_pfams[ii].size(); jj++) {
cout << "[" << ii << "][" << jj << "]: " << list_pfams[ii][jj] << endl;
}
}*/
new_id["occurences"] = list_pfams.size();
new_id["pfam"] = list_pfams;
//cout << "-------ending---------" << endl;
new_id["pfam"] = list_pfams;
new_motif[id] = new_id;
new_id.clear();
//cout << "valeur: " << ite << endl;
/*for (uint i = 0; i < tab_struc.size() ; i++) {
cout << "tab_struc[" << i << "]: " << tab_struc[i] << endl << endl;
} */
}
outfile << new_motif.dump(4) << endl;
outfile.close();
......@@ -267,13 +238,11 @@ void counting_occurences(const string& jsonfile, const string& jsonoutfile) {
int main()
{
//183
//cout << "------------------BEGIN-----------------" << endl;
string jsonfile = "/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/data/modules/ISAURE/Motifs_version_initiale/motifs_06-06-2021.json";
string out = "/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/data/modules/ISAURE/Motifs_derniere_version/motifs_final.json";
string jsonfile = "/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/data/modules/ISAURE/motifs_06-06-2021.json";
string out = "/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/data/modules/ISAURE/motifs_final.json";
counting_occurences(jsonfile, out);
//cout << "------------------END-----------------" << endl;
return 0;
}
......
#include <iostream>
#include <sstream>
#include <fstream>
#include "/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/cppsrc/json.hpp"
#include <typeinfo>
#include <set>
#include <algorithm>
#include <cstdio>
#include <vector>
using namespace std;
using json = nlohmann::json;
/*
Create a .fasta file for each of the sequence inside the benchmark in json format.
Also create a .dbn and .txt file that list the name, sequence, 2d structure and contacts for all sequence in the benchmark file.
Those files are useful for the Isaure_benchmark.py script.
*/
void create_files(const string& jsonmotifs) {
std::ifstream lib(jsonmotifs);
string fasta = "/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/data/fasta/";
string list = "/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/data/modules/ISAURE/Motifs_version_initiale/benchmark.txt";
string dbn = "/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/data/modules/ISAURE/Motifs_version_initiale/benchmark.dbn";
std::ofstream outlist (list);
std::ofstream outdbn (dbn);
json js = json::parse(lib);
uint count = 0;
for (auto it = js.begin(); it != js.end(); ++it) {
string id = it.key();
string name, seq, contacts, structure;
for (auto it2 = js[id].begin(); it2 != js[id].end(); ++it2) {
string chain = it2.key();
if (chain.compare("pfams") != 0) {
string name = id + "_" + chain;
string filename = fasta + name + ".fa";
std::ofstream outfasta (filename);
outfasta << ">test_" << name << endl;
for (auto it3 = js[id][chain].begin(); it3 != js[id][chain].end(); ++it3) {
string field = it3.key();
if (!field.compare("sequence")) {
seq = it3.value();
outfasta << seq.substr(0,seq.size()) << endl;
outfasta.close();
} else if (!field.compare("contacts")) {
contacts = it3.value();
} else if (!field.compare("struct2d")) {
structure = it3.value();
}
}
if(seq.find('&') == string::npos) {
outlist << ">test_" << name << endl;
outdbn << "test_" << name << "." << endl;
outlist << contacts << endl;
outdbn << seq << endl;
outdbn << structure << endl;
outdbn << contacts << endl;
outlist << seq << endl;
outlist << structure << endl;
count++;
}
}
}
}
cout << count << " sequences en tout" << endl;
lib.close();
outlist.close();
outdbn.close();
}
int main()
{
string path = "/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/data/";
string jsonbm = path + "modules/ISAURE/benchmark_16-07-2021.json";
create_files(jsonbm);
return 0;
}
#include <iostream>
#include <sstream>
#include <fstream>
#include "/local/local/BiorseoNath/cppsrc/json.hpp"
#include <typeinfo>
#include <set>
#include <algorithm>
#include <cstdio>
#include <vector>
#include <string>
using namespace std;
using json = nlohmann::json;
/*
This script is use to create a new motif library without a motif that contains the same pdb as the sequence used in input for prediction
with BiORSEO.
*/
void delete_redundant_pdb(const string& jsonlibrary, const string& name, const string& jsonoutfile) {
std::ifstream lib(jsonlibrary);
std::ofstream outfile (jsonoutfile);
json new_motif;
json new_id;
json js = json::parse(lib);
for (auto it = js.begin(); it != js.end(); ++it) {
string id = it.key();
vector<string> list_pdbs;
bool is_added = true;
for (auto it2 = js[id].begin(); it2 != js[id].end(); ++it2) {
string field = it2.key();
if (!field.compare("pdb")) {
vector<string> tab = it2.value();
list_pdbs = tab;
} else {
new_id[field] = it2.value();
}
}
if (count(list_pdbs.begin(), list_pdbs.end(), name.substr(0, name.size()-2))) {
is_added = false;
}
if (is_added) {
new_id["pdb"] = list_pdbs;
new_motif[id] = new_id;
}
new_id.clear();
}
outfile << new_motif.dump(4) << endl;
outfile.close();
}
int main(int argc, char** argv)
{
string jsonlibrary = "/local/local/BiorseoNath/data/modules/ISAURE/motifs_final.json";
string out = "/local/local/BiorseoNath/data/modules/ISAURE/bibliotheque_a_lire/motifs_final.json";
string name = argv[1];
delete_redundant_pdb(jsonlibrary, name, out);
return 0;
}
......@@ -28,17 +28,18 @@
from math import sqrt
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm
from matplotlib import cm
import scipy.stats as st
import sys
import os
import subprocess
import getopt
class SecStruct:
def __init__(self, dot_bracket, obj1_value, obj2_value):
self.dbn = dot_bracket
self.objectives = [ obj1_value, obj2_value ]
self.objectives = [obj1_value, obj2_value]
self.basepair_list = self.get_basepairs()
self.length = len(dot_bracket)
......@@ -96,9 +97,9 @@ class SecStruct:
tn = reference_structure.length * (reference_structure.length - 1) * 0.5 - fp - fn - tp
# Compute MCC
if (tp+fp == 0):
if (tp + fp == 0):
print("We have an issue : no positives detected ! (linear structure)")
return (tp*tn-fp*fn) / sqrt((tp+fp)*(tp+fn)*(tn+fp)*(tn+fn))
return (tp * tn - fp * fn) / sqrt((tp + fp) * (tp + fn) * (tn + fp) * (tn + fn))
class Pareto:
......@@ -106,16 +107,16 @@ class Pareto:
self.predictions = list_of_structs
self.true_structure = reference
self.n_pred = len(list_of_structs)
self.max_obj1 = max([ s.objectives[0] for s in self.predictions ])
self.max_obj2 = max([ s.objectives[1] for s in self.predictions ])
self.max_obj1 = max([s.objectives[0] for s in self.predictions])
self.max_obj2 = max([s.objectives[1] for s in self.predictions])
self.index_of_best = self.find_best_solution()
def find_best_solution(self):
# returns the index of the solution of the Pareto set which is the closest
# to the real 2D structure (the one with the max MCC)
max_i = -1
max_mcc = -1
for i,s in enumerate(self.predictions):
for i, s in enumerate(self.predictions):
mcc = s.get_MCC_with(self.true_structure)
if mcc > max_mcc:
max_mcc = mcc
......@@ -125,15 +126,15 @@ class Pareto:
def get_normalized_coords(self):
# retrieves the objective values of the best solution and normlizes them
coords = self.predictions[self.index_of_best].objectives
if self.max_obj1: # avoid divide by zero if all solutions are 0
x = coords[0]/self.max_obj1
if self.max_obj1: # avoid divide by zero if all solutions are 0
x = coords[0] / self.max_obj1
else:
x = 0.5
if self.max_obj2: # avoid divide by zero if all solutions are 0
y = coords[1]/self.max_obj2
if self.max_obj2: # avoid divide by zero if all solutions are 0
y = coords[1] / self.max_obj2
else:
y = 0.5
return ( x, y )
return (x, y)
class RNA:
......@@ -145,6 +146,8 @@ class RNA:
ignored_nt_dict = {}
def is_canonical_nts(seq):
for c in seq[:-1]:
if c not in "ACGU":
......@@ -155,6 +158,7 @@ def is_canonical_nts(seq):
return False
return True
def is_canonical_bps(struct):
if "()" in struct:
return False
......@@ -203,6 +207,7 @@ def load_from_dbn(file, header_style=3):
db.close()
return container, pkcounter
def parse_biokop(folder, basename, ext=".biok"):
solutions = []
err = 0
......@@ -243,6 +248,7 @@ def parse_biokop(folder, basename, ext=".biok"):
err = 1
return None, err
def parse_biorseo(folder, basename, ext):
solutions = []
err = 0
......@@ -266,6 +272,7 @@ def parse_biorseo(folder, basename, ext):
err = 1
return None, err
def prettify_biorseo(code):
name = ""
if "bgsu" in code:
......@@ -301,8 +308,8 @@ def process_extension(ax, pos, ext, nsolutions=False, xlabel="Best solution perf
print("[%s] Loaded %d solutions in a Pareto set, max(obj1)=%f, max(obj2)=%f" % (rna.basename_, pset.n_pred, pset.max_obj1, pset.max_obj2))
print("Loaded %d points on %d." % (len(points), len(RNAcontainer)-skipped))
x = np.array([ p[0] for p in points ])
y = np.array([ p[1] for p in points ])
x = np.array([p[0] for p in points])
y = np.array([p[1] for p in points])
xmin, xmax = 0, 1
ymin, ymax = 0, 1
xx, yy = np.mgrid[xmin:xmax:100j, ymin:ymax:100j]
......@@ -316,19 +323,21 @@ def process_extension(ax, pos, ext, nsolutions=False, xlabel="Best solution perf
ax[pos].axvline(x=1, alpha=0.2, color='black')
ax[pos].contourf(xx, yy, f, cmap=cm.Blues, alpha=0.5)
ax[pos].scatter(x, y, s=25, alpha=0.1)
ax[pos].set_xlim((-0.1,1.1))
ax[pos].set_ylim((-0.1,1.1))
ax[pos].annotate("("+str(len(points))+'/'+str(len(RNAcontainer)-skipped)+" RNAs)", (0.08,0.15))
ax[pos].set_xlim((-0.1, 1.1))
ax[pos].set_ylim((-0.1, 1.1))
ax[pos].set_title(prettify_biorseo(ext[1:]), fontsize=10)
ax[pos].annotate("(" + str(len(points)) + '/' + str(len(RNAcontainer)-skipped) + " RNAs)", (0.08, 0.15))
ax[pos].set_xlabel(xlabel)
ax[pos].set_ylabel(ylabel)
if nsolutions:
ax[pos+1].hist(sizes, bins=range(0, max(sizes)+1, 2), histtype='bar')
ax[pos+1].set_xlim((0,max(sizes)+2))
ax[pos+1].set_xticks(range(0, max(sizes), 10))
ax[pos+1].set_xticklabels(range(0, max(sizes), 10), rotation=90)
ax[pos+1].set_xlabel("# solutions")
ax[pos+1].set_ylabel("# RNAs")
ax[pos + 1].hist(sizes, bins=range(0, max(sizes) + 1, 2), histtype='bar')
ax[pos + 1].set_xlim((0, max(sizes) + 2))
ax[pos + 1].set_xticks(range(0, max(sizes), 10))
ax[pos + 1].set_xticklabels(range(0, max(sizes), 10), rotation=90)
ax[pos + 1].set_xlabel("# solutions")
ax[pos + 1].set_ylabel("# RNAs")
if __name__ == "__main__":
try:
......
#!/usr/bin/python3
# Created by Louis Becquey, louis.becquey@univ-evry.fr, Oct 2019
# This script processes files containing RNA structures obtained from bi-objective
# optimization programs, and a dot-bracket database of reference structures, to plot
# where are the best solutions in the Pareto set.
#
# The result files should follow this kind of format:
# for Biokop: (option --biokop)
# Structure Free energy score Expected accuracy score
# (((...(((...)))))) <tab> obj1_value <tab> obj2_value
# (((............))) <tab> obj1_value <tab> obj2_value
# ((((((...)))...))) <tab> obj1_value <tab> obj2_value
# ...
#
# for BiORSEO: (options --biorseo_**stuff**)
# >Header of the sequence
# GGCACAGAGUUAUGUGCC
# (((...(((...)))))) + Motif1 + Motif2 <tab> obj1_value <tab> obj2_value
# (((............))) <tab> obj1_value <tab> obj2_value
# ((((((...)))...))) + Motif1 <tab> obj1_value <tab> obj2_value
#
# typical Biokop usage:
# python3 pareto_visualizer.py --biokop --folder path/to/your/results/folder --database path/to/the/database_file.dbn
# typical Biorseo usage:
# python3 pareto_visualizer_json.py --folder path/to/your/results/folder (pmE et pmF) --database path/to/the/database_file.dbn (nom, sequence, structure)
#
from math import sqrt
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm
import scipy.stats as st
import sys
import os
import subprocess
import getopt
class SecStruct:
def __init__(self, name, dot_bracket, contacts, obj1_value, obj2_value):
self.name = name
self.dbn = dot_bracket
self.ctc = contacts
self.objectives = [ obj1_value, obj2_value ]
self.basepair_list = self.get_basepairs()
self.length = len(dot_bracket)
def get_basepairs(self):
parenthesis = []
brackets = []
braces = []
rafters = []
basepairs = []
As = []
Bs = []
for i, c in enumerate(self.dbn):
if c == '(':
parenthesis.append(i)
if c == '[':
brackets.append(i)
if c == '{':
braces.append(i)
if c == '<':
rafters.append(i)
if c == 'A':
As.append(i)
if c == 'B':
Bs.append(i)
if c == '.':
continue
if c == ')':
basepairs.append((i, parenthesis.pop()))
if c == ']':
basepairs.append((i, brackets.pop()))
if c == '}':
basepairs.append((i, braces.pop()))
if c == '>':
basepairs.append((i, rafters.pop()))
if c == 'a':
basepairs.append((i, As.pop()))
if c == 'b':
basepairs.append((i, Bs.pop()))
return basepairs
def get_MCC_with(self, reference_structure):
# Get true and false positives and negatives
tp = 0
fp = 0
tn = 0
fn = 0
for bp in reference_structure.basepair_list:
if bp in self.basepair_list:
tp += 1
else:
fn += 1
for bp in self.basepair_list:
if bp not in reference_structure.basepair_list:
fp += 1
tn = reference_structure.length * (reference_structure.length - 1) * 0.5 - fp - fn - tp
# Compute MCC
if (tp+fp == 0):
print("We have an issue : no positives detected ! (linear structure)")
return (tp*tn-fp*fn) / sqrt((tp+fp)*(tp+fn)*(tn+fp)*(tn+fn))
def get_MCC_ctc_with(self, reference_structure):
# Get true and false positives and negatives
tp = 0
fp = 0
tn = 0
fn = 0
prediction = self.ctc
true_ctc = reference_structure.ctc
for i in range(len(true_ctc)):
if true_ctc[i] == '*' and prediction[i] == '*':
tp += 1
elif true_ctc[i] == '.' and prediction[i] == '.':
tn += 1
elif true_ctc[i] == '.' and prediction[i] == '*':
fp += 1
elif true_ctc[i] == '*' and prediction[i] == '.':
fn += 1
# print(str(tp) + " " + str(tn) + " " + str(fp) + " " + str(fn) + "\n")
result = (tp * tn - fp * fn) / sqrt((tp + fp) * (tp + fn) * (tn + fp) * (tn + fn))
# Compute MCC
if ((tp + fp) * (tp + fn) * (tn + fp) * (tn + fn) == 0):
print("warning: division by zero!")
return None
elif (tp + fp == 0):
print("We have an issue : no positives detected ! (linear structure)")
return (tp * tn - fp * fn) / sqrt((tp + fp) * (tp + fn) * (tn + fp) * (tn + fn))
class Pareto:
def __init__(self, list_of_structs, reference):
self.predictions = list_of_structs
self.true_structure = reference
self.n_pred = len(list_of_structs)
self.max_obj1 = max([s.objectives[0] for s in self.predictions ])
self.max_obj2 = max([s.objectives[1] for s in self.predictions ])
self.index_of_best = self.find_best_solution()
self.index_of_best_ctc = self.find_best_solution_ctc()
def find_best_solution(self):
# returns the index of the solution of the Pareto set which is the closest
# to the real 2D structure (the one with the max MCC)
max_i = -1
max_mcc = -1
for i,s in enumerate(self.predictions):
mcc = s.get_MCC_with(self.true_structure)
if mcc > max_mcc:
max_mcc = mcc
max_i = i
print("\n" + "max mcc str: " + str(max_mcc))
return max_i
def find_best_solution_ctc(self):
# returns the index of the solution of the Pareto set which is the closest
# to the real contacts area (the one with the max MCC)
max_i = -1
max_mcc = -1
for i,s in enumerate(self.predictions):
mcc = s.get_MCC_ctc_with(self.true_structure)
if mcc is None:
continue
elif mcc > max_mcc:
max_mcc = mcc
max_i = i
return max_i
def get_normalized_coords(self):
# retrieves the objective values of the best solution and normalizes them
coords = self.predictions[self.index_of_best].objectives
if self.max_obj1: # avoid divide by zero if all solutions are 0
x = coords[0]/self.max_obj1
else:
x = 0.5
if self.max_obj2: # avoid divide by zero if all solutions are 0
y = coords[1]/self.max_obj2
else:
y = 0.5
return ( x,y )
def get_normalized_coords_ctc(self):
CRED = '\033[91m'
CEND = '\033[0m'
CGREEN = '\33[32m'
CBLUE = '\33[34m'
# retrieves the objective values of the best solution and normalizes them
coords = self.predictions[self.index_of_best_ctc].objectives
if self.max_obj1: # avoid divide by zero if all solutions are 0
x = coords[0]/self.max_obj1
else:
x = 0.5
"""if(x < 0.5):
print("\n" + CRED + self.predictions[self.index_of_best_ctc].name + CEND)
print(CRED + self.predictions[self.index_of_best_ctc].ctc + CEND)
print("count: " + str(self.predictions[self.index_of_best_ctc].ctc.count("*")))
print(CRED + self.true_structure.ctc + CEND)
print("count: " + str(self.true_structure.ctc.count("*")) + "\n")
elif(x >= 0.5 and type(self.predictions[self.index_of_best_ctc].ctc)) is str:
print("\n" + CGREEN + self.predictions[self.index_of_best_ctc].name + CEND)
print(CGREEN + self.predictions[self.index_of_best_ctc].ctc + CEND)
print("count: " + str(self.predictions[self.index_of_best_ctc].ctc.count("*")))
print(CGREEN + self.true_structure.ctc + CEND)
print("count: " + str(self.true_structure.ctc.count("*")) + "\n")"""
if self.max_obj2: # avoid divide by zero if all solutions are 0
y = coords[1]/self.max_obj2
else:
y = 0.5
return ( x,y )
class RNA:
def __init__(self, filename, header, seq, struct, contacts):
self.seq_ = seq
self.header_ = header
self.struct_ = struct
self.contacts_ = contacts
self.basename_ = filename
ignored_nt_dict = {}
def is_canonical_nts(seq):
for c in seq[:-1]:
if c not in "ACGU":
if c in ignored_nt_dict.keys():
ignored_nt_dict[c] += 1
else:
ignored_nt_dict[c] = 1
return False
return True
def is_canonical_bps(struct):
if "()" in struct:
return False
if "(.)" in struct:
return False
if "(..)" in struct:
return False
if "[]" in struct:
return False
if "[.]" in struct:
return False
if "[..]" in struct:
return False
return True
def load_from_dbn(file, header_style=1):
container = []
counter = 0
db = open(file, "r")
c = 0
header = ""
seq = ""
struct = ""
while True:
l = db.readline()
if l == "":
break
c += 1
c = c % 4
if c == 1:
header = l[:-1]
if c == 2:
seq = l[:-1].upper()
if c == 3:
struct = l[:-1]
n = len(seq)
if c == 0:
contacts = l[:-1]
if is_canonical_nts(seq) and is_canonical_bps(struct):
if header_style == 1: container.append(RNA(header.replace('/', '_').split('(')[-1][:-1], header, seq, struct, contacts))
if header_style == 2: container.append(RNA(header.replace('/', '_').split('[')[-1][:-41], header, seq, struct, contacts))
if '[' in struct: counter += 1
db.close()
return container, counter
def parse_biokop(folder, basename, ext=".biok"):
solutions = []
if os.path.isfile(os.path.join(folder, basename + ext)):
rna = open(os.path.join(folder, basename + ext), "r")
lines = rna.readlines()
rna.close()
different_2ds = []
for s in lines[1:]:
if s == '\n':
continue
splitted = s.split('\t')
db2d = splitted[0]
if db2d not in different_2ds:
different_2ds.append(db2d)
# here is a negative sign because Biokop actually minimizes -MEA instead
# of maximizing MEA : we switch back to MEA
solutions.append(SecStruct(basename, db2d, -float(splitted[1]), -float(splitted[2][:-1])))
# check the range of MEA in this pareto set
min_mea = solutions[0].objectives[1]
max_mea = min_mea
for s in solutions:
mea = s.objectives[1]
if mea < min_mea:
min_mea = mea
if mea > max_mea:
max_mea = mea
# normalize so the minimum MEA of the set is 0
for i in range(len(solutions)):
solutions[i].objectives[1] -= min_mea
if len(different_2ds) > 1:
return solutions
else:
print("[%s] \033[36mWARNING: ignoring this RNA, only one 2D solution is found.\033[0m" % (basename))
else:
print("[%s] \033[36mWARNING: file not found !\033[0m" % (basename))
def parse_biorseo(folder, basename, ext):
solutions = []
print(basename + ext)
if os.path.isfile(os.path.join(folder, basename + ext)):
rna = open(os.path.join(folder, basename + ext), "r")
lines = rna.readlines()
rna.close()
different_2ds = []
contacts = []
str2d = []
count = 0;
for s in lines[2:]:
count = count + 1
if s == '\n':
continue
splitted = s.split('\t')
if(count % 2 == 1):
obj1 = float(splitted[1])
obj2 = float(splitted[2][:-1])
db2d = splitted[0].split(' ')[0]
if db2d not in different_2ds:
if(s.find('(') != -1):
different_2ds.append(db2d)
if(s.find('*') != -1):
contacts = db2d
solutions.append(SecStruct(basename, str2d, contacts, obj1, obj2))
elif(s.find('(') != -1):
str2d = db2d
if len(different_2ds) > 1:
return solutions
else:
print("[%s] \033[36mWARNING: ignoring this RNA, only one 2D or contacts solution is found.\033[0m" % (basename))
else:
print("[%s] \033[36mWARNING: file not found !\033[0m" % (basename))
return None
def prettify_biorseo(code):
name = ""
if "bgsu" in code:
name += "RNA 3D Motif Atlas + "
elif "json" in code:
name += "Motifs d'Isaure + Direct P.M"
else:
name += "Rna3Dmotifs + "
if "raw" in code:
name += "Direct P.M."
if "byp" in code:
name += "BPairing"
if "jar3d" in code:
name += "Jar3d"
# name += " + $f_{1" + code[-1] + "}$"
return name
# Parse options
try:
opts, args = getopt.getopt( sys.argv[1:], "",
[ "json_pmE",
"json_pmF",
"folder=",
"database=",
"output="
])
except getopt.GetoptError as err:
print(err)
sys.exit(2)
results_folder = "."
extension = "all"
outputf = ""
for opt, arg in opts:
if opt == "--biokop":
extension = ".biok"
parse = parse_biokop
elif opt == "--folder":
results_folder = arg
elif opt == "--database":
database = arg
elif opt == "--output":
outputf = arg
else:
extension = '.' + opt[2:]
parse = parse_biorseo
RNAcontainer, _ = load_from_dbn(database)
if results_folder[-1] != '/':
results_folder = results_folder + '/'
if outputf == "":
outputf = results_folder
if outputf[-1] != '/':
outputf = outputf + '/'
def process_extension(ax, pos, ext, nsolutions=False, xlabel="Best solution performs\nwell on obj1", ylabel="Best solution performs\n well on obj2"):
points = []
sizes = []
for rna in RNAcontainer:
# Extracting the predictions from the results file
solutions = parse(results_folder, rna.basename_, ext)
reference = SecStruct(rna.basename_, rna.struct_, rna.contacts_, float("inf"), float("inf"))
if solutions is None:
continue
pset = Pareto(solutions, reference)
points.append(pset.get_normalized_coords())
sizes.append(pset.n_pred)
print("[%s] Loaded %d solutions in a Pareto set, max(obj1)=%f, max(obj2)=%f" % (rna.basename_, pset.n_pred, pset.max_obj1, pset.max_obj2))
print("Loaded %d points on %d." % (len(points), len(RNAcontainer)))
x = np.array([ p[0] for p in points ])
y = np.array([ p[1] for p in points ])
xmin, xmax = 0, 1
ymin, ymax = 0, 1
xx, yy = np.mgrid[xmin:xmax:100j, ymin:ymax:100j]
positions = np.vstack([xx.ravel(), yy.ravel()])
values = np.vstack([x, y])
kernel = st.gaussian_kde(values)
f = np.reshape(kernel(positions).T, xx.shape)
ax[pos].axhline(y=0, alpha=0.2, color='black')
ax[pos].axhline(y=1, alpha=0.2, color='black')
ax[pos].axvline(x=0, alpha=0.2, color='black')
ax[pos].axvline(x=1, alpha=0.2, color='black')
ax[pos].contourf(xx, yy, f, cmap=cm.Blues, alpha=0.5)
ax[pos].scatter(x, y, s=25, alpha=0.1)
ax[pos].set_xlim((-0.1,1.1))
ax[pos].set_ylim((-0.1,1.1))
ax[pos].set_title(prettify_biorseo(ext[1:]), fontsize=10)
ax[pos].annotate("("+str(len(points))+'/'+str(len(RNAcontainer))+" RNAs)", (0.08, 0.15))
ax[pos].set_xlabel(xlabel)
ax[pos].set_ylabel(ylabel)
if nsolutions:
ax[pos+1].hist(sizes, bins=range(0, max(sizes)+1, 2), histtype='bar')
ax[pos+1].set_xlim((0,max(sizes)+2))
ax[pos+1].set_xticks(range(0, max(sizes), 10))
ax[pos+1].set_xticklabels(range(0, max(sizes), 10), rotation=90)
ax[pos+1].set_xlabel("# solutions")
ax[pos+1].set_ylabel("# RNAs")
def process_extension_ctc(ax, pos, ext, nsolutions=False, xlabel="Best solution performs\nwell on obj1", ylabel="Best solution performs\n well on obj2"):
points = []
sizes = []
for rna in RNAcontainer:
# Extracting the predictions from the results file
solutions = parse(results_folder, rna.basename_, ext)
reference = SecStruct(rna.basename_, rna.struct_, rna.contacts_, float("inf"), float("inf"))
if solutions is None:
continue
pset = Pareto(solutions, reference)
points.append(pset.get_normalized_coords_ctc())
sizes.append(pset.n_pred)
print("[%s] Loaded %d solutions in a Pareto set, max(obj1)=%f, max(obj2)=%f" % (rna.basename_, pset.n_pred, pset.max_obj1, pset.max_obj2))
print("Loaded %d points on %d." % (len(points), len(RNAcontainer)))
x = np.array([ p[0] for p in points ])
y = np.array([ p[1] for p in points ])
xmin, xmax = 0, 1
ymin, ymax = 0, 1
xx, yy = np.mgrid[xmin:xmax:100j, ymin:ymax:100j]
positions = np.vstack([xx.ravel(), yy.ravel()])
values = np.vstack([x, y])
kernel = st.gaussian_kde(values)
f = np.reshape(kernel(positions).T, xx.shape)
ax[pos].axhline(y=0, alpha=0.2, color='black')
ax[pos].axhline(y=1, alpha=0.2, color='black')
ax[pos].axvline(x=0, alpha=0.2, color='black')
ax[pos].axvline(x=1, alpha=0.2, color='black')
ax[pos].contourf(xx, yy, f, cmap=cm.Blues, alpha=0.5)
ax[pos].scatter(x, y, s=25, alpha=0.1)
ax[pos].set_xlim((-0.1,1.1))
ax[pos].set_ylim((-0.1,1.1))
ax[pos].set_title(prettify_biorseo(ext[1:]), fontsize=10)
ax[pos].annotate("("+str(len(points))+'/'+str(len(RNAcontainer))+" RNAs)", (0.08,0.15))
ax[pos].set_xlabel(xlabel)
ax[pos].set_ylabel(ylabel)
if nsolutions:
ax[pos+1].hist(sizes, bins=range(0, max(sizes)+1, 2), histtype='bar')
ax[pos+1].set_xlim((0,max(sizes)+2))
ax[pos+1].set_xticks(range(0, max(sizes), 10))
ax[pos+1].set_xticklabels(range(0, max(sizes), 10), rotation=90)
ax[pos+1].set_xlabel("# solutions")
ax[pos+1].set_ylabel("# RNAs")
if extension == "all":
parse = parse_biorseo
fig, ax = plt.subplots(1, 2, figsize=(10, 5), sharey=True)
ax = ax.flatten()
process_extension(ax, 0, ".json_pmF_MEA", xlabel="Normalized $f_{1E}$", ylabel="Normalized MEA")
print("--------------------------------------------------------------------------------------------")
process_extension_ctc(ax, 1, ".json_pmF_MEA", xlabel="Normalized $f_{1E}$", ylabel="Normalized MEA")
print("--------------------------------------------------------------------------------------------")
for a in ax:
a.label_outer()
plt.subplots_adjust(bottom=0.2, top=0.9, left=0.07, right=0.98, hspace=0.05, wspace=0.05)
plt.savefig("pareto_visualizer_json_MEA_functionF.png")
else:
fig, ax = plt.subplots(2,1, figsize=(6,5))
plt.subplots_adjust(bottom=0.12, top=0.9, left=0.15, right=0.9, hspace=0.4)
if extension == ".biok":
process_extension(ax, 0, extension, nsolutions=True, xlabel="Normalized MFE", ylabel="Normalized MFE")
else:
process_extension(ax, 0, extension, nsolutions=False)
plt.savefig("pareto_visualizer_ext.png")
\ No newline at end of file
#include <iostream>
#include <sstream>
#include <fstream>
#include "/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/cppsrc/json.hpp"
#include <typeinfo>
#include <set>
#include <algorithm>
#include <cstdio>
#include <vector>
using namespace std;
using json = nlohmann::json;
/*
That script will remove from the library all the pattern that match ONLY with the sequence from which it comes from (with the same pdb).
*/
//To store the pdb and the sequence in the benchmark file. Also stor the corresponding motif id and components based on this sequence.
struct data {
//the pdb code (in the name of the sequence)
string pdb;
//the complete sequence with this pdb code
string seq_pdb;
//the id of the motif corresponding to this pdb in the library
string id;
//the module sequence with the components of this motif with the above id
string cmp;
};
typedef struct data data;
//returns the list of pdb codes and the corresponding information from the benchmark file.
vector<data> get_list_pdb_benchmark(const string& benchmark) {
fstream bm(benchmark);
vector<data> list_pdb_seq;
if (bm.is_open()) {
string name;
string sequence;
string structure;
string contacts;
while (getline(bm, name)) {
data d;
int size = name.size();
name = name.substr(5,size-6);
getline(bm, sequence);
d.pdb = name;
d.seq_pdb = sequence;
list_pdb_seq.push_back(d);
getline(bm, structure);
getline(bm, contacts);
}
bm.close();
}
return list_pdb_seq;
}
string trim(string str) {
int size = str.size();
str = str.substr(1, size-2);
return str;
}
//store the corresponding id and motif to the sequence from the benchmark file
data find_id_pattern(string& pdb_pattern, const string& benchmark) {
vector<data> l = get_list_pdb_benchmark(benchmark);
int size = l.size();
for (data d : l) {
string cmp = d.pdb;
cmp = cmp.substr(0, d.pdb.size()-2);
if (!cmp.compare(pdb_pattern)) {
return d;
}
}
return data();
}
//Create an array of data ('association'), which consists of each pdb of the benchmark file
// with the associated pattern from this sequence.
vector<data> find_id(const string& bibli, const string& benchmark) {
ifstream lib(bibli);
json js = json::parse(lib);
//nam seq_bm et id seq_id
vector<data> association;
for (auto it = js.begin(); it != js.end(); ++it) {
string id = it.key();
data d;
for (auto it2 = js[id].begin(); it2 != js[id].end(); ++it2) {
string field = it2.key();
string seq;
if (!field.compare("pdb")) {
int n = js[id][field].size();
for (int i = 0; i < n ; i++) {
ostringstream stream;
stream << js[id][field][i];
string pdb = trim(stream.str());
d = find_id_pattern(pdb, benchmark);
}
}
if (!field.compare("sequence")) {
seq = it2.value();
if (!(d.pdb.empty())) {
d.id = id;
d.cmp = seq;
association.push_back(d);
}
}
}
}
lib.close();
cout << association.size() << endl;
return association;
}
//check if the motif is found matching with a complete sequence from a benchmark file.
bool does_it_match(const string& seq, const string& seq_motif) {
size_t found = seq_motif.find("&");
size_t size = seq_motif.size();
vector<string> list_cmp;
if (found != std::string::npos) {
int count = 1;
string cmp = seq_motif.substr(0, found);
list_cmp.push_back(cmp);
while(found != std::string::npos) {
size_t begin = found;
found = seq_motif.find("&", found + 1);
cmp = seq_motif.substr(begin+1, found-begin-1);
list_cmp.push_back(cmp);
count++;
}
found = seq.find(list_cmp[0]);
int count2 = 1;
while((found != std::string::npos) && (count2 < count)) {
size_t begin = found;
found = seq.find(list_cmp[count2], found + 1);
count2++;
}
if(count == count2) {
return true;
}
} else {
found = seq.find(seq_motif);
if (found != std::string::npos) {
return true;
}
}
return false;
}
//return the list of motif id that didn't match with any other complete sequence than the one which it came from.
vector<string> select_not_motif(const string& bibli, const string& benchmark) {
vector<string> selection;
vector<data> association = find_id(bibli, benchmark);
for (data d : association) {
selection.push_back(d.id);
}
for (data d : association) {
for (data d2 : association) {
string seq = d.seq_pdb;
string seq2 = d2.cmp;
bool test = false;
if(d.pdb.substr(0, d.pdb.size()-2) != d2.pdb.substr(0, d2.pdb.size()-2)) {
test = does_it_match(seq, seq2);
if (test) {
cout << "pdb: " << d.pdb << " vs " << d2.pdb << " " << d2.cmp << " " << d2.id << endl;
auto position = find(selection.begin(), selection.end(), d.id);
if (position != selection.end()) {
int index = position - selection.begin();
selection.erase(selection.begin() + index);
}
}
}
}
}
sort(selection.begin(), selection.end() );
selection.erase(unique(selection.begin(), selection.end() ), selection.end() );
cout << "size: " << selection.size() << endl;
return selection;
}
int main()
{
string bibli = "/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/data/modules/ISAURE/motifs_final.json";
string benchmark = "/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/data/modules/benchmark.dbn";
/*vector<data> v = get_list_pdb_benchmark(benchmark);
for (data d : v) {
cout << d.pdb << ", " << d.seq_pdb << endl;
}*/
/*string name = "1U6P_B";
data d = find_id_pattern(name, benchmark);
cout << "name: " << d.pdb << ", seq: " << d.seq_pdb << endl;*/
/*vector<data> association = find_id(bibli, benchmark);
for (data d : association) {
cout << "<" << d.pdb << ", " << d.seq_pdb << ">, " << "<" << d.id << ", " << d.cmp << ">" << endl;
}*/
/*string seq = "UGCGCUUGGCGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGCUU";
string seq_motif = "UGCGCUUGGCGUUUUAGAGC&GCAAGUUAAAAUAAGGCUAGUCCGUUAUCAA&UGGCACCGAGUCG&U";
bool test = does_it_match(seq, seq_motif);
cout << test << endl;*/
vector<string> selection = select_not_motif(bibli, benchmark);
for (string str : selection) {
cout << str << ", ";
}
cout << endl;
return 0;
}
\ No newline at end of file
import json
import numpy as np
import matplotlib.pyplot as plt
import os.path
# Creates a violin plot of the distribution of the number of 'motifs' in the Isaure pattern library
def stats_library():
with open('/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/data/modules/ISAURE/motifs_28-05-2021.json') as f:
data = json.load(f)
nb_motifs = json.dumps(data).count("sequence")
print("nombre de motifs: " + str(nb_motifs))
tab_seq_length = []
tab_seq_length_with_delimiter = []
for i in range(1007):
test = str(i) in data
if test:
sequence = data[str(i)]["sequence"]
count_delimiter = sequence.count('&')
tab_seq_length.append(len(sequence) - count_delimiter)
tab_seq_length_with_delimiter.append(len(sequence))
data_to_plot = [np.array(tab_seq_length), np.array(tab_seq_length_with_delimiter)]
min1 = np.amin(data_to_plot[0])
max1 = np.amax(data_to_plot[0])
median1 = np.median(data_to_plot[0])
min2 = np.amin(data_to_plot[1])
max2 = np.amax(data_to_plot[1])
median2 = np.median(data_to_plot[1])
fig = plt.figure()
ax = fig.add_axes([0, 0, 1, 1])
label1 = "nombre de nucléotides" + "\n minimum: " + str(min1) + " mediane: " + str(
median1) + " maximum: " + str(max1)
label2 = "nombre de nucléotides + nombre de &" + "\n minimum: " + str(min2) + " mediane: " + str(
median2) + " maximum: " + str(max2)
labels = [label1, label2]
ax.set_xticks(np.arange(1, len(labels) + 1))
ax.set_xticklabels(labels)
ax.set_ylabel('longueurs des motifs')
ax.set_xlabel('motifs')
violins = ax.violinplot(data_to_plot, showmedians=True)
for partname in ('cbars', 'cmins', 'cmaxes', 'cmedians'):
vp = violins[partname]
vp.set_edgecolor('black')
vp.set_linewidth(1)
for v in violins['bodies']:
v.set_facecolor('red')
plt.title("Répartition des longueurs des motifs dans la bibliothèque (" + str(nb_motifs) + " motifs)")
plt.savefig("statistiques_motifs_Isaure.png", bbox_inches='tight')
# Returns the list in half
def get_half(list_name):
first_half = []
second_half = []
if len(list_name) % 2 == 0:
middle = len(list_name) / 2
else:
middle = len(list_name) / 2 + 0.5
for i in range(int(middle)):
first_half.append(list_name[i])
for i in range(int(middle)):
if i + int(middle) < len(list_name):
second_half.append(list_name[i + int(middle)])
return [first_half, second_half]
# Returns the list of name of the sequence in the benchmark.dbn
def get_list_name_bm():
path_file = '/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/data/modules/ISAURE/benchmark.dbn'
my_file = open(path_file, "r")
list_name = []
count = 0
for line in my_file:
if count % 4 == 0:
list_name.append(line[5:len(line) - 2])
count = count + 1
my_file.close()
return list_name
# Returns a 2d array containing for each sequence of the benchmark the number of 'motifs' inserted of each solution
def get_nb_motifs_by_seq(type_file):
list_name = get_list_name_bm()
tab = []
for name in list_name:
path_file = "/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/results/test_" + name + type_file
if os.path.exists(path_file):
list_nb_motifs = []
my_file = open(path_file, "r")
name = my_file.readline()
seq = my_file.readline()
count = 0
for line in my_file:
if count % 2 == 0:
tab_split = line.split("+")
list_nb_motifs.append(len(tab_split) - 1)
count = count + 1
my_file.close()
tab.append(list_nb_motifs)
return tab
# Creates a violin plot that shows the distribution of the number of patterns per solution for each sequence of the benchmark
def stats_nb_motifs_in_result(type_file):
list_name = get_list_name_bm()
tab = get_nb_motifs_by_seq(type_file)
list_median_str = []
for i in range(len(tab)):
list_median_str.append(np.median(tab[i]))
tab = [x for _, x in sorted(zip(list_median_str, tab))]
list_name = [x for _, x in sorted(zip(list_median_str, list_name))]
if (len(tab) % 2 == 0):
absciss = len(tab) / 2
else:
absciss = len(tab) / 2 + 0.5
divide_name = get_half(list_name)
divide_tab = get_half(tab)
arr = np.array(tab)
fig, ax = plt.subplots()
plt.figure(figsize=(15, 4), dpi=200)
plt.xticks(rotation=90)
violins = plt.violinplot(divide_tab[0], showmedians=True)
for i in range(int(len(divide_tab[0]))):
y = divide_tab[0][i]
x = np.random.normal(1 + i, 0.04, size=len(y))
plt.scatter(x, y)
plt.xticks(np.arange(1, len(divide_tab[0]) + 1), divide_name[0])
plt.xlabel('nom de la séquence')
plt.ylabel('nombre de motifs insérés dans la structure prédite')
plt.title("Répartition du nombre de motifs insérés par résultat pour chaque séquence")
for part in ('cbars', 'cmins', 'cmaxes', 'cmedians'):
vp = violins[part]
vp.set_edgecolor('black')
vp.set_linewidth(1)
for v in violins['bodies']:
v.set_facecolor('black')
plt.savefig("statistiques_nb_motifs_inseres_Isaure_" + type_file[8:] + "_1.png", bbox_inches='tight')
plt.figure(figsize=(15, 4), dpi=200)
plt.xticks(rotation=90)
violins = plt.violinplot(divide_tab[1], showmedians=True)
for i in range(int(len(divide_tab[1]))):
y = divide_tab[1][i]
x = np.random.normal(1 + i, 0.04, size=len(y))
plt.scatter(x, y)
plt.xticks(np.arange(1, len(divide_tab[1]) + 1), divide_name[1])
plt.xlabel('nom de la séquence')
plt.ylabel('nombre de motifs insérés dans la structure prédite')
plt.title("Répartition du nombre de motifs insérés par résultat pour chaque séquence (2ème partie)")
for part in ('cbars', 'cmins', 'cmaxes', 'cmedians'):
vp = violins[part]
vp.set_edgecolor('black')
vp.set_linewidth(1)
for v in violins['bodies']:
v.set_facecolor('black')
plt.savefig("statistiques_nb_motifs_inseres_Isaure_" + type_file[8:] + "_2.png", bbox_inches='tight')
# Returns the grouping of the number of inserted 'motif' for all solutions of all sequences of the benchmark
# according to the extension in argument
def get_all_nb_motifs_by_type(type_file):
tab = get_nb_motifs_by_seq(type_file)
tab_all = []
for i in range(len(tab)):
for j in range(len(tab[i])):
tab_all.append(tab[i][j])
return tab_all
# Create a figure containing the violin plot for MEA + function E, MEA + function F, MFE + function E and MFE + function F
# Each violin plot show the distribution of the number of inserted 'motif' by solution
def stats_nb_motifs_all():
list_name = get_list_name_bm()
tab_all_E_MEA = get_all_nb_motifs_by_type(".json_pmE_MEA")
tab_all_F_MEA = get_all_nb_motifs_by_type(".json_pmF_MEA")
tab_all_E_MFE = get_all_nb_motifs_by_type(".json_pmE_MFE")
tab_all_F_MFE = get_all_nb_motifs_by_type(".json_pmF_MFE")
data_to_plot = [tab_all_E_MEA, tab_all_F_MEA, tab_all_E_MFE, tab_all_F_MFE]
fig = plt.figure()
fig.set_size_inches(6, 3)
ax = fig.add_axes([0, 0, 1, 1])
labels = ['MEA + E', 'MEA + F', 'MFE + E', 'MFE + F']
ax.set_xticks(np.arange(1, len(labels) + 1))
ax.set_xticklabels(labels)
ax.set_xlabel("Répartition du nombre de motifs insérés par résultat pour chaque séquence")
ax.set_ylabel("nombre de motifs insérés dans la structure prédite")
violins = ax.violinplot(data_to_plot, showmedians=True)
for partname in ('cbars', 'cmins', 'cmaxes', 'cmedians'):
vp = violins[partname]
vp.set_edgecolor('black')
vp.set_linewidth(1)
for v in violins['bodies']:
v.set_facecolor("black")
plt.savefig('repartition_nb_motifs.png', dpi=200, bbox_inches='tight')
# Returns the list of the number of solutions and the list of the length of each sequence of the benchmark
def get_nb_solutions_and_sizes_by_seq(type_file):
list_name = get_list_name_bm()
list_nb_solutions = []
list_size = []
for name in list_name:
path_file = "/mnt/c/Users/natha/Documents/IBISC/biorseo2/biorseo/results/test_" + name + type_file
if os.path.exists(path_file):
my_file = open(path_file, "r")
name = my_file.readline()
seq = my_file.readline()
list_size.append(len(seq))
count = 0
nb = 0
for line in my_file:
if count % 2 == 0:
nb = nb + 1
count = count + 1
list_nb_solutions.append(nb)
my_file.close()
return [list_nb_solutions, list_size]
# Creates 4 violin plots that shows the distribution of the number of solutions for each sequence of the benchmark
def stats_nb_solutions():
list_name = get_list_name_bm()
tab_all_E_MEA = get_nb_solutions_and_sizes_by_seq(".json_pmE_MEA")[0]
tab_all_F_MEA = get_nb_solutions_and_sizes_by_seq(".json_pmF_MEA")[0]
tab_all_E_MFE = get_nb_solutions_and_sizes_by_seq(".json_pmE_MFE")[0]
tab_all_F_MFE = get_nb_solutions_and_sizes_by_seq(".json_pmF_MFE")[0]
data_to_plot = [tab_all_E_MEA, tab_all_F_MEA, tab_all_E_MFE, tab_all_F_MFE]
all_data = []
for i in range(len(data_to_plot)):
for j in range(len(data_to_plot[i])):
all_data.append(data_to_plot[i][j])
min = np.amin(all_data)
max = np.amax(all_data)
median = np.median(all_data)
fig = plt.figure()
fig.set_size_inches(6, 3)
ax = fig.add_axes([0, 0, 1, 1])
labels = ['MEA + E', 'MEA + F', 'MFE + E', 'MFE + F']
ax.set_xticks(np.arange(1, len(labels) + 1))
ax.set_xticklabels(labels)
ax.set_xlabel("Répartition du nombre solutions pour chaque séquence du benchmark" + "\n minimum: " + str(min) + " mediane: " + str(
median) + " maximum: " + str(max) + " (Pour l'ensemble des solutions)")
ax.set_ylabel("nombre de solutions (structures secondaires prédites)")
violins = ax.violinplot(data_to_plot, showmedians=True)
for partname in ('cbars', 'cmins', 'cmaxes', 'cmedians'):
vp = violins[partname]
vp.set_edgecolor('blue')
vp.set_linewidth(1)
for v in violins['bodies']:
v.set_facecolor("blue")
plt.savefig('repartition_nb_solutions.png', dpi=200, bbox_inches='tight')
# Create a scatter plot showing the number of solutions according to the length of the sequence in the benchmark
def stats_nb_solutions_by_seq_length():
list_name = get_list_name_bm()
x = []
y = []
tab_all_E_MEA = get_nb_solutions_and_sizes_by_seq(".json_pmE_MEA")
for i in range(len(tab_all_E_MEA[0])):
x.append(tab_all_E_MEA[1][i])
y.append(tab_all_E_MEA[0][i])
tab_all_F_MEA = get_nb_solutions_and_sizes_by_seq(".json_pmF_MEA")
for i in range(len(tab_all_F_MEA[0])):
x.append(tab_all_F_MEA[1][i])
y.append(tab_all_F_MEA[0][i])
tab_all_E_MFE = get_nb_solutions_and_sizes_by_seq(".json_pmE_MFE")
for i in range(len(tab_all_E_MFE[0])):
x.append(tab_all_E_MFE[1][i])
y.append(tab_all_E_MFE[0][i])
tab_all_F_MFE = get_nb_solutions_and_sizes_by_seq(".json_pmF_MFE")
for i in range(len(tab_all_F_MFE[0])):
x.append(tab_all_F_MFE[1][i])
y.append(tab_all_F_MFE[0][i])
plt.scatter(x, y, s=50, c='blue', marker='o', edgecolors='black')
plt.ylabel("nombre de solutions")
plt.xlabel("longueur de la séquence")
plt.title('nombre de structures prédites en fonction de la longueur de la séquence')
plt.savefig('nb_solutions_en_fonction_seq.png', dpi=200, bbox_inches='tight')
"""stats_nb_motifs_in_result(".json_pmE_MEA")
stats_nb_motifs_in_result(".json_pmF_MEA")
stats_nb_motifs_in_result(".json_pmE_MFE")
stats_nb_motifs_in_result(".json_pmF_MFE")
stats_nb_motifs_all()
stats_nb_solutions()"""
stats_nb_solutions_by_seq_length()
>test
CCGGGACCUCUAACCGGGUUCCCGGGCAGUCACUG