Showing
8 changed files
with
119 additions
and
79 deletions
... | @@ -18,9 +18,7 @@ Dockerfile | ... | @@ -18,9 +18,7 @@ Dockerfile |
18 | LICENSE | 18 | LICENSE |
19 | CHANGELOG | 19 | CHANGELOG |
20 | *.md | 20 | *.md |
21 | -scripts/automate.sh | 21 | +scripts/*.sh |
22 | -scripts/kill_rnanet.sh | ||
23 | -scripts/build_docker_image.sh | ||
24 | scripts/*.tar | 22 | scripts/*.tar |
25 | scripts/measure.py | 23 | scripts/measure.py |
26 | scripts/recompute_some_chains.py | 24 | scripts/recompute_some_chains.py | ... | ... |
... | @@ -25,13 +25,13 @@ RUN apk update && apk add --no-cache \ | ... | @@ -25,13 +25,13 @@ RUN apk update && apk add --no-cache \ |
25 | \ | 25 | \ |
26 | mv /RNANet/scripts/x3dna-dssr /usr/local/bin/x3dna-dssr && chmod +x /usr/local/bin/x3dna-dssr && \ | 26 | mv /RNANet/scripts/x3dna-dssr /usr/local/bin/x3dna-dssr && chmod +x /usr/local/bin/x3dna-dssr && \ |
27 | \ | 27 | \ |
28 | - curl -SL http://eddylab.org/infernal/infernal-1.1.3.tar.gz | tar xz && cd infernal-1.1.3 && \ | 28 | + curl -SL http://eddylab.org/infernal/infernal-1.1.4.tar.gz | tar xz && cd infernal-1.1.4 && \ |
29 | ./configure && make -j 16 && make install && cd easel && make install && cd / && \ | 29 | ./configure && make -j 16 && make install && cd easel && make install && cd / && \ |
30 | \ | 30 | \ |
31 | curl -SL https://github.com/epruesse/SINA/releases/download/v1.7.1/sina-1.7.1-linux.tar.gz | tar xz && mv sina-1.7.1-linux /sina && \ | 31 | curl -SL https://github.com/epruesse/SINA/releases/download/v1.7.1/sina-1.7.1-linux.tar.gz | tar xz && mv sina-1.7.1-linux /sina && \ |
32 | ln -s /sina/bin/sina /usr/local/bin/sina && \ | 32 | ln -s /sina/bin/sina /usr/local/bin/sina && \ |
33 | \ | 33 | \ |
34 | - rm -rf /infernal-1.1.3 && \ | 34 | + rm -rf /infernal-1.1.4 && \ |
35 | \ | 35 | \ |
36 | apk del openblas-dev gcc g++ gfortran binutils \ | 36 | apk del openblas-dev gcc g++ gfortran binutils \ |
37 | curl \ | 37 | curl \ | ... | ... |
... | @@ -10,16 +10,16 @@ | ... | @@ -10,16 +10,16 @@ |
10 | # Required computational resources | 10 | # Required computational resources |
11 | - CPU: no requirements. The program is optimized for multi-core CPUs, you might want to use Intel Xeons, AMD Ryzens, etc. | 11 | - CPU: no requirements. The program is optimized for multi-core CPUs, you might want to use Intel Xeons, AMD Ryzens, etc. |
12 | - GPU: not required | 12 | - GPU: not required |
13 | -- RAM: 16 GB with a large swap partition is okay. 32 GB is recommended (usage peaks at ~27 GB) | 13 | +- RAM: 16 GB with a large swap partition is okay. 32 GB is recommended (usage peaks at ~27 GB, but this number depends on your number of CPU cores) |
14 | - Storage: to date, it takes 60 GB for the 3D data (36 GB if you don't use the --extract option), 11 GB for the sequence data, and 7GB for the outputs (5.6 GB database, 1 GB archive of CSV files). You need to add a few more for the dependencies. Pick a 100GB partition and you are good to go. The computation speed is way better if you use a fast storage device (e.g. SSD instead of hard drive, or even better, a NVMe SSD) because of constant I/O with the SQlite database. | 14 | - Storage: to date, it takes 60 GB for the 3D data (36 GB if you don't use the --extract option), 11 GB for the sequence data, and 7GB for the outputs (5.6 GB database, 1 GB archive of CSV files). You need to add a few more for the dependencies. Pick a 100GB partition and you are good to go. The computation speed is way better if you use a fast storage device (e.g. SSD instead of hard drive, or even better, a NVMe SSD) because of constant I/O with the SQlite database. |
15 | - Network : We query the Rfam public MySQL server on port 4497. Make sure your network enables communication (there should not be any issue on private networks, but maybe you company/university closes ports by default). You will get an error message if the port is not open. Around 30 GB of data is downloaded. | 15 | - Network : We query the Rfam public MySQL server on port 4497. Make sure your network enables communication (there should not be any issue on private networks, but maybe you company/university closes ports by default). You will get an error message if the port is not open. Around 30 GB of data is downloaded. |
16 | 16 | ||
17 | # Method 1 : Installation using Docker | 17 | # Method 1 : Installation using Docker |
18 | 18 | ||
19 | -* Step 1 : Download the [Docker container](https://entrepot.ibisc.univ-evry.fr/d/1aff90a9ef214a19b848/files/?p=/rnanet_v1.3_docker.tar&dl=1). Open a terminal and move to the appropriate directory. | 19 | +* Step 1 : Download the [Docker container](https://entrepot.ibisc.univ-evry.fr/d/1aff90a9ef214a19b848/files/?p=/rnanet_v1.5b_docker.tar&dl=1). Open a terminal and move to the appropriate directory. |
20 | * Step 2 : Extract the archive to a Docker image named *rnanet* in your local installation | 20 | * Step 2 : Extract the archive to a Docker image named *rnanet* in your local installation |
21 | ``` | 21 | ``` |
22 | -$ docker load -i rnanet_v1.3_docker.tar | 22 | +$ docker load -i rnanet_v1.5b_docker.tar |
23 | ``` | 23 | ``` |
24 | * Step 3 : Run the container, giving it 3 folders to mount as volumes: a first to store the 3D data, a second to store the sequence data and alignments, and a third to output the results, data and logs: | 24 | * Step 3 : Run the container, giving it 3 folders to mount as volumes: a first to store the 3D data, a second to store the sequence data and alignments, and a third to output the results, data and logs: |
25 | ``` | 25 | ``` |
... | @@ -36,7 +36,7 @@ nohup bash -c 'time docker run --rm -v /path/to/3D/data/folder:/3D -v /path/to/s | ... | @@ -36,7 +36,7 @@ nohup bash -c 'time docker run --rm -v /path/to/3D/data/folder:/3D -v /path/to/s |
36 | 36 | ||
37 | You need to install the dependencies: | 37 | You need to install the dependencies: |
38 | - DSSR, you need to register to the X3DNA forum [here](http://forum.x3dna.org/site-announcements/download-instructions/) and then download the DSSR binary [on that page](http://forum.x3dna.org/downloads/3dna-download/). Make sure to have the `x3dna-dssr` binary in your $PATH variable so that RNANet.py finds it. | 38 | - DSSR, you need to register to the X3DNA forum [here](http://forum.x3dna.org/site-announcements/download-instructions/) and then download the DSSR binary [on that page](http://forum.x3dna.org/downloads/3dna-download/). Make sure to have the `x3dna-dssr` binary in your $PATH variable so that RNANet.py finds it. |
39 | -- Infernal, to download at [Eddylab](http://eddylab.org/infernal/), several options are available depending on your preferences. Make sure to have the `cmalign`, `esl-alimanip`, `esl-alipid` and `esl-reformat` binaries in your $PATH variable, so that RNANet.py can find them. | 39 | +- Infernal, to download at [Eddylab](http://eddylab.org/infernal/), several options are available depending on your preferences. Make sure to have the `cmalign`, `cmfetch`, `cmbuild`, `esl-alimanip`, `esl-alipid` and `esl-reformat` binaries in your $PATH variable, so that RNANet.py can find them. |
40 | - SINA, follow [these instructions](https://sina.readthedocs.io/en/latest/install.html) for example. Make sure to have the `sina` binary in your $PATH. | 40 | - SINA, follow [these instructions](https://sina.readthedocs.io/en/latest/install.html) for example. Make sure to have the `sina` binary in your $PATH. |
41 | - Sqlite 3, available under the name *sqlite* in every distro's package manager, | 41 | - Sqlite 3, available under the name *sqlite* in every distro's package manager, |
42 | - Python >= 3.8, (Unfortunately, python3.6 is no longer supported, because of changes in the multiprocessing and Threading packages. Untested with Python 3.7.\*) | 42 | - Python >= 3.8, (Unfortunately, python3.6 is no longer supported, because of changes in the multiprocessing and Threading packages. Untested with Python 3.7.\*) |
... | @@ -112,13 +112,14 @@ The most useful options in that list are | ... | @@ -112,13 +112,14 @@ The most useful options in that list are |
112 | * Computation of sequence identity matrices | 112 | * Computation of sequence identity matrices |
113 | * Statistics over the sequence lengths, nucleotide frequencies, and basepair types by RNA family | 113 | * Statistics over the sequence lengths, nucleotide frequencies, and basepair types by RNA family |
114 | * Overall database content statistics | 114 | * Overall database content statistics |
115 | - * Detailed analysis of the eta-theta pseudotorsion angles (use `--stats-opts "--wadley"` after `-s`) or 3D distance matrices and their averages per family (use `--stats-opts "--distance-matrices"`) | 115 | + * Detailed analysis of the eta-theta pseudotorsion angles (use `--stats-opts="--wadley"` after `-s`) or 3D distance matrices and their averages per family (use `--stats-opts="--distance-matrices"`) |
116 | * ` --redundant`, to yield all the available data and not only the BGSU NR-List respresentatives | 116 | * ` --redundant`, to yield all the available data and not only the BGSU NR-List respresentatives |
117 | 117 | ||
118 | # Computation time | 118 | # Computation time |
119 | 119 | ||
120 | To give you an estimation, our last full run took exactly 12h, excluding the time to download the MMCIF files containing RNA (around 25GB to download) and the time to compute statistics. | 120 | To give you an estimation, our last full run took exactly 12h, excluding the time to download the MMCIF files containing RNA (around 25GB to download) and the time to compute statistics. |
121 | Measured the 23rd of June 2020 on a 16-core AMD Ryzen 7 3700X CPU @3.60GHz, plus 32 Go RAM, and a 7200rpm Hard drive. Total CPU time spent: 135 hours (user+kernel modes), corresponding to 12h (actual time spent with the 16-core CPU). | 121 | Measured the 23rd of June 2020 on a 16-core AMD Ryzen 7 3700X CPU @3.60GHz, plus 32 Go RAM, and a 7200rpm Hard drive. Total CPU time spent: 135 hours (user+kernel modes), corresponding to 12h (actual time spent with the 16-core CPU). |
122 | +Another recent full run, including the MMCIF downloads and computation of heavy statistics (`--wadley --distance-matrices`) last 13h (real time) on a 60-core Xeon E7-4850v4@2.10GHz and 120 Go of RAM. The user+kernel time was about 300h. | ||
122 | 123 | ||
123 | Update runs are much quicker, around 3 hours. It depends mostly on what RNA families are concerned by the update. | 124 | Update runs are much quicker, around 3 hours. It depends mostly on what RNA families are concerned by the update. |
124 | 125 | ||
... | @@ -135,9 +136,11 @@ By default, this computes: | ... | @@ -135,9 +136,11 @@ By default, this computes: |
135 | * Statistics over the sequence lengths, nucleotide frequencies, and basepair types by RNA family | 136 | * Statistics over the sequence lengths, nucleotide frequencies, and basepair types by RNA family |
136 | * Overall database content statistics | 137 | * Overall database content statistics |
137 | 138 | ||
139 | +If you have run RNANet once with option `--extract`, additionally, you can compute more by passing the options: | ||
140 | +* With option `--distance-matrices` to compute pairwise residue distances within the chain for every chain, and compute average and standard deviations by RNA families. This is supposed to capture the average shape of an RNA family. The distance matrices are the size of the family's covariance model (match states). Unresolved nucleotides or deletions to the covariance model are NaNs. | ||
141 | + | ||
138 | If you have run RNANet once with options `--no-homology` and `--extract`, you unlock new statistics over unmapped chains. | 142 | If you have run RNANet once with options `--no-homology` and `--extract`, you unlock new statistics over unmapped chains. |
139 | * You will be allowed to use option `--wadley` to reproduce Wadley & al. (2007) results automatically. These are clustering results of the pseudotorsions angles of the backbone. | 143 | * You will be allowed to use option `--wadley` to reproduce Wadley & al. (2007) results automatically. These are clustering results of the pseudotorsions angles of the backbone. |
140 | -* (experimental) You will be allowed to use option `--distance-matrices` to compute pairwise residue distances within the chain for every chain, and compute average and standard deviations by RNA families. This is supposed to capture the average shape of an RNA family. | ||
141 | 144 | ||
142 | # Output files | 145 | # Output files |
143 | 146 | ... | ... |
... | @@ -969,6 +969,7 @@ class Pipeline: | ... | @@ -969,6 +969,7 @@ class Pipeline: |
969 | self.REUSE_ALL = False | 969 | self.REUSE_ALL = False |
970 | self.REDUNDANT = False | 970 | self.REDUNDANT = False |
971 | self.ALIGNOPTS = None | 971 | self.ALIGNOPTS = None |
972 | + self.STATSOPTS = None | ||
972 | self.USESINA = False | 973 | self.USESINA = False |
973 | self.SELECT_ONLY = None | 974 | self.SELECT_ONLY = None |
974 | self.ARCHIVE = False | 975 | self.ARCHIVE = False |
... | @@ -1102,6 +1103,8 @@ class Pipeline: | ... | @@ -1102,6 +1103,8 @@ class Pipeline: |
1102 | self.REUSE_ALL = True | 1103 | self.REUSE_ALL = True |
1103 | elif opt == "cmalign-opts": | 1104 | elif opt == "cmalign-opts": |
1104 | self.ALIGNOPTS = arg | 1105 | self.ALIGNOPTS = arg |
1106 | + elif opt == "stats-opts": | ||
1107 | + self.STATSOPTS = " ".split(arg) | ||
1105 | elif opt == "--all": | 1108 | elif opt == "--all": |
1106 | self.REUSE_ALL = True | 1109 | self.REUSE_ALL = True |
1107 | self.USE_KNOWN_ISSUES = False | 1110 | self.USE_KNOWN_ISSUES = False |
... | @@ -1545,9 +1548,12 @@ class Pipeline: | ... | @@ -1545,9 +1548,12 @@ class Pipeline: |
1545 | 1548 | ||
1546 | # Run statistics files | 1549 | # Run statistics files |
1547 | subprocess.run([python_executable, fileDir+"/scripts/regression.py", runDir + "/results/RNANet.db"]) | 1550 | subprocess.run([python_executable, fileDir+"/scripts/regression.py", runDir + "/results/RNANet.db"]) |
1551 | + if self.STATSOPTS is None: | ||
1548 | subprocess.run([python_executable, fileDir+"/statistics.py", "--3d-folder", path_to_3D_data, | 1552 | subprocess.run([python_executable, fileDir+"/statistics.py", "--3d-folder", path_to_3D_data, |
1549 | "--seq-folder", path_to_seq_data, "-r", str(self.CRYSTAL_RES)]) | 1553 | "--seq-folder", path_to_seq_data, "-r", str(self.CRYSTAL_RES)]) |
1550 | - | 1554 | + else: |
1555 | + subprocess.run([python_executable, fileDir+"/statistics.py", "--3d-folder", path_to_3D_data, | ||
1556 | + "--seq-folder", path_to_seq_data, "-r", str(self.CRYSTAL_RES)] + self.STATSOPTS) | ||
1551 | # Save additional informations | 1557 | # Save additional informations |
1552 | with sqlite3.connect(runDir+"/results/RNANet.db") as conn: | 1558 | with sqlite3.connect(runDir+"/results/RNANet.db") as conn: |
1553 | conn.execute('pragma journal_mode=wal') | 1559 | conn.execute('pragma journal_mode=wal') | ... | ... |
1 | +6ydp_1_AA_1176-2737 | ||
2 | +6ydw_1_AA_1176-2737 | ||
1 | 2z9q_1_A_1-72 | 3 | 2z9q_1_A_1-72 |
2 | 1ml5_1_b_5-121 | 4 | 1ml5_1_b_5-121 |
3 | 1ml5_1_a_1-2914 | 5 | 1ml5_1_a_1-2914 |
... | @@ -9,6 +11,9 @@ | ... | @@ -9,6 +11,9 @@ |
9 | 1qza_1_B_1-73 | 11 | 1qza_1_B_1-73 |
10 | 1ls2_1_B_1-73 | 12 | 1ls2_1_B_1-73 |
11 | 1gsg_1_T_1-72 | 13 | 1gsg_1_T_1-72 |
14 | +7d1a_1_A_805-902 | ||
15 | +7d0g_1_A_805-913 | ||
16 | +7d0f_1_A_817-913 | ||
12 | 3jcr_1_H_1-115 | 17 | 3jcr_1_H_1-115 |
13 | 1vy7_1_AY_1-73 | 18 | 1vy7_1_AY_1-73 |
14 | 1vy7_1_CY_1-73 | 19 | 1vy7_1_CY_1-73 |
... | @@ -18,15 +23,21 @@ | ... | @@ -18,15 +23,21 @@ |
18 | 4v48_1_A9_3-118 | 23 | 4v48_1_A9_3-118 |
19 | 4v47_1_A9_3-118 | 24 | 4v47_1_A9_3-118 |
20 | 2ob7_1_A_10-319 | 25 | 2ob7_1_A_10-319 |
21 | -1x1l_1_A_1-132 | 26 | +1x1l_1_A_1-130 |
22 | -1zc8_1_Z_1-93 | 27 | +1zc8_1_Z_1-91 |
23 | -2ob7_1_D_1-132 | 28 | +2ob7_1_D_1-130 |
24 | -4v42_1_BB_5-121 | ||
25 | 4v42_1_BA_1-2914 | 29 | 4v42_1_BA_1-2914 |
30 | +4v42_1_BB_5-121 | ||
26 | 1r2x_1_C_1-58 | 31 | 1r2x_1_C_1-58 |
27 | 1r2w_1_C_1-58 | 32 | 1r2w_1_C_1-58 |
28 | 1eg0_1_L_1-56 | 33 | 1eg0_1_L_1-56 |
29 | -5zzm_1_N_1-2904 | 34 | +3dg2_1_A_1-1542 |
35 | +3dg0_1_A_1-1542 | ||
36 | +4v48_1_BA_1-1543 | ||
37 | +4v47_1_BA_1-1542 | ||
38 | +3dg4_1_A_1-1542 | ||
39 | +3dg5_1_A_1-1542 | ||
40 | +5zzm_1_N_1-2903 | ||
30 | 2rdo_1_B_1-2904 | 41 | 2rdo_1_B_1-2904 |
31 | 3dg2_1_B_1-2904 | 42 | 3dg2_1_B_1-2904 |
32 | 3dg0_1_B_1-2904 | 43 | 3dg0_1_B_1-2904 |
... | @@ -34,21 +45,17 @@ | ... | @@ -34,21 +45,17 @@ |
34 | 4v47_1_A0_1-2904 | 45 | 4v47_1_A0_1-2904 |
35 | 3dg4_1_B_1-2904 | 46 | 3dg4_1_B_1-2904 |
36 | 3dg5_1_B_1-2904 | 47 | 3dg5_1_B_1-2904 |
37 | -3dg2_1_A_1-1542 | ||
38 | -3dg0_1_A_1-1542 | ||
39 | -4v48_1_BA_1-1543 | ||
40 | -4v47_1_BA_1-1542 | ||
41 | -3dg4_1_A_1-1542 | ||
42 | -3dg5_1_A_1-1542 | ||
43 | 1eg0_1_O_1-73 | 48 | 1eg0_1_O_1-73 |
44 | 1zc8_1_A_1-59 | 49 | 1zc8_1_A_1-59 |
45 | -1mvr_1_D_1-61 | ||
46 | -4adx_1_9_1-123 | ||
47 | -1zn1_1_B_1-59 | ||
48 | 1jgq_1_A_2-1520 | 50 | 1jgq_1_A_2-1520 |
49 | 4v42_1_AA_2-1520 | 51 | 4v42_1_AA_2-1520 |
50 | 1jgo_1_A_2-1520 | 52 | 1jgo_1_A_2-1520 |
51 | 1jgp_1_A_2-1520 | 53 | 1jgp_1_A_2-1520 |
54 | +1mvr_1_D_1-59 | ||
55 | +4c9d_1_D_29-1 | ||
56 | +4c9d_1_C_29-1 | ||
57 | +4adx_1_9_1-121 | ||
58 | +1zn1_1_B_1-59 | ||
52 | 1emi_1_B_1-108 | 59 | 1emi_1_B_1-108 |
53 | 3iy9_1_A_498-1027 | 60 | 3iy9_1_A_498-1027 |
54 | 3ep2_1_B_1-50 | 61 | 3ep2_1_B_1-50 |
... | @@ -61,7 +68,7 @@ | ... | @@ -61,7 +68,7 @@ |
61 | 3cw1_1_V_1-138 | 68 | 3cw1_1_V_1-138 |
62 | 3cw1_1_v_1-138 | 69 | 3cw1_1_v_1-138 |
63 | 2iy3_1_B_9-105 | 70 | 2iy3_1_B_9-105 |
64 | -3jcr_1_N_1-107 | 71 | +3jcr_1_N_1-106 |
65 | 2vaz_1_A_64-177 | 72 | 2vaz_1_A_64-177 |
66 | 2ftc_1_R_81-1466 | 73 | 2ftc_1_R_81-1466 |
67 | 3jcr_1_M_1-141 | 74 | 3jcr_1_M_1-141 |
... | @@ -70,9 +77,10 @@ | ... | @@ -70,9 +77,10 @@ |
70 | 3iy8_1_A_1-540 | 77 | 3iy8_1_A_1-540 |
71 | 4v5z_1_BY_2-113 | 78 | 4v5z_1_BY_2-113 |
72 | 4v5z_1_BZ_1-70 | 79 | 4v5z_1_BZ_1-70 |
73 | -4v5z_1_B1_2-125 | 80 | +4v5z_1_B1_2-123 |
74 | -4adx_1_0_1-2925 | 81 | +1mvr_1_B_1-96 |
75 | -1mvr_1_B_3-96 | 82 | +4adx_1_0_1-2923 |
76 | 3eq4_1_Y_1-69 | 83 | 3eq4_1_Y_1-69 |
77 | -6uz7_1_8_2140-2827 | 84 | +7a5p_1_2_259-449 |
85 | +6uz7_1_8_2140-2825 | ||
78 | 4v5z_1_AA_1-1563 | 86 | 4v5z_1_AA_1-1563 | ... | ... |
1 | +6ydp_1_AA_1176-2737 | ||
2 | +Could not find nucleotides of chain AA in annotation 6ydp.json. Either there is a problem with 6ydp mmCIF download, or the bases are not resolved in the structure. Delete it and retry. | ||
3 | + | ||
4 | +6ydw_1_AA_1176-2737 | ||
5 | +Could not find nucleotides of chain AA in annotation 6ydw.json. Either there is a problem with 6ydw mmCIF download, or the bases are not resolved in the structure. Delete it and retry. | ||
6 | + | ||
1 | 2z9q_1_A_1-72 | 7 | 2z9q_1_A_1-72 |
2 | DSSR warning 2z9q.json: no nucleotides found. Ignoring 2z9q_1_A_1-72. | 8 | DSSR warning 2z9q.json: no nucleotides found. Ignoring 2z9q_1_A_1-72. |
3 | 9 | ||
... | @@ -31,6 +37,15 @@ DSSR warning 1ls2.json: no nucleotides found. Ignoring 1ls2_1_B_1-73. | ... | @@ -31,6 +37,15 @@ DSSR warning 1ls2.json: no nucleotides found. Ignoring 1ls2_1_B_1-73. |
31 | 1gsg_1_T_1-72 | 37 | 1gsg_1_T_1-72 |
32 | DSSR warning 1gsg.json: no nucleotides found. Ignoring 1gsg_1_T_1-72. | 38 | DSSR warning 1gsg.json: no nucleotides found. Ignoring 1gsg_1_T_1-72. |
33 | 39 | ||
40 | +7d1a_1_A_805-902 | ||
41 | +Could not find nucleotides of chain A in annotation 7d1a.json. Either there is a problem with 7d1a mmCIF download, or the bases are not resolved in the structure. Delete it and retry. | ||
42 | + | ||
43 | +7d0g_1_A_805-913 | ||
44 | +Could not find nucleotides of chain A in annotation 7d0g.json. Either there is a problem with 7d0g mmCIF download, or the bases are not resolved in the structure. Delete it and retry. | ||
45 | + | ||
46 | +7d0f_1_A_817-913 | ||
47 | +Could not find nucleotides of chain A in annotation 7d0f.json. Either there is a problem with 7d0f mmCIF download, or the bases are not resolved in the structure. Delete it and retry. | ||
48 | + | ||
34 | 3jcr_1_H_1-115 | 49 | 3jcr_1_H_1-115 |
35 | DSSR warning 3jcr.json: no nucleotides found. Ignoring 3jcr_1_H_1-115. | 50 | DSSR warning 3jcr.json: no nucleotides found. Ignoring 3jcr_1_H_1-115. |
36 | 51 | ||
... | @@ -58,21 +73,21 @@ DSSR warning 4v47.json: no nucleotides found. Ignoring 4v47_1_A9_3-118. | ... | @@ -58,21 +73,21 @@ DSSR warning 4v47.json: no nucleotides found. Ignoring 4v47_1_A9_3-118. |
58 | 2ob7_1_A_10-319 | 73 | 2ob7_1_A_10-319 |
59 | DSSR warning 2ob7.json: no nucleotides found. Ignoring 2ob7_1_A_10-319. | 74 | DSSR warning 2ob7.json: no nucleotides found. Ignoring 2ob7_1_A_10-319. |
60 | 75 | ||
61 | -1x1l_1_A_1-132 | 76 | +1x1l_1_A_1-130 |
62 | -DSSR warning 1x1l.json: no nucleotides found. Ignoring 1x1l_1_A_1-132. | 77 | +DSSR warning 1x1l.json: no nucleotides found. Ignoring 1x1l_1_A_1-130. |
63 | - | ||
64 | -1zc8_1_Z_1-93 | ||
65 | -DSSR warning 1zc8.json: no nucleotides found. Ignoring 1zc8_1_Z_1-93. | ||
66 | 78 | ||
67 | -2ob7_1_D_1-132 | 79 | +1zc8_1_Z_1-91 |
68 | -DSSR warning 2ob7.json: no nucleotides found. Ignoring 2ob7_1_D_1-132. | 80 | +DSSR warning 1zc8.json: no nucleotides found. Ignoring 1zc8_1_Z_1-91. |
69 | 81 | ||
70 | -4v42_1_BB_5-121 | 82 | +2ob7_1_D_1-130 |
71 | -Could not find nucleotides of chain BB in annotation 4v42.json. Either there is a problem with 4v42 mmCIF download, or the bases are not resolved in the structure. Delete it and retry. | 83 | +DSSR warning 2ob7.json: no nucleotides found. Ignoring 2ob7_1_D_1-130. |
72 | 84 | ||
73 | 4v42_1_BA_1-2914 | 85 | 4v42_1_BA_1-2914 |
74 | Could not find nucleotides of chain BA in annotation 4v42.json. Either there is a problem with 4v42 mmCIF download, or the bases are not resolved in the structure. Delete it and retry. | 86 | Could not find nucleotides of chain BA in annotation 4v42.json. Either there is a problem with 4v42 mmCIF download, or the bases are not resolved in the structure. Delete it and retry. |
75 | 87 | ||
88 | +4v42_1_BB_5-121 | ||
89 | +Could not find nucleotides of chain BB in annotation 4v42.json. Either there is a problem with 4v42 mmCIF download, or the bases are not resolved in the structure. Delete it and retry. | ||
90 | + | ||
76 | 1r2x_1_C_1-58 | 91 | 1r2x_1_C_1-58 |
77 | DSSR warning 1r2x.json: no nucleotides found. Ignoring 1r2x_1_C_1-58. | 92 | DSSR warning 1r2x.json: no nucleotides found. Ignoring 1r2x_1_C_1-58. |
78 | 93 | ||
... | @@ -82,8 +97,26 @@ DSSR warning 1r2w.json: no nucleotides found. Ignoring 1r2w_1_C_1-58. | ... | @@ -82,8 +97,26 @@ DSSR warning 1r2w.json: no nucleotides found. Ignoring 1r2w_1_C_1-58. |
82 | 1eg0_1_L_1-56 | 97 | 1eg0_1_L_1-56 |
83 | DSSR warning 1eg0.json: no nucleotides found. Ignoring 1eg0_1_L_1-56. | 98 | DSSR warning 1eg0.json: no nucleotides found. Ignoring 1eg0_1_L_1-56. |
84 | 99 | ||
85 | -5zzm_1_N_1-2904 | 100 | +3dg2_1_A_1-1542 |
86 | -DSSR warning 5zzm.json: no nucleotides found. Ignoring 5zzm_1_N_1-2904. | 101 | +DSSR warning 3dg2.json: no nucleotides found. Ignoring 3dg2_1_A_1-1542. |
102 | + | ||
103 | +3dg0_1_A_1-1542 | ||
104 | +DSSR warning 3dg0.json: no nucleotides found. Ignoring 3dg0_1_A_1-1542. | ||
105 | + | ||
106 | +4v48_1_BA_1-1543 | ||
107 | +DSSR warning 4v48.json: no nucleotides found. Ignoring 4v48_1_BA_1-1543. | ||
108 | + | ||
109 | +4v47_1_BA_1-1542 | ||
110 | +DSSR warning 4v47.json: no nucleotides found. Ignoring 4v47_1_BA_1-1542. | ||
111 | + | ||
112 | +3dg4_1_A_1-1542 | ||
113 | +DSSR warning 3dg4.json: no nucleotides found. Ignoring 3dg4_1_A_1-1542. | ||
114 | + | ||
115 | +3dg5_1_A_1-1542 | ||
116 | +DSSR warning 3dg5.json: no nucleotides found. Ignoring 3dg5_1_A_1-1542. | ||
117 | + | ||
118 | +5zzm_1_N_1-2903 | ||
119 | +DSSR warning 5zzm.json: no nucleotides found. Ignoring 5zzm_1_N_1-2903. | ||
87 | 120 | ||
88 | 2rdo_1_B_1-2904 | 121 | 2rdo_1_B_1-2904 |
89 | DSSR warning 2rdo.json: no nucleotides found. Ignoring 2rdo_1_B_1-2904. | 122 | DSSR warning 2rdo.json: no nucleotides found. Ignoring 2rdo_1_B_1-2904. |
... | @@ -106,39 +139,12 @@ DSSR warning 3dg4.json: no nucleotides found. Ignoring 3dg4_1_B_1-2904. | ... | @@ -106,39 +139,12 @@ DSSR warning 3dg4.json: no nucleotides found. Ignoring 3dg4_1_B_1-2904. |
106 | 3dg5_1_B_1-2904 | 139 | 3dg5_1_B_1-2904 |
107 | DSSR warning 3dg5.json: no nucleotides found. Ignoring 3dg5_1_B_1-2904. | 140 | DSSR warning 3dg5.json: no nucleotides found. Ignoring 3dg5_1_B_1-2904. |
108 | 141 | ||
109 | -3dg2_1_A_1-1542 | ||
110 | -DSSR warning 3dg2.json: no nucleotides found. Ignoring 3dg2_1_A_1-1542. | ||
111 | - | ||
112 | -3dg0_1_A_1-1542 | ||
113 | -DSSR warning 3dg0.json: no nucleotides found. Ignoring 3dg0_1_A_1-1542. | ||
114 | - | ||
115 | -4v48_1_BA_1-1543 | ||
116 | -DSSR warning 4v48.json: no nucleotides found. Ignoring 4v48_1_BA_1-1543. | ||
117 | - | ||
118 | -4v47_1_BA_1-1542 | ||
119 | -DSSR warning 4v47.json: no nucleotides found. Ignoring 4v47_1_BA_1-1542. | ||
120 | - | ||
121 | -3dg4_1_A_1-1542 | ||
122 | -DSSR warning 3dg4.json: no nucleotides found. Ignoring 3dg4_1_A_1-1542. | ||
123 | - | ||
124 | -3dg5_1_A_1-1542 | ||
125 | -DSSR warning 3dg5.json: no nucleotides found. Ignoring 3dg5_1_A_1-1542. | ||
126 | - | ||
127 | 1eg0_1_O_1-73 | 142 | 1eg0_1_O_1-73 |
128 | DSSR warning 1eg0.json: no nucleotides found. Ignoring 1eg0_1_O_1-73. | 143 | DSSR warning 1eg0.json: no nucleotides found. Ignoring 1eg0_1_O_1-73. |
129 | 144 | ||
130 | 1zc8_1_A_1-59 | 145 | 1zc8_1_A_1-59 |
131 | DSSR warning 1zc8.json: no nucleotides found. Ignoring 1zc8_1_A_1-59. | 146 | DSSR warning 1zc8.json: no nucleotides found. Ignoring 1zc8_1_A_1-59. |
132 | 147 | ||
133 | -1mvr_1_D_1-61 | ||
134 | -DSSR warning 1mvr.json: no nucleotides found. Ignoring 1mvr_1_D_1-61. | ||
135 | - | ||
136 | -4adx_1_9_1-123 | ||
137 | -DSSR warning 4adx.json: no nucleotides found. Ignoring 4adx_1_9_1-123. | ||
138 | - | ||
139 | -1zn1_1_B_1-59 | ||
140 | -DSSR warning 1zn1.json: no nucleotides found. Ignoring 1zn1_1_B_1-59. | ||
141 | - | ||
142 | 1jgq_1_A_2-1520 | 148 | 1jgq_1_A_2-1520 |
143 | Could not find nucleotides of chain A in annotation 1jgq.json. Either there is a problem with 1jgq mmCIF download, or the bases are not resolved in the structure. Delete it and retry. | 149 | Could not find nucleotides of chain A in annotation 1jgq.json. Either there is a problem with 1jgq mmCIF download, or the bases are not resolved in the structure. Delete it and retry. |
144 | 150 | ||
... | @@ -151,6 +157,21 @@ Could not find nucleotides of chain A in annotation 1jgo.json. Either there is a | ... | @@ -151,6 +157,21 @@ Could not find nucleotides of chain A in annotation 1jgo.json. Either there is a |
151 | 1jgp_1_A_2-1520 | 157 | 1jgp_1_A_2-1520 |
152 | Could not find nucleotides of chain A in annotation 1jgp.json. Either there is a problem with 1jgp mmCIF download, or the bases are not resolved in the structure. Delete it and retry. | 158 | Could not find nucleotides of chain A in annotation 1jgp.json. Either there is a problem with 1jgp mmCIF download, or the bases are not resolved in the structure. Delete it and retry. |
153 | 159 | ||
160 | +1mvr_1_D_1-59 | ||
161 | +DSSR warning 1mvr.json: no nucleotides found. Ignoring 1mvr_1_D_1-59. | ||
162 | + | ||
163 | +4c9d_1_D_29-1 | ||
164 | +Mapping is reversed, this case is not supported (yet). | ||
165 | + | ||
166 | +4c9d_1_C_29-1 | ||
167 | +Mapping is reversed, this case is not supported (yet). | ||
168 | + | ||
169 | +4adx_1_9_1-121 | ||
170 | +DSSR warning 4adx.json: no nucleotides found. Ignoring 4adx_1_9_1-121. | ||
171 | + | ||
172 | +1zn1_1_B_1-59 | ||
173 | +DSSR warning 1zn1.json: no nucleotides found. Ignoring 1zn1_1_B_1-59. | ||
174 | + | ||
154 | 1emi_1_B_1-108 | 175 | 1emi_1_B_1-108 |
155 | DSSR warning 1emi.json: no nucleotides found. Ignoring 1emi_1_B_1-108. | 176 | DSSR warning 1emi.json: no nucleotides found. Ignoring 1emi_1_B_1-108. |
156 | 177 | ||
... | @@ -187,8 +208,8 @@ DSSR warning 3cw1.json: no nucleotides found. Ignoring 3cw1_1_v_1-138. | ... | @@ -187,8 +208,8 @@ DSSR warning 3cw1.json: no nucleotides found. Ignoring 3cw1_1_v_1-138. |
187 | 2iy3_1_B_9-105 | 208 | 2iy3_1_B_9-105 |
188 | DSSR warning 2iy3.json: no nucleotides found. Ignoring 2iy3_1_B_9-105. | 209 | DSSR warning 2iy3.json: no nucleotides found. Ignoring 2iy3_1_B_9-105. |
189 | 210 | ||
190 | -3jcr_1_N_1-107 | 211 | +3jcr_1_N_1-106 |
191 | -DSSR warning 3jcr.json: no nucleotides found. Ignoring 3jcr_1_N_1-107. | 212 | +DSSR warning 3jcr.json: no nucleotides found. Ignoring 3jcr_1_N_1-106. |
192 | 213 | ||
193 | 2vaz_1_A_64-177 | 214 | 2vaz_1_A_64-177 |
194 | DSSR warning 2vaz.json: no nucleotides found. Ignoring 2vaz_1_A_64-177. | 215 | DSSR warning 2vaz.json: no nucleotides found. Ignoring 2vaz_1_A_64-177. |
... | @@ -214,19 +235,22 @@ DSSR warning 4v5z.json: no nucleotides found. Ignoring 4v5z_1_BY_2-113. | ... | @@ -214,19 +235,22 @@ DSSR warning 4v5z.json: no nucleotides found. Ignoring 4v5z_1_BY_2-113. |
214 | 4v5z_1_BZ_1-70 | 235 | 4v5z_1_BZ_1-70 |
215 | DSSR warning 4v5z.json: no nucleotides found. Ignoring 4v5z_1_BZ_1-70. | 236 | DSSR warning 4v5z.json: no nucleotides found. Ignoring 4v5z_1_BZ_1-70. |
216 | 237 | ||
217 | -4v5z_1_B1_2-125 | 238 | +4v5z_1_B1_2-123 |
218 | -DSSR warning 4v5z.json: no nucleotides found. Ignoring 4v5z_1_B1_2-125. | 239 | +DSSR warning 4v5z.json: no nucleotides found. Ignoring 4v5z_1_B1_2-123. |
219 | 240 | ||
220 | -4adx_1_0_1-2925 | 241 | +1mvr_1_B_1-96 |
221 | -DSSR warning 4adx.json: no nucleotides found. Ignoring 4adx_1_0_1-2925. | 242 | +DSSR warning 1mvr.json: no nucleotides found. Ignoring 1mvr_1_B_1-96. |
222 | 243 | ||
223 | -1mvr_1_B_3-96 | 244 | +4adx_1_0_1-2923 |
224 | -DSSR warning 1mvr.json: no nucleotides found. Ignoring 1mvr_1_B_3-96. | 245 | +DSSR warning 4adx.json: no nucleotides found. Ignoring 4adx_1_0_1-2923. |
225 | 246 | ||
226 | 3eq4_1_Y_1-69 | 247 | 3eq4_1_Y_1-69 |
227 | DSSR warning 3eq4.json: no nucleotides found. Ignoring 3eq4_1_Y_1-69. | 248 | DSSR warning 3eq4.json: no nucleotides found. Ignoring 3eq4_1_Y_1-69. |
228 | 249 | ||
229 | -6uz7_1_8_2140-2827 | 250 | +7a5p_1_2_259-449 |
251 | +Could not find nucleotides of chain 2 in annotation 7a5p.json. Either there is a problem with 7a5p mmCIF download, or the bases are not resolved in the structure. Delete it and retry. | ||
252 | + | ||
253 | +6uz7_1_8_2140-2825 | ||
230 | Could not find nucleotides of chain 8 in annotation 6uz7.json. Either there is a problem with 6uz7 mmCIF download, or the bases are not resolved in the structure. Delete it and retry. | 254 | Could not find nucleotides of chain 8 in annotation 6uz7.json. Either there is a problem with 6uz7 mmCIF download, or the bases are not resolved in the structure. Delete it and retry. |
231 | 255 | ||
232 | 4v5z_1_AA_1-1563 | 256 | 4v5z_1_AA_1-1563 | ... | ... |
... | @@ -4,7 +4,7 @@ cd /home/lbecquey/Projects/RNANet | ... | @@ -4,7 +4,7 @@ cd /home/lbecquey/Projects/RNANet |
4 | rm -rf latest_run.log errors.txt | 4 | rm -rf latest_run.log errors.txt |
5 | 5 | ||
6 | # Run RNANet | 6 | # Run RNANet |
7 | -bash -c 'time python3.8 ./RNAnet.py --3d-folder /home/lbecquey/Data/RNA/3D/ --seq-folder /home/lbecquey/Data/RNA/sequences/ -r 20.0 --extract -s --archive' > latest_run.log 2>&1 | 7 | +bash -c 'time python3.8 ./RNAnet.py --3d-folder /home/lbecquey/Data/RNA/3D/ --seq-folder /home/lbecquey/Data/RNA/sequences/ --sina -r 20.0 --extract -s --archive' > latest_run.log 2>&1 |
8 | echo 'Compressing RNANet.db.gz...' >> latest_run.log | 8 | echo 'Compressing RNANet.db.gz...' >> latest_run.log |
9 | touch results/RNANet.db # update last modification date | 9 | touch results/RNANet.db # update last modification date |
10 | gzip -k /home/lbecquey/Projects/RNANet/results/RNANet.db # compress it | 10 | gzip -k /home/lbecquey/Projects/RNANet/results/RNANet.db # compress it | ... | ... |
-
Please register or login to post a comment