Showing
1 changed file
with
3 additions
and
3 deletions
... | @@ -6,13 +6,13 @@ We use the Rfam mappings between 3D structures and known Rfam families, using th | ... | @@ -6,13 +6,13 @@ We use the Rfam mappings between 3D structures and known Rfam families, using th |
6 | Future versions might compute a real MSA-based clusering directly with Rfamseq ncRNA sequences, like ProteinNet does with protein sequences, but this requires a tool similar to jackHMMER in the Infernal software suite, which is not available yet. | 6 | Future versions might compute a real MSA-based clusering directly with Rfamseq ncRNA sequences, like ProteinNet does with protein sequences, but this requires a tool similar to jackHMMER in the Infernal software suite, which is not available yet. |
7 | 7 | ||
8 | This script prepares the dataset from available public data in PDB and Rfam. | 8 | This script prepares the dataset from available public data in PDB and Rfam. |
9 | -It requires solid hardware to run. (Tested on a server with 24 cores and 48GB of RAM.) | 9 | +It requires solid hardware to run. (Tested on a server with 32 cores and 48GB of RAM.) |
10 | 10 | ||
11 | # Dependencies | 11 | # Dependencies |
12 | You need to install Infernal, DSSR, and SINA before running this. | 12 | You need to install Infernal, DSSR, and SINA before running this. |
13 | I moved to python3.8.1. Unfortunately, python3.6 is no longer supported, because of changes in the multiprocessing and Threading packages. Untested with Python 3.7.*. | 13 | I moved to python3.8.1. Unfortunately, python3.6 is no longer supported, because of changes in the multiprocessing and Threading packages. Untested with Python 3.7.*. |
14 | 14 | ||
15 | -Packages numpy, pandas, matplotlib, requests, psutil, biopython, and sqlalchemy are required. | 15 | +Packages numpy, pandas, matplotlib, requests, psutil, biopython, sqlalchemy and tqdm are required. |
16 | `python3.8 -m pip install numpy pandas matplotlib pymysql requests psutil biopython sqlalchemy tqdm` | 16 | `python3.8 -m pip install numpy pandas matplotlib pymysql requests psutil biopython sqlalchemy tqdm` |
17 | 17 | ||
18 | Before use, please set the two variables `path_to_3D_data` and `path_to_seq_data` (around line 30 of RNAnet.py) to two folders where you want to store RNA 3D structures and RNA sequences. A few gigabytes will be produced. | 18 | Before use, please set the two variables `path_to_3D_data` and `path_to_seq_data` (around line 30 of RNAnet.py) to two folders where you want to store RNA 3D structures and RNA sequences. A few gigabytes will be produced. |
... | @@ -34,7 +34,7 @@ Now, compute the features: | ... | @@ -34,7 +34,7 @@ Now, compute the features: |
34 | 34 | ||
35 | Then, compute the labels: | 35 | Then, compute the labels: |
36 | 36 | ||
37 | -* Run DSSR on every chain to get eta' and theta' pseudotorsions | 37 | +* Run DSSR on every chain to get a variety of descriptors per position, describing secondary and tertiary structure |
38 | * This also permits to identify missing residues and compute a mask for every chain. | 38 | * This also permits to identify missing residues and compute a mask for every chain. |
39 | 39 | ||
40 | Finally, store this data into files. | 40 | Finally, store this data into files. | ... | ... |
-
Please register or login to post a comment