Louis BECQUEY
Committed by GitHub

Update Readme.md

...@@ -6,13 +6,13 @@ We use the Rfam mappings between 3D structures and known Rfam families, using th ...@@ -6,13 +6,13 @@ We use the Rfam mappings between 3D structures and known Rfam families, using th
6 Future versions might compute a real MSA-based clusering directly with Rfamseq ncRNA sequences, like ProteinNet does with protein sequences, but this requires a tool similar to jackHMMER in the Infernal software suite, which is not available yet. 6 Future versions might compute a real MSA-based clusering directly with Rfamseq ncRNA sequences, like ProteinNet does with protein sequences, but this requires a tool similar to jackHMMER in the Infernal software suite, which is not available yet.
7 7
8 This script prepares the dataset from available public data in PDB and Rfam. 8 This script prepares the dataset from available public data in PDB and Rfam.
9 -It requires solid hardware to run. (Tested on a server with 24 cores and 48GB of RAM.) 9 +It requires solid hardware to run. (Tested on a server with 32 cores and 48GB of RAM.)
10 10
11 # Dependencies 11 # Dependencies
12 You need to install Infernal, DSSR, and SINA before running this. 12 You need to install Infernal, DSSR, and SINA before running this.
13 I moved to python3.8.1. Unfortunately, python3.6 is no longer supported, because of changes in the multiprocessing and Threading packages. Untested with Python 3.7.*. 13 I moved to python3.8.1. Unfortunately, python3.6 is no longer supported, because of changes in the multiprocessing and Threading packages. Untested with Python 3.7.*.
14 14
15 -Packages numpy, pandas, matplotlib, requests, psutil, biopython, and sqlalchemy are required. 15 +Packages numpy, pandas, matplotlib, requests, psutil, biopython, sqlalchemy and tqdm are required.
16 `python3.8 -m pip install numpy pandas matplotlib pymysql requests psutil biopython sqlalchemy tqdm` 16 `python3.8 -m pip install numpy pandas matplotlib pymysql requests psutil biopython sqlalchemy tqdm`
17 17
18 Before use, please set the two variables `path_to_3D_data` and `path_to_seq_data` (around line 30 of RNAnet.py) to two folders where you want to store RNA 3D structures and RNA sequences. A few gigabytes will be produced. 18 Before use, please set the two variables `path_to_3D_data` and `path_to_seq_data` (around line 30 of RNAnet.py) to two folders where you want to store RNA 3D structures and RNA sequences. A few gigabytes will be produced.
...@@ -34,7 +34,7 @@ Now, compute the features: ...@@ -34,7 +34,7 @@ Now, compute the features:
34 34
35 Then, compute the labels: 35 Then, compute the labels:
36 36
37 -* Run DSSR on every chain to get eta' and theta' pseudotorsions 37 +* Run DSSR on every chain to get a variety of descriptors per position, describing secondary and tertiary structure
38 * This also permits to identify missing residues and compute a mask for every chain. 38 * This also permits to identify missing residues and compute a mask for every chain.
39 39
40 Finally, store this data into files. 40 Finally, store this data into files.
......