Louis BECQUEY

New documentation

......@@ -9,12 +9,18 @@ esl*
.vscode/
__pycache__/
.git/
.gitignore
.dockerignore
errors.txt
known_issues.txt
known_issues_reasons.txt
Dockerfile
LICENSE
README.md
CHANGELOG
*.md
scripts/automate.sh
scripts/kill_rnanet.sh
scripts/build_docker_image.sh
scripts/*.tar
scripts/measure.py
scripts/recompute_some_chains.py
......
......@@ -27,9 +27,3 @@ BUG CORRECTIONS
- Modified nucleotides were not always correctly transformed to N in the alignments (and nucleotide.nt_align_code fields).
Now, the alignments and nt_align_code (and consensus) only contain "ACGUN-" chars.
Now, 'N' means 'other', while '-' means 'nothing' or 'unknown'.
COMING SOON
- Automated annotation of detected Recurrent Interaction Networks (RINs), see http://carnaval.lri.fr/ .
- Possibly, automated detection of HLs and ILs from the 3D Motif Atlas (BGSU). Maybe. Their own website already does the job.
- A field estimating the quality of the sequence alignment in table family.
- Possibly, more metrics about the alignments coming from Infernal.
\ No newline at end of file
......
# More about the database structure
To help you design your own SQL requests, we provide a description of the database tables and fields.
## Table `family`, for Rfam families and their properties
* `rfam_acc`: The family codename, from Rfam's numbering (Rfam accession number)
* `description`: What RNAs fit in this family
* `nb_homologs`: The number of hits known to be homologous downloaded from Rfam to compute nucleotide frequencies
* `nb_3d_chains`: The number of 3D RNA chains mapped to the family (from Rfam-PDB mappings, or inferred using the redundancy list)
* `nb_total_homol`: Sum of the two previous fields, the number of sequences in the multiple sequence alignment, used to compute nucleotide frequencies
* `max_len`: The longest RNA sequence among the homologs (in bases, unaligned)
* `ali_len`: The aligned sequences length (in bases, aligned)
* `ali_filtered_len`: The aligned sequences length when we filter the alignment to keep only the RNANet chains (which have a 3D structure) and some gap-only columns.
* `comput_time`: Time required to compute the family's multiple sequence alignment in seconds,
* `comput_peak_mem`: RAM (or swap) required to compute the family's multiple sequence alignment in megabytes,
* `idty_percent`: Average identity percentage over pairs of the 3D chains' sequences from the family
## Table `structure`, for 3D structures of the PDB
* `pdb_id`: The 4-char PDB identifier
* `pdb_model`: The model used in the PDB file
* `date`: The first submission date of the 3D structure to a public database
* `exp_method`: A string to know wether the structure as been obtained by X-ray crystallography ('X-RAY DIFFRACTION'), electron microscopy ('ELECTRON MICROSCOPY'), or NMR (not seen yet)
* `resolution`: Resolution of the structure, in Angströms
## Table `chain`, for the datapoints: one chain mapped to one Rfam family
* `chain_id`: A unique identifier
* `structure_id`: The `pdb_id` where the chain comes from
* `chain_name`: The chain label, extracted from the 3D file
* `eq_class`: The BGSU equivalence class label containing this chain
* `rfam_acc`: The family which the chain is mapped to (if not mapped, value is *unmappd*)
* `pdb_start`: Position in the chain where the mapping to Rfam begins (absolute position, not residue number)
* `pdb_end`: Position in the chain where the mapping to Rfam ends (absolute position, not residue number)
* `reversed`: Wether the mapping numbering order differs from the residue numbering order in the mmCIF file (eg 4c9d, chains C and D)
* `issue`: Wether an issue occurred with this structure while downloading, extracting, annotating or parsing the annotation. See the file known_issues_reasons.txt for more information about why your chain is marked as an issue.
* `inferred`: Wether the mapping has been inferred using the redundancy list (value is 1) or just known from Rfam-PDB mappings (value is 0)
* `chain_freq_A`, `chain_freq_C`, `chain_freq_G`, `chain_freq_U`, `chain_freq_other`: Nucleotide frequencies in the chain
* `pair_count_cWW`, `pair_count_cWH`, ... `pair_count_tSS`: Counts of the non-canonical base-pair types in the chain (intra-chain counts only)
## Table `nucleotide`, for individual nucleotide descriptors
* `nt_id`: A unique identifier
* `chain_id`: The chain the nucleotide belongs to
* `index_chain`: its absolute position within the portion of chain mapped to Rfam, from 1 to X. This is completely uncorrelated to any gene start or 3D chain residue numbers.
* `nt_position`: relative position within the portion of chain mapped to RFam, from 0 to 1
* `old_nt_resnum`: The residue number in the 3D mmCIF file (it's a string actually, some contain a letter like '37A')
* `nt_name`: The residue type. This includes modified nucleotide names (e.g. 5MC for 5-methylcytosine)
* `nt_code`: One-letter name. Lowercase "acgu" letters are used for modified "ACGU" bases.
* `nt_align_code`: One-letter name used for sequence alignment. Contains "ACGUN-" only first, and then, gaps may be replaced by the most common letter at this position (default)
* `is_A`, `is_C`, `is_G`, `is_U`, `is_other`: One-hot encoding of the nucleotide base
* `dbn`: character used at this position if we look at the dot-bracket encoding of the secondary structure. Includes inter-chain (RNA complexes) contacts.
* `paired`: empty, or comma separated list of `index_chain` values referring to nucleotides the base is interacting with. Up to 3 values. Inter-chain interactions are marked paired to '0'.
* `nb_interact`: number of interactions with other nucleotides. Up to 3 values. Includes inter-chain interactions.
* `pair_type_LW`: The Leontis-Westhof nomenclature codes of the interactions. The first letter concerns cis/trans orientation, the second this base's side interacting, and the third the other base's side.
* `pair_type_DSSR`: Same but using the DSSR nomenclature (Hoogsteen edge approximately corresponds to Major-groove and Sugar edge to minor-groove)
* `alpha`, `beta`, `gamma`, `delta`, `epsilon`, `zeta`: The 6 torsion angles of the RNA backabone for this nucleotide
* `epsilon_zeta`: Difference between epsilon and zeta angles
* `bb_type`: conformation of the backbone (BI, BII or ..)
* `chi`: torsion angle between the sugar and base (O-C1'-N-C4)
* `glyco_bond`: syn or anti configuration of the sugar-base bond
* `v0`, `v1`, `v2`, `v3`, `v4`: 5 torsion angles of the ribose cycle
* `form`: if the nucleotide is involved in a stem, the stem type (A, B or Z)
* `ssZp`: Z-coordinate of the 3’ phosphorus atom with reference to the5’ base plane
* `Dp`: Perpendicular distance of the 3’ P atom to the glycosidic bond
* `eta`, `theta`: Pseudotorsions of the backbone, using phosphorus and carbon 4'
* `eta_prime`, `theta_prime`: Pseudotorsions of the backbone, using phosphorus and carbon 1'
* `eta_base`, `theta_base`: Pseudotorsions of the backbone, using phosphorus and the base center
* `phase_angle`: Conformation of the ribose cycle
* `amplitude`: Amplitude of the sugar puckering
* `puckering`: Conformation of the ribose cycle (10 classes depending on the phase_angle value)
## Table `align_column`, for positions in multiple sequence alignments
* `column_id`: A unique identifier
* `rfam_acc`: The family's MSA the column belongs to
* `index_ali`: Position of the column in the alignment (starts at 1)
* `freq_A`, `freq_C`, `freq_G`, `freq_U`, `freq_other`: Nucleotide frequencies in the alignment at this position
* `gap_percent`: The frequencies of gaps at this position in the alignment (between 0.0 and 1.0)
* `consensus`: A consensus character (ACGUN or '-') summarizing the column, if we can. If >75% of the sequences are gaps at this position, the gap is picked as consensus. Otherwise, A/C/G/U is chosen if >50% of the non-gap positions are A/C/G/U. Otherwise, N is the consensus.
There always is an entry, for each family (rfam_acc), with index_ali = 0; gap_percent = 1.0; and nucleotide frequencies set to 0.0. This entry is used when the nucleotide frequencies cannot be determined because of local alignment issues.
## Table `re_mapping`, to map a nucleotide to an alignment column
* `remapping_id`: A unique identifier
* `chain_id`: The chain which is mapped to an alignment
* `index_chain`: The absolute position of the nucleotide in the chain (from 1 to X)
* `index_ali` The position of that nucleotide in its family alignment
......@@ -40,6 +40,13 @@ RUN apk update && apk add --no-cache \
musl-dev \
py3-pip py3-wheel \
freetype-dev zlib-dev
RUN addgroup -S appgroup -g 1000 && \
adduser -S appuser -u 1000 -G appgroup && \
chown -R appuser:appgroup /3D && \
chown -R appuser:appgroup /sequences && \
mkdir /runDir && \
chown -R appuser:appgroup /runDir
USER appuser
VOLUME ["/3D", "/sequences", "/runDir"]
WORKDIR /runDir
ENTRYPOINT ["/RNANet/RNAnet.py", "--3d-folder", "/3D", "--seq-folder", "/sequences" ]
\ No newline at end of file
......
# Warnings and errors in RNANet
Use Ctrl + F on this page to look for your error message in the list.
* **Could not load X.json with JSON package** :
The JSON format produced as DSSR output could not be loaded by Python. Try deleting the file and re-running DSSR (through RNANet).
* **Found DSSR warning in annotation X.json: no nucleotides found. Ignoring X.** :
DSSR complains because the CIF structure does not seem to contain nucleotides. This can happen on low resolution structures where only P atoms are solved, you should ignore them. This also can happen if the .cif file is corrupted (failed download, etc). Check with a 3D visualization software if your chain contains well-defined nucleotides. Try deleting the .cif and retry. If the problem persists, just ignore the chain.
* **Could not find nucleotides of chain X in annotation X.json. Ignoring chain X.** : Basically the same as above, but some nucleotides have been observed in another chain of the same structure.
* **Could not find real nucleotides of chain X between START and STOP. Ignoring chain X."** : Same as the two above, but nucleotides can be found outside of the mapping interval. This can happen if there is a mapping problem, e.g., considered absolute interval when it should not.
* **Error while parsing DSSR X.json output: {custom-error}** : The DSSR annotations lack some of our required fields. It is likely that DSSR changed something in their fields names. Contact us so that we fix the problem with the latest DSSR version.
* **Mapping is reversed, this case is not supported (yet). Ignoring chain X.** : The mapping coordinates, as obtained from Rfam, have an end position coming before the start position (meaning, the sequence has to be reversed to map the RNA covariance model). We do not support this yet, we ignore this chain.
* **Error with parsing of X duplicate residue numbers. Ignoring it.** : This 3D chain contains new kind(s) of issue(s) in the residue numberings that are not part of the issues we already know how to tackle. Contact us, so that we add support for this entry.
* **Found duplicated index_chain N in X. Keeping only the first.** : This RNA 3D chain contains two (or more) residues with the same numbering N. This often happens when a nucleic-like ligand is annotated as part of the RNA chain, and DSSR considers it a nucleotide. By default, RNANet keeps only the first of the multiple residues with the same number. You may want to check that the produced 3D structure contains the appropriate nucleotide and no ligand.
* **Missing index_chain N in X !** : DSSR annotations for chain X are discontinuous, position N is missing. This means residue N has not been recognized as a nucleotide by DSSR. Is the .cif structure file corrupted ? Delete it and retry.
* **X sequence is too short, let's ignore it.** : We discard very short RNA chains.
* **Error downloading and/or extracting Rfam.cm !** : We cannot retrieve the Rfam covariance models file. RNANet tries to find it at ftp://ftp.ebi.ac.uk/pub/databases/Rfam/CURRENT/Rfam.cm.gz so, check that your network is not blocking the FTP protocol (port 21 is open on your network), and check that the adress has not changed. If so, contact us so that we update RNANet with the correct address.
* **Something's wrong with the SQL database. Check mysql-rfam-public.ebi.ac.uk status and try again later. Not printing statistics.** : We cannot retrieve family statistics from Rfam public server. Check if you can connect to it by hand : `mysql -u rfamro -P 4497 -D Rfam -h mysql-rfam-public.ebi.ac.uk`. if not, check that the port 497 is opened on your network.
* **Error downloading RFXXXXX.fa.gz: {custom-error}** : We cannot reach the Rfam FTP server to download homologous sequences. We look in ftp://ftp.ebi.ac.uk/pub/databases/Rfam/CURRENT/fasta_files/ so, check if you can access it from your network (check that port 21 is opened on your network). Check if the address has changed and notify us.
* **Error downloading NR list !** : We cannot download BGSU's equivalence classes from their website. Check if you can access http://rna.bgsu.edu/rna3dhub/nrlist/download/current/20.0A/csv from a web browser. It actually happens that their website is not responding, the previous download will be re-used.
* **Error downloading the LSU/SSU database from SILVA** : We cannot reach SILVA's arb files. We are looking for http://www.arb-silva.de/fileadmin/arb_web_db/release_132/ARB_files/SILVA_132_LSURef_07_12_17_opt.arb.gz and http://www.arb-silva.de/fileadmin/silva_databases/release_138/ARB_files/SILVA_138_SSURef_05_01_20_opt.arb.gz , can you download and extract them from your web browser and place them in the realigned/ subfolder ?
* **Assuming mapping to RFXXXXX is an absolute position interval.** : The mapping provided by Rfam concerns a nucleotide interval START-END, but no nucleotides are defined in 3D in that interval. When this happens, we assume that the numbering is not relative to the residue numbers in the 3D file, but to the absolute position in the chain, starting at 1. And yes, we tried to apply this behavior to all mappings, this yields the opposite issue where some mappings get outside the available nucleotides. To be solved the day Rfam explains how they get build the mappings.
* **Added newly discovered issues to known issues** : You discovered new chains that cannot be perfectly understood as they actually are, congrats. For each chain of the list, another warning has been raised, refer to them.
* **Structures without referenced chains have been detected.** : Something went wrong, because the database contains references to 3D structures that are not used by any entry in the `chain` table. You should rerun RNANet. The option `--only` may help to rerun it just for one chain.
* **Chains without referenced structures have been detected** :
Something went wrong, because the database contains references to 3D chains that are not used by any entry in the `structure` table. You should rerun RNANet. The option `--only` may help to rerun it just for one chain.
* **Chains were not remapped** : Something went wrong, because the database contains references to 3D chains that are not used by any entry in the `re_mapping` table, assuming you were interested in homology data. You should rerun RNANet. The option `--only` may help to rerun it just for one chain. If you are not interested in homology data, use option `--no-homology` to skip alignment and remapping steps.
* **Operational Error: database is locked, retrying in 0.2s** : Too many workers are trying to access the database at the same time. Do not try to run several instances of RNANet in parallel. Even with only one instance, this might still happen if your device has slow I/O delays. Try to run RNANet from a SSD ?
* **Tried to reach database 100 times and failed. Aborting.** : Same as above, but in a more serious way.
* **Nothing to do !** : RNANet is up-to-date, or did not detect any modification to do, so nothing changed in your database.
* **KeyboardInterrupt, terminating workers.** : You interrupted the computation by pressing Ctrl+C. The database may be in an unstable state, rerun RNANet to solve the problem.
* **Found mappings to RFXXXXX in both directions on the same interval, keeping only the 5'->3' one.** : A chain has been mapped to family RFXXXXX, but the mapping has been found twice, with the limits inverted. We only keep one (in 5'->3' sense).
* **There are mappings for RFXXXXX in both directions** : A chain has been mapped to family RFXXXXX several times, and the mappings are not in the same sequence sense (some are reverted, with END < START). Then, we do not know what to decide for this chain, and we abort.
* **Unable to download XXXX.cif. Ignoring it.** : We cannot access a certain 3D structure from RCSB's download site, can you access it from your web browser and put it in the RNAcifs/ folder ? We look at http://files.rcsb.org/download/XXXX.cif , replacing XXXX by the right PDB code.
* **Wtf, structure XXXX has no resolution ? Check https://files.rcsb.org/header/XXXX.cif to figure it out.** : We cannot find the resolution of structure XXXX from the .cif file. We are looking for it in the fields `_refine.ls_d_res_high`, `_refine.ls_d_res_low`, and `_em_3d_reconstruction.resolution`. Maybe the information is stored in another field ? If you find it, contact us so that we support this new CIF field.
* **Could not find annotations for X, ignoring it.** : It seems that DSSR has not been run for structure X, or failed. Rerun RNANet.
* **Nucleotides not inserted: {custom-error}** : For some reason, no nucleotides were saved to the database for this chain. Contact us.
* **Removing N doublons from existing RFXXXXX++.fa and using their newest version** : You are trying to re-compute sequence alignments of 3D structures that had already been computed in the past. They will be removed from the alignment and recomputed, for the case the sequences have changed.
* **Removing N doublons from existing RFXXXXX++.stk and using their newest version** : Same as above.
* **Error during sequence alignment: {custom-error}** : Something went wrong during sequence alignment. Recompute the alignments using the `--update-homologous` option.
* **Failed to realign RFXXXXX (killed)** : You ran out of memory while computing multiple sequence alignments. Try to run RNANet of a machine with at least 32 GB of RAM.
* **RFXXXXX's alignment is wrong. Recompute it and retry.** : We could not load RFXXXXX's multiple sequence alignment. It may have failed to compute, or be corrupted. Recompute the alignments using the `--update-homologous` option.
\ No newline at end of file
# FAQ
* **What is the difference between . and - in alignments ?**
In `cmalign` alignments, - means a nucleotide is missing compared to the covariance model. It represents a deletion. The dot '.' indicates that another chain has an insertion compared to the covariance model. The current chains does not lack anything, it's another which has more.
In the final filtered alignment that we provide for download, the same rule applies, but on top of that, some '.' are replaced by '-' when a gap in the 3D structure (a missing, unresolved nucleotide) is mapped to an insertion gap.
* **Why are there some gap-only columns in the alignment ?**
These columns are not completely gap-only, they contain at least one dash-gap '-'. This means an actual, physical nucleotide which should exist in the 3D structure should be located there. The previous and following nucleotides are **not** contiguous in space in 3D.
* **Why is the numbering of residues in my 3D chain weird ?**
Probably because the numbering in the original chain already was a mess, and the RNANet re-numbering process failed to understand it correctly. If you ran RNANet yourself, check the `logs/` folder and find your chain's log. It will explain you how it was re-numbered.
* **What is your standardized way to re-number residues ?**
We first remove the nucleotides whose number is outside the family mapping (if any). Then, we renumber the following way:
0) For truncated chains, we shift the numbering of every nucleotide so that the first nucleotide is 1.
1) We identify duplicate residue numbers and increase by 1 the numbering of all nucleotides starting at the duplicate, recursively, and until we find a gap in the numbering suite. If no gap is found, residue numbers are shifted until the end of the chain.
2) We proceed the similar way for nucleotides with letter numbering (e.g. 17, 17A and 17B will be renumbered to 17, 18 and 19, and the following nucleotides in the chain are also shifted).
3) Nucleotides with partial numbering and a letter are hopefully detected and processed with their correct numbering (e.g. in ...1629, 1630, 163B, 1631, ... the residue 163B has nothing to do with number 163 or 164, the series will be renumbered 1629, 1630, 1631, 1632 and the following will be shifted).
4) Nucleotides numbered -1 at the begining of a chain are shifted (with the following ones) to 1.
5) Ligands at the end of the chain are removed. Is detected as ligand any residue which is not A/C/G/U and has no defined puckering or no defined torsion angles. Residues are also considered to be ligands if they are at the end of the chain with a residue number which is more than 50 than the previous residue (ligands are sometimes numbered 1000 or 9999). Finally, residues "GNG", "E2C", "OHX", "IRI", "MPD", "8UZ" at then end of a chain are removed.
6) Ligands at the begining of a chain are removed. DSSR annotates them with index_chain 1, 2, 3..., so we can detect that there is a redundancy with the real nucleotides 1, 2, 3. We keep only the first, which hopefully is the real nucleotide. We also remove the ones that have a negative number (since we renumbered the truncated chain to 1, some became negative).
7) Nucleotides with creative, disruptive numbering are attempted to be detected and renumbered, even if the numbers fell out of the family mapping interval. For example, the suite ... 1003, 2003, 3003, 1004... will be renumbered ...1003, 1004, 1005, 1006 ... and the following accordingly.
8) Nucleotides missing from portions not resolved in 3D are created as gaps, with correct numbering, to fill the portion between the previous and the following resolved ones.
* **What are the versions of the dependencies you use ?**
`cmalign` is v1.1.3, `sina` is v1.6.0, `x3dna-dssr` is v1.9.9, Biopython is v1.78.
\ No newline at end of file
* [Required computational resources](#required-computational-resources)
* [Method 1 : Using Docker](#method-1-:-installation-using-docker)
* [Method 2 : Classical command-line installation](#method-2-:-classical-command-line-installation-linux-only)
* [Command options](#command-options)
* [Computation time](#computation-time)
* [Post-computation tasks](#post-computation-tasks-estimate-quality)
* [Output files](#output-files)
# Required computational resources
- CPU: no requirements. The program is optimized for multi-core CPUs, you might want to use Intel Xeons, AMD Ryzens, etc.
- GPU: not required
- RAM: 16 GB with a large swap partition is okay. 32 GB is recommended (usage peaks at ~27 GB)
- Storage: to date, it takes 60 GB for the 3D data (36 GB if you don't use the --extract option), 11 GB for the sequence data, and 7GB for the outputs (5.6 GB database, 1 GB archive of CSV files). You need to add a few more for the dependencies. Pick a 100GB partition and you are good to go. The computation speed is way better if you use a fast storage device (e.g. SSD instead of hard drive, or even better, a NVMe SSD) because of constant I/O with the SQlite database.
- Network : We query the Rfam public MySQL server on port 4497. Make sure your network enables communication (there should not be any issue on private networks, but maybe you company/university closes ports by default). You will get an error message if the port is not open. Around 30 GB of data is downloaded.
# Method 1 : Installation using Docker
* Step 1 : Download the [Docker container](https://entrepot.ibisc.univ-evry.fr/d/1aff90a9ef214a19b848/files/?p=/rnanet_v1.3_docker.tar&dl=1). Open a terminal and move to the appropriate directory.
* Step 2 : Extract the archive to a Docker image named *rnanet* in your local installation
```
$ docker load -i rnanet_v1.3_docker.tar
```
* Step 3 : Run the container, giving it 3 folders to mount as volumes: a first to store the 3D data, a second to store the sequence data and alignments, and a third to output the results, data and logs:
```
$ docker run --rm -v path/to/3D/data/folder:/3D -v path/to/sequence/data/folder:/sequences -v path/to/experiment/results/folder:/runDir rnanet [ - other options ]
```
Typical usage:
```
nohup bash -c 'time docker run --rm -v /path/to/3D/data/folder:/3D -v /path/to/sequence/data/folder:/sequences -v /path/to/experiment/folder:/runDir rnanet -s --no-logs ' &
```
# Method 2 : Classical command line installation (Linux only)
You need to install the dependencies:
- DSSR, you need to register to the X3DNA forum [here](http://forum.x3dna.org/site-announcements/download-instructions/) and then download the DSSR binary [on that page](http://forum.x3dna.org/downloads/3dna-download/). Make sure to have the `x3dna-dssr` binary in your $PATH variable so that RNANet.py finds it.
- Infernal, to download at [Eddylab](http://eddylab.org/infernal/), several options are available depending on your preferences. Make sure to have the `cmalign`, `esl-alimanip`, `esl-alipid` and `esl-reformat` binaries in your $PATH variable, so that RNANet.py can find them.
- SINA, follow [these instructions](https://sina.readthedocs.io/en/latest/install.html) for example. Make sure to have the `sina` binary in your $PATH.
- Sqlite 3, available under the name *sqlite* in every distro's package manager,
- Python >= 3.8, (Unfortunately, python3.6 is no longer supported, because of changes in the multiprocessing and Threading packages. Untested with Python 3.7.\*)
- The following Python packages: `python3.8 -m pip install biopython matplotlib pandas psutil pymysql requests scipy setproctitle sqlalchemy tqdm`.
Then, run it from the command line, preferably using nohup if your shell will be interrupted:
```
./RNANet.py --3d-folder path/to/3D/data/folder --seq-folder path/to/sequence/data/folder [ - other options ]
```
Typical usage:
```
nohup bash -c 'time ~/Projects/RNANet/RNAnet.py --3d-folder ~/Data/RNA/3D/ --seq-folder ~/Data/RNA/sequences -s --no-logs' &
```
# Command options
The detailed list of options is below:
```
-h [ --help ] Print this help message
--version Print the program version
-f [ --full-inference ] Infer new mappings even if Rfam already provides some. Yields more copies of chains
mapped to different families.
-r 4.0 [ --resolution=4.0 ] Maximum 3D structure resolution to consider a RNA chain.
-s Run statistics computations after completion
--extract Extract the portions of 3D RNA chains to individual mmCIF files.
--keep-hetatm=False (True | False) Keep ions, waters and ligands in produced mmCIF files.
Does not affect the descriptors.
--3d-folder=… Path to a folder to store the 3D data files. Subfolders will contain:
RNAcifs/ Full structures containing RNA, in mmCIF format
rna_mapped_to_Rfam/ Extracted 'pure' RNA chains
datapoints/ Final results in CSV file format.
--seq-folder=… Path to a folder to store the sequence and alignment files. Subfolders will be:
rfam_sequences/fasta/ Compressed hits to Rfam families
realigned/ Sequences, covariance models, and alignments by family
--no-homology Do not try to compute PSSMs and do not align sequences.
Allows to yield more 3D data (consider chains without a Rfam mapping).
--all Build chains even if they already are in the database.
--only Ask to process a specific chain label only
--ignore-issues Do not ignore already known issues and attempt to compute them
--update-homologous Re-download Rfam and SILVA databases, realign all families, and recompute all CSV files
--from-scratch Delete database, local 3D and sequence files, and known issues, and recompute.
--archive Create a tar.gz archive of the datapoints text files, and update the link to the latest archive
--no-logs Do not save per-chain logs of the numbering modifications
```
Options --3d-folder and --seq-folder are mandatory for command-line installations, but should not be used for installations with Docker. In the Docker container, they are set by default to the paths you provide with the -v options.
The most useful options in that list are
* ` --extract`, to actually produce some re-numbered 3D mmCIF files of the RNA chains individually,
* ` --no-homology`, to ignore the family mapping and sequence alignment parts and only focus on 3D data download and annotation. This would yield more data since many RNAs are not mapped to any Rfam family.
* ` -s`, to run the "statistics" which are a few useful post-computation tasks such as:
* Computation of sequence identity matrices
* Statistics over the sequence lengths, nucleotide frequencies, and basepair types by RNA family
* Overall database content statistics
# Computation time
To give you an estimation, our last full run took exactly 12h, excluding the time to download the MMCIF files containing RNA (around 25GB to download) and the time to compute statistics.
Measured the 23rd of June 2020 on a 16-core AMD Ryzen 7 3700X CPU @3.60GHz, plus 32 Go RAM, and a 7200rpm Hard drive. Total CPU time spent: 135 hours (user+kernel modes), corresponding to 12h (actual time spent with the 16-core CPU).
Update runs are much quicker, around 3 hours. It depends mostly on what RNA families are concerned by the update.
# Post-computation tasks (estimate quality)
If your did not ask for automatic run of statistics over the produced dataset with the `-s` option, you can run them later using the file statistics.py.
```
python3.8 statistics.py --3d-folder path/to/3D/data/folder --seq-folder path/to/sequence/data/folder -r 20.0
```
/!\ Beware, if not precised with option `-r`, no resolution threshold is applied and all the data in RNANet.db is used.
By default, this computes:
* Computation of sequence identity matrices
* Statistics over the sequence lengths, nucleotide frequencies, and basepair types by RNA family
* Overall database content statistics
If you have run RNANet once with options `--no-homology` and `--extract`, you unlock new statistics over unmapped chains.
* You will be allowed to use option `--wadley` to reproduce Wadley & al. (2007) results automatically. These are clustering results of the pseudotorsions angles of the backbone.
* (experimental) You will be allowed to use option `--distance-matrices` to compute pairwise residue distances within the chain for every chain, and compute average and standard deviations by RNA families. This is supposed to capture the average shape of an RNA family.
# Output files
* `results/RNANet.db` is a SQLite database file containing several tables with all the information, which you can query yourself with your custom requests,
* `3D-folder-you-passed-in-option/datapoints/*` are flat text CSV files, one for one RNA chain mapped to one RNA family, gathering the per-position nucleotide descriptors,
* `archive/RNANET_datapoints_{DATE}.tar.gz` is a compressed archive of the above CSV files (only if you passed the --archive option)
* `archive/RNANET_alignments_latest.tar.gz` is a compressed archive of multiple sequence alignments in FASTA format, one per RNA family, including only the portions of chains with a 3D structure which are mapped to a family. The alignment has been computed with all the RFam sequences of that family, but they have been removed then.
* `path-to-3D-folder-you-passed-in-option/rna_mapped_to_Rfam` If you used the `--extract` option, this folder contains one mmCIF file per RNA chain mapped to one RNA family, without other chains, proteins (nor ions and ligands by default). If you used both `--extract` and `--no-homology`, this folder is called `rna_only`.
* `results/summary.csv` summarizes information about the RNA chains
* `results/families.csv` summarizes information about the RNA families
* `results/pair_types.csv` summarizes statistics about base-pair types in every family.
* `results/frequencies.csv` summarizes statistics about nucleotides frequencies in every family (including all known modified bases)
Other folders are created and not deleted, which you might want to conserve to avoid re-computations in later runs:
* `path-to-sequence-folder-you-passed-in-option/rfam_sequences/fasta/` contains compressed FASTA files of the homologous sequences used, by Rfam family.
* `path-to-sequence-folder-you-passed-in-option/realigned/` contains families covariance models (\*.cm), unaligned list of sequences (\*.fa), and multiple sequence alignments in both formats Stockholm and Aligned-FASTA (\*.stk and \*.afa). Also contains SINA homolgous sequences databases LSU.arb and SSU.arb, and their index files (\*.sidx).
* `path-to-3D-folder-you-passed-in-option/RNAcifs/` contains mmCIF structures directly downloaded from the PDB, which contain RNA chains,
* `path-to-3D-folder-you-passed-in-option/annotations/` contains the raw JSON annotation files of the previous mmCIF structures. You may find additional information into them which is not properly supported by RNANet yet.
\ No newline at end of file
# Known Issues
## Annotation and numbering issues
* Some GDPs that are listed as HETATMs in the mmCIF files are not detected correctly to be real nucleotides. (e.g. 1e8o-E)
* Some chains are truncated in different pieces with different chain names. Reason unknown (e.g. 6ztp-AX)
* Some chains are not correctly renamed A in the produced separate files (e.g. 1d4r-B)
## Alignment issues
* [SOLVED] Filtered alignments are shorter than the number of alignment columns saved to the SQL table `align_column`
* Chain names appear in triple in the FASTA header (e.g. 1d4r[1]-B 1d4r[1]-B 1d4r[1]-B)
## Technical running issues
* [SOLVED] Files produced by Docker containers are owned by root and require root permissions to be read
* [SOLVED] SQLite WAL files are not deleted properly
# Known feature requests
* [DONE] Get filtered versions of the sequence alignments containing the 3D chains, publicly available for download
* [DONE] Get a consensus residue for each alignement column
* [DONE] Get an option to limit the number of cores
* [UPCOMING] Automated annotation of detected Recurrent Interaction Networks (RINs), see http://carnaval.lri.fr/ .
* [UPCOMING] Possibly, automated detection of HLs and ILs from the 3D Motif Atlas (BGSU). Maybe. Their own website already does the job.
* A field estimating the quality of the sequence alignment in table family.
* Possibly, more metrics about the alignments coming from Infernal.
\ No newline at end of file
# RNANet
Building a dataset following the ProteinNet philosophy, but for RNA.
We use the Rfam mappings between 3D structures and known Rfam families, using the sequences that are known to belong to an Rfam family (hits provided in RF0XXXX.fasta files from Rfam).
Future versions might compute a real MSA-based clusering directly with Rfamseq ncRNA sequences, like ProteinNet does with protein sequences, but this requires a tool similar to jackHMMER in the Infernal software suite, which is not available yet.
This script prepares the dataset from available public data in PDB and Rfam.
Contents:
* [What it does](#what-it-does)
* [Output files](#output-files)
* [How to run](#how-to-run)
* [Required computational resources](#required-computational-resources)
* [Using Docker](#using-docker)
* [Using classical command line installation](#using-classical-command-line-installation)
* [Post-computation task: estimate quality](#post-computation-task:-estimate-quality)
* [What is RNANet ?](#what-is-rnanet)
* [Install and run RNANet](INSTALL.md)
* [How to further filter the dataset](#how-to-further-filter-the-dataset)
* [Filter on 3D structure resolution](#filter-on-3D-structure-resolution)
* [Filter on 3D structure publication date](#filter-on-3d-structure-publication-date)
* [Filter to avoid chain redundancy when several mappings are available](#filter-to-avoid-chain-redundancy-when-several-mappings-are-available)
* [More about the database structure](#more-about-the-database-structure)
* [Database tables documentation](Database.md)
* [FAQ](FAQ.md)
* [Troubleshooting](#troubleshooting)
* [Contact](#contact)
**Please cite**: *Coming soon, expect it in 2021*
## Cite us
# What it does
The script follows these steps:
* Gets a list of 3D structures containing RNA from BGSU's non-redundant list (but keeps the redundant structures /!\\),
* Asks Rfam for mappings of these structures onto Rfam families (~50% of structures have a direct mapping, some more are inferred using the redundancy list)
* Downloads the corresponding 3D structures (mmCIFs)
* If desired, extracts the right chain portions that map onto an Rfam family
* Louis Becquey, Eric Angel, and Fariza Tahi, (2020) **RNANet: an automatically built dual-source dataset integrating homologous sequences and RNA structures**, *Bioinformatics*, 2020, btaa944, [DOI](https://doi.org/10.1093/bioinformatics/btaa944), [Read the OpenAccess paper here](https://doi.org/10.1093/bioinformatics/btaa944)
Now, compute the features:
Additional relevant references:
* Extract the sequence for every 3D chain
* Downloads Rfamseq ncRNA sequence hits for the concerned Rfam families
* Realigns Rfamseq hits and sequences from the 3D structures together to obtain a multiple sequence alignment for each Rfam family (using `cmalign --cyk`, except for ribosomal LSU and SSU, where SINA is used)
* Computes nucleotide frequencies at every position for each alignment
* For each aligned 3D chain, get the nucleotide frequencies in the corresponding RNA family for each residue
The "ProteinNet" philosophy which inspired this work:
* AlQuraishi, M. (2019b). **ProteinNet: A standardized data set for machine learning of protein structure.** *BMC Bioinformatics*, 20(1), 311
Then, compute the labels:
If you use our annotations by DSSR, you might want to cite:
* Lu, X.-J.et al.(2015). **DSSR: An integrated software tool for dissecting the spatial structure of RNA.** *Nucleic Acids Research*, 43(21), e142–e142.
* Run DSSR on every RNA structure to get a variety of descriptors per position, describing secondary and tertiary structure. Basepair types annotations include intra-chain and inter-chain interactions.
If you use our multiple sequence alignments and homology data, you might want to cite:
* Pruesse, E. et al.(2012). **Sina: accurate high-throughput multiple sequence alignment of ribosomal RNA genes.** *Bioinformatics*, 28(14), 1823–1829
* Nawrocki, E. P. and Eddy, S. R. (2013). **Infernal 1.1: 100-fold faster RNA homology searches.** *Bioinformatics*, 29(22), 2933–2935.
Finally, export this data from the SQLite database into flat CSV files.
# Output files
# What is RNANet ?
RNANet is a multiscale dataset of non-coding RNA structures, including sequences, secondary structures, non-canonical interactions, 3D geometrical descriptors, and sequence homology.
* `results/RNANet.db` is a SQLite database file containing several tables with all the information, which you can query yourself with your custom requests,
* `3D-folder-you-passed-in-option/datapoints/*` are flat text CSV files, one for one RNA chain mapped to one RNA family, gathering the per-position nucleotide descriptors,
* `archive/RNANET_datapoints_{DATE}.tar.gz` is a compressed archive of the above CSV files (only if you passed the --archive option)
* `path-to-3D-folder-you-passed-in-option/rna_mapped_to_Rfam` If you used the `--extract` option, this folder contains one mmCIF file per RNA chain mapped to one RNA family, without other chains, proteins (nor ions and ligands by default). If you used both `--extract` and `--no-homology`, this folder is called `rnaonly`.
* `results/summary.csv` summarizes information about the RNA chains
* `results/families.csv` summarizes information about the RNA families
It is available in machine-learning ready formats like CSV files per chain or an SQL database.
Other folders are created and not deleted, which you might want to conserve to avoid re-computations in later runs:
Most interestingly, nucleotides have been renumered in a standardized way, and the 3D chains have been re-aligned with homologous sequences from the [Rfam](https://rfam.org/) database.
* `path-to-sequence-folder-you-passed-in-option/rfam_sequences/fasta/` contains compressed FASTA files of the homologous sequences used, by Rfam family.
* `path-to-sequence-folder-you-passed-in-option/realigned/` contains families covariance models (\*.cm), unaligned list of sequences (\*.fa), and multiple sequence alignments in both formats Stockholm and Aligned-FASTA (\*.stk and \*.afa). Also contains SINA homolgous sequences databases LSU.arb and SSU.arb, and their index files (\*.sidx).
* `path-to-3D-folder-you-passed-in-option/RNAcifs/` contains mmCIF structures directly downloaded from the PDB, which contain RNA chains,
* `path-to-3D-folder-you-passed-in-option/annotations/` contains the raw JSON annotation files of the previous mmCIF structures. You may find additional information into them which is not properly supported by RNANet yet.
# How to run
RNANet is availbale on Linux (x86-64) only. It could theoretically work on Mac using command line installation (*untested*).
## Methodology
We use the Rfam mappings between 3D structures and known Rfam families, using the sequences that are known to belong to an Rfam family (hits provided in RF0XXXX.fasta files from Rfam).
Future versions might compute a real MSA-based clusering directly with Rfamseq ncRNA sequences, like ProteinNet does with protein sequences, but this requires a tool similar to jackHMMER in the Infernal software suite, which is not available yet.
## Required computational resources
- CPU: no requirements. The program is optimized for multi-core CPUs, you might want to use Intel Xeons, AMD Ryzens, etc.
- GPU: not required
- RAM: 16 GB with a large swap partition is okay. 32 GB is recommended (usage peaks at ~27 GB)
- Storage: to date, it takes 60 GB for the 3D data (36 GB if you don't use the --extract option), 11 GB for the sequence data, and 7GB for the outputs (5.6 GB database, 1 GB archive of CSV files). You need to add a few more for the dependencies. Pick a 100GB partition and you are good to go. The computation speed is way better if you use a fast storage device (e.g. SSD instead of hard drive, or even better, a NVMe SSD) because of constant I/O with the SQlite database.
- Network : We query the Rfam public MySQL server on port 4497. Make sure your network enables communication (there should not be any issue on private networks, but maybe you company/university closes ports by default). You will get an error message if the port is not open. Around 30 GB of data is downloaded.
This script prepares the dataset from available public data in PDB, RNA 3D Hub, Rfam and SILVA.
To give you an estimation, our last full run took exactly 12h, excluding the time to download the MMCIF files containing RNA (around 25GB to download) and the time to compute statistics.
Measured the 23rd of June 2020 on a 16-core AMD Ryzen 7 3700X CPU @3.60GHz, plus 32 Go RAM, and a 7200rpm Hard drive. Total CPU time spent: 135 hours (user+kernel modes), corresponding to 12h (actual time spent with the 16-core CPU).
Update runs are much quicker, around 3 hours. It depends mostly on what RNA families are concerned by the update.
## Pipeline
The script follows these steps:
## Using Docker
To gather structures:
* Gets a list of 3D structures containing RNA from BGSU's non-redundant list (but keeps the redundant structures /!\\),
* Asks Rfam for mappings of these structures onto Rfam families (~50% of structures have a direct mapping, some more are inferred using the redundancy list)
* Downloads the corresponding 3D structures (mmCIFs)
* If desired, extracts the right chain portions that map onto an Rfam family to a separate mmCIF file
* Step 1 : Download the [Docker container](https://entrepot.ibisc.univ-evry.fr/f/e5edece989884a7294a6/?dl=1). Open a terminal and move to the appropriate directory.
* Step 2 : Extract the archive to a Docker image named *rnanet* in your local installation
```
$ docker load -i rnanet_v1.2_docker.tar
```
* Step 3 : Run the container, giving it 3 folders to mount as volumes: a first to store the 3D data, a second to store the sequence data and alignments, and a third to output the results, data and logs:
```
$ docker run --rm -v path/to/3D/data/folder:/3D -v path/to/sequence/data/folder:/sequences -v path/to/experiment/results/folder:/runDir rnanet [ - other options ]
```
To compute homology information:
* Extract the sequence for every 3D chain
* Downloads Rfamseq ncRNA sequence hits for the concerned Rfam families (or ARB databases of SSU or LSU sequences from SILVA for rRNAs)
* Realigns Rfamseq hits and sequences from the 3D structures together to obtain a multiple sequence alignment for each Rfam family (using `cmalign --cyk`, except for ribosomal LSU and SSU, where SINA is used)
* Computes nucleotide frequencies at every position for each alignment
* Map each nucleotide of a 3D chain to its position in the corresponding family sequence alignment
The detailed list of options is below:
To compute 3D annotations:
* Run DSSR on every RNA structure to get a variety of descriptors per position, describing secondary and tertiary structure. Basepair types annotations include intra-chain and inter-chain interactions.
```
-h [ --help ] Print this help message
--version Print the program version
-f [ --full-inference ] Infer new 3D->family mappings even if Rfam already provides some. Yields more copies of chains
mapped to different families.
-r 4.0 [ --resolution=4.0 ] Maximum 3D structure resolution to consider a RNA chain.
-s Run statistics computations after completion
--extract Extract the portions of 3D RNA chains to individual mmCIF files.
--keep-hetatm=False (True | False) Keep ions, waters and ligands in produced mmCIF files.
Does not affect the descriptors.
--fill-gaps=True (True | False) Replace gaps in nt_align_code field due to unresolved residues
by the most common nucleotide at this position in the alignment.
--3d-folder=… Path to a folder to store the 3D data files. Subfolders will contain:
RNAcifs/ Full structures containing RNA, in mmCIF format
rna_mapped_to_Rfam/ Extracted 'pure' RNA chains
datapoints/ Final results in CSV file format.
--seq-folder=… Path to a folder to store the sequence and alignment files. Subfolders will be:
rfam_sequences/fasta/ Compressed hits to Rfam families
realigned/ Sequences, covariance models, and alignments by family
--no-homology Do not try to compute PSSMs and do not align sequences.
Allows to yield more 3D data (consider chains without a Rfam mapping).
--all Build chains even if they already are in the database.
--only Ask to process a specific chain label only
--ignore-issues Do not ignore already known issues and attempt to compute them
--update-homologous Re-download Rfam and SILVA databases, realign all families, and recompute all CSV files
--from-scratch Delete database, local 3D and sequence files, and known issues, and recompute.
--archive Create a tar.gz archive of the datapoints text files, and update the link to the latest archive
--no-logs Do not save per-chain logs of the numbering modifications
```
You may not use the --3d-folder and --seq-folder options, they are set by default to the paths you provide with the -v options when running Docker.
Finally, export this data from the SQLite database into flat CSV files.
Typical usage:
```
nohup bash -c 'time docker run --rm -v /path/to/3D/data/folder:/3D -v /path/to/sequence/data/folder:/sequences -v /path/to/experiment/folder:/runDir rnanet -s --no-logs ' &
```
## Data provided
## Using classical command line installation
We provide couple of resources to exploit this dataset. You can download them on [EvryRNA](https://evryrna.ibisc.univ-evry.fr/evryrna/rnanet/rnanet_home).
* A series of tables in the SQLite3 database, see [the database documentation](Database.md) and [examples of useful queries](#how-to-further-filter-the-dataset),
* One CSV file per RNA chain, summarizing all the relevant information about it,
* Filtered alignment files in FASTA format containing only the sequences with a 3D structure available in RNANet, but which have been aligned using all the homologous sequences of this family from Rfam or SILVA,
* Additional statistics files about nucleotide frequencies, modified bases, basepair types within each chain or by RNA family.
You need to install the dependencies:
- DSSR, you need to register to the X3DNA forum [here](http://forum.x3dna.org/site-announcements/download-instructions/) and then download the DSSR binary [on that page](http://forum.x3dna.org/downloads/3dna-download/). Make sure to have the `x3dna-dssr` binary in your $PATH variable so that RNANet.py finds it.
- Infernal, to download at [Eddylab](http://eddylab.org/infernal/), several options are available depending on your preferences. Make sure to have the `cmalign`, `esl-alimanip`, `esl-alipid` and `esl-reformat` binaries in your $PATH variable, so that RNANet.py can find them.
- SINA, follow [these instructions](https://sina.readthedocs.io/en/latest/install.html) for example. Make sure to have the `sina` binary in your $PATH.
- Sqlite 3, available under the name *sqlite* in every distro's package manager,
- Python >= 3.8, (Unfortunately, python3.6 is no longer supported, because of changes in the multiprocessing and Threading packages. Untested with Python 3.7.\*)
- The following Python packages: `python3.8 -m pip install biopython==1.76 matplotlib pandas psutil pymysql requests scipy setproctitle sqlalchemy tqdm`. Note that Biopython versions 1.77 or later do not work (yet) since they removed the alphabet system.
For now, we do not provide as public downloads the set of cleaned 3D structures nor the full alignments with Rfam sequences. If you need them, [recompute them](INSTALL.md) or ask us.
Then, run it from the command line, preferably using nohup if your shell will be interrupted:
```
./RNANet.py --3d-folder path/to/3D/data/folder --seq-folder path/to/sequence/data/folder [ - other options ]
```
See the list of possible options juste above in the [Using Docker](#using-docker) section. Expect hours (maybe days) of computation.
## Updates
RNANet is updated monthly to take into account new structures proposed in the [BGSU Non-redundant lists](http://rna.bgsu.edu/rna3dhub/nrlist/). The monthly runs realign previous alignments with the new sequences using `esl-alimerge` from Infernal.
Typical usage:
```
nohup bash -c 'time ~/Projects/RNANet/RNAnet.py --3d-folder ~/Data/RNA/3D/ --seq-folder ~/Data/RNA/sequences --no-logs -s' &
```
It is updated yearly from scratch to take into account new Rfam sequences or updates in the covariance models, and updates in the PDB 3D files.
## Post-computation task: estimate quality
If your did not ask for automatic run of statistics over the produced dataset with the `-s` option, you can run them later using the file statistics.py.
```
python3.8 statistics.py --3d-folder path/to/3D/data/folder --seq-folder path/to/sequence/data/folder -r 20.0
```
/!\ Beware, if not precised with option `-r`, no resolution threshold is applied and all the data in RNANet.db is used.
For now, the SILVA releases used are fixed (LSU release 132 and SSU release 138) and not automatically updated. SILVA authors if you reach this : please provide a "latest" download link to ease automatic retrieval of the latest version.
If you have run RNANet twice, once with option `--no-homology`, and once without, you unlock new statistics over unmapped chains. You will also be allowed to use option `--wadley` to reproduce Wadley & al. (2007) results automatically.
See what's new in the latest version of RNANet [in the CHANGELOG](CHANGELOG).
# How to further filter the dataset
You may want to build your own sub-dataset by querying the results/RNANet.db file. Here are quick examples using Python3 and its sqlite3 package.
......@@ -240,133 +166,21 @@ with sqlite3.connect("results/RNANet.db) as connection:
```
Then proceed to steps 2 and 3.
# More about the database structure
To help you design your own requests, here follows a description of the database tables and fields.
## Table `family`, for Rfam families and their properties
* `rfam_acc`: The family codename, from Rfam's numbering (Rfam accession number)
* `description`: What RNAs fit in this family
* `nb_homologs`: The number of hits known to be homologous downloaded from Rfam to compute nucleotide frequencies
* `nb_3d_chains`: The number of 3D RNA chains mapped to the family (from Rfam-PDB mappings, or inferred using the redundancy list)
* `nb_total_homol`: Sum of the two previous fields, the number of sequences in the multiple sequence alignment, used to compute nucleotide frequencies
* `max_len`: The longest RNA sequence among the homologs (in bases, unaligned)
* `ali_len`: The aligned sequences length (in bases, aligned)
* `ali_filtered_len`: The aligned sequences length when we filter the alignment to keep only the RNANet chains (which have a 3D structure) and remove the gap-only columns.
* `comput_time`: Time required to compute the family's multiple sequence alignment in seconds,
* `comput_peak_mem`: RAM (or swap) required to compute the family's multiple sequence alignment in megabytes,
* `idty_percent`: Average identity percentage over pairs of the 3D chains' sequences from the family
## Table `structure`, for 3D structures of the PDB
* `pdb_id`: The 4-char PDB identifier
* `pdb_model`: The model used in the PDB file
* `date`: The first submission date of the 3D structure to a public database
* `exp_method`: A string to know wether the structure as been obtained by X-ray crystallography ('X-RAY DIFFRACTION'), electron microscopy ('ELECTRON MICROSCOPY'), or NMR (not seen yet)
* `resolution`: Resolution of the structure, in Angstöms
## Table `chain`, for the datapoints: one chain mapped to one Rfam family
* `chain_id`: A unique identifier
* `structure_id`: The `pdb_id` where the chain comes from
* `chain_name`: The chain label, extracted from the 3D file
* `eq_class`: The BGSU equivalence class label containing this chain
* `rfam_acc`: The family which the chain is mapped to (if not mapped, value is *unmappd*)
* `pdb_start`: Position in the chain where the mapping to Rfam begins (absolute position, not residue number)
* `pdb_end`: Position in the chain where the mapping to Rfam ends (absolute position, not residue number)
* `reversed`: Wether the mapping numbering order differs from the residue numbering order in the mmCIF file (eg 4c9d, chains C and D)
* `issue`: Wether an issue occurred with this structure while downloading, extracting, annotating or parsing the annotation. See the file known_issues_reasons.txt for more information about why your chain is marked as an issue.
* `inferred`: Wether the mapping has been inferred using the redundancy list (value is 1) or just known from Rfam-PDB mappings (value is 0)
* `chain_freq_A`, `chain_freq_C`, `chain_freq_G`, `chain_freq_U`, `chain_freq_other`: Nucleotide frequencies in the chain
* `pair_count_cWW`, `pair_count_cWH`, ... `pair_count_tSS`: Counts of the non-canonical base-pair types in the chain (intra-chain counts only)
## Table `nucleotide`, for individual nucleotide descriptors
* `nt_id`: A unique identifier
* `chain_id`: The chain the nucleotide belongs to
* `index_chain`: its absolute position within the portion of chain mapped to Rfam, from 1 to X. This is completely uncorrelated to any gene start or 3D chain residue numbers.
* `nt_position`: relative position within the portion of chain mapped to RFam, from 0 to 1
* `old_nt_resnum`: The residue number in the 3D mmCIF file (it's a string actually, some contain a letter like '37A')
* `nt_name`: The residue type. This includes modified nucleotide names (e.g. 5MC for 5-methylcytosine)
* `nt_code`: One-letter name. Lowercase "acgu" letters are used for modified "ACGU" bases.
* `nt_align_code`: One-letter name used for sequence alignment. Contains "ACGUN-" only first, and then, gaps may be replaced by the most common letter at this position (default)
* `is_A`, `is_C`, `is_G`, `is_U`, `is_other`: One-hot encoding of the nucleotide base
* `dbn`: character used at this position if we look at the dot-bracket encoding of the secondary structure. Includes inter-chain (RNA complexes) contacts.
* `paired`: empty, or comma separated list of `index_chain` values referring to nucleotides the base is interacting with. Up to 3 values. Inter-chain interactions are marked paired to '0'.
* `nb_interact`: number of interactions with other nucleotides. Up to 3 values. Includes inter-chain interactions.
* `pair_type_LW`: The Leontis-Westhof nomenclature codes of the interactions. The first letter concerns cis/trans orientation, the second this base's side interacting, and the third the other base's side.
* `pair_type_DSSR`: Same but using the DSSR nomenclature (Hoogsteen edge approximately corresponds to Major-groove and Sugar edge to minor-groove)
* `alpha`, `beta`, `gamma`, `delta`, `epsilon`, `zeta`: The 6 torsion angles of the RNA backabone for this nucleotide
* `epsilon_zeta`: Difference between epsilon and zeta angles
* `bb_type`: conformation of the backbone (BI, BII or ..)
* `chi`: torsion angle between the sugar and base (O-C1'-N-C4)
* `glyco_bond`: syn or anti configuration of the sugar-base bond
* `v0`, `v1`, `v2`, `v3`, `v4`: 5 torsion angles of the ribose cycle
* `form`: if the nucleotide is involved in a stem, the stem type (A, B or Z)
* `ssZp`: Z-coordinate of the 3’ phosphorus atom with reference to the5’ base plane
* `Dp`: Perpendicular distance of the 3’ P atom to the glycosidic bond
* `eta`, `theta`: Pseudotorsions of the backbone, using phosphorus and carbon 4'
* `eta_prime`, `theta_prime`: Pseudotorsions of the backbone, using phosphorus and carbon 1'
* `eta_base`, `theta_base`: Pseudotorsions of the backbone, using phosphorus and the base center
* `phase_angle`: Conformation of the ribose cycle
* `amplitude`: Amplitude of the sugar puckering
* `puckering`: Conformation of the ribose cycle (10 classes depending on the phase_angle value)
## Table `align_column`, for positions in multiple sequence alignments
* `column_id`: A unique identifier
* `rfam_acc`: The family's MSA the column belongs to
* `index_ali`: Position of the column in the alignment (starts at 1)
* `freq_A`, `freq_C`, `freq_G`, `freq_U`, `freq_other`: Nucleotide frequencies in the alignment at this position
There always is an entry, for each family (rfam_acc), with index_ali = zero and nucleotide frequencies set to freq_other = 1.0. This entry is used when the nucleotide frequencies cannot be determined because of local alignment issues.
## Table `re_mapping`, to map a nucleotide to an alignment column
* `remapping_id`: A unique identifier
* `chain_id`: The chain which is mapped to an alignment
* `index_chain`: The absolute position of the nucleotide in the chain (from 1 to X)
* `index_ali` The position of that nucleotide in its family alignment
# Troubleshooting
## Understanding the warnings and errors
* **Could not load X.json with JSON package** :
The JSON format produced as DSSR output could not be loaded by Python. Try deleting the file and re-running DSSR (through RNANet).
* **Found DSSR warning in annotation X.json: no nucleotides found. Ignoring X.** :
DSSR complains because the CIF structure does not seem to contain nucleotides. This can happen on low resolution structures where only P atoms are solved, you should ignore them. This also can happen if the .cif file is corrupted (failed download, etc). Check with a 3D visualization software if your chain contains well-defined nucleotides. Try deleting the .cif and retry. If the problem persists, just ignore the chain.
* **Could not find nucleotides of chain X in annotation X.json. Ignoring chain X.** : Basically the same as above, but some nucleotides have been observed in another chain of the same structure.
* **Could not find real nucleotides of chain X between START and STOP. Ignoring chain X."** : Same as the two above, but nucleotides can be found outside of the mapping interval. This can happen if there is a mapping problem, e.g., considered absolute interval when it should not.
* **Error while parsing DSSR X.json output: {custom-error}** : The DSSR annotations lack some of our required fields. It is likely that DSSR changed something in their fields names. Contact us so that we fix the problem with the latest DSSR version.
* **Mapping is reversed, this case is not supported (yet). Ignoring chain X.** : The mapping coordinates, as obtained from Rfam, have an end position coming before the start position (meaning, the sequence has to be reversed to map the RNA covariance model). We do not support this yet, we ignore this chain.
* **Error with parsing of X duplicate residue numbers. Ignoring it.** : This 3D chain contains new kind(s) of issue(s) in the residue numberings that are not part of the issues we already know how to tackle. Contact us, so that we add support for this entry.
* **Found duplicated index_chain N in X. Keeping only the first.** : This RNA 3D chain contains two (or more) residues with the same numbering N. This often happens when a nucleic-like ligand is annotated as part of the RNA chain, and DSSR considers it a nucleotide. By default, RNANet keeps only the first of the multiple residues with the same number. You may want to check that the produced 3D structure contains the appropriate nucleotide and no ligand.
* **Missing index_chain N in X !** : DSSR annotations for chain X are discontinuous, position N is missing. This means residue N has not been recognized as a nucleotide by DSSR. Is the .cif structure file corrupted ? Delete it and retry.
* **X sequence is too short, let's ignore it.** : We discard very short RNA chains.
* **Error downloading and/or extracting Rfam.cm !** : We cannot retrieve the Rfam covariance models file. RNANet tries to find it at ftp://ftp.ebi.ac.uk/pub/databases/Rfam/CURRENT/Rfam.cm.gz so, check that your network is not blocking the FTP protocol (port 21 is open on your network), and check that the adress has not changed. If so, contact us so that we update RNANet with the correct address.
* **Something's wrong with the SQL database. Check mysql-rfam-public.ebi.ac.uk status and try again later. Not printing statistics.** : We cannot retrieve family statistics from Rfam public server. Check if you can connect to it by hand : `mysql -u rfamro -P 4497 -D Rfam -h mysql-rfam-public.ebi.ac.uk`. if not, check that the port 497 is opened on your network.
* **Error downloading RFXXXXX.fa.gz: {custom-error}** : We cannot reach the Rfam FTP server to download homologous sequences. We look in ftp://ftp.ebi.ac.uk/pub/databases/Rfam/CURRENT/fasta_files/ so, check if you can access it from your network (check that port 21 is opened on your network). Check if the address has changed and notify us.
* **Error downloading NR list !** : We cannot download BGSU's equivalence classes from their website. Check if you can access http://rna.bgsu.edu/rna3dhub/nrlist/download/current/20.0A/csv from a web browser. It actually happens that their website is not responding, the previous download will be re-used.
* **Error downloading the LSU/SSU database from SILVA** : We cannot reach SILVA's arb files. We are looking for http://www.arb-silva.de/fileadmin/arb_web_db/release_132/ARB_files/SILVA_132_LSURef_07_12_17_opt.arb.gz and http://www.arb-silva.de/fileadmin/silva_databases/release_138/ARB_files/SILVA_138_SSURef_05_01_20_opt.arb.gz , can you download and extract them from your web browser and place them in the realigned/ subfolder ?
* **Assuming mapping to RFXXXXX is an absolute position interval.** : The mapping provided by Rfam concerns a nucleotide interval START-END, but no nucleotides are defined in 3D in that interval. When this happens, we assume that the numbering is not relative to the residue numbers in the 3D file, but to the absolute position in the chain, starting at 1. And yes, we tried to apply this behavior to all mappings, this yields the opposite issue where some mappings get outside the available nucleotides. To be solved the day Rfam explains how they get build the mappings.
* **Added newly discovered issues to known issues** : You discovered new chains that cannot be perfectly understood as they actually are, congrats. For each chain of the list, another warning has been raised, refer to them.
* **Structures without referenced chains have been detected.** : Something went wrong, because the database contains references to 3D structures that are not used by any entry in the `chain` table. You should rerun RNANet. The option `--only` may help to rerun it just for one chain.
* **Chains without referenced structures have been detected** :
Something went wrong, because the database contains references to 3D chains that are not used by any entry in the `structure` table. You should rerun RNANet. The option `--only` may help to rerun it just for one chain.
* **Chains were not remapped** : Something went wrong, because the database contains references to 3D chains that are not used by any entry in the `re_mapping` table, assuming you were interested in homology data. You should rerun RNANet. The option `--only` may help to rerun it just for one chain. If you are not interested in homology data, use option `--no-homology` to skip alignment and remapping steps.
* **Operational Error: database is locked, retrying in 0.2s** : Too many workers are trying to access the database at the same time. Do not try to run several instances of RNANet in parallel. Even with only one instance, this might still happen if your device has slow I/O delays. Try to run RNANet from a SSD ?
* **Tried to reach database 100 times and failed. Aborting.** : Same as above, but in a more serious way.
* **Nothing to do !** : RNANet is up-to-date, or did not detect any modification to do, so nothing changed in your database.
* **KeyboardInterrupt, terminating workers.** : You interrupted the computation by pressing Ctrl+C. The database may be in an unstable state, rerun RNANet to solve the problem.
* **Found mappings to RFXXXXX in both directions on the same interval, keeping only the 5'->3' one.** : A chain has been mapped to family RFXXXXX, but the mapping has been found twice, with the limits inverted. We only keep one (in 5'->3' sense).
* **There are mappings for RFXXXXX in both directions** : A chain has been mapped to family RFXXXXX several times, and the mappings are not in the same sequence sense (some are reverted, with END < START). Then, we do not know what to decide for this chain, and we abort.
* **Unable to download XXXX.cif. Ignoring it.** : We cannot access a certain 3D structure from RCSB's download site, can you access it from your web browser and put it in the RNAcifs/ folder ? We look at http://files.rcsb.org/download/XXXX.cif , replacing XXXX by the right PDB code.
* **Wtf, structure XXXX has no resolution ? Check https://files.rcsb.org/header/XXXX.cif to figure it out.** : We cannot find the resolution of structure XXXX from the .cif file. We are looking for it in the fields `_refine.ls_d_res_high`, `_refine.ls_d_res_low`, and `_em_3d_reconstruction.resolution`. Maybe the information is stored in another field ? If you find it, contact us so that we support this new CIF field.
* **Could not find annotations for X, ignoring it.** : It seems that DSSR has not been run for structure X, or failed. Rerun RNANet.
* **Nucleotides not inserted: {custom-error}** : For some reason, no nucleotides were saved to the database for this chain. Contact us.
* **Removing N doublons from existing RFXXXXX++.fa and using their newest version** : You are trying to re-compute sequence alignments of 3D structures that had already been computed in the past. They will be removed from the alignment and recomputed, for the case the sequences have changed.
* **Removing N doublons from existing RFXXXXX++.stk and using their newest version** : Same as above.
* **Error during sequence alignment: {custom-error}** : Something went wrong during sequence alignment. Recompute the alignments using the `--update-homologous` option.
* **Failed to realign RFXXXXX (killed)** : You ran out of memory while computing multiple sequence alignments. Try to run RNANet of a machine with at least 32 GB of RAM.
* **RFXXXXX's alignment is wrong. Recompute it and retry.** : We could not load RFXXXXX's multiple sequence alignment. It may have failed to compute, or be corrupted. Recompute the alignments using the `--update-homologous` option.
## Not enough memory
If you run out of memory, you may want to reduce the number of jobs run in parallel. #TODO: explain how
Check if your problem is listed in the [known issues](KnownIssues.md).
### Warning and Errors
If you ran RNANet and got an error or a warning that you do not fully understand, check the [Error documentation](Errors.md).
### Not enough memory
If you run out of memory (job killed), you may want to reduce the number of jobs run in parallel. Use the `--maxcores` option with a small number to ask RNANet to limit the concurrency and the simultaneous need for a lot of RAM. The computation time will increase accordingly.
### Not enough memory/too slow (developer trick)
If `--maxcores` is not enough, and that you identified the step which fails, you can try to edit the Python code. Look for the "coeff_ncores" argument of some functions calls. This is the coefficient applied to `--maxcores` for different steps of the pipeline. You can change it following your needs to reduce or increase concurrency (to use less memory, or compute faster, respectively).
# Contact
louis.becquey@univ-evry.fr
RNANet is still in beta, this means we are truly open (and enjoying) all the feedback we can get from interested users.
Please send all your questions, feature requests, bug reports or angry reacts to
louis.becquey@univ-evry.fr .
......
......@@ -979,9 +979,9 @@ class Pipeline:
setproctitle("RNANet.py process_options()")
try:
opts, _ = getopt.getopt(sys.argv[1:], "r:fhs", ["help", "resolution=", "3d-folder=", "seq-folder=", "keep-hetatm=", "only=",
opts, _ = getopt.getopt(sys.argv[1:], "r:fhs", ["help", "resolution=", "3d-folder=", "seq-folder=", "keep-hetatm=", "only=", "maxcores=",
"from-scratch", "full-inference", "no-homology", "ignore-issues", "extract",
"all", "no-logs", "archive", "update-homologous"])
"all", "no-logs", "archive", "update-homologous", "version"])
except getopt.GetoptError as err:
print(err)
sys.exit(2)
......@@ -1000,13 +1000,19 @@ class Pipeline:
print("-h [ --help ]\t\t\tPrint this help message")
print("--version\t\t\tPrint the program version")
print()
print("-f [ --full-inference ]\t\tInfer new mappings even if Rfam already provides some. Yields more copies of chains"
"\n\t\t\t\tmapped to different families.")
print("-r 4.0 [ --resolution=4.0 ]\tMaximum 3D structure resolution to consider a RNA chain.")
print("Select what to do:")
print("--------------------------------------------------------------------------------------------------------------")
print("-f [ --full-inference ]\t\tInfer new mappings even if Rfam already provides some. Yields more copies of"
"\n\t\t\t\t chains mapped to different families.")
print("-s\t\t\t\tRun statistics computations after completion")
print("--extract\t\t\tExtract the portions of 3D RNA chains to individual mmCIF files.")
print("--keep-hetatm=False\t\t(True | False) Keep ions, waters and ligands in produced mmCIF files. "
"\n\t\t\t\tDoes not affect the descriptors.")
"\n\t\t\t\t Does not affect the descriptors.")
print("--no-homology\t\t\tDo not try to compute PSSMs and do not align sequences."
"\n\t\t\t\t Allows to yield more 3D data (consider chains without a Rfam mapping).")
print()
print("Select how to do it:")
print("--------------------------------------------------------------------------------------------------------------")
print("--3d-folder=…\t\t\tPath to a folder to store the 3D data files. Subfolders will contain:"
"\n\t\t\t\t\tRNAcifs/\t\tFull structures containing RNA, in mmCIF format"
"\n\t\t\t\t\trna_mapped_to_Rfam/\tExtracted 'pure' RNA chains"
......@@ -1014,22 +1020,28 @@ class Pipeline:
print("--seq-folder=…\t\t\tPath to a folder to store the sequence and alignment files. Subfolders will be:"
"\n\t\t\t\t\trfam_sequences/fasta/\tCompressed hits to Rfam families"
"\n\t\t\t\t\trealigned/\t\tSequences, covariance models, and alignments by family")
print("--no-homology\t\t\tDo not try to compute PSSMs and do not align sequences."
"\n\t\t\t\tAllows to yield more 3D data (consider chains without a Rfam mapping).")
print("--maxcores=…\t\t\tLimit the number of cores to use in parallel portions to reduce the simultaneous"
"\n\t\t\t\t need of RAM. Should be a number between 1 and your number of CPUs. Note that portions"
"\n\t\t\t\t of the pipeline already limit themselves to 50% or 70% of that number by default.")
print("--archive\t\t\tCreate tar.gz archives of the datapoints text files and the alignments,"
"\n\t\t\t\t and update the link to the latest archive. ")
print("--no-logs\t\t\tDo not save per-chain logs of the numbering modifications")
print()
print("Select which data we are interested in:")
print("--------------------------------------------------------------------------------------------------------------")
print("-r 4.0 [ --resolution=4.0 ]\tMaximum 3D structure resolution to consider a RNA chain.")
print("--all\t\t\t\tBuild chains even if they already are in the database.")
print("--only\t\t\t\tAsk to process a specific chain label only")
print("--ignore-issues\t\t\tDo not ignore already known issues and attempt to compute them")
print("--update-homologous\t\tRe-download Rfam and SILVA databases, realign all families, and recompute all CSV files")
print("--from-scratch\t\t\tDelete database, local 3D and sequence files, and known issues, and recompute.")
print("--archive\t\t\tCreate a tar.gz archive of the datapoints text files, and update the link to the latest archive")
print("--no-logs\t\t\tDo not save per-chain logs of the numbering modifications")
print()
print("Typical usage:")
print(f"nohup bash -c 'time {fileDir}/RNAnet.py --3d-folder ~/Data/RNA/3D/ --seq-folder ~/Data/RNA/sequences -s' &")
print(f"nohup bash -c 'time {fileDir}/RNAnet.py --3d-folder ~/Data/RNA/3D/ --seq-folder ~/Data/RNA/sequences -s --no-logs' &")
sys.exit()
elif opt == '--version':
print("RNANet 1.3 beta, parallelized, Dockerized")
print("RNANet v1.3 beta, parallelized, Dockerized")
print("Last revision : Jan 2021")
sys.exit()
elif opt == "-r" or opt == "--resolution":
assert float(arg) > 0.0 and float(arg) <= 20.0
......@@ -1084,6 +1096,9 @@ class Pipeline:
self.ARCHIVE = True
elif opt == "--no-logs":
self.SAVELOGS = False
elif opt == "--maxcores":
global ncores
ncores = min(ncores, int(arg))
elif opt == "-f" or opt == "--full-inference":
self.FULLINFERENCE = True
......@@ -2614,9 +2629,9 @@ if __name__ == "__main__":
runDir = os.getcwd()
fileDir = os.path.dirname(os.path.realpath(__file__))
ncores = read_cpu_number()
print(f"> Running {python_executable} on {ncores} CPU cores in folder {runDir}.")
pp = Pipeline()
pp.process_options()
print(f"> Running {python_executable} on {ncores} CPU cores in folder {runDir}.")
# Prepare folders
os.makedirs(runDir + "/results", exist_ok=True)
......@@ -2639,8 +2654,7 @@ if __name__ == "__main__":
# Download and annotate new RNA 3D chains (Chain objects in pp.update)
# If the original cif file and/or the Json DSSR annotation file already exist, they are not redownloaded/recomputed.
# pp.dl_and_annotate(coeff_ncores=0.5)
pp.dl_and_annotate(coeff_ncores=1.0)
pp.dl_and_annotate(coeff_ncores=0.5)
print("Here we go.")
# At this point, the structure table is up to date.
......@@ -2652,7 +2666,7 @@ if __name__ == "__main__":
# Redownload and re-annotate
print("> Retrying to annotate some structures which just failed.", flush=True)
pp.dl_and_annotate(retry=True, coeff_ncores=0.3) #
pp.build_chains(retry=True, coeff_ncores=1.0) # Use half the cores to reduce required amount of memory
pp.build_chains(retry=True, coeff_ncores=0.5) # Use half the cores to reduce required amount of memory
print(f"> Loaded {len(pp.loaded_chains)} RNA chains ({len(pp.update) - len(pp.loaded_chains)} ignored/errors).")
if len(no_nts_set):
print(f"Among errors, {len(no_nts_set)} structures seem to contain RNA chains without defined nucleotides:", no_nts_set, flush=True)
......
1apg_1_D
1b2m_1_C
1b2m_1_D
1b2m_1_E
1cgm_1_I
1cwp_1_D
1cwp_1_E
1cwp_1_F
1ddl_1_E
1e8s_1_C
1eg0_1_L
1eg0_1_L_1-56
1eg0_1_M
1eg0_1_O
1eg0_1_O_1-73
1emi_1_B
1emi_1_B_1-108
1gsg_1_T
1gsg_1_T_1-72
1h2c_1_R
1h2d_1_R
1h2d_1_S
1i5l_1_U
1i5l_1_Y
1ibl_1_Z
1ibm_1_Z
1jgo_1_A
1jgo_1_A_2-1520
1jgp_1_A
1jgp_1_A_2-1520
1jgq_1_A
1jgq_1_A_2-1520
1laj_1_R
1ls2_1_B
1ls2_1_B_1-73
1m8w_1_E
1m8w_1_F
1mj1_1_Q
1mj1_1_R
1ml5_1_A
1ml5_1_a_1-2914
1ml5_1_A_2-1520
1ml5_1_b_5-121
1mvr_1_1
1mvr_1_A
1mvr_1_B
1mvr_1_B_3-96
1mvr_1_C
1mvr_1_D
1mvr_1_D_1-61
1mvr_1_E
1n1h_1_B
1n32_1_Z
1n33_1_Z
1n34_1_Z
1n38_1_B
1nb7_1_E
1nb7_1_F
1pn7_1_C
1pn8_1_D
1pvo_1_G
1pvo_1_H
1pvo_1_J
1pvo_1_K
1pvo_1_L
1qln_1_R
1qvg_1_3
1qzc_1_A
1qzc_1_B
1qzc_1_C
1r2w_1_C
1r2w_1_C_1-58
1r2x_1_C
1r2x_1_C_1-58
1rmv_1_B
1t1m_1_A
1t1m_1_B
1trj_1_B
1trj_1_C
1utd_1_1
1utd_1_2
1utd_1_3
1utd_1_4
1utd_1_5
1utd_1_6
1utd_1_7
1utd_1_8
1utd_1_9
1utd_1_Z
1uvi_1_D
1uvi_1_E
1uvi_1_F
1uvj_1_D
1uvj_1_E
1uvj_1_F
1uvn_1_B
1uvn_1_D
1uvn_1_F
1vq6_1_4
1vqn_1_4
1vqo_1_4
1vtm_1_R
1vy7_1_AY_1-73
1vy7_1_CY_1-73
1x18_1_A
1x18_1_B
1x18_1_C
1x18_1_D
1x1l_1_A
1x1l_1_A_1-132
1xmo_1_W
1xmq_1_W
1xnq_1_W
1xnr_1_W
1xpo_1_G
1xpo_1_H
1xpo_1_J
1xpo_1_K
1xpo_1_L
1xpo_1_M
1xpr_1_G
1xpr_1_H
1xpr_1_J
1xpr_1_K
1xpr_1_L
1xpr_1_M
1xpu_1_G
1xpu_1_H
1xpu_1_J
1xpu_1_K
1xpu_1_L
1xpu_1_M
1y1y_1_P
1ytu_1_D
1ytu_1_F
1zc8_1_A
1zc8_1_A_1-59
1zc8_1_B
1zc8_1_C
1zc8_1_F
1zc8_1_G
1zc8_1_H
1zc8_1_I
1zc8_1_J
1zc8_1_Z
1zc8_1_Z_1-93
1zn0_1_C
1zn1_1_B
1zn1_1_B_1-59
1zn1_1_C
2a1r_1_C
2a1r_1_D
2a8v_1_D
2atw_1_B
2atw_1_D
2az0_1_C
2az0_1_D
2az2_1_C
2az2_1_D
2b2d_1_S
2f4v_1_Z
2ftc_1_R
2ftc_1_R_81-1466
2fz2_1_D
2ht1_1_J
2ht1_1_K
2iy3_1_B
2iy3_1_B_9-105
2ob7_1_A
2ob7_1_A_10-319
2ob7_1_D
2ob7_1_D_1-132
2om3_1_R
2qqp_1_R
2r1g_1_A
2r1g_1_B
2r1g_1_C
2r1g_1_D
2r1g_1_E
2r1g_1_F
2r1g_1_X
2rdo_1_A
2rdo_1_A_3-118
2rdo_1_B
2rdo_1_B_1-2904
2tmv_1_R
2uxb_1_X
2uxc_1_Y
2uxd_1_X
2vaz_1_A
2vaz_1_A_64-177
2voo_1_C
2voo_1_D
2vrt_1_E
2vrt_1_F
2vrt_1_G
2vrt_1_H
2wj8_1_A
2wj8_1_B
2wj8_1_C
2wj8_1_D
2wj8_1_E
2wj8_1_F
2wj8_1_G
2wj8_1_H
2wj8_1_I
2wj8_1_J
2wj8_1_K
2wj8_1_L
2wj8_1_M
2wj8_1_N
2wj8_1_O
2wj8_1_P
2wj8_1_Q
2wj8_1_R
2wj8_1_S
2wj8_1_T
2x1a_1_B
2x1f_1_B
2xea_1_R
2xnr_1_C
2xpj_1_D
2xs5_1_D
2xs7_1_B
2z9q_1_A
2z9q_1_A_1-72
2zde_1_E
2zde_1_F
2zde_1_G
2zde_1_H
3avt_1_T
3b0u_1_A
3b0u_1_B
3bbv_1_Z
3cd6_1_4
3cma_1_5
3cme_1_5
3cw1_1_V
3cw1_1_v_1-138
3cw1_1_V_1-138
3cw1_1_W
3cw1_1_w_1-138
3cw1_1_X
3cw1_1_x_1-138
3d2s_1_F
3d2s_1_H
3ep2_1_A
3ep2_1_B
3ep2_1_B_1-50
3ep2_1_C
3ep2_1_D
3ep2_1_E
3ep2_1_Y
3ep2_1_Y_1-72
3eq3_1_A
3eq3_1_B
3eq3_1_B_1-50
3eq3_1_C
3eq3_1_D
3eq3_1_E
3eq3_1_Y
3eq3_1_Y_1-72
3eq4_1_A
3eq4_1_B
3eq4_1_B_1-50
3eq4_1_C
3eq4_1_D
3eq4_1_E
3eq4_1_Y
3eq4_1_Y_1-69
3er8_1_F
3er8_1_G
3er8_1_H
3er9_1_D
3erc_1_G
3gpq_1_E
3gpq_1_F
3ie1_1_E
3ie1_1_F
3ie1_1_G
3ie1_1_H
3iy8_1_A
3iy8_1_A_1-540
3iy9_1_A
3iy9_1_A_498-1027
3j06_1_R
3j0l_1_A
3j0l_1_B
3j0l_1_C
3j0l_1_D
3j0l_1_F
3j0l_1_H
3j0o_1_A
3j0o_1_B
3j0o_1_C
3j0o_1_D
3j0o_1_F
3j0o_1_H
3j0p_1_A
3j0p_1_C
3j0p_1_D
3j0p_1_F
3j0p_1_H
3j0q_1_A
3j0q_1_C
3j0q_1_D
3j0q_1_F
3j0q_1_H
3j2k_1_0
3j2k_1_1
3j2k_1_2
3j2k_1_3
3j2k_1_4
3j46_1_A
3j46_1_P
3j6b_1_E
3j6x_1_IR
3j6y_1_IR
3j9m_1_U
3j9y_1_V
3jb7_1_M
3jb7_1_T
3jbu_1_B
3jbu_1_V
3jbv_1_B
3jcj_1_G
3jcj_1_V
3jcn_1_V
3jcr_1_H
3jcr_1_H_1-115
3jcr_1_M
3jcr_1_M_1-141
3jcr_1_N
3jcr_1_N_1-107
3koa_1_C
3m7n_1_Z
3m85_1_X
3m85_1_Y
3m85_1_Z
3nma_1_B
3nma_1_C
3nvk_1_G
3nvk_1_S
3ok4_1_2
3ok4_1_4
3ok4_1_H
3ok4_1_J
3ok4_1_L
3ok4_1_N
3ok4_1_P
3ok4_1_R
3ok4_1_T
3ok4_1_V
3ok4_1_X
3ok4_1_Z
3ol6_1_D
3ol6_1_H
3ol6_1_L
3ol6_1_P
3ol7_1_D
3ol7_1_H
3ol7_1_L
3ol7_1_P
3ol8_1_D
3ol8_1_H
3ol8_1_L
3ol8_1_P
3ol9_1_D
3ol9_1_H
3ol9_1_L
3ol9_1_P
3olb_1_D
3olb_1_H
3olb_1_L
3olb_1_P
3p6y_1_Q
3p6y_1_T
3p6y_1_U
3p6y_1_V
3p6y_1_W
3pdm_1_R
3pf5_1_S
3pgw_1_N
3pgw_1_N_1-164
3pgw_1_R
3pgw_1_R_1-164
3qsu_1_P
3qsu_1_R
3rtj_1_D
3rzo_1_R
3s4g_1_B
3s4g_1_C
3t1h_1_W
3t1y_1_W
3u2e_1_C
3u2e_1_D
3wzi_1_C
486d_1_F
486d_1_G
4a3b_1_P
4a3c_1_P
4a3e_1_P
4a3g_1_P
4a3j_1_P
4a3m_1_P
4adx_1_0
4adx_1_0_1-2925
4adx_1_8
4adx_1_9
4adx_1_9_1-123
4afy_1_C
4afy_1_D
4am3_1_D
4am3_1_H
4am3_1_I
4b3r_1_W
4b3s_1_W
4b3t_1_W
4ba2_1_R
4bbl_1_Y
4bbl_1_Z
4csf_1_A
4csf_1_C
4csf_1_E
4csf_1_G
4csf_1_I
4csf_1_K
4csf_1_M
4csf_1_O
4csf_1_Q
4csf_1_S
4csf_1_U
4csf_1_W
4cxg_1_A
4cxg_1_B
4cxg_1_C
4cxh_1_A
4cxh_1_B
4cxh_1_C
4cxh_1_X
4d61_1_J
4dr4_1_V
4dr5_1_V
4dr6_1_B
4dr6_1_V
4dr7_1_B
4dr7_1_V
4dwa_1_D
4e6b_1_A
4e6b_1_B
4e6b_1_E
4e6b_1_F
4ejt_1_G
4eya_1_A
4eya_1_B
4eya_1_C
4eya_1_D
4eya_1_E
4eya_1_F
4eya_1_G
4eya_1_H
4eya_1_I
4eya_1_J
4eya_1_K
4eya_1_L
4eya_1_M
4eya_1_N
4eya_1_O
4eya_1_P
4eya_1_Q
4eya_1_R
4eya_1_S
4eya_1_T
4g0a_1_E
4g0a_1_F
4g0a_1_G
4g0a_1_H
4g7o_1_I
4g7o_1_S
4g9z_1_E
4g9z_1_F
4gkj_1_W
4gkk_1_W
4gv3_1_B
4gv3_1_C
4gv6_1_B
4gv6_1_C
4gv9_1_E
4hor_1_X
4hos_1_X
4hot_1_X
4ht9_1_E
4i67_1_B
4ii9_1_C
4j7m_1_B
4jzu_1_C
4jzv_1_C
4k4s_1_D
4k4s_1_H
4k4t_1_D
4k4t_1_H
4k4u_1_D
4k4u_1_H
4k4x_1_D
4k4x_1_H
4k4x_1_L
4k4x_1_P
4k4z_1_D
4k4z_1_H
4k4z_1_L
4k4z_1_P
4kzx_1_I
4kzy_1_I
4kzz_1_I
4kzz_1_J
4lj0_1_C
4lj0_1_D
4lj0_1_E
4lq3_1_R
4m7d_1_P
4n2s_1_B
4n48_1_D
4n48_1_G
4nia_1_1
4nia_1_2
4nia_1_3
4nia_1_4
4nia_1_5
4nia_1_6
4nia_1_7
4nia_1_8
4nia_1_A
4nia_1_B
4nia_1_C
4nia_1_D
4nia_1_E
4nia_1_F
4nia_1_G
4nia_1_H
4nia_1_I
4nia_1_J
4nia_1_K
4nia_1_L
4nia_1_M
4nia_1_N
4nia_1_O
4nia_1_U
4nia_1_W
4nia_1_Z
4nku_1_D
4nku_1_H
4oau_1_A
4oav_1_A
4oav_1_C
4ohy_1_B
4ohz_1_B
4oi0_1_B
4oi1_1_B
4oq8_1_D
4oq9_1_1
4oq9_1_2
4oq9_1_3
4oq9_1_4
4oq9_1_5
4oq9_1_6
4oq9_1_7
4oq9_1_8
4oq9_1_A
4oq9_1_B
4oq9_1_C
4oq9_1_D
4oq9_1_E
4oq9_1_F
4oq9_1_G
4oq9_1_H
4oq9_1_I
4oq9_1_J
4oq9_1_K
4oq9_1_L
4oq9_1_M
4oq9_1_N
4oq9_1_O
4oq9_1_U
4oq9_1_W
4oq9_1_Z
4peh_1_V
4peh_1_W
4peh_1_X
4peh_1_Y
4peh_1_Z
4pei_1_V
4pei_1_W
4pei_1_X
4pei_1_Y
4pei_1_Z
4qm6_1_C
4qm6_1_D
4qu6_1_B
4qu7_1_U
4qu7_1_V
4qu7_1_X
4qvc_1_G
4qvd_1_H
4rcj_1_B
4s2x_1_B
4s2y_1_B
4tu0_1_F
4tu0_1_G
4udv_1_R
4v42_1_AA
4v42_1_AA_2-1520
4v42_1_BA
4v42_1_BA_1-2914
4v42_1_BB
4v42_1_BB_5-121
4v47_1_A0
4v47_1_A0_1-2904
4v47_1_A9
4v47_1_A9_3-118
4v47_1_BA
4v47_1_BA_1-1542
4v48_1_A0
4v48_1_A0_1-2904
4v48_1_A6
4v48_1_A6_1-73
4v48_1_A9
4v48_1_A9_3-118
4v48_1_BA
4v48_1_BA_1-1543
4v4f_1_A0
4v4f_1_A1
4v4f_1_A2
4v4f_1_A3
4v4f_1_A4
4v4f_1_A5
4v4f_1_A6
4v4f_1_A7
4v4f_1_A8
4v4f_1_A9
4v4f_1_AZ
4v4f_1_B0
4v4f_1_B1
4v4f_1_B2
4v4f_1_B3
4v4f_1_B4
4v4f_1_B5
4v4f_1_B6
4v4f_1_B7
4v4f_1_B8
4v4f_1_B9
4v4f_1_BZ
4v4i_1_W
4v4i_1_X
4v4i_1_Y
4v4i_1_Z
4v4j_1_W
4v4j_1_X
4v4j_1_Y
4v4j_1_Z
4v5z_1_AA
4v5z_1_AA_1-1563
4v5z_1_AB
4v5z_1_AC
4v5z_1_AD
4v5z_1_AE
4v5z_1_AF
4v5z_1_AG
4v5z_1_AH
4v5z_1_B0
4v5z_1_B0_1-2902
4v5z_1_B1
4v5z_1_B1_2-125
4v5z_1_BA
4v5z_1_BB
4v5z_1_BC
4v5z_1_BD
4v5z_1_BE
4v5z_1_BF
4v5z_1_BG
4v5z_1_BH
4v5z_1_BI
4v5z_1_BJ
4v5z_1_BK
4v5z_1_BL
4v5z_1_BM
4v5z_1_BN
4v5z_1_BO
4v5z_1_BP
4v5z_1_BQ
4v5z_1_BR
4v5z_1_BS
4v5z_1_BT
4v5z_1_BU
4v5z_1_BV
4v5z_1_BW
4v5z_1_BX
4v5z_1_BY
4v5z_1_BY_2-113
4v5z_1_BZ
4v5z_1_BZ_1-70
4v68_1_A0
4v7e_1_AA
4v7e_1_AB
4v7e_1_AC
4v7e_1_AD
4v7e_1_AE
4v7j_1_AV
4v7j_1_AW
4v7j_1_BV
4v7j_1_BW
4v7k_1_AV
4v7k_1_AW
4v7k_1_BV
4v7k_1_BW
4v8t_1_1
4v8z_1_CX
4v99_1_AC
4v99_1_AH
4v99_1_AM
4v99_1_AR
4v99_1_AW
4v99_1_BC
4v99_1_BH
4v99_1_BM
4v99_1_BR
4v99_1_BW
4v99_1_CC
4v99_1_CH
4v99_1_CM
4v99_1_CR
4v99_1_CW
4v99_1_DC
4v99_1_DH
4v99_1_DM
4v99_1_DR
4v99_1_DW
4v99_1_EC
4v99_1_EH
4v99_1_EM
4v99_1_ER
4v99_1_EW
4v99_1_FC
4v99_1_FH
4v99_1_FM
4v99_1_FR
4v99_1_FW
4v99_1_GC
4v99_1_GH
4v99_1_GM
4v99_1_GR
4v99_1_GW
4v99_1_HC
4v99_1_HH
4v99_1_HM
4v99_1_HR
4v99_1_HW
4v99_1_IC
4v99_1_IH
4v99_1_IM
4v99_1_IR
4v99_1_IW
4v99_1_JC
4v99_1_JH
4v99_1_JM
4v99_1_JR
4v99_1_JW
4v9e_1_AA
4v9e_1_AG
4v9e_1_AM
4v9e_1_BA
4v9e_1_BG
4v9e_1_BM
4w2e_1_W
4w2e_1_X
4w2h_1_CY_1-73
4wkr_1_C
4wt8_1_AB
4wt8_1_BB
4wt8_1_CS
4wt8_1_DS
4wti_1_P
4wti_1_T
4wtj_1_P
4wtj_1_T
4wtk_1_P
4wtk_1_T
4wtl_1_P
4wtl_1_T
4wtm_1_P
4wtm_1_T
4x4u_1_H
4x62_1_B
4x64_1_B
4x65_1_B
4x66_1_B
4x9e_1_G
4x9e_1_H
4xbf_1_D
4xln_1_Q
4xln_1_T
4xlr_1_Q
4xlr_1_T
4y4p_1_1W
4y4p_1_1X
4y4p_1_1Y
4y4p_1_2W
4y4p_1_2X
4y4p_1_2Y
4yln_1_3
4yln_1_6
4yln_1_9
4ylo_1_3
4ylo_1_6
4ylo_1_9
4yoe_1_E
4z3s_1_1W
4z3s_1_1X
4z3s_1_1Y
4z3s_1_2W
4z3s_1_2X
4z3s_1_2Y
4z8c_1_1X
4z8c_1_2X
4zer_1_1X
4zer_1_2X
5a0v_1_F
5a79_1_R
5a7a_1_R
5afi_1_V
5afi_1_W
5afi_1_Y
5aj0_1_BV
5aj0_1_BW
5bud_1_D
5bud_1_E
5c0y_1_C
5ceu_1_C
5ceu_1_D
5det_1_P
5doy_1_1W
5doy_1_1X
5doy_1_1Y
5doy_1_2W
5doy_1_2X
5doy_1_2Y
5dto_1_B
5e02_1_C
5elk_1_R
5els_1_I
5elt_1_E
5elt_1_F
5f6c_1_C
5f6c_1_E
5f8k_1_1X
5f8k_1_2X
5fl8_1_X
5fl8_1_Y
5fl8_1_Z
5flx_1_Z
5g2x_1_A_595-692
5gmf_1_E
5gmf_1_F
5gmf_1_G
5gmf_1_H
5gmg_1_C
5gmg_1_D
5gxi_1_B
5h5u_1_H
5hau_1_1W
5hau_1_2W
5hcp_1_1X
5hcp_1_2X
5hcq_1_1X
5hcq_1_2X
5hcr_1_1X
5hcr_1_2X
5hd1_1_1X
5hd1_1_2X
5hjz_1_C
5hk0_1_F
5hkc_1_C
5i2d_1_K
5i2d_1_V
5ipl_1_3
5ipm_1_3
5ipn_1_3
5it9_1_I
5j4b_1_1W
5j4b_1_1X
5j4b_1_1Y
5j4b_1_2W
5j4b_1_2X
5j4b_1_2Y
5j4c_1_1W
5j4c_1_1X
5j4c_1_1Y
5j4c_1_2W
5j4c_1_2X
5j4c_1_2Y
5j8b_1_W
5j8b_1_X
5j8b_1_Y
5jcs_1_X
5jcs_1_Y
5jcs_1_Z
5jju_1_C
5k77_1_V
5k77_1_W
5k77_1_X
5k77_1_Y
5k77_1_Z
5k78_1_X
5k78_1_Y
5k8h_1_A
5kal_1_Y
5kal_1_Z
5kcr_1_1X
5kcs_1_1X
5l3p_1_X
5l3p_1_Y
5lza_1_V
5lzb_1_V
5lzb_1_W
5lzb_1_X
5lzb_1_Y
5lzc_1_V
5lzc_1_W
5lzc_1_X
5lzc_1_Y
5lzd_1_V
5lzd_1_W
5lzd_1_X
5lzd_1_Y
5lze_1_V
5lze_1_W
5lze_1_X
5lze_1_Y
5lzf_1_V
5lzf_1_Y
5lzs_1_II
5lzy_1_HH
5mc6_1_M
5mc6_1_N
5mfx_1_B
5mgp_1_X
5mmi_1_Z
5mmj_1_A
5mmm_1_Z
5mq0_1_3
5mrc_1_AA
5mrc_1_BB
5mre_1_AA
5mre_1_BB
5mrf_1_AA
5mrf_1_BB
5new_1_C
5o1y_1_B
5o2r_1_X
5o3j_1_B
5odv_1_A
5odv_1_B
5odv_1_C
5odv_1_D
5odv_1_E
5odv_1_F
5odv_1_G
5odv_1_H
5odv_1_I
5odv_1_J
5odv_1_K
5odv_1_L
5odv_1_M
5odv_1_N
5odv_1_O
5odv_1_P
5odv_1_Q
5odv_1_R
5odv_1_S
5odv_1_T
5odv_1_U
5odv_1_V
5odv_1_W
5odv_1_X
5sze_1_C
5t2c_1_AN
5tbw_1_SR
5u4i_1_X
5u4i_1_Y
5u4j_1_X
5u4j_1_Z
5udi_1_B
5udj_1_B
5udk_1_B
5udl_1_B
5uef_1_C
5uef_1_D
5uh5_1_I
5uh6_1_I
5uh8_1_I
5uh9_1_I
5uhc_1_I
5uk4_1_U
5uk4_1_V
5uk4_1_W
5uk4_1_X
5uq7_1_X
5uq7_1_Y
5uq7_1_Z
5uq8_1_X
5uq8_1_Y
5uq8_1_Z
5vi5_1_Q
5vyc_1_I1
5vyc_1_I2
5vyc_1_I3
5vyc_1_I4
5vyc_1_I5
5vyc_1_I6
5w0m_1_H
5w0m_1_I
5w0m_1_J
5w4k_1_1W
5w4k_1_1X
5w4k_1_1Y
5w4k_1_2W
5w4k_1_2X
5w4k_1_2Y
5w5h_1_B
5w5h_1_D
5w5i_1_B
5w5i_1_D
5wdt_1_V
5wdt_1_W
5wdt_1_Y
5we4_1_V
5we4_1_W
5we4_1_Y
5we6_1_V
5we6_1_W
5we6_1_Y
5wf0_1_V
5wf0_1_W
5wf0_1_Y
5wfk_1_V
5wfk_1_W
5wfk_1_Y
5wfs_1_V
5wfs_1_W
5wfs_1_Y
5wis_1_1W
5wis_1_1X
5wis_1_1Y
5wis_1_2W
5wis_1_2X
5wis_1_2Y
5wit_1_1W
5wit_1_1X
5wit_1_1Y
5wit_1_2W
5wit_1_2X
5wit_1_2Y
5wnp_1_B
5wnt_1_B
5wnu_1_B
5wnv_1_B
5x21_1_I
5x22_1_I
5x22_1_S
5x70_1_E
5x70_1_G
5x8r_1_A
5y88_1_X
5yts_1_B
5ytv_1_B
5ytx_1_B
5z4a_1_B
5z4d_1_B
5z4j_1_B
5zeb_1_V
5zep_1_W
5zeu_1_A
5zeu_1_V
5zsa_1_C
5zsa_1_D
5zsb_1_C
5zsb_1_D
5zsc_1_C
5zsc_1_D
5zsd_1_C
5zsd_1_D
5zsl_1_D
5zsl_1_E
5zsn_1_D
5zsn_1_E
5zuu_1_G
5zuu_1_I
6a4e_1_B
6a4e_1_D
6a6l_1_D
6b6h_1_3
6bk8_1_I
6c4i_1_X
6c4i_1_Y
6cae_1_1W
6cae_1_1X
6cae_1_1Y
6cae_1_2W
6cae_1_2X
6cae_1_2Y
6cfj_1_1W
6cfj_1_1X
6cfj_1_1Y
6cfj_1_2W
6cfj_1_2X
6cfj_1_2Y
6d1v_1_C
6d2z_1_C
6d30_1_C
6dmn_1_B
6dmv_1_B
6do8_1_B
6do9_1_B
6doa_1_B
6dob_1_B
6doc_1_B
6dod_1_B
6doe_1_B
6dof_1_B
6dog_1_B
6doh_1_B
6doi_1_B
6doj_1_B
6dok_1_B
6dol_1_B
6dom_1_B
6don_1_B
6doo_1_B
6dop_1_B
6doq_1_B
6dor_1_B
6dos_1_B
6dot_1_B
6dou_1_B
6dov_1_B
6dow_1_B
6dox_1_B
6doz_1_B
6dp0_1_B
6dp1_1_B
6dp2_1_B
6dp3_1_B
6dp4_1_B
6dp5_1_B
6dp6_1_B
6dp7_1_B
6dp8_1_B
6dp9_1_B
6dpa_1_B
6dpb_1_B
6dpc_1_B
6dpd_1_B
6dpe_1_B
6dpf_1_B
6dpg_1_B
6dph_1_B
6dpi_1_B
6dpj_1_B
6dpk_1_B
6dpl_1_B
6dpm_1_B
6dpn_1_B
6dpo_1_B
6dpp_1_B
6dti_1_W
6dzi_1_H
6e0o_1_B
6e0o_1_C
6e4p_1_J
6e4p_1_K
6een_1_G
6een_1_H
6een_1_I
6enf_1_X
6enj_1_X
6enu_1_X
6eri_1_AX
6evj_1_M
6evj_1_N
6fqr_1_C
6ftg_1_U
6ftg_1_V
6ftg_1_W
6fti_1_Q
6fti_1_U
6fti_1_V
6fti_1_W
6ftj_1_U
6ftj_1_V
6ftj_1_W
6gc5_1_F
6gc5_1_G
6gc5_1_H
6gfw_1_R
6gwt_1_X
6gx6_1_B
6gxm_1_X
6gxn_1_X
6gxo_1_X
6gz3_1_BV
6gz3_1_BW
6gz4_1_BV
6gz4_1_BW
6gz5_1_BV
6gz5_1_BW
6h4n_1_W
6h58_1_W
6h58_1_WW
6ha1_1_X
6ha8_1_X
6hcj_1_Q3
6hcq_1_Q3
6hhq_1_SR
6htq_1_U
6htq_1_V
6htq_1_W
6hxx_1_AA
6hxx_1_AB
6hxx_1_AC
6hxx_1_AD
6hxx_1_AE
6hxx_1_AF
6hxx_1_AG
6hxx_1_AH
6hxx_1_AI
6hxx_1_AJ
6hxx_1_AK
6hxx_1_AL
6hxx_1_AM
6hxx_1_AN
6hxx_1_AO
6hxx_1_AP
6hxx_1_AQ
6hxx_1_AR
6hxx_1_AS
6hxx_1_AT
6hxx_1_AU
6hxx_1_AV
6hxx_1_AW
6hxx_1_AX
6hxx_1_AY
6hxx_1_AZ
6hxx_1_BA
6hxx_1_BB
6hxx_1_BC
6hxx_1_BD
6hxx_1_BE
6hxx_1_BF
6hxx_1_BG
6hxx_1_BH
6hxx_1_BI
6hyu_1_D
6i0t_1_B
6i0u_1_B
6i0v_1_B
6i2n_1_U
6i7o_1_2B
6i7o_1_L
6i7o_1_M
6i7o_1_MB
6i7o_1_N
6i7o_1_NB
6ij2_1_E
6ij2_1_F
6ij2_1_G
6ij2_1_H
6ip5_1_2M
6ip5_1_ZU
6ip5_1_ZY
6ip6_1_2M
6ip6_1_ZY
6ip6_1_ZZ
6ip8_1_2M
6ip8_1_ZY
6ip8_1_ZZ
6is0_1_C
6j7z_1_C
6k32_1_P
6k32_1_T
6kqd_1_I
6kqd_1_S
6kqe_1_I
6kql_1_I
6kr6_1_B
6ktc_1_V
6kug_1_B
6l74_1_I
6lkq_1_S
6lkq_1_T
6lkq_1_U
6lkq_1_W
6m6v_1_E
6m6v_1_F
6m6v_1_G
6m7k_1_B
6mkn_1_W
6mpf_1_W
6mpi_1_W
6n6a_1_D
6n6c_1_D
6n6d_1_D
6n6e_1_D
6n6f_1_D
6n6g_1_D
6n6h_1_D
6n6i_1_C
6n6i_1_D
6n6j_1_C
6n6j_1_D
6n6k_1_C
6n6k_1_D
6n9e_1_1X
6n9e_1_2W
6n9e_1_2X
6n9f_1_1X
6n9f_1_2X
6nd5_1_1W
6nd5_1_1X
6nd5_1_1Y
6nd5_1_2W
6nd5_1_2X
6nd5_1_2Y
6nd6_1_1W
6nd6_1_1X
6nd6_1_1Y
6nd6_1_2W
6nd6_1_2X
6nd6_1_2Y
6nu2_1_U
6nu3_1_U
6o6v_1_C
6o6v_1_D
6o6x_1_C
6o6x_1_D
6o75_1_C
6o75_1_D
6o78_1_E
6o79_1_C
6o7b_1_C
6o7b_1_D
6o7h_1_K
6o7i_1_I
6o7k_1_G
6o7k_1_V
6o8w_1_U
6o97_1_1W
6o97_1_1X
6o97_1_1Y
6o97_1_2W
6o97_1_2X
6o97_1_2Y
6o9j_1_V
6o9k_1_Y
6of1_1_1W
6of1_1_1X
6of1_1_1Y
6of1_1_2W
6of1_1_2X
6of1_1_2Y
6ogy_1_M
6ogy_1_N
6okk_1_G
6ole_1_T
6ole_1_U
6ole_1_V
6olf_1_T
6olf_1_U
6olf_1_V
6olg_1_BV
6oli_1_T
6oli_1_U
6oli_1_V
6olz_1_BV
6om0_1_T
6om0_1_U
6om0_1_V
6om7_1_T
6om7_1_U
6om7_1_V
6ov0_1_E
6ov0_1_F
6ov0_1_G
6ov0_1_H
6ovy_1_I
6ow3_1_I
6owl_1_B
6owl_1_C
6oy5_1_I
6oy6_1_I
6p71_1_I
6p7p_1_D
6p7p_1_E
6p7p_1_F
6p7q_1_D
6p7q_1_E
6p7q_1_F
6pb4_1_3
6pmi_1_3
6pmj_1_3
6ppn_1_A
6ppn_1_I
6q1h_1_D
6q1h_1_H
6q8y_1_M
6q8y_1_N
6qcs_1_M
6qdw_1_A
6qdw_1_B
6qdw_1_V
6qik_1_X
6qik_1_Y
6qt0_1_X
6qt0_1_Y
6qtz_1_X
6qtz_1_Y
6qx3_1_G
6r7b_1_D
6r7b_1_E
6r9m_1_B
6r9o_1_B
6r9p_1_B
6r9q_1_B
6r9r_1_D
6r9r_1_E
6raz_1_Y
6rcl_1_C
6ri5_1_X
6ri5_1_Y
6rt4_1_C
6rt4_1_D
6rt5_1_A
6rt5_1_E
6rt6_1_A
6rt6_1_E
6rt7_1_A
6rt7_1_E
6rzz_1_X
6rzz_1_Y
6s05_1_X
6s05_1_Y
6s0m_1_C
6sag_1_R
6sce_1_B
6scf_1_I
6scf_1_K
6scf_1_L
6scf_1_M
6skf_1_AA
6skg_1_AA
6spc_1_A
6spe_1_A
6sty_1_C
6sty_1_F
6sv4_1_2B
6sv4_1_2C
6sv4_1_MB
6sv4_1_MC
6sv4_1_N
6sv4_1_NB
6sv4_1_NC
6swa_1_Q
6swa_1_R
6swa_1_S
6szs_1_X
6t34_1_A
6t34_1_B
6t34_1_C
6t34_1_D
6t34_1_E
6t34_1_F
6t34_1_G
6t34_1_H
6t34_1_I
6t34_1_J
6t34_1_K
6t34_1_L
6t34_1_M
6t34_1_N
6t34_1_O
6t34_1_P
6t34_1_Q
6t34_1_R
6t34_1_S
6t83_1_1B
6t83_1_2B
6t83_1_3B
6t83_1_4B
6t83_1_6B
6t83_1_A
6t83_1_AA
6t83_1_BB
6t83_1_CA
6tb3_1_N
6th6_1_AA
6tnu_1_M
6tnu_1_N
6ty9_1_M
6tz1_1_N
6u6y_1_E
6u6y_1_F
6u6y_1_G
6u6y_1_H
6u9x_1_H
6u9x_1_K
6ucq_1_1X
6ucq_1_1Y
6ucq_1_2X
6ucq_1_2Y
6uej_1_B
6uo1_1_1W
6uo1_1_1X
6uo1_1_1Y
6uo1_1_2W
6uo1_1_2X
6uo1_1_2Y
6utw_1_333
6uu0_1_333
6uu1_1_333
6uu2_1_333
6uu3_1_333
6uu4_1_333
6uu6_1_333
6uuc_1_333
6uz7_1_8_2140-2827
6v39_1_SN1
6v39_1_V
6v3a_1_SN1
6v3a_1_V
6v3b_1_SN1
6v3e_1_SN1
6vm6_1_G
6vm6_1_H
6vm6_1_I
6vm6_1_J
6vm6_1_K
6vyt_1_Y
6vyu_1_Y
6vyw_1_Y
6vyx_1_Y
6vyy_1_Y
6vyz_1_Y
6vz2_1_Y
6vz3_1_Y
6vz5_1_Y
6vz7_1_Y
6w6l_1_T
6w6l_1_U
6w6l_1_V
6wan_1_G
6wan_1_H
6wan_1_I
6wan_1_J
6wan_1_K
6wan_1_L
6wox_1_I
6woy_1_I
6wre_1_D
6x1b_1_D
6x1b_1_F
6xqd_1_1X
6xqd_1_2X
6xqe_1_1X
6xqe_1_2X
6xz7_1_F
6xz7_1_G
6y69_1_W
6ybv_1_K
6ybv_1_W
6ys3_1_A
6ys3_1_B
6ys3_1_V
6ysr_1_W
6yss_1_W
6yst_1_W
6ysu_1_W
6yud_1_K
6yud_1_M
6yud_1_O
6yud_1_P
6yud_1_Q
6ywo_1_E
6ywo_1_F
6ywo_1_I
6ywo_1_K
6z1p_1_AA
6z1p_1_AB
6z1p_1_BA
6z1p_1_BB
6z8k_1_X
6zmw_1_W
6zvh_1_X
6zvi_1_D
6zvi_1_E
6zvi_1_H
7jql_1_1X
7jql_1_2X
7jqm_1_1X
7jqm_1_2X
7jyy_1_E
7jyy_1_F
7jz0_1_E
7jz0_1_F
7k00_1_5
7k00_1_B
1qzb_1_B_1-73
1qza_1_B_1-73
5zzm_1_M_3-118
5zzm_1_N_1-2904
3dg2_1_B_1-2904
3dg0_1_B_1-2904
3dg4_1_B_1-2904
3dg5_1_B_1-2904
3dg2_1_A_1-1542
3dg0_1_A_1-1542
3dg4_1_A_1-1542
3dg5_1_A_1-1542
This diff could not be displayed because it is too large.