CHANGELOG
3.89 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
############################################################################################
v 1.5 beta, April 2021
FEATURE CHANGES
- New option --stats-opts="..." allows to pass options to the automatic run of statistics.py (when -s is used)
- Removed support for 3'->5' Rfam hits, they are now completely ignored. They concern the opposite strand, which is unresolved in 3D.
- Removed PyDCA, which is outdated and introduces dependencies conflicts. Code may be adapted later.
- A new column in align_column table, 'index_small_ali', gives the index of the nucléotide in the "3d only" alignment.
- 3D distance matrices are now computed only for match positions (match to the covariance model)
BUG CORRECTIONS
- Corrected a bug which skipped angle conversions from degrees (DSSR) to radians if nucleotides where renumbered.
############################################################################################
v 1.4 beta, March 2021
Khodor Hannoush joins the development of RNANet.
FEATURE CHANGES
- SINA is now used only if you pass the option --sina, Infernal is used by default even for rRNAs.
- A new option --cmalign-opts="..." allows to customize your cmalign runs, e.g. with --cyk. The default is no option.
- RNANet makes use of PyDCA to compute DCA-related features on the alignments (descriptions to come in the Database.md)
- statistics.py now fully supports the computation of 3D distance matrices, with average and standard deviation by RNA family
- Now RNANet considers only the equivalence class representative structure by default. To consider all members of an equivalence
class (like before), use the --redundant option.
TECHNICAL CHANGES
- cmalign is not run with --cyk anymore by default, and now requires huge amounts of RAM if launched with the default options.
- Moving to a 60-core/128GB server for our internal runs.
############################################################################################
v 1.3 beta, January 2021
The first uses of RNAnet by people from outside the development team happened between this December.
A few feedback allowed to identify issues and useful information to add.
FEATURE CHANGES
- Sequence alignments of the 3D structures mapped to a family are now provided.
- Full alignements with Rfam sequences are not provided, but you can ask us for the files.
- Two new fields in table 'family': ali_length and ali_filtered_length.
They are the MSA lengths of the alignment with and without the Rfam sequences.
- Gap replacement by consensus (--fill-gaps) has been removed. Now, the gap percentage and consensus are saved
in the align_column table and the datapoints in CSV format, in separate columns.
Consensus is one of ACGUN-, the gap being chosen if >75% of the sequences are gaps at this position.
Otherwise, A/C/G/U is chosen if >50% of the non-gap positions are A/C/G/U. Otherwise, N is the consensus.
TECHNICAL CHANGES
- SQLite connexions are now all in WAL mode by default (previously, only the writers used WAL mode, but this is useless)
- Moved to Python3.9 for internal testing.
- Latest version of BioPython is now supported (1.78)
BUG CORRECTIONS
- When an alignment file is updated in a newer run of RNANet, all the re_mappings are now re-computed
for this family. Previously, the remappings were computed only for the newly added sequences,
while the alignment actually changed even for chains added in past runs.
- Changed the ownership and permissions of files produced by the Docker container.
They were previously owned by root and the user could not get access to them.
- Modified nucleotides were not always correctly transformed to N in the alignments (and nucleotide.nt_align_code fields).
Now, the alignments and nt_align_code (and consensus) only contain "ACGUN-" chars.
Now, 'N' means 'other', while '-' means 'nothing' or 'unknown'.