README.md



GraphGONet

From the article entitled GraphGONet: a self-explaining neural network encapsulating the Gene Ontology graph for phenotype prediction on gene expression (submitted to Bioinformatics) by Victoria Bourgeais, Farida Zehraoui, and Blaise Hanczar.


Description

GraphGONet is a self-explaining neural network integrating the Gene Ontology into its hidden layers.


Get started

The code is implemented in Python using the PyTorch framework v1.7.1 (see requirements.txt for more details)


Dataset

The full microarray dataset can be downloaded on ArrayExpress database under the id E-MTAB-3732. Here, you can find the pre-processed training and test sets:

training set

test set 


TCGA dataset can be downloaded from GDC portal. 

 
Usage


1) Train

On the microarray dataset:

python GraphGONet.py --n_inputs=36834 --n_nodes=10663 --n_nodes_annotated=8249 --n_classes=1 --mask="top" --selection_ratio=0.01 --n_epochs=50 --es --patience=5 --class_weight 


On TCGA dataset:

python GraphGONet.py --n_inputs=18427 --n_nodes=10636 --n_nodes_annotated=8288 --n_classes=12 --mask="top" --selection_ratio=0.01 --n_epochs=50 --es --patience=5 --class_weight 


Help

All the details about the command line flags can be provided by the following command:

python GraphGONet.py --help


For most of the flags, the default values can be employed. log_dir and save_dir can be modified to your own repositories. Only the flags in the command lines displayed have to be adjusted to achieve the desired objective.


Comparison with random selection

On the microarray dataset:

python GraphGONet.py --n_inputs=36834 --n_nodes=10663 --n_nodes_annotated=8249 --n_classes=1 --mask="random" --selection_ratio=0.01 --n_epochs=50 --es --patience=5 --class_weight 


Comparison with no selection

On the microarray dataset:

python GraphGONet.py --n_inputs=36834 --n_nodes=10663 --n_nodes_annotated=8249 --n_classes=1 --n_epochs=50 --es --patience=5 --class_weight