Deep GONet
Original code from the article entitled Deep GONet: Self-explainable deep neural network based on Gene Ontology for phenotype prediction from gene expression data (accepted both in APBC 2021 and BMC Bioinformatics) by Victoria Bourgeais, Farida Zehraoui, Mohamed Ben Hamdoune, and Blaise Hanczar.
Description
Deep GONet is a self-explainable neural network integrating the Gene Ontology into its hierarchical architecture.
Get started
The code is implemented in Python using the Tensorflow framework v1.12 (see requirements.txt for more details)
Dataset
The full microarray dataset can be downloaded on ArrayExpress database under the id E-MTAB-3732. Here, you can find the pre-processed training and test sets:
Additional files for NN architecture: filesforNNarch
TCGA dataset can be downloaded from GDC portal.
Usage
The following show how to train and evaluate the neural network. Deep GONet was achieved with the $L_{GO}$ regularization and the hyperparameter $\alpha=1e{-2}$ on the microarray dataset. To replicate it, the command line flag type_training needs to be set to LGO (default value) and the command line flag alpha to $1e{-2}$ (default value).
There exists 3 functions (flag processing): one is dedicated to the training of the model (train), another one to the evaluation of the model on the test set (evaluate), and the last one to the prediction of the outcomes of the samples from the test set (predict).
1) Train
python DeepGONet.py --type_training="LGO" --alpha=1e-2 --EPOCHS=600 --is_training=True --display_step=10 --save=True --processing="train"
2) Evaluate
python DeepGONet.py --type_training="LGO" --alpha=1e-2 --EPOCHS=600 --is_training=False --restore=True --processing="evaluate"
3) Predict
python DeepGONet.py --type_training="LGO" --alpha=1e-2 --EPOCHS=600 --is_training=False --restore=True --processing="predict"
The outcomes are saved into a numpy array.
Help
All the details about the command line flags can be provided by the following command:
python DeepGONet.py --help
For most of the flags, the default values can be employed. log_dir and save_dir can be modified to your own repositories. Only the flags in the command lines displayed have to be adjusted to achieve the desired objective.
Comparison with classical fully-connected network using L2 or L1 regularization terms
It is possible to compare the model with L2,L1 regularization instead of LGO.
python DeepGONet.py --type_training="L2" --alpha=1e-2 --EPOCHS=600 --is_training=True --display_step=10 --save=True --processing="train"
python DeepGONet.py --type_training="L1" --alpha=1e-2 --EPOCHS=600 --is_training=True --display_step=10 --save=True --processing="train"
Without regularization:
python DeepGONet.py --alpha=0 --EPOCHS=100 --is_training=True --display_step=5 --save=True --processing="train"
Interpretation tool
Please see the notebook entitled Interpretation_tool.ipynb to perform the biological interpretation of the results.
How to cite this work?
Bourgeais, V., Zehraoui, F., Ben Hamdoune, M., & Hanczar, B. (2021). Deep GONet: Self-explainable deep neural network based on Gene Ontology for phenotype prediction from gene expression data. BMC Bioinformatics, 22(10), 455. https://doi.org/10.1186/s12859-021-04370-7