SurvOmics: a multi-omics database for cancer survival analysis


About SurvOmics


What is SurvOmics?

SurvOmics is a multi-omics database, which incorporates somatic mutation, RNA expression, DNA methylation, copy number variation, and clinical variables into multiple survival analyses in 33 cancer types from the TCGA dataset. The database provides two main functions, ‘Cancer’ and ‘Gene’ to help researchers visualize the relationships between cancer survival and genes among multi-omics layers.


The 'Cancer' function summarizes the calculated results of prognostic genes in multi-omics levels by using published bioinformatics algorithms or tools for a specific cancer type or dataset. The important molecular pathway related to survival is also presented.


The 'Gene' function provides the multi-omics and multiple survival analyses of a user-selected gene within cancer types or datasets. Selected integrative multi-omics analyses are also presented.


The 'Customized analysis' function allows users to construct their gene-based signature or multivariate regression model by uploading the specific candidate gene list or clinical factors they are interested in.

The TCGA RNA sequencing data and mutation data were collected from the GDC data portal [1], and processed with in-house script [2]. Level 3 copy number variation data were downloaded by applying the TCGA2BED tool [3]. Level 3 methylation data were collected from firehose [4]. TCGA clinical data were downloaded by using an R package, ‘TCGAbiolinks’ [5]. Survival endpoints were retrieved from TCGA-Clinical Data Resource (CDR) Outcome [6].

Reference

[1] “GDC data portal.”https://portal.gdc.cancer.gov/

[2] Liu, S.H., Shen, P.C., Chen, C.Y., Hsu, A.N., Cho, Y.C., Lai, Y.L., Chen, F.H., Li, C.Y., Wang, S.C., Chen, M. et al. (2020) DriverDBv3: a multi-omics database for cancer driver gene research. Nucleic Acids Res, 48, D863-D870.

[3] Cumbo F, Fiscon G, Ceri S, Masseroli M, Weitschek E. TCGA2BED: extracting, extending, integrating, and querying The Cancer Genome Atlas. BMC bioinformatics. 2017;18(1):6

[4] “Broad GDAC firehose.” https://gdac.broadinstitute.org/

[5] Colaprico A, Silva TC, Olsen C, Garofano L, Cava C, Garolini D, et al. TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data. Nucleic acids research. 2015;44(8):e71-e

[6] Liu J, Lichtenberg T, Hoadley KA, Poisson LM, Lazar AJ, Cherniack AD, Kovatich AJ, Benz CC, Levine DA, Lee AV, Omberg L, Wolf DM, Shriver CD, Thorsson V; Cancer Genome Atlas Research Network, Hu H. An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics. Cell. 2018 Apr 5;173(2):400-416.e11

Cancer

The 'Cancer' function summarizes the calculated results of prognostic genes in multi-omics levels by using published bioinformatics algorithms or tools for a specific cancer type or dataset. The important molecular pathway related to survival is also presented.

In the following sections, you will figure out how to access each section on the Cancer page. Each section, represented by a number in the screenshot below, has a corresponding page on the sidebar. You can access the desired tutorials by simply switching the sidebar.


1.Data source and dataset

As you begin the analysis, the first step is inputting your data. We have divided the process into three steps, marked in the below screenshot.



  1. Select a tissue type of dataset.
  2. Select a dataset according to the previous selected tissue type.
  3. Press ‘Submit’ button to view prognostic gene information of multiple features in a specific cancer type.

Gene

The 'Gene' function provides the multi-omics and multiple survival analyses of a user-selected gene within cancer types or datasets. Selected integrative multi-omics analyses are also presented.

In the following sections, you will figure out how to access each section on the Gene page. Each section, represented by a number in the screenshot below, has a corresponding page on the sidebar. You can access the desired tutorials by simply switching the sidebar.


1. Data source and dataset

As you begin the analysis, the first step is inputting your data. We have divided the process into three steps, marked in the below screenshot.



  1. Select to search by gene name or Ensembl ID.
  2. Enter the gene name or Ensembl ID.
  3. Press the ‘Submit’ button to view the analysis of the selected gene across multiple cancer types.

Customized analysis

Step 1. Begin your analysis.



The manipulation of this step follows the marked order in the screenshot.

  1. Go to the Customized analysis page.

  2. Choose the analysis type.
    1. Signature identification analysis identifies survival-related genes among the user-selected genes using LASSO and random forest methods. Select feature selection analysis if you want to inspect specific candidate gene(s) and construct the signature.

    2. Multivariate survival analysis analyzes and constructs the multivariate model with customized clinical factors; over one hundred clinical factors were provided for selection. Select Multivariate analysis to compare specific candidate gene(s) with well-known clinical prognostic biomarker(s).

Step 2. Upload a gene list.



The manipulation of this step follows the marked order in the screenshot.

  1. Upload the gene list by typing the gene symbols or submitting a txt file.
    If you submit in a txt file, please ensure that the file follows the specified format:
    1. Each line of the text file should contain only one gene symbol.
    2. Gene symbols should be listed one after the other, each symbol on a separate line.

    Here is an example of a correctly formatted input file.



  2. After the file is ready, click the “Next” button to move on to the next step. (If you want to reset your setting, click the “Reset” button.)

Step 3. Select cancer type



The manipulation of this step follows the marked order in the screenshot.

  1. Select the cancer type through the “Search by tissue type” and “Related dataset” drop-down menus.

  2. After choosing it, click the “Next” button to move on to the next step. (If you want to reset your setting, click the “Reset” button.)

Step 4. Select data type



The manipulation of this step follows the marked order in the screenshot.

  1. Choose the data type (multiple choices are allowed).

  2. After the setting, click the “Next” button to move on to the next step. (If you want to reset your setting, click the “Reset” button.)

Step 5. Select Endpoint



The manipulation of this step follows the marked order in the screenshot.

  1. Select a survival endpoint.

  2. Click the “Next” button to move on to the next step. (If you want to reset your setting, click the “Reset” button.)

Step 6. Select samples by clinical criteria



The manipulation of this step follows the marked order in the screenshot.

  1. Click the drop-down menu “Select samples by clinical criteria” to choose the interested clinical criteria.

  2. Set further criteria by the drop-down menu next to it. (Note: If you want to set more criteria, click on the button ”More Criteria” to add a new one.)

  3. After all the setting is done, click the “Next” button to move on to the next step. (If you want to reset your setting, click the “Reset” button)

Step 7. Check the selection information and submit



The manipulation of this step follows the marked order in the screenshot.

  1. The selection information lists all your setting conditions. Please check the analysis conditions, and if correct, enter your email address and click the "Submit" button. If the information is incorrect, click the "Reset analysis condition" button to reset your setting.

  2. After submission, the analysis report will be directly sent to the provided email within 5-60 minutes.