The DiseaseConnect (http://disease-connect.org) is a web server for analysis and visualization of a comprehensive knowledge on mechanism-based disease connectivity. The traditional disease classification system groups diseases with similar clinical symptoms and phenotypic traits. Thus, diseases with entirely different pathologies could be grouped together, leading to similar treatment design. Such problems could be avoided if diseases were classified based on their molecular mechanisms. Connecting diseases with similar pathological mechanisms could inspire novel strategies on the effective repositioning of existing drugs and therapies. Although there have been several studies attempting to generate disease connectivity networks, they have not yet touched the enormous and rapidly growing public repositories of disease-related omics data and literature data, two primary resources capable of providing insights into disease connections at an unprecedented level of detail. DiseaseConnect, the first public web server, integrates comprehensive omics and literature data, including a large amount of Genome-Wide Association Studies (GWAS) catalog, gene expression data, and text-mined knowledge, to discover disease-disease connectivity via common molecular mechanisms. Moreover, the clinical co-morbidity data and a comprehensive compilation of known drug-disease relationships are supplemented for advancing the understanding of the disease landscape and for facilitating the mechanism-based development of new drug treatments.
The portal provides three types of queries: gene query, disease query, and disease connection query. In the gene query, users enter a gene of interest, and the web server returns a set of diseases that have similar molecular mechanisms as well as association with the queried gene. The diseases are, therefore, represented as networks, where edges indicate diseases with a shared molecular basis. The strength of the connection is quantified as the P-value of a hypergeometric enrichment test in the number of shared genes; the P-value threshold is set by the server to limit the size of the network to 150 nodes. Users can also enter a disease (in the disease query) or a disease pair (in the disease connection query). In both cases, the web server returns all diseases connected to the queried disease(s) in a network representation. Again, the edges in these networks indicate a strong connection in terms of shared genes, as explained for the gene query.