Abstract

While most multi-omics research initiatives try to identify correlations between features, causation is less clear. In our proposal, the deep and wide ORCADES data will enable us to identify causation and underlying molecular pathways. For our exemplar - cardiovascular disease - we aim to define pathways leading from genes to phenotypes, and molecular hallmarks of its development. This would potentially lead to development of novel biomarkers, patient stratification methods, and identification of new drug targets. The methodology could readily be applied to other diseases.

Background

Cardiovascular disease (CVD) is the leading cause of mortality in Europe. There is clear evidence that cardiovascular mortality has decreased substantially over the last 5-10 years, but at differing rates. There are thus edaemia, obesity, diet, alcohol, smoking and lack of physical activity. Some limited research into metabolomic and proteomic associations with CVD has already taken place (e.g. Fan et al 2016, J Am Coll Cardiol 68:1281-93; Ganz et al 2016 JAMA 315:2532-41) but to date no study has combined data across multiple omics platforms.

Such an integrative approach holds the promise of better understanding the molecular mechanisms and interplay between environmental and genetic risk factors leading to CVD and will facilitate development of methods for patient stratification. Only recently have large cohorts been characterised with multiple omic technologies: these cohorts provide a unique and powerful resource for dissection of molecular pathways involved in the development of complex diseases.

Unfortunately, no large multi-omics cohort resources exist in Russia yet. In this project, we will use the Orkney Complex Disease Study (ORCADES) platform resource, with a uniquely rich dataset of omics and organism-level phenotypes. Many of the CVD risk factors (alcohol, smoking, diet and lifestyle) overlap (albeit with different characteristics) between Orkney and Russia, as will, we expect, the underlying biological pathways. Thus ORCADES provides a valuable resource to study the molecular basis of cardiovascular disease, well beyond Scotland.

Approach

  1. We will use summary-level data available from previous large GWAS of vascular diseases and their risk factors (collected by UoE), and omics traits (expression and protein QTLs, metabolomics and glycomics; collected by NSU) to derive genetically-anchored omics signatures of CVD (NSU). The components of omics signatures will be selected using novel methods, e.g. SMR/HEIDI (Zhu et al 2016, Nature Genet 48:481-7).
  2. These signatures will be verified in the ORCADES population (UoE).
  3. The omics signatures will be clustered using methods from multivariate statistics in order to define a limited number of molecular pathways to which effects of CVD loci converge (NSU). Consistency of omics signatures between multiple CVD loci may be interpreted as a strong indication for potential causality.
  4. We will use ORCADES data to define the omics signatures of external CVD risk factors, such as smoking and alcohol consumption(NSU+UoE). This will be done using machine learning methods (e.g. shrinkage models).
  5. We will look for the molecular phenotypes at which the action of genetic and environmental risk factors converges. We anticipate that these may be critical molecular species whose changes indicate the hallmarks of CVD development.

Future work (outside of scope of this project) may extend to testing the predictive and stratification potential of these biomarkers, study of possible lifestyle modifications leading to favourable changes in molecular profiles, validation of these molecules as potential drug targets. Moreover, the prototype methodology developed in this project may be applied to other diseases of interest. With this proposal, we attach letters from associated partners interested in following up the results of this project in the future.