Bugbank Navigation

First pass analysis of human genetic susceptibility to severe COVID-19

We have performed a preliminary analysis using UK Biobank COVID-19 data to test for genes or genetic variants that increase the risk of severe COVID-19. We have done several analyses but the conclusion is the same at this stage - the sample size is currently too small to distinguish true signals from noise. However, this may change over the next few weeks as (i) unfortunately, the number of cases will rise and (ii) the results from the UK Biobank cohort are combined with other cohorts through the COVID-19 Hg initiative using meta-analysis.

I will summarize just one of the analyses for brevity. The first analysis I tried was restricted to white European participants. Restricting the analysis this way is one approach to account for possible correlations between susceptibility to severe COVID-19 and genetic ancestry. I started with white Europeans because this is the largest group in the UK Biobank. I compared 330 cases of severe COVID-19 (identified as hospital inpatients with positive tests) to a control population of 283,722 participants with no known COVID-19, as described here. I excluded individuals who did not live in England at the time of recruitment and individuals no longer followed up by UK Biobank. I also excluded individuals with genetic data that did not pass standard quality controls.

In the analyses so far, I have not yet accounted for most important epidemiological variables. Instead I have followed the COVID-19 Hg standard analysis plan which adjusts only for age and sex. In future iterations, we will address this limitation.

The figure shows a Manhattan plot summarizing the location of signals of susceptibility to infection in the human genome. The bigger the peak, the stronger the evidence that genes in that region of the genome are associated with severe disease. No signal yet meets the stringent threshold for deeming it statistically interesting. The strongest signal so far is on chromosome 2. The closest gene is called KLHL29, which is involved in a wide variety of traits including obesity according to various studies in the GWAS catalog. The next strongest signal so far is on chromosome 14, near a gene called NRXN3 which is also involved in diverse traits, also including obesity according to the GWAS catalog. If these turn out to be statistically interesting signals, it supports predictions that the analysis is likely to pick up genetic susceptibility to pre-disposing factors, of which obesity is one. This underlines the importance of controlling for such mediating risk factors in future analyses.


  1. NRXN3 - associated with learning, adult and social behaviour
    KLHL29 - associated with protein degradtion through the ubiquination pathway and heat stress intolerance in catfish https://www.ncbi.nlm.nih.gov/pubmed/27476875

  2. This comment has been removed by the author.

  3. it would be nice to compare asymptomatic positive (not health controls)versus severe cases for ACE2 gene at ChrX and in different ancestries