Advanced Search

Commentary: Realising the Potential of Genomics for M. tuberculosis: A Silver Lining to the Pandemic?

View author affiliations

Citation:

通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索
Turn off MathJax
Article Contents

Article Metrics

Article views(5889) PDF downloads(2) Cited by()

Share

Related

Realising the Potential of Genomics for M. tuberculosis: A Silver Lining to the Pandemic?

View author affiliations
  • 1. Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam
  • 2. Nuffield Department of Medicine University of Oxford, Oxford, UK
  • Online Date: March 24 2022
    doi: 10.46234/ccdcw2022.063
  • The impact of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic on tuberculosis has been profound. Globally, there were 1.4 million fewer patients receiving tuberculosis treatment in 2020, representing not a fall in the global case load but a failure to serve millions of patients with the disease (1-2). Morbidity, mortality, and onward transmission are inevitable outcomes of such failure, setting the world back a decade of progress.

    Even before the pandemic, there was a severe diagnostic deficit with only 61% of patients with bacteriologically confirmed tuberculosis tested for rifampicin resistance, let alone other drugs relevant to their disease (3). Culture-based susceptibility testing is too slow, expensive, and demanding of infrastructure and skills to provide a global solution, not least because the global burden of disease is predominantly found in low- and middle-income countries. Interim molecular solutions have come in cartridge form, but even the widely available MTB/RIF Xpert® and its descendants cannot provide full drug susceptibility testing results that clinicians would optimally want when designing personalised regimens for their patients.

    Attention has therefore rightly turned to the potential of genomic solutions. Whole genome sequencing (WGS) has already been adopted as a routine test in a number of high-income settings such as the UK (4). It offers not only genome-based drug susceptibility testing for many drugs where we understand the molecular mechanisms of resistance, but also vital information on transmission to inform public health interventions. A remaining disadvantage of this technology is that good quality sequence data generally still requires DNA extraction from culture. Targeted Next Generation Sequencing (tNGS) offers a solution here by amplifying a fraction of the genome directly from the clinical sample and sequencing that to great depth. Clinically useful coverage of the “resistome” is thus generated, but at the cost of losing the potential for identifying genomic relatedness, inferring transmission, and informing public health action focussed on outbreaks.

    All molecular assays, whether targeted or not, require interpretation. Whilst there are some canonical mutations conferring resistance to key drugs, there are many more mutations with less supporting data (5). The challenge for targeted molecular assays based purely around polymerase chain reaction (PCR) technology has been to identify a small number of high-value mutations restricted to the short target DNA sequence available to the assay. As many mutations outside the target sequence are therefore missed, results need to be interpreted with caution and with regard to the circumstances (6). The challenge for sequencing solutions continues to be selecting the infrequent, often more poorly supported, but collectively important mutations to include in an interpretative catalogue of resistance mutations. Catalogue-based approaches to interpretation gained a boost in 2021 after the World Health Organization (WHO) formally endorsed a catalogue of mutations associated with resistance in M. tuberculosis in 2021 (7). Although this catalogue may not provide more accurate results than other existing lists of mutations, it sets a standard in a field where until now none has been universally worked out.

    The challenge with catalogue-based interpretation is that for most anti-tuberculosis drugs a small number of common mutations account for most global resistance, after which a potentially much larger number of rare mutations might collectively explain the remaining 5%–10% of drug resistance. Even in the dataset approaching 40,000 Mycobacterium tuberculosis (M. tuberculosis) isolates from which the WHO catalogue was generated, many of these mutations were seen too infrequently to confidently grade. Moreover, as some of these mutations may have only an incremental impact on the minimum inhibitory concentration (MIC), the resulting binary phenotype against which they are assessed is so inconsistent that just accumulating ever more data is unlikely to solve the problem. Recognizing that mutations do not all act in isolation but that some combine to have additive, synergistic, or epistatic effects is part of the solution (8).

    A number of machine learning techniques have been applied to the problem of drug susceptibility testing in M. tuberculosis. These avoid the need for a catalogue but do rely on training datasets. They have the potential to exploit features that have no mechanistic role in drug resistance, such as common drug resistance patterns (for example isoniazid resistance being common in the context of rifampicin resistance). The performance of some machine learning models has been excellent, and open-source tools exist for others to implement (9-10).

    Although some machine learning approaches avoid the need for a bioinformatics pipeline, instead of operating more or less directly from the raw sequence data (11), a pipeline that maps sequencing reads to a reference genome and filters for variants remains a prerequisite for most of the above approaches. Again, although many different tools are used across academia, health service providers, and industry (12), no single global standard exists even though attempts have been made to outline what it may look like (13). Moreover, were opinions to coalesce around any one of the existing solutions, none would likely be capable of scaling to process tens of thousands of genomes a day to serve potential global demand. In the absence of a standardised solution, data comparability and data sharing are more burdensome, and progress is stalled.

    The SARS-CoV-2 pandemic has led to global interest in genome sequencing as it has been the tool to identify the emergence of new variants and track their transmission locally, regionally, and globally (14). Huge efforts have gone into this work in countries around the world, with scientists facing many of the same problems described here: a need for bioinformatics pipelines; a need for global comparability and data sharing; and a need for scalability as samples are sequenced in their many thousands (15). The challenge has triggered a large philanthropic donation of cloud computing services and software engineering support from the tech giant ORACLE to the University of Oxford (16). The resulting Global Pathogen Analysis System (GPAS) delivered through the University of Oxford has been built in the first instance for an automated “turn-key” analysis service for SARS-CoV-2 genome sequence data agnostic to nucleic acid preparative methods and sequencing technology. Work is already in progress for an automated M. tuberculosis bioinformatics solution.

    With the support of up to 100 ORACLE and University bioinformaticians and software engineers, GPAS is now a cloud based, industry standard, secure solution that can process around 1 million SARS-CoV-2 genomes per day. The solution provides the user with sovereign control of their own data, but also with the opportunity to share it with the global community (this is not a vehicle for either ORACLE or the University of Oxford to access data for themselves). GPAS will be accessible from anywhere in the world, by anyone with an internet connection, and will be a sequencing platform and able to process both WGS and tNGS data with the ability to accommodate bespoke primer sets. For users who wish to share data, options to upload data to existing global repositories will be offered. Drug susceptibility predictions will be catalogue based, but machine learning approaches are a future possibility. For WGS data, a user-friendly graphical interface will summarize the genomic relatedness to all other samples that have been globally shared to date.

    A 10-year agreement with ORACLE provides GPAS free of charge for low- and middle-income countries, with services provided on a non-for-profit basis for high-income countries. GPAS is available for SARS-CoV-2 with a plan to release a parallel solution for M. tuberculosis in the third quarter of 2022, and thereafter for other microbial pathogens of importance to public health guided by a user group advising the development group led by the University of Oxford.

    Amongst the many losses the fight against tuberculosis has suffered due to the SARS-CoV-2 pandemic, the promise of a secure, industry standard, scalable pipeline solution for M. tuberculosis diagnosis and control, accessible by anyone around the world at low or no cost for the global good is a definite silver lining. It opens up the potential for WGS or tNGS technology to be adopted more widely where it has hitherto been held back by the absence of sequencing analytic capability. Many challenges remain, but this unprecedented opportunity for data sharing in the interest of public health may play a role in the solution for some.

Reference (16)

Citation:

 

Associate Professor Timothy M Walker
Wellcome Trust Clinical Career Development Fellow
Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam

 

Professor Derrick W Crook
Professor of Microbiology, Nuffield Department of Medicine
University of Oxford, Oxford, UK

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return