Whole genome sequencing

Whole genome sequencing

The genome is the complete genetic information of an organism. Most organisms have DNA genomes, although there are known viruses with RNA genomes too. The genome consists of the coding and noncoding DNA, as well as the organelles’ DNA (mitochondria and chloroplasts). Whole genome sequencing is the determination of the entire sequence of the whole genome (the order of all the bases, A, T, G and C) at a single time. After reading this blog you will learn more about the history, techniques and applications of this advanced method, as well as a little bit about the genome of your favourite feline friend.

History

The very first method for genome sequencing was Maxam-Gilberts sequencing method developed in 1976 by Allan Maxam and Walter Gilbert. In 1977, Frederick Sanger and colleagues developed the Sanger sequencing method which was the most widely used sequencing method for about 40 years. These methods were manual, slow and allowed the sequencing of short DNA segments only. Today automated, more rapid techniques have been developed that allow sequencing of longer DNA strands.

The first complete genome to ever be sequenced was the genome of Haemophilus influenzae (a Gram-negative, pathogenic bacteria) in 1995. Shortly, many other microorganisms followed suit. The first eukaryotic genome to be sequenced was that of Saccharomyces cerevisiae (the yeast) in 1996. The first sequenced multicellular eukaryote, however, was the nematode worm Caenorhabditis elegans in 1998.

Human genome was sequenced as a part of The Human Genome project. Started in 1990, The Human Genome project was declared complete in 2003. That is, as complete as it can be, given the available technology."The human genome has not been completely sequenced and neither has any other mammalian genome as far as I’m aware.", said Harvard Medical School bioengineer George Church.

First canine genome was sequenced and published in 2004, followed up with feline genome in 2007.

via GIPHY

Techniques used for whole genome sequencing

Due to the large size of mammalian genomes, long DNA strands are first broken down into smaller portions, sequenced and then put together using bioinformatics methods. One of the most commonly used methods is whole-genome shotgun method, which was also partially used for The Human Genome project. New modern techniques such as capillary sequencing, illumina dye sequencing, pyrosequencing, SMRT sequencing and nanopore technology are all emerging and developing as well.

This method is used specifically for long strands of DNA. The DNA is randomly cut into numerous shorter fragments for sequencing (usually 2-150 kb long). These short fragments are then cloned into vectors (a vector is a DNA molecule used as a "vehicle" to artificially carry foreign genetic material into another cell, where it can be replicated). The cloned short fragments are now sequenced starting from both ends using the chain termination method. Each sequence is called "read". The original sequence is afterward reconstructed from the reads using sequence assembly software.

In hierarchical shotgun, prior to actual sequencing, a physical map of the genome is made. This allows the planning of a minimal number of cut fragments that will cover the entire chromosome. Reducing the number of fragments reduces the amount of sequencing and assembly required for the process to be carried out.

To assembly the fragments, it is first determined in which way the clones overlap. After the identification of the overlaps, the sequences can be combined to construct the genome consensus (Figure 1).

Figure 1: In whole genome shotgun sequencing (a), the entire genome is randomly cut into small fragments and then reassembled. In hierarchical shotgun sequencing (b), the genome is first broken into larger segments. After the order of these segments is deduced, they are further cut into small fragments.(source)

Sequencing the genome makes it possible for us to identify mutations and estimate mutation frequencies too. Thousands of pathogenic mutations have been identified through GS in recent years. Identification of mutations is very important because it reveals the genetics behind many genetically determined conditions and disorders. Estimation of mutation frequencies can also be very informative. For example, it is known that the mutation rate in cancer is significantly higher than in healthy tissues due to genome instability. We also learned that the mutation rate isn’t even across the genome. Gene-rich regions undergo fewer mutations than noncoding regions, possibly due to DNA repair activity that is higher and more precise in these regions. Knowing mutations and their frequencies is a powerful knowledge.

WGS is a cutting-edge technology still developing and improving, and so are the possible applications and uses. The potential is superior. Improvement of this technology will open countless doors to many scientific fields such as personalized medicine, nutrition, pathology, microbiology, evolution, pharmacy, agrigenomics and numerous other fields. We are only yet starting to realize the full potential of WGS!

Feline genome

Feline genome was first sequences in 2007. It was Cinnamon, a 4-year-old Abyssinian, whose DNA was sequenced.

The feline genome analysis aims to contribute to health benefits for domestic cats, 90 million of which are owned by Americans alone, according to The Humane Society. Cats have 38 chromosomes and roughly 20,000 genes. About 250 heritable genetic disorders have been identified in cats.

When the cat genome was finally mapped the scientists were also surprised to learn that humans share about 90% of DNA with felines! This makes cats genetically closer to humans than dogs. The genome sequence analysis also showed that humans and cat lineages diverged about 100 million years ago.

Our felines can also serve as a excellent models for human disease, which is partially why the National Human Genome Research Institute (NHGRI) authorized sequencing of feline genome three years ago. There are over 200 hereditary human diseases that closely correlate conditions in cats. Among the most serious diseases common to cats and humans are leukemia, Alzheimer’s, HCM and HIV.

We hope that we could help you get a better understanding of what whole genome sequencing is and how it’s conducted. There is so much more we still have to learn about our own, as well as feline genomes and every piece of information we can collect today will be highly valuable tomorrow. WGS helps us understand life at its very molecular level. We can’t even begin to imagine the power of this knowledge.