Saturday 5 October 2013

Human genetics, part I


Human genetics is the study of human genome and individual genes. It includes many different aspects and views, such as pharmacogenetics and -genomics (the study of interactions between genes and drugs), gene therapy and mapping out disease susceptibility genes. In this post I summarize some topics of human genetics.

The human genome

Humans have genetic material, DNA and RNA, in two places: in the nucleus and mitochondria of cells.

Mitochondrial genome is always inherited from the mother. It is 16,6 kb long, and consists of one circular (plasmid) DNA module. It can be divided to light and heavy strains as opposed to the leading and lagging strand of the nuclear genome. Mitochondrial genome therefore has no chromosomes. Each cell has several, even thousands, copies of the plasmid, but the copy number varies in cell types. The mitochondrial genome has only 13 genes, is mostly free of binding proteins and has very little repetitive sequences. It is thus very dense in genes: it has ~1 gene in 0,45 kb of DNA. It also has no introns, and over 66 % of the DNA codes a protein. There is no recombination in the mitochondrial genome.

Mitochondrial genome has even a small strand of triple strand DNA, called the 7S DNA. Like all plasmids, the mitochondrial genome has an ORI-segment (origin of replication), but also promoters for both the light and the heavy strand. Over 98 % of the mitochondrial genome is highly conserved. It also has a different amount of stop codons and amino acid coding codons than the nuclear genome.

Human mitochondrial genome and genes. (c) NCBI

The nuclear genome is what traditionally is understood by a "genome". It is located in the nucleus as chromosomes, of which humans have 23 (the haploid number). The nuclear genome is 3,1 Gb in size - that's 250000 times larger than the mitochondrial genome! The nuclear genome is packaged around histone proteins. It has plenty of repetitive sequences, untranslated regions and introns. Approximately half of the nuclear genome is repetitive sequences such as microsatellites. Recombinations are common in nuclear genome, and it is inherited chromosomally: Mendelian inheritance for X and autosomes, paternal inheritance for Y chromosome.

The nuclear genome has over 20 000 protein-coding genes at a density of 1 gene / 120 kb. Only 1 % of the genome codes proteins. Single genes vary in size, the smallest being few hundred basepairs and the longest 2,4 Mb. One gene is transcribed from one strand, unlike in mitochondrial genome where by swapping between light and heavy strands one sequence can be translated differently. Circa 6 % is highly conservated, ~45 % transposons and the rest is poorly conserved for other reasons.

Both genomes participate in some important parthways in the human body. Both code subunits for the oxidative phosphorylation system, with nuclear genome providing 80 and mitochondrial genome 13 subunits. They also both participate in building protein synthesis complexes: mitochondrial genome is solely responsible for producing the rRNA and tRNA, while nuclear genome produces only ribosomal proteins.

# of genes and sizes of human chromosomes.
(c) Wikipedia

Repetitive sequences and variation in the human genome

Repetitive sequences are common in the human genome, and nearly 50 % of the nuclear genome is repetitive. Microsatellites, minisatellites and the telomeres at the end of chromosomes are examples of repetitive sequences which are often used in genetic studies. There are also repetitive sequences within genes. The repetition number can cause illnesses and diseases due to a malformed protein or the gathering of long mRNA chains in the cell. Illnesses caused by insertion leading to the lengthening of a repetitive sequence are Huntington disease, fragile X and myotonic dystrophy.

There are several factors causing variation in the human genome: mutations, genetic drift, selection and migration. Mutations can be classified to substitutions, deletions and insertions.


  • Substitutions are either synonymous (or silent), where the final amino acid chain of the protein does not change, or nonsynonymous, where the amino acid chain is changed. Non-synonymous mutations can alter the amino acid at the point of the mutation, cause a stop codon or lead to aberrant splicing or altered gene expression (mutations in promotor sequences).
  • Deletions are either multiples of three or not. Multiples of three lead to deletions of whole amino acids. Other deletions disrupt the reading frame, causing premature termination or loss of function of the gene product.
  • Insertions are like deletions: multiplies of three add amino acids, while non-multiples disrupt the reading frame. Insertions can also be insertions of entire genes or amplification, where the gene stability, function or expression is altered.

Cancer: inheritance and proto-oncogenes

Cancer is a genetic disease of somatic cells. Therefore all forms of cancer are genetic, some more than others. Environmental factors, such as lifestyle, are important factors in developing cancer. 37 % of cancers in the USA are estimated to be caused by diet high in fats, while ~30 % of cancers are caused by tobacco.

Cancer can never be inherited. Only genes which may or may not increase susceptibility to certain types of cancers, are hereditary. Cancer development depends on a few types of genes:
  • proto-oncogenes: normal genes, which may develop into oncogenes if activated
  • oncogenes: mutated forms of proto-oncogenes. Important in cell growth and differentiation. Can be classified to viral oncogenes and cellular oncogenes. Both types can be
    • growth factors
    • growth factor receptors
    • intracellular signal transduction factors
    • DNA-binding nuclear proteins
    • Cell cycle factors
  • Tumor suppressor genes: inhibit and control cell proliferation
  • DNA repair genes: repair faulty and broken DNA, and can fix mutations. In cancer the gene repairs are not working for example due to mutation. Mutations in repair genes can be inherited, which may cause leukemia, solid tumors and other types of cancer.
Proto-oncogenes can be activated in many ways. Duplication is common, and many cancer genes have multiple copies of normal oncogenes. Point mutations can activate oncogenes of the RAS family, which mediate cell signaling. Translocations can create novel chimeric genes. Leukemia is most often caused by a chimera called the Philadelphia chromosome t(9;22).  Ph1 does not answer to common cell function controls. Chimeras are also common in hematologic (to do with liver) tumors and sarcomas.Oncogene can also be translocated to an transcriptionally active region, and the oncogene becomes overexpressed.


For example, colorectal cancer can be caused by hereditary conditions such as familial adenomatous polyposis (FAP) and hereditary non-popyposis colorectal cancer (HNPCC). FAP is an autosomal dominant disorder, where patients develop several polyps in their large intestine. In addition they have germline mutations in a tumor suppresson gene on chromosome 5, which causes the adenomas. HNPCC, also known as Lynch syndrome, is autosomal dominant disorder. It is caused by mutations in DNA repair genes.

Breast cancer can also be due to hereditary conditions. 15- 20 % of breast cancers in Western Europe are in women who have a family history of breast cancer. The susceptibility genes are BRCA1 and BRCA2, which are found in 30-50 % of all breast cancer families. Both are DNA repair genes. A heterozygous carrier can have 85 % chance of developing breast cancer. 

Methylation of DNA is a normal way of the body to increase or decrease the activity of a single gene or the whole genome. Methylation silences the gene, because a methylated gene cannot be transcribed. In hypomethylation the gene is inadequately methylated, and normally silent genes are expressed. Imprinting is also lost is the imprinted gene is hypomethylated. Reversely, hypermethylation silences a normally active gene. Cancel cell genomes are often hypomethylated. Genes can be silenced by methylation, mutation or both.



Chemical carcinogenesis

Chemical carcinogenesis is a system where chemicals cause cancer. It was discovered in 1770s, when chimey sweeps were often diagnosed with scrotal cancer. Some key terms in chemical carcinogenesis are
  • Genotoxicity: toxicity to the genome (A substance can be genotoxic)
  • Mutagen: chemical or physical agents inducing alterations in DNA
  • Carcinogen: cancer-causing agent
  • Co-carcinogen; chemicals that cause cancer only in specific combinations
  • Genotoxic carcinogen: DNA reactive agents (mutagens) or agents acting on the chromosome level (causing aneuploidy)
  • Non-genotoxic carcinogen: does not affect DNA but promotes growth in other ways (hormones, some organic compounds)
Chemical carcinogenesis is a multistep process involving initiation, promotion, malignant conversion, and progression. In initiation, a carcinogen causes an irreversible mutation. In promotion non-genotoxic agents cause the expansion of the initiated, mutated cells. This step requires many faulty mechanisms to be effective: cell to cell communication must be hindered, apoptosis must be blocked and other forms of cell proliferation inhibition must fail too. Promotion leads to conversion, where the cells become a malignant cell population. This step requires tumor suppresson genes to be inactivated and oncogenes to be amplified. Last step is the progression, where the cells continue to divide, eventually leading to metastasis.

Well known and unfortunately widely used chemical carcinogens are tobacco, alcohol, pesticides and certain types of fibers and dusts (e.g. asbestos and coal). Many substances become carcinogenic only after metabolization in human cells. For example, some smokers have overactive activator enzymes in their lungs, and most if not all precarcinogens in tobacco are metabolized to actual carcinogens. Smokers, whose activator is lazier, accumulate much less carcinogens from the same about of precarcinogens. Same kind of mechanism is in the kidneys, where a carcinogen eliminating enzyme can either be very effective or very passive. Examples of such enzymes are N-acetyltransferase (NAT) and glutathione S-transferases (GSTs).

Cytochrome P450 genes, also known was CYP, catalyze reactions where polycyclic aromatic hydrocarbons (PAH-compounds) are metabolized to carcinogens. Humans have 57 CYP-genes. Some are expressed in the lungs, some in the kidneys. Different CYP-genes have been scientifically proven to be connected with lung cancer.

Example: asbestos
- Asbestos is inhaled when working with asbestos without adequate protective gear
- Asbestos fibers penetrate lung epithelium, causing inflammation and tissue damage
- Free radicals are produced due to prolonged contact between asbestos and mesothelial cells, and due to prolonged exposure to inflammatory cells.
- Reactive oxygen causes DNA damage. Antioxidants can hinder this reaction, but cannot entirely prevent it.
- Asbestos fibres in the lung become covered in iron and calcium. Macrophages ingest asbestos fibres, whereupon growth factors are released, and fibroblasts coat the macrophages with collagen.  


Molecular genetics of Hutchinson-Gilford progeria syndrome

(c) Georgia Gwinnett College
Hutchinson-Gilford progeria syndrome or HGPS for short is a disease causing premature ageing in children. The children are born healthy, but begin to age rapidly when they reach 18-24 months of age. The disease causes growth failure, heart issues, loss of hair, loss of body fat, aged-looking skin and stiffness of joints. The childred die of atherosclerosis around 13 years of age. Progeria is very rare, 100 patients have been diagnozed within as many years.

The disease is caused by a mutation in lamin-forming gene LMNA. Most cases are new mutations, meaning that the parents are not carriers of the disease and the susceptibility cannot be estimated. 90 % of the patients carry the same single-point mutation, which leads to aberrant splicing and a lack of 50 amino acids in lamin A.

Due to the aberrant splicing the protein lamin A is not postprocessed correctly. A farnesyl group is added to end of lamin A in a cell. In a normal cell, this farnesyl group is soon removed, and the final protein, callen lamin A, is formed. In progeria-infected patients the farnesyl group remains, leading to instability of the protein (which is called progerin). The instability is what causes the premature aging, since lamin A is very important factor in maintaining cell integrity and structure.

Progeria is related to normal ageing. Each chromosome has long telomeres at the ends. Each division in normal cell shortens the telomeres, until they begin to "wear out". Shortened telomeres induce the production on progerin - the very same protein which in progeria-patients induces premature ageing, and for the same reason (aberrant splicing). In other words, progeria patients experience all of the symptoms and die of old age when ~13 years old.

Progeria cannot be cured. The progress of the disease has been hindered by farnesyltransferase inhibitors (FTIs). The current status of the research is available at http://www.progeriaresearch.org/.



No comments:

Post a Comment