Molecular Pathology
Home Up Size, Scope

 

 

Molecular Genetic Pathology

MR Seashore, MD

REFERENCES:

Strachan and Read: Chapter 15

Collins, Gelehrter, Ginsburg: Chapters 6, 7

Scriver CR, Beaudet AL, Sly WS, Valle D, eds, Metabolic Basis of Inherited Disease, Sixth Edition, McGraw-Hill, New York , Chapter 17, 18

Have a look at the DOE Primer on Molecular Genetics

SUMMARY:

As human genes are being mapped, cloned, and sequenced, it has become possible to understand the relationship between mutations and changes in structure and function of important proteins. Often, these are mutations which cause disease. The correlations which can be made therefore allow the construction of an anatomy of the human genome and the understanding of genetic pathophysiology. From this kind of study, general principles evolve of how genes work and contribute to human difference. In addition, the study of mutations elucidates the normal.

A number of classes of mutations have been described, and their effect on phenotype being correlated with the class of mutation as well as the specific mutation. Specific molecular pathology has been increasingly identified.

Classes of mutations are being delineated

Single base changes: missense, nonsense, splices sites; Insertion, deletion, duplications; Triplet repeat expansions; Fusion genes

Consequences of mutations are being defined

No protein

Abnormal protein with changed or decreased function (gain or loss of function)

Allelic series of mutations are being recognized

Size of human genes is being determined

This size ranges from 0.8 kb and 3 introns (alpha-globin) to 2000 kb and 60 introns (Duchenne Muscular Dystrophy).

Understanding of gene structure and function is advancing

Molecular genetic heterogeneity plays an important role in variability in phenotype. This may involve allelic heterogeneity (allelic series), tissue-specific mRNA splicing, multimeric proteins depending on multiple genes, multiple steps in a pathway. You should try to identify some examples of this.

Evolution will be better understood

Functional and pathological anatomy of the human genome will be clarified, just as gross and microscopic anatomy are now

Clinical implications are many

Linked markers to inherited disease useful for diagnosis

Better estimation of risk to relatives

Understanding of disease heterogeneity

Understanding of multifactorial traits

Eventually hope for prevention and treatment

McKusick has begun to call this knowledge "the morbid anatomy of the human genome", an analogy to anatomic pathology. One could think that of this as the finest level of resolution of anatomic pathology.

Classes of mutations

Single base substitutions are among the most common changes to human DNA. These base changes can occur in the coding or the non-coding regions of the DNA. If they occur in the coding region, they can be:

synonymous: no amino acid change occurs

non-synonymous: amino acid change does occur

conservative: new amino acid has similar properties

non-conservative: new amino acid has different properties

As a paper exercise, you can make codon changes, see what amino acid changes you get, and try to predict the effect on the protein coded for by a gene in which those changes occurred. Kinds of single base substitutions include missense, nonsense, and splice site mutations.

Mutations that change the total base numbers in the gene include the following:

insertions

single base insertion

triplet insertion

many-base insertion

deletions

single base deletion

triplet deletion

many-base deletion (up to megabases)

duplications

Mutations in non-coding DNA can be at splice sites or regulatory sites. Splice-site mutations can result in:

exon skipping

cryptic splice site activation (much more on this in hemoglobins)

The impact of mutations depends on the effect of these mutations on protein function. The class of the mutation is one of the determining factors. The role of the protein is also a factor: is it structural, regulatory, part of a metabolic pathway, a membrane protein? Whether the phenotype that results from the mutation is dominant or recessive depends on the consequences of the protein change on the heterozygote for the mutation. The effect of the mutation also depends on whether it causes a loss of function of the protein (decreased amount or complete loss of activity, for example) or gain of function (a new or abnormal function, for example).

While there are exceptions that prove the rule, loss of function mutations tend to be recessive if 50% of the amount of protein is enough to sustain function. Gain of function mutations are often dominant; their effect may be on the wrong cell, at the wrong time, or their activity may be too high. Dominant negative mutations result when a mutant protein loses function and affects the function of another protein that is being normally made; an example is a mutation in one subunit of a multimeric protein.

The molecular pathology of triplet repeat expansion is not well understood. These can result in gain or loss of function mutations. In some cases, the mechanism may involve methylation of the gene, resulting in no transcription of the protein. In others, there is synthesis of proteins with long polyglutamine runs with gain of function. Neighboring genes may also be affected. The mechanism that alters phenotype when the 3'-untranslated region of a gene undergoes repeat expansion is not known.

Ways to lose a gene product:

gene deletion

partial gene deletion

frame shift insertions or deletions

stop codon base change

altered splice site

activated cryptic splice site

abnormal mRNA stability

change in regulatory sequences

abnormal intracellular transport or localization

A useful exercise may be to identify examples of these mechanisms in known human mutations.

The nomenclature of mutations is important to know in order to read the literature. For single base changes, mutations may be described by the base change or the protein change. Insertions and deletions are usually described at the level of the gene. When the base change is described, the position of the change and the exact base change are noted: A985G, for example, designates an A-to-G transition at position 985 of the coding region. When the protein is described, the one-letter amino acid abbreviation is usually used: the change from lysine to glutamic acid at amino acid position 304 of the peptide described above would be K304E. A stop codon is designated by X, for example G356X means a change from glycine to a stop codon at amino acid position 356 of the peptide. Deletions and insertions are described by the effect at the DNA level: F508 is a deletion of the phenylalanine codon at position 508 of the cystic fibrosis gene; nt1160(del 6bp) designates a deletion of 6 base pairs at nucleotide 1160 (similarly for insertions). See Trends in Genetics, March, 1995, for more explanation.

Size and Scope of mutations

Examples of Some Specific mutations

Membrane proteins

Dystrophin (Duchenne muscular dystrophy)

Recognized as an X-linked devastating disorder of muscle for many years, but pathogenesis and basis of disorder unknown. No satisfactory therapy has been devised. Linkage of the gene to Xp21 was first suspected in 1985 when a patient with a deletion at Xp21 had Duchenne muscular dystrophy without a family history and a deletion in Xp21, as seen in Fig.1. Further studies in a number of laboratories led to the cloning and sequencing of the gene which is mutant in DMD. The gene codes for a 400 KD protein, dystrophin, which resides in the muscle membrane. The molecular pathology described includes deletions, some of which involve more than one exon, as well as point mutations. About 60-65% of cases can be accounted for by deletions, ranging from 1 exon to the entire gene. The smaller deletions may result in less severe phenotypes, but large in-frame deletions may be less severe than smaller frame shifts, an example of genetic heterogeneity due to different mutations at the same locus.

CFTR (cystic fibrosis)

Linkage analysis led to localization of the cystic fibrosis gene to chromosome 7. Using chromosome jumping, several groups have succeeded in cloning the gene for cystic fibrosis. It codes for a 1400 amino acid membrane spanning protein which looks like a transport protein, a conclusion which fits the clinical presentation of the disorder. This protein is represented in Figure 2, taken from Science, 253:202 (1991). About 80% of the cases are the result of a deletion in this gene of a phenylalanine codon at position 508 (F508), resulting in a structural change in the nucleotide binding fold of the strand of the protein. More than 500 different mutations have been identified. The delineation of the molecular pathophysiology may lead to effective therapy. These attempts will be discussed in the combined Biochemistry/Genetics Gene Therapy Clinical Correlation. There is not complete correlation between genotype and phenotype. One explanation of that is modulation of the disease by other loci. One recently identified such locus in the transgenic CF mouse is on mouse chromosome 7.

Major structural proteins

Collagen

Mutations in the collagen gene cause phenotypes that depend on the function of that collagen subunit, and the organ in which it exerts its most importance. Subunits of collagen genes reside on autosomes as well as on the X chromosome.

Osteogenesis imperfecta is a name given to a number of disorders which all share the phenotype of easily fractured bones. They have been shown to represent defects in one of the collagens, a family of proteins which are the most abundant proteins in man and other mammals, comprising about 25% of total protein. Their role in providing structural integrity depends on their fibrous, stiff, triple-stranded helical structures. Figures 3 and 4 demonstrate the biosynthesis of collagen. The biosynthesis of this family of proteins depends on the gene products of more than 30 different loci. The 15 types of collagen are made of subunits whose synthesis directed by several loci; the remainder of the loci concerned with collagen biosynthesis encode a variety of enzymes and proteases responsible for posttranslational modification of the collagen chains. Defects in type I collagen result in osteogenesis imperfecta, but clinical and biochemical heterogeneity is striking. Linkage with RFLPs for the COL1A1 gene has been established and affected cells show decreased production of type I collagen; more than 70 molecular defects in COL1A1 and COL1A2 genes have been described which include deletions, substitutions, and frame shifts which lead to abnormal subunits which cannot assemble into the collagen triple helix, resulting in "protein suicide". Thus, individuals who are heterozygous for these mutations can express the condition, and many of the forms of OI are inherited in an autosomal dominant pattern.

Ehlers-Danlos syndrome type IV is caused by a variety of mutations in COL3A1, which codes for Type III collagen, leading to a very different phenotype because of the body distribution of Type III collagen in blood vessels, intestine, and skin. These individuals suffer from catastrophic rupture of blood vessels, intestine, and uterus. Splicing defects, deletions, and substitutions resulting in the replacement of glycine with bulkier amino acids result in the synthesis of abnormal Type III collagen. Other mutations in COL3A1 cause isolated familial aortic aneurysm. Type VII Ehlers-Danlos syndrome also has mutations in COL1A1 and COL1A2, but has a phenotype quite different from osteogenesis imperfecta. Reviewing these mutations provides an understanding of the correlation between mutation and phenotype.

Another example of a collagen mutation is Alport syndrome, a condition characterized by progressive glomerular damage, with ultrastructural abnormalities of the glomerular basement membrane, leading to renal failure. Hearing loss, the result of cochlear malfunction, and corneal abnormalities complete the picture. Genetic heterogeneity within Alport syndrome has been recognized for many years. and has generated considerable debate about autosomal versus X-linked inheritance. Pedigrees that are consistent with X-linked (85% of families), autosomal recessive, and autosomal dominant inheritance have all been observed. Considerable clinical heterogeneity has been reported. The molecular identification and characterization of the genes and their mutations associated with Alport syndrome has clarified much of the confusion about this condition. All of the mutations in Alport syndrome have been in one of the subunits of the -chain of type IV collagen. Type IV collagen is a major component of basement membranes, and is expressed in the glomerulus, the cornea, and the cochlea, consistent with the findings of nephritis, corneal abnormalities, and sensorineural hearing loss. Type IV collagen is composed of 6 subunits, 4 of which map to autosomes. Two subunits, 5 and 6, map to the X-chromosome at the location to which the X-linked form of Alport syndrome has been mapped. The X-linked and most common form of Alport syndrome maps to Xq22. Mutations in the gene for the 5 subunit of type 4 collagen, COL4A5, account for the majority of cases of this form of Alport syndrome. Deletions, point mutations, exon skipping, and nonsense mutations resulting in no protein synthesis have all been observed. In another form of the X-linked variant, considerably rarer, Alport syndrome associated with diffuse esophageal leiomyomatosis has a contiguous deletion of portions of the genes for subunit 5 and 6. The autosomal recessive form has been associated with mutations in the 3 and in the 4 subunits, both of which map to chromosome 2q35-37.

Fibrillin is another structural protein which is very important in the formation of the microfibrillar structures which hold the ocular lens in place and are an important component of arterial walls. It is not surprising, therefore, that a mutation in fibrillin should result in a condition which is characterized by dislocation of the ocular lens and aneurysm and rupture of the aorta (Marfan syndrome). The road to this conclusion was paved by both linkage studies and histochemical analysis. The presence of the condition segregated with the markers D15S25 and D15S1, located at 15q15-21 on chromosome 15. The LOD score for D15S25 was 9.2 at =0.0±0.09. Later evidence demonstrated that the gene for fibrillin maps to that position, and abnormalities in fibrillin in the microfibrillar structures were demonstrated histochemically in tissue from individuals with the Marfan syndrome. The mutations are still being characterized.

Efforts at therapy based an understanding of the pathogenesis of these disorders is the next obvious important step. Already the elucidation of specific mutations has increased diagnostic precision. Interesting new mechanisms are being recognized, such as mosaicism in phenotypically normal parents of affected children where a child was thought to represent a new mutation. The basic principles will become increasingly important, as it will be impossible to keep track of each individual case.

What are the key points in molecular and biochemical genetics?