Computation of various physical and chemical properties of Proteins and DNA from their primary sequence


Experimental Biochemistry Virtual Laboratory -->

Computation of various physical and chemical properties of Proteins and DNA from their primary sequence

Theory

Introduction to protein and amino acids:

Proteins are one of the most significant and abundant organic molecules in living systems that exhibit more diversity in structure and function than any other classes of macromolecules. They mediate virtually every cellular process exhibiting numerous functions. The diverse functions of proteins are determined by its structure and chemical composition. Although their structures, like their functions, vary greatly, all proteins are made up of one or more chains of amino acids.

Proteins are polymers of amino acids and each amino acid is joined to its neighbour through a covalent amide linkage known as peptide bond. All amino acids share a basic structure, which consists of a central carbon atom, also known as the alpha (α) carbon, bonded to an amino group (‍NH2 ), a carboxyl group (‍COOH), and a hydrogen atom. They differ from each other in their side chains or R group, which vary in structure, size, and electric charge and also determine the solubility of the amino acids in water.

Figure 1: General structure of an amino acid

Figure 2: Twenty different amino acids commonly found in proteins, each with a different R group that determines its chemical nature.

The properties of the side chain determine an amino acid’s chemical behavior (that is, whether it is considered acidic, basic, polar, or nonpolar). For example, amino acids such as valine, isoleucine, and leucine are nonpolar and hydrophobic, while amino acids like serine and glutamine have hydrophilic side chains and are polar. Some amino acids, such as lysine and arginine, have side chains that exhibit positively charge at physiological pH and are considered basic amino acids whereas some amino acids like aspartate and glutamate are negatively charged at physiological pH and are considered acidic.

Peptide bonds :

The amino acids of a polypeptide are attached to their neighbours by covalent bonds known as peptide bond. Such a linkage is formed by the removal of elements of water(dehydration) from the α-carboxyl group of one amino acid and the α-amino group of another resulting in a condensation reaction. In a peptide amino acid residue at the end with a free α-amino group is the amino-terminal(or N terminal), the residue at the other end which has a free carboxyl group, is the carboxyl-terminal(or C terminal)

Figure 3 : Peptide bond formed between two amino acids

Physical and chemical properties of protein that can be derived from their primary sequence

  1. Molar mass of protein
  2. Molar extinction coefficient : The property of protein to absorb UV light in proportion to their concentrations is utilized for the spectrophotometric determination of protein concentrations, and it is defined by the Beer–Lambert law (or Beer’s law). Beer’s law describes the dependence of a protein’s absorbance on its absorptivity coefficient, its concentration, and the pathlength of the incident light: The protein concentration based on the measured absorbance at 280 nm can be derived from the equation
  3. A = εcL A: absorbance of the protein (unitless)
    ε: molar extinction coefficient of the protein (M -1 cm -1)
    c: concentration of the protein (molar units, M)
    L: light pathlength (cm)

    At wavelength of 280 nm the aromatic amino acids tryptophan (Trp) , tyrosine (Tyr) and phenylalanine(Phe) exhibit strong light absorption and cysteine groups forming disulfide bonds (Cys–Cys) also absorb but to a lesser extent. Consequently, absorption of proteins and peptides at 280 nm is proportional to the content of these amino acids.

  4. Isoelectric point (pI) of protein:- It is defined as the pH at which the net charge of a protein molecule is zero. Proteins are positively charged at a pH below their pI and negatively charged at a pH above their pI. The protein pI varies greatly from extremely acidic to highly alkaline values ranging from about 4.0 to 12.0. Hence, pI values of proteins are used to determine methods and buffer composition for isolation, separation, purification and crystallization of protein. pI of a protein is primarily dependent on amino acid composition based on the combination of dissociation constant (pKa) values of the constituent amino acids. Out of the twenty common amino acids, two amino acids, aspartic acid, and glutamic acid, are negatively charged and three amino acids, lysine, arginine, and histidine, are positively charged at the neutral pH, as defined by their pKa values.

Introduction to DNA

DNA(deoxyribonucleic acid) is the molecular repositors of genetic information. Structure and function of every protein, biomolecule and cellular component is dependent on information programmed in the nucleotide sequence of DNA.

A nucleotide has three characteristics component

  1. Nitrogenous base(Pyrimidines-Cytosine and thymine, Purine- Adenine and guanine)
  2. Pentose sugar (2ˈ-deoxy-D-ribose)
  3. Phosphate group

5 ˈ-phosphate group of one nucleotide is linked to the 3 ˈ hydroxyl group of the next nucleotide, creating a phosphodiester linkage. Each strand of DNA has a backbone made of alternating sugar (deoxyribose) and phosphate groups. Attached to each sugar is one of four bases: adenine (A), cytosine (C), guanine (G) or thymine (T). The two strands are held together by hydrogen bonds between pairs of bases: adenine pairs with thymine, and cytosine pairs with guanine.

Figure 4 : Schematic representation of structure of a nucleotide and DNA

Properties of DNA that can be derived from their nucleotide sequence

  1. Molar mass
  2. Melting temperature(tm ) : Temperature at which half the DNA is present as separated single strands. The higher the content of G≡C base pairs, the higher is the melting point of DNA. DNA region that are rich in A=T pairs will denature easily. Melting temperature of DNA determined under fixed conditions of pH and iconic strength can give an approximate estimation of base content of DNA.