Structure and FunctionThe word protein was first coined in 1838 to emphasize the importance of this class of molecules. The word is derived from the Greek word proteios which means "of the first rank". This chapter will provide a brief background into the structure of proteins and how this structure can determine the function and activity of proteins. It is not intended to substitute for the more detailed information provided in a biochemistry or cell biology course. Proteins are the major components of living organisms and perform a wide range of essential functions in cells. While DNA is the information molecule, it is proteins that do the work of all cells - microbial, plant, animal. Proteins regulate metabolic activity, catalyze biochemical reactions and maintain structural integrity of cells and organisms. Proteins can be classified in a variety of ways, including their biological function (Table 2.1).
How does one group of molecules perform such a diverse set of functions? The answer is found in the wide variety of possible structures for proteins. In the English language, there are an enormous number of words with varied meaning that can be formed using only 26 letters as building blocks. A similar situation exists for proteins where an incredible variety of proteins can be formed using 20 different building blocks called amino acids. Each of these amino acid building blocks has a different chemical structure and different properties (Figure 2.2). Each protein has a unique amino acid sequence that is genetically determined by the order of nucleotide bases in the DNA, the genetic code. Since each protein has different numbers and kinds of the twenty available amino acids, each protein has a unique chemical composition and structure. For example, two proteins may each have 37 amino acids but if the sequence of the amino acids is different, then the protein will be different. How many different proteins can be formed from the twenty different amino acids? Consider a protein containing 100 different amino acids linked into one chain. Since each of the 100 positions of this chain could be filled with any one of the 20 amino acids, there are 20100 possible combinations, more than enough to account for the 90-100 million different proteins that may be found in higher organisms. A change in just one amino acid can change the structure and function of a protein. For example, sickle cell anemia is a disease that results from an altered structure of the protein hemoglobin, resulting from a change of the sixth amino acid from glutamic acid to valine. (This is the result of a single base pair change at the DNA level.) This single amino acid change is enough to change the conformation of hemoglobin so that this protein clumps at lower oxygen concentrations and causes the characteristic sickle shaped red blood cells of the disease. The unique structure and chemical composition of each protein is important for its function; it is also important for separating proteins in a protein purification strategy. Each of these differences in properties can be used as a basis for the separation methods that are used to purify proteins. Because these differences in protein properties originate from differences in the chemical structure of the amino acids that make up the protein, we need to explore the structure of amino acids and their contribution to protein properties in more detail. Chemical Composition of Proteins: (Protein Structure)Amino acid structure:Amino acids are composed of carbon, hydrogen, oxygen, and nitrogen. Two amino acids, cysteine and methionine, also contain sulfur. The generic form of an amino acid is shown in Figure 2.1. Atoms of these elements are arranged into 20 kinds of amino acids that are commonly found in proteins. All proteins in all species, from bacteria to humans, are constructed from the same set of twenty amino acids. All amino acids have an amino group (NH2) and a carboxyl group (COOH) bonded to the same carbon atom, known as the alpha carbon. Amino acids differ in the side chain or R group that is bonded to the alpha carbon. (Figure 2.2) Glycine, the simplest amino acid has a single hydrogen atom as its R group - Alanine has a methyl (-CH3) group.
The chemical composition of the unique R groups is responsible for the important characteristics of amino acids such as chemical reactivity, ionic charge and relative hydrophobicity. In Figure 2.2, the amino acids are grouped according to their polarity and charge. They are divided into four categories, those with polar uncharged R groups, those with apolar (nonpolar) R groups, acidic (charged) and basic (charged) groups.
A protein is formed by amino acid subunits linked together in a chain. The bond between two amino acids is formed by the removal of a H20 molecule from two different amino acids, forming a dipeptide. (Figure 2.3) The bond between two amino acids is called a peptide bond and the chain of amino acids is called a peptide (20 amino acids or smaller) or a polypeptide. Each protein consists of one or more unique polypeptide chains. Most proteins do not remain as linear sequences of amino acids; rather, the polypeptide chain undergoes a folding process. The process of protein folding is driven by thermodynamic considerations. This means that each protein folds into a configuration that is the most stable for its particular chemical structure and its particular environment. The final shape will vary but the majority of proteins assume a globular configuration. Many proteins such as myoglobin consist of a single polypeptide chain; others contain two or more chains. For example, hemoglobin is made up of two chains of one type (amino acid sequence) and two of another type. Although the primary amino acid sequence determines how the protein folds, this process is not completely understood. Although certain amino acid sequences can be identified as more likely to form a particular conformation, it is still not possible to completely predict how a protein will fold based on its amino acid sequence alone, and this is an active area of biochemical research. The final folded 3-D arrangement of the protein is referred to as its conformation. In order to maintain their function, proteins must maintain this conformation. To describe this complex conformation, scientists describe four levels of organization: primary, secondary, tertiary, and quaternary (Figure 2.4). The overall conformation of a protein is the combination of its primary, secondary, tertiary and quaternary elements. Four levels of Organization of Protein Structure:
Conjugated ProteinsSome proteins combine with other kinds of molecules such as carbohydrates, lipids, iron and other metals, or nucleic acids, to form glycoproteins, lipoproteins, hemoproteins, metalloproteins, and nucleoproteins respectively. The presence of these other biomolecules affects the protein properties. For example, a protein that is conjugated to carbohydrate, called a glycoprotein, would be more hydrophilic in character while a protein conjugated to a lipid would be more hydrophobic in character. Protein Properties and SeparationProteins are typically characterized by their size (molecular weight) and shape, amino acid composition and sequence, isolelectric point (pI), hydrophobicity, and biological affinity. Differences in these properties can be used as the basis for separation methods in a purification strategy (Chapter 4). The chemical composition of the unique R groups is responsible for the important characteristics of amino acids, chemical reactivity, ionic charge and relative hydrophobicity. Therefore protein properties relate back to number and type of amino acids that make up the protein. Size:Size of proteins is usually measured in molecular weight (mass) although occasionally the length or diameter of a protein is given in Angstroms. The molecular weight of a protein is the mass of one mole of protein, usually measured in units called daltons. One dalton is the atomic mass of one proton or neutron. The molecular weight can be estimated by a number of different methods including electrophoresis, gel filtration, and more recently by mass spectrometry. The molecular weight of proteins varies over a wide range. For example, insulin is 5,700 daltons while snail hemocyanin is 6,700,000 daltons. The average molecular weight of a protein is between 40,000 to 50,000 daltons. Molecular weights are commonly reported in kilodaltons or (kD), a unit of mass equal to 1000 daltons. Most proteins have a mass between 10 and 100 kD. A small protein consists of about 50 amino acids while larger proteins may contain 3,000 amino acids or more. One of the larger amino acid chains is myosin, found in muscles, which has 1,750 amino acids. Separation methods that are based on size and shape include gel filtration chromatography (size exclusion chromatography) and polyacrylamide gel electrophoresis. Amino Acid Composition and SequenceThe amino acid composition is the percentage of the constituent amino acids in a particular protein while the sequence is the order in which the amino acids are arranged. Charge:Each protein has an amino group at one end and a carboxyl group at the other end as well as numerous amino acid side chains, some of which are charged. Therefore each protein carries a net charge. The net protein charge is strongly influenced by the pH of the solution. To explain this phenomenon, consider the hypothetical protein in Figure 2.5. At pH 6.8, this protein has an equal number of positive and negative charges and so there is no net charge on the protein. As the pH drops, more H+ ions are available in the solution. These hydrogen ions bind to negative sites on the amino acids. Therefore, as the pH drops, the protein as a whole becomes positively charged. Conversely, at a basic pH, the protein becomes negatively charged. pH 6.8 is called the pI, or isoelectric point, for this protein; that is, the pH at which there are an equal number of positive and negative charges. Different proteins have different numbers of each of the amino acid side chains and therefore have different isoelectric points. So, in a buffer solution at a particular pH, some proteins will be positively charged, some proteins will be negatively charged and some will have no charge.
Hydrophobicity:Literally, hydrophobic means fear of water. In aqueous solutions, proteins tend to fold so that areas of the protein with hydrophobic regions are located in internal surfaces next to each other and away from the polar water molecules of the solution. Polar groups on the amino acid are called hydrophilic (water loving) because they will form hydrogen bonds with water molecules. The number, type and distribution of nonpolar amino acid residues within the protein determines its hydrophobic character. (Chart of hydrophobicity or hydropathy) A separation method that is based on the hydrophobic character of proteins is hydrophobic interaction chromatography. Solubility:As the name implies, solubility is the amount of a solute that can be dissolved in a solvent. The 3-D structure of a protein affects its solubility properties. Cytoplasmic proteins have mostly hydrophilic (polar) amino acids on their surface and are therefore water soluble, with more hydrophobic groups located on the interior of the protein, sheltered from the aqueous environment. In contrast, proteins that reside in the lipid environment of the cell membrane have mostly hydrophobic amino acids (non polar) on their exterior surface and are not readily soluble in aqueous solutions. Each protein has a distinct and characteristic solubility in a defined environment and any changes to those conditions (buffer or solvent type, pH, ionic strength, temperature, etc.) can cause proteins to lose the property of solubility and precipitate out of solution. The environment can be manipulated to bring about a separation of proteins- for example, the ionic strength of the solution can be increased or decreased, which will change the solubility of some proteins.
Biological Affinity (Function):Proteins often interact with other molecules in vivo in a specific way- in other words, they have a biological affinity for that molecule. These molecular counterparts, termed ligands, can be used as “bait” to “fish” out the target protein that you want to purify. For example, one such molecular pair is insulin and the insulin receptor. If you want to purify (or catch) the insulin receptor, you could couple many insulin molecules to a solid support and then run an extract (containing the receptor) over that column. The receptor would be “caught” by the insulin bait. These specific interactions are often exploited in protein purification procedures. Affinity chromatography is a very common method for purifying recombinant proteins (proteins produced by genetic engineering). Several histidine residues can be engineered at the end of a polypeptide chain. Since repeated histidines have an affinity for metals, a column of the metal can be used as bait to “catch” the recombinant protein.
Working with proteinsHow proteins lose their structure and function.Although DNA can be isolated and amplified from thousand year old mummies, most proteins are more fragile biomolecules. Therefore, laboratory reagents and storage solutions must provide suitable conditions so that the normal structure and function of the protein is maintained. To understand how the structure of proteins is protected in laboratory solutions, it is necessary to understand how that structure can be destroyed.
The composition of the extraction buffer is important for maintaining structure and function of the target protein. To prevent denaturation, the buffering pH is based on the pH stability range of the protein. Other components such as ionic strength, divalent cations (Ca++ and Mg++), or reducing agents (dithiothreitol or ß-mercaptoethanol) may be needed to maintain activity. In making the extract, cells are lysed and proteases (enzymes that degrade proteins) are released from their intracellular compartments. To prevent proteases from digesting the target protein, two strategies are commonly followed: 1) The extract is kept cold. The activity of proteolytic enzymes is greatly reduced by cold temperatures. For this reason, the protein purification process is often conducted in cold rooms. At the very least, an effort is made to keep the extract at 4?C. 2) Protease inhibitors are sometimes added to the mixture to prevent degradation by proteases. The drawback to this strategy is that the inhibitors must eventually be removed, along with other contaminant proteins.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||







