Search Keck Sites:


W.M. Keck Facility
 Yale University
 300 George Street
 Addresses
 
 Contact Us

Yale University School of Medicine

Keck Home Page > Protein Chemistry > Protein Analysis Procedures

More Information on Protein Analysis Procedures

Amino Acid Analysis

Amino acid analysis is carried out on a Beckman Model 7300 ion-exchange instrument following a 16 hr hydrolysis at 115 degrees C in 100 µl of 6 N HCl, 0.2% phenol that also contains 2 nmol norleucine. The latter serves as an internal standard to correct for losses that may occur during sample transfers, drying etc. After hydrolysis, the HCl is dried in a Speedvac and the resulting amino acids dissolved in 100 µl Beckman sample buffer that contains 2 nmol homoserine with the latter acting as a second internal standard to independently monitor transfer of the sample onto the analyzer. The instrument is calibrated with a 2 nmol mixture of amino acids and it is operated via the manufacturer's programs and with the use of their buffers. Data analysis is carried out on an external computer using Perkin Elmer/Nelson data acquisition software.

During acid hydrolysis asparagine will be converted to aspartic acid and glutamine to glutamic acid. During the HPLC analysis that follows, cysteine co-elutes with proline; and methionine sulfoxide, which is a common oxidation product found in peptides/proteins, co-elutes with aspartic acid. Hence, following normal acid hydrolysis, glutamine and asparagine are not individually quantified and it is possible that the methionine value will be low and (generally to a lesser extent) that the aspartic acid and proline values will be somewhat high. Improved quantitation of cysteine and methionine can be obtained by requesting prior oxidation with performic acid, which converts both methionine and methionine sulfoxide to methionine sulfone and cysteine and cystine to cysteic acid. Generally, however, performic acid oxidation destroys tyrosine. Best quantitation of tryptophan is generally obtained by requesting hydrolysis with methanesulfonic acid (MSA) instead of hydrochloric acid. The procedure used in this instance is to carry out the hydrolysis with 20 µl MSA for 16 hrs at 115C. After hydrolysis, the sample is neutralized with approximately 200 µl 0.35 M NaOH and 100 µl (50% of the sample) is then analyzed on the Beckman 7300. Please keep in mind that since we believe the overall extent of hydrolysis with MSA is less than with HCl, we do not recommend MSA hydrolysis for use in quantifying the concentration of protein stock solutions.

Top of Page Top

 

Internal Protein Sequencing

General Information

Protein digests that cannot be identified by mass spectrometric approaches (i.e., peptide mass searching or Database searching of MS/MS fragmentation data) are subjected to preparative reverse phase HPLC and individual, peak detected fractions are then "screened" by MALDI-MS followed by N-terminal (Edman degradation) sequencing of one or more peptide peaks that appear to be nearly homogenous based on absorbance profile and MALDI-MS spectrum. The Keck Laboratory has made a major and continuing effort to implement and improve more sensitive procedures for isolating and sequencing tryptic peptides from SDS PAGE-separated proteins that generally are submitted as Coomassie Blue stained gel bands. Some of these studies are described in Keck Laboratory publications (17, 18, 20-22, 23, 26). Our overall success rate, as measured by the fraction of proteins submitted for which internal peptide sequences were obtained that either result in identifying the protein (via the database searches that are included with this service) or that are suitable for generating cDNA probes or primers, is nearly 97%.

Nearly 75% of the proteins submitted for internal sequencing are identified via database searching of the first peptide sequence obtained by the Keck Laboratory. This percentage will inevitably increase as the various genome projects are completed. For this reason the Keck Laboratory strongly encourages that aliquots of all enzymatic digests destined for HPLC and peptide sequencing be subjected first to MS or MS/MS protein identification to allow many known proteins to be identified prior to embarking on the more time consuming and expensive HPLC/peptide sequencing approach.

Quantitation of SDS PAGE Samples for Internal Sequencing

The most critical determinant of success of internal sequencing is that a sufficient amount of protein be submitted. Currently, the minimum recommended amount is 5 pmol while the optimal amount is about 25 pmol. If the sample will be submitted in the form of a Coomassie Blue stained gel band, it should be in a single band as the second most critical determinant of success is that the protein be contained within the minimum possible volume of polyacrylamide gel. Two approaches may be taken to quantify the amount of protein prior to in gel enzymatic cleavage. The first is simply that several concentrations of a mixture of known proteins be run on the same gel as the sample and then the amount of protein in the sample estimated by comparison to these standards. Since proteins vary by at least twofold in their relative Coomassie Blue staining intensity it is important that more than one standard protein be run and that an "average" Coomassie Blue staining intensity for a given amount of standard protein be used to estimate the amount of protein in the unknown sample. The second approach that may be taken is to estimate the amount of protein based on the average MALDI-MS response, relative to internal standards, obtained on an aliquot of the resulting in gel trypsin or lysyl endopeptidase digest. The Keck Laboratory routinely estimates the amount of protein digest remainng in samples submitted for MALDI-MS based protein identification.
 

Preparation of SDS PAGE Samples for Internal Sequencing

The procedure we recommend for obtaining internal amino acid sequences from SDS PAGE-separated proteins is in situ tryptic or lysyl endopeptidase digestion in the gel matrix - followed by elution of the resulting peptides and HPLC separation. Although we also carry out in situ tryptic or lysyl endopeptidase digests on the PVDF membrane (followed again by elution and HPLC separation), we strongly recommend the in gel approach as it avoids the large losses that are sometimes associated with blotting onto PVDF membranes and it also is more compatible with subsequent mass spectrometry. It is important that in gel samples are stained and shipped according to these instructions and that every effort is made to maximize the ratio of protein to total gel volume. In general, we recommend the gel band contain at least 0.05 µg of the desired protein per cubic mm gel volume. Although we recommend an absolute minimum of 5 pmol protein, the quality of the resulting peptide sequence data is improved by going to larger amounts, with the optimum level of protein being iabout 25 pmol. If there are technical problems that prevent you from reaching the recommended protein/gel volume ratio, you should email a brief description of the problem (i.e., how to concentrate a ml of sample so it can be loaded into a single lane on an SDS polyacrylamide gel) to the Protein Chemistry Section who may be able to help devise an alternative protocol. If your protein contains significantly more than 10% (w/w) carbohydrate, we recommend the carbohydrate be removed prior to SDS PAGE, otherwise it is likely to hinder enzymatic cleavage. In addition to your sample, you should also submit an approximately equal size piece of gel from a region of the gel that does not contain protein. The latter will be "digested" and subjected to analytical HPLC at no charge and will serve as an important control which will help to quickly identify artifact and trypsin autolysis peaks in the final HPLC chromatogram. In addition to this "negative" control, the Keck Laboaratory will also digest (again at no additional charge) a similar amount of transferrin in parallel with your sample to serve as a positive control.

Estimated Turn-around Time and Cost of Internal Edman Sequencing

Typically, approximately five weeks are required to carry out an in gel tryptic digest, fractionate the resulting peptides by preparative reverse phase HPLC, "screen" about six of the peptide peaks that have the most symmetrical absorbance profile by MALDI-MS and to then subject the first peptide to Edman sequencing.

The following table provides an estimate of the minimum charges for an "average" internal Edman sequencing project. To generate these estimates we have relied on the data in Table I, which summarizes results obtained from more than 200 in gel digests/internal Edman sequencing projects, to estimate variables such as the average number of peptides sequenced/protein and the length of each seqeunce. As noted in this table, our overall success rate at completing these projects is above 96%. One factor that might result in the project exceeding the charges for conventional internal sequencing would be an unusually complex HPLC profile caused by digesting an extremely large protein (i.e., >100 kD). In this instance, additional MALDI-MS analyses and/or HPLC repurification of individual peaks might well be required to identify and isolate peptides suitable for sequencing.

Estimated Cost for an "Average" Internal Protein Sequencing Project

Description

Service Charge

'

Yale

Non-Yale/Non-Profit

'

'

'

In Gel Digest1

$250

$287

Preparative HPLC

$375

$430

MALDI-MS on 6 Peptides

$420

$486

Edman Sequencing of 2 Peptides - Assuming 25 (Total) Residues Identified

$1285

$1470

1The in gel digest charge would not apply to those samples that have already been subjected to protein identification as these samples would have been digested already as part of this latter service.

Top of Page Top

 

In Gel Enzymatic Digestion in the Keck Facility

In gel enzymatic digestion is carried out as described generally in Williams and Stone (1997) and Williams et al (1997). Basically, this procedure involves diffusing in modified trypsin (from Promega) or lysyl endopeptidase (Wako), digesting for 24 hrs at 37 degrees C and then extracting the resulting peptides. (See digest procedure for more information).

Top of Page Top

 

Preparative Reverse Phase HPLC Fractionation of Enzymatic Digests

All enzymatic digests that are destined for internal Edman sequencing are fractionated on a Hewlett Packard 1090 HPLC system equipped with an Isco Model 2150 Peak Separator and a 1 mm x 25 cm Vydac C-18 (5 micron particle size, 300 pore size) reverse phase column equilibrated with 98% buffer A (0.06% TFA) and 2% buffer B (0.052% TFA, 80% acetonitrile) as described in Williams and Stone (1997) and Williams et al (1997). Peptides are eluted at 50 µl/min with the following gradient program: 0-60 min (2-37% B), 60-90 min (37-75% B) and 90-105 min (75-98% B) and are detected by their absorbance at 210 nm. Fractions are collected in capless Eppendorf tubes that are positioned on the tops of 13 x 100 mm test tubes and that are capped within approximately 1 hour of their collection (to prevent evaporation of the acetonitrile). Under these conditions, several fractions have been successfully sequenced even after being stored at 4 degrees C for as long as two years. After loading selected peptides onto our Applied Biosystems sequencer, the original sample tube is rinsed with neat trifluoroacetic acid (to recover peptides that may have adsorbed onto the Eppendorf tube) which is then overlaid onto the sequencing filter.

Top of Page Top

 

N-Terminal Protein/Peptide Sequencing

N-terminal protein/peptide sequencing is carried out on two Applied Biosystems Procise 494 cLC instruments that are equipped with on-line HPLC for the identification of the resulting phenylthiohydantoin (Pth) amino acid derivatives. Since greater than 80% of higher eukaryotic proteins have been reported to have blocked amino-termini that preclude direct amino acid sequencing, the Keck Facility does not recommend this approach for intact eukeryotic proteins unless sufficient protein is available to first try direct N-terminal sequencing and then, if that fails, to submit an absolute minimum of 5 pmol protein for internal sequencing (preceded by mass spectrometric screening of the digest for "known" proteins). It is important to note that if final purification involves SDS-PAGE, two separate samples should be prepared if both direct N-terminal sequencing of the intact protein and "internal sequencing" of tryptic peptides derived from that protein may be requested. That is, SDS-PAGE samples destined for direct N-terminal sequencing must be electroblotted onto PVDF-type membranes, while the Keck Laboaratory recommends that SDS-PAGE purified samples destined for enzymatic cleavage and internal sequencing be submitted in the form of Coomassie Blue stained gel bands.


Before applying the sample, 2.5 pmol of a 17 residue internal sequencing standard peptide which has the formula:

[norleucine-(succinyl-lysine)4]3-norleucine - succinyl-lysine

is first spotted onto the sequencing filter. Since this internal sequencing standard is composed of non-naturally occurring amino acids, it does not interfere with sequencing the unknown peptide/protein. On the contrary, this internal standard provides the on-line monitoring of sequencer function that is so critical to being able to keep these instruments operating at the peak performance that is necessary to be able to routinely sequence at the <pmol level. In addition, when a blocked eukaryotic protein/peptide is encountered, the presence of the sequence of the internal standard assures that the instrument was operating well and that the failure to obtain a sequence was not the result of an instrument malfunction. The use of a 41-mer version of this standard is described in Elliott et al (1993). In general, the instrument is operated based on the manufacturer's recommendations and 1.0 pmol Pth-standards are routinely used for calibration. In addition, the S4 solvent that transfers the Pth-derivative to the HPLC contains 1.2 pmol Pth-norvaline which acts as an internal calibrant to independently monitor transfer to the HPLC.


All sequences obtained in the Keck Facility are searched via the BLAST Network Service operated by the National Center for Biotechnology Information. This server accesses the Brookhaven, Swiss, PIR and GenBank databases and is updated daily and may be accessed via the
Web. The E Value that is in the last column on the right of the Blast Search Results pages provides a useful criterion to judge the significance of the search. The Expect (E) value is the number of matches one can "expect" to see simply by chance in a database of the current size. The "E" value decreases exponentially with the Score that is assigned to a match and that is reported in the next to last column on the search page. The lower the "E" value the more significant is the match. Another indication of a significant homology would (in the case of proteolytic digests) be the presence of a preceding cleavage site. Additional information on interpreting BLAST Search results may be found in Altshul et al (1990). Although sequences obtained by the Keck Facility will be accompanied by a Pth Tabulation Table summarizing the approximate yields of Pth-amino acids detected at each cycle, the Keck Facility does not use this data for sequence calling. Hence, unless you specifically request that the data contained in these tables be verified for accuracy, these tables may contain one or more errors. Protein/peptide sequences are determined by overlaying successive Pth chromatograms on a light box and we strongly recommend that users go through this exercise to better understand the data.

 

    Top of Page
Medical Center Yale-New Haven Hospital Yale University

Copyright © 2003, Yale University, New Haven, Connecticut, USA. All rights reserved.
Comments or suggestions to site editor.

Last modified: 23-Oct-2006 (GB)