throbber
Protein Expression and PuriWcation 46 (2006) 166–171
`
`www.elsevier.com/locate/yprep
`
`REFOLD: An analytical database of protein refolding methods
`
`Michelle K.M. Chow a, Abdullah A. Amin a,b, Kate F. Fulton a, James C. Whisstock a,b,c,
`Ashley M. Buckle a,b,¤, Stephen P. Bottomley a,¤
`
`a Department of Biochemistry and Molecular Biology, P.O. Box 13D, Monash University,1 Vic. 3800, Australia
`b Victorian Bioinformatics Consortium,2 P.O. Box 53, Monash University, Clayton, Vic. 3800, Australia
`c ARC Centre for Structure and Functional Microbial Genomics, Monash University,1 Clayton, Vic. 3800, Australia
`
`Received 14 June 2005, and in revised form 19 July 2005
`Available online 15 August 2005
`
`Abstract
`
`The expression and harvesting of proteins from insoluble inclusion bodies by solubilization and refolding is a technique com-
`monly used in the production of recombinant proteins. To bring clarity to the large and widespread quantity of published protein
`refolding data, we have recently established the REFOLD database (http://refold.med.monash.edu.au), which is a freely available,
`open repository for protocols describing the refolding and puriWcation of recombinant proteins. Refolding methods are currently
`published in many diVerent formats and resources—REFOLD provides a standardized system for the structured reporting and pre-
`sentation of these data. Furthermore, data in REFOLD are readily accessible using a simple search function, and the database also
`enables analyses which identify and highlight particular trends between suitable refolding and puriWcation conditions and speciWc
`protein properties. This information may in turn serve to facilitate the rational design and development of new refolding protocols
`for novel proteins. There are approximately 200 proteins currently listed in REFOLD, and it is anticipated that with the continued
`contribution of data by researchers this number will grow signiWcantly, thus strengthening the emerging trends and patterns and
`making this database a valuable tool for the scientiWc community.
` 2005 Elsevier Inc. All rights reserved.
`
`Keywords: Inclusion bodies; Protein refolding; Renaturation; Database
`
`Biomedical and biotechnical research often involves
`the need to purify recombinant proteins in the simplest
`and most eYcient manner possible, whilst maximizing
`both the yield and quality of protein puriWed. The use of
`recombinant techniques and bacterial systems facilitates
`the expression of proteins on a large scale, however, a key
`limitation of such systems is often the insolubility of the
`target protein, which may be expressed largely as non-
`functional aggregates in inclusion bodies [1,2]. Despite
`the development of various growth conditions, bacterial
`
`* Corresponding authors. Fax: +61 3 99053703 (S.P. Bottomley).
`E-mail addresses: ashley.buckle@med.monash.edu.au (A.M. Buckle) ,
`steve.bottomley@med.monash.edu.au (S.P. Bottomley).
`1 Web: www.med.monash.edu.au.
`2 Web: www.vicbioinformatics.com.
`
`1046-5928/$ - see front matter  2005 Elsevier Inc. All rights reserved.
`doi:10.1016/j.pep.2005.07.022
`
`strains, expression systems, and solubilizing fusion part-
`ners to increase and maximize protein solubility [1,3–5],
`for some proteins these strategies still prove to be ineVec-
`tive or highly ineYcient. On the other hand, the overex-
`pression of insoluble proteins can be exploited by the fact
`that proteins produced in inclusion bodies are often very
`pure. As such, the solubilization and unfolding of aggre-
`gated proteins, followed by refolding and a simple one-
`step puriWcation, either sequentially or concurrently, in
`many cases proves to be the most direct and eVective
`method of producing highly puriWed protein.
`In the realm of biochemical research there is a pleth-
`ora of documented and anecdotal data regarding the
`techniques of refolding proteins in vitro. The various
`procedures and methods involved have been extensively
`
`Amgen Exhibit 2032
`Apotex Inc. et al. v. Amgen Inc. et al., IPR2016-01542
`Page 1
`
`

`

`M.K.M. Chow et al. / Protein Expression and PuriWcation 46 (2006) 166–171
`
`167
`
`reviewed [6–11], however, until recently there has been
`no central repository for the actual experimental data,
`nor any logical process by which optimal conditions may
`be gleaned for proteins with speciWc characteristics.
`Thus, for a researcher working with a novel protein,
`Wnding the most suitable conditions for expression, solu-
`bilization, and refolding of proteins a priori can be a rel-
`atively random process. To facilitate this process, we
`have recently developed the REFOLD database (http://
`refold.med.monash.edu.au), which has been designed
`with a view to add structure to the deposition and
`retrieval of refolding data [12]. REFOLD is intended to
`provide a valuable resource for researchers in developing
`new protocols for the puriWcation of proteins. To date
`(May 2005), we have collated the details of approxi-
`mately 200 published protocols, involving the overex-
`pression, solubilization, and refolding of recombinant
`proteins. We have also annotated entries with data relat-
`ing to the properties of the protein, such as structural
`data, isoelectric point, molecular weight, oligomeric
`state, and the presence of disulWde bonds, as well as ref-
`erences and links to other knowledge databases.
`REFOLD provides a wealth of information in diVer-
`ent ways. At its most basic level, it provides a detailed
`catalogue of successful refolding and puriWcation meth-
`ods for a wide range of proteins in a readily accessible
`and easy-to-read format. Beyond this, detailed annota-
`tion allows the relationships between protein character-
`istics and refolding protocols to be delineated. Despite
`the youth of this resource, certain trends are already
`becoming evident. As REFOLD continues to grow,
`these emerging patterns will become statistically stron-
`ger and may be useful in the rational design of new pro-
`tocols.
`We would like to advocate a standardized data entry
`and reporting system for refolding data as demonstrated
`in REFOLD, such that this information may be readily
`accessible and available for all researchers in a stream-
`lined format. Here, we describe the implementation and
`details of this system and examine some of the early
`trends emerging from the data in REFOLD.
`
`Data entry
`
`(http://refold.med.
`freely available
`is
`REFOLD
`monash.edu.au), and free registration on the website
`allows users to enter their own protein refolding proto-
`cols into the database. Data are entered using a simple 1-
`page form (Fig. 1), entailing details about the protein of
`interest as well as the refolding and puriWcation proce-
`dures. This form is logically structured, such that proper-
`ties of the protein are entered Wrst, followed by details of
`expression, and Wnally the refolding methodology. This
`allows for a standardized format, and thus provides a
`streamlined reference catalogue.
`
`Upon entry of a new protein into the database, basic
`details regarding properties of the PROTEIN, such as
`chain length, pI, molecularity, disulWde bonds, molecular
`weight, and species are recorded. This part of the form
`also provides a cross-reference to the UniProt [13] and
`SCOP [14] databases and the SCOP family to which the
`protein belongs (if known). The entry of protein traits is
`then followed by details of the paper in which the proto-
`col was originally published, with the journal name,
`paper title, publication details, and PubMed cross-refer-
`ence.
`Details of protein EXPRESSION comprises informa-
`tion such as the cell type (bacterial, yeast, insect, etc.) and
`strain in which the protein is expressed, as well as the
`expression vector used to encode the protein. The cell
`density at which protein expression is induced, as mea-
`sured by optical density at 600nm (OD600), is also
`entered into the database, as well as the time and temper-
`ature of expression.
`The entry of data concerning REFOLDING proce-
`dures is one of the central aspects of this database. The
`form provides for details of refolding method, that is, the
`technique used to refold the protein, as well as various
`buVer conditions used in the protocol. This includes the
`solubilization buVer in which the protein is unfolded, the
`wash buVer which is used to wash inclusion bodies and
`remove cellular debris and loosely bound proteins, and
`Wnally the refolding buVer in which the protein is
`refolded. Details regarding refolding conditions includ-
`ing time, temperature, pH, redox reagents, and chaper-
`ones (if used) are also speciWed, as well as other variables
`such as pre-puriWcation steps prior to refolding, refolding
`yield, and purity. There is also an entry point for a com-
`prehensive description of the protocol for expression,
`refolding, and puriWcation as would be detailed in a
`paper.
`
`Data retrieval and analysis
`
`Users can easily access data in the REFOLD data-
`base by executing a simple search on any chosen term, or
`alternatively, an advanced search can be performed
`according to more speciWc parameters. Search results
`can be represented either in a table format or in a drop-
`down tree-view mode with the records sorted according
`to structural classiWcation. The tabulated results provide
`details of a number of sortable parameters, such as vari-
`ous protein properties, SCOP family and refolding
`method and conditions. SpeciWed links provide access to
`full refolding records, whilst selecting the name of a pro-
`tein leads to more detailed information about the pro-
`tein itself. Additionally, following links from the search
`results page to other parameters will lead to refolding
`records sharing that property. A PubMed cross-refer-
`ence also allows users access to the original article.
`
`Page 2
`
`

`

`168
`
`M.K.M. Chow et al. / Protein Expression and PuriWcation 46 (2006) 166–171
`
`Fig. 1. REFOLD data entry form—an example of an existing record is shown. Entry of information into the database takes place via this 1-page
`form, detailing information about the protein, expression conditions, and the refolding protocol. When previously unentered data are added, drop-
`down Welds allow for the inclusion of pertinent information, as shown under “Protein” details in this example.
`
`Page 3
`
`

`

`M.K.M. Chow et al. / Protein Expression and PuriWcation 46 (2006) 166–171
`
`169
`
`With the collation of many protocols in REFOLD,
`the database provides an excellent opportunity to exam-
`ine the assembled data and delineate any trends that
`may be instructive to the design of new protocols.
`Although REFOLD is a relatively new resource, some
`early patterns can already be observed in both the
`expression and refolding data.
`To date, all of the proteins entered into the database
`have been expressed in bacterial Escherichia coli cells.
`This is not surprising, given that bacterial expression is
`generally considered to be the simplest and most eco-
`nomical method of producing recombinant protein. Fur-
`thermore, when purifying proteins from inclusion bodies
`optimization of protein solubility becomes an irrelevant
`factor, thus reducing the need to adopt more complex
`expression systems. The most commonly employed E.
`coli strain for protein expression is BL21, which has been
`used in »68% of the protocols entered so far, which
`again, is consistent with the view of BL21 strain being a
`robust standard expression system [2]. Expression of
`most proteins is induced in the mid-log phase with
`OD600 ranging between 0.4 and 0.8, which is in accord
`with standard expression protocols, although in some
`large-scale fermentations the OD600 at induction has
`been considerably greater than this, even reaching up to
`50 in one case [15].
`As was evident early in the life of REFOLD [12], by
`far the most frequent method of refolding proteins
`from solubilized inclusion bodies is by simple dilution,
`which accounts for 40% of entries in the database
`(Fig. 2A). The second-most common technique is dialy-
`sis and taken together, these methods account for
`three-quarters of the protocols recorded. This suggests
`that in most cases the simplest methods may be suY-
`cient to yield adequate quantities of protein without
`the need to complicate the protocol further. Beyond
`this, column-assisted refolding using a nickel-chelating
`resin or gel Wltration chromatography are the next
`most commonly used methods. It is also interesting to
`note that the majority (68%) of proteins are expressed
`without a fusion tag (Fig. 2B). Of the proteins which
`are expressed with fusion tags, the most common tag is
`a C- or N-terminal hexahistidine (his6) tag, consistent
`with the fact that column-assisted refolding on a
`nickel-chelating resin is the third-most common refold-
`ing method.
`A number of proteins are refolded in the presence of
`additives such as arginine (39 entries) and glycerol (27
`entries), which are both compounds commonly used to
`aid the folding of proteins [6,7,11,16]. In cases where
`arginine is used, it is generally present in the refolding
`buVer at concentrations ranging from 0.25 to 1.0 M,
`and the buVer pH is always above 7.4. In contrast, glyc-
`erol, which is generally used at concentrations between
`1 and 30% (v/v), has been included in buVers with pH
`values as low as 5.5. Other additives that are known to
`
`Fig. 2. Analysis of data in REFOLD. Pie charts showing percentage
`breakdown of proteins entered in the database by (A) refolding tech-
`nique and (B) fusion tag construct.
`
`assist protein refolding have been used in a few
`instances, for example, ethylene glycol (50% v/v),
`magnesium chloride (5–200 mM), and glycine (0.5–
`1.0 M). Molecular chaperones have been employed in
`only a few cases, and in these situations GroEL or
`its apical domain (GroEL minichaperone) has been
`used.
`It is in the provision of data about protein proper-
`ties in conjunction with detailed refolding procedures
`that REFOLD has its greatest potential as a predictive
`knowledge database. The combination of structural
`and technical information allows for the delineation of
`relationships between successful refolding methodolo-
`gies and for particular proteins traits. Such trends may
`then be instructive to the new design of protocols for
`proteins of similar structure. For example, a number of
`proteins in the database cluster in structural families
`according to their SCOP classiWcation (Table 1). It can
`be seen that the refolding of E-set domains of sugar-
`utilizing enzymes has been undertaken within a rela-
`tively narrow range of pH values, as is the case for
`
`Page 4
`
`

`

`170
`
`M.K.M. Chow et al. / Protein Expression and PuriWcation 46 (2006) 166–171
`
`Table 1
`Refolding data for SCOP family clusters in REFOLD
`
`SCOP family
`
`Caspase-like proteases
`
`E-set domains of sugar-utilizing enzymes
`
`Eukaryotic proteases
`
`Kringle modules
`Long-chain cytokines
`
`MHC antigen-recognition domain
`
`Papain-like cysteine proteinases
`
`Pepsin-like proteases
`
`Ribonuclease A-like
`
`Serpins
`
`Serum albumin-like
`
`Short chain cytokines
`
`Transforming growth factor (TGF)-♢
`
`V-set domains (antibody variable domain-like)
`
`12
`
`Refolding methoda
`Dialysis (2)
`Dilution/dialysis (1)
`Dialysis (2)
`Dilution (1)
`Dilution (2)
`DiaWltration (1)
`Size exclusion chromatography (1)
`Dilution (3)
`Dialysis (3)
`Dilution (1)
`DiaWltration (1)
`High pressure (1)
`Size exclusion chromatography (1)
`Dilution (1)
`Oxidative chromatography (2)
`Dialysis (3)
`Dilution (2)
`Dialysis (1)
`Dilution (2)
`Dialysis (2)
`Dilution (2)
`Size exclusion chromatography (1)
`Dilution (2)
`Dilution/dialysis (2)
`Size exclusion chromatography (1)
`Dialysis (3)
`Dilution (1)
`Dialysis (3)
`Dilution/dialysis (2)
`Dilution/column chromatography (1)
`Dilution (2)
`Nickel chelating chromatography (1)
`Dilution (6)
`Dialysis (3)
`Dilution/dialysis (3)
`a Number in brackets indicates the number of protein entries for which the speciWed refolding method has been used.
`
`No. entries
`
`3
`
`3
`
`4
`
`3
`7
`
`3
`
`5
`
`3
`
`5
`
`5
`
`4
`
`6
`
`3
`
`Refolding pH
`
`7.8–10.0
`
`7.8–8.5
`
`8.5–8.8
`
`7.6–9.0
`5.0–8.8
`
`6.6–8.0
`
`7.0–10.7
`
`7.0–10.5
`
`7.5–8.6
`
`5.6–7.8
`
`8.5–10
`
`7.5–9.5
`
`8.0–8.5
`
`7.5–9.5
`
`eukaryotic proteases and transforming growth factor-♢
`proteins. Therefore, when developing refolding proce-
`dures for other proteins belonging to these families, it
`would be logical to be guided by the respective pH
`ranges for suitable buVer conditions. Similarly, when
`considering the refolding method, all of the Kringle
`Module proteins entered so far have been refolded by
`dilution while a number of other protein families are
`refolded by either dilution or dialysis only. Most com-
`pellingly, of the 12 V-set domain proteins, all have been
`successfully refolded by dilution and/or dialysis
`(Table 1). Therefore, for these groups of proteins, these
`simple methods would be appropriate starting points
`for the design of refolding protocols.
`Thus, even with the moderate amount of data cur-
`rently available in the database, some early patterns can
`already be identiWed. As the amount of data entered into
`REFOLD increases, the number and statistical robust-
`ness of such trends will continue to grow, thus strength-
`ening the ability to deWne and predict appropriate
`conditions for new protocols.
`
`Additional features in REFOLD
`
`Aside from the data entry and retrieval functions, the
`REFOLD website contains extra
`features which
`enhance its value as a resource. For example, the provi-
`sion of a graphical breakdown of various parameters for
`refolding techniques and protein properties allows par-
`ticular trends and patterns to be observed at a glance,
`providing information about the most common features
`and methods used.
`REFOLD also oVers an extra opportunity for user
`input by providing a space for commentary on existing
`data at the end of each record. This presents an opportu-
`nity for scientiWc discourse and exchange between
`researchers as users can comment on particular proto-
`cols or proteins which they may have employed them-
`selves based on the data supplied in REFOLD. Such
`remarks could address issues such as the usefulness of a
`given protocol, its application and/or adaptation to a
`diVerent protein, comments on the protein
`itself,
`or other related topics. As such, users can contribute
`
`Page 5
`
`

`

`M.K.M. Chow et al. / Protein Expression and PuriWcation 46 (2006) 166–171
`
`171
`
`feedback on existing data and add further information
`which may be useful to other researchers. Hence,
`through this commentary capacity, REFOLD oVers an
`extra level of data annotation, as well as providing an
`open forum for discussion and dialogue between
`scientists.
`
`REFOLD. This work was supported by grants from the
`National Health and Medical Research Council, the Victo-
`rian State Government, and the Victorian Partnership for
`Advanced Computing. SPB is a Monash University Senior
`Logan Fellow and NHMRC R.D. Wright Fellow. J.C.W. is a
`Monash University Logan Fellow and NHMRC Senior
`Research Fellow. K.F.F. is a NHMRC Peter Doherty Fellow.
`
`Future directions
`
`It is anticipated that REFOLD will become an
`invaluable resource for protein researchers. In order for
`the database to Xourish and maximize its usefulness, the
`entry of new data is required. We strongly encourage
`and welcome the deposition of published data by all
`researchers in the standardized format as described in
`this paper. With the contribution of more records to the
`database, it is envisaged that the accumulated knowl-
`edge in the database will combine to produce a compre-
`hensive picture of the most appropriate refolding
`techniques for diVerent proteins, and deWnitive methods
`and conditions will emerge as being appropriate for
`polypeptides with speciWc properties. This will enable a
`strong predictive capacity for REFOLD to evolve,
`whereby the rational design of refolding protocols may
`be informed and facilitated by the knowledge of a pro-
`tein’s characteristics and suitable methodologies appro-
`priate for those properties.
`The beneWts of entering protocols REFOLD are
`multi-fold, with a number of diVerent perspectives—
`for a researcher developing a refolding procedure for
`their protein, the database can impart valuable and
`relevant information which is easily accessible, not
`only based on protein properties but also through
`direct links to already proven and published protocols.
`For scientists depositing records in the database,
`REFOLD provides an opportunity to disseminate
`one’s published work to a wider audience. For
`REFOLD itself, the contribution of more records
`increases the amount of data available and thus
`strengthens the analytical capacity of the database.
`And Wnally, for the research community in general, the
`continued expansion of REFOLD will see it become
`an invaluable tool, providing a vast repository of col-
`lective knowledge with a standardized format in a cen-
`tralized and interactive resource.
`
`Acknowledgments
`
`The authors acknowledge the contribution of all the
`researchers whose published data has been entered into
`
`References
`
`[1] H.P. Sorensen, K.K. Mortensen, Soluble expression of recombi-
`nant proteins in the cytoplasm of Escherichia coli, Microb. Cell
`Fact. 4 (2005) 1–8.
`[2] H.P. Sorensen, K.K. Mortensen, Advanced genetic strategies for
`recombinant protein expression in Escherichia coli, J. Biotechnol.
`115 (2005) 113–128.
`[3] V. De Marco, G. Stier, S. Blandin, A. de Marco, The solubility and
`stability of recombinant proteins are increased by their fusion to
`NusA, Biochem. Biophys. Res. Commun. 322 (2004) 766–771.
`[4] M.R. Dyson, S.P. Shadbolt, K.J. Vincent, R.L. Perera, J.
`McCaVerty, Production of soluble mammalian proteins in Esche-
`richia coli: identiWcation of protein features that correlate with
`successful expression, BMC Biotechnol. 4 (2004) 32–48.
`[5] R.B. Kapust, D.S. Waugh, Escherichia coli maltose-binding pro-
`tein is uncommonly eVective at promoting the solubility of poly-
`peptides to which it is fused, Protein Sci. 8 (1999) 1668–1674.
`[6] L.D. Cabrita, S.P. Bottomley, Protein expression and refolding—
`A practical guide to getting the most out of inclusion bodies, Bio-
`technol. Annu. Rev. 10 (2004) 31–50.
`[7] B. Fahnert, H. Lilie, P. Neubauer, Inclusion bodies: formation and
`utilisation, Adv. Biochem. Eng. Biotechnol. 89 (2004) 93–142.
`[8] A.P. Middelberg, Preparative protein refolding, Trends Biotech-
`nol. 20 (2002) 437–443.
`[9] A.K. Panda, Bioprocessing of therapeutic proteins from the inclu-
`sion bodies of Escherichia coli, Adv. Biochem. Eng. Biotechnol. 85
`(2003) 43–93.
`[10] K. Tsumoto, D. Ejima, I. Kumagai, T. Arakawa, Practical consid-
`erations in refolding proteins from inclusion bodies, Protein Expr.
`Purif. 28 (2003) 1–8.
`[11] E.D.B. Clark, Refolding of recombinant proteins, Curr. Opin. Bio-
`technol. 9 (1998) 157–163.
`[12] A.M. Buckle, G.L. Devlin, R.A. Jodun, K.F. Fulton, N. Faux, J.C.
`Whisstock, S.P. Bottomley, The matrix refolded, Nat. Methods 2
`(2005) 3.
`[13] R. Apweiler, A. Bairoch, C.H. Wu, W.C. Barker, B. Boeckmann, S.
`Ferro, E. Gasteiger, H. Huang, R. Lopez, M. Magrane, M.J. Mar-
`tin, D.A. Natale, C. O’Donovan, N. Redaschi, L.S. Yeh, UniProt:
`the Universal Protein knowledgebase, Nucleic Acids Res. 32
`(2004) D115–D119.
`[14] A.G. Murzin, S.E. Brenner, T. Hubbard, C. Chothia, SCOP: a
`structural classiWcation of proteins database for the investigation
`of sequences and structures, J. Mol. Biol. 247 (1995) 536–540.
`[15] A. Bazarsuren, U. Grauschopf, M. Wozny, D. Reusch, E. HoVmann,
`W. Schaefer, S. Panzner, R. Rudolph, In vitro folding, functional
`characterization, and disulWde pattern of the extracellular domain
`of human GLP-1 receptor, Biophys. Chem. 96 (2002) 305–318.
`[16] K. Tsumoto, M. Umetsu, I. Kumagai, D. Ejima, J.S. Philo, T.
`Arakawa, Role of arginine in protein refolding, solubilization, and
`puriWcation, Biotechnol. Prog. 20 (2004) 1301–1308.
`
`Page 6
`
`

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket