throbber
Protein Science (1993), 2, 41-54. Cambridge University Press. Printed in the USA.
`Copyright 0 1993 The Protein Society
`
`Disulfide bonding patterns and protein topologies
`
`H
`
`CRAIG J. BENHAM AND M. SALEET JAFRI
`Department of Biomathematical Sciences, Mount Sinai School of Medicine, New York, New York 10029
`(RECEIVED August 4, 1992; REVISED MANUSCRIPT RECEIVED September 10, 1992)
`
`Abstract
`This paper examines the topological properties of protein disulfide bonding patterns. First, a description of these
`patterns in terms of partially directed graphs is developed. The topologically distinct disulfide bonding patterns
`available to a polypeptide chain containing n disulfide bonds are enumerated, and their symmetry and reducibil-
`ity properties are examined. The theoretical probabilities are calculated that a randomly chosen pattern of n bonds
`will have any combination of symmetry and reducibility properties, given that all patterns have equal probability
`of being chosen. Next, the National Biomedical Research Foundation protein sequence and Brookhaven National
`Laboratories protein structure (PDB) databases are examined, and the occurrences of disulfide bonding patterns
`in them are determined. The frequencies of symmetric and/or reducible patterns are found to exceed theoretical
`predictions based on equiprobable pattern selection. Kauzmann’s model, in which disulfide bonds form during
`random encounters as the chain assumes random coil conformations, finds that bonds are more likely to form
`with near neighbor cysteines than with remote cysteines. The observed frequencies of occurrence of disulfide pat-
`terns are found here to be virtually uncorrelated with the predictions of this alternative random bonding model.
`These results strongly suggest that disulfide bond pattern formation is not the result of random factors, but in-
`stead is a directed process.
`Finally, the PDB structure database is examined to determine the extrinsic topologies of polypeptides contain-
`
`ing disulfide bonds. A complete survey of all structures in the database found no instances in which two loops
`formed by disulfide bonds within the same polypeptide chain are topologically linked. Similarly, no instances are
`found in which two loops present on different polypeptide chains in a structure are catenated. Further, no exam-
`ples of topologically knotted loops occur. In contrast, pseudolinking has been found to be a relatively frequent
`event. These results show a complete avoidance of nontrivial topological entanglements that is unlikely to be the
`result of chance events. A hypothesis is presented to account for some of these observations.
`Keywords: covalent bond topology; entanglements; knots; protein structure
`
`Topology is the branch of mathematics that studies those
`properties of shape that remain invariant under continu-
`ous deformations. Topological properties naturally sub-
`divide into two types - those that derive from the intrinsic
`structure of the object under study, and those that relate
`to how that structure is embedded in space. For example,
`a closed circle has a different intrinsic topological struc-
`ture than a finite line segment. One can convert a circle
`into a line segment only by introducing a cut, which is a
`discontinuous deformation. As these two structures have
`different intrinsic topologies, one naturally might expect
`them also t o have different ranges of possible realizations
`in space. All embeddings of a finite linear segment in
`three-dimensional space are topologically equivalent in
`
`Reprint requests to: Craig J. Benham, Department of Biomathemat-
`ical Sciences, Box 1023, Mount Sinai School of Medicine, 1 Gustave
`Levy Place, New York, New York 10029.
`
`the sense that any one can be converted to any other by
`a continuous deformation. In particular, a segment can-
`not be topologically knotted, because any candidate knot
`can be undone without recourse to cutting. One need only
`pass the ends of the segment back through whatever loops
`have been formed, which is a continuous deformation. It
`follows that all geometric shapes having the topological
`structure of finite line segments are topologically equiv-
`alent, both intrinsically and in all spatial embeddings. In
`contrast, a closed circular curve can be knotted. Differ-
`ent knot types cannot be interconverted without introduc-
`ing transient cuts. Two circular curves having distinct
`knot types differ only in the way they are embedded in
`space. Both have the same intrinsic topology, that of a
`closed circle.
`The pattern of covalent connections among amino acid
`residues imparts topological structure to a polypeptide
`chain. (Small loops, such as those occurring in aromatic
`
`41
`
`Amgen Exhibit 2052
`Apotex Inc. et al. v. Amgen Inc. et al., IPR2016-01542
`Page 1
`
`

`

`42
`
`rings, fused rings, and similar local structures, commonly
`are disregarded because their topologies show no variabil-
`ity.) Although a polypeptide chain is synthesized as a lin-
`ear polymer, it need not always have the trivial intrinsic
`topology of a line segment. The formation of covalent di-
`sulfide bonds between cysteine residues within a polypep-
`tide chain produces circular loops
`of covalent bonds
`(Thornton, 1981). These covalent self-associations impart
`nontrivial intrinsic topology to the polypeptide. Molecules
`containing such covalent loops also may have nontrivial
`embedded topologies. Possible examples include knotted
`loops, interlinked pairs of loops on the same polymeric
`backbone, catenanes between loops on different back-
`bones, as well as other forms of entanglement (Crippen,
`1974, 1975). As this paper treats only topological prop-
`erties, loop penetrations that are not topological in char-
`acter are not considered, although
`these also may be
`important in practice (Connolly et al., 1980; Klapper &
`Klapper, 1980).
`The topological state of a molecule constrains its geom-
`etry in specific and potentially important ways (Meiro-
`vitch & Scheraga, 1981a,b; Kikuchi et al., 1986, 1989). A
`protein can fold only into those conformations that are
`consistent with its topology. This limits the portion
`of
`conformation space that a molecule containing disulfide
`bonds may sample. The change in entropy consequent on
`this restriction can stabilize the conformation, as demon-
`strated by the increase in denaturation temperature ob-
`served when a disulfide bond in introduced (Johnson
`et al., 1978). Moreover, the folding pathway of a protein
`may involve the transient or permanent formation of spe-
`cific disulfide bonds that constrain the molecule in a way
`that directs it toward its correct final conformation
`(Creighton & Goldenberg, 1984; Scheraga et al., 1984;
`Weissman & Kim, 1991).
`
`Disulfide bonding patterns and intrinsic topologies
`Consider the distinct disulfide bonding patterns (i.e., states
`of connectivity) available to a polypeptide containing M
`cysteine residues. The backbone of this polymeric chain
`consists of the sequence of residues covalently connected
`through peptide bonds, which are oriented in the N -+ C
`direction. Covalent disulfide bonds may form
`between
`pairs of cysteines, with any single cysteine residue partic-
`ipating in at most one such bond. These disulfide bonds
`possess a chemical symmetry that does not endow them
`with a natural orientation.
`A disulfide bonding pattern has the mathematical struc-
`ture of a partially directed graph. The vertices of this
`graph are the C- and N-termini of the chain, plus each of
`the cysteine residues that participates in a disulfide bond.
`The edges of this graph are the covalent connections be-
`tween these vertices. The polypeptide backbone of the
`molecule is comprised of directed edges, each oriented ac-
`cording to its N -+ C chemical direction, forming a
`
`C. J. Benham and M.S. Jafri
`
`unique, directed, unbranched tree that spans every ver-
`tex. Because disulfide bonds are unoriented, the edges
`corresponding to them are undirected. The end vertices
`have order one, and all others have order three. (The or-
`der of a vertex, also called its valence by graph theorists,
`is the number of edges that are connected to it.) The three
`edges impinging on an interior vertex have distinct prop-
`erties: one edge is directed into the vertex, one is directed
`away from the vertex, and the edge corresponding to the
`disulfide bond
`is undirected. This formulation differs
`from earlier graph-theoretic treatments of disulfide bond-
`ing patterns in that here the direction corresponding to the
`chemical orientation of the polymeric backbone
`is in-
`cluded. Earlier approaches used undirected graphs only
`(Walba, 1985; Mao, 1989).
`Disulfide bonding patterns may be depicted by draw-
`ing the polymer backbone as a straight line, oriented left
`to right in the N -+ C direction, with the disulfide bonds
`shown as interconnections between the vertices corre-
`sponding to the pairs of cysteine residues involved. When
`not indicated by arrows, the backbone orientation always
`is chosen to be left to right as described. When necessary
`the vertices may be numbered in the order they are en-
`countered as the backbone is traversed in the direction as-
`signed by its orientation. For example, the three different
`patterns containing two disulfide bonds are shown in Fig-
`ure 1. Because we are concerned with topological prop-
`erties relating to connectivity, not at present with metric
`properties, the numbers of residues in each part
`of the
`polymer chain are not relevant.
`An alternative representation of a pattern labels the di-
`sulfide bonds alphabetically in the order they are first en-
`countered, starting from the N-terminus. The pair of
`cysteines connected by a particular bond are given its al-
`phabetic label. An n-bond pattern is specified by giving
`the sequence of letters associated with the bonded cys-
`teines, as they are encountered when the chain is traversed
`
`1
`
`2
`
`
`
`3 4
`
`5
`
`6
`
`A n
` BI
`ci
`
`1
`
`2
`
`
`
`3 4
`
`5
`
`6
`
`
`
`1
`
`2
`
`
`
`3 4
`
`5
`
`6
`
`
`
`Fig. 1. The three different disulfide bond patterns in polypeptides con-
`taining two such bonds. All three patterns are symmetric, whereas only
`pattern A is reducible.
`
`Page 2
`
`

`

`Disulfide bonding in proteins
`
`starting from the N-terminus. Thus, each pattern contain-
`ing n disulfide bonds determines a sequence of length 2n
`whose entries are the first n letters of the alphabet, each
`of which appears twice, with new letters appearing in al-
`phabetic order. In this notation the three two-bond disul-
`fide patterns are aabb, abab, and abba.
`In this paper the pattern associated with a state of disul-
`fide bonding of a polypeptide chain is a partially directed
`graph of the type shown in Figure 1, having a unique di-
`rected spanning tree corresponding to the backbone. The
`graph associated with the pattern is the simple collection
`of edges and vertices shown, with no orientation and no
`distinction between different types of edges.
`Two patterns have the same topological structure if one
`can be transformed into the other by a continuous defor-
`mation. This transition must preserve the directed nature
`of the polypeptide chain connections. Therefore its action
`on the directed backbone spanning tree is unique. In par-
`ticular, it associates corresponding vertices in the order
`they are encountered along the chain. It maps directed
`edges to their corresponding directed edges, and disulfide
`bonds to disulfide bonds. It follows that two patterns are
`topologically equivalent exactly when all their disulfide
`bonds connect corresponding pairs of vertices. That is,
`only identical patterns are topologically equivalent. Two
`patterns are topologically distinct if no continuous trans-
`formation between them exists. This means that their in-
`terconversion requires the
`formation, disruption or
`rearrangement of disulfide bonds. Distinct patterns are
`always topologically nonequivalent.
`It is important to note that the topological properties
`of patterns are not the same as the topological properties
`of their underlying graphs. Two graphs have the same in-
`trinsic topology (i.e., are isomorphic) when there is a way
`of numbering the vertices of each so that corresponding
`edges join pairs of vertices having the same numbers in
`both graphs (Roberts, 1984). In graphs the numbering of
`vertices may be chosen arbitrarily and is not determined
`by a directed spanning tree (i.e., polypeptide backbone),
`as was the case for patterns. Thus, two topologically dis-
`tinct patterns may have isomorphic underlying graphs. For
`example, two graphs that are mirror images are isomor-
`phic, although asymmetric patterns in which the disulfide
`bonds occur in mirror image order are not topologically
`equivalent because the mirror image mapping does not
`preserve the backbone orientation. Another example of
`distinct patterns having isomorphic graphs is shown in
`Figure 2 .
`Disulfide bonding patterns have specific attributes that
`could be important for protein structure. One such prop-
`erty is symmetry. A pattern
`is symmetric if it and its
`mirror image both have the same disulfide bonding con-
`nections. Alternatively, the pattern is symmetric if its al-
`phabetic representation reads the same when labels are
`assigned in the N + C direction as when they are assigned
`in the opposite direction. For example, all of the two-
`
`43
`
`bond patterns are symmetric, although patterns with three
`or more disulfide bonds may be asymmetric, as is the case
`for both patterns shown in Figure 2. The second impor-
`tant property is reducibility. A reducible pattern is one in
`which a single cut somewhere along the backbone can sep-
`arate the pattern into two nontrivial subpatterns. That
`is, some disulfide bonds occur entirely to the left of the
`cut point and others entirely to the right, but no disul-
`fide bonds span the cut point. The pattern in Figure 1A
`comprised of two disjoint loops is reducible, whereas both
`of the other patterns are
`irreducible. A third intrinsic
`topological property of a disulfide bonding pattern is non-
`planarity. A pattern is nonplanar if its graph cannot be
`drawn in a plane in a way in which no edges cross (Crip-
`pen, 1974). A pattern is nonplanar exactly when it contains
`the (sub-)pattern abcdbcda. (This topological definition
`of nonplanarity differs from that used by Kikuchi et al.
`[1986, 19891.)
`In the following sections formulas are derived express-
`ing the numbers of distinct (hence topologically nonequiv-
`alent) disulfide bond patterns, as well as the numbers of
`these that have all combinations of symmetry and reduc-
`ibility properties. Intrinsic nonplanarity will not be con-
`sidered in detail here, as it is less likely to be of practical
`importance in protein structure.
`
`lEELL
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`9
`
`1
`
`0
`
`
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`9
`
`1
`
`0
`
`
`
`1
`
`2
`
`6
`
`5
`
`7
`
`8
`
`4
`
`3
`
`9
`
`1
`
`0
`
`
`
`Fig. 2. An example of two different patterns whose underlying graphs
`are isomorphic. The top graph is the original pattern, where all edges
`now are regarded as undirected. If the vertices of this graph are visited
`along the path shown in the middle graph, and then this path is drawn
`as a straight line, the graph at the bottom results. Here the vertices re-
`tain their original numbering for clarity.
`
`Page 3
`
`

`

`44
`
`C. J. Benham and M.S. Jafri
`
`having approximately symmetric regions or other regu-
`larities. Numbering the 2n bonded cysteines starting at the
`N-terminus, a pattern is symmetric if, whenever cysteines
`i a n d j are bonded, then so are cysteines 2n - i + 1 and
`2n - j + 1. We note that this symmetry relates only to the
`topological pattern of disulfide bonding, not to metric
`properties such as the lengths of the polypeptide chain
`spanned by the bonds.
`The number S ( n ) of symmetric disulfide bonding pat-
`terns may be found as follows. All patterns containing
`either one or two disulfide bonds are symmetric, so S( 1) =
`1 and S(2) = 3. For the general case, we first enumerate
`those symmetric patterns in which a disulfide bond con-
`nects the first cysteine to the last (i.e., 2nth) cysteine, as
`shown in Figure 3A. This is a symmetric arrangement of
`that bond. There remain n - 1 other bonds to specify. For
`the entire pattern to be symmetric, these other bonds must
`be arranged in a symmetric manner. As there are S ( n - 1)
`ways in which this can be done, this gives the number of
`symmetric patterns of this first type. Alternatively, sup-
`pose the pattern has a disulfide bond connecting the first
`cysteine to the j t h cysteine, j # 2n. There are (2n - 2)
`choices for the cysteine to which this connection is made:
`only 1 and 2n are excluded. For the entire pattern to be
`symmetric, the 2nth cysteine must be connected to the
`2n - j + 1st cysteine, as shown in Figure 3B. Also, the re-
`maining ( n - 2) disulfide bonds must be arranged in a
`symmetric manner, which can be done in S ( n - 2) ways.
`Hence the total number of symmetric patterns of this type
`is (2n - 2)S(n - 2). Putting these results together,
`the
`
`A 1
`
`The number of disulfide bonding patterns
`Consider a polypeptide chain containing M cysteine res-
`idues in which n disulfide bonds are formed, so M r 2n.
`The number of ways of choosing the 2n cysteines par-
`ticipating in the disulfide bonding is ,,,,(& = M ! /(2n) !
`( M - 2n) ! . Now suppose that the participating cysteines
`have been specified. The number of distinct patterns
`containing n disulfide bonds may be found by the follow-
`ing procedure (Cantor & Schimmel, 1980). Consider the
`participating cysteine nearest the N-terminus. There are
`2n - 1 other cysteines to which it may be attached by
`a disulfide bond. Specify to which of these that bond is
`made. This leaves 2n - 2 cysteines whose disulfide con-
`nections remain to be determined. Of these, choose the
`unattached cysteine closest to the N-terminus. There are
`2n - 3 possible choices for which other cysteine forms the
`disulfide bond with this one. Specify to which of these
`candidates that bond is to be made. Continue this process
`until all 2n cysteines have been connected. At the first step
`there were 2n - 1 choices, at the second 2n - 3, at the
`third 2n - 5 , etc. The total number of choices is the prod-
`uct of all the odd numbers from 1 to 2n:
`
`P ( n ) = n (2i - 1) = -.
`
`n
`
`i= I
`
`(2n) !
`2"n!
`
`These equations give the number of different patterns
`containing n disulfide bonds. The factorial form of this
`expression was first presented by Kauzmann (1959). As
`noted above, all of these possibilities are topologically
`distinct as patterns, although some of their underlying
`graphs may be isomorphic.
`n disulfide bonds
`The number of arrangements of
`among Mcysteine residues on a polypeptide chain there-
`fore is
`
`-
`
`M !
`a ( M , n ) = M C 2 n P ( n ) =
`2"n! ( M - 2n) ! '
`
`M r 2n.
`
`B
`
`1
`
`2n
`
`This expression was derived by Sela and Lifson (1959).
`Hereafter we will not consider cysteines that do not par-
`ticipate in disulfide bonds.
`(In mathematics, an algebraic structure can be given to
`the set of patterns by defining a multiplication operation
`on them. However, it is not known whether the resulting
`construct, called the full connection monoid on 2n points
`[Kaufmann & Vogel, 19921, is relevant to protein structure.)
`
`The number of symmetric patterns
`The patterns involving n disulfide bonds may be classi-
`fied according to whether or not they possess symmetry.
`This attribute may reflect (or dictate) a folding pattern
`
`1
`
`j
`
`2n-j+l 2n
`
`or
`
`1
`
`2n-j+l
`
`j
`
`I
`
`2n
`
`Fig. 3. The two cases encountered in the derivation of the recursion re-
`lation for S(n), as described in the text. In the first case (A) a disulfide
`bond joins the first and last (2nth) cysteines, whereas in the second case
`(B) the first cysteine bonds to some cysteine other than the last. The
`disulfide bond shown in the first case is symmetric. However, in the sec-
`ond case the symmetry condition requires the presence of a mirror im-
`age disulfide bond as shown.
`
`Page 4
`
`

`

`Disulfide bonding in proteins
`
`total number of symmetric disulfide bonding patterns is
`shown to obey the following recursion relation:
`
`S ( 1 ) = 1,
`
`S ( 2 ) = 3,
`S ( n ) = S ( n - 1 ) + 2 ( n - 1 ) S ( n - 2 ) ,
`
`n 2 3. (3)
`
`This recursion relation may be solved explicitly, yielding
`the following closed form expression:
`
`(Here ;Pj = i ! / ( i - j ) ! is the permutation of i objects
`taken j at a time, which is the number of different ways
`of choosingj objects, in order and without replacement,
`from a collection of i objects. Throughout this paper
`square brackets in equations denote the greatest integer
`function.)
`
`The number of reducible patterns
`
`A disulfide bonding pattern is reducible if it consists of
`two nonoverlapping, nontrivial subpatterns. In other
`words, if there is a site on the polypeptide backbone
`where a single cut will decompose the pattern into two
`subpatterns, then the pattern is reducible.
`Recursion relations enumerating the reducible and ir-
`reducible patterns are derived as follows. A pattern con-
`taining n disulfide bonds is reducible exactly when it has
`at least one interior cut point, as described above. Tra-
`versing the sequence starting from the N-terminus, sup-
`pose the first such cut point that is encountered has i
`disulfide bonds on its N-terminal side and n - i bonds on
`its C-terminal side, 1 5 i c n. Then the subpattern con-
`sisting of the i bonds on the N-terminal side must be ir-
`reducible, because this is the first cut site encountered.
`The subpattern comprised of the n - i bonds on the C-
`terminal side of the cut can have any form, reducible or
`irreducible. So there are P ( n - i) choices for this pattern.
`Therefore the number of ways in which an n bond pat-
`tern can be chosen whose first cut site occurs as stated is
`the product I ( i ) P ( n - i ) , where I ( i ) denotes the num-
`ber of irreducible patterns with i bonds. For a pattern to
`be reducible it must have a cut point of this type at some
`position for which 1 s i 5 n - 1, so the total number R( n)
`of reducible patterns is the sum
`
`I(n) = P ( n ) - R ( n ) .
`
`45
`
`A similar calculation derives the recursion relation giv-
`ing the number S r ( n ) of n-bond patterns that are both
`symmetric and reducible. Again, suppose the first cut
`point occurs after i bonds, so the number of choices for
`the subpattern of these initial bonds is I ( i ) . Because the
`complete pattern is symmetric as well as reducible, the last
`i bonds must be the mirror images of the first ones. It fol-
`lows that n 1 2i, and that the subpattern of the middle
`n - 2i bonds, if any, is all that remains to be determined.
`For the entire pattern to be symmetric, the subpattern of
`the middle n - 2i bonds must be symmetric. Hence there
`are S ( n - 2i) choices for this structure. It follows that the
`number of symmetric, reducible patterns in which the first
`cut occurs after i bonds is the product I ( i ) S ( n - 2i), so
`the total number of patterns that are both symmetric and
`reducible is
`
`i= 1
`The above results determine the number A (n) of non-
`symmetric patterns on n disulfide bonds to be
`
`A ( n ) = P ( n ) - S ( n ) .
`
`(7)
`
`Similarly, the number of patterns that are symmetric and
`irreducible is
`
`S ; ( n ) = S ( n ) - S r ( n ) .
`
`(8)
`
`The number of nonsymmetric, reducible patterns is
`
`and the number of patterns that are both nonsymmetric
`and irreducible is
`
`Table 1 displays the numbers of patterns P ( n ) contain-
`ing n disulfide bonds, 1 s n s 12, together with the num-
`bers of these patterns that are symmetric, reducible, or
`both. From these values the numbers of patterns with all
`other combinations of symmetry and reducibility prop-
`
`erties may be calculated according to the above equations.
`Table 2 shows the fractions of patterns with given sym-
`metry and reducibility properties for the cases 1 s n I 12.
`These are the probabilities that a randomly chosen pat-
`tern of n disulfide bonds has the given attribute(s), pro-
`vided every pattern is equally likely to be chosen. One sees
`that the fractions of patterns that are asymmetric or ir-
`reducible or both grow with n, while the fractions with
`all other combinations of attributes decrease. The prob-
`ability of symmetry decreases rapidly as n grows, while
`the probability of reducibility decreases more slowly.
`
`Page 5
`
`

`

`46
`
`Table 1. Number P(n) of patterns of n disurfde bonds,
`together with the numbers of these patterns possessing
`spec$ic symmetry and reducibility properties"
`
`"_____._____
`_I__".
`n
`p ( n )
`
`S ( n )
`
`R ( n )
`
`____.__
`S,(nf
`
`1
`2
`3
`4
`5
`6
`7
`8
`9
`10
`11
`12
`
`1
`3
`15
`105
`945
`10,395
`135,135
`2,027,025
`34,459,425
`654,729,075
`13,749,310,575
`316,234,143,225
`
`1
`3
`7
`25
`81
`33 1
`1,303
`5,937
`26,785
`133,651
`669,351
`3,609,673
`
`0
`1
`5
`31
`239
`2,233
`24,725
`3 18,63 1
`4,707,359
`78,691,633
`1,471,482,725
`30,469,552,111
`
`0
`1
`1
`5
`9
`41
`105
`485
`1,609
`7,777
`3 1,425
`160,965
`
`_ _ _ ~ - ~
`a These quantities were calculated using the methods described in the
`text.
`
`Observed protein topologies
`In this section we describe the results of database surveys
`evaluating the intrinsic and embedded topological prop-
`erties of known polypeptide disulfide bonding patterns.
`The intrinsic topologies are given by the corresponding di-
`sulfide bonding patterns, whereas the embedded topolog-
`ical properties considered include knotting of loops and
`interlinking of pairs of loops. Intrinsic topologies are de-
`termined by disulfide bond connections alone, whereas
`the evaluation of embedded topologies requires knowl-
`edge of the structure of the protein.
`
`Table 2. Fractions of n-bond patterns having specific
`symmetry and reducibility properties"
`
`n
`
`&(n)
`
`"
`
`.
`
`P r ( n )
`
`P s r ( n )
`
`I
`
`_
`p,,(n)
`
`_
`Pa;(n)
`
`1
`2
`3
`4
`5
`6
`7
`8
`9
`10
`11
`12
`
`0.000000
`1 .000000
`0.333333
`1 .000000
`0.333333
`0.466667
`0.295238
`0.238095
`0.252910
`0.085714
`0.214815
`0.031842
`0.182965
`0.009642
`0.157191
`0.002929
`0.000777
`0.136606
`0.120190
`0.000204
`0.107022
`0.000049
`
`0 . ~ 1 1 0.096351
`
`0.000000
`0.000000
`0.000000
`0.333333
`0.266667
`0.066667
`0.247619
`0.047619
`0.243386
`0.009524
`0.210871
`0.003944
`0.182188
`0.000777
`0.156952
`0.000239
`0.136559
`0.000047
`0.120178
`0.000012
`0.107020
`0.000002
`0.096351
`0 . ~ 1
`
`0.000000
`0.000000
`0.266667
`0.514286
`0.670899
`0.757287
`0.808170
`0.840119
`0.862664
`0.879618
`0.892931
`0.903638
`
`_____._____~____
`a In terms of the quantities calculated in Equations 1-10, these frac-
`tins are: p , ( n ) = S(n)/P(n), p,(n) = R(n)/P(n), p,,(n) =
`S~(n)/~(n),p,~(n) =A,(n)/P(n), andp,(n) = A i ( n ) / P ( n ) . These
`fractions also give the probability that a randomly selected pattern has
`the corresponding set of attributes, provided all patterns have equal
`probabilities of selection. Here the subscripts stands for symmetric, a
`for asymmetric, r for reducible, and i for irreducible.
`
`C.J. Benham and M S . Jafri
`
`Intrinsic topologies- Disulfide bond patterns
`
`Information regarding known disulfide bond patterns in
`proteins has been culled from two databases. The Brook-
`haven National Laboratories protein structural database
`(PDB) contains atomic coordinates for the structures of
`approximately 600 molecules (Berstein et al., 1977). Most
`of these structures have been found by crystallography,
`although some are theoretical predictions. In several cases
`a single database entry contains information on multiple
`subunits of the molecule, or on an additional molecule
`such as a bound inhibitor. A total of 259 protein molecules
`in the structural database were found to have disulfide
`bonds. This total includes duplicate entries, successive re-
`finements of the same molecule, and entries for identical
`molecules from closely related species. Some structures
`are reported only for fragments of molecules or for mol-
`ecules that have been altered by mutations affecting the
`number of cysteines present. In developing the population
`of observed structures examined here, theoretically pre-
`dicted structures, mutated molecules, and fragments were
`removed from further consideration, as the information
`in the database does not specify the disulfide bonding pat-
`tern of the actual complete molecule in these cases. When
`duplicate and closely related entries also are deleted, a
`population of 62 distinct, complete polypeptide molecules
`
`containing disulfide bonds remains (listed in the kinemage
`file). The numbers of occurrences in this database of each
`type of observed disulfide bonding pattern are given in the
`fourth column of Table 3 below.
`The National Biomedical Research Foundation (NBRF)
`protein sequence database (Barker et al., 1986) contains
`many thousands of entries, only some of which report di-
`sulfide bonding information. However, the absence of
`this information for a given molecule does not necessar-
`ily imply that it lacks disulfide bonds. In the small num-
`ber of cases where
`disulfide bonding is reported, the
`accuracy of the pattern is not always known. Some en-
`tries rate bonds as certain, probable, or possible, whereas
`others give alternative possible disulfide bonding patterns.
`In some cases bonding patterns have been inferred by ho-
`mology with other molecules. The disulfide bonding in-
`formation derived from this database, although more
`plentiful than that found from the PDB structure data-
`base, must be regarded as being less reliable.
`A total of 455 complete polypeptide chains in the
`NBRF sequence database were found to have intrachain
`disulfide bonds. This figure excludes fragmentary mole-
`cules and cases where considerable uncertainty regarding
`the disulfide connections was reported. Deletion of repeat
`entries and closely related molecules resulted in a popu-
`lation of 186 distinct polypeptides containing disulfide
`bonds. Column 3 of Table 3 reports the occurrences of
`each type of observed pattern in this population.
`When the populations culled from the structure and se-
`quence databases were amalgamated and duplicate entries
`
`~
`
`
`
`Page 6
`
`

`

`Disulfide bonding in proteins
`
`were deleted, an aggregate population of 208 distinct
`polypeptides containing disulfide bonds resulted. All oc-
`currences of each type of disulfide bonding pattern in this
`aggregate population were determined. The results are
`given in column 5 of Table 3. Column 2 in this table gives
`the reducibility and symmetry properties of each observed
`pattern.
`Table 4 shows the observed frequencies of disulfide
`bonding patterns having specific reducibility and symme-
`try attributes. The number of distinct occurrences of a
`given pattern is evaluated from the data of Table 3, sep-
`arately for each database and also for the aggregate pop-
`ulation. Also shown is the theoretical probability of each
`type of attribute, calculated using the expressions in the
`previous sections, assuming that each pattern of n bonds
`has equal probability of forming. These data show that,
`in cases where more than three disulfide bonds are
`present, symmetric patterns occur with frequencies that
`greatly exceed what would be predicted from random,
`equiprobable bonding. When n 1 6 , this frequency is an
`order of magnitude greater than random. This disparity
`is greatest for patterns that are both symmetric and reduc-
`ible, which are overrepresented for all values of n. When
`n I 6 , the prevalence of this type of pattern is two orders
`of magnitude greater than would arise with random bond-
`ing. In contrast, patterns that are irreducible are under-
`rep

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket