throbber
720
`
`IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. rr—16, N0. 6, NOVEMBER 1970
`
`Convolutional Codes I: Algebraic Structure
`
`G. DAVID FORNEY, JR, MEMBER, IEEE
`
`Abstract—A convolutional encoder is defined as any constant
`linear sequential circuit. The associated code is the set of all output
`sequences resulting from any set of input sequences beginning at
`any time. Encoders are called equivalent if they generate the same
`code. The invariant factor theorem is used to determine when a
`convolutional encoder has a feedback-free inverse, and the minimum
`delay of any inverse. All encoders are shown to be equivalent to
`minimal encoders, which are feedback-free encoders withfeedback-
`free delay-free inverses, and which can be realized in the conven-
`tional manner with as few memory elements as any equivalent en-
`coder. Minimal encoders are shown to be immune to catastrophic
`error propagation and, in fact, to lead in a certain sense to the ,
`shortest decoded error sequences possible per error event. In two
`appendices, we introduce dual codes and syndromes, and show
`that a minimal encoder for a dual code has exactly the complexity
`of the original encoder; we show that systematic encoders with
`feedback form a canonical class, and compare this class to the
`minimal class.
`
`I. INTRODUCTION
`
`LOCK CODES were the earliest type of codes to
`
`B be investigated, and remain the subject of the
`
`overwhelming bulk of the coding literature. On
`the other hand, convolutional codes have proved to be
`equal or superior to block codes in performance in nearly
`every type of practical application, and are generally
`simpler than comparable block codes in implementation.
`This anomaly is due largely to the difficulty of analyzing
`convolutional codes, as compared to block codes. It is
`the intent of this series to stimulate increased theoretical
`
`interest in convolutional codes by review and clarification
`of known results and introduction of new ones. We hope,
`first, to advance the understanding of convolutional codes
`and tools useful in their analysis; second,
`to motivate
`further work by showing that in every case in which block
`codes and convolutional codes can be directly compared
`theoretically,
`the convolutional are as good or better.
`Two converging lines of development have generated
`interest in an algebraic approach to convolutional codes.
`On the one hand,
`the success of algebraic methods in
`generating classes of good block codes suggests that
`constructive methods of generating good convolutional
`codes might be developed through use Of algebraic struc—
`tures. Correspondingly one might expect that powerful
`decoding techniques based on such structures might be
`discovered.
`(Fortunately,
`good codes and decoding
`methods not relying on such‘ constructions are already
`known.) On the other hand, the usefulness of regarding
`convolutional encoders as linear sequential circuits has
`begun to become evident, as in the observation of Omura
`
`[1] and others that the Viterbi [2] maximum—likelihood
`decoding algorithm is the dynamic programming solution
`to a certain control problem, and in the observation of
`Massey and his colleagues [3H8] that certain questions
`concerning error propagation are related to questions
`concerning the invertibility of linear systems. As the
`theory of finite-dimensional
`linear systems is seen in—
`creasingly as essentially algebraic, we have another motive
`for examining convolutional encoders in an algebraic
`context.
`Our result is a series of structure theorems that dissect
`
`the structure of convolutional codes rather completely,
`mainly through use of the invariant factor theorem. We
`arrive at a class of canonical encoders capable of generating
`any convolutional code, and endowed with all the desirable
`properties one might Wish, except that in general they
`are not systematic.
`(The alternate ' canonical class Of
`systematic encoders with feedback is discussed in Appendix
`»II.) The results do not seem to suggest any constructive
`methods of generating good codes, and say little new
`in particular about the important class of rate-1 /n codes,
`except for putting known results in a more general context.
`It appears that the results obtained here for convolutional
`codes correspond to block—code results ([9], ch. 3).
`
`II. PROBLEM FORMULATION
`
`We are purposefully going to take a rather long time
`getting to our main results. Most of this time will be spent
`in definitions and statements of fundamental results in
`
`linear system theory, and
`. convolutional coding theory,
`algebra. It is a truism that when dealing with fundamentals,
`once the problem is stated correctly, the results are easy.
`We feel it is important that the right formulation of the
`problem (like justice) not only be done, but be seen to
`be done, in the eyes of readers who may have backgrounds
`in any Of the three areas noted.
`After exhibiting a simple convolutional encoder for
`motivation, we move to a general definition of convolu—
`tional encoders, which we see amount to general finite-
`state time—invariant linear circuits. We discuss the decoding
`problem, which leads to a definition of convolutional
`encoder equivalence and to certain desirable code prop-
`erties. Along the way certain algebraic artifacts will
`intrude into the discussion;
`in the final
`introductory
`section we collect the algebra we need, which is centered
`on the invariant factor theorem.
`
`Convolutional Encoder
`
`this paper
`Manuscript received December 18, 1969. Part of
`was presented at the Internatl. Information Theory Symposium,
`Ellenville, N. Y., January 27—31, 1969.
`The author is with Codex Corporation, Newton, Mass. 02158.
`
`Fig. 1(a) shows a simple binary systematic rate-1/2
`convolutional encoder of constraint length 2. The input
`to this encoder is a binary sequence
`
`ERICSSON EXHIBIT 1005
`
`

`

`FORNEYZ CONVOLUTIONAL CODES I
`
`{21
`
`
`
`
`
`(b)
`
`V2
`
`(The delay operator D corresponds to 2'1 in sampled—data
`theory, but is purely an indeterminate or place-holder,
`whereas 2 is a complex variable.) Now the input/output
`relationships are expressed concisely as
`
`MD) = 91(D)x(D)
`
`MD) = 92(D)x(D),
`
`where the generator polynomials gl(D) and 92(D) are
`
`91(D) = 1
`
`92(1)) = 1+ D + D2,
`
`and ordinary sequence multiplication with coeflicient
`operations modulo 2 and collection of like powers of D
`is implied.
`Similarly, we can define a general (n, k) conventional
`convolutional encoder by a matrix of generator poly-
`nomials g,,-(D), 1 _<_ i S k, 1 S j S n, With coefficients
`in some finite field F. There are k—input sequences 96,-(D),
`n—output sequences y,(D), each a sequence of symbols
`from F, with input/output relations given by
`3‘
`
`MD) = Emma-7(1)),
`1:1
`
`Fig. 1.
`
`(a) A rate—1/2 systematic convolutional encoder.
`(b) Alternate representation.
`
`again with all operations in F. If we define the constraint
`length for input 1’ as
`
`2;: (...’x_1,x0)x17...)_
`
`v.- = max [deg rim-(DH,
`ISiSn
`
`The outputs are two binary sequences 3/, and 3/2 (hence
`the rate is 1/2). The first output sequence 3/1 is simply
`equal to the input at (hence the code is systematic). The
`elements 3/2, of the second sequence yg are given by
`
`ya = xi EB xi—l (‘9 Ctr—2,
`
`the
`where (—9 denotes modulo 2 addition. Therefore,
`encoder must save 2 past information bits, so we say
`the constraint length is 2.
`Sometimes it is convenient to draw the encoder as in
`
`Fig. 1(b), with the output a function of the memory
`contents only.
`(This corresponds to the distinction in
`automata theory between Moore and Mealy machines.)
`Some authors would therefore say this code had con—
`straint length 3, since the outputs are a function of a span
`of 3 input bits. Others measure constraint length in terms
`of output bits and would assign this code a constraint
`length of 6. Our definition of constraint length is chosen
`to coincide with the number of memory elements in a
`minimal realization.
`The term “convolutional” comes from the observation
`
`that the output sequences can be regarded as a con-
`volution of the input sequence with certain generator
`sequences. With the input and output sequences, we
`associate sequences in the delay operator D (D transforms):
`
`x(D) =
`
`—{—yc_1D_1 +xo+x1D+x2D2+
`
`QUAD) = '
`
`'
`
`' + y1.—1D_1 + 1/10 + (Um—D + y12D2 +
`
`y2(D) = ' ’ ' + 92:11).—1 + yzo + 921D + y22D2 + ‘
`
`‘ '.
`
`then the general conventional encoder can be realized
`by k shift registers, the ith of length 12,, with the outputs
`formed as linear combinations in F on the appropriate
`shift register contents. We call this the obvious realization,
`and note that the number of memory elements required
`is equal to the overall constraint length, defined as the
`sum of constraint lengths
`
`k
`
`V: 2111*.
`1:1
`
`For notational convenience we shall generally suppress
`the parenthetical D in our subsequent references to se-
`quences; thus x,- means 96,-(D), y,- = y,(D), and so forth,
`where the fact that a letter represents a sequence (trans-
`form) should be clear from the context.
`
`Convolutional Encoder~Geneml Definition
`
`The encoders of the previous section are linear sequential
`circuits. We now consider all finite—state time—invariant
`
`linear sequential circuits as candidates for convolutional
`encoders.
`
`Definition 1: An (n, k) convolutional encoder over a
`finite field F is a k—input n—output constant linear causal
`finite-state sequential circuit.
`Let us dissect this definition.
`
`K—input: There are k discrete-time input sequences
`33,-, each with elements from F. We write the inputs as
`the row vector x. Sequences must start at some finite
`time and may or may not end. (Then we can represent a
`sequence such as 1 + D + D2 + -
`-
`- by a ratio of poly—
`
`

`

`722
`
`nomials such as 1 / (1 + D), without encountering the
`ambiguity 1 / (1 + D) = D"1 + D"2 + -
`- -
`.) If a sequence
`:6.- “starts” at time d (if the first nonzero element is x“),
`we say it has delay d, del x,- = d, and if it ends at time d’,
`we say it has degree d’, deg x.» = d’, in analogy to the
`degree of a polynomial. Similarly, we define the delay
`and degree of a vector of sequences as the minimum delay
`or maximum degree of the component sequences: del x =
`min del xi, deg x = max deg 13,. A finite sequence has both
`a beginning and an end. (Note that most authors consider
`only sequences starting at time 0 or later. It turns out that
`this assumption clutters up the analysis without com-
`pensating benefits.)
`N—output: There are n—output sequences 31,-, each
`with elements from F, which we write as the row vector y.
`The encoder is characterized by the map G, which maps
`any vector of input sequences x into some output vector
`y, which we can write in the functional form
`
`y = G(x).
`
`Constant (Time-Invariant): If all input sequences are
`shifted in time, all output sequences are correspondingly
`shifted. In delay—operator notation,
`
`G (D"x) = D"G (x)
`
`n any integer.
`
`(Note that most probabilistic analyses of convolutional
`codes have been forced to assume nonconstancy to obtain
`ensembles of encoders with enough randomness to prove
`theorems.)
`Linear: The output resulting from the superposition
`of two inputs is the superposition of the two outputs
`that would result from the inputs separately, and the
`output “scales” with the input. That is,
`
`0051 + x2) = G(x1) + 0052)
`
`G(ax1) = aG'(x1)
`
`a E F.
`
`It is easy to see that constancy and linearity together
`give the broader linearity condition
`
`G(ax1) = 016(06):
`
`where a is any sequence of elements of F in the delay
`operator D. Furthermore, they imply a transfer-function
`representation for the encoder. For let 2,, 1 S 2' S k, be
`the unit inputs in which the ith input at time 0 is 1,
`and all other inputs are 0. Let the generators g, be defined
`as the corresponding outputs (impulse responses):
`
`g,=G(e,-)
`
`lgz'gk.
`
`Then since any input x can be written
`
`we have by linearity
`
`y = G06) = gaye-
`
`Thus we can define a k X n transfer function matrix G,
`
`men TRANSACTIONS ON INFORMATION THEORY, NOVEMBER 1970
`
`Whose rows are the] generators g,-, such that the input/out-
`put relationship is
`
`y=xG.
`
`Therefore from this point we use the matrix notation
`y = xG in preference to the functional notation y = G (x).
`Finally, this definition implies that a zero input gives
`a zero output so that the encoder may not have any
`transient response (nonzero starting state).
`Causal: If the nonzero inputs start at time t, then
`the nonzero outputs start at time t’ 2 t. Since the unit
`inputs start at time 0, this implies that all generators
`start at time 0 or later. As sequences, all generators
`must
`therefore have all negative coefficients equal to
`0, or del g, 2 0. Conversely,
`the condition that all
`generators satisfy del g,- 2 0 implies
`k
`
`del y = del 2 mg,
`i=1
`
`2 min [del :c, + del gal
`lSiSIc
`
`2 del x,
`
`causality.
`Finite—state: The encoder shall have only a finite
`number of memory elements, each capable of assuming
`a finite number of values. The physical state of an encoder
`at any time is the contents of its memory elements; thus
`there are only a finite number of physical states. A more
`abstract definition of the state of an encoder at any time
`is the following: the state 8 of an encoder at time t‘ is the
`sequence of outputs at time t and later if there are no
`nonzero inputs at time t or later. Clearly the number
`of states so defined for any fixed 25 is less than or equal to
`the number of physical states, since causality implies
`that each physical state gives some definite sequence,
`perhaps not unique. Thus an encoder with a finite number
`of physical states must have a finite number of abstract
`states as well.
`
`By studying the abstract states, we develop further
`restrictions on the generators g,. Let us examine the set
`of possible states at time 1", which we call the state space
`Z. (By constancy the state spaces at all times are iso-
`morphic.) Let P be the projection operator that truncates
`sequences to end at time 0, and Q the complementary
`projection operator 1 — P that truncates sequences to
`start at time 1:
`
`xP =ded+
`
`+yc_1D_1 +730
`
`xQ=x1D2+x2D2+
`
`Then any input x is associated with a state at time 1‘
`given by
`
`s = xPGQ;
`
`conversely, any state in E can be so expressed using any
`input giving that state. NOW the state space E is seen to
`satisfy the conditions to be a vector space over the field F,
`for P, G, and Q are all linear over F; that is, if
`
`

`

`FORNEY: CONVOLU’I‘IONAL CODES I
`
`and
`
`then the combinations
`
`s; = xiPGQ,
`
`$2
`
`x2PGQ,
`
`s1 'l" 5'2 = (x1 + x2)PGQ
`
`0481 = (ax1)PGQ
`
`o: E F,
`
`are also states. As a vector space, 2 therefore has a finite
`dimension dim E, or else the number of states qdim 2, where
`q is the number of elements in the finite field, would be
`infinite.
`
`Consider now for simplicity a single—input single—output
`linear sequential circuit with transfer function 9, so 3/ = mg.
`Let 8., be the state obtained from a unit input at time —d:
`
`8,; = D“de
`
`d 2 0.
`
`If the state space has finite dimension dim 2, then at
`most dim E of these states can be linearly independent,
`and in particular there is some linear dependence relation
`(with coefficients ybd in F) between the first dim 2 + 1
`of these states:
`
`where
`
`dim E
`
`Z 1Pde
`d=0
`
`II
`
`¢(D“‘)9Q,
`
`dim E
`
`¢(D_l) = g tbdD—d
`
`is some nonzero polynomial over F of degree dim E or less
`in the indeterminate D". In order for :I/(D_1)g to be 0
`at time 1 and later, 9 itself must be equal to a ratio of
`polynomials h(D"1)/¢(D'1), with the degree of the nu-
`merator MD”) not greater than that of the denominator
`MD”) in order that g be causal, del g 2 0. Any sequence
`9 that can be so expressed will be called a realizable function,
`realizable meaning both causal and finite. Clearly a
`realizable function can also be expressed (by multiplying
`numerator and denominator by Ddeg ,) as a ratio of poly-
`nomials in D rather than D“, in which case the condition
`deg h 3 deg 30 is transformed to the condition that D
`not be a factor of the denominator after reduction to
`lowest terms.
`
`We shall always assume that the expression h(D")/
`MD“) of a realizable function g has been reduced to
`lowest terms and has a monic denominator (xi/deg ., = 1).
`A canonical realization of such a realizable function is
`
`shown in Fig. 2. The realization involves a feedback
`shift register with deg zp memory elements storing elements
`of F, which essentially performs long division;
`if h/11/
`is in lowest terms then the dimension of the state—space
`dim 2 is also equal to deg a/x.
`A convolutional encoder is then any linear sequential
`circuit whose transfer—function matrix G is realizable,
`that is, has components that are realizable functions.
`
`723
`
`It is clear that a brute-force finite-state realization of
`
`such an encoder always exists, since one can simply realize
`the kn components and sum their outputs; in Appendix II
`we discuss the result of realization theory that shows
`how to construct a canonical realization, namely, one
`with a number of memory elements equal to the dimension
`of the abstract state space. We see that the only respect
`in which this definition is more general than that of a
`conventional convolutional encoder
`is
`that
`realizable
`
`generators in general involve feedback and, as sequences,
`have infinite length. It might seem that getting an infinite-
`length generator sequence from a finite—state encoder
`is a good bargain, but we shall see in the sequel that in
`fact feedback buys nothing.
`
`Communications Context
`
`So far we have defined a general convolutional encoder
`merely as a general realizable linear sequential circuit,
`and developed basic concepts in linear system theory.
`It is only when we put the encoder in a communications
`context that questions unlike those posed in linear system
`theory arise. The most important new concept involves
`a definition of encoder equivalence very different from the
`linear—system-theoretic one.
`Consider then the use of a convolutional encoder in a
`
`communications system, shown in Fig. 3. From the k—input
`sequences x, called information sequences, the encoder G’
`generates a set of n-output sequences y, called a codeword,
`which is
`transmitted over some noisy channel. The
`received data, whatever their form, are denoted by r;
`a decoder operates on r in some way to produce k decoded
`sequences 2, preferably not too different from x.
`We see immediately that all is lost if the encoder map G
`is not one—to-one, for then even if there were no noise in
`the channel the decoder would make mistakes. In fact,
`by linearity, if y = le = sz for two different inputs,
`then for any output y’ there are at least two inputs x’
`and x’ + x1 — x2 such that
`
`y’ = x’G = (x’ + x1 — x2)G';
`
`and by constancy the difference between the inputs may
`be made to extend over all time by concatenating them
`if they are not infinite already, so that the probability
`of decoding error would be at least %. We therefore require
`that the encoder map be one—to—one in any useful con—
`volutional encoder. There is therefore some inverse map
`G“, obviously linear and constant, which takes outputs y
`back to the unique corresponding input x,
`
`xG’G'1 = x
`
`for all x;
`
`GG—l
`
`'7': I)“
`
`where Ik is the identity transformation. (Here we have
`used an n X k transfer-function matrix representation
`for (1*, as we may, since G'l is constant and linear; in
`matrix terminology G—1
`is a right inverse.) Of course
`this inverse map may not be realizable, but we shall show
`below that when there is any inverse there is always a
`
`

`

`724
`
`IEEE TRANSACTIONS ON INFORMATION THEORY, NOVEMBER 1970
`
`
`
`Fig. 2. Canonical realization of the transfer function h(D‘1)/¢(D‘l).
`
`ENCODER
`G
`
`DECODED
`INFORMATION
`SEQUENCES
`SEQUE NCES
`
`coca
`CODE WORD
`ESTIMATE
`3' WORD
`y
`
`
`cons woso
`ESTIMATOR
`
`> R
`
`
`
`‘ CHANNEL
`
`‘
`
`
`
`ECEIVED
`DATA
`
`
`
`
`
`
`INFORMATION
`DECODED
`
` SEQUENCES
`SEQUENCES
`CODE
`I' RECEIVED
`DATA
`
`WORD
`CHANNEL
`
`Fig. 3. Communications system using convolutional coding.
`
`realizable pseudo—inverse 6’1 such that Gé“ = IkD" for
`some d 2 0. C7“ is also called an inverse with delay d,
`and G is sometimes called information-lossless of order d.
`
`Returning to Fig. 3, we now split the decoder con—.
`ceptually into two parts, a codeword estimator that
`produces from the received data 7' some codeword estimate
`y, followed by a realizable pseudo-inverse G” that assigns
`to the codeword estimate the appropriate information
`sequences 2‘: (see Fig. 4). In practice a decoder is usually
`not realized in these two pieces, but it is clear that since
`all the information about the information sequences x
`comes through y, the decoder can do no better estimating x
`directly than by estimating y and making the one—to—one
`correspondence to x. (In fact, any decoder can be made
`into a codeword estimator by appending an encoder G’.)
`When G is one-to-one, as long as the codeword estimator
`makes no errors, there will be no error in the decoded
`sequences. However, even in the best—regulated com—
`munications systems, decoding errors will sometimes
`occur. We define the error sequencesas the difference
`between the estimated codeword y and the codeword y
`actually sent:'
`
`e=y——y.
`
`Correspondingly the information errors e, are defined
`as the difference between the decoded sequences 1‘: and
`the information sequences x, with allowance for the
`pseudo-inverse delay cl:
`
`e,
`
`120“” —- x
`
`= g’G-lD—d __ yG-l
`
`= eG".
`
`Fig. 4. Same system with decoder in two parts.
`
`As is implicit in our terminology, we consider the error
`sequences e as the more basic of the two.
`Since the error sequences are the difference between
`two codewords, e is itself a codeword, in fact the codeword
`generated by e,‘:
`
`e = e,G.
`
`We make a decomposition of e into short codewords,
`called error events, as follows. Start at some time when
`the codeword estimator has been decoding correctly.
`An error event starts at the first subsequent time at which e
`is nonzero. The error event stops at the first subsequent
`time when the error sequences in the event form a finite
`codeword, after which the decoder will be decoding cor-
`rectly again. Thus we express e as a sum of nonoverlapping
`finite codewords, the error events.
`Implicit in the above analysis is the assumption that
`infinite error events do not occur. Such a possibility is
`not excluded in principle, but can generally be disregarded,
`on the basis of the following plausibility argument. If
`two codewords differ in an infinite number of places,
`then as time goes on, we can expect the evidence in the
`received data 7' to build up in favor of the correct word,
`'with probability approaching 1. A well-designed codeword
`estimator will use this information efficiently enough
`that very long error events have very small probabilities
`and infinite error events have probability 0. More precisely,
`the average error-event length should be small, at least for
`channel noise, which is in some sense small, so that most
`of the time the decoder is decodingcorrectly. We say
`that any codeword estimator or decoder not satisfying
`this assumption is subject to ordinary error propagation,
`and exclude it from the class of useful decoders. Note
`
`The one-to—one correspondence between these two
`definitions is exhibited explicitly through the inverse G“.
`
`that any decoder that is capable of starting to decode
`correctly sooner or later, regardless of what time it is
`
`

`

`FORNEY: CONVOLUTIONAL CODES I
`
`turned on, can be made immune to ordinary error pro—
`pagation simply by restarting whenever it detects it is
`in trouble, or even periodically.
`We owe to Massey and Sain [4] the observation that
`if there is any infinite information sequence x0 such that
`the corresponding codeword y0 = xoG is finite, then even
`in the absence of ordinary error propagation decoding
`catastrophes can occur. For there will generally be a
`nonzero probability for the finite error event e = y0 to
`occur, which will lead to infinite information errors 9,, = x0.
`Thus 5: will differ from the original information sequences
`x in an infinite number of places, even if no further decoding
`errors are made by the codeword estimator.
`(In fact,
`the only chance to stop this propagation is to make another
`error.) Massey and Sain call this catastrophic error propa-
`gation. Since G‘1 must supply the infinite output x0 in
`response to the finite input yo, it must have internal
`feedback for the above situation to occur. We therefore
`
`require that any useful encoder must not only have a
`realizable pseudo—inverse G‘l, but one that is feedback—
`free. (We see later that if G has no such inverse, then
`there is indeed an infinite input leading to a finite output.)
`We come at last to our most important observations
`(see also [3]). The codeword estimator dominates both
`complexity and performance of the system: complexity,
`because both G and G“1 represent simple one—to-one
`(and in fact linear) maps, while the codeword estimator
`map from received data 7" to codeword 3: is many-to-one;
`performance, because in the absence of catastrophic error
`propagation the probability of decoding error is equal to
`the probability of an error event multiplied by the average
`decoding errors per error event, with the former generally
`dominant and the, latter only changed by small factors
`for reasonable pseudo—inverses G"1 (in fact, in many prac—
`tical applications the probability of error event is the sig—
`nificant quantity, rather than the probability of decoding
`error). But the performance and complexity of the code—
`word estimator depend only on the set of codewords y, not
`on the input/output relationship specified by G (assuming
`that all x and hence y are equally likely, or at least that
`the codeword estimator does not use codeword probabilities,
`as is true, for example, in maximum-likelihood estimation).
`Therefore it is natural to assert that two encoders genera-
`ating the same set of codewords are essentially equivalent
`in a communications context. So we give the following
`definitions.
`
`Definition 2: The code generated by a convolutional
`encoder G is the set of all codewords y = xG, where the 19
`inputs x are any sequences.
`
`Definition 3: Two encoders are equivalent
`generate the same code.
`These definitions free us to seek out the encoder in
`
`they
`
`if
`
`any equivalence class of encoders that has the most
`desirable properties . We have already seen the desirability
`of having a feedback—free realizable pseudo—inverse G”.
`Our main result is that any code can be generated by an
`encoder G with such a G“, and in fact with the following
`properties.
`
`1) G has a realizable feedback—free zero—delay inverse
`G‘l.
`
`2) G is itself conventional (feedback—free).
`3) The obvious realization of G requires as few memory
`elements as any equivalent encoder.
`4) Short codewords are associated with short information
`sequences, in a sense to be made precise later.
`
`In the context of linear system theory, the study of
`convolutional encoders under equivalence can be viewed
`as the study of those properties of linear systems that
`belong to the output space alone, or of the invariants
`over the class of all systems with the same output spaces.
`
`A lgebra.
`
`We assume that the reader has an algebraic background
`roughly at the level of the introductory chapters of Peter—
`son [9]. He will therefore be familiar with the notion of a
`field as a collection of objects that can be added, subtracted,
`multiplied, and divided with the usual associative, dis-
`tributive, and commutative rules; he will also know what
`is meant by a vector space over a field. Further, he will
`understand that a (commutative) ring is a collection of
`objects with all the properties of fields except division.
`He should also recall that the set of all polynomials in
`D with coefficients in a field F, written conventionally
`as F[D], is a ring.
`The polynomial ring F[D] is actually an example of
`the very best kind of ring, a principal ideal domain. The
`set of integers is another such example. Without giving
`the technical definition of such a ring, we can describe
`some of its more convenient properties. In a principal
`ideal domain, certain elements r, including 1, have inverses
`r‘1 such that M” = 1; such elements are called units.
`The unit integers are :lzl, and the unit polynomials are
`those of degree 0, namely, the nonzero elements of F.
`Those elements that are not units can be uniquely factored
`into products of primes, up to units; a prime is an element
`that has no factor but itself, up to units. (The ambiguity
`induced by the units is eliminated by some convention:
`the prime integers are taken to be the positive primes,
`while the prime polynominals are taken to be manic
`irreducible polynomials, where monic means having
`highest order coefficient equal to 1.) It follows that we
`can cancel: if ab = ac, then b = c; this is almost as good
`as division. Further, we have the notion of the greatest
`common divisor of a set of elements as the greatest product
`of primes that divides all elements of the set, again made
`unique by the conventional designation of primes.
`Other principal ideal domains have already occurred
`in our discussions. In general, if R is any principal ideal
`domain and S is any multiplicative subset—namely, a
`group of elements containing 1 but not 0 such that if a E S,
`b E S, then at E S—then the ring of fractions 3"]?
`consisting of the elements r/s where r E R and s E S is a
`principal ideal domain. (All elements of R that are in S
`are thereby given inverses and become units.) Letting R
`be the polynomials F[D], we have the following examples.
`
`

`

`man TRANSACTIONS ON INFORMATION THEORY, NOVEMBER 1970
`
`if there is any decomposition G = AI‘B such that A and
`B are invertible R-matrices and P is a diagonal matrix
`with 7.-
`I 7;“ or 7H1 = 0, then the 7,- are the invariant
`factors of G with respect to R.
`_
`Sketch of Proof [10]: G is said (in this context only)
`to be equivalent to G’ if G = AG’B, where A and B are
`square 10 X k and n X n R-matrices with unit deter-
`minants; the assertion of the theorem is that G is equiva-
`lent to a diagonal matrix I‘ that is unique under the
`specified conditions on the 'y.-. Since any Such A can
`be represented as the product of elementary row operations,
`and B of elementary column operations (interchange
`of rows (columns), multiplication of any row (column)
`by a unit in R, addition of any R—multiple of any row
`(column) to another), it can be shown that the A, are
`preserved under equivalence.
`In particular,
`therefore,
`A1 divides all elements of all equivalent matrices. We
`will now show that there exists an equivalent matrix
`in which some element divides all other elements, hence
`is equal to A, up to units. Let G not be already of this
`form, and let a and B be two nonzero elements in the same
`row or colunm such that a does not divide B. (If there is
`no element in the same row or column as a not divisible
`
`by a, there is some such 6 in some other column, and this
`column can be added to the column containing a to give
`an equivalent matrix for which the prescription above
`can be satisfied.) By row or column permutations a may
`be placed in the upper—left corner and {3 in the second
`entry of the first row or column; we assume column for
`definiteness. Now there exist a; and 3/ such that am + py = 5,
`where 6 is the greatest common divisor of oz and B, and
`has fewer prime factors than a since a .1’
`[3, 6 75 oz. The
`row transformation below then preserves equivalence
`while replacing a by 6:
`
`llIll
`
`___| ___..
`
`lIllI
`
`If 5 does not now divide all elements of the equivalent
`matrix, the construction can be repeated and 6 replaced
`by some 6’ with fewer prime factors. This descending
`chain can therefore terminate only with a 6 that does;
`divide all elements of the equivalent matrix. Since 5 =
`A1 = 7,, up to units, multiplication of the top row by a
`unit puts 71 in the upper-left corner, and the whole first
`row and column can be cleared to zero by transformations
`of the above type (with x = 1, y = 0), giving the equiv—
`alent matrix
`
`Il
`
`l,
`__,_|__._
`
`l|I|
`
`G,
`
`where 7, divides every element of G1. Similarly, G,
`equivalent to a matrix G; of the same form, so
`
`is
`
`726
`
`1) Let S consist of all nonzero polynomials. Then S‘IR
`consists of all ratios of polynomials with the denominator
`nonzero, which are called the rational functions, written
`conventionally F(D). Obviously,
`in F(D) all nonzero
`elements are invertible, so F (D) is actually a field, called
`the field of quotients of F[D].
`‘
`2) Let S consist of all nonnegative powers of D, in-
`cluding D° = 1. Then S'IR consists of elements D‘ f(D),
`where f(D) is a polynomial; in other words, S‘IR is the
`set of finite sequences F,,(D). Clearly all the irreducible
`polynomials except D remain as primes in FM (D).
`3) Let S consist of all polynomials with nonzero constant
`term; that is, not divisible by D. Then S'IR consists Of
`ratios of polynomials in D with a nonzero constant term
`in the denominator. We saw earlier that these are precisely
`the generators realizable by causal finite-state systems;
`these are

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket