`
`James L. Massey
`Freimann Professor of Electrical Engineering
`University of Notre Dame
`Notre Dame, Indiana, h6556, U.S.A.
`
`ABSTRACT
`
`The use of coding in digital communications,to
`be effectiVe,requires that the modulation system be
`designed on an unconventional basis. Rather than
`using "error probability" as the modulation criter-
`ion, this paper argues that the appropriate modula-
`tion criterion is the "cut-off rate" R of the dis-
`crete channel which the modulation system creates.
`It is then shown that the R criterion leads to a
`rich "communication theory"°of its own in which the
`optimality of the simplex signal set can be proved
`(rather than conjectured) and in which "soft deci-
`sion" demodulators can be systematically designed.
`At the same time, the R0 criterion leads to a modula-
`tion system compatible with the use of effective cod-
`ing techniques in an overall efficient digital com-
`munication system.
`
`I.
`
`Introduction
`
`Information theory has been around for a suffi-
`ciently long time that its model of a digital com-
`munication system, given in Fig. l, no longer star-
`tles the communications engineer. Most communica-
`tions engineers would now agree that encoding and
`modulation are both-aspects of the signal design
`problem, and that demodulation and decoding are like-
`wise both aspects of the signal detection problem.
`Yet coding techniques,
`in spite of the great re-
`search effort and resultant large literature thereon,
`are seldom used in practical systems. Ten years ago,
`the "experts" held that coding was an interesting
`intellectual game that lacked practical relevance;
`their motto was then "Coding is dead."
`In the mean-
`time, coding has been used with spectacular success
`in deep-space communications.
`Now the "experts" are
`saying that the deep-space channel, i.e. the additive
`white Gaussian noise channel constrained in energy
`but not
`in bandwidth,
`is the only channel where
`coding techniques will ever be practical. The motto
`is now "Coding is dead except for the deep-space
`channel."
`
`In this paper, we shall argue that this pessi-
`mism about coding stems from the failure of communi-
`cations engineers truly to accept the basic model of
`Fig. 1.
`
`In Section III, we define certain concepts im-
`plicit in Fig. 1. We then argue in Section IV that
`acceptance of the Fig. 1 model forces the communi-
`cations engineer to discard "error probability" as a
`modulation design criterion in favor of what we call
`the "R0 criterion" where R0 is the cut-off rate of
`the discrete memoryless channel createdaby the modu-
`lation system. Although the R criterion is intend-
`ed to lead to design of modulation systems which are
`compatible with effective coding systems, we show in
`the remainder of the paper that the R0 criterion
`
`leads to a "communication theory" in many ways
`stronger and more profound than that based on "error
`probability."
`To illustrate this point, we prove in
`Section V that the simplex signal set is optimum
`for the R criterion on the additive White Gaussian
`noise chagnel.
`From our proof, we are able to make
`some fundamental observations that apply to all
`Signal sets. Finally, in Section VI we show how
`the R criterion leads to the systematic design of
`"softodecision" demodulators for which an "error
`probability" criterion would be meaningless.
`
`II. Basic Definitions
`
`We shall suppose the encoded digits X in Fig. l
`are q—ary letters, i.e. that the modulator can
`generate any of q signals as the transmitted wave-
`form s(t) in each signalling interval.
`The rate R
`0f the coding systems is the number of binary digits
`U per encoded digit X (on a long-time basis), i.e.
`the number of source bits sent per use of the chan-
`nel. We shall suppose that the demodulaped digits
`Y are q’-ary letters where in general q'=q.
`The
`decoder output digits U are the estimates of the
`source delivered to the sink.
`
`From the coding viewpoint, the modulator, wave-
`form channel, and demodulator together constitute a
`discrete channel with q input letters and q' output
`letters. We shall assume this channel is memoryless,
`i.e. that the channel acts independently in each
`signalling interval, an assumption which is not so
`restrictive as might first appear since nothing pre-
`vents the signalling intervals from being very long
`(and in fact comprising many intervals from a
`"smaller" modulator.)
`In information theory parl-
`ance, X and Y constitute the input and output digits
`respectively of a discrete memoryless channel or
`(DMC). We shall let p(y1x) denote the probability
`of receiving the output letter y when the input
`letter x is sent, and these transition probabilities
`completely define the DMC.
`
`III.
`
`A Sensible Modulation Criterion
`
`The design criterion for modulation systems in
`almost universal use is "error probability," i.e.
`P(y+x). For this criterion to be meaningful, it is
`necessary to choose q'=q (usually both are 2) so
`that the input and output letters can share the
`same alphabet—-a seemingly innocuous step that is
`fatal to the effective use of coding. The communi-
`cations engineer who uses this criterion presumes
`that the purpose of coding is to "correct errors"
`made by the demodulator, and he comonly refers to
`codes in general as "error-correcting codes."
`Fe
`somehow believes that if the "error probability" is
`unacceptably large, then coding should be able to
`rescue his design and he is inevitably disappointed
`to learn that the necessary code redundancy is so
`large that his overall system is quite inefficient.
`
`1974 ZURICH SEMINAR
`
`E2
`
`(1)
`
`ERICSSON EXHIBIT 1010
`
`
`
`‘“'r“-‘r'~“"“"w'r‘***='-wrr-=~“-2-~“—":v~«:-we-a—-w-w—~w-—-<s-s~x--N-nu‘;-.\--.-..a—=...—_....=....._....._...,._.‘._._._...__..__:__,,4
`
`
`
`that for any practical demodulator. Demodulator
`design then reduces to selecting a finite q' which
`does not materially reduce the unquantized R .
`It
`is easily seen fro (1) that the unquantizedofi
`is
`given by
`Q
`Q *°
`
`O
`
`X I Vpi(§}pJ(gjdgQ(i)Q(J)} (3)
`R0 = -logzfmin Z
`Q(x)i=l J=l-=
`
`) is the density function for §_given
`where p (
`that the i-th signal vector, gi, is transmitted.
`
`The surprising aspect of (3), which seems not
`to have been previously noticed, is that the for-
`midable-looking integral which it contains can often
`be easily evaluated. For instance, when the channel
`has additive white Gaussian noise (AWGN) so that
`£y§fg_where the components of 3 are statistically
`independent Gaussian random variables with means 0
`and variances N /2, then (3) reduces to the remark-
`ably simple
`°
`
`q
`q
`X e
`Ro = -log2{min X
`q(i)i-l 3-1
`
`-Is -3 I2/hN
`'1 ‘fl
`
`°a.(1)o.(3)}
`
`(h)
`
`[M student, R. Johannesscn, has recently obtained a
`similarly simple evaluation of (3) when 3 is a
`general Gaussian variate and is presently extending
`the results of this paper to that case, i.e. to the
`non-white additive Gaussian noise channel.]
`
`With (h), we now have a closed—form expression
`for the unquantized (q'=¢) R of the AWGN channel
`which is quite amenable to agalysis.
`
`‘ V.
`
`Simplex Optimality
`
`It has long been conjectured, but never proved,
`that for the "error probability" criterion the best
`set of q signal vectors with average energy E on the
`AWGN channel is the simplex set. we now prove that
`for the R0 criterion the simplex is indeed optimal.
`
`We see from (R) that R0 depends only on the
`differences between signal vectors, so that for a
`given average energy
`cilia
`s
`i=l '1
`
`Q(i)
`
`E =
`
`(S)
`
`an optimal signal set will have its centroid at the
`origin, i.e. we may suppose without loss of
`optimality that
`
`Q X
`
`s q(1) = Q,
`1=1"
`
`with some slight manipulation, we find that
`(6) imply
`2
`Q
`Q
`X I21-gjl 0.(i)Q(J) = as
`X
`1=1 3:1
`
`[We
`which has a surprising simplicity of its own.
`also note that (7) is unchanged if we exclude the
`term for j=i in the second sumation.]
`
`Now letting
`
`b =
`
`§
`i=l
`
`q2(1)
`
`(6)
`
`(5) and
`
`(7)
`
`(8)
`
`The next step is to redesign the modulation system
`to give an acceptable "error probability" without
`coding.
`
`The crux of the problem is that "error probabil-
`ity"is a modulation design criterion which is sensi-
`ble only if coding is not used.
`It should then not
`be surprising that modulation systems designed on
`this basis prove to be ill-matched to coding systems.
`
`Our discussion in the previous section suggests
`that the real goal of the modulation system is to
`create the "best" DMC as seen by the coding system.
`We now argue that the sensible design criterion for
`a digital modulation system is the cut-off rate R
`°
`of this DMC, the bigger R
`for a given average
`energy E in the signalling interval, the better the
`modulation system. Mathematically, R0 is given by
`R0 = -log2{min X[:V§(yIx) Q(x)]2}
`(l)
`Q(x):r X
`where the minimization is over all probability dis-
`tributions Q(x) on the channel input letters.
`
`The true error probability of interest in Fig. l
`is Pe=P(U+U), i.e. the probability that the bit
`delivered to the sink differs from that which came
`from the source. Since the work of Viterbi1, it has
`been known that if convolutional coding techniques
`are used on the DMC, then one can achieve
`-nR
`
`(2)
`
`Pe ; cR2
`
`o
`
`if Rgho
`
`where cR is a small constant best found by simulation
`and where n is the constraint length (in channel
`letters) of the convolutional code. The one nuber
`
`R0 then gives both a region of rates where it is
`possible to operate with arbitrarily small probabil-
`ity of error 55; an exponent of error probability
`(which Viterbi has shown is the best possible expo-
`nent for rates R near R .) No other single number
`gives such useful inforgation, not even "channel
`capacity" which gives only a range of rates where
`operation is possible. Moreover, R
`is also the
`"R
`" of sequential decoding’, 1.3. the rate above
`whiggpthe average number of decoding steps per de-
`coded digit becomes infinite.
`In practice, sequen-
`tial decoders can comfortably operate at rates R
`near R0 (or even slightly greater, but only slightly).
`The thesis that R
`is the sensible criterion for
`modulation system desg
`is not new. Soe years ago,
`Wozencraft and Kennedy made this same proposal but
`it fell then on deaf ears. The intervening years
`have supplied results that now let us resurrect
`their proposal and to demonstrate that not only is
`it the logically right criterion but also that it
`can be effectively employed in modulation system
`design.
`
`IV. Siggal Space Considerations
`
`We now suppose that the signal s(t) can be
`written as a linear combination of N orthonormal
`functions and hence representable by the vector 5
`of its coefficients in this expansion. We presume
`that r(t) is reduced to its projection ; on these
`a
`same functions.
`4
`
`If one first considers a demodulator which
`
`we can rewrite (h) as
`
`merely passes along 3 to the decoder (requiring of
`course q'=~), one obtains an R0 which upper bounds
`
`E2
`
`(2)
`
`
`
`In practice,
`the performance for any demodulator.
`one would examine the R attainable with q'=2,h,8,
`16,... until one had obtained an R
`sufficiently
`close to the unquantized R
`to Justify a practical
`choice at this smaller number of quantization levels.
`To carry out such a design procedure, one needs only
`to know how, for a given q', to set the quantization
`levels so as to optimize the resulting R . Again
`this problem is nicely amenable to solation.
`
`For a given q‘, the modulation system reduces
`to a DMC with q=2 input letters (which we call 0 and
`1) and q' output letters (which we call yl,y2,...
`y ,). For a given output letter yi, we define its
`likelihood ratio as
`
`P(yiIO)
`(lh)
`.
`X(Yi) = §?§:TI)
`For q=2, it is easily seen that the optimizing G(x)
`in (1) is always Q(O)=Q(l)=l/2 so that (1) reduces
`to
`
`ql
`
`R0 = 1 -log2[l+Z/§(y1lO)P(yi(l)].
`i=l
`
`(15)
`
`For a given received vector g, we define its likeli-
`hood ratio as
`
`90(2).
`>.(g_) = 31-53
`where p (g) is the density function in effect for 3
`when 0 is sent. Without loss of optimality in R ,
`one can always first convert g_to 1(3) and hence°con—
`sider the demodulator design problem as that of
`choosing quantization thresholds for the scalar .
`quantity X(g). Let T be a quantization threshold
`between regions of A(q) where yi and y
`respectively
`will be the demodulator output.
`If T is optimally
`chosen, one must have
`
`8R0
`n~"=°-
`
`Applying this condition in (15), one can readily
`show that
`
`(16)
`T = />.(bi)>.(bJ)
`It
`is the necessary condition for T to be optimal.
`follows that the quantization of A(g) is optimal if
`and only if each quantization threshold is the
`geometric mean of the likelihood ratios for the re-
`sulting demodulator output letters whose decision
`regions the threshold separates.
`
`This result suggests the following algorithm to
`find an optimal set of thresholds for a given desired
`number q’ of output letters. Let yi be the output
`letter if T
`<X(g)§T where of course T = -w and
`T ,3 +~. One chooses T arbitrarily whigh then
`dgtermines X(yl). One then finds T2 so that T
`satisfies 16. The choice of T
`then determines
`A(y2) and one then finds T
`so that T
`satisfies (16).
`Etc.
`If this procedure ca; be completed but T , 1
`does not satisfy (16), then T must be increasgd'
`and the procedure repeated.
`if the procedure can-
`not be completed,
`then T1 must be decreased and the
`procedure repeated.
`'
`
`Q
`
`Q
`
`2 e
`R0 = -logefmin [b+E
`Q(i)
`i=1 j=l
`J+i
`
`_ l§i—:5_J |2/hm°Q(1)Q(J)]}
`
`Now using the convexity of the exponential (and not-
`ing that the factors Q(i)Q(J)
`sum to l-b) and
`making use of (7). we obtain
`-E/2No(l-b)
`
`]}
`
`(9)
`
`R
`
`< -log {min [b+(l-b)e
`o -
`2
`.
`Q(1)
`
`is independent
`with equality if and only it Is -5 I
`of i and J when i+J, i.e. if and o ly if the signal
`set is a simplex.
`The minimizing Q(i) in (9) is
`easily seen to be Q(i)=l/q for all i which gives
`-qE/(Q-1)2N°
`
`1
`R0 ,<_ +1°s2q -l0s2[l+(q-l)e
`with equality if and only if the signal set is a
`simplex whose signal vectors are equally likely.
`
`(10)
`
`Inequality (1) has some profound implications
`for digital communications. For the simplex set,
`(10) hold with equality so we see further that
`
`<1E/(q—l)2N
`
`° + (q-1)]
`
`(11)
`
`dR°
`-5- = 0.72 NO q/[e
`which decreases monotonically with E. Hence signal
`energy is most effectively used when E/No is small
`(and hence when the "error probability" would be
`large if we were designing a conventional modulation
`system.)
`On the other hand, the derivat ve in (11)
`is nearly equal to its maximum value at F =0 in the
`range where the exponential term is domingted by
`(q—l) and in this range we have
`
`1
`0 ~
`-up
`cm - .72 N .
`0
`Hence
`
`(12)
`
`E
`(13)
`.72 No
`R0 _<_
`is an upper bound applying to any modulation scheme
`(regardless of q) for the AWGN channel and holding
`with near equality for the q-ary simplex in the
`region
`
`3 logaq.
`
`E E
`
`o
`
`This shows that the only reason one might choose a
`large q (rather than q=2) would be if system con-
`straints required a large value of E/N since then
`a large q is required for operation inothe energy-
`efficient range where (13) holds with near equality.
`
`VI. Demodulator Considerations
`
`If one thing were ever clear, it is certainly
`that the "error probability" criterion is useless
`when designing a demodulator to make "soft decisions"
`(q'>q) rather than "hard decisions" (q'=q).
`On the
`other hand, the R0 criterion is natural for guiding
`the communications engineer in this task as we now
`proceed to show.
`
`Suppose for simplicity of discussion that binary
`modulation, i.e. q=2, is to be useda We have al-
`ready seen how the unquantized (i.e. q'=m) R0 bounds
`
`1974 ZURICH SEMINAR
`
`E2
`
`(3)
`
`
`
`VII. References
`
`1. A. J. Viterbi, "Error Bounds for Convolutional
`Codes and an Asymptotically Optimum Decoding
`Algorithm," IEEE Trans. Info. ’1'h., IT-13.
`pp. 260-269. April 19 7.
`
`2.
`
`3.
`
`J. M. Wozencraft and I. M. Jacobs, Principles of
`Communication gggineering, John Wiley and Sons,
`New York, 19 7.
`
`J. M. Wozencraft and R. S. Kennedy, "Modulation
`and Demodulation for Probabilistic Coding,"
`IEEE Trans. Info. Th., IT-12, pp. 291-297.
`Jul-Y 19
`.
`
`lb.
`
`J. L. Massey, Threshold Decoding, M.I.T. Press,
`Cambridge, Mass., 19 3.
`
`Information
`Source
`
`
`
`Modulator
`
`s t
`
`
`
`waveform
`, Channel
`
`
`
`Demodulator '
`
`Information
`Sink
`
`r(t)
`
`Figure l. The Basic Information-Theoretic
`Model of a Digital Communication System
`
`W student, L. Lee, has carried out this algo-
`rithm to find the optimum quantization rules on the
`scalar (AWGN) channel. The following table shows
`his results for E/No-l
`(OdB):
`
`
`
`Table I:
`Optimum Quantization Thresholds on the
`Scalar AWGN Channel with Binary Antipodal Signals
`for E/Nosl (dB), assuming /E = 1.
`In the example of Table I, the conclusion is
`that q'=8 signal levels is practically optimum, but
`using
`"hard decisions" (q'=2) results in serious
`loss of optimality (about 2dIB).
`
`our main interest in this section has been to
`show that the R criterion provides the communica-
`tions engineer 31th a useful way to approach the
`elusive problem of designing "soft decision" de-
`modulators, but our example is illustrative of the
`general fact that rarely will a "hard decision"
`demodulator be a practically optimum choice. The
`use of a. hard decision demodulator will often cause
`
`a loss of Re which translated into dB may well be
`on the same order as the gain which a sensible cod-
`ing system could provide.
`
`VI . Remarks
`
`In the above,_we‘_have tried to show that the R
`criterion is the sensible one for design of modula9
`tion systems in coded digital communications systems.
`We have also attempted to demonstrate that this
`criterion is at least as simple to handle in dealing
`with signal design questions as is demodulator "error
`probability" and that, moreover, the R criterion
`offers guidance in such matters as theodesign of
`"soft decision" demodulators where the "error
`probability" criterion is mute.
`
`It may seem curious that in a paper intended to
`deal with coding we have dealt mainly with modula-
`tion philosophy. But it is our belief that until
`modulation systems are designed on an R basis the
`coding engineer will inevitably be faced with a
`discrete channel for which coding techniques are
`largely ineffective.
`It is paramount to the design
`of efficient coded systems that the modulation
`engineer aim at maximizing B using as simple a
`signal set (small q) as poss
`1e and a consistently
`larger q'.
`The net result will be a system where
`the "error probability" would be large if the de-
`modulator were then compelled to make "hard deci-
`sions." Coding techniques such as sequential de-
`coding3, Viterbi decoding2 and threshold decoding“,
`all of which employ convolutional codes and make use
`of the "soft decision" information provided by the
`demodulator, will then naturally suggest themselves
`as solutions to the design of the coded portion of
`the over-all digital communications system.
`4
`
`E2
`
`(4)