throbber
IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 41, NO. 2, FEBRUARY 1993
`
`275
`
`A Soft Decision-Directed LMS Algorithm for Blind Equalization
`
`Steven J. Nowlan and Geoffrey E. Hinton
`
`Abstract-We propose a new adaptation algorithm for equaliz-
`ers operating on very distorted channels. The algorithm is based
`on the idea of adjusting the equalizer tap gains to maximize
`the likelihood that the equalizer outputs would be generated by
`a mixture of two gaussians with known means. The familiar
`decision-directed least mean square (LMS) algorithm is shown
`to be an approximation to maximizing the likelihood that the
`equalizer outputs come from such an i.i.d. source. The algorithm
`is developed in the context of a binary PAM channel and simula-
`tions demonstrate that the new algorithm converges in channels
`for which the decision-directed LMS algorithm does not converge.
`
`I. INTRODUCTION
`HE system which we investigate in this paper is illus-
`
`T trated in Fig. 1. A sequence of data (at)' is sent through
`
`a (linear) channel with unknown impulse response S . The data
`may take on one of a small number of discrete values (in the
`sequel we assume that at may take on one of two values
`*a). The output of the unknown channel (Q) is then passed
`through a linear transversal filter W whose impulse response
`is approximately S-l. The objective of W is to cancel out
`most of the distorting effects of the channel S so that the
`sequence ( g t ) output by W is a near copy of the original input
`sequence (ut). Finally, a zero-order (memoryless) nonlinear
`decision process (ZNL) examines each output yt and replaces
`it with the closest value from the set of discrete input values,
`producing the sequence (at). We wish to consider the general
`case in which S may be noncausal, so W must introduce some
`delay into the system. As a result, we desire in general that
`at-N or at = at-N where N represents the (known)
`yt
`delay of W .
`Systems of the type just described occur frequently in digital
`communications where data must be converted into analog
`form for transmission and then converted back into digital
`form at the receiver (see [ll], [14] for a general introduction
`to this problem). W is generally a filter with adjustable tap
`weights and the problem we explore is how to adjust these
`tap weights so W becomes a good approximation of S-l.
`If the sequence ( u t ) is known, the classical approach is to
`use an LMS or stochastic gradient descent procedure [ 121 to
`
`Channel Qt--p-+ Eaualizer
`
`Fig. 1. Block diagram of system of interest.
`
`minimize ~ Y [ ( y ~ - a ~ - ~ ) ~ l . ~
`In practical situations the sequence
`( a t ) is not known. Instead, a two-step procedure is used
`to adjust the equalizer: 1) an initial settling phase in which
`the transmitter sends a known initialization sequence and the
`receiver performs LMS adjustment of the equalizer and 2) a
`phase in which the receiver uses the output of the ZNL (at)
`as an estimate for ( a t ) in performing LMS adjustment of the
`equalizer.
`The second step of the procedure is usually referred to
`as operating the equalizer in decision-directed mode since
`updates to the taps of the equalizer are controlled by the
`decisions made by the ZNL.
`In the classical decision-directed LMS (DDLMS) algorithm
`[14], [13], the ZNL is a simple threshold device. For the
`binary channel we are considering, the output of this ZNL
`can be represented as a sgn(yt) and the DDLMS algorithm
`
`can be regarded as minimizing E[(yt - a ~ g n ( y ~ ) ) ~ ] . In this
`paper we derive a different form of ZNL for operating the
`equalizer in decision-directed mode. The modified ZNL is
`derived by proposing that (yt) be modeled as the output of
`an independent, identically distributed (i.i.d.) random process
`with a gaussian mixture density. The ZNL then corresponds
`to the maximum likelihood estimate of at given yt and the
`assumed model. The next section of the paper derives the
`basic algorithm, and Section I11 presents some simulation
`results on a simulated binary PAM channel which demonstrate
`that the modified equalization algorithm can converge reliably
`with initial channel error rates considerably greater than those
`allowed by the classical DDLMS procedure.
`
`Paper approved by the Editor for Channel Equalization of the IEEE Com-
`munications Society. Manuscript received April 24, 1990; revised May 23,
`1991 and August 12, 1991. This work was supported by a grant from the
`Ontario Information Technology Research Center.
`S. J. Nowlan is with the Computational Neuroscience Laboratory, The Salk
`Institute, San Diego, CA 92186.
`G.E. Hinton is with the Department of Computer Science, University of
`Toronto, Toronto, Ont. M5S 1A4, Canada.
`IEEE Log Number 9207349.
`'The notation ( u t ) is used to denote a sequence of real values, while ut
`will refer to one element of that sequence.
`
`11. DEVELOPMENT OF THE ALGORITHM
`We begin by introducing some mathematical notation, and
`then outline the basic method for adjusting the tap weights
`of the equalizer via a stochastic gradient descent procedure.
`'Technically, this expectation is evaluated with respect to ensemble aver-
`ages of the signal. However, usual practice is to assume stationarity of systems
`and processes and to substitute time averages for ensemble averages. In the
`sequel, except as noted, one may assume that all expectations are evaluated
`as time rather than ensemble averages.
`
`0090-6778/93$03.00 0 1993 IEEE
`
`Exhibit 1030
`U.S. Patent No. 6,108,388
`
`

`

`276
`
`IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 41, NO. 2, FEBRUARY 1993
`
`We then develop our algorithm by introducing a particular
`functional to be optimized and relating this functional to the
`functional used in the classicial DDLMS procedure.
`In the system depicted in figure 1, we assume that W
`represents an adjustable filter with 2N + 1 taps and a delay
`of N , so
`
`A N
`
`Yt = C W i x t - k
`
`I -
`
`k=-N
`where the superscript on the term w: indicates that the tap
`weights change over time. Define the vector wt = (W?,,
`.
`, w h ) and the vector zt = { x t + N , . . . , xt-N}. Then we
`may express (1) as
`
`+ .
`
`
`
`T
`Yt = wt 2 t
`where the superscript T denotes the vector transpose.
`As mentioned in the introduction, the objective of adjusting
`the tap weights in the equalizer is to make yt z a t - N . The
`approach we shall use is to treat (yt) as a stochastic process
`and to minimize a functional
`
`(2)
`
`J ( w ) = E[Q(yt)].
`
`min
`W E R2 N + l
`Following the classical approach (assuming stationary ergodic
`processes), we replace ensemble averages in (3) with time
`averages and we get the following stochastic gradient update
`procedure:
`
`(3)
`
`wt+l = wt - E e t q
`
`(4)
`
`where
`
`and E is a (fixed) stepsize. General convergence results for this
`class of stochastic approximation algorithms may be found in
`[3]. To complete the algorithmic specification, we simply need
`to define Q(yt).
`Since we do not have direct access to ( a t ) when updating
`our tap weights, Q(yt) must contain some assumptions about
`at-N. In the classical DDLMS algorithm for a binary PAM
`channel with input values f a , Q(yt) = (yt - a sgn(yt))2
`and et = yt - a sgn(yt). Informally, the assumption in the
`classical algorithm is that if yt 2 0 then f3t-N = a, otherwise
`a t - N = -a.
`We propose instead that we define
`
`W Y t ) = -1%
`
`f d Y t )
`
`(5)
`
`where
`
`fY(yt) is the density function of a gaussian mixture with two
`components centered at
`p1 and p2 are the proportions
`of the two gaussians in the mixture, and u is their standard
`deviation. In the sequel, we will assume that p1 = p2 = 0.5
`Upper case letters denote random variables (RV) and corresponding lower
`case letters denote values of the RV.
`
`1-
`
`0.8.-
`
`0.6--
`
`0.4--
`
`0.2.-
`--
`
`-0
`
`-0.2.-
`
`- 0 . 4 -
`
`-0.6.-
`-o.e--
`
`-1.6
`
`-0.4
`
`-0
`
`-1
`I
`-1.2 - 0 . 8
`-
`0 . 4
`0 . 8
`1 . 6
`2
`Fig. 2. Soft decision error function for different values of u. (a = 0
`corresponds to a hard decision.)
`
`0 . 0
`
`1.2
`
`which corresponds approximately to the assumption that the
`values f a are equally likely in the data sequence (at). We
`assume for now u is also fixed.
`Minimizing Q ( y t ) as defined in (5) corresponds to maxi-
`mizing the (log) likelihood that the sequence (yt) is generated
`by an i.i.d. source with density f u ( ) . One can justify this
`model for (yt) by assuming that ( a t ) is generated by an i.i.d.
`source with density f A ( a t ) = 0.56,,+, + 0.56,,-,
`and that
`the combined effect of S followed by W on (at) is simply to
`add 0 mean gaussian noise (and a delay of N ) to (at). The
`added noise is composed of channel noise and an intersymbol
`interference term. Since the intersymbol interference term
`contains the contributions from a large number of random
`symbols, the characterization of the overall noise constribution
`as gaussian is not unreasonable.
`To evaluate et based on (5) we must deal with the sum of
`exponentials in (6). One simple approximation is to ignore the
`smaller of the two exponential terms. Under this assumption
`we find
`
`{ + (Yt + a ) Yt < 0
`
`7 (yt - a ) otherwise.
`
`et
`
`(7)
`
`We can simply include the 1 / u 2 term into our step size 6 and
`we find that this approximate expression for et is identical to
`the expression for et in the classical DDLMS algorithm.
`If we do not make any approximations, the expression for
`et based on (5) is
`
`Comparing (8) to the classical DDLMS error, we see that we
`have replaced the sign function with a more complicated non-
`linearity. We refer to this particular non-linearity as a “soft”
`decision as compared to the “hard” decision represented by
`the sign function in the DDLMS algorithm. The choice of
`these terms is apparent from Fig. 2 where we have plotted
`(8) as a function of y for several values of u. As u + 0 (8)
`becomes identical to the error in the DDLMS algorithm. This
`error has a sharp discontinuity at y = 0, and as u increases
`this discontinuity is smoothed away.
`
`

`

`IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 41, NO. 2, FEBRUARY 1993
`
`277
`
`The weakness of the hard decision used in the classical
`DDLMS algorithm is apparent when we realize that when yt
`is very near 0 a decision based on the sign of yt is most likely
`to be incorrect, due to the effects of random noise. Yet it is
`in these cases that the tap weights are changed the most. The
`“soft” decision nonlinearity has a much smaller magnitude
`when yt is near 0, so it makes only small weight changes in
`these highly ambiguous cases.
`Finally, we turn to the parameters of the gaussian mixture
`density f y ( ) . There are three parameters for each component
`in the mixture, a mixing proportion p , a mean p, and a variance
`u2. In the development so far we have assumed that the means
`have been fixed at &u and the mixing proportions have been
`fixed at p1 = p 2 = 0.5. These values come directly from our
`assumption that ( u t ) is generated from a binary i.i.d. source
`with density
`~ A ( A ) = 0 . 5 6 ~ + , + 0 . 5 6 ~ ~ ~
`(9)
`where a is a constant and 6, is the usual Kronecker delta
`function
`6, = {
`
`1 x = o
`0 otherwise.
`Knowledge of the source characteristics and transmission
`scheme will usually suggest reasonable choices for the values
`of p and p for each component.
`We have also assumed that the variance of the two com-
`ponents was equal and fixed at some unspecified value u2.
`To understand why both components would have the same
`variance, it is helpful to consider more carefully the nature of
`the “noise” which results when W is not an ideal inverse filter
`for S. In Fig. 1, xt and yt may be expressed as the following
`convolution sums:
`
`k=-m
`where we have let wk = 0 for Ikl > N and dropped the
`superscript on wk for notational convenience. Let W* =
`, w&) denote the impulse response of an ideal
`(WE,,...
`inverse filter for S so
`
`k
`If we replace W with W* in Fig. 1 we would have
`
`k
`We can regard W as an approximation to the ideal equalizer
`W* and rewrite (11) as
`
`where (n;) is a convolutional noise sequence and we have used
`(13). The variance of each comDonent in our mixture densitv
`
`is largely determined by the variance of the convolutional
`noise sequence (n;). From (14) it is apparent that ni contains
`contributions from a large number of transmitted symbols, so
`it is reasonable to assume that the properties of (n:) are not
`dramatically different for different symbol values (Le., samples
`representing the symbol a and samples representing the symbol
`-u will have similar distributions for nc). This suggests that
`the variance of each component in the mixture density f u ( )
`is the same.4
`Equation (14) also suggests that the characteristics of (n;)
`will change as W changes, so the assumption that u2 should
`be fixed does not seem reasonable. Simulation studies have
`shown that better convergence may be obtained by adjusting
`u in parallel with adjusting the equalizer tap weights [lo], [7],
`[9]. The algorithm used to update u is an incremental version
`of the standard maximum likelihood estimator for mixture
`components with tied variance [5], [8]. The actual update rule
`is
`
`u2(t + 1) = K O 2 @ ) + (1 - IF.)[X(y, + u)2 + (1 - X)(y, - u)2]
`(15)
`
`where
`
`and IF. is a decay rate slightly less than 1 for discounting past
`data. The decay rate for the exponential averaging of past data
`is based on the degree of stationarity of the data. A more
`detailed development of this expression may be found in [9].
`Finally, we note that the model for the data given by
`(9) may be generalized to allow a larger class of discrete
`signal distributions while still allowing the distribution of the
`sequence ( y t ) to be modeled as a mixture of gaussians. If we
`let A denote a random process for which ( u t ) is a realization,
`the pdf of this i.i.d. source may be written in the general form
`
`z
`This source will be 0 mean and ~ncorrelated~ as long as
`
`An i.i.d. data source of the form (17) will lead to an i.i.d.
`model for (vt) with a Gaussian mixture density in which the
`mixing proportions pi and means pi are as specified in (17)
`and the variance of all components is the same.
`
`111. SIMULATION RESULTS
`The discussion in the previous section comparing the “hard”
`and “soft” decision nonlinearities suggested that the difference
`between these two nonlinearities will be most apparent when
`a channel is poorly equalized and there are numerous incorrect
`decisions (i.e., cases where at-N and yt have opposite signs).
`Simulations reported elsewhere [ 101, [7] have shown that the
`
`4This argument is developed in greater detail in [9] and also in [6] in the
`context of nonlinear deconvolution.
`.
`5EI--l,.-l,l = a? if t =
`and is 0 , otherwise.
`
`

`

`278
`
`IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 41, NO. 2, FEBRUARY 1993
`
`use of the soft decision nonlinearity can lead to more rapid
`initial convergence than the classical DDLMS algorithm in
`channels with moderate to severe noise and distortion. The
`simulations reported in this section focus on comparing the
`convergence limits of the DDLMS and soft decision-directed
`algorithms.
`Simulations were carried out using a synthetic binary PAM
`channel. The output of an ideal nonredundant binary signal
`source with signal levels f l was sent through a raised cosine
`pulse generator and a non-causal finite impulse response (FIR)
`distortion filter. White zero mean noise was then added to the
`output of the FIR filter to produce the input to the equalizer.
`50 different channels with different bit error rates were created
`by combining different distortion filters with different amounts
`of white noise. The distortion filters had a gain of 1 and phase
`distortion ranging from f 1 0 ” to f60”. The signal-to-noise
`ratio varied from 30 dB to 0 dB.
`The DDLMS and the soft-decision algorithm were used
`to equalize each of the 50 channels. In each simulation, the
`equalizer was initialized with a center tap weight of 1.0 with
`all other tap weights equal 0.0. An initial bit error rate for
`each channel was computed by comparing the sign of yt to the
`sign of at-N for 500 channel outputs.6 1000 weight updates
`were then performed (using either the DDLMS or soft-decision
`algorithm) and the tap weights frozen. A final bit error rate was
`computed based on an additional 500 channel outputs. A figure
`of merit y was then computed for each simulation:
`Final bit error rate
`y = 1.0 -
`Initial bit error rate ’
`A value of 1.0 for y indicates that the adjustment algorithm
`was able to reduce the bit error rate to 0 within 1000 update^,^
`y = 0.5 indicates that the bit error rate was cut in half by the
`adjustment algorithm, and values of y near 0 indicate that the
`adjustment algorithm has failed to improve the error rate at
`all. A range for E between 0.1 and
`was explored for
`each algorithm on each channel and results are reported for
`the value of E which gave the best value for 7.’
`The simulation results are summarized in Fig. 3 where y
`has been plotted on the vertical axis and the initial bit error
`rate as a percentage on the horizontal axis. The shaded region
`bounded by the lines marked with + indicates the performance
`range of the DDLMS algorithm over the 50 channels. The
`shaded region bounded by the lines marked with * indicates the
`performance range of the soft-decision algorithm. For channels
`with initial bit error rates up to about 5% the two algorithms
`are indistinguishable, however there are significant differences
`for larger initial bit error rates. The absolute convergence limit
`of each algorithm is indicated approximately by the bit error
`rate at which y = 0. For the DDLMS algorithm this limit is
`between 16% and 18%9 while the soft decision limit is between
`61n all cases, the initial bit error rate is > 0 indicating that the “eye” of the
`equalizer output is initially partially closed.
`’This does not mean the channel was perfectly equalized since it is possible
`to make no incorrect decisions and still have a high MSE.
`8This optimal value of f was not necessarily the same for both algorithms
`operating on the same channel.
`9This figure is in agreement with other reported convergence limits for the
`DDLMS algorithm [4].
`
`0.9
`
`12
`
`16
`
`20
`
`4
`8
`28
`0
`24
`Fig. 3. Graph of y versus initial bit error rate (%) for the DDLMS (+) and
`soft-decision (*) algorithms.
`
`32
`
`36
`
`+
`40
`
`32% and 35%. From a practical standpoint, the region of
`interest is primarily for values of y > 0.5. Even in this region,
`the soft decision algorithm maintains a significant advantage
`(10% to 12% for DDLMS with y = 0.5 compared to 20% to
`23% for the soft-decision algorithm).
`
`IV. CONCLUSION
`We have proposed a new algorithm for blind equalization
`of very distorted channels. The algorithm is based on the idea
`of adjusting the equalizer tap gains to maximize the likelihood
`that the equalizer outputs come from an i.i.d. source that is a
`mixture of two gaussians with known means. The approach
`is most similar to earlier work on blind deconvolution by
`Godfrey and Rocca [6] and blind equalization by Bellini [l],
`[2]. Our approach differs from these other approaches in that
`the algorithm is developed based on an assumed distribution
`for the equalizer output rather than assumptions about the
`distribution of signals at the channel input. This leads to an
`algorithm with an adaptive non-linearity for estimating the
`channel input given the equalizer output. This “soft” decision
`algorithm is compared to the classical decision-directed LMS
`algorithm on a binary PAM channel and simulations suggest
`that when operating in the blind mode the new algorithm can
`converge in channels with twice the initial bit error rate for
`which the DDLMS algorithm is convergent.
`
`ACKNOWLEDGMENT
`We thank Y. Le Cun and J. Bridle for helpful discussions.
`
`REFERENCES
`
`[l] S. Bellini, “Bussgang techniques for blind equalization,” in IEEE
`GLOBECOM Conf Rec., pp. 1634-1640, Dec. 1986.
`[2] S. Bellini and F. Rocca, “Near optimal blind deconvolution,” in
`Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, vol. 4,
`pp. 2236-2239, 1988.
`[3] A. Benveniste, M. Goursat, and G. Ruget, “Robust identification of
`a nonminimum phase system: Blind adjustment of linear equalizer
`in data communications,” IEEE Trans. Automat. Contr., vol. AC-25,
`pp. 385-399, 1980.
`[4] J. A. C. Bingham, The Theory and Practice of Modem Design. New
`York: Wiley, 1988.
`[5] A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum likelihood
`from incomplete data via the EM algorithm,” in Proc. Royal Statist.
`SOC., V O ~ . B-39, pp. 1-38, 1977.
`
`

`

`IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 41, NO. 2, FEBRUARY 1993
`
`219
`
`R. Godfrey and F. Rocca, “Zero memory non-linear deconvolution,”
`Geophys. Prospecting, vol. 29, pp. 189-228, 1981.
`G.E. Hinton and S.J. Nowlan, “The bootstrap Widrow-Hoff
`rule as
`a cluster-formation algorithm,” Neural Comput., ~ 0 1 . 2, pp, 355-362,
`1990.
`G , J, McLachlan and K,E, Basford, Mixture Models: Inference and
`New York: Marcel Dekker, 1988.
`Applications to Clustering.
`S. J. N o w h ‘‘soft competitive adaptation: Neural network learning
`algorithms based on fitting statistical mixtures,” Ph.D. dissertation,
`School of Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, Apr.
`1991.
`
`[IO] S . J. Nowlan and G. E. Hinton, Maximum likelihood decision directed
`adaptive equalization,” University of Toronto, Tech. Rep. CRG-TR-89-8,
`1989.
`[ I l l s. u. H. Qureshi, “Adaptive equalization,” Proc. IEEE, vol. 73,
`pp. 1349-1387, Sept. 1985.
`[12] B. Widrow and M.E. Hoff, Jr., “Adaptive switching circuits,” in IRE
`WESCON Convention Rec., 1960, pp. 96-104.
`[I31 B. Widrow, J. M. McCool, M. G. Larimore, and C. R. Johnson, Jr., “Sta-
`tionary and nonstationary
`learning characteristics of the LMS adaptive
`filter,” Proc. IEEE, vol. 64, pp. 1151-1162, Aug. 1976.
`[I41 B. Widrow and S. D. Steams, Adaptive Signal Processing. Englewood
`Cliffs, NJ: Prentice-Hall, 1985.
`
`

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket