`
`
`
`
`
`
`BEFORE THE PATENT TRIAL AND APPEAL BOARD
`
`
`
`
`
`
`
`
`Case IPR2022-01059
`U.S. Patent No. 10,779,080
`______________________
`
`
`GOOGLE LLC,
`
`Petitioner
`
`v.
`
`JAWBONE INNOVATIONS, LLC,
`
`Patent Owner
`
`
`
`
`
`
`
`Declaration of Shauna L. Wiest Regarding McCowan
`
`
`
`
`
`
`
`
`
`
`
`Page 1 of 78
`
`Amazon v. Jawbone
`U.S. Patent 10,779,080
`Amazon Ex. 1024
`
`
`
`Declaration of Shauna L. Wiest
`
`
`I, Shauna L. Wiest, state and declare as follows:
`
`Introduction
`
`
`
`I.
`
`1.
`
`I have prepared this Declaration in connection with Google LLC’s
`
`(“Petitioner”) Petition for Inter Partes Review of U.S. Patent No. 10,779,080,
`
`which I understand will be filed concurrently with this Declaration.
`
`2.
`
`I am currently a senior research analyst with the Research &
`
`Information Services team at Finnegan, Henderson, Farabow, Garrett & Dunner,
`
`LLP, located at 901 New York Avenue, NW, Washington, DC 20001-4413.
`
`3.
`
`I am over eighteen years of age, and I am competent to make this
`
`Declaration. I make this Declaration based on my own personal knowledge, and
`
`my knowledge of library science practices.
`
`4.
`
`I earned a Master of Science in Library Science degree from the
`
`University of North Carolina at Chapel Hill in 1999, and a Bachelor of Arts in
`
`Political Science degree from the University of California at San Diego in 1989.
`
`I have worked as a law librarian for over twenty years. I have been employed in
`
`the Research & Information Services Department at Finnegan, Henderson,
`
`Farabow, Garrett & Dunner, LLP since 2021. Before that, from 2000-2015, I
`
`was employed as a Law Librarian at Stoel Rives LLP, and from 2015-2016, I
`
`was employed as a Competitive Intelligence Specialist for Nossaman LLP.
`
`
`
`2
`
`Page 2 of 78
`
`
`
`
`II.
`
`Standard Library Practice for Receiving, Cataloging, and Making
`Materials, Including Serial Publications, Publicly Available
`
`Declaration of Shauna L. Wiest
`
`
`5.
`
`I have knowledge of and experience with standard library practices
`
`regarding the receipt, cataloging, shelving, and making materials, including
`
`serial publications, available to the public. I am fully familiar with and have
`
`knowledge of and experience with the Machine-Readable Cataloging (MARC)
`
`system, an industry-wide standard that libraries use to catalog materials.
`
`6.
`
`The MARC system was developed during the 1960s to standardize
`
`bibliographic records so they could be read by computers and shared among
`
`libraries. By the mid-1970s, MARC had become the international standard for the
`
`storage of bibliographic data and cataloguing. It is still used today. Many libraries
`
`provide public access to their MARC records via the Internet and/or their electronic
`
`cataloging systems at the library. In a MARC record, each field provides specific
`
`information about the cataloged item, including how materials are held and made
`
`available to the public.
`
`III. Serial Publications
`
`7.
`
`A serial publication, often known as a “journal,” is a resource that
`
`is issued in successive parts and has no predetermined conclusion. These
`
`successive parts are commonly referred to as “issues,” and each issue is usually
`
`chronologically numbered and dated. The presence of enumeration, years of
`
`
`
`3
`
`Page 3 of 78
`
`
`
`Declaration of Shauna L. Wiest
`
`
`coverage, and/or other chronological information also indicates a serial
`
`publication.
`
`8.
`
`There are significant differences between cataloging finite resources
`
`(books/monographs) and continuing resources (serials). For serials, the catalog
`
`record provides information about the serial as a whole, including the first or earliest
`
`available issue. It also provides information as to holdings – the volumes and issues,
`
`with dates, received by the library and made available to the public. In serials
`
`cataloging, there are identifying characteristics unique to serials that are slightly
`
`different from monographs (books). The issue date for a print serial publication, for
`
`example, generally appears on the cover (front or back), the masthead page, the title
`
`page (if any), the table of contents page(s), or on the pages of the individual articles
`
`contained in the issue. More information regarding the unique aspects of cataloguing
`
`serials can be found at this link: https://www.loc.gov/aba/pcc/conser/scctppt/Basic-
`
`2014/Basic-Trainee-Mannual.pdf
`
`IV. MARC Records
`
`9.
`
`The MARC record system uses a specific three-digit numeric code
`
`(from 001-999) to identify each field in a catalog record. For example, field tag 008
`
`provides the six-digit date the item was catalogued (Date entered on file). The first
`
`six characters of field tag 008 are always in the “YYMMDD” format. Field tag 022
`
`provides the International Standard Serial Number (ISSN), a unique identification
`
`
`
`4
`
`Page 4 of 78
`
`
`
`Declaration of Shauna L. Wiest
`
`
`number assigned to serial publications (continuing resource). Field tag 245 identifies
`
`the full title statement for the work and field tag 260 identifies the place of
`
`publication, name of publisher, and copyright date of the publication. Field tag 362
`
`identifies the numbering to be used for chronological cataloguing of individual
`
`issues of continuing resources (serials). The designations within field tag 362
`
`determine at what point in time the serial began, which guides how issues are
`
`checked in, processed, and added to the library’s main collection. Finally, the 9XX
`
`field tags denote local holdings information for the resource.
`
`10.
`
` Based on standard library practice, when a library receives an
`
`item, it stamps (or labels) the item with the library name and often with a date
`
`that is within a few days or weeks of receipt. Next, the library will catalog the
`
`item within a matter of a few days or weeks of receiving it.
`
`11. Generally, after an item is cataloged, the public may access the
`
`item by searching a catalog, browsing the library shelves, and either requesting
`
`or electronically accessing the item from the library. Standard library practice is
`
`to make the item available to the public within a few days or weeks of cataloging
`
`it.
`
`V.
`
`Print Holdings and MARC Records for McCowan
`
`12. As detailed below, I have reviewed the print public holdings
`
`information and Library of Congress and Iowa State University of Science and
`
`
`
`5
`
`Page 5 of 78
`
`
`
`Declaration of Shauna L. Wiest
`
`
`Technology MARC records for Iain A. McCowan, Darren C. Moore, and S.
`
`Sridharan, “Near-field Adaptive Beamformer for Robust Speech Recognition,”
`
`Digital Signal Processing, January 2002, Vol. 12, No. 1, pages 87-106 (ISSN
`
`1051-2004) (“McCowan”). Exhibit 1007 to the concurrently filed Petition is a
`
`true and accurate copy of McCowan.
`
`13.
`
`Appendix A to this declaration is a true and accurate copy of the print
`
`journal cover, title pages, table of contents, and library date stamp for the issue of
`
`Digital Signal Processing containing McCowan held by the Library of Congress.
`
`Appendix A also includes pages 87 to 106 of this issue, which is the article titled
`
`“Near-field Adaptive Beamformer for Robust Speech Recognition” (McCowan). The
`
`date stamp for this issue indicates that McCowan was received by the Library of
`
`Congress on February 19, 2002. The print journal cover provides directional
`
`information so that members of the interested public could access the print issue
`
`containing McCowan at Call Number TK 5102.5 .D4463 Set 1 with Bar Code 0 020
`
`814 014 0 within a matter of a few days or weeks of February 19, 2002.
`
`14. Based on the information in Appendix A, it is clear that the issue of
`
`Digital Signal Processing containing McCowan was received by the Library of
`
`Congress on or before February 19, 2002. Based on standard library practice,
`
`McCowan would have been processed and catalogued by the Library of Congress
`
`within a matter of a few days or weeks of February 19, 2002.
`
`
`
`6
`
`Page 6 of 78
`
`
`
`
`
`Declaration of Shauna L. Wiest
`
`15. Accordingly, McCowan would have been made available to the public
`
`within a few days or weeks of being checked-in and catalogued. Members of the
`
`interested public could have accessed McCowan by browsing the Library of
`
`Congress shelves or by searching the Library’s catalog within a few days or weeks
`
`of February 19, 2002.
`
`16.
`
` Appendix B to this declaration is a true and accurate copy of the
`
`Library of Congress MARC record for its holdings of the serial publication Digital
`
`Signal Processing containing McCowan, which was downloaded
`
`from
`
`https://lccn.loc.gov/91650983/marcxml on April 5, 2022.
`
`17. Appendix C to this declaration is a true and correct copy of the Library
`
`of Congress public catalog record for its copy of Digital Signal Processing
`
`containing McCowan, including holdings and location information, which was
`
`downloaded
`
`from
`
`https://catalog.loc.gov/vwebv/holdingsInfo?searchId=17671&recPointer=0&recCo
`
`unt=25&searchType=2&bibId=11390503 on April 5, 2022.
`
`18. Appendix D to this declaration is a true and accurate copy of the print
`
`journal cover, title pages, table of contents, and library date stamp for the issue of
`
`Digital Signal Processing containing McCowan held by the Iowa State University of
`
`Science and Technology Library. Appendix D also includes pages 87 to 106 of this
`
`issue, which is the article titled “Near-field Adaptive Beamformer for Robust Speech
`
`
`
`7
`
`Page 7 of 78
`
`
`
`Declaration of Shauna L. Wiest
`
`
`Recognition” (McCowan). The date stamp for this issue indicates that McCowan was
`
`received by the Iowa State University of Science and Technology Library on
`
`February 18, 2002. The print journal cover provides directional information so that
`
`members of the interested public could access the print issue containing McCowan
`
`at Call Number TK 5102.5 .D4463 within a matter of a few days or weeks of
`
`February 18, 2002.
`
`19. Based on the information in Appendix D, it is clear that the issue of
`
`Digital Signal Processing containing McCowan was received by the Iowa State
`
`University of Science and Technology Library on or before February 18, 2002.
`
`Based on standard library practice, McCowan would have been processed and
`
`catalogued by the Iowa State University of Science and Technology Library within
`
`a matter of a few days or weeks of February 18, 2002.
`
`20. Accordingly, McCowan would have been made available to the public
`
`within a few days or weeks of being checked-in and catalogued. Members of the
`
`interested public could have accessed McCowan by browsing the Iowa State
`
`University of Science and Technology Library shelves or by searching the Library’s
`
`catalog within a few days or weeks of February 18, 2002.
`
`21. Appendix E to this declaration is a true and accurate copy of the Iowa
`
`State University of Science and Technology Library MARC record for its holdings
`
`of the serial publication Digital Signal Processing containing McCowan, which was
`
`
`
`8
`
`Page 8 of 78
`
`
`
`
`downloaded
`
`from
`
`Declaration of Shauna L. Wiest
`
`https://iowa-primo.hosted.exlibrisgroup.com/primo-
`
`explore/sourceRecord?vid=01IASU&docId=01IASU_ALMA21233591260002756
`
`on May 10, 2022.
`
`22. Appendix F to this declaration is a true and correct copy of the Iowa
`
`State University of Science and Technology Library public catalog record for its
`
`copy of Digital Signal Processing containing McCowan, including holdings and
`
`location
`
`information, which was
`
`downloaded
`
`from
`
`https://iowa-
`
`primo.hosted.exlibrisgroup.com/permalink/f/12tutg/01IASU_ALMA21233591260
`
`002756 on April 20, 2022.
`
`23. The Library of Congress MARC record (Appendix B) for the serial
`
`publication Digital Signal Processing containing McCowan confirms the fixed data
`
`elements of MARC field tag 008 as 900620c19919999mnubrp0a0engc. As
`
`discussed above, the first six characters “900620” are in typical “YYMMDD”
`
`format and indicate that the serial publication Digital Signal Processing was
`
`catalogued by the Library of Congress on June 20, 1990. The publication status code
`
`“c” appearing in MARC field tag 008 refers to “continuing resource currently
`
`published” indicating that Digital Signal Processing began publication in 1991 and
`
`is a continuing published resource.
`
`24. The Library of Congress MARC record field tag 022 denotes the unique
`
`International Standard Serial Number (ISSN) for Digital Signal Processing as 1051-
`
`
`
`9
`
`Page 9 of 78
`
`
`
`Declaration of Shauna L. Wiest
`
`
`2004. The Library of Congress MARC record field tag 245 denotes the title and
`
`statement of responsibility for the work as “Digital Signal Processing.”
`
`25. The Library of Congress MARC record field tag 362 provides a
`
`chronological designation of a continuing (serial) resource. In this MARC record for
`
`Digital Signal Processing the sequential serial designation begins with Volume 1,
`
`no. 1 (Jan. 1991)- with no end date noted.
`
`26. Finally, the Library of Congress MARC record field tags 984 and 991
`
`identify the holdings, location, and call number for its copy of Digital Signal
`
`Processing containing McCowan. The MARC record lists this information as:
`
`General Collection, Call Number TK5102.5 .D4463, Serials Location, with holdings
`
`beginning with the issue dated January 9, 2001. This information confirms that the
`
`journal issue of Digital Signal Processing containing McCowan is publicly available
`
`and held in the General Collection at Call Number TK5102.5 .D4463.
`
`27. The Library of Congress’s public catalog record for its copy of Digital
`
`Signal Processing (Appendix C) sets forth the holdings and onsite location
`
`information for members of the public seeking the print issue containing McCowan.
`
`The public catalog record indicates that the print issue containing McCowan should
`
`be requested in the Jefferson or Adams Building Reading Rooms and is contained
`
`within holdings v.3-v.12:no.1 (1993-2002:Jan.) at Call Number TK5102.5 .D4463.
`
`
`
`10
`
`Page 10 of 78
`
`
`
`
`
`Declaration of Shauna L. Wiest
`
`28. The Iowa State University of Science and Technology Library MARC
`
`record (Appendix E) for the serial publication Digital Signal Processing containing
`
`McCowan confirms the fixed data elements of MARC field tag 008 as
`
`900620c19919999mnubr1p0a0engd. As discussed above, the first six characters
`
`“900620” are in typical “YYMMDD” format and indicate that the serial publication
`
`Digital Signal Processing was catalogued by the Iowa State University of Science
`
`and Technology Library on June 20, 1990. The publication status code “c” appearing
`
`in MARC field tag 008 refers to “continuing resource currently published”
`
`indicating that Digital Signal Processing began publication in 1991 and is a
`
`continuing published resource.
`
`29. The Iowa State University of Science and Technology Library MARC
`
`record field tag 999 (including the AVA field tag) identifies the local holdings,
`
`location, and call number for its copy of McCowan. The AVA field tag in this MARC
`
`record lists the holdings information as: ##$0990008738310102756$822233
`
`591250002756$a01IASU_INST$bPARKS$cStorage Building$dTK5102.5 D4463
`
`$echeck_holdings$jSGEN$k0$p1$qPARKS$tv. 4 (1994)-v. 15 (2005). This
`
`information confirms that the print issue of Digital Signal Processing containing
`
`McCowan, is available and held in the PARKS Storage Building at Call Number
`
`TK5102.5 .D4463, with holdings beginning with Volume 4 (1994) through Volume
`
`15 (2005).
`
`
`
`11
`
`Page 11 of 78
`
`
`
`
`
`Declaration of Shauna L. Wiest
`
`30. The Iowa State University of Science and Technology Library’s public
`
`catalog record for its copy of Digital Signal Processing (Appendix F) sets forth the
`
`public holdings and onsite location information for members of the public seeking
`
`the print issue containing McCowan. The public catalog record indicates that the
`
`2002 Bound Issue (volume 12) of Digital Signal Processing containing McCowan,
`
`is publicly available and held in the PARKS Storage Building at Bar Code
`
`32792018902886 at Call Number TK5102.5 .D4463.
`
`31. Based on this evidence, it is my opinion that Exhibit 1007 is an
`
`authentic document, which would have been made publicly available and publicly
`
`accessible within a few days or weeks of February 19, 2002.
`
`VI. Conclusion
`
`32.
`
`In signing this Declaration, I understand it will be filed as evidence
`
`in a contested case before the Patent Trial and Appeal Board of the United States
`
`Patent and Trademark Office. I understand I may be subject to cross-
`
`examination in this case and that cross-examination will take place within the
`
`United States. If cross-examination is required of me, I will appear for cross-
`
`examination within the United States during the time allotted for cross-
`
`examination.
`
`33.
`
`I declare that all statements made herein of my knowledge are true, that
`
`all statements made on information and belief are believed to be true, and that these
`
`
`
`12
`
`Page 12 of 78
`
`
`
`Declaration of Shauna L. Wiest
`
`
`statements were made with the knowledge that willful false statements and the like
`
`so made are punishable by fine or imprisonment, or both, under Section 1001 of Title
`
`18 of the United States Code.
`
`
`
`
`
`
`
`Executed on May 24, 2022.
`
`
`
`Shauna L. Wiest
`
`13
`
`Page 13 of 78
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Appendix A
`Appendix A
`
`
`
`
`Page 14 of 78
`
`
`
`
`
`
`
`ONISSIIONdIWNOISIWLIDIC
`
`AA
`
`e414
`
`Al
`
`MAE
`
`Page 15 of 78
`
`
`
`biGee)|4
`
`ALTER)
`
`Set |
`
`PROCESSING12JAN20021
`
`
`
`DIGITALSIGNAL
`
`Page 16 of 78
`
`
`
`
`~ __
`
`http:/Awww.ideallbrary.com-
`
`-
`
`=
`
`a. ie)
`
`I
`
`
` | Articies published onlinefirst aa
`
`
`Q) iioTey '
`‘PRESS
`
`Editors Y
`
`Joe Campbel
`
`ISSN 1051-2004
`
`
`
`ACADEMI
`
`Page 17 of 78
`
`
`
`Digital
`Signal
` Processing
`
`A Review Journal
`
`Editors
`
`Jim Schroeder
`SPRI/CSSIP
`Adelaide, SA, Australia
`E-mail: schroeder@cssip.edu.au
`
`Joe Campbell
`M.LT. Lincoin Laboratory
`Lexington, Massachusetts
`E-mail: j.campbell@ieee.org
`
`Editorial Board
`
`Maurice Bellanger
`CNAM
`Paris, France
`Robert E. Bogner
`University of Adelaide
`Adelaide, SA, Australia
`Johann F. Bohme
`Ruhr-Universitat Bochum
`Bochum, Germany
`James A. Cadzow
`Vanderbilt University
`Nashville, Tennessee
`G. Clifford Carter
`NUWC
`Newport, RhodeIsland
`A. G. Constantinides
`imperial College
`London, England
`Petar M. Djuric
`State University of New York
`Stony Brook, New York
`Anthony D. Fagan
`University College Dublin
`Dublin, treland
`Sadaoki Furui
`Tokyo institute of Technology
`Tokyo, Japan
`
`John E. Hershey
`General Electric Company
`Schenectady, New York
`B. R. Hunt
`University of Arizona
`Tucson, Arizona
`JamesF. Kaiser
`Duke University
`Durham, North Carolina
`R. Lynn Kirlin
`University of Victoria
`Victoria, British Columbia, Canada
`Ercan Kuruoglu
`Istituto di Elaborazione della Informazione
`Ghezzano,Italy
`MeemongLee
`Jet Propulsion Laboratory
`Pasadena,California
`Petre Stoica
`Uppsala University
`Uppsala, Sweden
`Mati Wax
`Wavion, Ltd
`Yoqneam,Isreal
`Rao Yarlagadda
`Oklahoma State University
`Stillwater, Oklahoma
`
`Cover photo. Lower path directivity pattern at 5000 Hz. See the article by McCowan, Moore, and Sridharan in
`this issue
`
`Page 18 of 78
`
`Page 18 of 78
`
`
`
`
`
`
`ARY OF °NGRS
`<<
`Sp
`‘p
`FEB 14 2002
`
`
`
` COPY.CopyRant DES
`
`
`Digital Signal Processing
`
`Volume 12, Number 1, January 2002
`
`© 2002 Elsevier Science (USA)
`
`All Rights Reserved
`
`Nopart of this publication may be reproducedortransmitted in any form or by any means, electronic or mechanical, including photocopy,
`recording, or any information storage and retrieval system, without permissionin writing from the Publisher. Exceptions: Explicit permission
`from Academic Press is not required to reproduce a maximum oftwo figures or tables from an Academic Pressarticle in anotherscientific or
`research publication provided that the material has not been credited to another source andthatfull credit to the Academic Pressarticle is
`given. In addition, authors of work contained herein need not obtain permission in the following casesonly: (1) to use their original figures or
`tables in their future works; (2) to make copies of their papers for use in their classroom teaching; and(3) to include their papers as part of their
`dissertations.
`The appearanceof the codeat the bottom ofthe first page of an article in this journal indicates the Publisher's consentthat copies of the
`article may be madefor personalor internal use, or for the personalorinternal use of specific clients. This consent is given on the condition,
`however, that the copier pay the stated per copy fee through the Copyright Clearance Center,
`Inc. (222 Rosewood Drive, Danvers,
`Massachusetts 01923), for copying beyondthat permitted by Sections 107 or 108 of the U.S. Copyright Law. This consent does not extend to
`other kinds of copying, such as copying for general distribution, for advertising or promotional purposes, for creating new collective works, or for
`resale. Copy feesfor pre-2002 articles are as shownonthearticletitle pages; if no fee code appears onthetitle page, the copy fee is the same
`as those for currentarticles.
`
`1051-2004/02 $35.00
`MADE IN THE UNITED STATES OF AMERICA
`This journal is printed on acid-free paper
`
`©
`
`DIGITAL SIGNAL PROCESSING(ISSN 1051-2004)
`Published quarterly by Elsevier Science.
`Editorial and Production Offices: 525 B Street, Suite 1900, San Diego, CA 92101-4495
`Accounting and Circulation Offices: 6277 Sea Harbor Drive, Orlando, FL 32887-4900
`2002: Volume 12. Price $343.00 U.S.A. and Canada; $374.00all olher countries
`All prices include postage and handling
`Information concerning personal subscription rales may be obtained by wriling to lhe Publishers. All correspondence, permission requests, and subscription orders
`should be addressedto the office of the Publishers at 6277 Sea HarborDrive, Orlando, FL 32887-4900(telephone: 407-345-2000), Send nolices of change of address
`to the office of the Publishers at leas| 6 to 8 weeksin advance.Please include both old and new addresses. POSTMASTER: Send changesolfaddresslo Digital Signal
`Processing, 6277 Sea Harbor Drive, Orlando, FL 32887-4900,
`
`Page 19 of 78
`
`Page 19 of 78
`
`
`
`Digital Signal Processing 12, 87-106 (2002)
`doi:10.1006/dspr.2001.0414, available online at http://www.idealibrary.com on I D E mel
`
`®
`
`Near-field Adaptive Beamformerfor Robust
`Speech Recognition
`
`Iain A. McCowan, Darren C. Moore, and 8S. Sridharan
`
`Speech Research Laboratory, RCSAVT, School of EESE, Queensland University
`of Technology, GPO Box 2434, Brisbane QLD 4001, Australia
`E-mail: iain@ieee.org,; moore@idiap.ch; s.sridharan@qut.edu.au
`
`
`McCowan, I. A., Moore, D. C., and Sridharan, 8., Near-field Adaptive
`Beamformer for Robust Speech Recognition, Digital Signal Processing 12
`(2002) 87-106.
`
`This paper investigates a new microphonearray processing technique
`specifically for the purpose of speech enhancement and recognition. The
`main objective of the proposed techniqueis to improve the low frequency
`directivity of a conventional adaptive beamformer, as low frequency per-
`formanceis critical in speech processing applications. The proposed tech-
`nique, termed near-field adaptive beamforming (NFAB), is implemented
`using the standard generalized sidelobe canceler (GSC) system structure,
`where a near-field superdirective (NFSD) beamformeris used as the fixed
`upper-path beamformer to improve the low frequency performance. In ad-
`dition, to minimize signal leakage into the adaptive noise canceling path for
`near-field sources, a compensation unit is introduced prior to the blocking
`matrix. The advantage of the techniqueis verified by comparingthe direc-
`tivity patterns with those of conventionalfilter-sum, NFSD, and GSC sys-
`tems. In speech enhancement and recognition experiments, the proposed
`technique outperforms the standard techniques for a near-field source in
`adverse noise conditions.
`© 2002 Elsevier Science (USA)
`Key Words: microphone array; beamforming; near-field; adaptive; su-
`perdirectivity; speech recognition.
`
`INTRODUCTION
`1.
`Ss
`
`Currently, much research is being undertaken to improve the robustness of
`speech recognition systems in real environments. This paper focuses on the
`use of a microphone array to enhance the noisy input speech signal prior to
`recognition. While the use of microphonearrays for speech recognition has been
`studied for some time by a numberof researchers, a persistent problem has been
`the poor low frequencydirectivity of conventional beamforming techniques with
`87
`
`1051-2004/02 $35.00
`© 2002 Elsevier Science (USA)
`All rights reserved.
`
`Page 20 of 78
`
`Page 20 of 78
`
`
`
`88
`
`Digital Signal Processing Vol. 12, No. 1, January 2002
`
`practical array dimensions. Low frequency performanceis critical for speech
`processing applications, as significant speech energy is located below 1 kHz.
`By explicitly maximizing the array gain, superdirective beamforming tech-
`niques are able to achieve greater directivity than conventional techniques with
`closely spaced sensorarrays[1]. This directivity generally comes at the expense
`of a controlled reduction in the white noise gain of the array. Recent work has
`demonstrated the suitability of superdirective beamforming for speech enhance-
`ment and recognition tasks [2, 3]. By employing a spherical propagation model
`in its formulation, rather than assuminga far-field model, near-field superdirec-
`tivity (NFSD) succeeds in achieving high directivity at low frequencies for near-
`field speech sources in diffuse noise conditions [4]. In previous work, near-field
`superdirectivity has been shownto lead to good speech recognition performance
`in high noise conditions for a near-field speaker[5].
`Superdirective techniques are typically formulated assuming a diffuse noise
`field. While this is a good approximation to many practical noise conditions,
`further noise reduction would result from a more accurate model of the
`actual noise conditions during operation. Adaptive array processing techniques
`continually update their parameters based on the statistics of the measured
`input noise. The generalized sidelobe canceler (GSC) [6] presents a structure
`that can be used to implement a variety of adaptive beamformers. A block
`diagram of the basic GSC system is shown in Fig. 1. The GSC separates
`the adaptive beamformer into two main processing paths—a standard fixed
`beamformer, w, with L constraints on the desired signal response, and an
`adaptive path, consisting of a blocking matrix, B, and a set of adaptive filters,a.
`As the desired signal has been constrained in the upper path, the lower path
`filters can be updated using an unconstrained adaptive algorithm, such as the
`least-mean-square (LMS)algorithm.
`While the theory of adaptive techniques promises greater signal enhance-
`ment, this is not always the case in real situations. A common problem with
`the GSC system is leakage of the desired signal through the blocking matrix,
`resulting in signal degradation at the beamformer output. This is particularly
`problematic for broadband signals, such as speech, and especially for speech
`recognition applications wheresignal distortionis critical.
`In this paper we propose a system that is suited to speech enhancement in a
`practical near-field situation, having both the good low frequency performance
`of near-field superdirectivity and the adaptability of a GSC system, while taking
`
`
`
`FIG. 1. Generalized sidelobe canceler structure.
`
`Page 21 of 78
`
`
`
`Page 21 of 78
`
`
`
`McCowan, Moore, and Sridharan: Near-field Adaptive Beamformer
`
`89
`
`care to minimize the problem of signal degradation for near-field sources.
`We begin by formulating a concise model for near-field sound propagation in
`Section 2. This model is then used in Section 3 to develop the proposed near-
`field adaptive beamforming (NFAB) technique. To demonstrate the benefit of
`the technique over existing methods, an experimental evaluation assessing
`directivity patterns, speech enhancement performance, and speech recognition
`performanceis detailed in Sections 4 and 5.
`
`2. NEAR-FIELD SOUND PROPAGATION MODEL
`
`| I
`
`n sensor array applications, a succinct means of characterizing both the
`array geometry andthelocation of a signal sourceis via the propagationvector.
`The propagation vector concisely describes the theoretical propagation of the
`signal from its source to each sensorin the array. In this section, we develop an
`expression for the propagation vector of a sound source located in the near-field
`of a microphone array using a spherical propagation model. This expression is
`then usedin the formulation of the proposed near-field adaptive beamformer in
`the following sections.
`Many microphone array processing techniques assume a planar signal
`wavefront. This is reasonable for a far-field source, but when the desired
`source is close to the array a more accurate spherical wavefront model must
`be employed. For a microphonearray of length L, a source is considered to be
`in the near-field if r < 2L2/A, where ris the distance to the source and ( is the
`wavelength.
`Wedefine the reference microphone as the origin of a 3-dimensional vector
`space, as shownin Fig. 2. The position vector for a source in direction (95, ¢s),
`at distance r, from the reference microphone, is denoted p, andis given by:
`
`Ps =rs[, y. Z]|sind, sings|. (1)
`
`cos #, sin ds
`
`cos d;
`
`(i = 1,...,N), are similarly
`The microphone position vectors, denoted as p;
`defined. The distance from the source to microphone/ is thus
`
`dj = ||Ps — pill.
`
`(2)
`
`where|| || is the Euclidean vector norm.
`In such a model, the differences in distance to each sensor can be significant
`for a near-field source, resulting in phase misalignment across sensors. The
`difference in propagation time to each microphone with respect to the reference
`microphone(i = 1) is given by
`
`Tj
`
`_ di -ay
`Cc
`
`(3)
`
`Page 22 of 78
`
`
`
`Page 22 of 78
`
`
`
`90
`
`Digital Signal Processing Vol. 12, No. 1, January 2002
`
`source
`
`
`
`x
`
`x
`
`microphone i
`
`#5
`
`FIG. 2. Near-field propagation model.
`
`where c = 340 ms! for sound. In addition, the wavefront amplitude decaysat a
`rate proportional to the distance traveled. The resulting amplitude differences
`across sensors are negligible for far-field sources, but can be significant in
`the near-field case. The microphone attenuation factors, with respect to the
`amplitude on the reference microphone, are given by
`
`i
`
`_ dy
`di
`
`(4)
`
`Thus, if x1(f) is the desired source at the reference microphone, the signal on
`the ith microphoneis given by
`
`xi(f) Sour (fel,
`
`(5)
`
`Consequently, we define the near-field propagation vector for a source at
`distance r and direction (@, #) as
`
`d(f.r,6,¢) = [oyeJn ae127m ayeJemttn
`
`(6)
`
`3, NEAR-FIELD ADAPTIVE BEAMFORMING
`==
`
`The proposed system structure is shown in Fig. 3. The objective of the
`proposed technique is to add the benefit of good low frequency directivity
`to a standard adaptive beamformer, as low frequency performanceis critical
`in speech processing applications. The upper path consists of a fixed near-
`field superdirective beamformer, while the lower path contains a near-field
`compensation unit, a blocking matrix and an adaptive noise cancelingfilter.
`The principal componentsof the system are discussed in the followingsections.
`
`Page 23 of 78
`
`
`Page 23 of 78
`
`
`
`McCowan,Moore, and Sridharan: Near-field Adaptive Beamformer
`
`91
`
`Fixed
`NFSD
`Beamformer
`
`@)
`
`yif)
`
`Near-field
`
`
`
`
`Blocking
`Adaptive
`Matrix
`Filters
`compensation
`
`
`
`
`
`
`FIG. 3. Near-field adaptive beamformer,
`
`Section 3.1 gives an explanation of the near-field superdirective beamformer.
`Section 3.2 proposes the inclusion of a near-field compensation unit in the
`adaptive sidelobe canceling path and examines its effect on reducing signal
`distortion at the output. Once this near-field compensation has been performed,
`a standard generalized sidelobe canceling blocking matrix and adaptive filters
`can be applied to reduce the output noise power, as discussed in Section 3.3.
`
`3.1. Near-field Superdirective Beamformer
`Superdirective beamforming techniques are based upon the maximization of
`the array gain, or directivity index. The array gain is defined as the ratio of
`output signal-to-noise ratio to input signal-to-noise ratio and for the general
`case can be expressed in matrix notation as [1]
`
`wif)" P(fyw(f)
`= wi NFQAWN)
`“
`Gf) = 7
`
`where w(f) is a column vector of channelgains,
`
`wif) =[wi(f) -.. wi(f) --. wwf)".
`
`(8)
`
`( )" is the complex conjugate transpose operator, and P(f) and Q(f) are
`the cross-spectral density matrices of the signal and noise respectively. In
`practical speech processing applications the form of the signal and noise cross-
`spectral density matrices is generally unknown and mustbe estimated, either
`from mathematical models (fixed beamformers) o