`Naganuma
`
`( 10 ) Patent No .: US 11,069,337 B2
`( 45 ) Date of Patent :
`Jul . 20 , 2021
`
`US011069337B2
`
`( 54 ) VOICE - CONTENT CONTROL DEVICE ,
`VOICE - CONTENT CONTROL METHOD ,
`AND NON - TRANSITORY STORAGE
`MEDIUM
`( 71 ) Applicant : JVC KENWOOD Corporation ,
`Yokohama ( JP )
`( 72 ) Inventor : Tatsumi Naganuma , Yokohama ( JP )
`( 73 ) Assignee : JVC KENWOOD Corporation ,
`Yokohama ( JP )
`Subject to any disclaimer , the term of this
`patent is extended or adjusted under 35
`U.S.C. 154 ( b ) by 108 days .
`( 21 ) Appl . No .: 16 / 290,983
`( 22 ) Filed :
`Mar. 4 , 2019
`( 65 )
`
`( * ) Notice :
`
`Prior Publication Data
`Sep. 12 , 2019
`US 2019/0279611 A1
`Foreign Application Priority Data
`( 30 )
`Mar. 6 , 2018 ( JP )
`JP2018-039754
`( 51 ) Int . CI .
`GIOL 13/08
`GIOL 17/26
`
`( 2013.01 )
`( 2013.01 )
`( Continued )
`( 52 ) U.S. Ci .
`GIOL 13/08 ( 2013.01 ) ; G06F 3/167
`CPC
`( 2013.01 ) ; GIOL 15/08 ( 2013.01 ) ; GIOL 15/16
`( 2013.01 ) ; G10L 15/18 ( 2013.01 ) ; GIOL
`15/1807 ( 2013.01 ) ; GIOL 15/22 ( 2013.01 ) ;
`GIOL 17/26 ( 2013.01 ) ; GIOL 15/183
`( 2013.01 ) ;
`( Continued )
`( 58 ) Field of Classification Search
`CPC . GOON 3/08 ; GO6N 20/00 ; G06N 3/02 ; G1OL
`
`15/22 ; GIOL 15/16 ; G1OL 15/1815 ; GIOL
`2015/223 ; GIOL 2015/227 ; G1OL 15/24 ;
`G06F 16/90332 ; G06F 40/205 ; G06F
`40/30 ; G06F 16/285 ; G06F 3/167 ; G06F
`16/3344 ; GO6F 40/268 ; GO6F 40/284 ;
`( Continued )
`References Cited
`U.S. PATENT DOCUMENTS
`
`( 56 )
`
`5,852,804 A 12/1998 Sako
`2013/0325471 A1 * 12/2013 Rachevsky
`
`GOON 20/00
`704/244
`
`( Continued )
`FOREIGN PATENT DOCUMENTS
`04-204700
`7/1992
`JP
`Primary Examiner Linda Wong
`( 74 ) Attorney , Agent , or Firm Amin , Turocy & Watson ,
`LLP
`
`( 57 )
`ABSTRACT
`A voice - content control device includes a voice classifying
`unit configured to analyze a voice spoken by a user and
`acquired by a voice acquiring unit to classify the voice as
`either one of a first voice or a second voice , a process
`executing unit configured to analyze the acquired voice to
`execute processing required by the user , and a voice - content
`generating unit configured to generate , based on content of
`the executed processing , output sentence that is text data for
`a voice to be output to the user , wherein the voice - content
`generating unit is further configured to generate a first output
`sentence as the output sentence when the analyzed voice has
`been classified as the first voice , and generate a second
`output sentence in which information is omitted as com
`pared to the first output sentence as the output sentence when
`the analyzed voice has been classified as the second voice .
`5 Claims , 6 Drawing Sheets
`
`16
`
`30
`
`VOICE ACQUIRING UNIT
`
`410
`VOICE
`DETECTING
`UNIT
`
`( 32
`VOICE ANALYZYING UNIT
`
`138
`VOICE CLASSIFYING
`UNIT
`
`12
`VOICE OUTPUT
`UNIT
`
`$ 14
`LIGHTING UNIT
`
`18
`COMMUNICA
`TION UNIT
`
`20
`STORAGE
`
`( 50
`INTENTION ANALYZYING
`UNIT
`( 52
`ACQUISITION CONTENT
`INFORMATION ACQUIRING
`UNIT
`PROCESS EXECUTING UNIT
`
`( 36
`160
`FIRST OUTPUT SENTENCE
`GENERATING UNIT
`62
`SECOND OUTPUT SENTENCE
`GENERATING UNIT
`VOICE - CONTENT
`GENERATING UNIT
`
`< 40
`OUTPUT CONTROLLER
`
`CONTROLLER
`
`VOICE - CONTENT CONTROLLER
`
`-1-
`
`Amazon v. SoundClear
`US Patent 11,069,337
`Amazon Ex. 1001
`
`
`
`US 11,069,337 B2
`Page 2
`
`( 2006.01 )
`( 2006.01 )
`( 2013.01 )
`( 2013.01 )
`( 2006.01 )
`( 2006.01 )
`( 2013.01 )
`
`( 51 ) Int . Ci .
`GIOL 15/22
`G06F 3/16
`GIOL 25/51
`GIOL 15/18
`GIOL 15/16
`GIOL 15/08
`GIOL 15/183
`( 52 ) U.S. Cl .
`GIOL 25/51 ( 2013.01 ) ; GIOL 2015/227
`CPC
`( 2013.01 ) ; GIOL 2015/228 ( 2013.01 )
`( 58 ) Field of Classification Search
`CPC .. GO6F 40/289 ; G06F 16/3329 ; G06F 16/337 ;
`G06F 40/00 ; G06F 40/56
`See application file for complete search history .
`References Cited
`U.S. PATENT DOCUMENTS
`2016/0379638 A1 * 12/2016 Basye
`2017/0083281 A1 *
`3/2017 Shin
`4/2017 Gelfenbeyn
`2017/0110129 A1 *
`6/2017 Divakaran
`2017/0160813 A1 *
`5/2019 Tsai
`2019/0130900 A1 *
`5/2019 Andersen
`2019/0139541 A1 *
`5/2019 Huang
`2019/0164554 A1 *
`6/2019 Nandy
`2019/0180740 A1 *
`2019/0266999 A1 *
`8/2019 Chandrasekaran
`* cited by examiner
`
`GIOL 15/18
`704/235
`GIOL 13/033
`G06F 3/167
`G06K 9/00221
`GIOL 15/22
`GIOL 15/16
`GO6F 16/3329
`GIOL 15/30
`GO9B 5/00
`
`( 56 )
`
`-2-
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul. 20, 2021
`Jul . 20 , 2021
`
`Sheet 1 of 6
`Sheet 1 of 6
`
`US 11,069,337 B2
`US 11,069,337 B2
`
`FIG.1
`
`10 14
`
`0
`
`12
`
`
`
`-H
`
`-3-
`
`-3-
`
`
`
`U.S. Patent
`
`Jul . 20 , 2021
`
`Sheet 2 of 6
`
`US 11,069,337 B2
`
`FIG.2
`
`16
`
`30
`
`VOICE ACQUIRING UNIT
`
`10
`
`VOICE
`DETECTING
`
`32
`
`VOICE ANALYZYING UNIT
`
`$ 38
`VOICE CLASSIFYING
`UNIT
`
`( 34
`
`INTENTION ANALYZYING
`UNIT
`
`52
`ACQUISITION CONTENT
`INFORMATION ACQUIRING
`UNIT
`PROCESS EXECUTING UNIT
`
`( 36
`
`FIRST OUTPUT SENTENCE
`GENERATING UNIT
`62
`SECOND OUTPUT SENTENCE
`GENERATING UNIT
`VOICE - CONTENT
`GENERATING UNIT
`
`12
`VOICE OUTPUT
`UNIT
`
`( 14
`LIGHTING UNIT
`
`18
`COMMUNICA
`TION UNIT
`
`20
`
`STORAGE
`
`OUTPUT CONTROLLER
`
`CONTROLLER
`
`VOICE - CONTENT CONTROLLER
`
`-4-
`
`
`
`U.S. Patent
`
`Jul . 20 , 2021
`
`Sheet 3 of 6
`
`US 11,069,337 B2
`
`FIG.3
`
`INTENTION
`INFORMATION
`
`ATTRIBUTE
`PARAMETER EO /
`ATTRIBUTE CONTENT
`E1
`
`ATTRIBUTE
`PARAMETER EOI
`ATTRIBUTE CONTENT
`E1
`
`WEATHER
`
`DATE
`
`DAY ZOF MONTH Y OF
`YEAR X
`
`LOCATION
`
`TOKYO
`
`FIG.4
`
`INTENTION
`INFORMATION
`
`ACQUISITION
`PARAMETER AQ /
`ACQUISITION
`CONTENT
`INFORMATION A1
`WEATHER
`
`ACQUISITION
`ACQUISITION
`PARAMETER A0 /
`PARAMETER AO /
`ACQUISITION
`ACQUISITION
`CONTENT
`CONTENT
`INFORMATION A1
`INFORMATION A1
`AIR TEMPERATURE CHANCE OF RAINFALL
`
`WEATHER
`
`PARTLY CLOUDY
`
`HIGHEST AIR
`TEMPERATURE : 25
`DEGREES
`LOWEST AIR
`TEMPERATURE : 15
`DEGREES
`
`20 %
`
`-5-
`
`
`
`U.S. Patent
`
`Jul . 20 , 2021
`
`Sheet 4 of 6
`
`US 11,069,337 B2
`
`FIG.5
`
`START
`
`ACQUIRE INPUT VOICE
`
`S10
`
`S12
`
`ANALYZE INPUT VOICE AND
`GENERATE TEXT DATA
`
`EXTRACT INTENTION INFORMATION
`FROM TEXT DATA
`
`EXECUTE PROCESSING FOR
`INTENTION INFORMATION , OR
`ACQUIRE ACQUISITION
`INFORMATION
`
`$ 18
`CLASSIFY INPUT VOICE
`
`FIRST VOICE ?
`
`$ 20
`
`NO
`
`YES
`S22
`GENERATE FIRST OUTPUT
`SENTENCE
`
`S24
`GENERATE SECOND
`OUTPUT SENTENCE
`
`OUTPUT OUTPUT SENTENCE
`
`S26
`
`-6-
`
`
`
`U.S. Patent
`
`Jul . 20 , 2021
`
`Sheet 5 of 6
`
`US 11,069,337 B2
`
`S18
`CLASSIFY INPUT VOICE
`
`FIG.6
`
`START
`
`S10
`
`ACQUIRE INPUT VOICE
`
`S12
`ANALYZE INPUT VOICE AND
`GENERATE TEXT DATA
`
`$ 14
`EXTRACT INTENTION INFORMATION
`FROM TEXT DATA
`
`S16
`EXECUTE PROCESSING FOR
`INTENTION INFORMATION , OR
`ACQUIRE ACQUISITION
`INFORMATION
`
`S17
`GENERATE FIRST OUTPUT
`SENTENCE
`
`wwwwwwwwwwwww
`
`YES
`
`FIRST VOICE ?
`
`S20A
`
`S24A
`NO
`GENERATE SECOND OUTPUT
`SENTENCE
`
`S26
`OUTPUT OUTPUT SENTENCE
`
`END
`
`-7-
`
`
`
`U.S. Patent
`
`Jul . 20 , 2021
`
`Sheet 6 of 6
`
`US 11,069,337 B2
`
`$ 18A
`COMMU .
`NICATION
`
`20
`STORAGE
`
`2A
`
`IVOICE DETECT
`ING UNIT
`
`VOICE OUTPUT
`
`VOICE OUTPUT
`UNT
`15A
`COMMUNICA
`TION UNIT
`14
`
`LIGHTING UNIT
`
`RESPONCE DEVICE
`
`FIG.7
`
`100
`
`1A
`
`16
`
`30
`VOICE ACQUIRING
`UNIT
`
`$ 32
`VOICE ANALYZING
`
`38
`
`VOICE
`CLASSIFYING
`
`34
`
`50
`
`INTENTION
`ANALYAING UNIT
`52
`ACQUISITION
`CONTENT
`INFORMATION
`ACQUIRING UNIT
`54
`INFORMATION
`ACQUIING UNIT
`PROCESS EXECUTING
`UNIT
`
`36
`60
`FIRST OUTPUT
`SENTENCE
`GENERATING UNIT
`
`SECOND OUTPUT
`SENTENCE
`GENERATING UNIT
`VOICE CONTENT
`GENERATING UNIT
`140
`OUTPUT
`CONTROLLER
`CONTROLLER
`VOICE - CONTENT CONTROLLER
`
`+
`
`+ + +0,0,0,0,0,000
`
`-8-
`
`
`
`US 11,069,337 B2
`
`5
`
`20
`
`BACKGROUND
`
`1
`VOICE - CONTENT CONTROL DEVICE ,
`VOICE - CONTENT CONTROL METHOD ,
`AND NON - TRANSITORY STORAGE
`MEDIUM
`
`2
`voice as either one of a first voice and a second voice ;
`analyzing the acquired voice to execute processing intended
`by the user ; and generating , based on content of the executed
`processing , output sentence that is text data for a voice to be
`output to the user , wherein at the generating , a first output
`sentence is generated as the output sentence when the
`CROSS - REFERENCE TO RELATED
`acquired voice has been classified as the first voice , and a
`APPLICATION
`second output sentence is generated as the output sentence
`This application claims priority from Japanese Applica-
`in which a part of information included in the first output
`tion No. 2018-039754 , filed on Mar. 6 , 2018 , the contents of 10 sentence is omitted when the acquired voice has been
`which are incorporated by reference herein in its entirety .
`classified as the second voice .
`According to one aspect , there is provided a non - transi
`tory storage medium that stores a voice - content control
`FIELD
`program that causes a computer to execute : acquiring a
`The present application relates to a voice - content control 15 voice spoken by a user ; analyzing the acquired voice to
`classify the acquired voice as either one of a first voice and
`device , a voice - content control method , and a non - transitory
`storage medium .
`a second voice ; analyzing the acquired voice to execute
`processing intended by the user ; and generating , based on
`content of the executed processing , output sentence that is
`text data for a voice to be output to the user , wherein at the
`As disclosed in Japanese Examined Patent Publication
`generating , a first output sentence is generated as the output
`sentence when the acquired voice has been classified as the
`No. H07-109560 , for example , a voice control device that
`analyzes detected voice of a user and performs processing
`first voice , and a second output sentence is generated as the
`according to the user's intention has been proposed . Fur
`output sentence in which part of information included in
`thermore , a voice control device , which outputs , via voice , 25 the first output sentence is omitted when the acquired voice
`that processing intended by a user has been performed , or
`has been classified as the second voice .
`outputs , via voice - content of a user's inquiry , has also been
`The above and other objects , features , advantages and
`proposed .
`technical and industrial significance of this application will
`However , when a voice processing device that outputs
`be better understood by reading the following detailed
`voice is used , the output voice may be heard by a person who 30 description of presently preferred embodiments of the appli
`is not the user of the voice processing device and is around
`cation , when considered in connection with the accompa
`the voice processing device . For example , if a person around
`nying drawings .
`the voice processing device is asleep , the output voice may
`be an annoyance to the person . In that case , the output voice
`BRIEF DESCRIPTION OF THE DRAWINGS
`itself may be decreased in sound volume , but if the output 35
`voice is decreased in sound volume too much , the output
`FIG . 1 is a schematic diagram of a voice - content control
`may
`voice
`be hard to be heard by the user himself and the
`device according to a first embodiment ;
`user may be unable to understand the content of the voice .
`FIG . 2 is a schematic block diagram of the voice - content
`Therefore , on outputting the voice to the user , influence of
`control device according to the first embodiment ;
`the output voice to people other than the user is desired to 40
`FIG . 3 is a table illustrating an example of attribute
`be suppressed , and content of the output voice to the user is
`information ;
`desired to be made adequately understandable .
`FIG . 4 is a table illustrating acquisition information ;
`FIG . 5 is a flow chart illustrating a flow of output
`processing for output sentence , according to the first
`SUMMARY
`45 embodiment ;
`A voice - content control device , a voice - content control
`FIG . 6 is a flow chart illustrating another example of the
`method , and a non - transitory storage medium are disclosed .
`flow of the output processing for the output sentence ;
`and
`According to one aspect , there is provided a voice - content
`control device , comprising : a voice classifying unit config
`FIG . 7 is a schematic block diagram of a voice processing
`ured to analyze a voice spoken by a user and acquired by a 50 system according to a second embodiment .
`voice acquiring unit to classify the voice as either one of a
`first voice or a second voice ; a process executing unit
`DETAILED DESCRIPTION OF THE
`configured to analyze the voice acquired by the voice
`PREFERRED EMNODIMENTS
`acquiring unit to execute processing required by the user ;
`and a voice - content generating unit configured to generate , 55
`Embodiments of the present application are explained in
`based on content of the processing executed by the process
`detail below with reference to the drawings . The embodi
`executing unit , output sentence that is text data for a voice
`ments explained below are not intended to limit the present
`to be output to the user , wherein the voice - content generat-
`application .
`ing unit is further configured to generate a first output
`sentence as the output sentence when the acquired voice has 60
`First Embodiment
`been classified as the first voice , and generate a second
`First , a first embodiment is explained . FIG . 1 is a sche
`output sentence in which information is omitted as com-
`matic diagram of a voice - content control device according to
`pared to the first output sentence as the output sentence when
`the first embodiment . As shown in FIG . 1 , a voice - content
`the acquired voice has been classified as the second voice .
`According to one aspect , there is provided a voice - content 65 control device 1 according to the first embodiment detects a
`control method , comprising : acquiring a voice spoken by a
`voice V1 spoken by a user H by a voice detecting unit 10 ,
`user ; analyzing the acquired voice to classify the acquired
`analyzes the detected voice V1 to perform a predetermined
`
`-9-
`
`
`
`US 11,069,337 B2
`
`3
`4
`analyzing unit 32 then replaces the voice waveform per time
`processing , and outputs a voice V2 by a voice output unit 12 .
`with characters based on a table in which a relationship
`Although the voice V2 is output toward the user H , when
`between the voice waveforms and the characters is stored ,
`other people are present around the voice - content control
`thereby converting the voice V1 into the text data . Note that
`device 1 , the voice V2 can be heard by the people . If , for
`example , a person around the voice - content control device 1 5 the converting method can be arbitrarily chosen as long as
`is asleep , the voice V2 may be an annoyance to the person .
`it enables to convert the voice V1 into the text data .
`The voice - content control device 1 according to this
`Based on the text data generated by the voice analyzing
`embodiment analyzes the voice V1 , and adjusts text to be
`unit 32 , the process executing unit 34 detects information on
`output as the voice V2 , thereby suppressing influence of the
`content of processing that is included in the voice V1 and
`voice V2 on people other than the user H and allowing the 10 desired to be executed by the voice - content control device 1 ,
`user H to adequately understand content of the voice V2 .
`and executes the processing . The process executing unit 34
`FIG . 2 is a schematic block diagram of the voice - content
`has an intention analyzing unit 50 , and an acquisition
`control device according to the first embodiment . As shown
`content information acquiring unit 52 .
`in FIG . 2 , the output - content control device 1 includes the
`The intention analyzing unit 50 acquires the text data that
`voice detecting unit 10 , the voice output unit 12 , a lighting 15
`is generated by the voice analyzing unit 32 , extracts inten
`unit 14 , a controller 16 , a communication unit 18 , and a
`tion information I based on the text data , and extracts the
`storage 20. The voice - content control device 1 is a so - called
`attribute information E based on the intention information I.
`smart speaker ( artificial intelligence ( AI ) speaker ) , but is not
`The attribute information E is information that is associated
`limited thereto as long as the device has functions described
`later . The voice - content control device 1 can be , for 20 with the intention information I , and is information that
`indicates a condition necessary for acquiring information
`example , a smart phone , a tablet , and the like .
`The voice detecting unit 10 is a microphone and detects
`that the user H wishes to acquire . Namely , the attribute
`the voice V1 spoken by the user H. The user H speaks the
`information E is an entity .
`voice V1 toward the voice detecting unit 10 so as to include
`Firstly , processing for extracting the intent information I
`information for a processing wished to be performed by the 25 will be described . The intention information I , that is , an
`voice - content control device 1. The voice detecting unit 10
`intent , is information that indicates what kind of processing
`can be regarded as an input unit that accepts information
`is intended by the user H to be performed on the voice
`input externally . The input unit may be provided in addition
`content control device 1. In other words , the intention
`to the voice detecting unit 10 , and , for example , a switch to
`information I is information that indicates what kind of
`adjust volume of the voice V2 by operation performed by the 30 processing is required by the user H to be performed on the
`user H , and the like may be provided . The voice output unit
`voice - content control device 1. The intention analyzing unit
`12 is a speaker , and outputs sentences ( output sentences
`50 extracts the intention information I from the text data by
`described later ) generated by the controller 16 as the voice
`using , for example , a natural language processing . In the
`V2 . The lighting unit 14 is a light source , such as a light
`present embodiment , the intention analyzing unit 50 extracts
`emitting diode ( LED ) , and is turned on by a control of the 35 the intention information I from the text data based on
`controller 16. The communication unit 18 is a mechanism to
`multiple pieces of training data stored in the storage 20. The
`communicate with external servers , such as a Wi - Fi ( regis-
`training data herein is data in which the intention informa
`tered trademark ) module and an antenna , and communicates
`tion I has been assigned to the text data in advance . That is ,
`information with an external server not shown under control
`the intention analyzing unit 50 extracts the training data that
`of the controller 16. The communication unit 18 performs 40 is similar to the text data generated by the voice analyzing
`communication of information with the external servers by
`unit 32 , and regards the intention information I of the
`wireless communication such as Wi - Fi , but the communi-
`extracted training data as the intention information I of the
`cation of information with the external servers may be
`text data generated by the voice analyzing unit 32. Note that
`performed also by wired communication by cables con-
`the training data is not necessarily required to be stored in
`nected . The storage 20 is a memory that stores information 45 the storage 20 , and the intention analyzing unit 50 can search
`on arithmetic calculation of the controller 16 or programs ,
`for the training data in the external server by controlling the
`and includes , for example , at least one of a random access
`communication unit 18. As long as the intention analyzing
`memory ( RAM ) , a read - only memory ( ROM ) , and an exter-
`unit 50 extracts the intention information I from text data ,
`the extracting method of the intention information I can be
`nal storage device , such as a flash memory .
`The controller 16 is an arithmetic unit , namely , a central 50 arbitrarily chosen . For example , the intention analyzing unit
`processor ( CPU ) . The controller 16 includes a voice acquir-
`50 can read a relationship table of keywords and the inten
`ing unit 30 , a voice analyzing unit 32 , a process executing
`tion information I stored in the storage 20 , and can extract
`unit 34 , a voice - content generating unit 36 , a voice classi-
`the intention information I that is associated with the key
`fying unit 38 , and an output controller 40. The voice
`word when the keyword in the relationship table is included
`acquiring unit 30 , the voice analyzing unit 32 , the process 55 in the text data .
`executing unit 34 , the voice - content generating unit 36 , the
`For example , if the text data corresponds to text “ How's
`voice classifying unit 38 , and the output controller 40
`the weather today ? ” , by performing the above described
`perform processes described later by reading software / pro-
`analysis , the intention analyzing unit 50 recognizes that
`gram stored in the storage 20 .
`processing of notifying the user H of weather information is
`The voice acquiring unit 30 acquires the voice V1 that is 60 information on the processing required by the user H , that is ,
`detected by the voice detecting unit 10. The voice analyzing
`the intention information I. Furthermore , if , for example , the
`unit 32 performs voice analysis of the voice V1 acquired by
`text data corresponds to text “ Turn on the light . ” , by
`the voice acquiring unit 30 , to convert the voice V1 into text
`performing the above described analysis , the intention ana
`data . The text data is character data / text data that includes a
`lyzing unit 50 recognizes that processing of turning power of
`sentence spoken as the voice V1 . The voice analyzing unit 65 the light on is information on the processing required by the
`32 detects , for example , voice waveform comprising ampli-
`user H , that is , the intention information I. As described
`tude and wave length per time from the voice V1 . The voice
`above , the intention information I is classified into informa
`
`-10-
`
`
`
`US 11,069,337 B2
`
`6
`5
`storage 20 in advance . Accordingly , even if a keyword
`tion for notification of the required information , and infor-
`mation for control of devices as being required .
`indicating the location is not included in the text data , the
`intention analyzing unit 50 is able to set the attribute content
`The extracting method of the intention information I using
`El of the location to Tokyo . Furthermore , the intention
`text data can be arbitrarily chosen , not limited thereto . For
`example , the output - content control device 1 can be config- 5 analyzing unit 50 may set the attribute content E1 by
`ured to store a relationship table of keywords and the
`communicating with the external server through the com
`intention information I in the storage 20 , and to detect the
`munication unit 18. In this case , for example , the intention
`intention information I associated with the keyword when
`analyzing unit 50 acquires the current location by commu
`the keyword is included in text data of the voice V1 spoken
`nication with a global positioning system ( GPS ) , and sets the
`by the user H. As an example of this case , a keyword 10 location as the attribute content E1 .
`" konnichiwa " may be associated with weather information
`The intention analyzing unit 50 extracts the intention
`and news . In this case , when the user H speaks the voice V1
`information I and the attribute information E as described
`“ konnichiwa ” , the intention analyzing unit 50 detects the
`above , but without being limited thereto . Any extraction
`weather information and the news as the intention informa-
`methods for the intention information I and attribute infor
`15 mation E may be used . FIG . 3 illustrates a case where the
`tion I.
`Described next is the attribute information E. FIG . 3 is a
`weather information is the intention information I , but the
`intention information I and attribute information E are able
`table illustrating an example of the attribute information .
`The attribute information E , that is , the entity , is a condition
`to be extracted similarly in other cases . For example , if
`needed upon execution of the processing , which is required
`information indicating that the power of the light is to be
`by the user H and is extracted as the intention information 20 turned on is the intention information I , the attribute infor
`I , that is , a parameter . For example , if the intention infor-
`mation E includes information on the location of the light ,
`mation I is weather information , the attribute information E
`and information on the date and time when the power is to
`includes information on a location indicating where the
`be turned on .
`weather information is on , and information on a date indi-
`The acquisition content information acquiring unit 52
`cating when the weather information is for . Furthermore , as 25 illustrated in FIG . 2 executes , based on content of the
`illustrated in FIG . 3 , the attribute information E includes an
`intention information I , processing required by the user . If
`attribute parameter E0 and attribute content E1 . The attribute
`the intention information I indicates that a device is to be
`parameter E0 is information indicating the type of param-
`controlled , the acquisition content information acquiring
`eter , that is , the kind of condition , and the attribute content
`unit 52 executes processing of content of the intention
`E1 indicates content of the attribute parameter E0 . That is , 30 information I. For example , the acquisition content infor
`if the attribute information E is information on a location ,
`mation acquiring unit 52 turns the power of the light at the
`the attribute parameter E0 is information indicating that the
`location indicated by the attribute information E on .
`FIG . 4 is a table illustrating acquisition information . If the
`condition is the locatio
`and the attribute co ent El is
`information indicating that the location is Tokyo . Moreover ,
`intention information I indicates notification of required
`if the attribute information E is information on a date , the 35 information , the acquisition content information acquiring
`attribute parameter E0 is information indicating that the
`unit 52 acquires the required information , that is , acquisition
`condition is the date , and the attribute content El is infor-
`information A. The acquisition information A is information
`mation indicating that the date is the day Z of the month Y that the user H is to be notified of , and is , in other words ,
`information determined by the process executing unit 34 to
`of the year X.
`According to this embodiment , the intention analyzing 40 be information that the user H requires to be notified of .
`unit 50 extracts the attribute information E , based on the
`Based on the intention information I extracted by the inten
`extracted intention information I. More specifically , the
`tion analyzing unit 50 , the acquisition content information
`intention analyzing unit 50 selects and extracts the attribute
`acquiring unit 52 acquires the acquisition information A.
`parameter E0 from the extracted intention information I. The
`More specifically , the acquisition content information
`intention analyzing unit 50 reads out a relation table between 45 acquiring unit 52 selects and extracts acquisition parameter
`the intention information I and the attribute parameters EO
`A0 from the extracted intention information I. The acquisi
`stored in the storage 20 , and detects , from the relation table ,
`tion content information acquiring unit 52 reads out a
`the intention information I matched with the extracted
`relation table between the intention information I and the
`intention information I. The intention analyzing unit 50 then
`acquisition parameters A0 stored in the storage 20 , and
`extracts the attribute parameter EO associated with the 50 detects , from the relation table , the intention information I
`matched intention information I. However , the intention
`matched with the extracted intention information I. The
`analyzing unit 50 may communicate with an external server
`acquisition content information acquiring unit 52 then
`via the communication unit 18 , and acquire the relation table
`extracts the attribute parameter A0 associated with the
`matched intention information I. However , the acquisition
`from the external server .
`After the intention analyzing unit 50 has extracted attri- 55 content information acquiring unit 52 may communicate
`with the external server via the communication unit 18 , and
`bute parameters E0 , the intention analyzing unit 50 sets the
`attribute content El for each of the attribute parameters EO .
`acquire the relation table from the external server .
`The intention analyzing unit 50 extracts the attribute content
`After having extracted the acquisition parameters A0 , the
`E1 from , for example , the text data generated by the voice
`acquisition content information acquiring unit 52 acquires ,
`analyzing unit 32. That is , if a keyword “ today ” is included 60 based on the attribute information E , acquisition content
`in the text data , the intention analyzing unit 50 sets the
`information Al for each of the acquisition parameters A0 .
`attribute content E1 of the attribute parameter EO , the date ,
`Specifically , for each of the acquisition parameters A0 , the
`to today . Furthermore , the intention analyzing unit 50 may
`acquisition content information acquiring unit 52 acquires
`set the attribute content E1 for the attribute parameter E0 in
`the acquisition content information Al corresponding to the
`advance . For example , if the intention information I is 65 attribute content E1 set for that attribute parameter EO . By
`weather information , set data indicating that the attribute
`communicating with the external server / external device via
`content E1 of the location is Tokyo may be stored in the
`the communication unit 18 , the acquisition content infor
`
`-11-
`
`
`
`US 11,069,337 B2
`
`7
`8
`frequency . The voice classifying unit 38 classifies the voice
`mation acquiring unit 52 acquires the acquisition content
`V1 as either one of the first voice V1A or the second voice
`information A1 from the external server for each of the
`VIB by using a peak frequency that is equal to or higher than
`acquisition parameters A0 . However , if the acquisition con-
`a predetermined intensity in the spectrum as a feature value .
`tent information A1 has been stored in the storage 20 , the
`acquisition content information acquiring unit 52 may 5 For example , the voice classifying unit 38 determines the
`acquire the acquisition content information A1 from the
`voice as a whisper to classify to the second voice V1B when
`storage 20. That is , the acquisition content information A1
`the peak frequency is equal to or lower than the threshold ,
`can be said to be data that the acquisition content informa-
`and determines the voice as not a whisper to classify to the
`tion acquiring unit 52 acquires from a database of the
`first voice V1A when the peak frequency is larger than the
`10 threshold . Note that the voice classifying unit 38 can per
`external server , the storage 20 , or the like .
`As described above , the acquisition content information
`form the classification to the first voice V1A and the second
`A1 is information that the acquisition content information
`voice V1B by using any method . For example , the voice
`acquiring unit 52 has acquired by the communication with
`classifying unit 38 can perform the classification to the first
`the external server , or read - out from the storage 20. In the
`voice V1A and the second voice V1B by using a slope of the
`example of FIG . 4 , the intention information I is weather , 15 peak in the spectrum as a feature value . Moreover , the voice
`and the acquisition parameters A0 are weather , air tempera-
`classifying unit 38 can perform the classification to the first
`ture , and chance of rainfall . In this case , the acquisition
`voice V1A and the second voice V1B by using either one of
`content information acquiring unit 52 acquires the acquisi-
`a volume of the voice V1 , a speaking speed of the user in the
`tion content information A1 for the respective acquisition
`voice V1 , and a volume ratio between a speech of the user
`parameters A0 , that is , weather , air temperature , and chance 20 and a wind noise as a feature value . Furthermore , a prox
`of rainfall , in Tokyo , on the day Z of the month Y of the year
`imity sensor can be provided in the voice - content control
`X. In the example of FIG . 4 , the acquisition content infor-
`device 1 , a distance between the user H and the voice
`mation A1 for weather is