DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Specification
The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. The title is too long.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

This application includes one or more claim limitations that use the word “means” or “step” but are nonetheless not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph because the claim limitation(s) recite(s) sufficient structure, materials, or acts to entirely perform the recited function.  Such claim limitation(s) is/are:
a detection unit (See Figs. 3, 5, 7, 9, and 12: 112; ¶0063)
a determination unit (See Figs. 3, 5, 7, 9, and 12: 113; ¶0063)
a response control unit (See Figs. 3, 5, 7, 9, and 12: 114; ¶0063)
a measurement unit (See ¶0163)
a display unit (See Fig. 2: 15; ¶0083)
Claims 1, 2, 7, 15
Claims 1, 2, 8-9, 12, 15
Claims 1, 3-5, 10, 15
Claims 9, 10
Claim 11


Because this/these claim limitation(s) is/are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are not being interpreted to cover only the corresponding structure, material, or acts described in the specification as performing the claimed function, and equivalents thereof.
If applicant intends to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to remove the structure, materials, or acts that performs the claimed function; or (2) present a sufficient showing that the claim limitation(s) does/do not recite sufficient structure, materials, or acts to perform the claimed function.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-2, 4, 12-13, and 15-17 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Kunitake et al. (US #2017/0186428).

Regarding Claim 1, Kunitake discloses an information processing apparatus (Figs. 1-2) comprising:
a detection unit that detects a positional relationship between a user and an object on a basis of an image captured by a camera (Kunitake ¶0055 discloses the voice interaction device 100 is coupled to a controlled-object equipment piece 140 via a network. ¶0056-¶0057 discloses the voice interaction device 100 includes an input unit 110, a voice interaction processing unit 120, and an output unit 130, such as a sensor unit 111, e.g., a human sensor, a camera [an imaging device], and a line-of-sight sensor, and a voice input unit 112. ¶0058 discloses the human sensor detects whether or not a person is present in a predetermined distance from the controlled-object equipment piece 140. The camera takes an image of a predetermined range including the controlled-object equipment piece 140. ¶0059 discloses the line-of-sight sensor includes a camera that takes an image of a predetermined range including the controlled-object equipment piece 140);
a determination unit that determines a situation of the user on a basis of the positional relationship between the user and the object detected by the detection unit (Kunitake ¶0058 discloses both sensors [human sensor and camera] outputs data indicating the result to the voice interaction processing unit 120. ¶0059 discloses the line-of-sight sensor indicates the identified direction of the line of sight of the person to the voice interaction processing unit 120, which operates as [¶0062] state recognition unit 121 [determination unit], an intention understanding unit 123 [a discrimination unit]); and
a response control unit that executes a voice response corresponding to the situation of the user determined by the determination unit (Kunitake ¶0062 discloses the voice interaction processing unit 120 operates as a state recognition unit 121 [a determination unit], a voice recognition unit 122 [sensing unit], an intention understanding unit 123 [a discrimination unit], an action selection unit 124, an equipment control unit 125, a response generation unit 126, and a voice synthesis unit 127).

Regarding Claim 2, Kunitake discloses the information processing apparatus according to claim 1,
wherein the detection unit detects a positional relationship between a part of the user and the object (Kunitake ¶0058 discloses the human sensor detects whether or not a person is present in a predetermined distance from the controlled-object equipment piece 140. The camera takes an image of a predetermined range including the controlled-object equipment piece 140. ¶0059 discloses the line-of-sight sensor includes a camera that takes an image of a predetermined range including the controlled-object equipment piece 140), and
the determination unit determines the situation of the user on a basis of the positional relationship between the part of the user and the object (Kunitake ¶0063 discloses the state recognition unit 121 [a determination unit, ¶0062] determines whether or not the state of the user or the state around the controlled-object equipment piece 140 is a suitable state for control on the basis of one or more pieces of data output from the sensor unit 111).

Regarding Claim 4, Kunitake discloses the information processing apparatus according to claim 1,
wherein the response control unit executes the voice response on a basis of a sound signal collected by a microphone (Kunitake ¶0061 discloses the voice input unit 112 outputs voice data input to a sound collection device [a sound collector] to the voice interaction processing unit 120. Examples of the sound collection device include a directional microphone attached to the main body of the voice interaction device 100, a handheld microphone coupled to the voice interaction device 100 using wires or wirelessly, a pin microphone, and a desktop microphone. Further, the voice input unit 112 may communicate with a device having a sound collection function and a communication function, such as a smartphone or a tablet, to acquire voice data input to the device and may output the acquired voice data to the voice interaction processing unit 120).

Regarding Claim 12, Kunitake discloses the information processing apparatus according to claim 1,
wherein the determination unit determines the situation of the user on a basis of an invalid word (Kunitake ¶0097 discloses when in step S103, the contents of an utterance of a user include only a transitive verb related to equipment control, the intention understanding unit 123 fails to discriminate the controlled-object equipment piece 140 even if the intention understanding unit 123 can interpret the utterance as an utterance for requesting equipment control because the contents of the utterance include no noun).

Regarding Claim 13, Kunitake discloses the information processing apparatus according to claim 1,
wherein the situation of the user includes at least any of a state ¶0063 discloses the state recognition unit 121 determines whether or not the state of the user is a suitable state for control) or an action of the user (Kunitake ¶0091 discloses in contrast, when the state recognition unit 121 determines the state of the user and the state around the equipment piece are unsuitable states for the control [NO in step 105], the action selection unit 124 causes the response generation unit 126 to generate a verification response statement).

Claims 15-17 are rejected for the same reasons as set forth in Claim 1.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 3, 9, and 10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kunitake et al. (US #2017/0186428) in view of Tokuji et al. (JP #2018/045192 A).

Regarding Claim 3, Kunitake discloses the information processing apparatus according to claim 1,
wherein the response control unit (Kunitake Fig. 1: 126) controls at least one of 
whether or not to make a voice response (Kunitake ¶0078 disclose on the basis of the intention interpretation result obtained by the intention understanding unit 123 and the determination result obtained by the state recognition unit 121, the action selection unit 124 selects whether to cause the equipment control unit 125 to perform equipment control, whether to cause the response generation unit 126 to generate a verification response statement, or whether to perform another task),
content of a response (Kunitake ¶0080 disclose under instructions of the action selection unit 124, the response generation unit 126 generates a verification response statement and outputs text data that indicates the verification response statement to the voice synthesis unit 127. For example, when the contents of the utterance are presented as "Open the refrigerator", the verification response statement becomes "Are you sure you want to open the refrigerator?", which have contents for asking back about the contents of the utterance), speed of voice, sound quality of voice, or a type of voice in accordance with the situation of the user.
Kunitake may not explicitly disclose volume of voice.
However, Tokuji (abstract) teaches volume of voice (Tokuji ¶0044 discloses Fig. 5b is a table showing the relationship between the distance between the user and the robot and the reference output volume stored in advance in the smartphone 110. The reference output volume is a volume which is a reference of a volume when the robot 100 speaks. As the distance between the user and the robot increases, the robot 100 needs to speak at a large volume. Thus, the reference output volume is set to be larger as the distance between the user and the robot increases).
Kunitake and Tokuji are analogous art as they pertain to control plurality of dialog based devices. Therefore it would have been obvious to someone of ordinary skill in the art before the effective filing date of the invention was made to modify voice recognition devices (as taught by Kunitake) to change the volume based on the user distance from the robot (as taught by Tokuji, ¶0044) to advantageously provide a voice interactive device (Tokuji, ¶0007).

Regarding Claim 9, Kunitake discloses the information processing apparatus according to claim 1, further comprising
a measurement unit that can measure a distance between the user and the object (Kunitake ¶0066 discloses human sensor indicates that a person is present in the predetermined distance from the controlled object equipment piece 140),
wherein the determination unit determines the situation of the user on a basis of a positional relationship between the user and the object including the distance between the user and the object (Kunitake ¶0066 discloses in these cases, the state recognition unit 121 determines that the state is a state where a person is detected around the controlled-object equipment piece 140).
Kunitake may not explicitly disclose a measurement unit that can measure a distance between the user and the object.
However, Tokuji (abstract) teaches a measurement unit that can measure a distance between the user and the object (Tokuji ¶0011 discloses the distance acquiring means can include an image acquiring means for acquiring an image of a user, a detecting means for detecting a face or a body of a user from the image, and a distance detecting means for determining a distance between the user and the face or the body of the user detected by the detecting means. The distance acquisition means can be distance sensors. The distance sensor can be a laser, an ultrasonic wave, an infrared ray, or the like, or a stereo type, a DFD [Depth from Defocus] type, or the like. ¶0041 discloses the distance between the robot 100 and the used based on the volume and the face size).
Kunitake and Tokuji are analogous art as they pertain to control plurality of dialog based devices. Therefore it would have been obvious to someone of ordinary skill in the art before the effective filing date of the invention was made to modify voice recognition devices (as taught by Kunitake) to measure distance of the user from the robot (as taught by Tokuji, ¶0041) to advantageously provide a voice interactive device (Tokuji, ¶0007).

Regarding Claim 10, Kunitake discloses the information processing apparatus according to claim 1, further comprising
a measurement unit that measures a distance to the user (Kunitake ¶0066 discloses human sensor indicates that a person is present in the predetermined distance from the controlled object equipment piece 140)..
Kunitake may not explicitly disclose a measurement unit that can measure a distance between the user and the object, wherein the response control unit executes the voice response in a case where the distance to the user measured by the measurement unit and a sound pressure of a sound signal collected by a microphone satisfy a predetermined condition.
However, Tokuji (abstract) teaches a measurement unit that can measure a distance between the user and the object (Tokuji ¶011 discloses the distance acquiring means can include an image acquiring means for acquiring an image of a user, a detecting means for detecting a face or a body of a user from the image, and a distance detecting means for determining a distance between the user and the face or the body of the user detected by the detecting means. The distance acquisition means can be distance sensors. The distance sensor can be a laser, an ultrasonic wave, an infrared ray, or the like, or a stereo type, a DFD [Depth from Defocus] type, or the like),
wherein the response control unit executes the voice response in a case where the distance to the user measured by the measurement unit and a sound pressure of a sound signal collected by a microphone satisfy a predetermined condition (Tokuji ¶0043 discloses Fig. 5a is a table showing a relationship between a distance between a user and a robot and a reference input volume stored in advance in the smartphone 110. The reference input volume is a volume that is assumed to be input to the robot 100 when the user utters at a normal level of sound volume. While a user generally tends to speak louder as farther away from the robot, it is expected that the larger the distance, the smaller the input sound volume. Thus, the reference input volume is set to be smaller as the distance between the user and the robot increases.
Kunitake and Tokuji are analogous art as they pertain to control plurality of dialog based devices. Therefore it would have been obvious to someone of ordinary skill in the art before the effective filing date of the invention was made to modify voice recognition devices (as taught by Kunitake) to measure distance of the user from the robot (as taught by Tokuji, ¶0041) to advantageously provide a voice interactive device (Tokuji, ¶0007).

Claims 5-8, 11 and 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kunitake et al. (US #2017/0186428) in view of Yamada (US #2015/0331490).

Regarding Claim 5, Kunitake discloses the information processing apparatus according to claim 4,
wherein the microphone can detect a direction of the sound signal collected (Kunitake ¶0061 discloses examples of the sound collection device include a directional microphone attached to the main body of the voice interaction device 100. Further, the voice input unit 112 may communicate with a device having a sound collection function and a communication function, such as a smartphone or a tablet, to acquire voice data input to the device and may output the acquired voice data to the voice interaction processing unit 120).
Kunitake may not explicitly disclose wherein the microphone is an array microphone that can detect a direction of the sound signal collected.
However, Yamada teaches wherein the microphone is an array microphone that can detect a direction of the sound signal collected (Yamada ¶0042 discloses the information input unit 20 includes a microphone array composed of a plurality of microphone serving as a voice input unit and a camera [imaging unit] that serves as an image input unit and captures a moving image. ¶0050 discloses the microphone array 22; Figs. 2-3).
Kunitake, Tokuji and are analogous art as they pertain to control plurality of dialog based devices. Therefore it would have been obvious to someone of ordinary skill in the art before the effective filing date of the invention was made to modify the teachings of Kunitake in view of Tokuji in light of the teachings of Yamada to include microphone array for voice input (as taught by Yamada, ¶0050) to accurately determine a desired utterance section uttered by the user even under the noisy environment and implementing high-accuracy voice recognition (Yamada, ¶0011).

Regarding Claim 6, Kunitake in view of Yamada discloses the information processing apparatus according to claim 5,
wherein the response control unit does not execute the voice response in a case where an object that produces sound is positioned in the direction of the sound signal collected by the array microphone (Kunitake ¶0089 discloses when the intention understanding unit 123 interprets the utterance of the user as an equipment-control requesting utterance [YES in step S104], the state recognition unit 121 determines whether or not the state of the user [who has issued the utterance] or the state around the [controlled-object] equipment piece 140, is a suitable state for the control [step S105]. ¶0090 discloses when the state recognition unit 121 determines that the state of the user or the state around the equipment piece is a suitable state for the control [YES in step S105], the action selection unit 124 instructs the equipment control unit 125 to perform the control of the controlled-object equipment piece 140 requested by the user on the basis of the intention interpretation result obtained by the intention understanding unit 123. Accordingly, the equipment control unit 125 generates an equipment control command for performing the control of the controlled-object equipment piece 140 as instructed and outputs the equipment control command to the controlled-object equipment piece 140 [step S106]. As a result, the controlled-object equipment piece 140 operates in accordance with the input equipment control command. Therefore, it would have been obvious for the response control unit to not execute the voice response.).
Kunitake may not explicitly disclose sound signal collected by an array microphone.
However, Yamada teaches sound signal collected by an array microphone (Yamada ¶0042 discloses the information input unit 20 includes a microphone array composed of a plurality of microphone serving as a voice input unit and a camera [imaging unit] that serves as an image input unit and captures a moving image. ¶0050 discloses the microphone array 22; Figs. 2-3).
Kunitake, Tokuji and are analogous art as they pertain to control plurality of dialog based devices. Therefore it would have been obvious to someone of ordinary skill in the art before the effective filing date of the invention was made to modify the teachings of Kunitake in view of Tokuji in light of the teachings of Yamada to include microphone array for voice input (as taught by Yamada, ¶0050) to accurately determine a desired utterance section uttered by the user even under the noisy environment and implementing high-accuracy voice recognition (Yamada, ¶0011).

Regarding Claim 7, Kunitake in view of Yamada discloses the information processing apparatus according to claim 5. But Kunitake may not explicitly disclose wherein directivity of the array microphone is adjusted to a direction of the user detected by the detection unit.
However, Yamada teaches wherein directivity of the array microphone is adjusted to a direction of the user detected by the detection unit (Yamada ¶0053 discloses microphone array 201 including plurality of microphones 1 to 4 arranged at different positions acquires a sound from a voice source 202 positioned in a specific direction; Fig. 4).
Kunitake, Tokuji and are analogous art as they pertain to control plurality of dialog based devices. Therefore it would have been obvious to someone of ordinary skill in the art before the effective filing date of the invention was made to modify the teachings of Kunitake in view of Tokuji in light of the teachings of Yamada to include microphone array for voice input (as taught by Yamada, ¶0050) to accurately determine a desired utterance section uttered by the user even under the noisy environment and implementing high-accuracy voice recognition (Yamada, ¶0011).

Regarding Claim 8, Kunitake in view of Yamada discloses the information processing apparatus according to claim 5. But Kunitake may not explicitly disclose further comprising a plurality of the array microphones, wherein the array microphone that collects sound is selected on a basis of the situation of the user determined by the determination unit.
However, Yamada teaches a plurality of the array microphones (Yamada ¶0042 discloses the information input unit 20 includes a microphone array composed of a plurality of microphone serving as a voice input unit and a camera [imaging unit] that serves as an image input unit and captures a moving image. ¶0050 discloses the microphone array 22; Figs. 2-3),
wherein the array microphone that collects sound is selected on a basis of the situation of the user determined by the determination unit (Yamada ¶0067 discloses the voice source waveform is used in the process of setting a sound signal selected based on the voice source direction estimated by the voice source direction estimating unit 132 and the voice section information detected by the voice section detecting unit 133 as an analysis target and analyzing a change in the frequency level or the like).
Kunitake, Tokuji and are analogous art as they pertain to control plurality of dialog based devices. Therefore it would have been obvious to someone of ordinary skill in the art before the effective filing date of the invention was made to modify the teachings of Kunitake in view of Tokuji in light of the teachings of Yamada to include microphone array for voice input (as taught by Yamada, ¶0050) to accurately determine a desired utterance section uttered by the user even under the noisy environment and implementing high-accuracy voice recognition (Yamada, ¶0011).

Regarding Claim 11, Kunitake discloses the information processing apparatus according to claim 1,
wherein the information processing apparatus further includes a display unit (Kunitake ¶0084 discloses although the output unit 130 includes the one or more voice output units 131, instead of including the one or more voice output units 131, the verification response statement or the like indicated by the text data generated by the response generation unit 126 can be displayed on a display device, such as a display unit incorporated in the voice interaction device 100, or on an external display device coupled to the voice interaction device 100; Fig. 1: output unit 130).
Kunitake may not explicitly disclose the display unit displays at least any of a fact that a response is in progress, a reason for not responding, or a situation of a room.
However, Yamada teaches the display unit displays at least any of a fact that a response is in progress (Yamada ¶0155 discloses in step S207, when the voice source direction and the voice section are decided, a process of notifying the user of the decision can be performed, and, for example, a process of outputting an image such as an icon representing the decision to a display unit can be performed. ¶0258 discloses when it is determined in step S505 [Figs. 18-20] that the user [utterer] is viewing a predetermined specific position, in step S506, the user is notified of the fact that voice recognition can be performed. For example, a message can be displayed on a part of a display unit of the television,
a reason for not responding (Yamada ¶0259 discloses however, when it is determined in step S505 that the user [utterer] is not viewing a predetermined specific position, in step S507, the user is notified of the fact that voice recognition is not performed. For example, this process can be also performed such that a message is displayed on a part of the display unit of the television), or a situation of a room.
Kunitake, Tokuji and are analogous art as they pertain to control plurality of dialog based devices. Therefore it would have been obvious to someone of ordinary skill in the art before the effective filing date of the invention was made to modify the teachings of Kunitake in view of Tokuji in light of the teachings of Yamada to display a message that voice recognition can be performed when the user [utterer] is viewing a predetermined specific position (as taught by Yamada, ¶0258) to accurately determine a desired utterance section uttered by the user even under the noisy environment and implementing high-accuracy voice recognition (Yamada, ¶0011).

Regarding Claim 14, Kunitake discloses the information processing apparatus according to claim 13, but may not explicitly disclose wherein the situation of the user includes at least any of a sleeping situation, a relaxing situation, a situation of watching television, or a situation of having a conversation with a family member.
However, Yamada teaches wherein the situation of the user includes at least any of a sleeping situation, a relaxing situation, a situation of watching television (Yamada ¶0043 discloses as illustrated in Fig. 1, users 31 to 34 which are television viewers are in front of the television which is the voice recognition device 10), or a situation of having a conversation with a family member.
Kunitake, Tokuji and are analogous art as they pertain to control plurality of dialog based devices. Therefore it would have been obvious to someone of ordinary skill in the art before the effective filing date of the invention was made to modify the teachings of Kunitake in view of Tokuji in light of the teachings of Yamada to include microphone array for voice input on a television for television viewers to input voice (as taught by Yamada, ¶0043) to accurately determine a desired utterance section uttered by the user even under the noisy environment and implementing high-accuracy voice recognition (Yamada, ¶0011).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to YOGESHKUMAR G PATEL whose telephone number is (571)272-3957. The examiner can normally be reached 7:30 AM-4 PM PST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Duc Nguyen can be reached on 571-272-7503. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/YOGESHKUMAR PATEL/Primary Examiner, Art Unit 2651