DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:

(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 

This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) [and associated disclosure] is/are: 
“a speaker identification unit,” [Fig. 3, 76]
“a semantic analysis unit,” [Fig. 3,78]
“a tracking unit,” [Fig. 3, 74]
“a voice session generation unit,” [Fig. 3, 75]
“a voice recognition unit,” [Fig. 3, 77]
“a response generation unit,” [Fig. 3, 79]  
“an imaging unit,” [Fig. 3, 71] [a camera, see para 0037]
“a voice acquisition unit,” [Fig. 3,72] [a microphone, see para 0038] and 
“a user tracking unit” [Fig. 3,74, 75, 76, see para 0047] 
as recited in claims 1 – 20.

If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 1 – 20 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a 

Claims 1 – 20 are also rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the enablement requirement.  The claim(s) contains subject matter which was not described in the specification in such a way as to enable one skilled in the art to which it pertains, or with which it is most nearly connected, to make and/or use the invention.
Claims 1 – 20 recite various units interpreted under 112f as noted above.  The elements claimed are detailed in Fig. 3, and include reference numerals to, “a speaker identification unit,” [76] “a semantic analysis unit,” [78] “a tracking unit,” [74] “a voice session generation unit,” [75] “a voice recognition unit,” [77] “a response generation unit,” [79] “an imaging unit,” [71] “a voice acquisition unit,” [72] and “a user tracking unit.” [74, 75, and 76, see para [0047] detailing “the tracking unit 74, the voice session generation unit 75, and the speaker identification unit 76 can be defined as a user tracking unit”].
Of these elements, the imaging unit [71] is detailed with sufficient structure as being a camera in para [0037].  The voice acquisition unit [72] is detailed as a microphone in para [0038].  The remaining units, [76, 78, 74, 75, 79], are not disclosed as being directed to any particular structure, and instead are disclosed as computer program units executed by CPU 51, see para [0035] “Functional blocks illustrated in Fig. 
As to the written description requirement, MPEP 2161.01 I. details determining whether there is adequate written description for a computer-implemented functional claim limitation, noting:
Similarly, original claims may lack written description when the claims define the invention in functional language specifying a desired result but the specification does not sufficiently describe how the function is performed or the result is achieved. For software, this can occur when the algorithm or steps/procedure for performing the computer function are not explained at all or are not explained in sufficient detail (simply restating the function recited in the claim is not necessarily sufficient). In other words, the algorithm or steps/procedure taken to perform the function must be described with sufficient detail so that one of ordinary skill in the art would understand how the inventor intended the function to be performed. See MPEP §§ 2163.02 and 2181, subsection IV.

Further as to the written description requirement, MPEP 2163.03 VI. details written description circumstances arising from indefiniteness of a means plus function limitation, noting:
A claim limitation expressed in means- (or step-) plus-function language "shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof." 35 U.S.C. 112(f)  or pre-AIA  35 U.S.C. 112, sixth paragraph. If the specification fails to disclose sufficient corresponding structure, materials, or acts that perform the entire claimed function, then the claim limitation is indefinite because the applicant has in effect failed to particularly point out and distinctly claim the invention as required by 35 U.S.C. 112(b)  or pre-AIA  35 U.S.C. 112, second paragraph. In re Donaldson Co., 16 F.3d 1189, 1195, 29 USPQ2d 1845, 1850 (Fed. Cir. 1994) (en banc). Such a limitation also lacks an adequate written description as required by 35 U.S.C. 112(a)  or pre-AIA  35 U.S.C. 112, first paragraph, because an indefinite, unbounded functional limitation would cover all ways of performing a function and indicate that the inventor has not provided sufficient disclosure to show possession of the invention. See also MPEP § 2181.

MPEP 2163.02 details examples of enablement issues in computer programing cases, in particular in section II. regarding block elements within a computer, noting:
While no specific universally applicable rule exists for recognizing an insufficiently disclosed application involving computer programs, an examining guideline to generally follow is to challenge the sufficiency of disclosures that fail to include the programmed steps, algorithms or procedures that the computer performs necessary to produce the claimed function. These can be described in any way that would be understood by one of ordinary skill in the art, such as with a reasonably detailed flowchart which delineates the sequence of operations the program must perform. In programming applications where the software disclosure only includes a flowchart, as the complexity of functions and the generality of the individual components of the flowchart increase, the basis for challenging the sufficiency of such a flowchart becomes more reasonable because the likelihood of more than routine experimentation being required to generate a working program from such a flowchart also increases.

The claimed imaging unit and the voice acquisition unit, are explicitly directed to known structural elements (e.g. a camera and microphone), and thus cannot be interpreted as functional software blocks. 
The claimed speaker identification unit is illustrated by the software process performed by element 76 in steps S33 and S34 of Fig. 6, and para [0045], [0046], [0047], noting that it identifies the user existing in a predetermined angular direction as a speaker whose utterance is to be received on the basis of the image, voice, and sensing information obtained in the environment where the user exists. The specification does not provide details regarding the claimed elements of how user’s face is tracked with angular directions, or the operation of identifying the user having the face as the speaker, or any directional relations or other operations of the speaker identification unit in the form of an algorithm, software or other functional code of the computer block disclosure.
semantic analysis unit is illustrated by the software process performed by element 78 in step S37 of Fig. 6, and para [0050] noting that it performs natural language processing, in particular, semantic analysis on a sentence including the character strings from the voice recognition unit and thereby extracts a speaker’s request.  The specification does not provide details regarding the operation or performance of the claimed semantic analysis.  The specification notes that the characters strings from the voice recognition unit are used to perform the speaker request extraction and subsequent information operations, but does not provide detail of the operation of the semantic analysis by the claimed semantic analysis unit in the form of an algorithm, software or other functional code of the computer block disclosure.
The claimed tracking unit is illustrated by the software process performed by element 74 in steps S12, S13 and S14 of Fig. 5, and para [0040], [0041] noting the tracking unit estimates a state of the user in an imaging range on a basis of the image, and performs face identification, orientation detection, and position estimation as well as producing tracking information representing an angular direction of the face being track. The specification does not provide details regarding how the claimed elements of tracking of the face of the user detected in the image is performed, estimating probabilities operations, terminating tracking based on probabilities operations, or other claimed operations of the tracking unit in the form of an algorithm, software or other functional code of the computer block disclosure.
The claimed voice session generation unit is illustrated by the software process performed by element 75 in steps S31 and S32 of Fig. 6, and para [0042], [0043] noting that it estimates the direction of the uttering user and the speech duration, generates a 
The claimed voice recognition unit is illustrated by the software process performed by element 77 in step S36 of Fig. 6, and para [0049] noting that it checks matching between voice data and vocabulary registered in the dictionary, thereby performing voice recognition. The specification does not provide details as to the claimed performance of voice recognition of the utterance of the identified speaker operation, or any other related operations of the voice recognition unit in the form of an algorithm, software or other functional code of the computer block disclosure.
The claimed response generation unit is illustrated by the software process performed by element 79 in step S38 of Fig. 6, and para [0051] noting that it generates a response to the speaker’s request on the basis of the information from the semantic analysis unit.  The specification does not provide details as to the claimed generation of a response to the request of the speaker operation, or any other related operations of the response generation unit in the form of an algorithm, software or other functional code of the computer block disclosure.


1) that the inventor(s) at the time the application was filed, had possession of the claimed invention, or 
2) how to make or use the invention without undue experimentation.

In sum, Claims 1 – 20 fail to meet the written description requirement of 35 U.S.C. 112a.  The lack of disclosure of the code/algorithms to implement the claimed units (as detailed above) in a manner understandable to a person of ordinary skill in the art results in a failure to reasonably convey that the inventor(s) at the time the application was filed, had possession of the claimed invention.

Further, Claims 1 – 20 fail to meet the enablement requirement of 35 U.S.C. 112a.  The lack of disclosure of the code/algorithms to implement the claimed units (as detailed above) in a manner understandable to a person of ordinary skill in the art results in claimed subject matter which was not described in the specification in such a way as to enable one skilled in the art to which it pertains, or with which it is most nearly connected, to make and/or use the invention

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1 – 20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claims 1 – 20 recite various units interpreted under 112f as noted above.  The elements claimed are detailed in Fig. 3, and include reference numerals to, “a speaker identification unit,” [76] “a semantic analysis unit,” [78] “a tracking unit,” [74] “a voice session generation unit,” [75] “a voice recognition unit,” [77] “a response generation unit,” [79] “an imaging unit,” [71] “a voice acquisition unit,” [72] and “a user tracking unit.” [74, 75, and 76, see para [0047] detailing “the tracking unit 74, the voice session generation unit 75, and the speaker identification unit 76 can be defined as a user tracking unit”].
Of these elements, the imaging unit [71] is detailed with sufficient structure as being a camera in para [0037].  The voice acquisition unit [72] is detailed as a microphone in para [0038].  
The remaining units, “a speaker identification unit,” [76] “a semantic analysis unit,” [78] “a tracking unit,” [74] “a voice session generation unit,” [75] “a voice recognition unit,” [77] “a response generation unit,” [79] and “a user tracking unit,” are 
As noted above in the rejection under 35 U.S.C. 112a, the specification fails to adequately disclose an algorithm for performing the claimed specific computer function for these units.  The specification also fails to detail any other structure to perform the claimed functions of these units.  As such, these claimed units are indefinite under 35 U.S.C. 112b due to the specifications failure to adequately disclose the algorithm or structural details.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1 – 16, 19 and 20 rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
The claim(s) recite(s) a series of steps for identifying users, their directions, and requests to provide an output. With the exception of recitation of generic computer and processing elements (i.e. the various units detailed as computer blocks, see analysis above in the 112 section), the claims under their broadest reasonable interpretation, cover performance of the elements in the mind. Each of the claimed elements can be interpreted as a person hearing a request from another person in a particular direction 
This judicial exception is not integrated into a practical application. In particular, the claims only recite functional computer block elements at a high level of generality. The units, in particular as claimed are in general terms for performing the elements, such that it amounts no more than mere instructions to apply the exception using a generic computer component based on the interpretation required by the disclosure. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claims are directed to an abstract idea.
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, using the additional elements of the generic computing block units to perform the steps amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claims are not patent eligible.
Claims 17 and 18 include limitations that are directed to specific structure in the form of an imaging unit and a voice acquisition unit, explicit detailed as a camera and a 

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1 – 20 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Solomon et al. (hereinafter Sol. U.S. Patent Application Publication 2018/0232662).

Regarding Claim 1, Sol discloses:
An information processing apparatus (e.g. system of Figs. 1 and 2) comprising:
a speaker identification unit (e.g. person identifier 105, entity identifier 104 of entity tracker 100 of Fig. 7) that identifies a user existing in a (e.g. note part of entity tracker used to determine an identity, position and/or current status of one or more 
a semantic analysis unit that performs semantic analysis of the utterance of the identified speaker to output a request of the speaker (e.g. parser utilizing a plurality of intent templates that may be filled with words or terms received from the voice listener by examining a semantic meaning; para 66, 81 – 83).

Regarding Claim 2, in addition to the elements stated above regarding claim 1, Sol further discloses:
wherein, in a case where a face of the user-detected in the image is being tracked in the angular direction (e.g. entity tracker 100, as detailed above; note use of yaw/pitch/roll; para 208) in which a voice session for performing a dialogue with the user is generated (e.g. see example conversation in para 68, 230), the speaker identification 

Regarding Claim 3, in addition to the elements stated above regarding claim 2, Sol further discloses:
a tracking unit that tracks the face of the user detected in the image (e.g. using camera data, the entity tracker 100 may identify a particular person; para 232); and
a voice session generation unit that generates the voice session in the angular direction in which a trigger for starting the dialogue with the user has been detected (e.g. one or more functions activated upon detection of one or more keywords that are spoken by a user; para 296; note also context information 110 may be utilized by voice listener 30 when interpreting human speech or activating functions in response to a keyword trigger; para 211).

Regarding Claim 4, in addition to the elements stated above regarding claim 3, Sol further discloses:
wherein the speaker identification unit identifies the speaker on a basis of the image, the voice, and sensing information obtained by sensing in the environment (e.g. note the entity identifier 104 may record images of a person's face, and associate these 

Regarding Claim 5, in addition to the elements stated above regarding claim 4, Sol further discloses:
wherein the trigger is detected on a basis of at least any of the image, the voice, and the sensing information (e.g. one or more functions activated upon detection of one or more keywords that are spoken by a user; para 296; note also context information 110 may be utilized by voice listener 30 when interpreting human speech or activating functions in response to a keyword trigger; para 211; note further examples of including signals gestures captured by a cameras, face direction etc; para 321, 324).

Regarding Claim 6, in addition to the elements stated above regarding claim 5, Sol further discloses:
wherein the trigger is an utterance of a predetermined word detected from the voice (e.g. activating functions in response to a keyword trigger; para 211; further note functions activated upon detection of one or more keywords; para 296 and spoken keywords; para 324, Fig. 21)

Regarding Claim 7, in addition to the elements stated above regarding claim 5, Sol further discloses:


Regarding Claim 8, in addition to the elements stated above regarding claim 3, Sol further discloses:
wherein, in a case where the trigger has been detected in the angular direction different from the angular direction in which N voice sessions are being generated in a state where the N voice sessions are being generated (e.g. the system may track multiple conversations that are occurring simultaneously or otherwise overlapping, and may interact with participants in each conversation as appropriate for each conversation; para 179; and context information, including entity position and status,  is used to determine whether a particular commitment should be executed, note further utilization when activating functions in response to a keyword trigger; para 211; in other words, tracking multiple conversation of multiple users, responding accordingly in terms of their positional information for the triggers, i.e. different positions, or “angular directions”) the voice session generation unit terminates the voice session estimated to have a lowest probability of occurrence of the utterance out of the N voice (note that Each user, location, and activity included in the entity identity data 112, entity position data 114, and entity status data 116 may have an associated estimate of a probability that that user, location, or activity was correctly identified; para 252; and In an 

Regarding Claim 9, in addition to the elements stated above regarding claim 8, Sol further discloses:
wherein the voice session generation unit estimates the voice session having the lowest probability of occurrence of the utterance on a basis of at least any of the image, the voice, and the sensing information (e.g.  in addition to spoken utterances, additional user input data is utilized including image data and context information including data related to an identity, position and status based on received sensory data; para 113; see also various sensors used by entity tracker in para 200; Each user, location, and activity included in the entity identity data 112, entity position data 114, and entity status data 116 may have an associated estimate of a probability that that user, location, or activity was correctly identified; para 252)

Regarding Claim 10, in addition to the elements stated above regarding claim 9, Sol further discloses:
v/herein the voice session generation unit terminates the voice session having an earliest utterance detection time, on a basis of the voice (e.g. note confidence decay 

Regarding Claim 11, in addition to the elements stated above regarding claim 8, Sol further discloses:
wherein, in a case where the face has been detected in the angular direction different from the angular direction in which M faces are being tracked in a state where the M faces are being tracked, the tracking unit terminates the tracking of the face of the user estimated to have the lowest probability of occurrence of the utterance out of the M faces being tracked  (e.g. Each user, location, and activity included in the entity identity data 112, entity position data 114, and entity status data 116 may have an associated estimate of a probability that that user, location, or activity was correctly identified; para 252; note the entity identifier 104 may record images of a person's face, and associate these images with recorded audio of the person's voice; para 205; and In an environment with multiple users, such indicators also may identify the particular user who is addressing a device; para 324; and note filtering sensor data when confidence values are below a threshold; para 227; consistently identify speech form particular people and ignore background noise; para 233 and finally see the examples of aggregating metrics based on speaker ID and keyword confidence in order to rank and select messages; see Fig. 22 and its corresponding description, paras 308+; note 

Regarding Claim 12, in addition to the elements stated above regarding claim 11, Sol further discloses:
wherein the tracking unit estimates the user having the lowest probability of occurrence of the utterance on a basis of at least any of the image, and the sensing information (e.g.  in addition to spoken utterances, additional user input data is utilized including image data and context information including data related to an identity, position and status based on received sensory data; para 113; see also various sensors used by entity tracker in para 200; Each user, location, and activity included in the entity identity data 112, entity position data 114, and entity status data 116 may have an associated estimate of a probability that that user, location, or activity was correctly identified; para 252).

Regarding Claim 13, in addition to the elements stated above regarding claim 12, Sol further discloses:
wherein the tracking unit terminates tracking of the face of the user existing at a most distant position on a basis of the image (e.g.  in addition to spoken utterances, additional user input data is utilized including image data and context information 

Regarding Claim 14, in addition to the elements stated above regarding claim 11, Sol further discloses:
wherein a number M of the faces tracked by the tracking unit and a number N of the voice sessions generated by the voice session generation unit are same (e.g. the system may track multiple conversations that are occurring simultaneously or otherwise overlapping, and may interact with participants in each conversation as appropriate for each conversation; para 179; note the entity identifier 104 may record images of a person's face, and associate these images with recorded audio of the person's voice; para 205)

Regarding Claim 15, in addition to the elements stated above regarding claim 1, Sol further discloses:
a voice recognition unit that performs voice recognition of the utterance of the identified speaker (e.g. voice listener 30 receives audio data and utilizes speech recognition functionality to translate spoken utterances into text; para 46);


Regarding Claim 16, in addition to the elements stated above regarding claim 1, Sol further discloses:
a response generation unit that generates a response to the request of the speaker (e.g. note example system response; para 154; and note message generated by the system in response to the speech; para 298, 303)

Regarding Claim 17, in addition to the elements stated above regarding claim 1, Sol further discloses:
an imaging unit that obtains the image in the environment (e.g. note example sensors such as cameras; par 200); and
a voice acquisition unit that obtains the voice in the environment (e.g. note example sensors such as microphones; par 210).

Claim 18 is rejected under the same grounds as claims 1, 5, 16 and 17 above

Claim 19 is rejected under the same grounds as claims 1 and 3 above.

Claim 20 is rejected under the same grounds as claims 1 and 3 above.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
1) Johnson et al. (U.S. 2013/0144629) teaches a system and method for multimodal speech and gesture interaction.  The disclosure details the ability for users of the system to combine speech inputs with gestures from a camera to issue commands as well as face tracking; and
2) Kim et al. (U.S. 2013/0300648) teaches a system and method for an audio user interaction system.   Kim details a number of relevant teachings including steering microphones according to the location/direction/position of a user as well as use of an angle offset to correlate directions for received speech signals.  Kim further teches use of a camera to focus and that the techniques disclosed are applicable to voice activated control and the like.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Andrew C Flanders whose telephone number is (571)272-7516.  The examiner can normally be reached on M-F 8:30-5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/ANDREW C FLANDERS/           Primary Examiner, Art Unit 2654