DETAILED ACTION
This is a first office action in response to application 15/930,485 filed on May 13, 2020 in which claims 1-16 are presented for examination. The application is being examined under the first inventor to file provisions of the AIA .

Priority Acknowledgement
2.	Acknowledgement is made of applicant’s claim for foreign priority under 35 U.S.C. 119(a)-(d) of Japanese Patent Application No. 2019-103859 filed on June 3, 2019.

Invention Title
3.	The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. 

Claim Rejections - 35 USC § 103
8.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

9.	Claims 1, 2, 7-10, 13, 14 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Pat. 8,314,943 to Kanda (“Kanda”) and further in view of U.S. Pat. 8,781,826 to Kooiman (“Kooiman”), U.S. Pat. 8,463,606 to Scott et al. (“Scott”) and U.S. Pat. 9,245,525 to Yeracaris et al. (“Yeracaris”).
Regarding claim 1, Kanda discloses an image processing apparatus (the digital copier of Figs. 1 and 2) comprising: 
a first processor that outputs an audio question for a user from a speech output device (Kanda, column 9/lines 11-14: audio is outputted to the speaker to instruct the user to reinput a speech command); 
a third processor that receives a spoken response of the user to the audio question, the spoken response being inputted from a speech input device (Kanda, , column 9/lines 11-14: after the user has received the audio instruction to reinput a speech command it is to be assumed that the user follows said instruction by reinputting the speech command); and 
a second processor that takes an appropriate image processing action to the spoken response received by the third processor (Kanda, column 4/lines 18-27: commands for the operation of the digital copier are registered in a dictionary in correlation with speech patterns; column 4/lines 28-42: a digital pattern matching of a speech from the user and the speech patterns stored in the dictionary is performed and when a match is detected the command correlated to the matching speech pattern obtained; the command is transmitted to the operation control unit and executed).
Kanda is silent about a first and a second mode and switching between the first mode and the second mode as claimed.
However, Kooiman teaches a speech recognition system which determines the reception quality of a voice input and switches over to a mode of operation which is less sensitive to noise when the reception quality drops below a given reception quality threshold (Kooiman, column 2/lines 21-27).

Kanda and Kooiman are silent about the second mode being limited in possible responses.
However, Scott describes an automated speech recognition system which has two modes of speech recognition including a “hot word recognition mode” and “a continuous speech recognition mode” (Scott, column 10/lines 33-37).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date to have implemented several modes including Scott’s “hot word recognition mode” and “continuous speech recognition mode” in Kanda and Kooiman’s image processing apparatus, the two modes mentioned by Scott being common in the art and “continuous speech recognition modes” being known to have much higher speech recognition error rates than “hot word recognition modes” (U.S. Pat. 9,245,525 to Yeracaris et al., column 3, lines 8-10). The “hot word recognition mode” is viewed as corresponding to the mode claimed as “being limited in possible responses”.

Regarding claim 2 (dependent on claim 1), Kanda, Kooiman and Scott disclose the first mode being an open-ended question mode prompting the user to respond to the audio question with a free-form spoken response, and the second mode being a closed-ended question mode prompting the user to respond to the audio question with a fixed spoken response, the fixed spoken response being selected from possible responses (corresponds to the common understanding of Scott’s “hot word recognition mode” and “continuous speech recognition mode”).
Regarding claim 7 (dependent on claim 1), Kanda, Kooiman and Scott disclose the fourth processor allowing the user to switch between the first mode and the second mode (see discussion above).
Regarding claim 8 (dependent on claim 1), Kanda, Kooiman and Scott disclose, 
wherein the fourth processor switches between the first mode and the second mode depending on a background noise level surrounding the image processing apparatus, and 
wherein, when the background noise level goes above a predetermined threshold, the fourth processor switches from the first mode to the second mode (Kooiman, column 2/lines 21-27).
Regarding claim 9 (dependent on claim 8), Kanda, Kooiman and Scott disclose wherein the background noise level is an operational noise level from the image processing apparatus (Kanda, column 1/lines 27-32).
Regarding claim 10 (dependent on claim 8), Kanda, Kooiman and Scott disclose wherein the background noise level is a present background noise level inputted from the speech input device (Kanda, column 1/lines 27-32, Kooiman, column 2/lines 21-23: Kanda is concerned with the noise produced by the image forming apparatus; Kooiman is concerned about noise associated with the speech input device; in the Kanda, Kooiman combination both sources of noise will have to be taken into account; in a common scenario, the image forming apparatus is idle and the only source of noise is that associated with the speech input device) and the fourth processor compares the present background noise level to the predetermined threshold (Kooiman, column 2/lines 21-27).
Regarding claim 13 (dependent on claim 1), Kanda, Kooiman and Scott disclose wherein the fourth processor does not switch from the first mode to the second mode during a predetermined process (Kanda, Kooiman and Scott teach broadly switching from one mode to a second mode according to the noise level whereas Kanda is concerned with the noise generated by the device and Kooiman is concerned with the noise associated with the speech input device; Kanda column 7/lines 7-12: noise generated by a single operation such as image reading or image printing does not affect speech recognition whereas noise from both operations executing simultaneously does; it stands to reason that in the combined Kanda, Kooiman and Scott image processing apparatus the noise threshold at which the mode is switched is above the noise generated by the image printing operation alone; accordingly, in a common scenario in which the only process running is an image printing operation the noise is below the threshold noise and the mode is not switched).
Regarding claim 14 (dependent on claim 8), Kanda, Kooiman and Scott disclose, wherein the fourth processor switches from the first mode to the second mode at a first point in time when the background noise level goes above the predetermined threshold during a process (Kooiman, column 2/lines 21-27), and the fourth processor switches from the second mode to the first mode at a second point in time when the background noise level reaches or goes below the predetermined threshold during the process (Kooiman, column 3/lines 19-22).
Claim 16 is directed to a non-transitory computer-readable recording medium storing a program for a computer of an image processing apparatus to execute the method of claim 1. Kanda, Kooiman and Scott disclose such non-transitory computer-readable medium (for example, Kanda, Fig. 2: RAM 227, memory 206, etc.) and the claim is further rejected based on the grounds used to reject claim 1 above. 

7.	Claims 3, 4 and 6 are rejected under 35 U.S.C. 103 as being unpatentable over Kanda, Kooiman and Scott as applied to claim 2 above, and further in view of U.S. Pub. 2020/0143792 to Iwata et al. (“Iwata”).
Regarding claim 3 (dependent on claim 2), Kanda, Kooiman and Scott disclose the image processing apparatus further comprising a display (Kanda, Fig.2, operation panel 225) but are silent about
wherein, when the first processor outputs the audio question from the speech output device in the second mode, the first processor further presents the possible responses in list form on the display, and 

However, Iwata describes interactive apparatus which support speech and text interactions (Iwata, interactive apparatus 101 of Fig. 2). According to Iwata, presenting a response is not limited to outputting a response sentence. It is also possible to display question sentences or important keywords associated with the candidates for an answer so that the user can select one of the candidate answers on a user interface.
Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date to have implemented Iwata’s text and speech interactions in Kanda, Kooiman and Scott’s image processing apparatus as a convenience to the user.  
With respect to the specific claim language, Kanda, Kooiman and Scott and Iwata disclose
wherein, when the first processor outputs the audio question from the speech output device in the second mode, the first processor further presents the possible responses in list form on the display (Iwata, [0029]/lines 11-15: important keywords associated with the candidates are displayed; displaying in  list is the most common display method, see Fig. 8), and 
wherein the user responds with the fixed spoken response, the fixed spoken response being selected from the possible responses presented on the display (Iwata, [0029]/lines 13-14).
Regarding claim 4 (dependent on claim 2), Kanda, Kooiman and Scott disclose wherein, when the first processor outputs the audio question from the speech output 
Regarding claim 6 (dependent on claim 3), Kanda, Kooiman, Scott and Iwata are silent about the possible responses being presented in chronological order based on a date and time at which the possible responses were registered on the image processing apparatus.
However, the examiner takes official notice that such order of presentation is common in the art. Indeed, in a common scenario in which records associated with possible responses are stored in a relational database table, when such records are retrieved from the table and no specific order of retrieval is specified, such records would be retrieved in the order in which they would have been inserted into the table, i.e., “in chronological order based on a date and time at which the possible responses were registered on the image processing apparatus”.
  
7.	Claims 5, 11, 12 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Kanda, Kooiman and Scott as applied to claim 8 above, and further in view of U.S. Pub. 2006/0095268 to Yano et al. (“Yano”).
Regarding claim 5 (dependent on claim 3), Kanda, Kooiman, Scott and Iwata are silent about the order in which the possible responses might be presented.

Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date to have displayed the possible responses in descending order of the number of times as a selection, as taught by Yano as a convenience to the user.
Regarding claim 11 (dependent on claim 8), claim 12 (dependent on claim 11) and claim 15 (dependent on claim 11), Kanda, Kooiman and Scott does not explicitly disclose storing past operational noise levels from each process in memory and calculating a background noise level from an upcoming process from the past operational noise level of the identical process.
However, Kanda recognizes that ambient noise associated with operation of different functions of an image forming apparatus might cause the performance of the speech recognition to be degraded which in turn might cause an error in the operation of the image forming apparatus to occur (Kanda, column 1/lines 27-32). Kanda notes that noise generated by a single operation (such as an image printing operation and an image reading operation) is not large enough to cause the speech recognition process to fail (Kanda, column 7/lines 7-10) whereas noise generated by two simultaneous operations is (Kanda, column 7/lines 10-12). Clearly, these observations imply that “past operational noise levels” from each of the image printing operation and image reading operation were measured and that the measured noise levels or values derived from the measured noise levels were stored somewhere. Kanda’s solution to the problem is to 
Therefore, in the Kanda, Kooiman combination, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date to have measured noise levels for each of the processes of the image forming apparatus in advance, and made a determination, prior to executing a new command, which of the available speech recognition modes could be executed safely, and selected the most capable mode that could be safely used at the anticipated noise levels. This approach would afford the best outcome under the anticipated noise levels.  
With regard to storing past operational noise levels in a memory of the image processing apparatus, according to the description above, said noise levels or parameters derived from said noise levels would have to be available in the memory of the image processing apparatus which could be accomplished by either storing the 

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PAUL F. PAYER whose telephone number is (571) 270-7302.  The examiner can normally be reached on Mon and Thu 7am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Benny Q. Tieu can be reached on (571) 272-7490.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
/PAUL F PAYER/Primary Examiner, Art Unit 2674