DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
2.	Claims 1-10 are pending in this application.  
	

Response to Arguments
3.	Applicant’s arguments, see Remarks, filed 02/05/2021, with respect to the 35 U.S.C 103 rejection of claim(s) 1-10 under Davies (US PG. Pub. 2016/0080295 A1) in view of Kuroiwa (US PG. Pub. 2020/0329160 A1); and further in view of Nakajima (US PG. Pub. 2019/0260884 A1) have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made under 35 U.S.C. 103 for independent claims 1 and 7 and dependent claims 3-6 as being unpatentable over Davies (US PG. Pub. 2016/0080295 A1) in view of Shin (US PG. Pub. 2011/0265040 A1); also a new ground(s) of rejection is made under Davies (US PG. Pub. 2016/0080295 A1) in view of Shin (US PG. Pub. 2011/0265040 A1); and further in view of Nakajima (US PG. Pub. 2019/0260884 A1) for dependent claims 2 and 8-10.

4. Regarding claim 1, Applicant’s specifically argues on pages 4-9 of the Remarks that the prior arts of Davies in view of Kuroiwa and Nakajima failed to teach the limitation “execute a processing function in the image former on the basis of a command being input by voice through the voice operation, the voice operation screen including no item used only by the touch operation”. The examiner respectfully agrees. However, the newly added prior art of Shin teaches the following limitation (See Shin, Call Application, Sect. [0095], wherein, a call application is executed by a mobile device 100, where a voice instruction command is set for an audio playback process and when an audio playback application and a moving image playback application have been set as applications displayed via a light image and a call application is currently executed in the mobile device 100, the controller 160 can control the display unit 132 to display an execution screen of the call application on which the light images corresponding to the audio playback application and the moving image playback application are also displayed. Thus, arguments against claim 1 and similar claim 7 is now taught by the newly added reference of Shin.

5.	Additionally, applicant argues the limitations of dependent claims 2 and 3 are stated on Page 9 of the Remarks, however, these rejections have been updated and rejected accordingly as shown below.

Claim Rejections - 35 USC § 103
6.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of 
7.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

8.	The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
9.	The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

10.	Claims 1, 3-7 is/are rejected under 35 U.S.C. 103 as being unpatentable over Davies (US PG. Pub. 2016/0080295 A1) in view of Shin (US PG. Pub. 2011/0265040 A1).

Referring to Claim 1, Davies teaches an image forming apparatus (See Davies, Figs. 4-5, Instant Voice and Imaging System) comprising a controller (See Davies, Fig. 4, User Interface 401), a display (See Davies, Figs. 4-5, Display embodied on User Interface and Display Generator 512), and an image former (See Davies, Fig. 5, Video Segment Generator 409), and being capable of executing a processing function by a voice operation (See Davies, Fig. 4, Sect. [0033] lines 28-33, As illustrated by FIG. 4, the instant system for voice based social networking receives an electronically encoded voice message (and frequently an image) and ultimately delivers it to one or multiple users.  The voice and image may be captured (converted into data records) by various devices and the data transmitted to the system in a variety of formats such as Multimedia Messaging Service (MMS), email, web form input or Application Program Interface (API) call; it is assumed that other data of interest, such as sender identification and context data such as location may also be communicated to the system in some associated format.  Interface 401 provides an electrically compatible interface as required by the sender's technology.), wherein
the controller (See Davies, Fig. 4, User Interface 401) performs controls to:
be capable of displaying, on the display, a touch operation screen including an item capable of being used by a touch operation, and a voice operation screen used by the voice operation (See Davies, Sect. [0034], [0035], [0037] lines 1-6, Interface 401 is also responsible for converting the input information into a format acceptable to the system.  Uploaded image and voice data are saved in image store 402 and audio store 403.  Information about the image, audio and sender are entered into a comment descriptor structure of the form described in FIG. 3.  This comment descriptor is then added to the comment descriptor database 404…The selection rules engine 407 takes information from comment descriptor database 404, user database 405 and advertising database 406.  Using this information it selects comments for forwarding to each particular user.  The selection is forwarded to video segment generator 409 and to comment delivery module 410.  For each selected comment, video segment generator 409 takes the corresponding images from image store 402 and audio from audio store 403 and converts them into a standard video fragment…Users can initiate a comment by selecting that mode through touching a screen button, a switch, performing a gesture, speaking a command, creating an accelerometer event, etc. They may then take a photo, select a photo from a file in a directory (on or off device) or paste a URL to discuss the contents of a webpage.).

Davies fails to explicitly teach 
switch the touch operation screen to the voice operation screen on the basis of to a command to start an operation by voice being input through the voice operation; and
execute a processing function in the image former on the basis of a command being input by voice through the voice operation, the voice operation screen including no item used only by the touch operation.

However, Shin teaches 
switch the touch operation screen to the voice operation screen on the basis of to a command to start an operation by voice being input through the voice operation (See Shin, Fig. 8A, Sect. [0092]-[0093], When the user touches the light image and moves the touch in a diagonal direction, i.e., in the bottom left direction, while the execution screen of the text message application is being displayed, the controller 160 controls the display unit 132 to switch the execution screen of the text message application to that of the audio playback application again…Diagram 803 of FIG. 8A shows the execution screen of the audio playback application to which the execution screen of the text message application is switched again.  While the text message application is being executed, the user can recognize, via the light image, what applications are currently being executed via multitasking.  When the user touches the light image and then moves the touch in the light illumination direction, the controller 160 controls the display unit 132 to switch the current screen to an execution screen of an application that is being executed via multitasking.); and
execute a processing function in the image former on the basis of a command being input by voice through the voice operation, the voice operation screen including no item used only by the touch operation (See Shin, Call Application, Sect. [0095], when one application is currently executed in the 
mobile device 100, a light image may also be displayed that can allow the user to execute another application on the screen of the currently executed application.  The light image may be displayed: in a certain region on the screen of the currently executed application; in a region between items included the execution screen of the application; on the boundary line of the display unit 132; or in the corner of the display unit 132.  Applications displayed via a light image may be a user's frequently used applications or a user's selected applications.  For example, when an audio playback application and a moving image playback application have been set as applications displayed via a light image and a call application is currently executed in the mobile device 100, the controller 160 can control the display unit 132 to display an execution screen of the call application on which the light images corresponding to the audio playback application and the moving image playback application are also displayed.).

Before the effective filing date of the claimed invention, it would have obvious to a person of ordinary skill in the art to switch the touch operation screen to the voice operation screen on the basis of to a command to start an operation by voice being input through the voice operation; and execute a processing function in the image former on the basis of a command being input by voice through the voice operation, the voice operation screen including no item used only by the touch operation. The motivation for doing so would have been to provide system and a method for providing a Graphical User Interface (GUI) to enhance the convenience of mobile devices, also  (See Sect. [0007]-[0009] of the Shin reference). Therefore, it would have been obvious to combine Davies and Shina to obtain the invention as specified in claim 1.

Referring to Claim 3, the combination of Davies in view of Shin teaches the image forming apparatus (See Davies, Figs. 4-5, Instant Voice Imaging System) according to claim 1, wherein the controller, as the voice operation screen, displays only an item receiving a voice operation (See Davies, Fig. 4, Sect. [0033] lines 1-4, In FIG. 4, the instant system for voice based social networking receives an electronically encoded voice message (and frequently an image) and ultimately delivers it to one or multiple users).


Referring to Claim 4, the combination of Davies in view of Shin teaches the image forming apparatus (See Davies, Figs. 4-5, Instant Voice Imaging System) according to claim 1, wherein, the controller, as the voice operation screen, displays text information used for utterance (See Davies, Fig. 4, Sect. [0033] lines 4-11, The voice and image may be captured (converted into data records) by various devices and the data transmitted to the system in a variety of formats such as Multimedia Messaging Service (MMS), email, web form input or Application Program Interface (API) call; it is assumed that other data of interest, such as sender identification and context data such as location may also be communicated to the system in some associated format.).

Referring to Claim 5, the combination of Davies in view of Shin teaches the image forming apparatus (See Davies, Figs. 4-5, Instant Voice Imaging System) according to claim 1, wherein if a voice command is not received immediately after activation of the image forming apparatus or within a predetermined time while the voice operation screen is displayed (See Davies, Sect. [0037] lines 7-14, If the user does not explicitly select an image or URL, the audio recording starts immediately and proceeds as before with the exception that the comment will use the sender image as the message image. This sender image then becomes the root image for the identification of the present conversation; the system can internally identify this conversation using a combination of the image and a timestamp but, without sending further data, it is dependent on the recipients to disambiguate the conversation in this case.), the controller displays a standby screen for receiving the touch operation, on the display (See Davies, Fig. 3, Sect. [0038], In the case of a URL, the URL is stored in the message image filename 301 of the data structure (FIG. 3) and rendered as an image in the composite image. This ensures that the recipient will see what the sender saw and not what a dynamic web page may show at a later date. Trivially, the user interface allows for selecting the image by the recipient and redirection to the current web page with full browser access.).
Referring to Claim 6, the structural elements of apparatus claim 1 perform all of the steps of method claim 6.  Thus, claim 6 is rejected for the same reasons discussed in the rejection of claim 1. 

Referring to Claim 7, arguments analogous to claim 1 are applicable herein.  The non-transitory computer readable medium is explicitly/inherently taught as evidenced by (See Davies, Embodied within Figs. 4-5, Sect. [0054], When provided by a processor, the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared. Moreover, explicit use of the term "processor" or "controller" should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor (DSP) hardware, read-only memory (ROM) for storing software, random access memory (RAM), and non-volatile storage. Other hardware, conventional and/or custom, may also be included.) and various memories stored therein.

11.	Claims 2, 8-10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Davies (US PG. Pub. 2016/0080295 A1) in view of Shin (US PG. Pub. 2011/0265040 A1); and further in view of Nakajima (US PG. Pub. 2019/0260884 A1).

Referring to Claim 2, the combination of Davies in view of Shin teaches the image forming apparatus (See Davies, Figs. 4-5, Instant Voice Imaging System) according to claim 1.
The combination of Davies in view of Shin fails to explicitly teach wherein the controller performs controls to:
display, if a first command is input by voice, a first voice operation screen displaying an executable processing function;
display, if a second command for the processing function is input by voice, a second voice operation screen for changing on the basis of the second command, a setting item for the processing function; and
execute the processing function in the image former on the basis of the setting item.

However, Nakajima teaches wherein the controller (See Nakajima, Fig. 5, Touch panel 45) performs controls to:
 display, if a first command is input by voice, a first voice operation screen displaying an executable processing function (See Nakajima, Sect. [0107] lines 1-3 and Sect. [0111], first search processing in which a search range is the voice operation command group 630 related to the sub-screen 230 is performed…in a case where one voice operation command that agrees with the search target character string (voice recognition data) has been detected by the first search processing, processing corresponding to the one voice operation command (search target character string) is executed.), 	
(See Nakajima, Sect. [0107] lines 3-11 and Sect. [0108], second search processing in which a search range is the voice operation command group 610 related to the main screen 210 is performed.  More specifically, in a case where a search target character string is not detected within the first search range by the first search processing, the second search processing in which a search range is the voice operation command group 610 related to the operation screen 210 serving as a caller is executed…This enables even a voice operation command (for example, "GENKO GASHITSU (original-document image quality)") that agrees with any of voice operation commands of the voice operation command group 610 related to the operation screen (the operation screen serving as a caller) 210 other than the operation screen 230 that has been most recently called to be searched for.  Therefore, one voice operation command corresponding to a user's voice for operation can be properly detected from among the plurality of voice operation commands related to the plurality of operation screens.); and
execute the processing function in the image former on the basis of the setting item (See Nakajima, Sect. [0110], first search processing in which a search range is a first command group M1 that has been narrowed down from among voice operation commands of the plurality of voice operation command groups 610 and 630 according to a predetermined criterion (described next) is executed (also refer to FIG. 3).  Here, what is employed as the predetermined criterion is whether 
or not it is a voice operation command group related to a screen (logically, a screen of the lowest layer) that has been most recently (lastly) called (until a voice for voice operation is vocalized) among at least one operation screen that is currently (in detail, at the time of vocalizing the voice for voice operation) displayed.  Specifically, the voice operation command group 630 related to the screen 230 that has been most recently called (most recently displayed) between the two operation screens 210 and 230 that are currently displayed is determined as the first command group M1.  In other words, the first priority order is given to the voice operation command group 630 (the first command group M1), and search processing for the first command group M1 (630) to which the first priority order has been given is first executed.  It should be noted that the operation screen 230 is a screen that is displayed most frontward between the two operation screens, and is also designated as a screen displayed as the highest layer.  In addition, since the operation screen 230 is a screen corresponding to the voice operation command group 630 to which the first priority order has been given, the operation screen 230 is also designated as a first priority screen.).

Before the effective filing date of the claimed invention, it would have obvious to a person of ordinary skill in the art to display, if a first command is input by voice, a first voice operation screen displaying an executable processing function, display, if a second command for the processing function is input by voice, a second voice operation screen for changing on the basis of the second command, a setting item for the processing function; and execute the processing function in the image former on the (See Nakajima Sect. [0007]). Therefore, it would have been obvious to combine Davies in view of Shin; and further Nakajima to obtain the invention as specified in claim 2.

	Referring to Claim 8, the combination of Davies in view Shin teaches the image forming apparatus according to claim 1 (See Davies, Figs. 4-5, Instant Voice and Imaging System).
The combination of Davies in view of Shin fails to explicitly teach wherein the voice operation screen includes at least a command.
However, Nakajima teaches wherein the voice operation screen includes at least a command (See Nakajima, Fig. 9, Voice Operation Command Screen, Sect. [0092],
as shown in FIG. 9, as the voice operation command group 630 related to the magnification ratio setting screen 230 (refer to FIGS. 6 and 7), a plurality of voice operation commands including "JIDO (automatic)", "CHISAME (smallish)", "PURASU (plus)", "MAINASU (minus)", "GOJYU PASENTO (50%)", "NANAJYUTTEN NANA PASENTO (70.7%)", "HACHIJYUICHITEN ROKU PASENTO (81.6%)", "HACHIJYUROKUTEN ROKU PASENTO (86.6%)", and "HYAKU PASENTO/TOUBAI (100%/non-magnified)" are registered.). 

(See Nakajima Sect. [0007]). Therefore, it would have been obvious to combine Davies in view Shin; and further in view of Nakajima to obtain the invention as specified in claim 8.

	Referring to Claim 9, the combination of Davies in view of Shin teaches the method of forming an image according to claim 6 (See Davies, Figs. 4-5, Instant Voice and Imaging System).
The combination of Davies in view of Shin fails to explicitly teach wherein the voice operation screen includes at least a command.
However, Nakajima teaches wherein the voice operation screen includes at least a command (See Nakajima, Fig. 9, Voice Operation Command Screen, Sect. [0092],
as shown in FIG. 9, as the voice operation command group 630 related to the magnification ratio setting screen 230 (refer to FIGS. 6 and 7), a plurality of voice operation commands including "JIDO (automatic)", "CHISAME (smallish)", "PURASU (plus)", "MAINASU (minus)", "GOJYU PASENTO (50%)", "NANAJYUTTEN NANA PASENTO (70.7%)", "HACHIJYUICHITEN ROKU PASENTO (81.6%)", "HACHIJYUROKUTEN ROKU PASENTO (86.6%)", and "HYAKU PASENTO/TOUBAI (100%/non-magnified)" are registered.). 

Before the effective filing date of the claimed invention, it would have obvious to a person of ordinary skill in the art to incorporate wherein the voice operation screen includes at least a command.  The motivation for doing so would have been to provide a technology that enables to properly detect one voice operation command corresponding to user's voice input from among a plurality of voice operation commands related to a plurality of operation screens (See Nakajima Sect. [0007]). Therefore, it would have been obvious to combine Davies in view of Shin; and further in view of Nakajima to obtain the invention as specified in claim 9.

	Referring to Claim 10, the combination of Davies in view of  Shin teaches the non-transitory computer-readable recording medium according to claim 7 (See Davies, Figs. 4-5, Instant Voice and Imaging System).
The combination of Davies in view of Shin fails to explicitly teach wherein the voice operation screen includes at least a command.
	However, Nakajima teaches wherein the voice operation screen includes at least a command (See Nakajima, Fig. 9, Voice Operation Command Screen, Sect. [0092],
as shown in FIG. 9, as the voice operation command group 630 related to the magnification ratio setting screen 230 (refer to FIGS. 6 and 7), a plurality of voice operation commands including "JIDO (automatic)", "CHISAME (smallish)", "PURASU (plus)", "MAINASU (minus)", "GOJYU PASENTO (50%)", "NANAJYUTTEN NANA PASENTO (70.7%)", "HACHIJYUICHITEN ROKU PASENTO (81.6%)", "HACHIJYUROKUTEN ROKU PASENTO (86.6%)", and "HYAKU PASENTO/TOUBAI (100%/non-magnified)" are registered.). 

Before the effective filing date of the claimed invention, it would have obvious to a person of ordinary skill in the art to incorporate wherein the voice operation screen includes at least a command.  The motivation for doing so would have been to provide a technology that enables to properly detect one voice operation command corresponding to user's voice input from among a plurality of voice operation commands related to a plurality of operation screens (See Nakajima Sect. [0007]). Therefore, it would have been obvious to combine Davies in view of Shin and further in view of Nakajima to obtain the invention as specified in claim 10.

Cited Art
12.	The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Prchal et al. (US PG. PUB. 2014/0298434 A1) discloses a method for providing A user authentication via a mobile device enabled as a first NFC 
device.  The user authentication may be specified by an end user of the mobile device for permitting NFC with a second NFC device.  The user authentication may be related to an environmental object or a perspective of the mobile device specified by the end user.  It may be determined whether the mobile device and the second NFC device are in proximity to one another.  When the mobile device and the second NFC device are in proximity to one another and a detected action performed by a user with the mobile device is substantially similar to the provided user authentication, NFC between the mobile device and the second NFC device may be permitted.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DARRYL V DOTTIN whose telephone number is (571)270-5471.  The examiner can normally be reached on M-F 9am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tammy Goddard can be reached on (571) 272-7773.  The fax phone 
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/DARRYL V DOTTIN/
Examiner, Art Unit 2677