Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 05/10/2018 is being considered by the examiner.
Drawings
The drawing submitted on 03/19/2018 is being considered by the examiner.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Shah et al. (US 2017/0332035 A1) in view of Honjo (US 2019/0230413 A1).
 
Regarding Claims 1, 7, and 12, Shah  et al. teach: A method for operating a smart media environment system, the method comprising: obtaining an audio input signal (audio input) at a first device (Fig.2A electronic device 190), wherein the audio input signal includes voice data ([0050] FIG. 1 is an example smart media environment 100 in accordance with some implementations. The smart media environment 100 includes a structure 150 (e.g., a house, office building, garage, or mobile home) with various integrated devices. [0052] In addition to the media devices 106 and 108, one or more electronic devices 190 are disposed in the smart media environment 100 to collect audio inputs for initiating various media play functions of the media devices. [0057] A user could also make a voice request via the microphone of the electronic device 190 concerning the media content that has already been played on a display device. [0059] The integrated smart home devices include intelligent, multi-sensing, network-connected devices that integrate seamlessly with each other in a smart home network and/or with a central server or a cloud-computing system to provide a variety of useful smart home functions. [0060] The smart home devices in the smart media environment 100 may include…one or more intelligent, multi-sensing, network-connected camera systems 132,… [0066] The voice-activated electronic device 190 is configured to receive audio inputs from an environment in proximity to the voice-activated electronic device 190. [0150] The server 140 could include one or more input devices 710 that facilitate user input, such as a keyboard, a mouse, a voice-command input unit or microphone, a touch screen display, a touch-sensitive input pad, a gesture capturing camera, or other input buttons or controls. Furthermore, the server 140 could use a microphone and voice recognition or a camera and gesture recognition to supplement or replace the keyboard. In some implementations, the server 140 includes one or more cameras, scanners, or photo sensor units for capturing images, for example, of graphic series codes printed on the electronic devices.); determining an input signal control target based on the audio input signal ([0075] Specifically, a voice message is recorded by an electronic device 190, and the voice message is configured to request media play on a media output device 106. Optionally, the electronic device 190 partially processes the voice message locally. [0077] In some implementations, the user voice designation of the media output device 106 includes description of the destination media output device.); in response to a determination that the input signal control target is a media output device of the smart media environment system, transmitting the audio input signal to a remote server (a voice assistance server 112); receiving a voice analysis signal (cloud cast service server 116 receiving the voice analysis signal from the voice assistance server 112) based on the audio input signal and remote voice data ([0070] An assistant server (e.g., a voice assistance server 112) is configured to support the voice activated electronic device 190, control interactions with a search stack and resolve which media action needs to be executed according to raw voice inputs collected by the electronic device 190.The assistant server sends (202) a request to the cloud cast service server 116 which converts the media action into an Action Script that can then be executed by the target cast device 108. [0072] Once the command is obtained, the cloud cast service server 116 maintains this CloudCastCommand in a consistent storage keyed by a unique_command_id and target_device_id. [0075] Specifically, a voice message is recorded by an electronic device 190, and the voice message is configured to request media play on a media output device 106. Optionally, the electronic device 190 partially processes the voice message locally. Optionally, the electronic device 190 transmits the voice message or the partially processed voice message to a voice assistance server 112 via the communication networks 110 for further processing. A cloud cast service server 116 determines that the voice message includes a first media play request, and that the first media play request includes a user voice command to play media content on a media output device 106 and a user voice designation of the media output device 106. The user voice command further includes at least information of a first media play application (e.g., YouTube and Netflix) and the media content (e.g., Lady Gaga music) that needs to be played. [0077] In some implementations, the user voice designation of the media output device 106 includes description of the destination media output device. [0100] In some implementations, after receiving the voice message recorded by an electronic device 190-1 or 190-2, the cloud cast service server 116 forwards the voice message to a voice assistance server 112 that parses the voice message and identifies the user voice command and the voice designation of the destination media output device, and receives from the voice assistance server 112 the user voice command and the voice designation of the destination media output device 106-2.); generating a command signal (Action script or a second media play request including the information of the first media play application and the media content) based on the voice analysis signal, wherein the command signal is associated with a voice command (a user voice command) of the media output device; and transmitting the command signal to the media output device, wherein the command signal causes the media output device to perform an action associated with the voice command ([0070] The assistant server sends (202) a request to the cloud cast service server 116 which converts the media action into an Action Script that can then be executed by the target cast device 108. [0076] In accordance with the voice designation of the media output device, the cloud cast service server 116 in a device registry 118 a cast device associated in the user domain with the electronic device 190 and coupled to the media output device 106. The cast device 108 is configured to execute one or more media play applications for controlling the media output device 106 to play media content received from one or more media content hosts 114. Then, the cloud cast service server 116 sends to the cast device 108 a second media play request including the information of the first media play application and the media content that needs to be played. Upon receiving the information sent by the cloud cast service server 116, the cast device 108 executes the first media play application and controls the media output device 106 to play the requested media content.).
Shah et al. however do not teach: the input signal control target is an image capture device of the image capture system or image capture device mount of the image capture system.
Honjo teaches: the input signal control target is an image capture device of the image capture system ([0007] According to another aspect of the present invention, there is provided an image capturing apparatus comprising: an image capturing unit configured to capture an image; a communication unit capable of executing, while the image capturing apparatus is connected to a network, communication for allowing another apparatus connected to the network to recognize the image capturing apparatus and communication for transmitting the image to a communication partner apparatus; [0021] FIG. 1 shows an example of a system arrangement according to this embodiment. This system includes a camera 101 and a client apparatus 102. The camera 101 and the client apparatus 102 include communication apparatuses that can be connected to each other via a network 103 and can communicate with each other. [0026] Examples of the apparatus arrangements of the camera 101 and the client apparatus 102 will be described. The example of the apparatus arrangement of the camera 101 will be described first with reference to FIG. 2. In an example, the camera 101 includes a control unit 201, a storage unit 202, a communication unit 203, an image capturing unit 204, and an image capturing optical system 205. Note that the arrangement shown in FIG. 2 is merely an example. Part of the arrangement shown in FIG. 2 may be omitted, or an additional component may be added to the arrangement shown in FIG. 2. A plurality of functional blocks shown in FIG. 2 may be integrated into one functional block, and one functional block shown in FIG. 2 may be divided into a plurality of functional blocks. Part or all of the arrangement shown in FIG. 2 may be replaced by another component. For example, the camera 101 can have various arrangements such as an arrangement in which a video analysis function, a voice input function, a voice output function, and the like are further included, and an arrangement in which the control unit 201 has part of the capability of the storage unit 202. [0027] The control unit 201 performs overall control of the camera 101 and various processes by, for example, executing a program stored in the storage unit 202. [0028] The control unit 201 acquires a command from the client apparatus 102 via the communication unit 203, and controls the camera 101.).
Therefore it would have been obvious to one of the ordinary skilled in the art before the effective filling date of the invention was made for Shah et al. to include the teaching of Honjo above in order to control camera operation through voice command.

Regarding Claims 2 and 13, Shah et al. teach: The method of claim 1, further comprising: receiving a feedback message (from the cloud cast service server 116), wherein the feedback message indicates that the action status associated with the voice command ([0053] The speaker is configured to allow the electronic device 190 to deliver voice messages to a location where the electronic device 190 is located in the smart media environment 100, thereby broadcasting music, reporting a state of audio input processing, having a conversation with or giving instructions to a user of the electronic device 190. As an alternative to the voice messages, visual signals could also be used to provide feedback to the user of the electronic device 190 concerning the state of audio input processing. When the electronic device 190 is a conventional mobile device (e.g., a mobile phone or a tablet computer), its display screen is configured to display a notification concerning the state of audio input processing. [0054] In accordance with some implementations, the electronic device 190 is a voice interface device that is network-connected to provide voice recognition functions with the aid of a cloud cast service server 116 and/or a voice assistance server 112. For example, the electronic device 190 includes a smart speaker that provides music to a user and allows eyes-free and hands-free access to voice assistant service (e.g., Google Assistant). [0073] The cloud cast service server 116 sends the execution status message to source device (e.g. the voice-activated electronic device 190) optionally via a Cloud Messaging service. The voice-activated electronic device 190 will then call S3 for TTS and playback. )
Shah et al. however do not teach: receiving a feedback message from the image capture device, wherein the feedback message indicates that the action associated with the voice command is completed (See Honjo teaching: [0024] Thus, the client apparatus 102 can recognize the camera 101 connected to the network 103, communicate with the camera 101, and transmit each control command for PTZ (Pan/Tilt /Zoom) control or the like to the camera 101. In this case, the camera 101 can also transmit, to the client apparatus 102, a response to the command. [0031] For example, the communication unit 305 performs various communications such as transmission of each change command including a network setting change to the camera 101, and reception of a video stream or a response to each change command from the camera 101. [0035] Note that if the setting is changed, the camera 101 may notify the client apparatus 102 of completion of the change of the setting. ). 
Therefore it would have been obvious to one of the ordinary skilled in the art before the effective filling date of the invention was made for Shah et al. to include the teaching of Honjo above in order to control camera operation through voice command.

Regarding Claims 3 and 14, Shah et al. teach: The method of claim 2, wherein the feedback message is an audible signal ([0053] The speaker is configured to allow the electronic device 190 to deliver voice messages to a location where the electronic device 190 is located in the smart media environment 100, thereby broadcasting music, reporting a state of audio input processing, having a conversation with or giving instructions to a user of the electronic device 190.As an alternative to the voice messages, visual signals could also be used to provide feedback to the user of the electronic device 190 concerning the state of audio input processing. When the electronic device 190 is a conventional mobile device (e.g., a mobile phone or a tablet computer), its display screen is configured to display a notification concerning the state of audio input processing. [0073] The cloud cast service server 116 sends the execution status message to source device (e.g. the voice-activated electronic device 190) optionally via a Cloud Messaging service. The voice-activated electronic device 190 will then call S3 for TTS and playback.).

Regarding Claims 4 and 15, Shah et al. teach: The method of claim 2, wherein the feedback message is displayed on a display of the first device ([0053] As an alternative to the voice messages, visual signals could also be used to provide feedback to the user of the electronic device 190 concerning the state of audio input processing. When the electronic device 190 is a conventional mobile device (e.g., a mobile phone or a tablet computer), its display screen is configured to display a notification concerning the state of audio input processing.).

Regarding Claims 5 and 16, Shah et al. teach: The method of claim 1, wherein the audio input signal is obtained from a second device (cast device 108) via a wireless communication link ([0054] In accordance with some implementations, the electronic device 190 is a voice interface device that is network-connected to provide voice recognition functions with the aid of a cloud cast service server 116 and/or a voice assistance server 112. [0055] When voice inputs from the electronic device 190 are used to control the media output devices 106 via the cast devices 108, the electronic device 190 effectively enables a new level of control of cast-enabled media devices. [0062] Similarly, each of the cast devices 108 and the voice-activated electronic devices 190 is also capable of data communications and information sharing with other cast devices 108, voice-activated electronic devices 190, smart home devices, a central server or cloud-computing system 140, and/or other devices (e.g., the client device 104) that are network-connected. Data communications may be carried out using any of a variety of custom or standard wireless protocols (e.g., IEEE 802.15.4, Wi-Fi, ZigBee, 6LoWPAN, Thread, Z-Wave, Bluetooth Smart, ISA100.11a, WirelessHART, MiWi, etc.) and/or any of a variety of custom or standard wired protocols (e.g., Ethernet, HomePlug, etc.), or any other suitable communication protocol, including communication protocols not yet developed as of the filing date of this document. [0063] In some implementations, the cast devices 108, the electronic devices 190 and the smart home devices serve as wireless or wired repeaters. In some implementations, a first one of and the cast devices 108 communicates with a second one of the cast devices 108 and the smart home devices via a wireless router. The cast devices 108, the electronic devices 190 and the smart home devices may further communicate with each other via a connection (e.g., network interface 160) to a network, such as the Internet 110. Through the Internet 110, the cast devices 108, the electronic devices 190 and the smart home devices may communicate with a smart server system 140 (also called a central server system and/or a cloud-computing system herein).).

Regarding Claims 6 and 17, Shah et al. teach: The method of claim 5, further comprising: transmitting a feedback message (a second media play request) to the second device via the wireless communication link ([0076] In accordance with the voice designation of the media output device, the cloud cast service server 116 in a device registry 118 a cast device associated in the user domain with the electronic device 190 and coupled to the media output device 106. The cast device 108 is configured to execute one or more media play applications for controlling the media output device 106 to play media content received from one or more media content hosts 114. Then, the cloud cast service server 116 sends to the cast device 108 a second media play request including the information of the first media play application and the media content that needs to be played. Upon receiving the information sent by the cloud cast service server 116, the cast device 108 executes the first media play application and controls the media output device 106 to play the requested media content. ).

Regarding Claim 8, Shah et al. teach: The method of claim 7, wherein the command signal is generated based on the audio input signal and stored voice data ([0070] An assistant server (e.g., a voice assistance server 112) is configured to support the voice activated electronic device 190, control interactions with a search stack and resolve which media action needs to be executed according to raw voice inputs collected by the electronic device 190. The assistant server sends (202) a request to the cloud cast service server 116 which converts the media action into an Action Script that can then be executed by the target cast device 108. [0071] In some implementations, a voice assistant server makes a remote procedure call (RPC) of executeCastCommand with a CloudCastCommand as follows:
TABLE-US-00001 message CloudCastCommand [ optional string unique_command_id = 1 ; optional string source_device_id = 2 ; optional string target_device_id = 3 ; optional string app_id = 4 ; optional string content_id = 5 ; optional string content_auth_token = 6 ; ] message ExecuteCastCommandRequest [ optional CloudCastCommand cast_command = 1 ; ] message ExecuteCastCommandResponse [ optional CloudCastCommand cast_command = 1 ; optional string cast_action_script = 2 ; ] [0072] Once the command is obtained, the cloud cast service server 116 maintains this CloudCastCommand in a consistent storage keyed by a unique_command_id and target_device_id. The CloudCastCommand will be replaced or removed when another command is issued for the same target cast device 108 or the electronic device 190 or when /executionReport endpoints receives either SUCCESS/ERROR status. The cloud cast service server 116 then cleans up Command that is stale(haven't finished in a certain time period), and generates the Cast Action Script. Once Cast Action Script is generated, the cloud cast service server 116 returns the script in the RPC response, and sends the Response using Google Cloud Messaging Service if (source_device_id!=target_device_id). [0073] The cloud cast service server 116 sends the execution status message to source device (e.g. the voice-activated electronic device 190) optionally via a Cloud Messaging service. The voice-activated electronic device 190 will then call S3 for TTS and playback. Note: execution status message are prestored voice data outputted by TTS.).

Regarding Claim 9, Shah et al. teach: The method of claim 7, wherein the command signal is generated based on the audio input signal, stored voice data, and a user activity (See rejection of Claim 8 and [0086] The smart media environment 100 further includes one or more voice-activated electronic devices 190 that are communicatively coupled to the cloud cast service server 116 and the voice assistance server 112. [0087] When media content is being played on the first output device 106-1, a user may send a voice command to any of the electronic devices 190 to request play of the media content to be transferred to the second output device 106-2. The voice command includes a media play transfer request. [0089] In a specific example, when a music playlist is played on the first output device 106-1, the user says "Play on my living room speakers." The first output device 106-1 stops playing the currently played song, and the stopped song resumes on the living room speakers. When the song is completed, the living room speakers continue to play the next song on the music playlist previously played on the first output device 106-1. As such, when the user is moving around in the smart home environment 100, the play of the media content would seamlessly follow the user while only involving limited user intervention (i.e., giving the voice command). Such seamless transfer of media content is accomplished according to one or more of the following operations: [0090] A voice assistant service (e.g., a voice assistance server 112) recognizes that it is a user voice command to transfer media from one output device (source) to another output device (destination); [0091] The Assistant service passes a message including the user voice command to the cloud cast service server 116; [0092] The cloud cast service server 116 then asks the source output device 106-1 to provide a blob of data that is needed for transferring the media stream; [0093] The content of the blob of data is partner dependent but it typically contains the current media content being played, the position with the current media content and the stream volume of the current media content; [0094] Optionally, the content of the blob of data include information of a container for the current media content (e.g., the playlist to which the media content belong), and a position of the current media content within the playlist; [0095] The cloud cast service server 116 tells the source device to stop playing the media content; [0096] The cloud cast service server 116 then loads the appropriate receiver application (e.g., media play application) on the destination (i.e. the same receiver application that is running on the source output device); [0097] The cloud cast service server 116 sends this blob of data to the destination cast device 108-2 along with an instruction to the receiver application to resume transfer of the media content; and [0098] The receiver application interprets the data blob to resume the media content accordingly.).

Regarding Claim 10, Shah et al. teach: The method of claim 9, wherein the user activity is determined based on sensor data of the image capture device (see rejection of claim 9 and [0150] The server 140 could include one or more input devices 710 that facilitate user input, such as a keyboard, a mouse, a voice-command input unit or microphone, a touch screen display, a touch-sensitive input pad, a gesture capturing camera, or other input buttons or controls. Furthermore, the server 140 could use a microphone and voice recognition or a camera and gesture recognition to supplement or replace the keyboard. In some implementations, the server 140 includes one or more cameras, scanners, or photo sensor units for capturing images, for example, of graphic series codes printed on the electronic devices. [0171] The client device 104 includes one or more input devices 810 that facilitate user input, such as a keyboard, a mouse, a voice-command input unit or microphone, a touch screen display, a touch-sensitive input pad, a gesture capturing camera, or other input buttons or controls. Furthermore, some the client devices 104 use a microphone and voice recognition or a camera and gesture recognition to supplement or replace the keyboard. In some implementations, the client device 104 includes one or more cameras, scanners, or photo sensor units for capturing images, for example, of graphic series codes printed on the electronic devices.).

Regarding Claim 11, Shah et al. teach: The method of claim 9, wherein the user activity is determined based on sensor data (voice data from microphone input) of the first device (device 190) (See rejection of claim 9).

Regarding Claim 18: The method of claim 12, wherein the action is adjusting a pan angle of the image capture device mount (See Honjo teaching: [0007] According to another aspect of the present invention, there is provided an image capturing apparatus comprising: an image capturing unit configured to capture an image; a communication unit capable of executing, while the image capturing apparatus is connected to a network, communication for allowing another apparatus connected to the network to recognize the image capturing apparatus and communication for transmitting the image to a communication partner apparatus; [0021] FIG. 1 shows an example of a system arrangement according to this embodiment. This system includes a camera 101 and a client apparatus 102. The camera 101 and the client apparatus 102 include communication apparatuses that can be connected to each other via a network 103 and can communicate with each other. [0024] The client apparatus 102 transmits a Probe packet for searching for an apparatus that has a desired function or provides a desired service among the apparatuses connected to the network 103. In this example, the client apparatus 102 transmits a Probe packet for searching for a camera as an apparatus having an image capturing function. If the camera 101 is an apparatus that has a function desired by the client apparatus 102 or provides a service desired by the client apparatus 102, upon receiving the Probe packet, the camera 101 transmits a ProbeMatch packet to the client apparatus 102. Thus, the client apparatus 102 can recognize the camera 101 connected to the network 103, communicate with the camera 101, and transmit each control command for PTZ (Pan/Tilt /Zoom) control or the like to the camera 101. In this case, the camera 101 can also transmit, to the client apparatus 102, a response to the command. [0027] The control unit 201 performs overall control of the camera 101 and various processes by, for example, executing a program stored in the storage unit 202. [0028] The control unit 201 acquires a command from the client apparatus 102 via the communication unit 203, and controls the camera 101.).

Regarding Claim 19: The method of claim 12, wherein the action is adjusting a tilt angle of the image capture device mount (See rejection of Claim 18).

Regarding Claim 20: The method of claim 12, wherein the action is adjusting a roll angle of the image capture device mount (See rejection of Claim 18).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. The prior art of record Piccioni (US 2020/0355463 A1) teach: [0053] Remote camera 1300 is a suitable camera capable of receiving video and, optionally, audio input from an area in the field of view of remote camera and communicate video and, optionally, audio, data to a remote location over an external data communications network. In the illustrated embodiment, remote camera 1300 is mounted in a moveable manner that allows panning and tilting, and has the capability to change the zoom level and focus, all of which allow remote camera 1300 to modify remote camera's 1300 field of view in various ways. Alternatively, remote camera 1300 may have more limited movement, such as only being able to pan or tilt, or lack the ability to change zoom level and focus, such as a camera fixed on an automatic teller machine (ATM). The movement of remote camera 1300 may be self-initiated, such as under the control of software and hardware that is a part of the camera, or the movement may be controlled by commands received from an external control system, such as a server at a police department. Remote camera 1300 is mounted on a suitable mount, such as a lamp post or traffic signal. Alternatively, the mount may be any suitable item, such as a building, ATM, store front, vehicle, such as a public safety vehicle, a drone, a robotic device, another person, an animal, such as a police dog, or other structure. More specifically, the mount for remote camera 1300 is not required to be fixed in place, such as a light pole, but may itself be mobile, such as a drone. In embodiments where the mount for remote camera 1300 is itself mobile, remote camera 1300 may also be requested to change remote camera's 1300 geographic location, as well as pan, tilt and zoom, and may take into account the motion of the mount when determining where to direct remote camera's 1300 field of view. Indeed, remote camera 1300 may be another camera 60 or camera 50 mounted on a different wearer of a different belt 10. Remote camera 1300 is capable of communicating with other devices wireless over a suitable wireless data communications network such as the Internet, Wi-Fi, Bluetooth and proprietary or custom wireless network technologies. 
The prior art of record LEE et al.(US 2018/0322870 A1) teach: [0056] Furthermore, an internal service may be a VOD service, a voice telephone service, and a video telephone service. In this case, user interactive device 100 may i) recognize voice commands for the VOD service, the voice call service, and the video call service, ii) transmit the recognized voice commands (e.g., user speech and context information) to central server 200. Central server 200 may perform tasks to provide the requested service in cooperation with other internal servers 210 to 230, generate at least one control signals and audio and video feedbacks, and provide the results of performing the tasks and the generated control signals and audio and video feedbacks to user interactive device 100.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMMAD K ISLAM whose telephone number is (571)270-5878.  The examiner can normally be reached on Monday -Friday, EST (IFP).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on 571-272-7453.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/MOHAMMAD K ISLAM/Primary Examiner, Art Unit 2656