Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .




A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 12/2/2020 has been entered.





The following is a quotation of 35 U.S.C. 112(b):

(B)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:

The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention. 














Claims 2-3 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.

A) Line 7 of claim 1 recites,

“extract an audio signal of the selected object as an audio object signal.”

Namely, that:  an audio signal is extracted as an audio object signal.

Claim 2, which depends from claim 1, then recites,

 “extract the audio object signal from the audio signal.”  

The recitations are confusing and indefinite because the recitation of claim 1 appears to indicate that the audio object signal comprises the extracted audio signal whereas the recitation of claim 2 appears to indicate that the audio object signal is extracted from the audio signal.  The two recitations appear to be mutually exclusive. 

Appropriate clarification is needed.  It would appear that clarification could be made by amending claim 1 to indicate that the extracted audio signal includes the audio object signal.   



The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:

A person shall be entitled to a patent unless –


(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.




Claim 1 is rejected under 35 U.S.C. 102(a)(2) as being anticipated by US Patent Document #2015/0312662 to Kishimoto et al.

As is illustrated in Figure 2, Kishimoto discloses a system comprising:   

A) A source of video-audio signals (e.g., @ 10, 20, 50); and 

B) A video-audio processing apparatus (e.g., @ 30), comprised of processing circuitry executing software stored on a non-transitory memory [see paragraphs 0043-0044], configured to:

1) Receive and display the video signals from the source [e.g., note block S11 and S12 of Figure 4], wherein the display of the received video signal: 

a) causes one or more video objects, contained within the images of the video signal, to be displayed via the display of the video signal –and, hence, to be displayed “based on the video signal” [e.g., Note: 30 of Figure 1 on which a person (video object) is displayed; and paragraphs 0051 and 0061]  
   

2) Receive a selection, by a user, of a video object which is selected by the user from the one or more objects in the displayed image [e.g., Note: 30 and P1 of Figure 1; S13 and S14 of Figure 4; and paragraphs 0053 and 054]  

3) Extract an audio signal of the selected video object as an audio object signal based on sound (captured @ 10 of Figure 1) coming from a position corresponding to a position of the user selected video object [e.g., Note: S13-S15 of Figure 4; and paragraphs 0053-054 and 0068-0069]  


Claim 19 is rejected under 35 U.S.C. 102(a)(2) as being anticipated by US Patent Document #2015/0312662 to Kishimoto et al for the same reason that were set forth above with respect to claim 1.  Additionally:

As addressed above, the apparatus described by Kishimoto performs the recited method (each of the recited active steps of manipulation).  


Claim 20 is rejected under 35 U.S.C. 102(a)(2) as being anticipated by US Patent Document #2015/0312662 to Kishimoto et al for the same reason that were set forth above with respect to claim 19.  Additionally:

Again, as to the non-transitory CRM implementation, see paragraphs 0043-0044 of Kishimoto.  

Claim 4 is rejected under 35 U.S.C. 102(a)(2) as being anticipated by US Patent Document #2015/0312662 to Kishimoto et al for the same reason that were set forth above with respect to claim 1.  Additionally:

In the system disclosed by Kishimoto, object position information is produced the position of the selected video object both:

1) In the space of the display screen (via x-y coordinates) [e.g., @ 35 of Figure 2; and paragraph 0053]; and (alternatively) 

2) In the space of the camera’s FOV (via range and range angle derived from the x-y coordinates) [e.g., @ 38 of Figure 2; and paragraphs 0063-0064].    

Each/both of the produced forms of “object position information” are then used to extract the audio signal as the audio object signal as already addressed above with respect to claim 1.


Claim 5 is rejected under 35 U.S.C. 102(a)(2) as being anticipated by US Patent Document #2015/0312662 to Kishimoto et al for the same reason that were set forth above with respect to claim 4.  Additionally:

In the system disclosed by Kishimoto, the audio object signal is extracted by extracting/separating the audio signal from the sound information captured/provided by the source (@ 12) via sound separation (@ 37 of Figure 2) using the object position information (@ 35/38 of Figure 2) [e.g., @ paragraphs 0069-0070].


Claim 6 is rejected under 35 U.S.C. 102(a)(2) as being anticipated by US Patent Document #2015/0312662 to Kishimoto et al for the same reason that were set forth above with respect to claim 5.  Additionally:

The examiner notes that the extracting/separation described by Kishimoto is, by definition, a “beam-forming” process in that is directs/steers the focus of the microphone array [e.g., @ paragraphs 0005, and 0069-0070].


Claim 11/1 is rejected under 35 U.S.C. 102(a)(2) as being anticipated by US Patent Document #2015/0312662 to Kishimoto et al for the same reason that were set forth above with respect to claim 4.  Additionally:

The position information produced by the system disclosed by Kishimoto is, by definition, a form of metadata associated with the displayed video image and video image object.



Claim 12 is rejected under 35 U.S.C. 102(a)(2) as being anticipated by US Patent Document #2015/0312662 to Kishimoto et al for the same reason that were set forth above with respect to claim 11.  Additionally:

As addressed above with respect to claim 4, the position information produced by the system disclosed by Kishimoto represent the position in space of the selected video object.


Claim 14 is rejected under 35 U.S.C. 102(a)(2) as being anticipated by US Patent Document #2015/0312662 to Kishimoto et al for the same reason that were set forth above with respect to claim 11.  Additionally:

The sound collection range information, identifying areas A1, represents a spread state of the audio signal to be extracted and is, by definition, a form of metadata associated with the displayed video image and video image object.


Claim 17 is rejected under 35 U.S.C. 102(a)(2) as being anticipated by US Patent Document #2015/0312662 to Kishimoto et al for the same reason that were set forth above with respect to claim 11.  Additionally:

The source comprises a camera (e.g., @ 20 of Figure 2) for photographing a scene/FOV to obtain the video signal.


Claim 18 is rejected under 35 U.S.C. 102(a)(2) as being anticipated by US Patent Document #2015/0312662 to Kishimoto et al for the same reason that were set forth above with respect to claim 11.  Additionally:

The source comprises a microphone array camera (e.g., @ 10 of Figure 2) for obtaining the audio signal by carrying out sound acquisition.


















The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:

A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.























Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over US Patent Document #2015/0312662 to Kishimoto et al in view of US Patent Document #2010/0123785 to Chen.

Kishimoto discloses a system as was set forth above with respect to claim 1.  Additionally, with respect to Kishimoto, the following is noted:

A) As is illustrated in Figure 1, Kishimoto described the system as being configured, via a range designation unit (@ 44), to set a sound collection area which is illustrated in the Figure at A1.  As is illustrated in Figures 9A-9C, Kishimoto describes the user as being able to change/set the size of the sound collection region at A1 [e.g., note too: paragraphs 0055-0059].  

B) Kishimoto further describes the system as having comprised an image recognition process (e.g., @ 34) for recognizing patterns in the video images, registered in advance, corresponding to specific video objects – i.e., a person or person’s face [e.g., note: paragraphs 0051, and 0060-0061].  As described these recognized video objects may be used to set the position and size of the collection area A1 [e.g., note: paragraph 0061] – thereby causing an image (A1) to be displayed together with the video object. 

Claim 7 differs from the system disclosed by Kishimoto only in that claim 7, when construed in the content of claim 1, required the user to select the recognized object.  

In an analogous environment, as in Kishimoto, Chen described as system which used an object (e.g., face) recognition process to recognize and mark sound producing objects in captured video images for the purpose of controlling a beam forming microphone array [e.g., @ paragraphs 0021-0026].  However, in Chen, the user interface was configured to provide the user with the ability to choose whether or not to select the so recognized and marked objects – in contrast, in Kishimoto, the selection of the recognized and marked object was automatically made in response to the recognition.  

In accordance with the showing of Chen, it would have been obvious to one of ordinary skill in the art to have modified the system disclosed by Kishimoto so as to have been configured to allow the user to select the one or ones of the recognized and marked objects thereby advantageously providing the user with the ability to control the system to select on which of the recognized video objects the system was to focus – thereby desirably providing a more user friendly user interface.









Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over US Patent Document #2015/0312662 to Kishimoto et al in view of US Patent Document #2010/0123785 to Chen for the same reasons that were set forth above with respect to claim 7.  Additionally

As addressed above, the object recognition process in the modified system of Kishimoto comprised face recognition.




Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over US Patent Document #2015/0312662 to Kishimoto et al in view of US Patent Document #2010/0123785 to Chen for the same reasons that were set forth above with respect to claim 7.  Additionally

The image displayed with the recognized object in the modified system of Kishimoto (@ A1) comprises a “frame”.



Claim 2/1 is rejected under 35 U.S.C. 103 as being unpatentable over US Patent Document #2015/0312662 to Kishimoto et al in view of US Patent Document #2010/0123785 to Chen for the reasons set forth above with respect to claim 7.  Additionally:

In the modified system of Kishimoto, the audio signals is first extracted from the audio source as the audio object signal but is not outputted until the video object is selected, at which time the outputted audio object signal is “extracted” therefrom for output.




Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over US Patent Document #2015/0312662 to Kishimoto et al in view of US Patent Document #2010/0123785 to Chen for the reasons set forth above with respect to claim 2.  Additionally:

In the modified system of Kishimoto, more than one audio object signal may be selected by the user for output. Whether one of the object signals is deemed/labelled/considered  foreground and one is deemed/labelled/considered background is an issue of perspective/semantics and, as such, such terminology fails to distinguish that which is claimed over the applied prior art.



Claim 13/11 is rejected under 35 U.S.C. 103 as being unpatentable over US Patent Document #2015/0312662 to Kishimoto et al in view of US Patent Document #2010/0123785 to Chen for the reasons set forth above with respect to claim 7.  Additionally:

In the modified system of Kishimoto, the information generated in response to the user inputs, for the selection of one of the plurality video object established a priority processing for the selected object, over non-selected one of the video object(s), and is, by definition, a form of metadata associated with the displayed video image and video image object.



Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over US Patent Document #2015/0312662 to Kishimoto et al in view of US Patent Document #2010/0123785 to Chen for the reasons set forth above with respect to claim 11, further in view of US Patent Document #2014/0244880 to Soffer.  

It would have been obvious to one of ordinary skill in the art to have modified the system disclosed by Kishimoto in view of the showing of Chen for the reasons set forth above with respect to claim 11.  As is illustrated in Figure 2 of Kishimoto, the modified system of Kishimoto comprises image signal processing (@ 33) and audio signal processing (@ 42) for outputting the generated video and audio signals to respective display devices (@ 61 and 63) in the form required by the display devices [ e.g., note paragraph 0049 of Kishimoto].

Claim 15 further recites processing to encode the audio and the metadata.

Soffer evidences a KVM interface circuitry (e.g., @ Figures 1-3) for allowing peripheral display and user input devices of a computer to be located remotely from the computer.  This is accomplished by providing a bidirectional communication link between the computer and the remote devices and encoding the information to be communicated there between (including the audio and video information) as a multiplexed transmission for communication via the provided communication link which, in come implementations, may be a single channel/stream [note: paragraph 0029]. In accordance with the teachings of Soffer, it would have been obvious to one of ordinary skill in the art to have further modified the system disclosed by Kishimoto so as to have comprised a conventional KVM interface for allowing the peripheral display and interface devices to be located remotely from the computer – thereby advantageously conserving working space at the user side while permitting an added layer of physical security/protection for the computer itself.  The so further modified system would have operated to have encoded/converted the video, audio and metadata information (i.e., the A1 image information), required for display, to be encoded into a digital stream for transmission through the KVM link.





Claim 16 is rejected under 35 U.S.C. 103 as being unpatentable over US Patent Document #2015/0312662 to Kishimoto et al in view of US Patent Document #2010/0123785 to Chen and US Patent Document #2014/0244880 to Soffer for the reasons set forth above with respect to claim 15.  Additionally:

The encoding and multiplexing recited in claim 16 read on the “encoding” and “multiplexing” of the modified system Kishimoto that is required to transmit the video, audio, and metadata information through the KVM link as a single stream as was set forth above with respect to claim 15.












The examiner notes Figure 2 of US Patent Document #2009/0174805 to Alberth
























Any inquiry concerning this communication or earlier communications from the examiner should be directed to DAVID E. HARVEY whose telephone number is (571) 272-7345.  The examiner can normally be reached on M-F from 6:00AM to 3PM.

If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Mr. William Vaughn, can be reached on (571) 272-3922.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

                                                          /DAVID E HARVEY/
                                                          Primary Examiner, Art Unit 2481