DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims


2.            The following is a NON-FINAL office action upon examination of application number 16/238,084 in response to Applicant’s Request for Continued Examination (RCE) filed on 01/05/2022. Claims 1-20 are pending in the application and have been examined on the merits discussed below.

Response to Amendment

3.	In the response filed January 05, 2022, Applicant amended claims 1-2, 6-9, 12, 14-16, and 19, and did not cancel any claims. No new claims were presented for examination. 

4.	The claim rejections under 35 U.S.C. 101 were previously withdrawn [See Office Action, 03/01/2021].

Response to Arguments

5.	Applicant's arguments filed January 05, 2022, have been fully considered.

6.	Applicant submits “The rejection should be withdrawn because neither Yuxin nor Sarin nor

claims.” [Applicant’s Remarks, 01/05/2022, page 7]

In response to the Applicant’s argument that “the rejection should be withdrawn because neither Yuxin nor Sarin nor Soundararajan, alone nor in combination, teach, disclose, or suggest the elements of the pending claims,” it is noted that this argument is a mere allegation of patentability by the Applicant with no supporting rationale or explanation. Merely stating that the claims do not teach a feature does not offer any insight as to why the specific sections of the prior art relied upon by the Examiner fail to disclose the claimed features. Applicant's arguments amount to a general allegation that the claims define a patentable invention without specifically pointing out how the language of the claims patentably distinguishes them from the references. Moreover, the Examiner notes the limitations being argued by Applicant as being newly amended to the claims in the response filed 01/05/2022, which have been addressed in the updated rejection below. Applicant’s argument has been considered, but it pertains to amendments to independent claims 1/7/14 that are believed to be addressed via the new ground of rejection under §103(a) set forth in the instant office action.

7.	Applicant submits “Yuxin, Sarin, and Soundararajan fail to disclose the elements recited as “performs a secondary level processing of information associated with the survey vote, wherein the first image from the first angle of the survey vote is compared to a second image from a second angle of the survey vote” as recited in amended claim 1; or “executing a secondary level
processing of information associated with the survey vote, wherein the first image from the first angle of the survey vote is compared to a second image from a second angle of the survey vote” as recited in amended claim 7; or “executing a secondary level processing of information associated with the survey vote, wherein the first image from the first angle of the survey vote is 

In response to the Applicant’s argument that “Yuxin, Sarin, and Soundararajan fail to disclose the elements recited as “performs a secondary level processing of information associated with the survey vote, wherein the first image from the first angle of the survey vote is compared to a second image from a second angle of the survey vote” as recited in amended claim 1; or “executing a secondary level processing of information associated with the survey vote, wherein the first image from the first angle of the survey vote is compared to a second image from a second angle of the survey vote” as recited in amended claim 7; or “executing a secondary level processing of information associated with the survey vote, wherein the first image from the first angle of the survey vote is compared to a second image from a second angle of the survey vote” as recited in amended claim 14,” the Examiner notes the limitations being argued by Applicant as being newly amended to the claims in the response filed 01/05/2022, which have been addressed in the updated rejection below. Applicant’s argument has been considered, but it pertains to amendments to independent claims 1/7/14 that are believed to be addressed via the new ground of rejection under §103(a) set forth in the instant office action.

8.	Applicant’s remaining arguments either logically depend from the above-rejected arguments, in which case they too are unpersuasive for the reasons set forth above, or they are directed to features which have been newly added via amendment. Therefore this is now the Examiner's first opportunity to consider these limitations in view of the prior art and as such any arguments regarding these limitations would be inappropriate since they have not yet been examined. A full rejection of these limitations in view of the prior art will be presented later in this Office Action.

Claim Rejections - 35 USC § 103

9.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

10.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

11.	The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or    nonobviousness.

12.	This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.


13.	Claims 1-2, 4-5, 7-11, 13-18, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Yuxin et al., Pub. No.: US 2010/0207874 A1, [hereinafter Yuxin], in view of Sarin et al., Pub. No.: US 2013/0124207 A1, [hereinafter Sarin], in view of Soundararajan et al., Pub. No.: US 2015/0189378 A1, [hereinafter Soundararajan], in further view of Hawkins, Pub. No.: US 2016/0125242 A1, [hereinafter Hawkins].

As per claim 1, Yuxin teaches a system (paragraph 0009, discussing a system and method for detecting collaborative gestures of an audience using vision-based technologies to enable interactivity for digital signage and other applications), comprising: 

a first camera (paragraph 0012, discussing that as shown in FIG. 1, the video camera computer can be interconnected to the display computer, allowing feedback and analysis from the video camera computer to be used by the display computer. The display computer can also provide feedback to the video camera computer regarding camera settings to allow the change of focus, zoom, field of view, and physical orientation of the camera, if the mechanisms to do such are associated with the camera. The camera computer can include an input/output device, having a keyboard, monitor, and other input and output devices for allowing direct input and output of data to and from the camera computer…; paragraph 0029, discussing that the system then captures an audience view with the imaging device(s). This step can involve capturing a single snapshot or a series of frames/video. It can involve capturing a view of the entire camera field of view, or only a portion of the field of view…);

a memory that stores computer executable components (paragraph 0015, discussing that the controller can be any type of personal computer, portable computer, or workstation computer that includes a processing unit, a system memory, and a system bus that couples the processing ; and 

a processor that executes the computer executable components stored in the memory (paragraph 0015: “The processing unit may include one or more processors, each of which may be in the form of any one of various commercially available processors. Generally, each processor receives instructions and data from a read-only memory and/or a random access memory.”),

wherein the computer executable components comprise: an image capturing component to capture a first image from a first angle (paragraph 0010, discussing that the system includes at least one imaging device (e.g. a camera) pointed at an audience (located in an audience area 16 that represents at least a portion of the field of view of the imaging device), and a video camera computer, interconnected to the imaging device and configured to run gesture detection and recognition algorithms. The video camera computer is a video image analysis computing device that is configured to analyze visual images taken by the imaging device. The imaging device can be configured to take video images (i.e. a series of sequential video frames that capture motion) at any desired frame rate, or it can take still images; paragraph 0019, discussing that a visual image of the audience response is taken by the imaging device [i.e., an image capturing component], and is analyzed by the image analysis computer; paragraph 0029, discussing that once some display output is provided, the system then captures an audience view with the imaging device(s). This step can involve capturing a single snapshot or a series of frames/video. It can involve capturing a view of the entire camera field of view [i.e., This shows that an image is captured from a first angle], or only a portion of the field of view…); and
an image processing component that processes the first image to determine a survey count, wherein the survey count indicates a number of times a survey image was identified in the first image (paragraph 0017, discussing that the display provides visual and/or audio-visual content to the audience. The content can be in the form of survey questions. This content can include requests, or other types of prompts for audience response in the form of some gesture. The gesture requested can be raising or waving a hand…; paragraph 0018, discussing that in FIG. 2 the group 14 has been prompted to raise a hand in response to some query provided by the display system…; paragraph 0020, discussing that vision-based detection captures an audience's view via image-processing techniques in order to detect and classify specific gestures. For example, audience participants positioned near the imaging device can collectively perform some type of individual gesture (e.g. raise a hand) in response to the content, or they can perform a collaborative gesture; paragraph 0021, discussing that using group gesture detection techniques that have been developed, the image analysis that the system performs can function in at least two basic ways. One mode of operation of the system is to measure the level of audience interest as a function of the audience response (e.g. looking for a majority response). For example, the simultaneous raise of many hands in the audience can be detected and regarded as a highly positive feedback to the displayed content.  For example, if the prompt provided to the audience in FIG. 2 related to a choice between alternatives for subsequent content, and the system were programmed to provide content based upon a majority vote of the audience, the audience gestures shown in FIG. 2 would suggest that 12 out of the 15 audience members approved [i.e., This shows that the first image is processed to determine a survey count]  of a particular option that was then being offered. The displayed content can then be adapted or modified based upon this majority vote to provide content indicated by the audience response; paragraph 0033, discussing that the next step in the method is to recognize the collaborative behavior. This step can include aggregating or correlating the results or detecting similarity in gestures to recognize collaborative gestures/behaviors. In this step the system analyzes the input 

	While Yuxin teaches an image capturing component, it does not explicitly teach that the image capturing component controls the first camera to capture a first image from a first angle of a survey vote, in response to detecting, via a microphone, an audio phrase that matches a survey phrase; and an authentication component that based on detection of a discrepancy within the survey count: in response to the discrepancy exceeding a threshold: performs a secondary level processing of information associated with the survey vote, wherein the first image from the first angle of the survey vote is compared to a second image from a second angle of the survey vote, and adjusts the survey count based on the secondary level processing of the information; and in response to the discrepancy not exceeding a threshold, ignoring the discrepancy. Sarin in the analogous art of image capturing applications teaches:

an image capturing component that, in response to detecting, via a microphone, an audio phrase that matches a survey phrase, controls the first camera to capture a first image (paragraph 0003, discussing a computing device (e.g., a smart phone, digital camera, or other device with image capture functionality) causes an image capture device [i.e., image capturing component] 

Yuxin is directed toward audience measurement systems. Sarin is directed toward tools for voice-controlled image capture operations. Therefore they are deemed to be analogous as they both are directed towards audience measurement systems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Yuxin with Sarin because the references are analogous art because they are both directed to solutions for audience measurement, which falls within applicant’s field of endeavor (audience analysis), and because modifying Yuxin to include Sarin’s feature for in response to detecting, via a microphone, an audio phrase that matches a survey phrase, controls the first camera to capture a first image, in the manner claimed, would serve the motivation of improving the user experience by allowing quick access to image-capture functionality (Sarin at paragraph 0001), or in the pursuit of providing quick access to image-capture functionality based on audio command input received; and further obvious because the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.

an authentication component that based on detection of a discrepancy within the survey count: in response to the discrepancy exceeding a threshold: performs a secondary level processing of information associated with the survey vote, wherein the first image from the first angle of the survey vote is compared to a second image from a second angle of the survey vote, and adjusts the survey count based on the secondary level processing of the information; and in response to the discrepancy not exceeding a threshold, ignoring the discrepancy. Soundararajan in the analogous art of audience measurement systems teaches:

an authentication component that based on detection of a discrepancy within the survey count: in response to the discrepancy exceeding a threshold: performs a secondary level processing of information associated with the survey vote (paragraph 0001, discussing methods and apparatus to count people in an audience; paragraph 0024, discussing that the secondary presence data indicates that a person is not present in the media exposure environment of the primary media presentation device even though the person is logged in as an audience member via the people meter. In such instances, examples disclosed update and/or adjust the people meter data by, for example, removing the incorrectly counted audience member [i.e., detection of a discrepancy] from one or more counts and/or tallies associated with the media exposure environment and/or media detected in the environment…; paragraph 0026, discussing that the base metering device performs the comparison(s) of the secondary presence data with the primary presence data and the corresponding adjustment of data based on discrepancies; paragraph 0027, discussing that in some situations, such as when there is more than one member in a panelist household, detecting a person near a portable device may be insufficient to specifically identify which household member is near the portable device. Accordingly, the proximity data collected by the portable device is further analyzed to identify or at least estimate the identity of the detected person. In some examples, estimating the identity of the person is  and 

adjusts the survey count based on the secondary level processing of the information (paragraph 0041, discussing that the first portable device 114 of FIG. 1 communicates the secondary data obtained by the first portable device 114 to the example base metering device 110 of FIG. 1…In the illustrated example of FIG. 1, the example base metering device 110 of FIG. 1 compares the primary presence information received from the example primary people meter 112 of FIG. 1 with the secondary presence data obtained via the example portable devices 114, 118 to, for example, confirm the accuracy of the data obtained via the primary people meter 112. In some examples, if there is a discrepancy between the primary presence information gathered via the example primary people meter 112 and the secondary presence information gathered via the portable devices 114, 118, the example base metering device 110 of FIG. 1 adjusts the primary presence information by removing incorrectly counted individuals from a tracked audience [i.e., adjusts the survey count based on the secondary level processing of the information] of the media presentation device 104 and/or decreases a count or tally of people in an audience for the media exposure environment 102. Alternatively, the example base metering device 110 of FIG. 1 adjusts the primary presence information by adding individuals to the tracked audience of the media presentation device 104 and/or increases a count or tally of people in an audience for the media exposure environment 102; paragraph 0047, discussing that the people analyzer 200  and 

in response to the discrepancy not exceeding a threshold, ignoring the discrepancy (paragraph 0083, discussing that the identity estimator 506 generates a probability or confidence 

The Yuxin-Sarin combination describes features related to audience measurement. Soundararajan is directed toward methods and apparatus to count people in an audience. Therefore they are deemed to be analogous as they both are directed towards audience measurement systems. It would have been obvious to one of ordinary skill in the art before the 

The Yuxin-Sarin-Soundararajan combination does not explicitly teach wherein the first image from the first angle of the survey vote is compared to a second image from a second angle of the survey vote. However, Hawkins in the analogous art of audience participation measurement teaches this concept. Hawkins teaches:

wherein the first image from the first angle of the survey vote is compared to a second image from a second angle of the survey vote (paragraph 0090, discussing that although an image is captured of an entire stand 104A-D by each respective camera 106A-D, an alternative for counting the number of placards is that the settings of each camera are set so that only a portion of the stand appears in the image frame. Advantageously, this allows a more detailed (zoomed-

Examiner notes that Hawkins, in addition to Yuxin as cited above, also teaches: controls the first camera to capture a first image from a first angle of a survey vote (paragraph 0003, discussing a method, system and apparatus for providing improved audience participation; paragraph 0022, describes that allowing members of a live audience at an event to vote on issues  

The Yuxin-Sarin-Soundararajan combination describes features related to audience measurement. Hawkins is directed toward audience participation measurement system. Therefore they are deemed to be analogous as they both are directed towards audience measurement systems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the Yuxin-Sarin-Soundararajan combination with Hawkins because the references are analogous art because they are both directed to solutions for audience measurement, which falls within applicant’s field of endeavor (audience analysis), and because modifying the Yuxin-Sarin-Soundararajan combination to 

As per claim 2, the Yuxin-Sarin-Soundararajan-Hawkins combination teaches the system of claim 1. The Yuxin-Sarin-Soundararajan combination does not explicitly teach wherein the second level processing of information comprises processing the second image from the second angle of the survey vote. However, Hawkins in the analogous art of audience participation measurement systems teaches this concept. Hawkins teaches:

wherein the second level processing of information comprises processing the second image from the second angle of the survey vote (paragraph 0090, discussing that although an image is captured of an entire stand 104A-D by each respective camera 106A-D, an alternative for counting the number of placards is that the settings of each camera are set so that only a portion of the stand appears in the image frame. Advantageously, this allows a more detailed (zoomed-in) image of the captured portion of the stand, thus improving the accuracy of placard detection. In this case, each camera is slowly panned across its respective stand and a plurality of images is captured [i.e., panning a camera across its respective stand to take a plurality of images suggests capturing a second image from a second angle] as the camera is panned so that each captured image is an image of a different portion of the stand. The camera panning speed and range and the image capture frequency are predetermined so that every part of the 

The Yuxin-Sarin-Soundararajan combination describes features related to audience measurement. Hawkins is directed toward audience participation measurement system. Therefore they are deemed to be analogous as they both are directed towards audience measurement systems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the Yuxin-Sarin-Soundararajan combination with Hawkins because the references are analogous art because they are both directed to solutions for audience measurement, which falls within applicant’s field of endeavor (audience analysis), and because modifying the Yuxin-Sarin-Soundararajan combination to include Hawkins’s feature for processing the second image from the second angle of the survey vote, in the manner claimed, would serve the motivation of improving the accuracy of the detection of objects in the captured image (Hawkins at paragraph 0058), or in the pursuit of improving 

As per claim 4, the Yuxin-Sarin-Soundararajan-Hawkins combination teaches the system of claim 1. Yuxin further teaches comprising: an audio processing component (paragraph 0010, discussing that the system includes at least one imaging device 12 (e.g. a camera) pointed at an audience, and a video camera computer, interconnected to the imaging device and configured to run gesture detection and recognition algorithms. The video camera computer is a video image analysis computing device that is configured to analyze visual images taken by the imaging device…; paragraph 0017, discussing that the display and the audio speaker [i.e., audio processing component] provides visual and/or audio-visual content to the audience. The content can be in the form of survey questions, or any other type of content. This content can include requests, or other types of prompts for audience response in the form of some gesture. These requests or prompts are for group responses…; claim 3, discussing an interactive content delivery system in accordance with claim 2, further comprising an audio broadcast device, synchronized with the video display device; paragraphs 0018, 0040).

While Yuxin teaches an audio processing component (paragraph 0017), it does not explicitly teach an audio processing component that processes one or more audio phrases captured by the microphone to determine if the one or more audio phrases comprises the audio phrase that matches the survey phrase. However, Sarin in the analogous art of image capturing applications teaches this concept. Sarin teaches:

 that processes one or more audio phrases captured by the microphone to determine if the one or more audio phrases comprises the audio phrase that matches the survey phrase (paragraph 0003, discussing a computing device (e.g., a smart phone, digital camera, or other device with image capture functionality) causes an image capture device to capture one or more digital images based on audio input (e.g., a voice command) received by the computing device. For example, a user speaks a word or phrase, and the user's voice is converted to audio input data by the computing device. The computing device compares (e.g., using an audio matching algorithm) the audio input data to an expected voice command associated with an image capture application, and determines whether the word or phrase spoken by the user matches the expected voice command [i.e., processes audio phrase captured by the microphone to determine if the audio phrase comprises the audio phrase that matches the survey phrase]; paragraph 0030, discussing that the mobile device  includes a microphone and speaker, along with two proximity sensors, situated below the surface of the mobile device; paragraph 0038, discussing that a user can generate user input that affects camera control and handling of digital image data. The user input can be, for example, voice input…; paragraph 0042, discussing that a voice command can be any word, phrase, or utterance...For example, in an image capture scenario, a user can speak a command while in front of the camera, behind the camera, beside the camera, wearing the camera, or in any other orientation. A voice command can be a default command or a custom command selected by a user…; paragraph 0043, discussing that the mobile device can use audio matching algorithms to compare audio input with expected commands…When audio input is recognized as a sufficient match for an expected command by the mobile device, feedback can be provided to users. Such feedback can provide assurance to users that a voice command was recognized and that mobile device will take a photo momentarily…; paragraph 0051, discussing that if the voice-controlled image capture tool is running on a mobile device in a locked/low-power state, the tool can listen for voice commands and, if a voice command is recognized, the mobile device can generate command events to cause 

Yuxin is directed toward audience measurement systems. Sarin is directed toward tools for voice-controlled image capture operations. Therefore they are deemed to be analogous as they both are directed towards audience measurement systems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Yuxin with Sarin because the references are analogous art because they are both directed to solutions for audience measurement, which falls within applicant’s field of endeavor (audience analysis), and because modifying Yuxin to include Sarin’s feature for processing one or more audio phrases captured by the microphone to determine if the one or more audio phrases comprises the audio phrase that matches the survey phrase, in the manner claimed, would serve the motivation of improving the user experience by allowing quick access to image-capture functionality (Sarin at paragraph 0001), or in the pursuit of providing quick access to image-capture functionality based on audio command input received; and further obvious because the claimed invention is merely a combination of old elements, and in the combination each element 

As per claim 5, the Yuxin-Sarin-Soundararajan-Hawkins combination teaches the system of claim 1. Yuxin further teaches further comprising: a report generating component that generates a report indicating the survey count associated with a survey phrase (paragraph 0033, discussing that the next step in the method is to recognize the collaborative behavior. This step can include aggregating or correlating the results or detecting similarity in gestures to recognize collaborative gestures/behaviors. In this step the system analyzes the input data and computes scores based on various detectors to rank the probability of the appearance of one or more collaborative gestures/behaviors. There are many examples of such collaborative gestures. One type of collaborative gesture can be the raising of hands in the audience, and detecting this gesture using gesture recognition technologies. This can include the raise of one hand or two hands to deliver feedback from the audience regarding the displayed content, or a finger pointing gesture performed by multiple people at the same time.  This scenario can be similar to the "majority vote" approach, in which the number of raised hands is counted and considered as the voting behavior [i.e., report indicating the survey count associated with a survey phrase]. It is to be understood that raised hands are just one of many collaborative gestures that can be considered. For example, the detection of raised hands could be considered together with the detection of facial expression and motion detection to obtain a more precise measurement of an audience response to the displayed content. Moreover, the system can be configured to not merely tabulate a simple majority vote. For example, the system can be configured to compute a score or rating, such as a score on a scale of from 0-10, as a rating or measure of the relative quality of the gesture. For example, the score can indicate the strength of positive feedback from the audience…; paragraph 0034, discussing that many other types of gestures can also be aggregated or correlated…; paragraph 0041, discussing that the invention leverages novel 

Claims 7 and 14 recite substantially similar limitations that stand rejected via the art citations and rationale applied to claim 1, as discussed above. Further, as per claims 7 and 14 Yuxin teaches a computer implemented method and a computer program product that provides survey information using image processing, the computer program product comprising a computer readable storage medium having program instructions embodied (paragraph 0015, discussing that the controller can be any type of personal computer, portable computer, or workstation computer that includes a processing unit, a system memory, and a system bus that couples the processing unit to the various components of the computer. The processing unit may include one or more processors, each of which may be in the form of any one of various commercially available processors. Generally, each processor receives instructions and data from a read-only memory and/or a random access memory.  The controller can also include a hard drive, a floppy drive, and CD ROM drive that are connected to the system bus by respective interfaces…).

Claims 8 and 15 recite substantially similar limitations that stand rejected via the art citations and rationale applied to claim 2, as discussed above.


As per claim 9, the Yuxin-Sarin-Soundararajan-Hawkins combination teaches the computer implemented method of claim 7. Yuxin further teaches wherein the information comprises secondary image information from the first image, wherein the secondary image information is not the survey image (paragraph 0033, discussing that the next step in the method is to recognize the collaborative behavior. This step can include aggregating or correlating the results or detecting similarity in gestures to recognize collaborative gestures/behaviors [i.e., information comprises secondary image information from the first image]. In this step the system analyzes the input data and computes scores based on various detectors to rank the probability of the appearance of one or more collaborative gestures/behaviors. There are many examples of such collaborative gestures. One type of collaborative gesture can be the raising of hands in the audience, and detecting this gesture using gesture recognition technologies. This can include the raise of one hand or two hands to deliver feedback from the audience regarding the displayed content…This scenario can be similar to the "majority vote" approach, in which the number of raised hands is counted and considered as the voting behavior. It is to be understood that raised hands are just one of many collaborative gestures that can be considered. For example, the detection of raised hands could be considered together with the detection of facial expression [i.e., the secondary image information is not the survey image] and motion detection to obtain a more precise measurement of an audience response to the displayed content; paragraph 0034, discussing that many other types of gestures can also be aggregated or correlated. For example, facial expressions can be detected to focus on, for example, smiling faces in the audience. The system can focus on all smiling faces as a collaborative gesture at some specific instant. The number of smiling faces, the duration of each smiling face, as well as the extent of the smile can be detected using face vision technologies, and the results can then be aggregated to make a decision; paragraph 0024, discussing that the collaborative gestures to be detected are not limited to human hands, but can include the movement of one's head, torso, leg, foot, or other body part, as well as facial expressions. The gestures can also involve physical motions, or change of bodily orientation, such as turning to the left or to the right, or moving towards or away from the display device; paragraphs 0017, 0031).


Claims 11 and 18 recite substantially similar limitations that stand rejected via the art citations and rationale applied to claim 5, as discussed above.

As per claim 13, the Yuxin-Sarin-Soundararajan-Hawkins combination teaches the computer implemented method of claim 7. Although not taught by the Yuxin-Sarin combination, Soundararajan teaches wherein the threshold is set at a value where an adjustment of the survey count would impact an outcome of a survey associated with the survey phrase (paragraph 0041, discussing that the example base metering device of FIG. 1 compares the primary presence information received from the example primary people meter of FIG. 1 with the secondary presence data obtained via the example portable devices to, for example, confirm the accuracy of the data obtained via the primary people meter. In some examples, if there is a discrepancy between the primary presence information gathered via the example primary people meter and the secondary presence information gathered via the portable devices, the example base metering device of FIG. 1 adjusts the primary presence information by removing incorrectly counted individuals from a tracked audience of the media presentation device and/or decreases a count or tally of people in an audience for the media exposure environment. Alternatively, the example base metering device of FIG. 1 adjusts the primary presence information by adding individuals to the tracked audience of the media presentation device and/or increases a count or tally of people in an audience for the media exposure environment; paragraph 0083, discussing that the identity estimator generates a probability or confidence level that any particular one of the persons associated with the house corresponds to the person detected as near the portable device. In some such examples, where the probability of one of the persons exceeds a certain threshold and/or is larger than the probabilities of the other persons the example identity estimator of FIG. 5 identifies that one particular person as the person detected as near the portable device. 

The Yuxin-Sarin combination describes features related to audience measurement. Soundararajan is directed toward methods and apparatus to count people in an audience. Therefore they are deemed to be analogous as they both are directed towards audience measurement systems. It would have been obvious to one of ordinary skill in the art before the 

Claim 16 recites substantially similar limitations that stand rejected via the art citations and rationale applied to claim 9, as discussed above.
Claim 20 recites substantially similar limitations that stand rejected via the art citations and rationale applied to claim 13, as discussed above.

15.	Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Yuxin in view of Sarin, in view of Soundararajan, in view of Hawkins, in further view of Liu, Pub. No.: US 2017/0255820 A1, [hereinafter Liu].

As per claim 3, the Yuxin-Sarin-Soundararajan-Hawkins combination teaches the system of claim 1. Yuxin further teaches wherein the information comprises secondary image information from the first image, wherein the secondary image information is not the survey image and is selected from a group consisting of facial data (paragraph 0033, discussing that the next step in the method is to recognize the collaborative behavior. This step can include aggregating or correlating the results or detecting similarity in gestures to recognize collaborative gestures/behaviors [i.e., information comprises secondary image information from the first image]. In this step the system analyzes the input data and computes scores based on various detectors to rank the probability of the appearance of one or more collaborative gestures/behaviors. There are many examples of such collaborative gestures. One type of collaborative gesture can be the raising of hands in the audience, and detecting this gesture using gesture recognition technologies. This can include the raise of one hand or two hands to deliver feedback from the audience regarding the displayed content…This scenario can be similar to the "majority vote" approach, in which the number of raised hands is counted and considered as the voting behavior. It is to be understood that raised hands are just one of many collaborative gestures that can be considered. For example, the detection of raised hands could be considered together with the detection of facial expression [i.e., the secondary image information is not the survey image and is selected from a group consisting of facial data] and motion detection to obtain a more precise measurement of an audience response to the displayed content; paragraph 0034, discussing that many other types of gestures can also be aggregated or correlated. For example, facial expressions can be detected to focus on, for example, smiling faces in the audience. The system can focus on all smiling faces as a collaborative gesture at some specific instant. The number of smiling faces, the duration of each smiling face, as well as the extent of the smile can be detected using face vision technologies, and the results can then be aggregated to make a decision; paragraph 0024, discussing that the collaborative gestures to be detected are not limited to human hands, but can include the movement of one's head, torso, leg, foot, or other body part, as well as facial expressions. The gestures can also involve physical motions, or change of bodily orientation, such as turning to the left or to the right, or moving towards or away from the display device; paragraphs 0017, 0031).
wherein the secondary image information is selected from a group consisting of clothing data, and jewelry data. However, Liu in the analogous art of image analysis systems teaches this concept. Liu teaches:

wherein the secondary image information is selected from a group consisting of clothing data, and jewelry data  (paragraph 0014, discussing that embodiments of the invention use facial recognition technology or software, and some embodiments use appearance recognition technology (ART) or software. As the term is used herein, ART uses not only traditional facial features (relatively unchanging over time, generally used in traditional facial recognition, which can include, for example, characteristics, geometry, and shape of the heads, face, eyes, nose, mouth, cheeks, chin or ears, and/or portions, elements, and/or aspects thereof, and/or distances or relative distances associated with or between such features or elements thereof), but also one or more non-traditional facial features or other appearance characteristics. In some embodiments, such non-traditional features or characteristics can include, for example, among other things, hair style, facial hair or style, makeup, eyeglasses, hair, clothing, pants, shirts, hats, shoes, jewelry, earrings, body height, body size, and/or tattoos; paragraph 0084, discussing that RFID technology, appearance recognition technology (ART), or a combination of both may be used to accurately identify people in photos and videos. Some embodiments relate to accurately identify people taken in photos in various situations, but embodiments can also be applicable to videos and other digital media; paragraph 0099, discussing that appearance recognition technology uses not only traditional facial features, but also one or more non-traditional facial features or other appearance characteristics. In some embodiments, such non-traditional features or characteristics can include, for example, among other things, hair style, facial hair or style, makeup, eyeglasses, hair, clothing, pants, shirts, hats, shoes, jewelry, earrings, body height, body size, and/or tattoos).
.

14.	Claims 6, 12, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Yuxin in view of Sarin, in view of Soundararajan, in view of Hawkins, in further view of Lee et al., Pub. No.: US 2014/0161305 A1, [hereinafter Lee].

As per claim 6, the Yuxin-Sarin-Soundararajan-Hawkins combination teaches the system of claim 1. Yuxin further teaches wherein the image capturing component, controls a second camera to capture the second image of the survey vote (paragraph 0010, discussing that the system includes at least one imaging device (e.g. a camera) pointed at an audience, and a video camera computer, interconnected to the imaging device and configured to run gesture detection and 

wherein the image processing component processes the second image to determine the survey count (paragraph 0017, discussing that the display (and the audio speaker 22) provides visual and/or audio-visual content to the audience. The content can be in the form of survey questions, or any other type of content. This content can include requests, or other types of prompts for audience response in the form of some gesture. These requests or prompts are for group responses. The gesture requested can be a certain motion of the body, including relatively subtle motions, such as a facial gesture or nod of the head, to more obvious gestures, such as raising or waving a hand, or moving the entire body in some way; paragraph 0018, discussing that in FIG. 2 the group 14 has been prompted to raise a hand in response to some query provided by the display system. Whenever the system prompts a group response, several results are possible. Most of the participants are likely to respond in the manner requested and raise one hand, these group members being indicated at 28a. However, some participants may provide an alternate, though potentially qualifying response, such as by raising two hands; paragraph 0020, discussing that vision based gesture detection and recognition have been widely studied in the past decade. Vision-based detection captures an audience's view via image-processing techniques such as background subtraction, silhouette detection, etc., in order to detect and classify specific gestures. For example, audience participants positioned near the imaging device can collectively perform some type of individual gesture (e.g. raise a hand) in response to the content, or they can perform a collaborative gesture; paragraph 0021, discussing that using group gesture detection techniques that have been developed, the image analysis that the system performs can function in at least two basic ways. One mode of operation of the system is to measure the level of audience interest as a function of the audience response (e.g. looking for a majority response). For example, the simultaneous raise of many hands in the audience can be detected and regarded as a highly positive feedback to the displayed content. For example, if the prompt provided to the audience in FIG. 2 related to a choice between alternatives for subsequent content, and the system were programmed to provide content based upon a majority vote of the 


While Yuxin teach teaches control the second camera to capture the second image, Yuxin does not explicitly teach that the second camera is controlled to capture the second image from the second angle of the survey vote, in response to detecting the audio phrase that matches the survey phrase; and wherein the image processing component processes the second image to determine the survey count by counting how many times the survey image was matched in the first image and the second image. Sarin in the analogous art of image capturing applications teaches:

in response to detecting the audio phrase that matches the survey phrase (paragraph 0003, discussing a computing device (e.g., a smart phone, digital camera, or other device with image capture functionality) causes an image capture device to capture one or more digital images based on audio input received by the computing device. For example, a user speaks a word or phrase, and the user's voice is converted to audio input data by the computing device. The computing device compares (e.g., using an audio matching algorithm) the audio input data to an expected voice command associated with an image capture application, and determines whether the word or phrase spoken by the user matches the expected voice command (i.e., detecting the audio phrase that matches the survey phrase); paragraph 0038, discussing that a user can generate user input that affects camera control and handling of digital image data. The user input can be, for example, voice input…; paragraph 0042, discussing that a voice command can be any word, phrase, or utterance…; paragraph 0043, discussing that the mobile device can use audio matching algorithms to compare audio input with expected commands. When audio input is recognized as a sufficient match for an expected command by the mobile device, feedback can be provided to users. Such feedback can provide assurance to users that a voice command was recognized and that mobile device will take a photo momentarily [i.e., in response to detecting the audio phrase that matches the survey phrase]; paragraph 0051, discussing that if the voice-

Yuxin is directed toward audience measurement systems. Sarin is directed toward tools for voice-controlled image capture operations. Therefore they are deemed to be analogous as they both are directed towards audience measurement systems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Yuxin with Sarin because the references are analogous art because they are both directed to solutions for audience measurement, which falls within applicant’s field of endeavor (audience analysis), and because modifying Yuxin to include Sarin’s feature for response to detecting the audio phrase that matches the survey phrase, in the manner claimed, would serve the motivation of improving the user experience by allowing quick access to image-capture functionality (Sarin 

While the Yuxin-Sarin-Soundararajan combination teaches capturing the second image, it does not explicitly teach capture the second image from the second angle of the survey vote; and wherein the image processing component processes the second image to determine the survey count by counting how many times the survey image was matched in the first image and the second image. Hawkins in the analogous art of audience participation measurement systems teaches:

capture the second image from the second angle of the survey vote (paragraph 0090, discussing that although an image is captured of an entire stand 104A-D by each respective camera 106A-D, an alternative for counting the number of placards is that the settings of each camera are set so that only a portion of the stand appears in the image frame. Advantageously, this allows a more detailed (zoomed-in) image of the captured portion of the stand, thus improving the accuracy of placard detection. In this case, each camera is slowly panned across its respective stand and a plurality of images is captured [i.e.,  panning a camera across its respective stand to take a plurality of images suggest capturing a second image from a second angle] as the camera is panned so that each captured image is an image of a different portion of the stand. The camera panning speed and range and the image capture frequency are predetermined so that every part of the stand is captured in at least one image. This means that all placards held up during the time period over which the camera is panned can be captured in an image and then detected and counted…; paragraph 0096, discussing that the voting results of individual stands in the stadium 

The Yuxin-Sarin-Soundararajan combination describes features related to audience measurement. Hawkins is directed toward audience participation measurement system. Therefore they are deemed to be analogous as they both are directed towards audience measurement systems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the Yuxin-Sarin-Soundararajan combination with Hawkins because the references are analogous art because they are both directed to solutions for audience measurement, which falls within applicant’s field of endeavor (audience analysis), and because modifying the Yuxin-Sarin-Soundararajan combination to include Hawkins’s feature for capturing the second image from the second angle of the survey vote, in the manner claimed, would serve the motivation of improving the accuracy of the detection of objects in the captured image (Hawkins at paragraph 0058), or in the pursuit of improving audience participation for a particular event (Hawkins, paragraph 0092); and further obvious because the claimed invention is merely a combination of old elements, and in the combination 

The Yuxin-Sarin-Soundararajan-Hawkins combination does not explicitly teach wherein the image processing component processes the second image to determine the survey count by counting how many times the survey image was matched in the first image and the second image. However, Lee in the analogous art of audience measurement system teaches this concept. Lee teaches:

wherein the image processing component processes the second image to determine the survey count by counting how many times the survey image was matched in the first image and the second image (paragraph 0057, discussing that when the example body recognizer 322 of FIG. 3 detects a body of a person in a particular frame of two-dimensional data, an indication of the recognition is conveyed to the example body counter and/or the example frame database. In the illustrated example of FIG. 3, the example body counter combines the body detections provided by the body recognizer 308 of the three-dimensional data analyzer 300 and the body detections provided by the body recognizer 322 of the two-dimensional data analyzer 302 corresponding to a common frame or frames to generate body counts for the respective frames [i.e.,  the second image is processed to determine the survey count by counting how many times the survey image was matched in the first image and the second image]. In some examples, the body counter 316 also executes one or more filters and/or check functions to avoid counting the same person more than once. Further, the example body recognizer 322 calculates and/or stores a location of the detected person in, for example, the frame database in connection with a timestamp. In some examples, the location of the detected sent by the example body recognizer 322 is an x-y coordinate similar to the example coordinate 402 of FIG. 4; paragraph 0059, discussing that in some examples, the 2D-based recognition analys(es) performed by the 

The Yuxin-Sarin-Soundararajan-Hawkins combination describes features related to audience measurement. Lee is directed toward systems and methods for methods and apparatus to monitor environments. Therefore they are deemed to be analogous as they both are directed towards audience measurement systems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the Yuxin-Sarin-Soundararajan-Hawkins combination with Lee because the references are analogous art because they are both directed to solutions for audience measurement, which falls within applicant’s field of endeavor (audience analysis), and because modifying the Yuxin-Sarin-Soundararajan-Hawkins combination to include Lee’s feature for processing the second image to determine the survey count by counting how many times the survey image was matched in the first image and the second image, in the manner claimed, would serve the motivation of providing the ability to accurately count a number of people in a particular space or environment at a particular time (Lee at paragraph 0012), or in the pursuit of verifying the tally of participant’s responses, thereby helping researchers to obtain an accurate count of responses; and further obvious because the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.

Claims 12 and 19 recite substantially similar limitations that stand rejected via the art citations and rationale applied to claim 6, as discussed above.



Conclusion

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
A.	Epshteyn et al., Pub. No.: US 2015/0127340 A1 – describes that it may be desirable to avoid repeated or duplicative counting of any person who is tracked. Accordingly, in some embodiments, a tracking system may store one or more characteristics of a first person who is counted. If a second person is counted sharing one or more of the recorded characteristics of the first person, then the tracking system may assume that it is the same person that was previously counted. Accordingly, the tracking system may not count the second person.
B.	Reiffel, Pub. No.: US 2012/0274775 A1 – describes an audience participation method and apparatus whereby mass audience members who are physically present in a venue can, upon being cued, convey quantitative data regarding their personal real-time individual and collective preferences, choices, opinions, and other personal responses concerning events taking place in the venue.
C.	Seki, Patent No.: US 5,121,201 – describes a method and apparatus for detecting the number of persons.
D.	Trajkovic et al., Patent No.: US 6,633,232 B2 – describes that a person of skill in the art of image analysis would recognize the many different ways of counting individuals and their movement.
E.	Ono, Pub. No.: US 2017/0330025 A1 – describes a face detection process.
F.	Rhoads et al., Pub. No.: US 2015/0016712 A1 – describes that the power of taking multiple pictures beyond just one, and doing so from slightly different angles, will be shown to have large benefits in fast detections and false positive reduction.
G.	Neustaedter et al., Pub. No.: US 2011/0063440 A1 – describes a video communication system providing a real time video communication link between 
H.	Datta, Ritendra, Jia Li, and James Z. Wang. "Algorithmic inferencing of aesthetics and emotion in natural images: An exposition." 2008 15th IEEE international conference on image processing. IEEE, 2008 – describes image analysis techniques. 
Any inquiry concerning this communication or earlier communications from the examiner  should be directed to Darlene Garcia-Guerra whose telephone number is (571) 270-3339. The examiner can normally be reached on M-F 7:30a.m.-5:00p.m. EST. 
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Brian M. Epstein can be reached on 571- 270-5389. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Darlene Garcia-Guerra/
Examiner, Art Unit 3683