DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Remarks
This action is in response to the amendments received on 12/5/22.  Claims 1-20 are pending in the application.  Applicants' arguments have been carefully and respectfully considered.
Claims 1-20 are rejected under 35 U.S.C. 101.
Claim(s) 1-3, 7-9, 11, 12, 14-16, 19, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Loxam et al. (US 9,064,326), and further in view of Borel et al. (US 2017/0076156).
Claims 4 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Loxam, and further in view of Brown (US 2008/00228749).
Claims 5, 13, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Loxam, and further in view of Ramkumar et al. (US 2017/0116786).
Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Loxam, and further in view of Lindley et al. (US 2009/0307261).
Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Loxam, and further in view of Cai (US 20140285717).

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-11 and 13-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception without significantly more. 
Claim 1 recites a method with steps, and therefore is a process, which is a statutory category of invention.
Step 2A, Prong One asks: Is the claim directed to a law of nature, a natural phenomenon (product of nature) or an abstract idea? See MPEP 2106.04 Part I. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas.  See MPEP 2106.04(a).
The limitation of “analyzing at least some of the first real-time video” and “analyzing, by a processor, the first image to identify a first object within the first image”, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. Nothing in the claim element precludes the step from practically being performed in the mind. For example, “analyzing” in the context of this claim encompasses the user manually looking at an image. 
The limitation of “determining one or more object settings”, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. Nothing in the claim element precludes the step from practically being performed in the mind. For example, “determining” in the context of this claim encompasses the user manually observing settings. 
The limitation of “generating an object tag comprising information associated with the first object”, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. For example, “generating” in the context of this claim encompasses the user mentally assigning a tag to an object. 
The limitation of “identifying, by the processor and based upon (i) the second image in the second real-time video captured via the first camera and (ii) the object information determined based upon the first object identified within the first real-time video captured via the first camera, the first object which was identified within the first real-time video, within the second image”, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. For example, “identifying” in the context of this claim encompasses the user mentally identifying if the second image contains the object that was also in the first image.
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
	
At step 2a, prong two, this judicial exception is not integrated into a practical application.  Claim 1 recites cameras, a display device, and a processor, however, this is recited as a high-level of generality and so generically such that it amounts to no more than mere instructions to apply the exception using a generic computer component.  See MPEP 2106.05(f).  

Additionally, the claims recite “receiving … real-time video.”  This element does not integrate the abstract idea into a practical application because it does not impose a meaningful limit on the judicial exception and provides only insignificant extra solution activity that is mere data gathering in conjunction with the abstract idea.
The claims recite “storing the object tag and object information.”  This element does not integrate the abstract idea into a practical application because it does not impose a meaningful limit on the judicial exception and provides only insignificant extra solution activity.
The claims recite “displaying, via a display device, a representation of the object tag.”  This element does not integrate the abstract idea into a practical application because it does not impose a meaningful limit on the judicial exception and provides only insignificant extra solution activity as “necessary data gathering and outputting.”  See MPEP 2106.05(g)(3).

	The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional elements, a display device, amounts to no more than mere instructions to apply an exception using generic computer components.  Mere instructions to apply an exception using generic computer components cannot provide an inventive concept.

	With respect to the “receiving … real-time video” limitations, the courts have found limitations directed towards data gathering to be well-understood, routine, and conventional.  See MPEP 2106.05(d)(II). “receiving or transmitting data over a network.”
With respect to the “storing” limitation, the courts have found limitations directed towards storing, recited at a high level of generality, to be well-understood, routine, and conventional. See MPEP 2106.05(d)(ll), “electronic recordkeeping” and “storing and retrieving information in memory.”
With respect to the “displaying” limitations, the courts have found limitations directed towards data gathering to be well-understood, routine, and conventional.  See MPEP 2106.05(d)(II) ““Presenting offers and gathering statistics.”

Considering the additional elements individually and in combination and the claim as a whole, the additional elements do not provide significantly more than the abstract idea. The claim is not patent eligible.

With respect to claims 2 and 15, the limitations are directed to the monitoring which is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. For example, “monitoring” in the context of this claim encompasses the user watching a video for certain things.  Accordingly, the claim further recites the abstract idea.

With respect to claims 3 and 16, the limitations are directed towards receiving an input and provides only insignificant extra solution activity that is mere data gathering in conjunction with the abstract idea.  The courts have found limitations directed towards data gathering to be well-understood, routine, and conventional.  See MPEP 2106.05(d)(II). “receiving or transmitting data over a network.”  Claim 3 recites a client device however, this is recited as a high-level of generality and so generically such that it amounts to no more than mere instructions to apply the exception using a generic computer component.  See MPEP 2106.05(f).

With respect to claims 4 and 17, the limitations are directed towards transcribing input audio which is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. For example, “transcribing” in the context of this claim encompasses the user listening to audio to determine a tag.  Accordingly, the claim further recites the abstract idea.

	With respect to claims 5 and 18, the limitation is directed towards input received, which has been discussed above with respect to the abstract idea and does not amount to significantly more than the above-identified judicial exception.

	With respect to claim 6, the limitation is directed towards wireless communication between a client device and a camera.  McInerny (US 9,706,102) is directed towards transmitting data from a requesting device (camera) and a presentation device (Fig. 1).  McInerny teaches wireless communication and that such wireless communication between devices is well known to those skilled it the art of computer communications (Col. 7 Li. 2-6).

	With respect to claims 7 and 8, the limitation is directed towards further defining the object information, which has been discussed above with respect to the abstract idea and does not amount to significantly more than the above-identified judicial exception.

With respect to claim 9, the limitations are directed towards determining a second location and comparing the second location which is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. For example, “determining” and “comparing” in the context of this claim encompasses the user manually observing the image.  Accordingly, the claim further recites the abstract idea.

With respect to claim 10, the limitations are directed towards “recording” and “comparing” which is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. For example, “recording” and “comparing” in the context of this claim encompasses the user manually recording and listening to audio.  Accordingly, the claim further recites the abstract idea.

With respect to claim 11, the limitations resemble claim 1, with the “receiving”, “analyzing”, and “determining”, reciting an abstract idea, as discussed above.

With respect to claim 12, the limitations are directed towards generating a second object tag, which has been discussed above with respect to the abstract idea and does not amount to significantly more than the above-identified judicial exception.  The claim recites “storing” the object tag and object information.  This element does not integrate the abstract idea into a practical application because it does not impose a meaningful limit on the judicial exception and provides only insignificant extra solution activity that is mere data gathering in conjunction with the abstract idea, as discussed above.  With respect to the “storing” limitation, the courts have found limitations directed towards storing, recited at a high level of generality, to be well-understood, routine, and conventional. See MPEP 2106.05(d)(ll), “electronic recordkeeping” and “storing and retrieving information in memory.”

With respect to claim 13, the limitation is directed towards generating the object tag, which has been discussed above with respect to the abstract idea and does not amount to significantly more than the above-identified judicial exception.

With respect to claim 20, the limitations are directed towards outputting the object tag. This element does not integrate the abstract idea into a practical application because it does not impose a meaningful limit on the judicial exception and provides only insignificant extra solution activity as “necessary data gathering and outputting.”  See MPEP 2106.05(g)(3).  The courts have found limitations directed towards data gathering to be well-understood, routine, and conventional.  See MPEP 2106.05(d)(II) ““Presenting offers and gathering statistics.”

Claim 14 recites a device with a processor and memory, and therefore is a machine which is a statutory category of invention.
Step 2A, Prong One asks: Is the claim directed to a law of nature, a natural phenomenon (product of nature) or an abstract idea? See MPEP 2106.04 Part I. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas.  See MPEP 2106.04(a).
Most of claim 14 resembles claim 1, with the “analyzing”, “determining”, “generating”, and “identifying” reciting an abstract idea, as discussed above.
The limitation of “determining, based upon the second image, a location of the first object”, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. Nothing in the claim element precludes the step from practically being performed in the mind. For example, “determining” in the context of this claim encompasses the user manually looking at images to determine a location of an object. 

At step 2a, prong two, this judicial exception is not integrated into a practical application.  Claim 14 recites cameras, a processor and a memory, however, this is recited as a high-level of generality and so generically such that it amounts to no more than mere instructions to apply the exception using a generic computer component.  See MPEP 2106.05(f).  These limitations can also be viewed as nothing more than an attempt to generally link the use of the judicial exception to the technological environment of a computer.  See MPEP 2106.05(h).
Additionally, the claim recites “receiving … real-time video” and “storing the object tag and object information.”  This element does not integrate the abstract idea into a practical application because it does not impose a meaningful limit on the judicial exception and provides only insignificant extra solution activity that is mere data gathering in conjunction with the abstract idea, as discussed above.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional elements, a processor and a memory, amount to no more than mere instructions to apply an exception using generic computer components.  Mere instructions to apply an exception using generic computer components cannot provide an inventive concept.

	With respect to the “receiving … real-time video” limitations, the courts have found limitations directed towards data gathering to be well-understood, routine, and conventional.  See MPEP 2106.05(d)(II). “receiving or transmitting data over a network.”
With respect to the “storing” limitation, the courts have found limitations directed towards storing, recited at a high level of generality, to be well-understood, routine, and conventional. See MPEP 2106.05(d)(ll), “electronic recordkeeping” and “storing and retrieving information in memory.”
With respect to the “displaying” limitations, the courts have found limitations directed towards data gathering to be well-understood, routine, and conventional.  See MPEP 2106.05(d)(II) ““Presenting offers and gathering statistics.”

Considering the additional elements individually and in combination and the claim as a whole, the additional elements do not provide significantly more than the abstract idea. The claim is not patent eligible.

Claim 19 recites a non-transitory machine readable medium, and therefore is a manufacture which is a statutory category of invention.
Step 2A, Prong One asks: Is the claim directed to a law of nature, a natural phenomenon (product of nature) or an abstract idea? See MPEP 2106.04 Part I. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas.  See MPEP 2106.04(a).
Most of claim 19 resembles claim 1, with the “analyzing”, “determining”, “generating”, and “identifying” reciting an abstract idea, as discussed above.

At step 2a, prong two, this judicial exception is not integrated into a practical application.  Claim 19 recites cameras however, this is recited as a high-level of generality and so generically such that it amounts to no more than mere instructions to apply the exception using a generic computer component.  See MPEP 2106.05(f).  These limitations can also be viewed as nothing more than an attempt to generally link the use of the judicial exception to the technological environment of a computer.  See MPEP 2106.05(h).
The claim recites “receiving … real-time video” and “storing the object tag and object information.”  This element does not integrate the abstract idea into a practical application because it does not impose a meaningful limit on the judicial exception and provides only insignificant extra solution activity that is mere data gathering in conjunction with the abstract idea, as discussed above.

The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional elements, a speaker, amount to no more than mere instructions to apply an exception using generic computer components.  Mere instructions to apply an exception using generic computer components cannot provide an inventive concept.

	With respect to the “receiving … real-time video” limitations, the courts have found limitations directed towards data gathering to be well-understood, routine, and conventional.  See MPEP 2106.05(d)(II). “receiving or transmitting data over a network.”
With respect to the “storing” limitation, the courts have found limitations directed towards storing, recited at a high level of generality, to be well-understood, routine, and conventional. See MPEP 2106.05(d)(ll), “electronic recordkeeping” and “storing and retrieving information in memory.”
With respect to the “displaying” limitations, the courts have found limitations directed towards data gathering to be well-understood, routine, and conventional.  See MPEP 2106.05(d)(II) ““Presenting offers and gathering statistics.”

Considering the additional elements individually and in combination and the claim as a whole, the additional elements do not provide significantly more than the abstract idea. The claim is not patent eligible.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-3, 7-9, 11, 12, 14-16, 19, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Loxam et al. (US 9,064,326), and further in view of Borel et al. (US 2017/0076156).

With respect to claim 1, Loxam teaches a method, comprising: 
receiving, at a first time, a first real-time video (Loxam, Col. 6 Li. 4-5, The trigger-detection engine 315 detects trigger items in a video stream at the mobile device 300.), comprising a first image, captured via a first camera (Loxam, Fig. 1, video camera 121), wherein the first real-time video comprises a real-time representation of a first view of the first camera (Loxam, Col. 14 Li. 31-34, a user may view the real world, in real time, by looking at a video screen on a mobile handheld device 400A, 400B. Using a camera in the device the area around the user might be filmed and the images can be displayed on the screen of the mobile device 400A, 400B.);
analyzing, by a processor, the first image to identify a first object within the first image (Loxam, Col. 6 Li. 16-19, The trigger-detection engine 315 may analyze the frames of the captured video stream and identify the objects/potential trigger item within each frame of the captured video stream.); 
generating an object tag comprising information associated with the first object (Loxam, Col. 7 Li. 47-56, The trigger-detection engine 315 analyzes each captured frame and then their relation to each other in the video stream. The trigger-detection engine 315 may relate patterns from the series of frames to assist in determining what the potential trigger items are and are they known to the system. The trigger-detection engine 315 will initially try to match the distinct points and objects to those known in the trigger item engine 314. However, trigger-detection engine 315 can also use the backend server to assist in detecting trigger items or in the creation of a new trigger item., Col. 8 Li. 53-55, The augmentation engine 316 then also allows the user to associate that augmented reality content with at least one trigger item from the trigger item engine & Col. 11 Li. 26-35, For some embodiments, the augment information database 360 stores a master composite of the augmented reality content and any other information from all of the different source depositories that may be inserted into the captured video stream 308. The information may include identification information ( e.g., the university), advertisement information ( e.g., restaurant discount coupons), link information (e.g., a URL link to the website of a restaurant), facial information ( e.g., Bob Smith), etc. Different types of augmented reality information may be stored for the same object.); 
storing the object tag and object information determined based on the first object (Loxam, Col. 3 – Col. 4 Li. 4-5, store in the local cache … (4) all augmented
reality content actually generated by this user. & Col. 8 Li. 64-67, The local cache 317 … configured to store augmented reality content and information associated with the known trigger items.).
receiving, at a second time, a second real-time video (Loxam, Col. 13 Li. 1-4, The video capturing module 120 may be configured to capture images or video streams. The video capturing module 120 may be associated with a video camera 121 and may enable a user to capture the images and/or the video streams.) comprising a second image, captured via the first camera (Loxam, Fig. 1, video camera 121), wherein the second real-time video comprises a real-time representation of a second view of the first camera (Loxam, Col. 14 Li. 31-34, a user may view the real world, in real time, by looking at a video screen on a mobile handheld device 400A, 400B. Using a camera in the device the area around the user might be filmed and the images can be displayed on the screen of the mobile device 400A, 400B. Examiner note: Loxam teaches that a user can use their device to capture multiple video streams); 
while monitoring the second real-time video captured via the first camera, identifying, by a processor, based upon (i) the second image in the second real-time video and (ii) the object information determined based upon the first object, which was identified within the first image in the first real-time video, within the second image (Loxam, Col. 8 Li. 20-30, Accordingly, the trigger-detection engine 315 monitoring the video stream from a video camera of the mobile computing device detects the real world trigger item by comparing objects in the video stream to known trigger items stored in 1) a database communicatively connected to the mobile computing device over a network, 2) a local cache in the mobile computing device and 3) any combination of the two. The associated augmented reality content and actions are pulled from 1) a database communicatively connected to the mobile computing device over a network, 2) a local cache in the mobile computing device and 3) any combination of the two.); and
displaying, via a display device, a representation of the object tag overlayed onto a real-time video (Loxam, Col. 8 Li. 31-33, The augmentation engine then overlays the augmented reality content onto the video stream being displayed on a display screen of the mobile computing device.).
Loxam doesn't expressly discuss analyzing at least some of the first real-time video to determine a context of the first real-time video; determining one or more object settings, indicative of one or more types of objects to be detected, based upon the context of the first real-time video; and analyzing, by a processor and based upon the one or more object settings, the first image to identify a first object within the first image, wherein the first object is of a first type of object of the one or more types of objects.
Borel teaches analyzing at least some of the first real-time video to determine a context of the first real-time video (Borel, pa 0027, determine a “scene” of the video based on location and comparison to other images); 
determining one or more object settings, indicative of one or more types of objects to be detected, based upon the context of the first real-time video (Borel, pa 0028, filtering parameters are applied to set priorities about what is relevant/interesting depending on the determined scene); and 
analyzing, by a processor and based upon the one or more object settings, the first image to identify a first object within the first image, wherein the first object is of a first type of object of the one or more types of objects (Borel, pa 0029, once extraneous motion is eliminated, determine if a human or animal is present in the video).
It would have been obvious at the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified Loxam with the teachings of Borel because it allows a user to create meaningful summaries (Borel, pa 0082).

With respect to claim 2, Loxam in view of Borel teaches the method of claim 1, wherein: the monitoring the second real-time video comprises monitoring the second real-time video based upon the object information (Loxam, Col. 8 Li. 20-26, Accordingly, the trigger-detection engine 315 monitoring the video stream from a video camera of the mobile computing device detects the real world trigger item by comparing objects in the video stream to known trigger items stored in 1) a database communicatively connected to the mobile computing device over a network, 2) a local cache in the mobile computing device and 3) any combination of the two.).

With respect to claim 3, Loxam in view of Borel teaches the method of claim 1, comprising receiving an input via a client device associated with the first camera, wherein the object tag is generated based upon the input (Loxam, Col. 12 Li. 4-18, A user may actively download published augmented reality scenarios from their instance of the augmented reality application 314. …The selected augmented reality scenarios that have been transmitted to the mobile computing device 300 and potentially stored in the local cache 317 are used by the augmentation engine 316 to generate the augmented video stream).

With respect to claim 7, Loxam in view of Borel teaches the method of claim 1, wherein the object information comprises at least one of: the first type of object of the first object; the first image; a third image comprising the first object; a portion of the first image corresponding to the first object; a portion of the third image corresponding to the first object; or one or more visual characteristics of the first object (Loxam, Col. 11 Li. 26-35, For some embodiments, the augment information database 360 stores a master composite of the augmented reality content and any other information from all of the different source depositories that may be inserted into the captured video stream 308. The information may include identification information ( e.g., the university), advertisement information ( e.g., restaurant discount coupons), link information (e.g., a URL link to the website of a restaurant), facial information ( e.g., Bob Smith), etc. Different types of augmented reality information may be stored for the same object.).

With respect to claim 8, Loxam in view of Borel teaches the method of claim 1, wherein the object information comprises at least one of: a location associated with the first object (Loxam, Col. 6 Li. 45-48, the rough geographical information from the GPS reduces the amount of possible trigger items that need to be sorted through as a possible match to known objects in that area.); or audio recorded via a microphone during a time that the first image is captured.

With respect to claim 9, Loxam in view of Borel teaches the method of claim 8, comprising: determining a second location associated with the second image; and comparing the second location with the location associated with the first object to determine a distance between the second location and the location, wherein the identifying the first object within the second image is performed based upon the distance  (Loxam, Col. 6 Li. 45-48, the rough geographical information from the GPS reduces the amount of possible trigger items that need to be sorted through as a possible match to known objects in that area.).

With respect to claim 11, Loxam in view of Borel teaches the method of claim 1, comprising:  
receiving, at a third time, a third real-time video (Loxam, Col. 13 Li. 1-4, The video capturing module 120 may be configured to capture images or video streams. The video capturing module 120 may be associated with a video camera 121 and may enable a user to capture the images and/or the video streams.), comprising a third image, captured via the first camera (Loxam, Fig. 1, video camera 121), wherein the third real-time video comprises a real-time representation of a third view of the first camera (Loxam, Col. 14 Li. 31-34, a user may view the real world, in real time, by looking at a video screen on a mobile handheld device 400A, 400B. Using a camera in the device the area around the user might be filmed and the images can be displayed on the screen of the mobile device 400A, 400B. Examiner note: Loxam teaches that a user can use their device to capture multiple video streams); 
analyzing at least some of the third real-time video to determine a second context of the third real-time video, wherein the second context of the third real-time video is different than the context of the first real-time video (Borel, pa 0027, [0027] In addition to determining a location, a more specific determination of a “scene” is done. For example, the location may be a bedroom, while the scene is a sleeping baby. In one embodiment, the user is prompted to label the scene (e.g., as sleeping baby). Alternately, there can be automatic detection of the scene using a neural network or similar application, with comparisons to images of particular scenes, and also comparisons to previously stored images and videos labelled by the user. In addition, various cues are used in one embodiment to determine the type of scene. For example, for a “sleeping baby,” the video may be matched to a baby in bed scene from examination of the video. This is combined with other cues, such as the time of day indicating night time, the camera being in night mode, a microphone detecting sounds associated with sleeping, etc. Similarly, a birthday party can be detected holistically using different cues, including the comparison to birthday party images, motion indicating many individuals, singing (e.g., the song “Happy Birthday”), etc.); 
determining one or more second object settings, indicative of one or more second types of objects to be detected, based upon the second context of the third real- time video, wherein the one or more second object settings are different than the one or more object settings determined in association with the first real-time video and the one or more second types of objects are different than the one or more types of objects (Borel, pa 0028, filtering parameters are applied to set priorities about what is relevant/interesting depending on the determined scene); and 
analyzing, based upon the one or more second object settings, the third image to identify a second object within the third image, wherein the second object is of a second type of object of the one or more second types of objects  (Borel, pa 0029, once extraneous motion is eliminated, determine if a human or animal is present in the video).

With respect to claim 12, Loxam in view of Borel teaches the method of claim 11, comprising:  generating a second object tag comprising information associated with the second object; and storing the second object tag and second object information determined based upon the second object (Loxam, Col. 8 Li. 53-55, The augmentation engine 316 then also allows the user to associate that augmented reality content with at least one trigger item from the trigger item engine & Col. 11 Li. 26-35, For some embodiments, the augment information database 360 stores a master composite of the augmented reality content and any other information from all of the different source depositories that may be inserted into the captured video stream 308.).

With respect to claim 14, Loxam teaches a computing device comprising: 
a processor (Loxam, Col. 3 Li. 35-36); and 
memory comprising processor-executable instructions that when executed by the processor cause performance of operations (Loxam, Col. 3 Li. 36-51), the operations comprising: 
receiving, at a first time, a first real-time video (Loxam, Col. 6 Li. 4-5, The trigger-detection engine 315 detects trigger items in a video stream at the mobile device 300.), comprising a first image, captured via a first camera (Loxam, Fig. 1, video camera 121), wherein the first real-time video comprises a real-time representation of a first view of the first camera (Loxam, Col. 14 Li. 31-34, a user may view the real world, in real time, by looking at a video screen on a mobile handheld device 400A, 400B. Using a camera in the device the area around the user might be filmed and the images can be displayed on the screen of the mobile device 400A, 400B.); 
analyzing the first image to identify a first object within the first image (Loxam, Col. 6 Li. 16-19, The trigger-detection engine 315 may analyze the frames of the captured video stream and identify the objects/potential trigger item within each frame of the captured video stream.); 
generating an object tag comprising information associated with the first object  (Loxam, Col. 7 Li. 47-56, The trigger-detection engine 315 analyzes each captured frame and then their relation to each other in the video stream. The trigger-detection engine 315 may relate patterns from the series of frames to assist in determining what the potential trigger items are and are they known to the system. The trigger-detection engine 315 will initially try to match the distinct points and objects to those known in the trigger item engine 314. However, trigger-detection engine 315 can also use the backend server to assist in detecting trigger items or in the creation of a new trigger item., Col. 8 Li. 53-55, The augmentation engine 316 then also allows the user to associate that augmented reality content with at least one trigger item from the trigger item engine & Col. 11 Li. 26-35, For some embodiments, the augment information database 360 stores a master composite of the augmented reality content and any other information from all of the different source depositories that may be inserted into the captured video stream 308. The information may include identification information ( e.g., the university), advertisement information ( e.g., restaurant discount coupons), link information (e.g., a URL link to the website of a restaurant), facial information ( e.g., Bob Smith), etc. Different types of augmented reality information may be stored for the same object.); 
storing the object tag and object information determined based upon the first object (Loxam, Col. 3 – Col. 4 Li. 4-5, store in the local cache … (4) all augmented
reality content actually generated by this user. & Col. 8 Li. 64-67, The local cache 317 … configured to store augmented reality content and information associated with the known trigger items.);
receiving, at a second time, a second real-time video (Loxam, Col. 13 Li. 1-4, The video capturing module 120 may be configured to capture images or video streams. The video capturing module 120 may be associated with a video camera 121 and may enable a user to capture the images and/or the video streams.) comprising a second image, captured via the first camera (Loxam, Fig. 1, video camera 121), wherein the second real-time video comprises a real-time representation of a second view of the first camera (Loxam, Col. 14 Li. 31-34, a user may view the real world, in real time, by looking at a video screen on a mobile handheld device 400A, 400B. Using a camera in the device the area around the user might be filmed and the images can be displayed on the screen of the mobile device 400A, 400B. Examiner note: Loxam teaches that a user can use their device to capture multiple video streams); 
identifying, based upon (i) the second image in the second real-time video captured via the first camera and (ii) the object information based upon the first object identified within the first real-time video captured via the first camera, the first object, which was identified within the first image in the first real-time video, within the second image  (Loxam, Col. 8 Li. 20-30, Accordingly, the trigger-detection engine 315 monitoring the video stream from a video camera of the mobile computing device detects the real world trigger item by comparing objects in the video stream to known trigger items stored in 1) a database communicatively connected to the mobile computing device over a network, 2) a local cache in the mobile computing device and 3) any combination of the two. The associated augmented reality content and actions are pulled from 1) a database communicatively connected to the mobile computing device over a network, 2) a local cache in the mobile computing device and 3) any combination of the two.); and
determining, based upon the second image, a location of the first object (Loxam, Col. 6 Li. 33-48, Combining the visual information and the metadata of an image or object, such as geographical information, may allow a rapid recognition or matching to the characteristics of objects that are known and pre-stored in an object database 342. The geographical information may be provided by a global positioning system (GPS) built-into the mobile computing device. Combining the visual information with the metadata of an image or object generally reduces the amount of possible trigger items that need to be sorted through by the object recognition engine 320 and trigger-detection engine 315 to identify and recognize known objects and/or persons. For example, the rough geographical information from the GPS reduces the amount of possible trigger items that need to be sorted through as a possible match to known objects in that area.); and
displaying a representation of the location overlayed onto a real-time video (Loxam, Col. 8 Li. 31-33, The augmentation engine then overlays the augmented reality content onto the video stream being displayed on a display screen of the mobile computing device.).	
Loxam doesn't expressly discuss analyzing at least some of the first real-time video to determine a context of the first real-time video; determining one or more object settings, indicative of one or more types of objects to be detected, based upon the context of the first real-time video; and analyzing, by a processor and based upon the one or more object settings, the first image to identify a first object within the first image, wherein the first object is of a first type of object of the one or more types of objects.
analyzing at least some of the first real-time video to determine a context of the first real-time video (Borel, pa 0027, determine a “scene” of the video based on location and comparison to other images); 
determining one or more object settings, indicative of one or more types of objects to be detected, based upon the context of the first real-time video (Borel, pa 0028, filtering parameters are applied to set priorities about what is relevant/interesting depending on the determined scene); and 
analyzing, by a processor and based upon the one or more object settings, the first image to identify a first object within the first image, wherein the first object is of a first type of object of the one or more types of objects (Borel, pa 0029, once extraneous motion is eliminated, determine if a human or animal is present in the video).
It would have been obvious at the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified Loxam with the teachings of Borel because it allows a user to create meaningful summaries (Borel, pa 0082).

With respect to claim 15, Loxam in view of Borel teaches the computing device of claim 14, wherein: the identifying the first object is performed while monitoring the second real-time video based upon the object information (Loxam, Col. 10 Li. 50-57, Note, similarly augmentation engine 316 can be preparing and selecting augmented reality content to be overlaid onto the video frames while the trigger item identification is going on. Note, the local cache 317 assists in performance in that it maintains a large portion of the augmented reality content most relevant to this user on the mobile device eliminating the need to transmit augmented reality content.).

With respect to claim 16, Loxam in view of Borel teaches the computing device of claim 14, the operations comprising receiving an input via a client device associated with the first camera, wherein the object tag is generated based upon the input (Loxam, Col. 12 Li. 4-18, A user may actively download published augmented reality scenarios from their instance of the augmented reality application 314. …The selected augmented reality scenarios that have been transmitted to the mobile computing device 300 and potentially stored in the local cache 317 are used by the augmentation engine 316 to generate the augmented video stream).

With respect to claim 19, the limitations are essentially the same as claim 1, in the form of a non-transitory machine readable medium having stored thereon processor-executable instructions that when executed cause performance of operations, and they are rejected for the same reasons.

With respect to claim 20, Loxam in view of Borel teaches the non-transitory machine readable medium of claim 19, wherein the providing the representation comprises at least one of:
outputting, via a speaker, an audio message indicative of the object tag; or
displaying, via a display device, a graphical representation of the object tag (Loxam, Col. 8 Li. 31-33, The augmentation engine then overlays the augmented reality content onto the video stream being displayed on a display screen of the mobile computing device.).

Claims 4 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Loxam in view of Borel, and further in view of Brown (US 2008/00228749).

With respect to claim 4, Loxam in view of Borel teaches the method of claim 3, as discussed above.  Loxam in view of Borel doesn't expressly discuss wherein the input corresponds to an audio recording received via a microphone associated with the client device, the method comprising transcribing the audio recording to generate the object tag.
Brown teaches wherein the input corresponds to an audio recording received via a microphone associated with the client device (Brown, pa 0035, As indicated above, the audio data may be provided by the author. For example, the author can make a recording to be posted on the Internet, and use the disclosed architecture to suggest folksonomically appropriate tags for the recording. & pa 0075, A user can enter commands and information into the computer 1502 through one or more wired/wireless input devices… Other input devices (not shown) may include a microphone), the method comprising transcribing the audio recording to generate the object tag (Brown, pa 0036-0037, transcribing audio into text to generate tags for the data).
It would have been obvious at the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified Loxam in view of Borel to have included the teachings of Brown because it can provide meaningful tags to content (Brown, pa 0007).

With respect to claim 17, the limitations are essential the same as claim 4, and are thus rejected for the same reasons.

Claims 5, 13, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Loxam in view of Borel, and further in view of Ramkumar et al. (US 2017/0116786).

With respect to claim 5, Loxam in view of Borel teaches the method of claim 3, as discussed above. Loxam in view of Borel doesn't expressly discuss wherein the input corresponds to a text-input received via the client device.  
Ramkumar teaches wherein the input corresponds to a text-input received via the client device (Ramkumar, pa 0046, If the product is not recognized, at block 912, a user is allowed to add a tag to a product, for example, manually add a definition or description of the product.).
	It would have been obvious at the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified Loxam with the teachings of Ramkumar because it assists in future recognition (Ramkumar, pa 0038).

With respect to claim 13, Loxam in view of Borel teaches the method of claim 1, as discussed above. Loxam in view of Borel doesn't expressly discuss wherein the generating the object tag is performed responsive to receiving a request to generate the object tag.
Ramkumar teaches wherein the generating the object tag is performed responsive to receiving a request to generate the object tag (Ramkumar, Fig. 4 step 402, pa 0031, the user is pointing a camera associated with the user device at a particular object (e.g., a book on a bookshelf) and a frame with the object appears in the camera view & pa 0046, The process 900 begins at block 902 where a product is identified as described above in reference to FIGS. 3-5…. If the product is not recognized, at block 912, a user is allowed to add a tag to a product, for example, manually add a definition or description of the product.).
It would have been obvious at the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified Loxam in view of Borel with the teachings of Ramkumar because it assists in future recognition (Ramkumar, pa 0038).

With respect to claim 18, the limitations are essential the same as claim 5, and are thus rejected for the same reasons.

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Loxam in view of Borel, and further in view of Lindley et al. (US 2009/0307261).

With respect to claim 6, Loxam in view of Borel teaches the method of claim 3, as discussed above.  Loxam in view of Borel doesn't expressly discuss wherein the client device is wirelessly connected to the first camera.
Lindley teaches wherein the client device is wirelessly connected to the first camera (Lindley, pa 0024, The camera 105 can transfer data such as media objects and associated metadata to a computer 110 over a communication link 115. The communication link 115 can be wireless).
It would have been obvious at the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified Loxam in view of Borel with the teachings of Lindley because devices such as Devices such as digital cameras, camcorders, television cameras, or mobile phones can produce multiple media objects and Labeling media objects with metadata tags can be beneficial for future access of the media objects (Lindley, pa 0020-0021).

Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Loxam in view of Borel, and further in view of Cai (US 20140285717).

With respect to claim 10, Loxam in view of Borel teaches the method of claim 8, as discussed above.  Loxam in view of Borel doesn't expressly discuss recording second audio via the microphone during a time that the second image is captured; and comparing the second audio with the audio to determine an audio similarity between the second audio and the audio, wherein the identifying the first object within the second image is performed based upon the audio similarity.
Cai teaches recording second audio via the microphone during a time that the second image is captured; and comparing the second audio with the audio to determine an audio similarity between the second audio and the audio, wherein the identifying the first object within the second image is performed based upon the audio similarity (Cai, pa 0065, identify objects in video may comparing image or audio data in the video against confirmed audio data).
	It would have been obvious at the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified Loxam in view of Borel with the teachings of Cai because it can indicate an object detected in the video (Cai, pa 0029).

Response to Arguments
Rejections under 35 U.S.C. 101 
	Applicant argues that the 35 U.S.C. 101 rejection is overcome because the amendments to the independent claims.  The Examiner respectfully disagrees.  Claim 12 was identified as providing limitations beyond a mental process, however, it recited the action of “overlaying” which requires more than merely “displaying.”  Claims 1 and 14 recite “displaying” data, even though it is overlayed onto a video, it is still basic output and does not impose a meaningful limit on the judicial exception.  It provides only insignificant extra solution activity as “necessary data gathering and outputting.”  See MPEP 2106.05(g)(3).  Claim 19 still merely outputs data by “providing” the tag and video.  Therefore, the 35 U.S.C. 101 rejection is maintained.

Rejections under 35 U.S.C. 103
Applicant seems to argue a newly amended limitation.  Applicant’s amendment has rendered the previous rejection moot.  Upon further consideration of the amendment, a new grounds of rejection is made in view of Borel et al. (US 2017/0076156).

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRITTANY N ALLEN whose telephone number is (571)270-3566.  The examiner can normally be reached on M-F 9 am - 5:00 pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Usmaan Saeed can be reached on 571-272-4046.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/BRITTANY N ALLEN/           Primary Examiner, Art Unit 2169