DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1 thru 20 have been examined.
Claim Objections
Claim 20 is objected to because of the following informalities:  In lines 1 and 2, the phrase "The non-transitory computer-readable medium of claim 16," is repeated.  One of the duplicate phrases should be deleted.  Appropriate correction is required.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 4 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 4 recites “an operator” in line 2, while claim 1 also recites “an operator” (line 3).  It is unclear if this is a new operator (of the augmented reality device) or the same operator (of the augmented reality device).  The examiner assumes it is the same operator for continued examination.
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1 thru 5, 8, 9, 12, 14, 16 and 20 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1, 2, 6, 8, 9, 13 thru 16 and 20 of U.S. Patent No. 11,011,055 B2. Although the claims at issue are not identical, they are not patentably distinct from each other because the pending claims are broader limitations, written in varying order of limitations, and include fewer limitations than recited in the patented claims.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 2 and 5 thru 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Eledath et al Patent Application Publication Number 2016/0378861 A1 in view of Ozaki et al Patent Application Publication Number 2013/0286206 A1.
Regarding claims 1, 12 and 16 Eledath et al teach the claimed method, a method for vision-based human-machine interaction (Figure 3), and the claimed device, the vision-based user interface platform 132 (Figure 2), and the claimed non-transitory computer readable medium storing instructions, “In general, as used herein, “module,” “subsystem,” “service” and similar terminology may refer to computer code, instructions, and/or electronic circuitry, which may be embodied in a non-transitory computer accessible medium such as memory, data storage, and/or processor hardware.” P[0116], comprising:
the claimed instructions executed by processors, “Embodiments in accordance with the disclosure may be implemented in hardware, firmware, software, or any combination thereof. Embodiments may also be implemented as instructions stored using one or more machine-readable media, which may be read and executed by one or more processors.” P[0161], to cause:
the claimed processor(s), the scene understanding services 220 include a preemptive local processing module 222, an on-demand local processing module 224, and an on-demand cloud processing module 226, and the knowledge base service 236 includes active context processor 238 and knowledge base processor 244 (P[0117] and Figure 2), configured to:
the claimed providing instructions for an operator of the augmented reality device to collect video data associated with a real world site of interest from a first device, “system 110 is instructed by the user (e.g., by natural language speech dialog) to capture the license plate of the vehicle” P[0045], “A camera 114 acquires images (e.g., video 122) of the real world scene 100.” P[0097], the signals from the camera 114 are provided to the computing device 130 and processed, and then provided to the augmented view 140 of the real world scene 100 of display device 138 (Figure 1) (display device 138 equates to claimed augmented reality device), and “If at block 324 the system 110 determines to output a virtual element (e.g., a graphical overlay) on the scene 100, the system 110 proceeds to block 324. At block 324, the system 110 selects virtual element(s) 142 (e.g., an augmented reality overlay) that represent a portion of the stored knowledge correlated with visual feature(s) of the scene 100, in accordance with the system 110's interpretation of the user input at block 314. At block 326, the system 110 displays the virtual element(s) selected at block 324 on the view of the scene. In doing so, the system 110 may align the virtual element with the corresponding visual feature in the scene so that the virtual element directly overlays or is adjacent to the view of the visual feature.” (P[0140] and Figure 3), and
the claimed provide feedback on the real world site of interest, “The system 110 extracts from the dialog “man” “on left” and “gray shirt” and extracts from the image that portion of the image that depicts the face of the man on the left in the gray short. This is shown in the image 904 by the bounding box surrounding the man's face. The text box overlaid on the image 904 indicates that the system 110 provides feedback to let the user know that the user's inquiry has been received and is being processed. In this case, the feedback is visual, in the form of the bounding box surrounding the man's face. While difficult to see in the image 904, a text label is also overlaid below the bounding box, indicating “face detected . . . ”. The text box overlaid on image 906 provides additional feedback to the user to indicate that an information retrieval process has been initiated to identify the face within the bounding box (using, e.g., a facial recognition algorithm)” (P[0111] and Figure 9);
the claimed receiving a portion of the video data captured by a camera of the augmented reality device and contextual data relating to the real world site of interest by the first device, “In block 332, the system 110 may provide output (e.g., virtual element overlays and/or NL output) to one or more other applications/services (e.g., applications/services 134), by one or more display services 250, for example. In block 334, the system 110 may provide output (e.g., virtual element overlays and/or NL output) to one or more other applications/services (e.g., messaging, mapping, travel, social media), by one or more collaboration services 258, for example.” (P[0141] and Figure 3), and “At block 326, the system 110 displays the virtual element(s) selected at block 324 on the view of the scene. In doing so, the system 110 may align the virtual element with the corresponding visual feature in the scene so that the virtual element directly overlays or is adjacent to the view of the visual feature.” P[0140] (claimed contextual data);
the claimed processing the video or contextual data to determine a movement analytic of an object associated with the real world site of interest, classification of the object, or spatial attribute of the object, “The overlays, e.g., virtual elements 1302, 1304, can be animated in some embodiments, e.g., with visual routing that is dynamically updated in response to the user's movement progressing along the route. The scene understanding features of the system 110 allow the system 110 to automatically observe user actions and state of objects in the scene 1300, and provide feedback and warnings as needed.” P[0114], “Motion analysis technology is used to detect movers, identify flow patterns of traffic, crowds and individuals, and detect motion pattern anomalies to identify salient image regions for the user to focus attention.” P[0067], “A scene-understanding server (e.g., scene understanding services 220) provides interfaces to modules that recognize classes and specific instances of objects (vehicles, people etc.)” P[0049], “The real world scene 100 includes a person 104 and one or more visual features 1 to N (where N is a positive integer), and where multiple visual features 1, N may have relationships with one another that are discovered through use of the system 110. Such relationships may include, … spatial relationships” P[0096], and “the vehicle graphical overlay 1206 on the real world scene 1202 identifies a vehicle in the scene (from which the user can view certain characteristics of the vehicle, such as color or make/model) as well as it's spatial location within the scene 1202, including surrounding people and objects” P[0113]; and 
the claimed performing an action by the first device based on the processing, “the preemptive processing can respond to changes in the active context (as evidenced by, e.g., observations 240, 242 and/or user intent) by proactively offering AR-enabled suggestions and notifications at the mobile device” P[0118].
Eledath et al do not explicitly teach the claimed real world site of interest includes a roadway, but do teach that vehicles may be the viewed objects (P[0045] and P[0046]).  Vehicles would typically drive on roads, which would be included in the view of the camera.  Ozaki et al teach an augmented reality view of vehicles that includes a view of the road (Figure 6).  It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to combine the method for vision-based human-machine interaction of real world scenes and augmented reality of Eledath et al with the augmented reality view that includes the road of Ozaki et al in order to enable a user who sees a vehicle-mounted apparatus to instinctively understand a posted location of a posted data set (Ozaki et al P[0013]).
Regarding claim 2 Eledath et al teach the claimed generating a data collection workflow comprises instructions based on generating the data collection workflow, “At block 312, the system 110 analyzes video depicting a real world scene, extracts semantic elements from the visual scene, and generates a semantic understanding of the visual scene. To do this, the system 110 executes one or more computer vision algorithms, including object detection algorithms, scene recognition and localization algorithms, and/or occlusion reasoning algorithms. As used herein, “semantic element” may refer to a tag or label, such as a metatag, which describes a visual feature of the scene (e.g., an object or activity name or type, category, or class).” (P[0137] and Figure 3), “At block 314, the system 110 interprets user input.” P[0138], and “At block 316, the system 110 determines what to do in response to the user interaction interpreted at block 314 and the visual scene interpreted at block 312.” P[0139], the method of Figure 3 equates to the claimed data collection workflow.
Regarding claim 5 Eledath et al teach the claimed portion of the video data captured by the camera is based on a field of view of the camera, “For example, if the camera 114 is supported by the person 104 (e.g., as a component of a wearable or body-mounted device), the location/orientation data 126 provides information (e.g., head tracking navigation) to allow the system 110 to detect and respond to the person's movements, which can change the field of view of the camera 114. As used herein, “field of view” (FOV) may refer to, among other things, the extent of the observable real world that is visible through the lens of the camera 114 at any given moment in time. The field of view may depend on, for example, the particular position and spatial orientation of the camera 114, the focal length of the camera lens (which may be variable, in some embodiments), the size of the optical sensor, and/or other factors, at any given time instance. Objects that are outside a camera's FOV at the time that the video 122 is recorded will not be depicted in the video 1222.” P[0098].
Regarding claim 6 Eledath et al teach the claimed video data is pre-processed to remove spurious noise, remove specific features, apply a filter, or insert a timestamp, “Regarding the scene understanding services 220, the preemptive local processing module 222 enables local processing, e.g., on a mobile device. The processing is preemptive (or proactive) in that it does need to be initiated by a user cue. In other words, the preemptive processing can respond to changes in the active context (as evidenced by, e.g., observations 240, 242 and/or user intent) by proactively offering AR-enabled suggestions and notifications at the mobile device.” P[0118] (claimed pre-processing), “the system 110 addresses the problem of on-the-fly adaptation of an “information aperture” to enable dynamic tasking. Guided by user directives, the reasoning module (e.g., subsystem 230) provides a dynamic information aperture into the knowledge filtered by the user context.” P[0051] (claimed filter to improve image quality, and the claimed remove specific features).
Regarding claim 7 Eledath et al teach the claimed video data includes spatial data provided by the augmented reality device, “The real world scene 100 includes a person 104 and one or more visual features 1 to N (where N is a positive integer), and where multiple visual features 1, N may have relationships with one another that are discovered through use of the system 110. Such relationships may include, … spatial relationships” P[0096], and “the vehicle graphical overlay 1206 on the real world scene 1202 identifies a vehicle in the scene (from which the user can view certain characteristics of the vehicle, such as color or make/model) as well as it's spatial location within the scene 1202, including surrounding people and objects” (P[0113] and Figure 12).
Regarding claim 8 Eledath et al teach the claimed contextual data includes audio, textual or gesture-based inputs, “User-borne sensing includes auditory, visual, gestural inputs as well as sensing from user carried appliances (such as cellular signal trackers).” P[0061], the virtual elements 508, 510, 512, 514, 516, 518, 520, and 522 of the display in Figure 5 includes text (ref# 508) P[0104].
Regarding claim 9 Eledath et al teach the claimed movement analytic includes information indicative of a location, traveling speed, or traveling direction of the object, “One or more location/orientation sensors 118 acquire location/orientation data 126 in order to spatially align or “register” the video 122 with the real world scene 100 so that object detection and/or object recognition algorithms and other computer vision techniques can determine an understanding of the real world scene 100 from the point of view of the user.” P[0098], and “the modules 1402, 1404 receive and analyze the video inputs provided by the user's camera device, apply one or more computer vision algorithms to the video inputs to extract visual features, such as people and objects, and search the database 1406 for information about the extracted visual feature (e.g., geographic location, person or object identification, etc.)” P[0125].
Regarding claim 10 Eledath et al teach the claimed video or contextual data determines movement analytic involves a computer-vision technique, a feature detection technique, or a 3D object technique, “One or more location/orientation sensors 118 acquire location/orientation data 126 in order to spatially align or “register” the video 122 with the real world scene 100 so that object detection and/or object recognition algorithms and other computer vision techniques can determine an understanding of the real world scene 100 from the point of view of the user.” P[0098].
Regarding claim 11 Eledath et al teach the claimed method of claim 1 (see above), further comprising:
the claimed determining a pixel area for the object in the video data, “Box 804 explains that the graphical overlay 822 is placed on the image 800 at a location (e.g., x, y pixel coordinates) that corresponds to a person whose identity is known as a result of integration of the AR functionality with back-end services and stored knowledge” (P[0107] and Figure 8); and
the claimed determining that the pixel area matches a predicted pixel area for a particular type of object in the video data, “Box 802 explains that the graphical overlay 820 is placed on the image 800 at a location (e.g., x, y pixel coordinates) that corresponds to a building whose geographic location is known. By selecting the overlay (e.g., by speech or tapping on the overlay graphic 820), the user can obtain additional information about the location.” (P[0107] and Figure 8).
Regarding claim 13 Eledath et al teach the claimed device is located at an edge of a network, “the system 110 is instructed to provide a wider, peripheral coverage of the site for vehicles that match the provided descriptions and also to watch for unusual events” P[0046].  A device at the edge of a network would function the same as in the center of a network, in either case, the device is connected to the network.  Location is not an issue.
Regarding claim 14 Eledath et al teach the claimed perform an action based on processing of video or contextual data, “the preemptive processing can respond to changes in the active context (as evidenced by, e.g., observations 240, 242 and/or user intent) by proactively offering AR-enabled suggestions and notifications at the mobile device” P[0118].
Regarding claim 15 Eledath et al teach the claimed provide the movement analytic, the classification, or the spatial attribute to a component, of a movement analytic platform to perform an action, “The overlays, e.g., virtual elements 1302, 1304, can be animated in some embodiments, e.g., with visual routing that is dynamically updated in response to the user's movement progressing along the route. The scene understanding features of the system 110 allow the system 110 to automatically observe user actions and state of objects in the scene 1300, and provide feedback and warnings as needed.” P[0114], “Motion analysis technology is used to detect movers, identify flow patterns of traffic, crowds and individuals, and detect motion pattern anomalies to identify salient image regions for the user to focus attention.” P[0067], and “A scene-understanding server (e.g., scene understanding services 220) provides interfaces to modules that recognize classes and specific instances of objects (vehicles, people etc.)” P[0049].
Regarding claim 17 Eledath et al teach the claimed send to the augmented reality device additional instruction for providing additional contextual data for the video data, “interaction between the user and the system 110 is achieved by augmenting the user's sight and sound with additional information, interfaces and personalization” P[0044], “The DIA module 230 evaluates mission goals and available computational resources to determine if it should autonomously initiate background processes to mine peripheral information. The initiated processes support both data corroboration to verify new data and data collaboration where additional relevant information is generated around new data.” P[0077], and “Box 802 explains that the graphical overlay 820 is placed on the image 800 at a location (e.g., x, y pixel coordinates) that corresponds to a building whose geographic location is known. By selecting the overlay (e.g., by speech or tapping on the overlay graphic 820), the user can obtain additional information about the location.” P[0107].
Regarding claim 18 Eledath et al teach the claimed send to a user device or a vehicle at the real world site of interest a message providing information relating to the real world site of interest, “the preemptive processing can respond to changes in the active context (as evidenced by, e.g., observations 240, 242 and/or user intent) by proactively offering AR-enabled suggestions and notifications at the mobile device” P[0118], and “The multimodal group chat services 262 employ interactive messaging (e.g., Internet relay chat or IRC) technology to enable users of the system 110 to share virtual elements with one another in a live, real time communication environment.” P[0122].
Regarding claim 19 Eledath et al teach the claimed provide to a user interface at a client device the video data, contextual data, the movement analytic, the classification, or the spatial attribute, “The overlays, e.g., virtual elements 1302, 1304, can be animated in some embodiments, e.g., with visual routing that is dynamically updated in response to the user's movement progressing along the route. The scene understanding features of the system 110 allow the system 110 to automatically observe user actions and state of objects in the scene 1300, and provide feedback and warnings as needed.” P[0114], “Motion analysis technology is used to detect movers, identify flow patterns of traffic, crowds and individuals, and detect motion pattern anomalies to identify salient image regions for the user to focus attention.” P[0067], and “A scene-understanding server (e.g., scene understanding services 220) provides interfaces to modules that recognize classes and specific instances of objects (vehicles, people etc.)” P[0049]. Also see the display device 138 of Figure 1.
Regarding claim 20 Eledath et al teach the claimed generate a data collection workflow for the operator of the augmented reality device including instructions, “At block 312, the system 110 analyzes video depicting a real world scene, extracts semantic elements from the visual scene, and generates a semantic understanding of the visual scene. To do this, the system 110 executes one or more computer vision algorithms, including object detection algorithms, scene recognition and localization algorithms, and/or occlusion reasoning algorithms. As used herein, “semantic element” may refer to a tag or label, such as a metatag, which describes a visual feature of the scene (e.g., an object or activity name or type, category, or class).” (P[0137] and Figure 3), “At block 314, the system 110 interprets user input.” P[0138], and “At block 316, the system 110 determines what to do in response to the user interaction interpreted at block 314 and the visual scene interpreted at block 312.” P[0139], the method of Figure 3 equates to the claimed data collection workflow.
Allowable Subject Matter
Claim 3 would be allowable if rewritten to overcome the rejection(s) under double patenting, set forth in this Office action and to include all of the limitations of the base claim and any intervening claims.
Claim 4 would be allowable if rewritten to overcome the rejection(s) under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), 2nd paragraph, and under double patenting, set forth in this Office action and to include all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:  The reason for indicating allowable subject matter over the prior art of record is based on the combined limitations of claims 1 and 3 (and of claims 1 and 4).  The reason for indicating allowable subject matter is also the same as the reason for allowance recited in the parent application 16/360584 office action of 2/1/2021.  More specifically, the claimed instructions provided to the augmented reality device as augmented reality content (claim 3) and the claimed instruction indicate an area of the site of interest that the operator of the augmented reality device is to position within the field of vehicle of the camera (claim 4), are the limitations related to the reason for allowance in the parent application.  These limitations are equated to the previously claimed, additional content on the augmented reality second device related to the object in the video data to provide instructions for collecting the video data from the site (16/360584 claim 1).  
Relevant Art
The examiner further notes Kuznetsov et al PGPub 2017/0263014 A1 for relevance to the claimed method, see the method of Figure 4.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DALE W HILGENDORF whose telephone number is (571)272-9635. The examiner can normally be reached Monday - Friday 9-5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jelani Smith can be reached on 571-270-3969. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DALE W HILGENDORF/Primary Examiner, Art Unit 3662