DETAILED ACTION
Status of Claims 
Applicant’s Amendment filed on 07/11/2022 has been considered.
Claim 15 is cancelled.
Claims 1, 3-4, 6, 8-14 and 16-20 are currently pending and have been examined.

Response to Amendment
Applicant’s amendment, filed 07/11/2022, has been entered. Claims 1, 10, 13 and 16 have been amended.
Claim Objections
The claim objections have been withdrawn pursuant Applicant’s amendments.

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 07/11/2022 has been entered.


Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1, 4, 6, 8-9, 13-14 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Meek et al (US 2015/0026156 A1), as previously cited and hereinafter Meek, in view of Adato et al. (US 2019/0213212 A1), as previously cited and hereinafter Adato, in further view of Powers et al. (US 2018/0374276 A1), as previously cited and hereinafter Powers.
Regarding claim 1, Meek discloses a method (i.e. abstract) comprising:
	-capturing a video stream of a staging area and products for sale presented therein, the capturing of the video stream including automatically receiving the video stream (Meek, see at least: [0053], [0041] and [0054] - “FIG. 3A is an illustrative block diagram of an embodiment of the capture and use of images of in-store inventory [i.e. capturing video stream of a staging area and products for sale presented therein]. A person 305 at the store 209, who may or may not be affiliated with the store 209” and “The user may then be able to browse the shelves [i.e. video stream of a staging area and products for sale] of the stores online to see what sort of products are available in the store” and “the imaging device 303 is a video camera [i.e. capturing video stream]…the movement and triggering of the imaging device is controlled by an automatic system that can sequence zoom, pan, start and stop video [i.e. the capturing of the video stream including automatically receiving the video stream]”); 
-automatically comparing by a computer processor a scene represented in the captured video stream with the scene represented in a prior captured video stream to identify the presence of a significant change between the video streams (Meek, see at least: [0070] and [0098] - “the inventory image data 316 can be updated from time to time [i.e. with the scene represented in a prior captured video stream]. When the inventory image data 316 is updated, it may not be necessary to update the product mapping function 402 for every image segment 314. For example, FIG. 4D shows an update of the inventory image data from 316a to 316c, where the image segment 314f becomes 314g, but the product does not change, and the image segment 314c becomes 314e, and the product does change. The system described with respect to FIG. 4B, for automatically creating the data for the product mapping function 402, can align the two inventory images 316a and 316c [i.e. automatically comparing by a computer processor a scene represented in the captured video stream], and discover an image match between 314f and 314g. As a result, the two segments 314f and 314g can map to the same product 403a, and it may be that no additional work is needed to identify segment 314g. At the same time, the system may discover an image mismatch between the image segments 314c and 314e, so it can leave 314c mapping to product 403b, but may initiate a new automatic product mapping process on the image segment 314e [i.e. with the scene represented in a prior captured video stream to identify the presence of a significant change between the video streams], as described with respect to FIG. 4B, which may result in identifying a different product 403c to use for the mapping of the image segment 314e” and “The device 801 can capture video imagery of a product 302 as indicated by 803. The device 801 can also use its location service to record the position and orientation of the device 801 during the video capture, to help with the process of stitching together the individual video frame images into a single image [i.e. video streams] as previously described with respect to FIG. 3B”); 
-when the comparing reveals a significant change (Meek, see at least: [0070] - “At the same time, the system may discover an image mismatch between the image segments 314c and 314e [i.e. when the comparing reveals a significant change], so it can leave 314c mapping to product 403b, but may initiate a new automatic product mapping process on the image segment 314e, as described with respect to FIG. 4B, which may result in identifying a different product 403c to use for the mapping of the image segment 314e”): 
-processing the video stream with a product identifying module including image recognition based on stored representations of visual characteristics of the products to obtain data identifying the products, product information, and location within the video stream of the products included in the video stream (Meek, see at least: [0061], [0065] and [0098] - “The database 205 used by the system can contain both image data 405 of the inventory 301 in the store 209, and also structured product data 406 associated with the products 302 that are in the inventory 301 [i.e. including image recognition based on stored representations of visual characteristics of the products]…an item 403 of the structured product data 406 can contain an image 404 a of the product, the name 404 b of the product, the supplier 404 c of the product, the retail price 404 d of the product, and other data as indicated by the ellipsis. A product mapping module 402 can take an image 316 of the in-store inventory 301, and an identification 401 of a point within the inventory image 316, using the x-axis and y-axis offset of the point [i.e. location within the video stream of the products included in the video stream], and map the inventory image 316 and point 401 to structured product data 403 [i.e. product information] that represents the product corresponding to the image segment [i.e. processing the video stream with a product identifying module to obtain data identifying the products] that is at point 401 in the inventory image 316” and “image segment 314 can be matched against product images in a product image database 420 using an image-driven search…A good match may be used to identify the product in the product image database 420, and then use the product data from the product image database 420 to provide the structured data 403 about the product [i.e. including image recognition based on stored representations of visual characteristics of the products]” and “The device 801 can capture video imagery of a product 302 as indicated by 803. The device 801 can also use its location service to record the position and orientation of the device 801 during the video capture, to help with the process of stitching together the individual video frame images into a single image [i.e. processing the video stream] as previously described with respect to FIG. 3B”), the processing of the video stream including: 
-providing the video stream to an object recognition service that identifies the products located within the video stream and the location of each identified product within the video stream (Meek, see at least: [0061] and [0098] - “A product mapping module 402 can take an image 316 of the in-store inventory 301 [i.e. providing the video stream to an object recognition service], and an identification 401 of a point within the inventory image 316, using the x-axis and y-axis offset of the point, and map the inventory image 316 and point 401 to structured product data 403 that represents the product corresponding to the image segment that is at point 401 [i.e. that identifies the products located within the video stream and the location of each identified product within the video stream] in the inventory image 316” and “The device 801 can capture video imagery of a product 302 as indicated by 803. The device 801 can also use its location service to record the position and orientation of the device 801 during the video capture, to help with the process of stitching together the individual video frame images into a single image [i.e. providing the video stream] as previously described with respect to FIG. 3B”); 
-receiving from the object recognition service, product identifiers of the identified products and location data identifying the location of each respective product (Meek, see at least: [0061] and [0063] - “A product mapping module 402 [i.e. the object recognition service] can take an image 316 of the in-store inventory 301, and an identification 401 of a point within the inventory image 316, using the x-axis and y-axis offset of the point [i.e. receiving from the object recognition service, location data identifying the location of each respective product], and map the inventory image 316 and point 401 to structured product data 403 that represents the product corresponding to the image segment that is at point 401 in the inventory image 316” and “The inventory image 316 can be segmented into smaller image segments 314, each one representing a single product 302. The image segment 314 may contain text 411 that can be located and recognized by an Optical Character Recognition (OCR) module 412 [i.e. receiving from the object recognition service, product identifiers of the identified products]. The OCR module 412 can convert the characters into machine-readable text 413, which can then be used to search a product database 414. Data from the product database 414 can then be used to provide the structured data 403 about the product”); 
-retrieving product information for each identified product from a product database (Meek, see at least: [0063] - “The inventory image 316 can be segmented into smaller image segments 314, each one representing a single product 302. The image segment 314 may contain text 411 that can be located and recognized by an Optical Character Recognition (OCR) module 412. The OCR module 412 can convert the characters into machine-readable text 413, which can then be used to search a product database 414. Data from the product database 414 can then be used to provide the structured data 403 about the product [i.e. retrieving product information for each identified product from a product database]”); and 
-augmenting data of the video stream for each identified product therein with the product identifier, the location data identifying the location of the product in the video stream, and the retrieved product information including adding text to the video stream that is visible when the video stream is presented on a display, and including at least a portion of the retrieved product information (Meek, see at least: [0061], [0071], and [0098] - “A product mapping module 402 can take an image 316 of the in-store inventory 301, and an identification 401 of a point within the inventory image 316, using the x-axis and y-axis offset of the point, and map the inventory image 316 and point 401 to structured product data 403 [i.e. augmenting data of the video stream for each identified product therein with the product identifier, the location data identifying the location of the product in the video stream, and the retrieved product information] that represents the product corresponding to the image segment that is at point 401 in the inventory image 316” and “The customer 206 can view products in the image such as 503…The sidebar shows contact information 504 and descriptive information 505 about the store 209, and lists structured information about the products 506 that are indicated by the product mapping module 402 to be in the visible part of the inventory image data 316 [i.e. including adding text to the video stream that is visible when the video stream is presented on a display and including at least a portion of the retrieved product information]” and “The device 801 can capture video imagery of a product 302 as indicated by 803. The device 801 can also use its location service to record the position and orientation of the device 801 during the video capture, to help with the process of stitching together the individual video frame images into a single image [i.e. video stream] as previously described with respect to FIG. 3B”); 
-storing the video stream and the obtained data in a location where the video stream and the obtained data can be accessed in response to requests received via a network (Meek, see at least: [0061], [0071], [0041] and  [0098] - “The database 205 used by the system can contain both image data 405 of the inventory 301 in the store 209, and also structured product data 406 associated with the products 302 that are in the inventory 301 [i.e. storing the video stream and the obtained data in a location where the video stream and obtain data can be accessed]” and “FIG. 5A is an illustrative example of an embodiment of a user interface for providing a customer with a retail browsing experience online [i.e. the obtained data can be accessed in response to requests received via a network]” and “user may not be looking for a specific product, but may search for general product terms [i.e. accessed in response to requests received via a network] to find certain types of stores in the area. The user may then be able to browse the shelves of the stores online to see what sort of products are available in the store” and “The device 801 can capture video imagery of a product 302 as indicated by 803. The device 801 can also use its location service to record the position and orientation of the device 801 during the video capture, to help with the process of stitching together the individual video frame images into a single image [i.e. video stream] as previously described with respect to FIG. 3B”).

Meek does not explicitly disclose the capturing of the video stream including receiving the receiving the video stream from a stationarily mounted camera and retrieving and transmitting the video stream with the obtained data to a requestor over a network in response to a request to provide the requestor a view of available product in a near-real time manner within an online shopping session.
Adato, however, teaches a system for processing images captured in a retail store (i.e. abstract), including the known technique of capturing of a video stream including receiving the video stream from a stationarily mounted camera (Adato, see at least: [0150] and [0115] - “the image data representative of products displayed on store shelves may be acquired by a plurality of stationary capturing devices 125 fixedly mounted [i.e. capturing of an image including receiving the video stream from a stationarily mounted camera] in the retail store” and “Examples of capturing devices may include, a digital camera, a time-of-flight camera, a stereo camera, an active stereo camera, a depth camera, a Lidar system, a laser scanner, CCD based devices, or any other sensor based system capable of converting received light into electric signals [i.e. stationarily mounted camera]…the image data may include pixel data streams, digital images, digital video streams [i.e. receiving the video stream]”); and
retrieving and transmitting the video stream with the obtained data to a requestor over a network in response to a request to provide the requestor a view of available product in a near-real time manner within an online shopping session (Adato, see at least: [0225], [0233], [0115] and Fig. 11E - “FIGS. 11A-11E illustrate example outputs based on data automatically derived from machine processing and analysis of images captured in retail store 105…And FIG. 11E illustrates optional outputs for user 120” and “the near real-time display of retail store 105 may be presented to the online customer [i.e. within an online shopping session] in a manner enabling easy virtual navigation [i.e. to a requestor over a network] in retail store 105…as shown in FIG. 11E, GUI 1150 may include a first display area 1152 [i.e. retrieving and transmitting the video stream] for showing the near real-time display [i.e. to provide the requestor a view of available product in a near-real time manner] and a second display area 1154 for showing a product list including products identified in the near real-time display [i.e. with the obtained data]…upon selecting the “bakery” tab. GUI 1150 may present a near real-time display of the bakery of retail store 105…Server 135 may be configured to update the near real-time display and the product list…after identifying a selection of arrow 1158B [i.e. in response to a request], server 135 may present a different section of the dairy department and may update the product list accordingly” and “the image data may include pixel data streams, digital images, digital video streams [i.e. receiving the video stream]”). This known technique is applicable to the method of Meek as they both share characteristics and capabilities, namely, they are directed to a system for processing images captured in a retail store.
It would have been recognized that applying the known techniques of capturing of a video stream including receiving the video stream from a stationarily mounted camera, as taught by Adato, to the teachings of Meek would have yielded predictable results because the level of ordinary skill in the art demonstrated by the references applied shows the ability to incorporate such references into similar methods. Further, adding the modification of capturing of a video stream including receiving the video stream from a stationarily mounted camera, as taught by Adato, into the method of Meek would have been recognized by those of ordinary skill in the art as resulting in an improved method that would allow for the detection of products in the back of a shelf while still less power and fewer processing cycles (Adato, [0201]).
Additionally, it would have been obvious to one of ordinary skill in the art to include in the method, as taught by Meek, retrieving and transmitting the video stream with the obtained data to a requestor over a network in response to a request to provide the requestor a view of available product in a near-real time manner within an online shopping session, as taught by Adato, since the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable. It further would have been obvious to one of ordinary skill in the art at the time of filing to modify Meek, to include the teachings of Adato, in order to allow for the detection of products in the back of a shelf while still less power and fewer processing cycles (Adato, [0201]).

Meek in view of Adato does not explicitly teach the text being added relative to a location of the product in the video stream.
Powers, however, teaches annotation data associated with particular objects and/or viewpoints of image data (i.e. abstract), including the known technique of augmenting the data of the video stream includes adding text to the video stream that is visible when the video stream is presented on a display, the text added relative to a location of the product in the video stream (Powers, see at least: [0094] and [0039] - “The annotation 610 may include textual content associated with pillow 618  [i.e. augmenting the data]. For instance, the textual content may include product information (e.g., make, model, price, availability) associated with the pillow 618 [i.e. includes adding text to the video stream that is visible when the video stream is presented on a display]” and “the system may be configured to receive image or video data [i.e. augmenting the data of the video stream]” Examiner notes that the annotations 602-610 in Fig. 6 are relative to a location of the product [i.e. the text added relative to a location of the product in the video stream]). This known technique is applicable to the method of Meek in view of Adato as they both share characteristics and capabilities, namely, they are directed to annotation data associated with particular objects and/or viewpoints of image data.
	It would have been recognized that applying the known techniques of augmenting the data of the video stream includes adding text to the video stream that is visible when the video stream is presented on a display, the text added relative to a location of the product in the video stream, as taught by Powers, to the teachings of Meek in view of Adato would have yielded predictable results because the level of ordinary skill in the art demonstrated by the references applied shows the ability to incorporate such references into similar methods. Further, adding the modification of augmenting the data of the video stream includes adding text to the video stream that is visible when the video stream is presented on a display, the text added relative to a location of the product in the video stream, as taught by Powers, into the method of Meek in view of Adato would have been recognized by those of ordinary skill in the art as resulting in an improved method that would allow for the maintenance of an updated environment (Powers, [0221]).

Regarding claim 4, the combination of Meek/Adato/Powers teaches the method of claim 1. Meek further discloses:
	-wherein augmenting the data of the video stream includes adding the product identifier, the location data identifying the location of the product in the video stream, and the retrieved product information as metadata to the video stream (Meek, see at least: [0061] and [0071] - “A product mapping module 402 can take an image 316 of the in-store inventory 301, and an identification 401 of a point within the inventory image 316, using the x-axis and y-axis offset of the point, and map the inventory image 316 and point 401 to structured product data 403 that represents the product [i.e. augmenting the data of the video stream includes adding the product identifier and the retrieved product information as metadata to the video stream] corresponding to the image segment that is at point 401 [i.e. augmenting the data of the video stream includes adding the location data identifying the location of the product in the video stream] in the inventory image 316” and “The customer 206 can view products in the image such as 503…The sidebar shows contact information 504 and descriptive information 505 about the store 209, and lists structured information about the products 506 that are indicated by the product mapping module 402 to be in the visible part of the inventory image data 316 [i.e. augmenting data of the video stream]” see also [0098] for ‘video stream’).

Regarding claim 6, the combination of Meek/Adato/Powers teaches the method of claim 1. Meek further discloses:
	-wherein the text added to the video stream age includes a product name and a price (Meek, see at least: [0071] - “FIG. 5A is an illustrative example of an embodiment of a user interface for providing a customer with a retail browsing experience online…The sidebar shows contact information 504 and descriptive information 505 about the store 209, and lists structured information about the products 506 that are indicated by the product mapping module 402 to be in the visible part of the inventory image data 316 [i.e. text added to the video stream]” Examiner notes that the sidebar of Fig. 5A indicates the names and prices of the products in the image; see also [0098] for ‘video stream’).

Regarding claim 8, the combination of Meek/Adato/Powers teaches the method of claim 1. Meek further discloses:
-wherein the staging area is a shelf in a grocery store (Meek, see at least: [0041] and  [0060] - “The user may then be able to browse the shelves of the stores online to see what sort of products are available in the store” and “One example of this might be at a farmers' market [i.e. in a grocery store], where a customer 206 might be able to check at the end of winter which vendors participated in the previous year”).

Regarding claim 9, the combination of Meek/Adato/Powers teaches the method of claim 1. Meek further discloses:
-wherein capturing the video stream includes receiving the video stream via a network from a stationary video device (Meek, see at least: [0054] - “In yet another embodiment, the imaging device 303 is a still-image camera on a pan/tilt/zoom mount [i.e. capturing the video stream includes receiving the video stream via a network from a stationary video device]” see also [0098] for ‘video stream’).

Regarding claim 13, Meek in view of Adato teaches the method of claim 10. Meek further discloses:
-wherein augmenting the data of the video stream includes adding text to the video stream that is visible when the video stream is presented on a display and including at least a portion of the retrieve product information (Meek, see at least: [0071] - “FIG. 5A is an illustrative example of an embodiment of a user interface for providing a customer with a retail browsing experience online. The interface 501 presents several sections. The view 502 of the store 209 shows the inventory image data 316 for the in-store inventory 301, and provides pan and zoom controls to allow the customer 206 to simulate walking through the store 209 as previously described with respect to FIG. 3B…The sidebar shows contact information 504 and descriptive information 505 about the store 209, and lists structured information about the products 506 [i.e. augmenting the data of the video stream includes adding text to the video stream that is visible when the video stream is presented on a display and including at least a portion of the retrieve product information] that are indicated by the product mapping module 402 to be in the visible part of the inventory image data 316” see also [0098] for ‘video stream’).
Meek in view of Adato does not explicitly teach the text being added relative to a location of the product in the video stream
Powers, however, teaches annotation data associated with particular objects and/or viewpoints of image data (i.e. abstract), including the known technique of augmenting the data of the video stream includes adding text to the video stream that is visible when the video stream is presented on a display, the text added relative to a location of the product in the video stream (Powers, see at least: [0094] and [0039] - “The annotation 610 may include textual content associated with pillow 618 [i.e. augmenting the data]. For instance, the textual content may include product information (e.g., make, model, price, availability) associated with the pillow 618 [i.e. includes adding text to the video stream that is visible when the video stream is presented on a display]” and “the system may be configured to receive image or video data [i.e. augmenting the data of the video stream]” Examiner notes that the annotations 602-610 in Fig. 6 are relative to a location of the product [i.e. the text added relative to a location of the product in the video stream]). This known technique is applicable to the method of Meek in view of Adato as they both share characteristics and capabilities, namely, they are directed to annotation data associated with particular objects and/or viewpoints of image data.
It would have been recognized that applying the known techniques of augmenting the data of the video stream includes adding text to the video stream that is visible when the video stream is presented on a display, the text added relative to a location of the product in the video stream, as taught by Powers, to the teachings of Meek in view of Adato would have yielded predictable results because the level of ordinary skill in the art demonstrated by the references applied shows the ability to incorporate such references into similar methods. Further, adding the modification of augmenting the data of the video stream includes adding text to the video stream that is visible when the video stream is presented on a display, the text added relative to a location of the product in the video stream, as taught by Powers, into the method of Meek in view of Adato would have been recognized by those of ordinary skill in the art as resulting in an improved method that would allow for the maintenance of an updated environment (Powers, [0221]).

Regarding claim 14, Meek in view of Adato teaches the method of claim 13. Meek further discloses:
-wherein the text added to the video stream includes a price (Meek, see at least: [0071] - “FIG. 5A is an illustrative example of an embodiment of a user interface for providing a customer with a retail browsing experience online…The sidebar shows contact information 504 and descriptive information 505 about the store 209, and lists structured information about the products 506 that are indicated by the product mapping module 402 to be in the visible part of the inventory image data 316 [i.e. text added to the video stream]” Examiner notes that the sidebar of Fig. 5A indicates the prices of the products in the image; see also [0098] for ‘video stream’).

Regarding claim 18, Meek in view of Adato teaches the system of claim 16. Meek further discloses:
-wherein the data processing activities are performed on a periodic basis (Meek, see at least: [0070] - “FIG. 4D is an illustrative block diagram of an embodiment of mapping segments of images of in-store inventory to product data over a period of time. As previously illustrated in FIG. 3C, the inventory image data 316 can be updated from time to time”)
Meek in view of Adato does not explicitly disclose the data processing activities being performed on a time-scheduled basis.
Powers, however, teaches annotation data associated with particular objects and/or viewpoints of image data (i.e. abstract), including the known technique of data processing activities being performed on a time-scheduled basis (Powers, see at least: [0221] - “the user 3210 may periodically or regularly capture image data [i.e. data processing activities are performed on a periodic basis] 2314 of the physical environment 2310 such that the spatial interaction system 2300 may maintain update object models 2318 and/or a shared 3D model/environment 2320 of the physical environment 3210”). This known technique is applicable to the system of Meek in view of Adato as they both share characteristics and capabilities, namely, they are directed to annotation data associated with particular objects and/or viewpoints of image data.
It would have been recognized that applying the known techniques of data processing activities being performed on a time-scheduled basis, as taught by Powers, to the teachings of Meek in view of Adato would have yielded predictable results because the level of ordinary skill in the art demonstrated by the references applied shows the ability to incorporate such references into similar methods. Further, adding data processing activities being performed on a time-scheduled basis, as taught by Powers, into the system of Meek in view of Adato would have been recognized by those of ordinary skill in the art as resulting in an improved system that would allow for the maintenance of an updated environment (Powers, [0221]).

Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Meek in view of Adato, in further view of Powers, in further view of Tang et al. (US 2019/0272425 A1), as previously cited and hereinafter Tang.
Regarding claim 3, the combination of Meek/Adato/Powers teaches the method of claim 1. Meek further discloses:
The combination of Meek/Adato/Powers does not explicitly disclose the object recognition service including a deep neural network object recognition model built and maintained from training images of the products to be recognized through use of the model.
		Tang, however, teaches an object recognition service including a deep neural network object recognition model built and maintained from training images of products to be recognized through use of the model (Tang, see at least: [0029] - “Once the image has been captured, and in some embodiments, after it has undergone some pre-processing as mentioned above, attributes or features of the scene, such as objects, surfaces, and spaces, be determined from the image data through various models including various computer-vision and image processing techniques and processes…the neural network can be trained using images from a catalog that include metadata, description, classification, or other data that can be used to identify various objects and object features [i.e. built and maintained from training images of products to be recognized through use of the model].  For example, in some embodiments, localization can then be performed to determine the relevant region of the scene associated with an object (including spaces or surfaces) of interest.  In some embodiments, a conventional training process can be used with the deep neural network [i.e. the object recognition service includes a deep neural network object recognition model]”).
		It would have been obvious to one of ordinary skill in the art to include in the method, as taught by the combination of Meek/Adato/Powers, an object recognition service including a deep neural network object recognition model built and maintained from training images of products to be recognized through use of the model, as taught by Tang, since the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable. It further would have been obvious to one of ordinary skill in the art at the time of filing to modify the combination of Meek/Adato/Powers, to include the teachings of Tang, in order to allow a user to capture an image of the product and submit the captured image to an object recognition system to obtain information associated with the product of interest or find visually similar products (Tang, [0001]).

Claims 10, 12, 16 and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Meek, in view of Adato.
Regarding claim 10, Meek discloses a method (i.e. abstract) comprising: 
-capturing a video stream of products offered for sale located within a store, the capturing of the video stream including receiving the image from a remotely operable camera operated according to a demand of a requesting customer (Meek, see at least: [0053], [0041], [0072] and [0054] - “FIG. 3A is an illustrative block diagram of an embodiment of the capture and use of images of in-store inventory [i.e. capturing a video stream of products offered for sale located within a store]. A person 305 at the store 209, who may or may not be affiliated with the store 209…The customer 206 may use a device 207 to access the customer interface 203. In response, the customer interface 203 may use the images from the database 205 to present a display 306 of the in-store inventory 301 to the customer 206 [i.e. the capturing of the video stream including receiving the video stream from a camera operated according to a demand of a requesting customer]” and “The user may then be able to browse the shelves of the stores online to see what sort of products are available in the store [i.e. video stream of products]” and “The interface 510 allows the customer 206 to enter keywords 511 for search [i.e. according to a demand of a requesting customer], and a location 512, which can be filled in with a default value, and click on the Search button 513 to start a search” and “the imaging device 303 is a video camera [i.e. capturing video stream]…the movement and triggering of the imaging device is controlled by an automatic system that can sequence zoom, pan, start and stop video [i.e. the capturing of the video stream including receiving the image from a remotely operable camera]”); 
-automatically comparing by a computer processor a scene represented in the captured video stream with the scene represented in a prior captured video stream to identify the presence of a significant change between the video streams (Meek, see at least: [0070] and [0098] - “the inventory image data 316 can be updated from time to time [i.e. with the scene represented in a prior captured video stream]. When the inventory image data 316 is updated, it may not be necessary to update the product mapping function 402 for every image segment 314. For example, FIG. 4D shows an update of the inventory image data from 316a to 316c, where the image segment 314f becomes 314g, but the product does not change, and the image segment 314c becomes 314e, and the product does change. The system described with respect to FIG. 4B, for automatically creating the data for the product mapping function 402, can align the two inventory images 316a and 316c [i.e. automatically comparing by a computer processor a scene represented in the captured video stream], and discover an image match between 314f and 314g. As a result, the two segments 314f and 314g can map to the same product 403a, and it may be that no additional work is needed to identify segment 314g. At the same time, the system may discover an image mismatch between the image segments 314c and 314e, so it can leave 314c mapping to product 403b, but may initiate a new automatic product mapping process on the image segment 314e [i.e. with the scene represented in a prior captured video stream to identify the presence of a significant change between the video streams], as described with respect to FIG. 4B, which may result in identifying a different product 403c to use for the mapping of the image segment 314e” and “The device 801 can capture video imagery of a product 302 as indicated by 803. The device 801 can also use its location service to record the position and orientation of the device 801 during the video capture, to help with the process of stitching together the individual video frame images into a single image [i.e. video streams] as previously described with respect to FIG. 3B”); 
-when the comparing reveals a significant change (Meek, see at least: [0070] - “At the same time, the system may discover an image mismatch between the image segments 314c and 314e [i.e. when the comparing reveals a significant change], so it can leave 314c mapping to product 403b, but may initiate a new automatic product mapping process on the image segment 314e, as described with respect to FIG. 4B, which may result in identifying a different product 403c to use for the mapping of the image segment 314e”): 
-providing the video stream to an object recognition service that identifies the products located within the video stream based on stored representations of visual characteristics of the products and a location of each identified product within the video stream  (Meek, see at least: [0061], [0065] and [0098] - “The database 205 used by the system can contain both image data 405 of the inventory 301 in the store 209, and also structured product data 406 associated with the products 302 that are in the inventory 301 [i.e. based on stored representations of visual characteristics of products]…A product mapping module 402 can take an image 316 of the in-store inventory 301 [i.e. providing the video stream to an object recognition service], and an identification 401 of a point within the inventory image 316, using the x-axis and y-axis offset of the point, and map the inventory image 316 and point 401 to structured product data 403 that represents the product corresponding to the image segment that is at point 401 [i.e. that identifies the products located within the video stream and the location of each identified product within the video stream] in the inventory image 316” and “image segment 314 can be matched against product images in a product image database 420 using an image-driven search…A good match may be used to identify the product in the product image database 420, and then use the product data from the product image database 420 to provide the structured data 403 about the product [i.e. based on stored representations of visual characteristics of the products]” and “The device 801 can capture video imagery of a product 302 as indicated by 803. The device 801 can also use its location service to record the position and orientation of the device 801 during the video capture, to help with the process of stitching together the individual video frame images into a single image [i.e. providing the video stream] as previously described with respect to FIG. 3B”); 
-receiving from the object recognition service, product identifiers of the identified products and location data identifying the location of each respective product (Meek, see at least: [0061] and  [0063] - “A product mapping module 402 [i.e. the object recognition service] can take an image 316 of the in-store inventory 301, and an identification 401 of a point within the inventory image 316, using the x-axis and y-axis offset of the point [i.e. receiving from the object recognition service, location data identifying the location of each respective product], and map the inventory image 316 and point 401 to structured product data 403 that represents the product corresponding to the image segment that is at point 401 in the inventory image 316” and “The inventory image 316 can be segmented into smaller image segments 314, each one representing a single product 302. The image segment 314 may contain text 411 that can be located and recognized by an Optical Character Recognition (OCR) module 412 [i.e. receiving from the object recognition service, product identifiers of the identified products]. The OCR module 412 can convert the characters into machine-readable text 413, which can then be used to search a product database 414. Data from the product database 414 can then be used to provide the structured data 403 about the product”); 
-retrieving product information for each identified product from a product database (Meek, see at least: [0063] - “The inventory image 316 can be segmented into smaller image segments 314, each one representing a single product 302. The image segment 314 may contain text 411 that can be located and recognized by an Optical Character Recognition (OCR) module 412. The OCR module 412 can convert the characters into machine-readable text 413, which can then be used to search a product database 414. Data from the product database 414 can then be used to provide the structured data 403 about the product [i.e. retrieving product information for each identified product from a product database]”); 
-augmenting data of the video stream for each identified product therein with the product identifier, the location data identifying the location of the product in the video stream, and the retrieved product information (Meek, see at least: [0061], [0071], and [0098] - “A product mapping module 402 can take an image 316 of the in-store inventory 301, and an identification 401 of a point within the inventory image 316, using the x-axis and y-axis offset of the point, and map the inventory image 316 and point 401 to structured product data 403 [i.e. augmenting data of the video stream for each identified product therein with the product identifier, the location data identifying the location of the product in the video stream, and the retrieved product information] that represents the product corresponding to the image segment that is at point 401 in the inventory image 316” and “The customer 206 can view products in the image such as 503…The sidebar shows contact information 504 and descriptive information 505 about the store 209, and lists structured information about the products 506 that are indicated by the product mapping module 402 to be in the visible part of the inventory image data 316 [i.e. augmenting data of the video stream for each identified product therein]” and “The device 801 can capture video imagery of a product 302 as indicated by 803. The device 801 can also use its location service to record the position and orientation of the device 801 during the video capture, to help with the process of stitching together the individual video frame images into a single image [i.e. the video stream] as previously described with respect to FIG. 3B”); 
-storing the video stream with the augmented data in a location where the video stream and the augmented data can be accessed in response to requests received via a network (Meek, see at least: [0061], [0071], [0041], and [0098] - “The database 205 used by the system can contain both image data 405 of the inventory 301 in the store 209, and also structured product data 406 associated with the products 302 that are in the inventory 301 [i.e. storing the video stream with the augmented data in a location where the video stream and augmented data can be accessed]” and “FIG. 5A is an illustrative example of an embodiment of a user interface for providing a customer with a retail browsing experience online [i.e. the augmented data can be accessed in response to requests received via a network]” and “user may not be looking for a specific product, but may search for general product terms [i.e. accessed in response to requests received via a network] to find certain types of stores in the area. The user may then be able to browse the shelves of the stores online to see what sort of products are available in the store” and “The device 801 can capture video imagery of a product 302 as indicated by 803. The device 801 can also use its location service to record the position and orientation of the device 801 during the video capture, to help with the process of stitching together the individual video frame images into a single image [i.e. the video stream] as previously described with respect to FIG. 3B”).

		Meek does not explicitly disclose retrieving and transmitting the video stream with the obtained data to a requestor over a network in response to a request to provide the requestor a view of available product in a near-real time manner within an online shopping session.
		Adato, however, teaches retrieving and transmitting the video stream with the obtained data to a requestor over a network in response to a request to provide the requestor a view of available product in a near-real time manner within an online shopping session (Adato, see at least: [0225], [0233], [0115] and Fig. 11E - “FIGS. 11A-11E illustrate example outputs based on data automatically derived from machine processing and analysis of images captured in retail store 105…And FIG. 11E illustrates optional outputs for user 120” and “the near real-time display of retail store 105 may be presented to the online customer [i.e. within an online shopping session] in a manner enabling easy virtual navigation [i.e. to a requestor over a network] in retail store 105…as shown in FIG. 11E, GUI 1150 may include a first display area 1152 [i.e. retrieving and transmitting the video stream] for showing the near real-time display [i.e. to provide the requestor a view of available product in a near-real time manner] and a second display area 1154 for showing a product list including products identified in the near real-time display [i.e. with the obtained data]…upon selecting the “bakery” tab. GUI 1150 may present a near real-time display of the bakery of retail store 105…Server 135 may be configured to update the near real-time display and the product list…after identifying a selection of arrow 1158B [i.e. in response to a request], server 135 may present a different section of the dairy department and may update the product list accordingly” and “the image data may include pixel data streams, digital images, digital video streams [i.e. receiving the video stream]”).
		It would have been obvious to one of ordinary skill in the art to include in the method, as taught by Meek, retrieving and transmitting the video stream with the obtained data to a requestor over a network in response to a request to provide the requestor a view of available product in a near-real time manner within an online shopping session, as taught by Adato, since the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable. It further would have been obvious to one of ordinary skill in the art at the time of filing to modify Meek, to include the teachings of Adato, in order to allow for the detection of products in the back of a shelf while still less power and fewer processing cycles (Adato, [0201]).

Regarding claim 12, Meek in view of Adato teaches the method of claim 10, Meek further discloses:
-wherein augmenting the data of the video stream includes adding the product identifier, the location data identifying the location of the product in the video stream, and the retrieved product information as metadata to the video stream (Meek, see at least: [0061] and [0071] - “A product mapping module 402 can take an image 316 of the in-store inventory 301, and an identification 401 of a point within the inventory image 316, using the x-axis and y-axis offset of the point, and map the inventory image 316 and point 401 to structured product data 403 that represents the product [i.e. augmenting the data of the video stream includes adding the product identifier and the retrieved product information as metadata to the video stream] corresponding to the image segment that is at point 401 [i.e. augmenting the data of the video stream includes adding the location data identifying the location of the product in the video stream] in the inventory image 316” and “The customer 206 can view products in the image such as 503…The sidebar shows contact information 504 and descriptive information 505 about the store 209, and lists structured information about the products 506 that are indicated by the product mapping module 402 to be in the visible part of the inventory image data 316 [i.e. augmenting data of the video stream]” see also [0098] for ‘video stream’).

Regarding claim 16, Meek discloses a system (i.e. abstract) comprising:
-a network interface device, a processor, and a memory storing instructions executable by the processor to cause the system to perform data processing activities (Meek, see at least: [0049] - “the system and its components may include a bus or other communication component for communicating information and a processor or processing circuit coupled to the bus for processing information. The hardware elements can also include one or more processors or processing circuits coupled to the bus for processing information. The system also includes main memory”) comprising: 
-receiving, via the network interface device, a video stream including products offered for sale located within a store (Meek, see at least: [0053], [0041], and [0098] - “FIG. 3A is an illustrative block diagram of an embodiment of the capture and use of images of in-store inventory [i.e. receiving, via the network interface device, a video stream including products offered for sale located within a store]. A person 305 at the store 209, who may or may not be affiliated with the store 209” and “The user may then be able to browse the shelves [i.e. a video stream including products offered for sale located within a store] of the stores online to see what sort of products are available in the store” and “The device 801 can capture video imagery of a product 302 [i.e. a video stream including products] as indicated by 803”);
-automatically comparing by the processor a scene represented in the received video stream with the scene represented in a prior received video stream to identify the presence of a significant change between the video streams (Meek, see at least: [0070] and [0098] - “the inventory image data 316 can be updated from time to time [i.e. with the scene represented in a prior captured video stream]. When the inventory image data 316 is updated, it may not be necessary to update the product mapping function 402 for every image segment 314. For example, FIG. 4D shows an update of the inventory image data from 316a to 316c, where the image segment 314f becomes 314g, but the product does not change, and the image segment 314c becomes 314e, and the product does change. The system described with respect to FIG. 4B, for automatically creating the data for the product mapping function 402, can align the two inventory images 316a and 316c [i.e. automatically comparing by the processor a scene represented in the received video stream], and discover an image match between 314f and 314g. As a result, the two segments 314f and 314g can map to the same product 403a, and it may be that no additional work is needed to identify segment 314g. At the same time, the system may discover an image mismatch between the image segments 314c and 314e, so it can leave 314c mapping to product 403b, but may initiate a new automatic product mapping process on the image segment 314e [i.e. with the scene represented in a prior received video stream to identify the presence of a significant change between the video streams], as described with respect to FIG. 4B, which may result in identifying a different product 403c to use for the mapping of the image segment 314e” and “The device 801 can capture video imagery of a product 302 as indicated by 803. The device 801 can also use its location service to record the position and orientation of the device 801 during the video capture, to help with the process of stitching together the individual video frame images into a single image [i.e. video streams] as previously described with respect to FIG. 3B”);
-when the comparing reveals a significant change (Meek, see at least: [0070] - “At the same time, the system may discover an image mismatch between the image segments 314c and 314e [i.e. when the comparing reveals a significant change], so it can leave 314c mapping to product 403b, but may initiate a new automatic product mapping process on the image segment 314e, as described with respect to FIG. 4B, which may result in identifying a different product 403c to use for the mapping of the image segment 314e”):
-submitting the video stream to an object recognition service that identifies the products located within the video stream based on stored representations of visual characteristics of the products and a location of each identified product within the video stream (Meek, see at least: [0052], [0061], [0065], and  [0098] - “The provider 202 of the system can use a server 201 to facilitate the retail shopping experience of a customer 206 at a store 209, which can be run by a merchant 210 [i.e. submitting the video stream to an object recognition service]” and “The database 205 used by the system can contain both image data 405 of the inventory 301 in the store 209 [i.e. based on stored representations of visual characteristics of the products], and also structured product data 406 associated with the products 302 that are in the inventory 301…A product mapping module 402 can take an image 316 of the in-store inventory 301 [i.e. submitting the video stream to an object recognition service], and an identification 401 of a point within the inventory image 316, using the x-axis and y-axis offset of the point, and map the inventory image 316 and point 401 to structured product data 403 that represents the product corresponding to the image segment that is at point 401 [i.e. that identifies the products located within the video stream and the location of each identified product within the video stream] in the inventory image 316” and “image segment 314 can be matched against product images in a product image database 420 using an image-driven search…A good match may be used to identify the product in the product image database 420, and then use the product data from the product image database 420 to provide the structured data 403 about the product [i.e. based on stored representations of visual characteristics of the products]” and “The device 801 can capture video imagery of a product 302 as indicated by 803. The device 801 can also use its location service to record the position and orientation of the device 801 during the video capture, to help with the process of stitching together the individual video frame images into a single image [i.e. video stream] as previously described with respect to FIG. 3B”); 
-receiving from the object recognition service, product identifiers of the identified products and location data identifying the location of each respective product (Meek, see at least: [0061] and [0063] - “A product mapping module 402 [i.e. the object recognition service] can take an image 316 of the in-store inventory 301, and an identification 401 of a point within the inventory image 316, using the x-axis and y-axis offset of the point [i.e. receiving from the object recognition service, location data identifying the location of each respective product], and map the inventory image 316 and point 401 to structured product data 403 that represents the product corresponding to the image segment that is at point 401 in the inventory image 316” and “The inventory image 316 can be segmented into smaller image segments 314, each one representing a single product 302. The image segment 314 may contain text 411 that can be located and recognized by an Optical Character Recognition (OCR) module 412 [i.e. receiving from the object recognition service, product identifiers of the identified products]. The OCR module 412 can convert the characters into machine-readable text 413, which can then be used to search a product database 414. Data from the product database 414 can then be used to provide the structured data 403 about the product”);
-retrieving product information for each identified product from a product database (Meek, see at least: [0063] - “The inventory image 316 can be segmented into smaller image segments 314, each one representing a single product 302. The image segment 314 may contain text 411 that can be located and recognized by an Optical Character Recognition (OCR) module 412. The OCR module 412 can convert the characters into machine-readable text 413, which can then be used to search a product database 414. Data from the product database 414 can then be used to provide the structured data 403 about the product [i.e. retrieving product information for each identified product from a product database]”);
-augmenting data of the video stream for each identified product therein with the product identifier, the location data identifying the location of the product in the video stream, and the retrieved product information (Meek, see at least: [0061], [0071], and [0098] - “A product mapping module 402 can take an image 316 of the in-store inventory 301, and an identification 401 of a point within the inventory image 316, using the x-axis and y-axis offset of the point, and map the inventory image 316 and point 401 to structured product data 403 [i.e. augmenting data of the video stream for each identified product therein with the product identifier, the location data identifying the location of the product in the video stream, and the retrieved product information] that represents the product corresponding to the image segment that is at point 401 in the inventory image 316” and “The customer 206 can view products in the image such as 503…The sidebar shows contact information 504 and descriptive information 505 about the store 209, and lists structured information about the products 506 that are indicated by the product mapping module 402 to be in the visible part of the inventory image data 316 [i.e. augmenting data of the video stream for each identified product therein]” and “The device 801 can capture video imagery of a product 302 as indicated by 803. The device 801 can also use its location service to record the position and orientation of the device 801 during the video capture, to help with the process of stitching together the individual video frame images into a single image [i.e. video stream] as previously described with respect to FIG. 3B”);
-storing the video stream with the augmented data in a location where the video stream and the augmented data can be accessed in response to requests received via a network (Meek, see at least: [0061], [0071], [0041], and [0098] - “The database 205 used by the system can contain both image data 405 of the inventory 301 in the store 209, and also structured product data 406 associated with the products 302 that are in the inventory 301 [i.e. storing the video stream with the augmented data in a location where the video stream and augmented data can be accessed]” and “FIG. 5A is an illustrative example of an embodiment of a user interface for providing a customer with a retail browsing experience online [i.e. obtain data can be accessed in response to requests received via a network]” and “user may not be looking for a specific product, but may search for general product terms [i.e. accessed in response to requests received via a network] to find certain types of stores in the area. The user may then be able to browse the shelves of the stores online to see what sort of products are available in the store” and “The device 801 can capture video imagery of a product 302 as indicated by 803. The device 801 can also use its location service to record the position and orientation of the device 801 during the video capture, to help with the process of stitching together the individual video frame images into a single image [i.e. the video stream] as previously described with respect to FIG. 3B”).
Meek does not explicitly disclose the received video stream being received from a stationarily mounted camera and retrieving and transmitting the video stream with the obtained data to a requestor over the network in response to a request to provide the requestor a view of available product in a near-real time manner within an online shopping session.
Adato, however, teaches a system for processing images captured in a retail store (i.e. abstract), including the known technique of a received video stream being received from a stationarily mounted camera (Adato, see at least: [0150] and [0115] - “the image data representative of products displayed on store shelves may be acquired by a plurality of stationary capturing devices 125 fixedly mounted [i.e. received video stream received from a stationarily mounted camera] in the retail store” and “Examples of capturing devices may include, a digital camera, a time-of-flight camera, a stereo camera, an active stereo camera, a depth camera, a Lidar system, a laser scanner, CCD based devices, or any other sensor based system capable of converting received light into electric signals [i.e. stationarily mounted camera]…the image data may include pixel data streams, digital images, digital video streams [i.e. received video stream]”); and
retrieving and transmitting the video stream with the obtained data to a requestor over the network in response to a request to provide the requestor a view of available product in a near-real time manner within an online shopping session (Adato, see at least: [0225], [0233], [0115] and Fig. 11E - “FIGS. 11A-11E illustrate example outputs based on data automatically derived from machine processing and analysis of images captured in retail store 105…And FIG. 11E illustrates optional outputs for user 120” and “the near real-time display of retail store 105 may be presented to the online customer [i.e. within an online shopping session] in a manner enabling easy virtual navigation [i.e. to a requestor over a network] in retail store 105…as shown in FIG. 11E, GUI 1150 may include a first display area 1152 [i.e. retrieving and transmitting the video stream] for showing the near real-time display [i.e. to provide the requestor a view of available product in a near-real time manner] and a second display area 1154 for showing a product list including products identified in the near real-time display [i.e. with the obtained data]…upon selecting the “bakery” tab. GUI 1150 may present a near real-time display of the bakery of retail store 105…Server 135 may be configured to update the near real-time display and the product list…after identifying a selection of arrow 1158B [i.e. in response to a request], server 135 may present a different section of the dairy department and may update the product list accordingly” and “the image data may include pixel data streams, digital images, digital video streams [i.e. receiving the video stream]”).This known technique is applicable to the system of Meek as they both share characteristics and capabilities, namely, they are directed to a system for processing images captured in a retail store.
It would have been recognized that applying the known technique of a received video stream being received from a stationarily mounted camera, as taught by Adato, to the teachings of Meek would have yielded predictable results because the level of ordinary skill in the art demonstrated by the references applied shows the ability to incorporate such references into similar systems. Further, adding the modification of a received video stream being received from a stationarily mounted camera, as taught by Adato, into the system of Meek would have been recognized by those of ordinary skill in the art as resulting in an improved system that would allow for the detection of products in the back of a shelf while still less power and fewer processing cycles (Adato, [0201]).
Additionally, it would have been obvious to one of ordinary skill in the art to include in the system, as taught by Meek, retrieving and transmitting the video stream with the obtained data to a requestor over the network in response to a request to provide the requestor a view of available product in a near-real time manner within an online shopping session, as taught by Adato, since the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable. It further would have been obvious to one of ordinary skill in the art at the time of filing to modify Meek, to include the teachings of Adato, in order to allow for the detection of products in the back of a shelf while still less power and fewer processing cycles (Adato, [0201]).

Regarding claim 19, Meek in view of Adato teaches the system of claim 16. Meek further discloses:
-wherein the data processing activity of augmenting the data of the video stream includes adding the retrieved product information to the video stream (Meek, see at least: [0061] and  [0071] - “A product mapping module 402 can take an image 316 of the in-store inventory 301, and an identification 401 of a point within the inventory image 316, using the x-axis and y-axis offset of the point, and map the inventory image 316 and point 401 to structured product data 403 that represents the product [i.e. the data processing activity of augmenting the data of the video stream includes adding the retrieved product information to the video stream] corresponding to the image segment that is at point 401 in the inventory image 316” and “The customer 206 can view products in the image such as 503…The sidebar shows contact information 504 and descriptive information 505 about the store 209, and lists structured information about the products 506 that are indicated by the product mapping module 402 to be in the visible part of the inventory image data 316 [i.e. augmenting the data of the video stream includes adding the retrieved product information to the video stream]” see also [0098] for ‘video stream’).

Regarding claim 20, Meek in view of Adato teaches the system of claim 19. Meek further discloses:
-wherein the retrieved product information is added as metadata to the video stream (Meek, see at least: [0071] - “The customer 206 can view products in the image such as 503…The sidebar shows contact information 504 and descriptive information 505 about the store 209, and lists structured information about the products 506 that are indicated by the product mapping module 402 to be in the visible part of the inventory image data 316 [i.e. the retrieved product information is added as metadata to the video stream]” see also [0098] for ‘video stream’).

Claims 11 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Meek in view of Adato, in further view of Tang.
Regarding claim 11, Meek in view of Adato teaches the method of claim 10. 
Meek in view of Adato does not explicitly teach the object recognition service including a convolutional neural network object recognition model built and maintained from training images of products to be recognized through use of the model.
		Tang, however, teaches an object recognition service including a convolutional neural network object recognition model built and maintained from training images of products to be recognized through use of the model (Tang, see at least: [0029] - “Once the image has been captured, and in some embodiments, after it has undergone some pre-processing as mentioned above, attributes or features of the scene, such as objects, surfaces, and spaces, be determined from the image data through various models including various computer-vision and image processing techniques and processes…the neural network can be trained using images from a catalog that include metadata, description, classification, or other data that can be used to identify various objects and object features [i.e. built and maintained from training images of products to be recognized through use of the model].  For example, in some embodiments, localization can then be performed to determine the relevant region of the scene associated with an object (including spaces or surfaces) of interest.  In some embodiments, a conventional training process can be used with the deep neural network [i.e. a convolutional neural network object recognition model]”).
It would have been obvious to one of ordinary skill in the art to include in the method, as taught by Meek in view of Adato an object recognition service including a convolutional neural network object recognition model built and maintained from training images of products to be recognized through use of the model, as taught by Tang, since the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable. It further would have been obvious to one of ordinary skill in the art at the time of filing to modify Meek in view of Adato, to include the teachings of Tang, in order to allow a user to capture an image of the product and submit the captured image to an object recognition system to obtain information associated with the product of interest or find visually similar products (Tang, [0001]).

Regarding claim 17, Meek in view of Adato teaches the system of claim 16. 
Meek in view of Adato does not explicitly teach storing the video stream with the augmented data including transmitting the video stream with the augmented data via the network interface device to a source of the video stream.
Tang, however, teaches storing a video stream with augmented data including transmitting the video stream with the augmented data via the network interface device to a source of the video stream (Tang, see at least: [0044], [0052] and [0025] - “although the services are shown to be part of the provider environment 506 in FIG. 5, that one or more of these identification services might be operated by third parties 508 that offer these services to the provider [i.e. transmitting the video stream with the augmented data via the network interface device to a source of the video stream]” and “The image analysis service 518, or other services and/or components of the environment might access one or more data stores, such as a user data store 520 that contains information about the various users, and one or more content repositories 514 storing content able to be served to those users [i.e. storing the video stream with the augmented data]” and “The camera might capture video, such that a “live” view of the captured video information [i.e. video stream]” Examiner notes that Fig. 5 indicates the content database is in the service provider environment [i.e. storing the video stream with the augmented data includes transmitting the video stream with the augmented data via the network interface device to a source of the video stream]).
It would have been obvious to one of ordinary skill in the art to include in the system, as taught by Meek in view of Adato, storing a video stream with augmented data including transmitting the video stream with the augmented data via the network interface device to a source of the video stream, as taught by Tang, since the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable. It further would have been obvious to one of ordinary skill in the art at the time of filing to modify Meek in view of Adato, to include the teachings of Tang, in order to allow a user to capture an image of the product and submit the captured image to an object recognition system to obtain information associated with the product of interest or find visually similar products (Tang, [0001]).

Response to Arguments
Rejections under 35 U.S.C. §103
	Applicant argues that the amended claims provide for automatic video stream
comparing by a computer processor to identify significant changes in the scene. When there is a significant change, further processing is performed. No such automatic comparison is made in any of the cited references (Remarks, page 7).
	Examiner respectfully disagrees. Meek discloses the inventory image data, which includes captured video imagery, being updated and automatically creating data for the product mapping function by aligning the two inventory images [i.e. automatically comparing] to determine if the images match; if there is a mismatch [i.e. a significant change], a new automatic product mapping process on the image segment is initiated [i.e. processing is performed] (see Meek, [0070] and [0098]). Thus, Meek discloses this amended feature.

Applicant further argues that the response to arguments section of the Office Action refers to the specification and alternative embodiments described therein. These alternative embodiments are not what is claimed. The Response to Arguments section appears to be reading additional elements or limitations into the claims. Applicant requests the claimed be examined for what they explicitly recite, such as a video stream and not a still image. As such, Applicant reiterated the arguments made in response to the last Office Action and requests full consideration thereof in view of the actual claim language and not additional elements and limitations read into the claims from the specification (Remarks, pages 7-8).
Examiner respectfully disagrees. Examiner has not interpreted the video stream as a still image. As explained in the previous Office Actions response to arguments, Meek discloses an imaging device being a video camera that captures video imagery of a product and stitching together the individual video frame images into a single image [i.e. process video streams] (see Meek, [0098]). Meek further discloses that the movement and triggering of the video camera can be controlled by an automatic system that can sequence zoom, pan, start and stop video [i.e. the capturing of the video stream including automatically receiving the video stream] (see Meek, [0054]). Adato modifies the video camera of Meek to be a stationarily mounted video camera as Adato teaches capturing devices, said capturing devices including a camera capable of capturing video streams, which are fixedly mounted in a retail store (see Adato, [0150] and [0115]).
Additionally, the reference to paragraph [0018] of Applicant’s specification was to further point out that the disclosure of Meek is in line with the way Applicant’s specification uses and processes the video stream. Paragraph [0018] says “The camera 104 may be a still image camera or a video camera…. the image 110 is a frame of a video stream” and, similarly, paragraph [0098] of Meek discloses stitching together the individual video frame images of the captured video data into a single image. Furthermore, paragraph [0018] of Applicant’s specification is the only place in Applicant’s specification that describes the use of a video stream so this description of the video stream can’t be an alternative embodiment from what the claims recite as paragraph [0018] is the only source of support for ‘video streams’ found in Applicant’s specification.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
-Arnold et al. (US 2010/0077428 A1) teaches processing video media with associated meta data.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ARIELLE E WEINER whose telephone number is (571)272-9007. The examiner can normally be reached M-F 8:30-5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Maria-Teresa (Marissa) Thein can be reached on 571-272-6764. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ARIELLE E WEINER/            Examiner, Art Unit 3684