DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
EXAMINER’S AMENDMENT
Authorization for this examiner’s amendment was given in an interview with Scott A. McCollister on August 18, 2022.
The application has been amended as follows: 
1. (Currently Amended) A method for recognizing a product, comprising: 
acquiring a video taken by each of at least one camera; 
performing a recognition on each video to obtain a video segment that a product delivery is recognized and to obtain participated users, the participated users comprising: a delivery initiation user and a delivery reception user; 
inputting the video segment into a preset delivery recognition model to obtain a recognition result, the recognition result comprising: a product delivered and a delivery probability; and 
updating product information of products carried by the participated users based on the recognition result;
wherein performing the recognition on each video to obtain the video segment that the product delivery is recognized and to obtain the participated users comprises:
performing the recognition on each image of each video to obtain at least one first image that the product delivery is recognized; sequencing the at least one first image based on a time point to aggregate adjacent first images that a time point difference between the adjacent first images is less than a preset difference threshold to obtain at least one video segment; and determining the participated users corresponding to the video segment; and
wherein the video segment is processed by the delivery recognition model by:
acquiring a pre-delivery image, an on-delivery image and a post-delivery image from the video segment; performing the recognition on the pre-delivery image to determine a product held by the delivery initiation user before the product delivery and a first recognition probability of the product; performing the recognition on the on-delivery image to determine a product held simultaneously by the delivery initiation user and the delivery reception user during the product delivery and a second recognition probability of the product; performing the recognition on the post-delivery image to determine a product held by the delivery reception user after the product delivery and a third recognition probability of the product; and determining the product delivered and the delivery probability based on the product held by the delivery initiation user before the product delivery and the first recognition probability, the product held simultaneously by the delivery initiation user and the delivery reception user during the product delivery and the second recognition probability, and the product held by the delivery reception user after the product delivery and the third recognition probability.
2. (Cancelled) 



3. (Currently Amended) The method of claim1, wherein determining the participated users corresponding to the video segment comprises:
acquiring depth information corresponding to each image of the video segment;
determining point cloud information of each image of the video segment based on each image of the video segment and the depth information corresponding to each image; and
determining the participated users corresponding to the video segment based on the point cloud information of each image of the video segment and point cloud information of each user.
4. (Original) The method of claim 3, further comprising:
determining a position and a body gesture of each participated user based on the point cloud information of each participated user; and
determining the delivery initiation user and the delivery reception user, based on the position and the body gesture of each participated user.
5. (Cancelled)





6. (Currently Amended) The method of claim 1, wherein determining the delivery probability based on the first recognition probability, the second recognition probability and the third recognition probability comprises:
determining a multiply of the first recognition probability, the second recognition probability and the third recognition probability as the delivery probability.
7. (Original) The method of claim 1, wherein updating the product information of products carried by the participated users based on the recognition result comprises:
acquiring a first product corresponding to a maximum delivery probability based on the recognition result;
deleting first product information of the first product from the product information of products carried by the delivery initiation user; and
adding the first product information of the first product to the product information of products carried by the delivery reception user.
8. (Currently Amended) An electronic device, comprising:
at least one processor; and
a memory connected in communication with the at least one processor; 
wherein, the memory is configured to store an instruction executable by the at least one processor, and the instruction is executed by the at least one processor to enable the at least one processor to:
acquire a video taken by each of at least one camera; 
perform a recognition on each video to obtain a video segment that a product delivery is recognized and to obtain participated users, the participated users comprising: a delivery initiation user and a delivery reception user; 
input the video segment into a preset delivery recognition model to obtain a recognition result, the recognition result comprising: a product delivered and a delivery probability; and 
update product information of products carried by the participated users based on the recognition result;
wherein the one or more processors are enabled to perform the recognition on each video to obtain the video segment that the product delivery is recognized and to obtain the participated users by:
performing the recognition on each image of each video to obtain at least one first image that the product delivery is recognized; sequencing the at least one first image based on a time point to aggregate adjacent first images that a time point difference between the adjacent first images is less than a preset difference threshold to obtain at least one video segment; and determining the participated users corresponding to the video segment; and
wherein the video segment is processed by the delivery recognition model by:
acquiring a pre-delivery image, an on-delivery image and a post-delivery image from the video segment; performing the recognition on the pre-delivery image to determine a product held by the delivery initiation user before the product delivery and a first recognition probability of the product; performing the recognition on the on-delivery image to determine a product held simultaneously by the delivery initiation user and the delivery reception user during the product delivery and a second recognition probability of the product; performing the recognition on the post-delivery image to determine a product held by the delivery reception user after the product delivery and a third recognition probability of the product; and determining the product delivered and the delivery probability based on the product held by the delivery initiation user before the product delivery and the first recognition probability, the product held simultaneously by the delivery initiation user and the delivery reception user during the product delivery and the second recognition probability, and the product held by the delivery reception user after the product delivery and the third recognition probability.
9. (Cancelled) 



10. (Currently Amended) The electronic device of claim 8, wherein the one or more processors are enabled to determine the participated users corresponding to the video segment by:
acquiring depth information corresponding to each image of the video segment;
determining point cloud information of each image of the video segment based on each image of the video segment and the depth information corresponding to each image; and
determining the participated users corresponding to the video segment based on the point cloud information of each image of the video segment and point cloud information of each user.
11. (Original) The electronic device of claim 10, wherein the one or more processors are enabled to:
determine a position and a body gesture of each participated user based on the point cloud information of each participated user; and
determine the delivery initiation user and the delivery reception user, based on the position and the body gesture of each participated user.
12. (Cancelled) 




. 
13. (Currently Amended) The electronic device of claim 8, wherein the one or more processors are enabled to determine the delivery probability based on the first recognition probability, the second recognition probability and the third recognition probability by:
determining a multiply of the first recognition probability, the second recognition probability and the third recognition probability as the delivery probability.
14. (Original) The electronic device of claim 8, wherein the one or more processors are enabled to update the product information of products carried by the participated users based on the recognition result by:
acquiring a first product corresponding to a maximum delivery probability based on the recognition result;
deleting first product information of the first product from the product information of products carried by the delivery initiation user; and
adding the first product information of the first product to the product information of products carried by the delivery reception user.
15. (Currently Amended) A method for recognizing a product, comprising:
acquiring a video taken by each of at least one camera;
performing a recognition on each video to obtain a video segment that a product delivery is recognized, to obtain participated users, and to obtain a product delivered, the participated users comprising a delivery initiation user and a delivery reception user; and
updating product information of products carried by the participated users based on the participated users and the product delivered;
wherein performing the recognition on each video to obtain the video segment that a product delivery is recognized, to obtain the participated users, and to obtain the product delivered comprises:
performing the recognition on each video to obtain the video segment and the participated users; inputting the video segment into a preset delivery recognition model to obtain a recognition result, the recognition result comprising a product delivered and a delivery probability; and determining the product delivered based on the recognition result
wherein processing the video segment by the delivery recognition model comprises:
acquiring a pre-delivery image, an on-delivery image and a post-delivery image from the video segment; performing the recognition on the pre-delivery image to determine a product held by the delivery initiation user before the product delivery and a first recognition probability of the product; performing the recognition on the on-delivery image to determine a product held simultaneously by the delivery initiation user and the delivery reception user during the product delivery and a second recognition probability of the product; performing the recognition on the post-delivery image to determine a product held by the delivery reception user after the product delivery and a third recognition probability of the product; and determining the product delivered and the delivery probability based on the product held by the delivery initiation user before the product delivery and the first recognition probability, the product held simultaneously by the delivery initiation user and the delivery reception user during the product delivery and the second recognition probability, and the product held by the delivery reception user after the product delivery and the third recognition probability.
16. (Cancelled) 



17. (Currently Amended) The method of claim 15, wherein performing the recognition on each video to obtain the video segment and the participated users comprises:
performing the recognition on each image of each video to obtain at least one first image that the product delivery is recognized;
sequencing the at least one first image based on a time point to aggregate adjacent first images that a time point difference between the adjacent first images is less than a preset difference threshold to obtain at least one video segment; and
determining the participated users corresponding to the video segment.
18. (Original) The method of claim 17, wherein determining the participated users corresponding to the video segment comprises:
acquiring depth information corresponding to each image of the video segment;
determining point cloud information of each image of the video segment based on each image of the video segment and the depth information corresponding to each image; and
determining the participated users corresponding to the video segment based on the point cloud information of each image of the video segment and point cloud information of each user.
19. (Cancelled)





20. (Currently Amended) The method of claim 15, wherein the product information of products carried by the participated users is updated by:
acquiring a first product corresponding to a maximum delivery probability;
deleting first product information of the first product from the product information of products carried by the delivery initiation user; and
adding the first product information of the first product to the product information of products carried by the delivery reception user.


Allowable Subject Matter
Claims 1, 3, 4, 6-8, 10, 11, 13-15, 17, 18, 20 are allowed.
The following is an examiner’s statement of reasons for allowance: 
Applicant's invention is drawn to provides a method and a device for recognizing a product, an electronic device and a non-transitory computer readable storage medium, relating to a field of unmanned retail product recognition. The method includes the following. A video taken by each5 camera in a store is acquired. A recognition is performed on each video to obtain a video segment that a product delivery is recognized and to obtain participated users. The participated users include a delivery initiation user and a delivery reception user. The video segment is inputted into a preset delivery recognition model to obtain a recognition result. The recognition result includes a product delivered and a delivery probability. The product information of products carried by the10 participated users is updated based on the recognition result. 

	The closest prior art of record fail to teach the limitation of “wherein performing the recognition on each video to obtain the video segment that the product delivery is recognized and to obtain the participated users comprises:
performing the recognition on each image of each video to obtain at least one first image that the product delivery is recognized; sequencing the at least one first image based on a time point to aggregate adjacent first images that a time point difference between the adjacent first images is less than a preset difference threshold to obtain at least one video segment; and determining the participated users corresponding to the video segment; and wherein the video segment is processed by the delivery recognition model by: acquiring a pre-delivery image, an on-delivery image and a post-delivery image from the video segment; performing the recognition on the pre-delivery image to determine a product held by the delivery initiation user before the product delivery and a first recognition probability of the product; performing the recognition on the on-delivery image to determine a product held simultaneously by the delivery initiation user and the delivery reception user during the product delivery and a second recognition probability of the product; performing the recognition on the post-delivery image to determine a product held by the delivery reception user after the product delivery and a third recognition probability of the product; and determining the product delivered and the delivery probability based on the product held by the delivery initiation user before the product delivery and the first recognition probability, the product held simultaneously by the delivery initiation user and the delivery reception user during the product delivery and the second recognition probability, and the product held by the delivery reception user after the product delivery and the third recognition probability”.
	Applicant’s independent claim 1 comprises a particular combination of elements, which is neither taught nor suggested by the prior art.
Similarly, other independent claim 8 and 15 comprises a particular combination of elements with analogous wording variations, which are neither taught nor suggested by prior art as a whole claim.
Dependent claims are deemed allowable for the same reasons as corresponding independent claims.
Fan et al. Pub. No. US 20130250115 A1 teaches a system, method, and program product to determine whether a product has been successfully purchased by identifying in a video record when a movement of a product adjacent to a scanner occurs, and whether the scanner did not record a purchase transaction at that time; measuring a difference in time between the time of the movement of the product and a time of another movement of a product, and determining by a trained support vector machine a likelihood that the product was successfully purchased. Alternately, the difference in time can be measured between the time of the movement of the product and a time of a transaction record, or between the time of the movement of the product and a boundary time. The support vector machine can use a radial basis function kernel and can generate a decision value and a confidence score.
Bobbitt et al. Pub. No. US 20160034766 A1 teaches transaction units of video data and transaction data captured from different checkout lanes are prioritized as a function of lane priority values of respective ones of the different checkout lanes from which the transaction units are acquired. Each of the checkout lanes has a different lane priority value. The individual transaction units are processed in the prioritized processing order to automatically detect irregular activities indicated by the transaction unit video and the transaction data of the processed individual transaction units.
DETECTING SWEETHEARTING IN RETAIL SURVEILLANCE VIDEOS – 2009 teaches from Fig. 2 a major type of retail fraud in surveillance videos, known as sweethearting (or fake scan), where a cashier intentionally fails to enter one or more items into the transaction in an attempt to get free merchandise for the customer. We first develop a motion-based algorithm to identify video segments as candidates for primitive events at the POS. 
WO 2020156108 A1 teaches  a system and methods for monitoring retail transactions, including regular and irregular transactions associated with a check-out machine. The system utilizes various GUI elements, their configurations, and their interactions with a user to present retail transactions and their information thereof such that various problems in the conventional systems are overcome.
	However, cited reference, alone or in combination, neither disclose nor suggest combination of features specifically “wherein performing the recognition on each video to obtain the video segment that the product delivery is recognized and to obtain the participated users comprises:
performing the recognition on each image of each video to obtain at least one first image that the product delivery is recognized; sequencing the at least one first image based on a time point to aggregate adjacent first images that a time point difference between the adjacent first images is less than a preset difference threshold to obtain at least one video segment; and determining the participated users corresponding to the video segment; and wherein the video segment is processed by the delivery recognition model by: acquiring a pre-delivery image, an on-delivery image and a post-delivery image from the video segment; performing the recognition on the pre-delivery image to determine a product held by the delivery initiation user before the product delivery and a first recognition probability of the product; performing the recognition on the on-delivery image to determine a product held simultaneously by the delivery initiation user and the delivery reception user during the product delivery and a second recognition probability of the product; performing the recognition on the post-delivery image to determine a product held by the delivery reception user after the product delivery and a third recognition probability of the product; and determining the product delivered and the delivery probability based on the product held by the delivery initiation user before the product delivery and the first recognition probability, the product held simultaneously by the delivery initiation user and the delivery reception user during the product delivery and the second recognition probability, and the product held by the delivery reception user after the product delivery and the third recognition probability.
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Fan et al. Pub. No. US 20130250115 A1 - SYSTEMS AND METHODS FOR FALSE ALARM REDUCTION DURING EVENT DETECTION
	Bobbitt et al. Pub. No. US 20160034766 A1 - OPTIMIZING VIDEO STREAM PROCESSING
	Zucker  et al. Patent. No. US 10943128 B2 - Constructing shopper carts using video surveillance
	Thompson et al. Pub. No. US 20210065106 A1 - Computer-based logistics method for arranging delivering of items to recipients situated at different recipient locations, involves instructing parent agent and delivery agent to deliver items by providing navigation instructions
	Francis et al. Pub. No. 20200387865 A1 – Environment Tracking
	Fisher et al. Patent No. US 10127438 B1 - System for tracking changes in items by persons in area of shopping store, has first image processor including recognition engines, and second image processor for identifying and classifying background changes by processing factored images
	Mishra et al. Patent No. US 10203211 B1 - Visual route book data sets
	Glaser et al. Pub. No. US 20170323376 A1 - SYSTEM AND METHOD FOR COMPUTER VISION DRIVEN APPLICATIONS WITHIN AN ENVIRONMENT
	Anabuki Pub. No. US 20150063640 A1 - Image processing apparatus for use in monitoring system for determining observation area in image, comprises detection unit to detect persons from obtained image and determination unit that determines observation target
	Iwai Pub. No. US 20150010204 A1 - PERSON BEHAVIOR ANALYSIS DEVICE, PERSON BEHAVIOR ANALYSIS SYSTEM, PERSON BEHAVIOR ANALYSIS METHOD, AND MONITORING DEVICE
	Mullins Pub. No. US 20140267407 A1 - SEGMENTATION OF CONTENT DELIVERY
	WO 2020156108 A1 - SYSTEM AND METHODS FOR MONITORING RETAIL TRANSACTIONS
	CN 112529604 A - Material feeding method, device, electronic device and storage medium
	CN 110264177 A - A self-service shopping system and method
	CN 104732412 A - A method for guaranteeing electronic commerce front and rear end is the same
	DETECTING SWEETHEARTING IN RETAIL SURVEILLANCE VIDEOS – 2009
	Cart Auditor: A Compliance and Training Tool for Cashiers at Checkout – 2010
	How Computer Vision Provides Physical Retail with a Better View on Customers - 2019
	A deep learning pipeline for product recognition on store shelves - 2019
	Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NIZAR N SIVJI whose telephone number is (571)270-7462.  The examiner can normally be reached on Monday-Friday 7-4.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Srilakshmi K. Kumar can be reached on (571) 272-7769.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/NIZAR N SIVJI/           Primary Examiner, Art Unit 2647