Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA 

Response to Amendment

Applicant’s “Response to Amendment and Reconsideration” filed on March, 21, 2022 has been considered.
Applicant’s response by virtue of amendment to claims 1-9, 12-18 has overcome the Examiner’s rejection under 35 USC § 101 paragraph.
Claims 19-25 are added. Claims 1-25 are pending in this application and an action on the merits follows.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-2, 4-25 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Buibas (U.S. Patent No. 10,783,491) 

Regarding claim 1, 6, Buibas teach acquiring identity information of a purchaser; acquiring a hand image of a target object with an identification image collection module, the identification image collection module arranged on a shelf for bearing items; identifying a take-up action or a put-back action of the purchaser to acquire an action identification result, and identifying an item at which the take-up action or the put-back action aims to acquire an item identification result; and performing a checkout based on the identity information of the purchaser, the action identification result, and the item identification result.; a camera for acquiring a hand image of a target object, the camera arranged on a shelf for bearing items; and a processor configured to acquire identity information of a purchaser; (ATTORNEY DOCKET NUMBERPATENT APPLICATION Confirmation No. 9030may combine quantity sensors and camera images to detect and identify items added or removed by a shopper, [55];
performing checkout based on the identity information of the purchaser, the action identification result, and the item identification result, (the system derives information 150 that the person 103 took item 111 from shelf 102. This information may be used for example for automated checkout, [151], 5The smart shelves shown in FIG. 42 have cameras mounted on the bottom of the shelf; these cameras observe items on the shelf below. For example, camera 4231 on shelf 4212 observes items on shelf 4213. When user 4201 reaches for an item on shelf 4213, cameras on either or both of shelves 4212 and 4213 may detect entry of the user's hand into the shelf area, and may capture images of shelf contents that may be used to determine which item or items are taken or moved. This data may be combined with images from other store cameras, such as cameras 4231 and 4232, to track the shoppers and attribute item movements to specific shoppers, see [256], Fig. 32.


Regarding claim 2, 7, Buibas teaches the acquiring the identity information of the purchaser comprises: judging whether a distance between the target object and a range sensor comply with a preset threshold value; if the distance is judged to comply with the preset threshold value, confirming the target object as the purchaser; and acquiring the identity information of the purchaser according to an acquired facial image of the purchaser., (to distinguish between people, vector-to-vector distances may be computed and compared to a threshold; for example, a distance of 0.0 to 0.5 may indicate the same person, and a greater distance may indicate different people, [0227], see Fig. 32)

Regarding claim 4, 8, 12, 13, Buibas teaches the identifying the item comprises: acquiring a plurality of primary classification results according to an acquired plurality of frames of hand images of the purchaser in front of a shelf bearing the items and a pre-trained first-level classification model, in which the pre-trained first-level classification model is a model that is constructed by an image identification technique of a convolutional neural network and trained by all the items on the shelf; ATTORNEY DOCKET NUMBERPATENT APPLICATION acquiring a first-level classification result according to the plurality of primary classification results and a pre-trained first-level linear regression model; and obtaining the item identification result according to the first-level classification result, ([282, 296-298, 318-319, 179]).

Regarding claims 5, 9, 14-15, Buibas teaches after the acquiring the first-level classification result, judging whether the first-level classification result is a similar item; acquiring a plurality of secondary classification results according to the plurality of frames of hand images and a pre-trained second-level classification model, then acquiring a second-level classification result according to the plurality of secondary classification results and a pre-trained second-level linear regression model, and acquiring the item identification result according to the secondary classification result, in case that the first-level classification result is judged the similar item, in which the second-level classification model is a model that is constructed by the image identification technique of the convolutional neural network and pre-trained by all the similar items on the shelf in advance; and, if the first-level classification result is judged not a similar item, skipping to the obtaining, [281-290].  

Regarding claim 10, Buibas teaches a shelf for bearing items; a range sensor arranged on the shelf, for generating distance information between a target object and a range sensor; an identity verification collection module arranged on the shelf, for acquiring a facial image of the target object; an identification image collection module arranged on the shelf, for acquiring a hand image of the target object; a processor; and a memory that records processor-executable instructions, in which the processor is configured to acquire identity information of a purchaser according to the distance information and the facial image, identify a take-up action or a put-back action of the purchaser to acquire an action identification result, and identify an item at which the take-up action or the put-back action aims to acquire an item identification result according to the hand image sent by the identification image collection module, and perform a checkout according to the identity information, the action identification result, and the item identification result of the purchaser, (to distinguish between people, vector-to-vector distances may be computed and compared to a threshold; for example, a distance of 0.0 to 0.5 may indicate the same person, and a greater distance may indicate different people, [0227], (ATTORNEY DOCKET NUMBERPATENT APPLICATION Confirmation No. 9030may combine quantity sensors and camera images to detect and identify items added or removed by a shopper, [55];
performing checkout based on the identity information of the purchaser, the action identification result, and the item identification result, (the system derives information 150 that the person 103 took item 111 from shelf 102. This information may be used for example for automated checkout, [151], 5The smart shelves shown in FIG. 42 have cameras mounted on the bottom of the shelf; these cameras observe items on the shelf below. For example, camera 4231 on shelf 4212 observes items on shelf 4213. When user 4201 reaches for an item on shelf 4213, cameras on either or both of shelves 4212 and 4213 may detect entry of the user's hand into the shelf area, and may capture images of shelf contents that may be used to determine which item or items are taken or moved. This data may be combined with images from other store cameras, such as cameras 4231 and 4232, to track the shoppers and attribute item movements to specific shoppers, see [256], Fig. 32.)


Regarding claim 11, 16-18, Buibas teaches a client terminal, to receive the identity information inputted by a target object and send the identity information to the checkout device, and to receive a shopping list produced by the checkout device; and the checkout device according to claim 6, [14].

Regarding claim 19, Buibas teaches acquiring another hand image of the target object with another identification image collection module arranged on the shelf, (extending an authorization from one person to another person. For example, an authorization may apply to an entire vehicle and therefore may authorize all occupants of that vehicle to perform actions such as entering a secured area or taking and purchasing products, [206]).

Regarding claim 20, Buibas teaches the identification image collection module and the other identification image collection module are arranged diagonally on the shelf, (see Fig. 43, 4301, 4302).

Regarding claims 21, 23, Buibas teaches automatically acquiring identity information of a purchaser; identifying a take-up action or a put-back action of the purchaser to acquire an action identification result, wherein the identification of the take-up action or the put-back action comprises acquiring a plurality of consecutive image frames of a hand of the purchaser in a vicinity of a shelf with one or more items (A sequence of raw images 1601 is obtained from camera 121 in the store, [175]… These transformed images 1605 may then be shifted in time to account for possible time offsets among different cameras in the store. This shifting 1607 synchronizes the frames from the different cameras in the store to a common time scale,[175], 
and utilizing a processor for establishing a direction of the motion of a purchaser's hand based on the plurality of consecutive image frames of the hand and a timing of each frame; identifying, using a computerized image identification and identification processing, an item at which the take-up action or the put-back action is directed, and automatically determining an item identification result (the before and after images from all cameras may be packaged together into an event data record, and transmitted for example to a store server 130 for analyses 5521 to determine what item or items have been taken from or put onto the item storage area as a result of the shopper's interaction, [288]); and 
performing a checkout based on the identity information of the purchaser, the action identification result, and the item identification result. …. Processor or processors 130 may analyze the data from cameras and other sensors to track shoppers, to detect actions that shoppers perform with items or item storage areas, and to identify items that shoppers take, replace, or move. By correlating the track 5201 of a shopper with the location and time of actions on items, items may be associated with shoppers, for example for automated checkout in an autonomous store, [303].
These cameras 4301 and 4302 may be used in combination with similar cameras on shelves above and/or below shelf 4212 in a shelving unit (such as shelves 4211 and 4213 in FIG. 42) to detect hand events. For example, the system may use multiple hand detection cameras to triangulate the position of a hand going into a shelf. With two cameras observing a hand, the position of a hand can be determined from the two images (consecutive image frames of a hand of the purchaser). With multiple cameras (for example four or more) observing a shelf, the system may be able to determine the position of more than one hand at a time (establishing a direction of the motion of a purchaser's hand based on the plurality of consecutive image frames of the hand and a timing of each frame) since the multiple views can compensate for potential occlusions. Images of the shelf just prior to a hand entry event may be compared to images of the shelf just after a hand exit event (timing), in order to determine which item or items may have been taken, moved, or added to the shelf (determining an item identification result), [257]. When the sensor subsystem detects that a shopper has entered or is entering an item storage area, it may generate an enter signal 5502, and when it detects that the shopper has exited or is exiting this area, it may generate an exit signal 5503. Entry may correspond for example to a shopper reaching a hand into a space between shelves, and exit may correspond to the shopper retracting the hand from this space. In one or more embodiments these signals may contain additional information, such as for example the item storage area affected, or the approximate location of the shopper's hand. The enter and exit signals trigger acquisition of before and after images, respectively, captured by the cameras that observe the item storage area with which the shopper interacts. In order to obtain images prior to the enter signal, camera images may be continuously saved in a buffer. This buffering is illustrated in FIG. 55 for three illustrative cameras 4311 a, 4311 b, and 4312 a mounted on the underside of shelf 4212. Frames captured by these cameras are continuously saved in circular buffers 5511, 5512, and 5513, respectively. (consecutive image frames of a hand of the purchaser), [287]);
the tracked 3D field of influence volume 1001 of person 103 is calculated to be near item storage area 102. The system therefore calculates an intersection 1011 of the item storage area 102 and the 3D field of influence volume 1001 around person 1032 and locates camera images that contain views of this region, such as image 1011. At a subsequent time 142, for example when person 103 is determined to have moved away from item storage area 102, an image 1012 (or multiple such images) is obtained of the same intersected region. These two images are then fed as inputs to neural network 300, which may for example detect whether any item was moved, which item was moved (if any) and the type of action that was performed. The detected item motion is attributed to person 103 because this is the person whose field of influence volume intersected the item storage area at the time of motion. By applying the classification analysis of neural network 300 only to images that represent intersections of person's field of influence volume with item storage areas, processing resources may be used efficiently and focused only on item movement that may be attributed to a tracked person, [168].

Regarding claim 22, Buibas teaches the movement direction of the motion track of the hand is relative to the shelf, (tracking of a person's field of influence with detection of item motion to attribute the motion to a person, Fig. 9)

Regarding claim 24, Buibas teaches the item identification result is determined based at least partially on a computerized image identification analysis using a pre- trained first-level classification model that is trained by images of all items on the shelf, and aATTORNEY DOCKET NUMBERPATENT APPLICATION WZ-M8CNUS20020416/764,086 Confirmation No. 9030pre-trained second-level classification model that is trained by images of all similar items on the shelf, [246, 300].

Regarding claim 25, Buibas teaches using a computerized image identification and identification processing means using a multi-level computerized image identification and classification model and analysis, the multi-level computerized image identification and classification model and analysis comprises: acquiring a plurality of primary classification results, according to a pre-trained first-level classification model, in which the pre-trained first-level classification model is trained by all items on a shelf; determining a first-level classification result according to the plurality of primary classification results; acquiring a plurality of secondary classification results, according to a pre-trained second-level classification model, in which the second-level classification model is pre-trained in advance by all the similar items on the shelf, in case it is determined by a processor that the first- level classification result is a similar item; determining a second-level classification result according to the plurality of secondary classification results; and determining the item identification result according to the second-level classification result, (The first half of the network may have for example N copies of a standard image classification network. The final classifier layer of this image classification network may be removed, and the network may be used as a pre-trained feature extractor. This network may be pretrained on a dataset such as the ImageNet dataset, which is a standard objects dataset with images and labels for various types of objects, including but not limited to people, [246]).


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Buibas in view of Adato et al. (U.S. Patent Publication No. 2019/0149725).

Regarding claim 3, Buibas teaches the identification image collection module is arranged on a lower portion of a door frame of the shelf, and the identifying the take-up action or the put-back action comprises: acquiring a plurality of frames of consecutive hand images of the purchaser in front of the shelf bearing the items, and establishing a motion track of a hand for the plurality of frames of consecutive hand images on a timeline; and identifying an action of the purchaser as the take-up action or the put-back action according to a movement direction of the motion track of the hand relative to the shelf, (Each modular shelf may contain at least one camera module on the bottom of the shelf, at least one lighting module on the bottom of the shelf, a right-facing camera on or near the left edge of the shelf, a left-facing camera on or near the right edge of the shelf, a processor, and a network switch. The camera module may contain two or more downward-facing cameras, [0043]). Buibas does not specifically teach the shooting angle of the identification image collection module is upward.
However, Adato teaches to facilitate the installation of system 500, each first housing 502 (e.g., first housing 502K) may include an adjustment mechanism 642 for setting a field of view 644 of image capture device 506K such that the field of view 644 will at least partially encompass products placed both on a bottom shelf of retail shelving unit 640 and on a top shelf of retail shelving unit 640, [0174]. 	Therefore, it would have been obvious to one with ordinary skill in the art before the effective filing date of the invention, to modify design method of Buibas, to include different shooting angle of the camera, as taught by Adato, in order to provide continuous monitoring of dynamically changing product displays, [0003].

Response to Arguments
Applicant's arguments have been fully considered but they are not persuasive. 
Applicant argues that the prior art does not teach 
"acquiring a hand image of a target object with an identification image collection module, the identification image collection module arranged on a shelf for bearing items:
“a camera for acquiring a hand image of a target object, the camera arranged on a shelf for bearing items””
“a shelf for bearing items; an identification image collection module arranged on the shelf, for acquiring a hand image of the target object”
“comparing consecutive hand images, as recited in new Claim 21 - acquiring a plurality of frames of consecutive hand images of the purchaser in front of a shelf bearing items, and establishing a motion track of a hand for the plurality of frames of consecutive hand images on a timeline; and identifying an action of the purchaser as the take-up action or the put-back action according to a movement direction of the motion track of the hand”.
“identifying, using a computerized image identification and identification processing, an item at which the take-up action or the put-back action is directed.”

Examiner does not agree. Bulbas teaches a sequence of raw images 1601 is obtained from camera 121 in the store, [175]… These transformed images 1605 may then be shifted in time to account for possible time offsets among different cameras in the store. This shifting 1607 synchronizes the frames from the different cameras in the store to a common time scale,[175], 
and utilizing a processor for establishing a direction of the motion of a purchaser's hand based on the plurality of consecutive image frames of the hand and a timing of each frame; identifying, using a computerized image identification and identification processing, an item at which the take-up action or the put-back action is directed, and automatically determining an item identification result (the before and after images from all cameras may be packaged together into an event data record, and transmitted for example to a store server 130 for analyses 5521 to determine what item or items have been taken from or put onto the item storage area as a result of the shopper's interaction, [288]); and 
performing a checkout based on the identity information of the purchaser, the action identification result, and the item identification result. …. Processor or processors 130 may analyze the data from cameras and other sensors to track shoppers, to detect actions that shoppers perform with items or item storage areas, and to identify items that shoppers take, replace, or move. By correlating the track 5201 of a shopper with the location and time of actions on items, items may be associated with shoppers, for example for automated checkout in an autonomous store, [303].
These cameras 4301 and 4302 may be used in combination with similar cameras on shelves above and/or below shelf 4212 in a shelving unit (such as shelves 4211 and 4213 in FIG. 42) to detect hand events. For example, the system may use multiple hand detection cameras to triangulate the position of a hand going into a shelf. With two cameras observing a hand, the position of a hand can be determined from the two images (consecutive image frames of a hand of the purchaser). With multiple cameras (for example four or more) observing a shelf, the system may be able to determine the position of more than one hand at a time (establishing a direction of the motion of a purchaser's hand based on the plurality of consecutive image frames of the hand and a timing of each frame) since the multiple views can compensate for potential occlusions. Images of the shelf just prior to a hand entry event may be compared to images of the shelf just after a hand exit event (timing), in order to determine which item or items may have been taken, moved, or added to the shelf (determining an item identification result), [257]. When the sensor subsystem detects that a shopper has entered or is entering an item storage area, it may generate an enter signal 5502, and when it detects that the shopper has exited or is exiting this area, it may generate an exit signal 5503. Entry may correspond for example to a shopper reaching a hand into a space between shelves, and exit may correspond to the shopper retracting the hand from this space. In one or more embodiments these signals may contain additional information, such as for example the item storage area affected, or the approximate location of the shopper's hand. The enter and exit signals trigger acquisition of before and after images, respectively, captured by the cameras that observe the item storage area with which the shopper interacts. In order to obtain images prior to the enter signal, camera images may be continuously saved in a buffer. This buffering is illustrated in FIG. 55 for three illustrative cameras 4311 a, 4311 b, and 4312 a mounted on the underside of shelf 4212. Frames captured by these cameras are continuously saved in circular buffers 5511, 5512, and 5513, respectively. (consecutive image frames of a hand of the purchaser), [287]);
the tracked 3D field of influence volume 1001 of person 103 is calculated to be near item storage area 102. The system therefore calculates an intersection 1011 of the item storage area 102 and the 3D field of influence volume 1001 around person 1032 and locates camera images that contain views of this region, such as image 1011. At a subsequent time 142, for example when person 103 is determined to have moved away from item storage area 102, an image 1012 (or multiple such images) is obtained of the same intersected region. These two images are then fed as inputs to neural network 300, which may for example detect whether any item was moved, which item was moved (if any) and the type of action that was performed. The detected item motion is attributed to person 103 because this is the person whose field of influence volume intersected the item storage area at the time of motion. By applying the classification analysis of neural network 300 only to images that represent intersections of person's field of influence volume with item storage areas, processing resources may be used efficiently and focused only on item movement that may be attributed to a tracked person, [168].

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to MILENA RACIC whose telephone number is (571)270-5933. The examiner can normally be reached M-F 7:30am-4pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Florian (Ryan) Zeender can be reached on (571)272-6790. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MILENA RACIC/Patent Examiner, Art Unit 3627     


/FLORIAN M ZEENDER/Supervisory Patent Examiner, Art Unit 3627