DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
The examiner acknowledges that the instant application claims priority from foreign application JP 2018-237226, filed on 12/19/18 and therefore, the claims receive the effective filing date of December 19, 2018.  
Information Disclosure Statement
The IDSs submitted on 7/30/2019 and 10/18/2019 was previously considered. 

Status of Claims
Applicant’s amended claims, filed 9/22/2022, have been entered. Claims 1, 2, 10, 11, 17, and 19 are amended. Claims 3-5 and 13-15 were previously canceled. Claims 1, 2, 6-12, and 16-20 are currently pending in this application and have been examined.  

Claim Interpretation
The examiner acknowledges that claim 10 recites one or more claim limitations that use a generic place holder coupled with functional language but are nonetheless not being interpreted under 35 U.S.C. 112(f). Since the “unit configured to” and “unit for” are recited in the specification (see at least [0032]) as part of the “information processing terminal”, for purposes of this examination, the examiner will interpret the limitations as being performed by hardware. Therefore upon further consideration, claim 10 is no longer interpreted under 35 U.S.C. 112(f) because the functionality is performed by the information processing terminal, which recites sufficient structure for the units.


Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1, 2, 6-12, and 16-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Claims 1, 10, and 11 recite the limitation “setting a listing price of the product based on the displayed listing price” in lines 22-23, 20-21, and 20-21, respectively. There is insufficient antecedent basis for  the limitation “the displayed listing price.” While the claims recite “acquire product information related to the product… the product information including a listing price” in lines 11-13 of claim 1, lines 10-14 of claim 10, and lines 8-11 of claim 11, the claims do not recite “displaying” the listing price. For purposes of compact prosecution, Examiner will examine the limitation to read as “setting a listing price of the product based on the acquired listing price.” Claims 2, 6-9, 12, and 16-20 inherit the deficiencies noted in claims 1, 10, and 11. Appropriate correction is required.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 2, 6-12, and 16-20 are rejected under 35 U.S.C. 103 as being unpatentable over Purves et al. (US 2015/0012426 A1 [previously recited]) in view of Di Censo et al. (US 2015/0193005 A1) and Zheng et al. (US 2019/0080171 A1).

Regarding claim 11, Purves et al., hereinafter Purves discloses a method performed by an information processing apparatus (abstract), the method comprising: 
	detecting, using a sensor, a predetermined motion of a user (Figs. 2A and 2B; ¶0097 [the recording may begin when the user presses a button on the electronic device indicating 9 that the user would like to initiate an action; Examiner notes pressing a button on the device is comparable to a sensor detecting a predetermined motion of a user]); 
	capturing an image when the predetermined motion of the user is detected by the sensor (Figs. 2A and 2B; ¶0097 [FIGS. 2A-B show data flow diagrams illustrating processing gesture and vocal commands in some embodiments of the MDGAAT. In some implementations the user 201 may initiate an action by providing both a physical gesture 202 and a vocal command 203… capture the gesture via a camera on the electronic device 207… the camera may take a burst of photos. In some implementations, the recording may begin when the user presses a button on the electronic device indicating 9 that the user would like to initiate an action]); 
	recognizing a first gesture of the user specifying a product, in the image (Figs. 2A, 2B, 12G; ¶0103 [As shown in FIG. 2 b, in some implementations, the electronic device 206 may process the audio and gesture data itself 218, and may also have a library of possible gestures that it may match 219 with the processed audio and gesture data to], ¶0107 [MDGAAT may use an image processing component in order to process the video and/or images 310 and determine what the gesture is] in view of ¶0214 [With reference to FIG. 12G, a consumer 120 may utilize a “framing” gesture to select an item in the scene. For example, a consumer 120 may “frame” an antique desk lamp 147] and ¶0164); 
	acquiring product information of the product specified and related to a listing of the product on an electronic transaction platform for product's purchase and sale using the first gesture, based at least in part on the image where the first gesture is recognized, the product information including a listing price when the product is listed on the electronic transaction platform (Figs. 2A, 2B, 12G [displaying Antique Lamp $99.99]; ¶0103 [the electronic device 206 may process the…gesture data itself 218, and may also have a library of possible gestures that it may match 219 with the processed…gesture data to. The electronic device may then send in the command message 220 the actions to be performed], ¶0214 [a consumer 120 may “frame” an antique desk lamp 147… the V-GLASSES may provide information labels with regard to the item identifying information, availability at local stores, availability on online merchants 148, and/or the like (e.g., various merchants, retailers may inject advertisements related products for the consumer to view, etc.)]; Examiner notes Fig. 12G displays product information including a price of the lamp from a merchant. The listed price that is displayed is the “listing price when the product is listed on the electronic transaction platform”); 
	controlling an output of the product information to the user (Figs. 2A, 2B, 12G; ¶0104 [The MDGAAT may then perform the action specified 221, accessing any information necessary to conduct the action 222, and may send a confirmation page or AR overlay to the user 223] in view of ¶0214 [the V-GLASSES may provide information labels with regard to the item identifying information, availability at local stores, availability on online merchants 148, and/or the like (e.g., various merchants, retailers may inject advertisements related products for the consumer to view, etc.)]); Atty. Dkt. No. 4270.0050001-5- Reply to Office Action of February 15, 2022Satoshi YANAGISAWA Application No. 16/525,825 
	recognizing a second gesture of the user, based at least in part on a second image comprising the second gesture and the product, the second image being captured by an image sensor (FIG. 27B-27D; ¶¶0328-0333 [V-GLASSES may categorize information overlays into different layers, e.g., a merchant information layer to provide merchant information with regard to the captured items in the scene, a retail information layer to provide retail inventory information with regard to the captured items in the scene, a social information layer to provide ratings, reviews, comments and/or other related social media feeds with regard to the captured items in the scene, and/or the like. For example, when V-GLASSES captures a scene that contains different objects, different layers of information with regard to different objects (e.g., a trademark logo, a physical object, a sales receipt, and/or the like) may be overlay on top of the captured scene… a consumer may slide the information layer 1611 a to obtain another layer, e.g., retail information 1611 b, social information 1611 c, item information 1611 d, and/or the like…  a consumer may tap on the provided virtual label of a “Cartier” store, e.g., 1613, 1623, etc., and be directed to a store map including inventory information, e.g., as shown in FIG. 16B… a consumer may slide the virtual label overlay layer to view another layer of information labels, e.g., social information 1611 c, item information 1611 d, and/or the like], ¶0107 [MDGAAT may use an image processing component in order to process the video and/or images 310 and determine what the gesture is] in view of Figs. 2A, 2B, 12G, ¶0103, ¶0164, and ¶0214]); and 
	in response to the recognized second gesture of the user, performing a listing process regarding an electronic transaction for the product (FIG. 27B-27D; ¶¶0328-0333; Examiner notes obtaining and displaying additional layers to view item information and retail information, including item information, purchase item description, and price information, is comparable to performing a listing process and displaying the layers in response to a “slide” is comparable to recognizing a second gesture).

While Purves discloses detecting a predetermined motion using a sensor and capturing an image when the predetermined motion is detected by the sensor (Figs. 2A and 2B; ¶0097 [the recording may begin when the user presses a button on the electronic device indicating 9 that the user would like to initiate an action; Examiner notes pressing a button on the device is comparable to a sensor detecting a predetermined motion of a user] in view of ¶0097 [FIGS. 2A-B show data flow diagrams illustrating processing gesture and vocal commands in some embodiments of the MDGAAT. In some implementations the user 201 may initiate an action by providing both a physical gesture 202 and a vocal command 203… capture the gesture via a camera on the electronic device 207… the camera may take a burst of photos. In some implementations, the recording may begin when the user presses a button on the electronic device indicating 9 that the user would like to initiate an action]), Purves does not explicitly disclose detecting, using a depth sensor, a predetermined motion and capturing an image when the predetermined motion is detected by the depth sensor. However, in the field of identifying objects that are targets of a directional gesture of a user (abstract), Di Censo et al., hereinafter Di Censo, teaches a user interacting with an information assistant by either pressing a button or interacting with an I/O device (¶0037) by performing a gesture which is detected using an I/O device (Figs. 2A and 2B; ¶¶0033-0035) which can be one or more depth sensors (¶0028 [I/O devices may include one or more types of sensors…including depth sensors]) and based on the motion-based trigger event detected from the I/O device such as holding a hang gesture, a limb gesture, eye gaze gesture for a specified period of time, and once the information assistant determines that a directional gesture and trigger event have been received, the information assistant isolates data associated with the object of interest by activating a camera to capture one or more images of an object of interest (Figs. 2A, 2B, 4A, 4B, 8; ¶¶0057-0064 [the processing unit 102 analyzes sensor data (e.g., image data and/or depth data) received via one or more I/O devices to determine whether a directional gesture (e.g., hand/arm pointing, eye gaze, voice prompt, etc.) performed by a user intersects an object in the surrounding environment… an accelerometer and/or gyroscope may determine that the information assistant 100 has been moved from a first position (e.g., at the user's side, facing the ground) to a second position (e.g., pointed out in front of the user, facing a direction that is substantially parallel to the ground). Accordingly, in various embodiments, the processing unit 102 may determine that an object 210 is being targeted by a directional gesture—and may also determine directional data associated with the directional gesture—when a user points at an object 210, looks at an object 210, points the information assistant 100 at an object 210… If the processing unit 102 determines that a directional gesture is targeting an object of interest 210, then the method proceeds to step 820… at step 820, the processing unit 102 determines whether a trigger event is received while the object 210 is being targeted by the directional gesture… trigger events that are recognized by the information assistant 100 may include… motion-based trigger events, time-based trigger events, input device trigger events, implicit trigger events, and the like. In some embodiments, a trigger event is detected via sensor data… received from one or more I/O devices… a motion-based trigger may be detected by analyzing data received from an accelerometer and/or gyroscope to determine that the information assistant 100 has been moved (e.g., rotated, lifted, shook, etc.) in a particular manner… a time-based trigger event may be detected by analyzing data received via… one or more I/O devices to determine that an object 210 has been targeted by a directional gesture for a specified period of time (e.g., 1 to 3 seconds). For example, and without limitation, the processing unit 102 may determine that a hand gesture, limb gesture, eye gaze gesture, etc. has been targeting an object 210 for a specified period of time… If a trigger event is received while the object 210 is being targeted by the directional gesture, then the method 800 proceeds to step 830… At step 830, the information assistant 100 acquires sensor data associated with the object of interest 210 being targeted by the directional gesture. The information assistant 100 may acquire sensor data via one or more… cameras 120. For example, and without limitation, as described above, the information assistant 100 may acquire image data… via one or more image sensors… at step 840, the processing unit 102 analyzes one or more types of sensor data associated with the object of interest 210 to determine at least one characteristic of the object of interest 210] in view of ¶¶0036-0037, ¶0039). The step of Di Censo is applicable to the method of Purves as they share characteristics and capabilities, namely, they are directed to recognizing gestures and performing predetermined processes in response to the gestures. It would have been obvious to one of ordinary skill in the art at the time of filing to modify the sensor that detects the predetermined motion of a user as taught by Purves with the depth sensor as taught by Di Censo. One of ordinary skill in the art at the time of filing would have been motivated to expand the method of Purves in order to recognize motion based trigger events and gestures (¶¶0034-0037) and more effectively acquire information about objects in the user’s environment (¶0007). 

While Purves discloses recognizing a plurality of user gestures relating to a product, capturing images relating to the product and gestures, displaying information in response to the captured images, and acquiring product information related to the specified product related to a listing, Purves does not explicitly disclose the user is a seller, the product is owned by the seller, and performing a listing processing regarding an electronic transaction for the product for listing the product for sale on the electronic transaction platform while the image sensor is capturing the product, the listing process including setting a listing price of the product based on the acquired listing price. However, in the field of identifying an object in a live feed image and displaying information associated with the identified object (abstract), Zheng et al., hereinafter Zheng, teaches a user with a worn computing device with a camera capturing digital images of a physical environment such as a living room (¶0021) with objects that are owned by the user (¶0023), performing object recognition to recognize objects that are included within the digital image (¶0025), generating and displaying augmented reality digital content in a user interface of the computing device as part of a “live feed” of digital images taken of the physical environment including characteristics of the object and a price for which the object is available for sale or purchase via an online auction (¶0026), and supporting the sale of the recognized object by providing to the user information (e.g., metadata) from a variety of different types of service provider systems relating to sale of the object such as the product name, product description, and price for the sale based on online auctions, displays a selectable option to sell the product at the sale price, and generates automated listings to sell one or more objects individually or as a whole (Figs. 3-6, ¶0032, ¶0037, ¶0051, and ¶0054; claims 3, 8, 9, and 20). The step of Zheng is applicable to the method of Purves in view of Di Censo as they share characteristics and capabilities, namely, they are directed to identifying objects in an image and retrieving information associated with the object. It would have been obvious to one of ordinary skill in the art at the time of filing to modify the user that performs the predetermined actions associated with gestures as taught by Purves in view of Di Censo with the user being a seller and listing for sale of a selected item owned by the seller as taught by Zheng. One of ordinary skill in the art at the time of filing would have been motivated to expand the method of Purves in view of Di Censo in order to aid in the sale of objects recognized in digital images collected from a digital camera (¶0054). 

Regarding claim 12, Purves in view of Di Censo and Zheng teaches the method of claim 11. While Purves discloses displaying product information (Figs. 2A, 2B, 12G [displaying Antique Lamp $99.99]; ¶0214 [the V-GLASSES may provide information labels with regard to the item identifying information, availability at local stores, availability on online merchants 148, and/or the like (e.g., various merchants, retailers may inject advertisements related products for the consumer to view, etc.)];), Purves does not explicitly disclose wherein the listing comprises using the product information to list the product for sale. However Zheng further teaches obtaining information (e.g., metadata) from a variety of different types of service provider systems relating to sale of the object such as the product name, product description, and price for the sale based on online auctions and generating an automated listing using the collected metadata (Figs. 3-6, ¶0032, ¶0037, ¶0051, and ¶0054; claims 3, 8, 9, and 20). One of ordinary skill in the art at the time of filing would have been motivated to expand the method of Purves in view of Di Censo in order to aid in the sale of objects recognized in digital images collected from a digital camera (¶0054).

Regarding claim 16, Purves in view of Di Censo and Zheng teaches the method of claim 11. Purves further discloses 
	capturing a third image including a third gesture and the product (Fig. 13B; ¶0249 [the consumer may provide an indication of interests 231 a (e.g., see 427 a-b in FIG. 15E; tapping an “add to cart” button, etc.)… the CSR may in turn provide detailed information and/or add the item to shopping cart 233 a (e.g., see 439 in FIG. 4G) to the consumer per consumer request], ¶0350 [when a consumer uses a mobile device to capture a reality scene (e.g., 2003/2004), V-GLASSES may determine a type of the object in the captured visual scene 2036, e.g., an item, card, barcode, receipt, etc… V-GLASSES may correlate the search term with product information 2044 (e.g., include price comparison information if the user is interested in finding the lowest price of a product, etc.), and generate an information layer for the virtual overlay 2049. In one implementation, the V-GLASSES may optionally capture mixed gestures within the captured reality scene 2029, e.g., consumer motion gestures, verbal gestures by articulating a command, etc. (see FIGS. 32-41) in view of Figs. 2A, 2B, 12G, ¶0103 and ¶0214]); 
	recognize the third gesture based at least in part on the third image (Fig. 13B; ¶0107 [MDGAAT may use an image processing component in order to process the video and/or images 310 and determine what the gesture is], ¶0164 [stored gestures], ¶0249 [the consumer may provide an indication of interests 231 a (e.g., see 427 a-b in FIG. 15E; tapping an “add to cart” button, etc.)… the CSR may in turn provide detailed information and/or add the item to shopping cart 233 a (e.g., see 439 in FIG. 4G) to the consumer per consumer request], ¶0350 [when a consumer uses a mobile device to capture a reality scene (e.g., 2003/2004), V-GLASSES may determine a type of the object in the captured visual scene 2036, e.g., an item, card, barcode, receipt, etc… V-GLASSES may correlate the search term with product information 2044 (e.g., include price comparison information if the user is interested in finding the lowest price of a product, etc.), and generate an information layer for the virtual overlay 2049. In one implementation, the V-GLASSES may optionally capture mixed gestures within the captured reality scene 2029, e.g., consumer motion gestures, verbal gestures by articulating a command, etc. (see FIGS. 32-41) in view of Figs. 2A, 2B, 12G, ¶0103 and ¶0214]); and
	register the product in a predetermined list in response to the recognized third gesture of the seller (Fig. 13B; ¶0249 [the consumer may provide an indication of interests 231 a (e.g., see 427 a-b in FIG. 15E; tapping an “add to cart” button, etc.)… the CSR may in turn provide detailed information and/or add the item to shopping cart 233 a (e.g., see 439 in FIG. 4G) to the consumer per consumer request; Examiner notes “adding the item to a shopping cart” is comparable to registering the product in a predetermined list).

Regarding claim 1, Purves discloses a wearable terminal (Fig. 55; abstract; ¶0183 [a V-GLASSES device may take a form similar to a pair of eyeglasses, which may provide an enhanced view with virtual information labels atop the captured reality scene to a consumer who wears the V-GLASSES device]) comprising: 
	a display (Fig. 55; ¶0184 [the V-GLASSES device may have a plurality of sensors and mechanisms including… a flip down transparent/semi-transparent/opaque LED screen element within the wearer's field of view]); 
	an image sensor configured to capture a first image (Fig. 55; ¶0184 [the V-GLASSES device may have a plurality of sensors and mechanisms including, but not limited to: front facing camera to capture a wearer's line of sight]) comprising a first gesture and a product(Figs. 2A, 2B, 12G; ¶0103 [As shown in FIG. 2 b, in some implementations, the electronic device 206 may process the audio and gesture data itself 218, and may also have a library of possible gestures that it may match 219 with the processed audio and gesture data to], ¶0107 [MDGAAT may use an image processing component in order to process the video and/or images 310 and determine what the gesture is] in view of ¶0214 [With reference to FIG. 12G, a consumer 120 may utilize a “framing” gesture to select an item in the scene. For example, a consumer 120 may “frame” an antique desk lamp 147] and ¶0164)); 
	a sensor configured to detect a predetermined motion of a user (Fig. 55; ¶0184 [the V-GLASSES device may have a plurality of sensors and mechanisms including, but not limited to: front facing camera to capture a wearer's line of sight… dual microphones, one having a conical listening position pointing towards the wearer's mouth… infrared/laser projector in the upper portion of the glasses distally placed from a screen element and usable for projecting rich media… ], ¶0200 [V-GLASSES may project option buttons on a surface and the consumer may tap the projected buttons to make a selection] in view of ¶0097 [the recording may begin when the user presses a button on the electronic device indicating 9 that the user would like to initiate an action; Examiner notes pressing a button on the device is comparable to a sensor detecting a predetermined motion of a user]); and 
	a processor (Fig. 55; ¶¶0472-0473), wherein the processor is configured to: 
		cause the image sensor to start capturing when the predetermined motion of the user is detected by the sensor (Figs. 2A and 2B; ¶0097 [FIGS. 2A-B show data flow diagrams illustrating processing gesture and vocal commands in some embodiments of the MDGAAT. In some implementations the user 201 may initiate an action by providing both a physical gesture 202 and a vocal command 203… capture the gesture via a camera on the electronic device 207… the camera may take a burst of photos. In some implementations, the recording may begin when the user presses a button on the electronic device indicating 9 that the user would like to initiate an action]); 
		recognize the first gesture of the user specifying the product, based at least in part on the first image (Figs. 2A, 2B, 12G; ¶0103 [As shown in FIG. 2 b, in some implementations, the electronic device 206 may process the audio and gesture data itself 218, and may also have a library of possible gestures that it may match 219 with the processed audio and gesture data to], ¶0107 [MDGAAT may use an image processing component in order to process the video and/or images 310 and determine what the gesture is] in view of ¶0214 [With reference to FIG. 12G, a consumer 120 may utilize a “framing” gesture to select an item in the scene. For example, a consumer 120 may “frame” an antique desk lamp 147] and ¶0164); 
		acquire product information related to the product specified and related to a listing of the product on an electronic transaction platform for product's purchase and sale, in response to the recognized first gesture, the product information including a listing price when the product is listed on the electronic transaction platform (Figs. 2A, 2B, 12G [displaying Antique Lamp $99.99]; ¶0103 [the electronic device 206 may process the…gesture data itself 218, and may also have a library of possible gestures that it may match 219 with the processed…gesture data to. The electronic device may then send in the command message 220 the actions to be performed], ¶0214 [a consumer 120 may “frame” an antique desk lamp 147… the V-GLASSES may provide information labels with regard to the item identifying information, availability at local stores, availability on online merchants 148, and/or the like (e.g., various merchants, retailers may inject advertisements related products for the consumer to view, etc.)]; Examiner notes Fig. 12G displays product information including a price of the lamp from a merchant. The listed price that is displayed is the “listing price when the product is listed on the electronic transaction platform”); 
		show a visible output of the product information on the display (Figs. 2A, 2B, 12G; ¶0104 [The MDGAAT may then perform the action specified 221, accessing any information necessary to conduct the action 222, and may send a confirmation page or AR overlay to the user 223] in view of ¶0214 [the V-GLASSES may provide information labels with regard to the item identifying information, availability at local stores, availability on online merchants 148, and/or the like (e.g., various merchants, retailers may inject advertisements related products for the consumer to view, etc.)]); 
		recognize a second gesture of the user, based at least in part on a second image comprising the second gesture and the product, the second image being captured by the image sensor (FIG. 27B-27D; ¶¶0328-0333 [V-GLASSES may categorize information overlays into different layers, e.g., a merchant information layer to provide merchant information with regard to the captured items in the scene, a retail information layer to provide retail inventory information with regard to the captured items in the scene, a social information layer to provide ratings, reviews, comments and/or other related social media feeds with regard to the captured items in the scene, and/or the like. For example, when V-GLASSES captures a scene that contains different objects, different layers of information with regard to different objects (e.g., a trademark logo, a physical object, a sales receipt, and/or the like) may be overlay on top of the captured scene… a consumer may slide the information layer 1611 a to obtain another layer, e.g., retail information 1611 b, social information 1611 c, item information 1611 d, and/or the like…  a consumer may tap on the provided virtual label of a “Cartier” store, e.g., 1613, 1623, etc., and be directed to a store map including inventory information, e.g., as shown in FIG. 16B… a consumer may slide the virtual label overlay layer to view another layer of information labels, e.g., social information 1611 c, item information 1611 d, and/or the like], ¶0107 [MDGAAT may use an image processing component in order to process the video and/or images 310 and determine what the gesture is] in view of Figs. 2A, 2B, 12G, ¶0103, ¶0164, and ¶0214]); and 
		in response to the recognized second gesture of the seller, perform a listing process (FIG. 27B-27D; ¶¶0328-0333; Examiner notes obtaining and displaying additional layers to view item information and retail information is comparable to performing a listing process and displaying the layers in response to a “slide” is comparable to recognizing a second gesture). 

While Purves discloses detecting a predetermined motion using a sensor and capturing an image when the predetermined motion is detected by the sensor (Figs. 2A and 2B; ¶0097 [the recording may begin when the user presses a button on the electronic device indicating 9 that the user would like to initiate an action; Examiner notes pressing a button on the device is comparable to a sensor detecting a predetermined motion of a user] in view of ¶0097 [FIGS. 2A-B show data flow diagrams illustrating processing gesture and vocal commands in some embodiments of the MDGAAT. In some implementations the user 201 may initiate an action by providing both a physical gesture 202 and a vocal command 203… capture the gesture via a camera on the electronic device 207… the camera may take a burst of photos. In some implementations, the recording may begin when the user presses a button on the electronic device indicating 9 that the user would like to initiate an action]), Purves does not explicitly disclose using a depth sensor configured to detect a predetermined motion and start capturing an image when the predetermined motion is detected by the depth sensor. However, in the field of identifying objects that are targets of a directional gesture of a user (abstract), Di Censo et al., hereinafter Di Censo, teaches a user interacting with an information assistant by either pressing a button or interacting with an I/O device (¶0037) by performing a gesture which is detected using an I/O device (Figs. 2A and 2B; ¶¶0033-0035) which can be one or more depth sensors (¶0028 [I/O devices may include one or more types of sensors…including depth sensors]) and based on the motion-based trigger event detected from the I/O device such as holding a hang gesture, a limb gesture, eye gaze gesture for a specified period of time, and once the information assistant determines that a directional gesture and trigger event have been received, the information assistant isolates data associated with the object of interest by activating a camera to capture one or more images of an object of interest (Figs. 2A, 2B, 4A, 4B, 8; ¶¶0057-0064 [the processing unit 102 analyzes sensor data (e.g., image data and/or depth data) received via one or more I/O devices to determine whether a directional gesture (e.g., hand/arm pointing, eye gaze, voice prompt, etc.) performed by a user intersects an object in the surrounding environment… an accelerometer and/or gyroscope may determine that the information assistant 100 has been moved from a first position (e.g., at the user's side, facing the ground) to a second position (e.g., pointed out in front of the user, facing a direction that is substantially parallel to the ground). Accordingly, in various embodiments, the processing unit 102 may determine that an object 210 is being targeted by a directional gesture—and may also determine directional data associated with the directional gesture—when a user points at an object 210, looks at an object 210, points the information assistant 100 at an object 210… If the processing unit 102 determines that a directional gesture is targeting an object of interest 210, then the method proceeds to step 820… at step 820, the processing unit 102 determines whether a trigger event is received while the object 210 is being targeted by the directional gesture… trigger events that are recognized by the information assistant 100 may include… motion-based trigger events, time-based trigger events, input device trigger events, implicit trigger events, and the like. In some embodiments, a trigger event is detected via sensor data… received from one or more I/O devices… a motion-based trigger may be detected by analyzing data received from an accelerometer and/or gyroscope to determine that the information assistant 100 has been moved (e.g., rotated, lifted, shook, etc.) in a particular manner… a time-based trigger event may be detected by analyzing data received via… one or more I/O devices to determine that an object 210 has been targeted by a directional gesture for a specified period of time (e.g., 1 to 3 seconds). For example, and without limitation, the processing unit 102 may determine that a hand gesture, limb gesture, eye gaze gesture, etc. has been targeting an object 210 for a specified period of time… If a trigger event is received while the object 210 is being targeted by the directional gesture, then the method 800 proceeds to step 830… At step 830, the information assistant 100 acquires sensor data associated with the object of interest 210 being targeted by the directional gesture. The information assistant 100 may acquire sensor data via one or more… cameras 120. For example, and without limitation, as described above, the information assistant 100 may acquire image data… via one or more image sensors… at step 840, the processing unit 102 analyzes one or more types of sensor data associated with the object of interest 210 to determine at least one characteristic of the object of interest 210] in view of ¶¶0036-0037, ¶0039). The system of Di Censo is applicable to the system of Purves as they share characteristics and capabilities, namely, they are directed to recognizing gestures and performing predetermined processes in response to the gestures. It would have been obvious to one of ordinary skill in the art at the time of filing to modify the sensor that detects the predetermined motion of a user as taught by Purves with the depth sensor as taught by Di Censo. One of ordinary skill in the art at the time of filing would have been motivated to expand the system of Purves in order to recognize motion based trigger events and gestures (¶¶0034-0037) and more effectively acquire information about objects in the user’s environment (¶0007). 

While Purves discloses recognizing a plurality of user gestures relating to a product, capturing images relating to the product and gestures, displaying information in response to the captured images, and acquiring product information related to the specified product related to a listing, Purves does not explicitly disclose the user is a seller, the product is owned by the seller, and performing a listing processing regarding an electronic transaction for the product for listing the product for sale on the electronic transaction platform for the product of which the product information is shown on the display after the image sensor has captured the product, the listing process including setting a listing price of the product based on the acquired listing price. However, in the field of identifying an object in a live feed image and displaying information associated with the identified object (abstract), Zheng et al., hereinafter Zheng, teaches a user with a worn computing device with a camera capturing digital images of a physical environment such as a living room (¶0021) with objects that are owned by the user (¶0023), performing object recognition to recognize objects that are included within the digital image (¶0025), generating and displaying augmented reality digital content in a user interface of the computing device as part of a “live feed” of digital images taken of the physical environment including characteristics of the object and a price for which the object is available for sale or purchase via an online auction (¶0026), and supporting the sale of the recognized object by providing to the user information (e.g., metadata) from a variety of different types of service provider systems relating to sale of the object such as the product name, product description, and price for the sale based on online auctions, displays a selectable option to sell the product at the sale price, and generates automated listings to sell one or more objects individually or as a whole (Figs. 3-6, ¶0032, ¶0037, ¶0051, and ¶0054; claims 3, 8, 9, and 20). The system of Zheng is applicable to the system of Purves in view of Di Censo as they share characteristics and capabilities, namely, they are directed to identifying objects in an image and retrieving information associated with the object. It would have been obvious to one of ordinary skill in the art at the time of filing to modify the user that performs the predetermined actions associated with gestures as taught by Purves in view of Di Censo with the user being a seller and listing for sale of a selected item owned by the seller as taught by Zheng. One of ordinary skill in the art at the time of filing would have been motivated to expand the system of Purves in view of Di Censo in order to aid in the sale of objects recognized in digital images collected from a digital camera (¶0054). 

Regarding claim 2, Purves in view of Di Censo and Zheng teaches the wearable terminal according to claim 1. While Purves discloses displaying product information (Figs. 2A, 2B, 12G [displaying Antique Lamp $99.99]; ¶0214 [the V-GLASSES may provide information labels with regard to the item identifying information, availability at local stores, availability on online merchants 148, and/or the like (e.g., various merchants, retailers may inject advertisements related products for the consumer to view, etc.)]), Purves does not explicitly disclose wherein the listing process further comprises using the product information to list the product for sale. However Zheng further teaches obtaining information (e.g., metadata) from a variety of different types of service provider systems relating to sale of the object such as the product name, product description, and price for the sale based on online auctions and generating an automated listing using the collected metadata (Figs. 3-6, ¶0032, ¶0037, ¶0051, and ¶0054; claims 3, 8, 9, and 20). One of ordinary skill in the art at the time of filing would have been motivated to expand the system of Purves in view of Di Censo in order to aid in the sale of objects recognized in digital images collected from a digital camera (¶0054).

Regarding claim 6, Purves in view of Di Censo and Zheng teaches the wearable terminal according to claim 1. Purves further discloses wherein the image sensor is further configured to capture a third image including a third gesture and the product (Fig. 13B; ¶0249 [the consumer may provide an indication of interests 231 a (e.g., see 427 a-b in FIG. 15E; tapping an “add to cart” button, etc.)… the CSR may in turn provide detailed information and/or add the item to shopping cart 233 a (e.g., see 439 in FIG. 4G) to the consumer per consumer request], ¶0350 [when a consumer uses a mobile device to capture a reality scene (e.g., 2003/2004), V-GLASSES may determine a type of the object in the captured visual scene 2036, e.g., an item, card, barcode, receipt, etc… V-GLASSES may correlate the search term with product information 2044 (e.g., include price comparison information if the user is interested in finding the lowest price of a product, etc.), and generate an information layer for the virtual overlay 2049. In one implementation, the V-GLASSES may optionally capture mixed gestures within the captured reality scene 2029, e.g., consumer motion gestures, verbal gestures by articulating a command, etc. (see FIGS. 32-41) in view of Figs. 2A, 2B, 12G, ¶0103 and ¶0214]) and wherein the processor is further configured to: 
	recognize the third gesture based at least in part on the third image (Fig. 13B; ¶0107 [MDGAAT may use an image processing component in order to process the video and/or images 310 and determine what the gesture is], ¶0164 [stored gestures], ¶0249 [the consumer may provide an indication of interests 231 a (e.g., see 427 a-b in FIG. 15E; tapping an “add to cart” button, etc.)… the CSR may in turn provide detailed information and/or add the item to shopping cart 233 a (e.g., see 439 in FIG. 4G) to the consumer per consumer request], ¶0350 [when a consumer uses a mobile device to capture a reality scene (e.g., 2003/2004), V-GLASSES may determine a type of the object in the captured visual scene 2036, e.g., an item, card, barcode, receipt, etc… V-GLASSES may correlate the search term with product information 2044 (e.g., include price comparison information if the user is interested in finding the lowest price of a product, etc.), and generate an information layer for the virtual overlay 2049. In one implementation, the V-GLASSES may optionally capture mixed gestures within the captured reality scene 2029, e.g., consumer motion gestures, verbal gestures by articulating a command, etc. (see FIGS. 32-41) in view of Figs. 2A, 2B, 12G, ¶0103 and ¶0214]); and
	register the product in a predetermined list in response to the recognized third gesture of the seller (Fig. 13B; ¶0249 [the consumer may provide an indication of interests 231 a (e.g., see 427 a-b in FIG. 15E; tapping an “add to cart” button, etc.)… the CSR may in turn provide detailed information and/or add the item to shopping cart 233 a (e.g., see 439 in FIG. 4G) to the consumer per consumer request; Examiner notes “adding the item to a shopping cart” is comparable to registering the product in a predetermined list).

Regarding claim 7, Purves in view of Di Censo and Zheng teaches the wearable terminal according to claim 1. Purves further discloses wherein the product information of the product is selected via an object used for the first gesture (FIG. 27B-27D; ¶¶0328-0333 [V-GLASSES may categorize information overlays into different layers, e.g., a merchant information layer to provide merchant information with regard to the captured items in the scene, a retail information layer to provide retail inventory information with regard to the captured items in the scene, a social information layer to provide ratings, reviews, comments and/or other related social media feeds with regard to the captured items in the scene, and/or the like. For example, when V-GLASSES captures a scene that contains different objects, different layers of information with regard to different objects (e.g., a trademark logo, a physical object, a sales receipt, and/or the like) may be overlay on top of the captured scene… a consumer may slide the information layer 1611 a to obtain another layer, e.g., retail information 1611 b, social information 1611 c, item information 1611 d, and/or the like…  a consumer may tap on the provided virtual label of a “Cartier” store, e.g., 1613, 1623, etc., and be directed to a store map including inventory information, e.g., as shown in FIG. 16B… a consumer may slide the virtual label overlay layer to view another layer of information labels, e.g., social information 1611 c, item information 1611 d, and/or the like], ¶0107 [MDGAAT may use an image processing component in order to process the video and/or images 310 and determine what the gesture is] in view of Figs. 2A, 2B, 12G, ¶0103, ¶0164, and ¶0214]).

Regarding claim 8, Purves in view of Di Censo and Zheng teaches the wearable terminal according to claim 7. Purves further discloses wherein the object is a finger of the seller (¶0343 [upon receiving user finger indication, the V-GLASSES may obtain an image of the scene (or the user finger pointed portion) 2006, e.g., grabbing a video frame, etc. In one implementation, the V-GLASSES may detect fingertip position within the video frame, and determine an object around the fingertip position for recognition 2007. The V-GLASSES may then perform OCR and/or pattern recognition on the obtained image (e.g., around the fingertip position) 2008 to determine a type of the object in the image 2010. For example, in one implementation, the V-GLASSES may start from the finger point and scan outwardly to perform edge detection so as to determine a contour of the object]).

Regarding claim 9, Purves in view of Di Censo and Zheng teaches the wearable terminal according to claim 7. Purves further discloses wherein the processor is further configured to delete the product information from the display, in a case where the object is no longer positioned in a predetermined region in another image captured by the image sensor and the another image is shown on the display (FIG. 27B-27D; ¶¶0328-0333; Examiner noes “swiping” is comparable to where the object is no longer positioned in s predetermined region and displaying another layer is comparable to deleting the product information from the display). 

Regarding claims 10, 17, 18, and 20, the claim discloses substantially the same limitations, as claims 1, 2, and 6. All limitations as recited have been analyzed and rejected with respect to claims 1, 2, and 6, and do not introduce any additional narrowing of the scopes of the claims as analyzed. Examiner notes that while claim 20 recites a “fourth” image and a “fourth” gesture, claim 10 from which claim 20 depends only recites a “first” and “second” image and gesture. Therefore the “fourth” gesture and image is interpreted as an image and gesture different from the “first” and “second” image and gesture. Therefore, claims 10, 17, 18, and 20 are rejected for the same rational over the prior art cited in claims 1, 2, and 6. 

Regarding claim 19, Purves in view of Di Censo and Zheng teaches the information processing terminal according to claim 10. Purves further discloses wherein the listing process further comprises starting a purchase process to purchase, on the electronic platform, another product recognized by the first gesture (Fig. 13B; ¶0249 [the consumer may provide an indication of interests 231 a (e.g., see 427 a-b in FIG. 15E; tapping an “add to cart” button, etc.)… the CSR may in turn provide detailed information and/or add the item to shopping cart 233 a (e.g., see 439 in FIG. 4G) to the consumer per consumer request], ¶0350 [when a consumer uses a mobile device to capture a reality scene (e.g., 2003/2004), V-GLASSES may determine a type of the object in the captured visual scene 2036, e.g., an item, card, barcode, receipt, etc… V-GLASSES may correlate the search term with product information 2044 (e.g., include price comparison information if the user is interested in finding the lowest price of a product, etc.), and generate an information layer for the virtual overlay 2049. In one implementation, the V-GLASSES may optionally capture mixed gestures within the captured reality scene 2029, e.g., consumer motion gestures, verbal gestures by articulating a command, etc. (see FIGS. 32-41) in view of Figs. 2A, 2B, 12G, ¶0103 and ¶0214]), wherein the image sensor is further configured to capture a third image including a third gesture and the product (Fig. 13B; ¶0249 [the consumer may provide an indication of interests 231 a (e.g., see 427 a-b in FIG. 15E; tapping an “add to cart” button, etc.)… the CSR may in turn provide detailed information and/or add the item to shopping cart 233 a (e.g., see 439 in FIG. 4G) to the consumer per consumer request… the consumer may submit a payment interest indication 231 b (e.g., by tapping on a “pay” button), and the CSR may present a purchasing page 233 b (e.g., an item information checkout page with a QR code, see 442 in FIG. 15H) to the consumer 202], ¶0350 [when a consumer uses a mobile device to capture a reality scene (e.g., 2003/2004), V-GLASSES may determine a type of the object in the captured visual scene 2036, e.g., an item, card, barcode, receipt, etc… V-GLASSES may correlate the search term with product information 2044 (e.g., include price comparison information if the user is interested in finding the lowest price of a product, etc.), and generate an information layer for the virtual overlay 2049. In one implementation, the V-GLASSES may optionally capture mixed gestures within the captured reality scene 2029, e.g., consumer motion gestures, verbal gestures by articulating a command, etc. (see FIGS. 32-41) in view of Figs. 2A, 2B, 12G, ¶0103 and ¶0214]), and wherein the processor is further configured to: 
	recognize the third gesture based at least in part on the third image (Fig. 13B; ¶0107 [MDGAAT may use an image processing component in order to process the video and/or images 310 and determine what the gesture is], ¶0164 [stored gestures], ¶0249 [the consumer may provide an indication of interests 231 a (e.g., see 427 a-b in FIG. 15E; tapping an “add to cart” button, etc.)… the CSR may in turn provide detailed information and/or add the item to shopping cart 233 a (e.g., see 439 in FIG. 4G) to the consumer per consumer request… the consumer may submit a payment interest indication 231 b (e.g., by tapping on a “pay” button), and the CSR may present a purchasing page 233 b (e.g., an item information checkout page with a QR code, see 442 in FIG. 15H) to the consumer 202], ¶0350 [when a consumer uses a mobile device to capture a reality scene (e.g., 2003/2004), V-GLASSES may determine a type of the object in the captured visual scene 2036, e.g., an item, card, barcode, receipt, etc… V-GLASSES may correlate the search term with product information 2044 (e.g., include price comparison information if the user is interested in finding the lowest price of a product, etc.), and generate an information layer for the virtual overlay 2049. In one implementation, the V-GLASSES may optionally capture mixed gestures within the captured reality scene 2029, e.g., consumer motion gestures, verbal gestures by articulating a command, etc. (see FIGS. 32-41) in view of Figs. 2A, 2B, 12G, ¶0103 and ¶0214]); and 
	accept to purchase the another product in response to the recognized third gesture (¶0249 [the consumer may submit a payment interest indication 231 b (e.g., by tapping on a “pay” button), and the CSR may present a purchasing page 233 b (e.g., an item information checkout page with a QR code, see 442 in FIG. 15H) to the consumer 202] in view of Figs. 2A, 2B, 12G, ¶0103 and ¶0214]). 


Examiner’s Comment

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
	Reference C of the Notice of References Cited Timonen et al. (US 2019/0080494 A1) discloses capturing an image of an object, identifying the object, collecting metadata regarding the object, generating a listing based on the metadata, and allowing a user to list the item for sale. 


Response to Arguments
Applicant’s arguments filed 9/22/2022, with respect to the previous 35 U.S.C. 112 rejections have been fully considered and are persuasive in view of the amended claims. Accordingly the previous 35 U.S.C. 112 rejections are withdrawn. 
	Applicant’s arguments filed 9/22/2022, with respect to the previous 35 USC §103 rejections have been fully considered but are moot in view of the new 35 USC §103 rejections applied to applicant’s amended claims.


Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LINDSEY B SMITH whose telephone number is (571)272-0519. The examiner can normally be reached Monday - Friday 9-6 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jeff Smith can be reached on 571-272-6763. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

LINDSEY B. SMITH
Examiner
Art Unit 3625



/LINDSEY B SMITH/Examiner, Art Unit 3625                                                                                                                                                                                                        
/MICHAEL MISIASZEK/Primary Examiner, Art Unit 3625