Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendments/Remarks 
Claims 1, 8, and 15 have been amended. 
Claims 1-20 remained pending.
Please refer to the action below.

Examiner Notes
The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. However, the claimed subject matter, not the specification, is the measure of the invention. 

Responses to Arguments/Remarks
Applicant’s arguments/remarks of 10/21/2022 erroneously addressed the prior rejections as to “Applicant respectfully submits that Claim 1 is patentable because Jeong does not disclose, teach, or suggest at least the above-emphasized features of Claim 1.  To the extent the Office is attempting to equate the augmented reality object of Jeong to the claimed "image information", the recipe of Jeong to the claimed "text information", and the necessary/unnecessary indications of Joeng to the claimed "association information", Claim 1 recites "receiving, via the UI, a first user input in association with the image information, and a second user input in association with the text information to associate respective image information of the image information with corresponding text information of the text information." Jeong does not disclose receiving any such first user input and second user input, nor does Jeong disclose that the second user input associates respective image information of the image information with corresponding text information of the text information. Therefore, Jeong's purported disclosure of augmented reality objects, recipes, and necessary/unnecessary indications does not disclose the cited features of Claim 1. 
Moreover, at least because Jeong does not disclose, teach, or suggest "receiving, via the UI, a first user input in association with the image information, and a second user input in association with the text information to associate respective image information of the image information with corresponding text information of the text information," as recited in Claim 1, Jeong necessarily does not disclose, teach, or suggest "generate association information that associates the respective image information with the corresponding text information, based on the first user input and the second user input," as further recited in Claim 1” have considered, however, these arguments are moot in light of the new ground of rejection.  
 
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-20 is/are rejected under 35 U.S.C. 103 as being unpatentable and obvious over Jeong et al (US 20200088463, previously cited), in view of Heindorf et al. (CN 110096576, A1). 

     Regarding claim 1, Jeong teaches a device for providing a user interface (UI) for generating training data for an artificial intelligence (Al) model (artificial intelligence system of at least para. 0003 and 0245-0247 comprising said device for providing said user interface (UI) for generating training data for an artificial intelligence (Al) model), 
the device comprising: 
a memory (para. 0138-0140, memory 120) configured to store instructions; and 
a processor (processor 110  of para. 0190-0192) configured to execute the instructions to: 
provide, for display via the UI, image information that depicts an object, a set of operations of the object, and a process associated with the set of operations (provided UI of at least para. 0211 and 0273-0277, and Fig. 18 provide, for display instructional image information of at least a dish preparation recipe object, a set of ingredients and recipe operations of the object, and implied guidance process or the like associated with said set of operations);
provide, for display via the UI, text information that describes the object, the set of operations of the object, and the process associated with the set of operations (provide for display at least text recipe of further para. 0211, and 0273-0277 and Fig. 18 that describes the object, the set of operations of the object, and said process associated with the set of operations); 
receive, via the UI, a user input that associates respective image information of the image information with corresponding display text information (received the user inputs of at least para. 0157, 0218, and 0273 which include at least touch or spoken inputs via the UI which associates respective image information of the image information with corresponding displayed text information of the text information);
and generate association information that associates the respective image information with the corresponding display text information, based on the user input (the system further display in para. 0157, 0212, 0329 association information that associates the respective image information with the corresponding text information, based on the user input).  
   However, Jeong is silent regarding receive first user input in association with the image information, and a second user input in association with the text information to associate respective image information of the image information with corresponding text information of the text information; and generate said association information based on the first user input and the second user input.
     Heindorf teaches in at least Fig. 1 a user interface receiving and displaying webpage content 120 and depicted video and images, Heindorf further teaches in the description receiving user text query information 102, comprising at least one of “how our breaks the sealing paint tank”, and second plurality of other text query such as "how to plastering wall. preparing region, by breaking the seal and opening a tank using a screwdriver cover paint” to associate respective image information of the image information with corresponding text information of the text information, a search environment 100 further condition to receive based on at least queries 102 at least first user input in association with video tutorial images or do it yourself image information, the system further associates text input data with at least queried displayed video data which may obviously have been captured or obtained by known means to associate respective tutorial image information of the image information with said corresponding text information of the text information, the system further configures with an artificial intelligence capability operation to provide the query result 105 indicative of said association information of a step by step instruction based on the first user input and the second user input. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Jeong in view of Heindorf  to include wherein receive, via the UI, said first user input in association with the image information, and said second user input in association with the text information to associate respective image information of the image information with corresponding text information of the text information and generate said association information based on the first user input and the second user input, as Jeong in view of Heindorf are in the same field of endeavor of training a learning machine to learn and associate at least images of instruction tutorials or product images with corresponding text information, where the system may provide to a user based on presented or depicted images and user request queries step by step instruction tutorials corresponding to at least request queries and predetermined trained data, where the user of at least Heindorf may obtain said associated information to in a case requested plurality of text queries associated with  the depicted video or images, which may further be realized according to known methods to yield predictable results since known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art as said combination is thus the adaptation of an old idea or invention using newer technology that is either commonly available and understood in the art thereby a variation on already known art (See MPEP 2143, KSR Exemplary Rationale F).

     Regarding claim 2 (according to claim 1), Jeong further teaches wherein the processor is further configured to: receive, via the UI, discourse parsing information that identifies a relation between image information of the association information (the system of at least para. 0146-0147 describes recognizing and associating user inputs which in some case maybe initially impossible, said recognition understoodly may comprise discourse parsing information that identifies a relation between captured image information of the association information); and generate annotated image information based on the discourse parsing information (display in at least para. 0273-0277 captioned or annotated image information based on the discourse parsing information).

     Regarding claim 3 (according to claim 1), Jeong further teaches wherein the processor is further configured to: receive, via the UI, semantics parsing information that provides the text 34information of the association information in a machine-understandable format (para. 0197); and generate annotated image information based on the semantics parsing information (generated captioned information in at least 0157, 0212, 0329 based on the semantics parsing information of at least para. 0197).
 
     Regarding claim 4 (according to claim 1), Jeong further teaches wherein the processor is further configured to: input the association information into the Al model as training data for the Al model to permit the Al model to associate the respective image information with the corresponding text information (the system inputting in para. 0245-0247 association information into the Al model as training data for the Al model to permit the Al model to associate the respective image information of further para. 0273-0277 with the corresponding text information)

     Regarding claim 5 (according to claim 2), Jeong further teaches wherein the processor is further configured to: input the annotated image information into the Al model as training data for the Al model to permit the Al model to identify the relation between the image information of the association information (para. 0273-0277 teaches in a case of inputting captioned or annotated image information into the Al model of at least para. 0245-0247 as training data for the Al model to permit the Al model to identify the relation between the image information of the association information).

     Regarding claim 6 (according to claim 3), Jeong further teaches wherein the processor is further configured to: input the annotated image information into the Al model as training data for the Al model to permit the Al model to convert the text information of the association information to the machine-understandable format (para. 0273-0277 further teaches in a case of inputting captioned or annotated image information into the Al model of at least para. 0245-0247 as training data for the Al model to permit the Al model to permit said Al model to convert the text information of the association information to the machine-understandable format).

     Regarding claim 7 (according to claim 1), Jeong further teaches wherein the image information is associated with an instructional video regarding the object, and wherein the text information corresponds to at least one of a product manual associated with the object, captions associated with the instructional video, or text input via the UI (para. 0273-0277 teaches in a case displayed image information associated with an instructional video regarding a dish preparation object, and wherein the text information corresponds to at least one of a product manual associated with the object, captions associated with the instructional video, or text input via the UI).

     Regarding claim 8, Jeong teaches a method for providing a user interface (UI) for generating training data for an artificial intelligence (Al) model (artificial intelligence system of at least para. 0003 and 0245-0247 comprising said device for providing said user interface (UI) for generating training data for an artificial intelligence (Al) model), 
the method comprising: 
providing, for display via the UI, image information that depicts an object, a set of operations of the object, and a process associated with the set of operations (provided UI of at least para. 0211 and 0273-0277, and Fig. 18 provide, for display instructional image information of at least a dish preparation recipe object, a set of ingredients and recipe operations of the object, and implied guidance process or the like associated with said set of operations);
providing, for display via the UI, text information that describes the object, the set of operations of the object, and the process associated with the set of operations (provide for display at least text recipe of further para. 0211, and 0273-0277 and Fig. 18 that describes the object, the set of operations of the object, and said process associated with the set of operations); 
receiving, via the UI, a user input that associates respective image information of the image information with corresponding display text information (received the user inputs of at least para. 0157, 0218, and 0273 which include at least touch or spoken inputs via the UI which associates respective image information of the image information with corresponding displayed text information of the text information);
and generating association information that associates the respective image information with the corresponding display text information, based on the user input (the system further display in para. 0157, 0212, 0329 association information that associates the respective image information with the corresponding text information, based on the user input).  
   However, Jeong is silent regarding receiving first user input in association with the image information, and a second user input in association with the text information to associate respective image information of the image information with corresponding text information of the text information; and generating said association information based on the first user input and the second user input.
     Heindorf teaches in at least Fig. 1 a user interface receiving and displaying webpage content 120 and depicted video and images, Heindorf further teaches in the description receiving user text query information 102, comprising at least one of “how our breaks the sealing paint tank”, and second plurality of other text query such as "how to plastering wall. preparing region, by breaking the seal and opening a tank using a screwdriver cover paint” to associate respective image information of the image information with corresponding text information of the text information, a search environment 100 further condition to receive based on at least queries 102 at least first user input in association with video tutorial images or do it yourself image information, the system further associates text input data with at least queried displayed video data which may obviously have been captured or obtained by known means to associate respective tutorial image information of the image information with said corresponding text information of the text information, the system further configures with an artificial intelligence capability operation to provide the query result 105 indicative of said association information of a step by step instruction based on the first user input and the second user input. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Jeong in view of Heindorf  to include wherein receiving, via the UI, said first user input in association with the image information, and said second user input in association with the text information to associate respective image information of the image information with corresponding text information of the text information and generating said association information based on the first user input and the second user input, as Jeong in view of Heindorf are in the same field of endeavor of training a learning machine to learn and associate at least images of instruction tutorials or product images with corresponding text information, where the system may provide to a user based on presented or depicted images and user request queries step by step instruction tutorials corresponding to at least request queries and predetermined trained data, where the user of at least Heindorf may obtain said associated information to in a case requested plurality of text queries associated with  the depicted video or images, which may further be realized according to known methods to yield predictable results since known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art as said combination is thus the adaptation of an old idea or invention using newer technology that is either commonly available and understood in the art thereby a variation on already known art (See MPEP 2143, KSR Exemplary Rationale F).

     Regarding claim 9 (according to claim 8), Jeong further teaches wherein further comprising: receiving, via the UI, discourse parsing information that identifies a relation between image information of the association information (the system of at least para. 0146-0147 describes recognizing and associating user inputs which in some case maybe initially impossible, said recognition understoodly may comprise discourse parsing information that identifies a relation between captured image information of the association information); and generating annotated image information based on the discourse parsing information (display in at least para. 0273-0277 captioned or annotated image information based on the discourse parsing information).

     Regarding claim 10 (according to claim 8), Jeong further teaches wherein further comprising: receiving, via the UI, semantics parsing information that provides the text information of the association information in a machine-understandable format (para. 0197); and generating annotated image information based on the semantics parsing information (generated captioned information in at least 0157, 0212, 0329 based on the semantics parsing information of at least para. 0197).
 
     Regarding claim 11 (according to claim 8), Jeong further teaches wherein further comprising: inputting the association information into the Al model as training data for the Al model to permit the Al model to associate the respective image information with the corresponding text information (the system inputting in para. 0245-0247 association information into the Al model as training data for the Al model to permit the Al model to associate the respective image information of further para. 0273-0277 with the corresponding text information).

     Regarding claim 12 (according to claim 9), Jeong further teaches wherein  further comprising: inputting the annotated image information into the Al model as training data for the Al model to permit the Al model to identify the relation between the image information of the association information (para. 0273-0277 teaches in a case of inputting captioned or annotated image information into the Al model of at least para. 0245-0247 as training data for the Al model to permit the Al model to identify the relation between the image information of the association information).

     Regarding claim 13 (according to claim 10), Jeong further teaches wherein  further comprising: inputting the annotated image information into the Al model as training data for the Al model to permit the Al model to convert the text information of the association information to the machine-understandable format (para. 0273-0277 further teaches in a case of inputting captioned or annotated image information into the Al model of at least para. 0245-0247 as training data for the Al model to permit the Al model to permit said Al model to convert the text information of the association information to the machine-understandable format ).

     Regarding claim 14 (according to claim 8), Jeong further teaches wherein  the image information is associated with an instructional video regarding the object, and wherein the text information corresponds to at least one of a product manual associated with the object, captions associated with the instructional video, or text input via the UI (para. 0273-0277 teaches in a case displayed image information associated with an instructional video regarding a dish preparation object, and wherein the text information corresponds to at least one of a product manual associated with the object, captions associated with the instructional video, or text input via the UI).


    Regarding claim 15, Jeong teaches in at least para. 0368-0370 a non-transitory computer-readable medium storing instructions, the instructions comprising: one or more instructions that, when executed by one or more processors of a device for training an artificial intelligence (Al) model, processor of at least para. 0368-0370 allows an artificial intelligence system of at least para. 0003 and 0245-0247 for providing said user interface (UI) and generating training data for said artificial intelligence (Al) model), 
cause the one or more 37processors to: provide, for display via the UI, image information that depicts an object, a set of operations of the object, and a process associated with the set of operations (provided UI of at least para. 0211 and 0273-0277, and Fig. 18 provide, for display instructional image information of at least a dish preparation recipe object, a set of ingredients and recipe operations of the object, and implied guidance process or the like associated with said set of operations);
provide, for display via the UI, text information that describes the object, the set of operations of the object, and the process associated with the set of operations (provide for display at least text recipe of further para. 0211, and 0273-0277 and Fig. 18 that describes the object, the set of operations of the object, and said process associated with the set of operations); 
receive, via the UI, a user input that associates respective image information of the image information with corresponding display text information (received the user inputs of at least para. 0157, 0218, and 0273 which include at least touch or spoken inputs via the UI which associates respective image information of the image information with corresponding displayed text information of the text information);
and generate association information that associates the respective image information with the corresponding display text information, based on the user input (the system further display in para. 0157, 0212, 0329 association information that associates the respective image information with the corresponding text information, based on the user input).  
   However, Jeong is silent regarding receive first user input in association with the image information, and a second user input in association with the text information to associate respective image information of the image information with corresponding text information of the text information; and generate said association information based on the first user input and the second user input.
     Heindorf teaches in at least Fig. 1 a user interface receiving and displaying webpage content 120 and depicted video and images, Heindorf further teaches in the description receiving user text query information 102, comprising at least one of “how our breaks the sealing paint tank”, and second plurality of other text query such as "how to plastering wall. preparing region, by breaking the seal and opening a tank using a screwdriver cover paint” to associate respective image information of the image information with corresponding text information of the text information, a search environment 100 further condition to receive based on at least queries 102 at least first user input in association with video tutorial images or do it yourself image information, the system further associates text input data with at least queried displayed video data which may obviously have been captured or obtained by known means to associate respective tutorial image information of the image information with said corresponding text information of the text information, the system further configures with an artificial intelligence capability operation to provide the query result 105 indicative of said association information of a step by step instruction based on the first user input and the second user input. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Jeong in view of Heindorf  to include wherein receive, via the UI, said first user input in association with the image information, and said second user input in association with the text information to associate respective image information of the image information with corresponding text information of the text information and generate said association information based on the first user input and the second user input, as Jeong in view of Heindorf are in the same field of endeavor of training a learning machine to learn and associate at least images of instruction tutorials or product images with corresponding text information, where the system may provide to a user based on presented or depicted images and user request queries step by step instruction tutorials corresponding to at least request queries and predetermined trained data, where the user of at least Heindorf may obtain said associated information to in a case requested plurality of text queries associated with  the depicted video or images, which may further be realized according to known methods to yield predictable results since known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art as said combination is thus the adaptation of an old idea or invention using newer technology that is either commonly available and understood in the art thereby a variation on already known art (See MPEP 2143, KSR Exemplary Rationale F).

     Regarding claim 16 (according to claim 15), Jeong further teaches wherein the one or more instructions further cause the one or more processors to: receive, via the UI, discourse parsing information that identifies a relation between image information of the association information (the system of at least para. 0146-0147 describes recognizing and associating user inputs which in some case maybe initially impossible, said recognition understoodly may comprise discourse parsing information that identifies a relation between captured image information of the association information); and generate annotated image information based on the discourse parsing information (display in at least para. 0273-0277 captioned or annotated image information based on the discourse parsing information).

     Regarding claim 17 (according to claim 15), Jeong further teaches wherein the one or more instructions further cause the one or more processors to: receive, via the UI, semantics parsing information that provides the text 34information of the association information in a machine-understandable format (para. 0197); and generate annotated image information based on the semantics parsing information (generated captioned information in at least 0157, 0212, 0329 based on the semantics parsing information of at least para. 0197).

     Regarding claim 18 (according to claim 15), Jeong further teaches wherein the one or more instructions further cause the one or more processors to: input the association information into the Al model as training data for the Al model to permit the Al model to associate the respective image information with the corresponding text information (the system inputting in para. 0245-0247 association information into the Al model as training data for the Al model to permit the Al model to associate the respective image information of further para. 0273-0277 with the corresponding text information).

     Regarding claim 19 (according to claim 16), Jeong further teaches wherein the one or more instructions further cause the one or more processors to: 
receive, via the UI, semantics parsing information that provides the text information of the association information in a machine-understandable format (para. 0197); and input the annotated information and the semantics parsing information into the Al model as training data for the Al model to permit the Al model to identify the relation between the image information of the association information (display and captioned data of further para. 0273-0277 maybe inputted or self-learned as implied further in para. 0245-0247, said data comprising obviously system or user inputted captioned information and the semantics parsing information into the Al model as training data for the Al model to permit the Al model to identify obviously relation between the image information of the association information); and 
to permit the Al model to convert the text information of the association information to the machine-understandable format (para. 0273-0277 further teaches in a case of inputting captioned or annotated image information into the Al model of at least para. 0245-0247 as training data for the Al model to permit the Al model to permit said Al model to convert the text information of the association information to the machine-understandable format ).

     Regarding claim 20 (according to claim 15), Jeong further teaches wherein the one or more instructions further cause the one or more processors to: the image information is associated with an instructional video regarding the object, and wherein the text information corresponds to at least one of a product manual associated with the object, captions associated with the instructional video, or text input via the UI (para. 0273-0277 teaches in a case displayed image information associated with an instructional video regarding a dish preparation object, and wherein the text information corresponds to at least one of a product manual associated with the object, captions associated with the instructional video, or text input via the UI).


Conclusion
      Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
      Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARCELLUS AUGUSTIN whose telephone number is (571)270-3384. The examiner can normally be reached 9 AM- 5 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, BENNY TIEU can be reached on 571-272-7490. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MARCELLUS J AUGUSTIN/Primary Examiner, Art Unit 2674                                                                                                                                                                                                        12/5/2022