DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.


Claim Objections
Claim 4 is objected to because of the following informalities: the claims recite two instances of “and/or”. Only either “and” or “or” should be present. Appropriate correction is required.


Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.



Claim 13 is rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. The claim does not fall within at least one of the four categories of patent eligible subject matter because the claim is drawn to a computer readable medium, which includes both statutory and non-statutory embodiments as seen in Specification, [0015] (where the computer readable medium can include both volatile and non-volatile memories). A claim that covers both statutory (nonvolatile memory) and non-statutory embodiments (volatile memory), includes a signal (under broadest reasonable interpretation). Such a signal has been held by the courts to be non-statutory (see Ex parte Mewherter (2013)). Therefore, when the broadest reasonable interpretation of a claim covers a signal per se, the claim must be rejected under 35 U.S.C. 101 as covering non-statutory subject matter. See In re Nuijten, 500 F.3d 1346, 1356-57 (Fed. Cir. 2007) (transitory embodiments are not directed to statutory subject matter). See, e.g., MPEP §2106.03.


Claims 1-13 are rejected under 35 U.S.C. 101 because the claims are directed to a judicial exception (i.e., an abstract idea) without significantly more.
	The claims recite certain methods of organizing human activity (certain activity between a person and a computer) or alternatively, managing personal behavior or relationships/interactions between people. The claims recite methods of guided information capture, i.e., instructing users to collect information about objects. The claims also variously recite mental tasks or processes (which is explicitly articulated where applicable below).
	Independent claims 1 and 13 recite requesting user input and guiding the user through a user-driven data collection procedure of collecting labeled data; defining a context of the data collection procedure based on user input, automatically from an application field, or from collected sensor data, and generating corresponding context information (note that inferring this automatically from an application field or from collected sensor data falls under the “Mental Processes” grouping of abstract ideas, as it encompasses an evaluation, observation, and/or judgment); after defining the context of the data collection procedure, executing the data collection workflow and requesting user input in which the workflow engine presents information relevant to the data to be collected to the user via a graphical user interface; using the context information, the provided information relevant to the data to be collected, and the received user input during the data collection workflow for labelling the data to be collected.
	Dependent claim 3 recites allowing a user the option to skip a step that requires user input is not valid or available.
	Dependent claim 4 recites updating a generated dataset at a particular stage of the workflow and/or a current data collection step (i.e., a certain method of organizing human activity) based on observing at least some of the user inputs and/or context information, in which the observation step falls under the “Mental Processes” grouping of abstract ideas (as it encompasses an evaluation, an observation, and/or judgment).
	Dependent claim 5 recites providing summary information of user inputs during the workflow.
	Dependent claim 6 recites generating a new data collection workflow or subsequent modified execution of the data collection workflow based on the summary information of user inputs.
	Dependent claim 7 recites requesting user input as to which are the objects of interest and what is the context for collecting data for the objects of interest.
	Dependent claim 8 recites presenting at least one visual structure on the graphical user interface that the user positions in relation to the at least one object of interest in the at least one image being captured, where the information regarding the position of the at least one visual structure is used as label information indicative of a position of the at least one object of interest in relation to the at least one image.
	Dependent claim 9 recites that the information regarding at least one of the position, size, and number of the at least one visual structure is used to calibrate the generation of the at least one visual structure itself. This is nothing more than automating a mental task or process, e.g., a user can mentally perform such identifications and placement of initial bounding boxes based on a position, size, and number of the visual structure. Such a step encompasses an evaluation, observation, and/or judgment, which falls under the “Mental Processes” grouping of abstract ideas.
	Dependent claim 11 recites requesting user input such that when the user starts executing the data collection workflow, the workflow engine presents information relevant to the data to be collected on the graphical user interface.
	Dependent claim 12 recites displaying on the graphical user interface a view of a camera for capturing the at least one image and presents in the view of the camera at least one visual structure in one or more set locations and sizes, and the user is requested to fit one or more of the at least one object of interest displayed in the view of the camera within at least one of the visual structures, and subsequently requesting user input for capturing the at least one image, and saves the captured at least one image and data associated with the at least one of the visual structures.
	Therefore, the claims fall under the “Certain Methods of Organizing Human Activity” and “Mental Processes” groupings of abstract ideas. Accordingly, the claims recite an abstract idea.

	The judicial exception is not integrated into a practical application of the idea. The claims recite the use of workflow engines (i.e., a type of software program), an algorithm based on machine learning (i.e., another type of software program), sensors, graphical user interfaces, a camera view (dependent claim 12), and (in the case of independent claim 13) computer readable medium and internal memory. There are recited at a high level of generality and recited so generically that they represent no more than mere instructions to apply the judicial exception on a computer (see MPEP 2106.05(f)). These limitations can also be viewed as nothing more than an attempt to generally link the use of the judicial exception to the technological environment of a computer (see MPEP 2106.05(h)).
	Independent claims 1 and 13 further recite generating a dataset of labeled data. This is nothing more than an insignificant post-solution activity, which is unrelated to any particular means by which the system, for example, automatically defines contexts, determines the various workflow steps, or particularly displays workflow collection steps in the graphical user interface. Furthermore, this is nothing more than an insignificant field-of-use limitation, describing the context rather than a particular means of achieving the result (especially with regards to the fact that the dataset of labeled data is the basis for the process which uses an algorithm based on machine learning). These are nothing more than attempts to limit the claim to a particular technological field.
	The claims’ recitation of limitations pertaining to receiving user input or requesting user input (claims 1, 7-8, and 11-13) and displaying/providing certain outputs to the user (claims 1, 3, 5, 8, and 12-13) are nothing more than insignificant extra-solution activities, as they do not relate to any particular manner by which the system may, for example, collect or organize the received user input in any particular manner, or output any particular display via the graphical user interface. The type of information being requested from the user, inputted by the user, and displayed/provided to the user, is nothing more than insignificant field-of-use limitations, describing the context rather than a particular manner of achieving the result.
	Similarly, the independent claims’ recitation of defining a context of the data collection procedure which can comprise at least one of three various ways of obtaining such information, is nothing more than insignificant field-of-use limitations, describing the context rather than a particular manner of achieving the result. 
	
	Dependent claim 2 recites that the algorithm which is based on machine learning uses at least one sensor and sensor-driven component (e.g., a processor), and implements an artificial intelligence model for processing signals received from the at least one sensor and sensor-driven component. These are noting more than attempts to limit the claims to a particular technological field—namely, via computers, and thus do not amount to significantly more. Such limitations do not purport to explain how—by what particular process or structure—the recited components implement the desired goal or result of processing signals (i.e., the term “processing signals” also being very generic and high-level). 
	Dependent claim 4 recites updating the generated dataset at a particular stage of the workflow or updating a current data collection step. This is nothing more than an insignificant extra-solution activity that is unrelated to how the data collection steps were even generated in the first place, or in what particular process or structure is used for updating the generated dataset.
	Dependent claim 6 recites updating a generated dataset at a particular stage of the workflow and/or updating a current data collection step in response to observing the user inputs and/or context information. This is nothing more than an attempt to limit the claims to a particular context, i.e., field-of-use limitation, which does not amount to significantly more.
	Dependent claim 10 recites that the at least one visual structure has a frame-like form for placing the at least one object of interest within the frame-like form by user input. This is an insignificant field-of-use limitation, describing the context rather than a particular manner of achieving the result. In particular, such a limitation does not further limit how—by what particular process or structure—the system generates such a visual structure.

	The claims do not contain any additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the additional elements reciting the use of various computing software and hardware components amount to no more than mere instructions to apply the judicial exception using generic components. Mere instructions to apply an exception using generic computer components cannot provide an inventive concept.
	Additionally, with regards to the various claims’ recitation of requesting user input, the user providing such input, and outputting/displaying certain information to the user, is nothing more than a well-understood, routine, and conventional activity of receiving and transmitting data (note that despite the application to guided user collection/capture of objects, this is nothing more than an attempt to limit the claims to a particular field-of-use, describing the context rather than a particular manner of achieving the result). See MPEP 2106.05(d)(II) (“Receiving or transmitting data over a network, e.g., using the Internet to gather data”, with regards to requesting user input and receiving user input; “Presenting offers and gathering statistics” with regards to the displaying step).
	Creating/updating the labeled dataset (claims 1, 4, and 13) are nothing more than well-understood, routine, and conventional activities of electronic recordkeeping. See MPEP 2106.05(d)(II) (“Electronic recordkeeping”).

	Even as an ordered combination, the claims as a whole do not contain any additional elements that amount to significantly more. The claims do nothing more than provide a generic environment for guiding users to collect certain information.
In particular, the claims are not limited to a particular manner of how information is presented to users to guide their actions, e.g., there is no improvement in the graphical user interface. See Intellectual Ventures I LLC v. Erie Indemnity Co., 850 F.3d 1315, 121 USPQ2d 1928 (Fed. Cir. 2017) (“the claimed invention does not recite any particular unique delivery of information through this mobile interface…Nor do the claims describe how the mobile interface communicates with other devices or any attributes of the mobile interface, aside from its broadly recited function. Thus, the mobile interface here does little more than provide a generic technological environment to allow users to access information”).
Additionally, there are no limitations confining the claims to any improved method for data collection. For example, the claims do not purport to explain how the system determines what data collection steps are needed and thus used to guide the user actions.
Thus, the recitations to various intended applications, e.g., for machine learning algorithm purposes, only attempt to limit the claims to a particular technological field. Reciting the intended goal/result/effect is not enough; there must be a concrete embodiment of that goal within the claims. However, there are no concrete recitations as to how the system would identify the necessary data in order to improve data collection processes; no limitations with regards to the form or structure of the data being collected; nor are there any limitations with regards to the graphical user interface that is communicating with the user. Accordingly, the claims recite an abstract idea of organizing human activity (i.e., certain activity between a person and a computer, and managing personal behavior or relationships or interactions between people), and automating mental tasks or processes.
Thus, even with the additional elements, the additional elements only provide generic contexts (i.e., insignificant field-of-use limitations) in which the claimed steps are being implemented; or generic data collection steps of requesting for information and receiving user information; generic outputting/displaying certain information to a user; and generic electronic recordkeeping steps of creating/updating datasets. Such limitations do not further limit the claims to any particular manner of achieving the aforementioned steps, and thus do not amount to significantly more.
Thus, even when the claim elements are considered as a combination, they add nothing that is not already present when the elements are considered separately. There is nothing inventive about any of the claim details, individually or in combination, that are not themselves in the realm of abstract ideas.
Thus, for at least the aforementioned reasons, the claims are rejected under 35 U.S.C. 101 for being directed to a judicial exception (i.e., an abstract idea) without significantly more.






Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.


Claims 1-2, 4-5, 11, and 13 are rejected under 35 U.S.C. 102(a)(1)/(a)(2) as being anticipated by Hickman et al. (“Hickman”) (US 2019/0076731 A1).
	Regarding claim 1: Hickman teaches A method of performing a data collection procedure for generating a dataset for input to a process which uses an algorithm which is based on machine learning, the method comprising:
	running, by a computing device, a workflow engine which is configured to perform a data collection workflow, wherein the workflow engine, when executing the data collection workflow, requests user input and guides the user through a user-driven data collection procedure of collecting labeled data and creates a dataset of labeled data as a basis for the process which uses an algorithm which is based on machine learning (Hickman, [0103], where the gameplay (i.e., “workflow engine”) may prompt a person to show the computing device 108 a particular object so that data of that object can be collected (i.e., “data collection workflow”). In this way, the gameplay directs the user to collect data of objects for which the dataset is lacking details, and may prompt the user to label the object (Hickman, [0095]) (i.e., “requests user input and guides the user through a user-driven data collection procedure of collecting labeled data”). See, e.g., Hickman, [0093], where the gameplay may include a checklist of data collection requirements (i.e., “workflow”), and guide the user to collect data over areas of interest in the environment. See Hickman, [0034], where a virtual game is created in which rewards/scores are provided to cause users to collect data that is valuable, where users are prompted to collect data from unknown areas and/or to label data, which can ultimately be used to train various machine learning systems (i.e., “creates a dataset of labeled data as a basis for the process which uses an algorithm which is based on machine learning”)),
defining a context of the data collection procedure which comprises at least one of: using a graphical user interface for inquiring user input regarding context of the data collection procedure, automatically defining the context of the data collection procedure based on at least one of an application field of the method and the data to be collected, and inferring the context of the data collection procedure automatically from sensor data received from at least one sensor, and generating corresponding context information (Hickman, [0096-0097], where the system may prompt the user to label newly received data with, e.g., an identification of a person to associate with a room or other object such as a shoe (i.e., “using a graphical user interface for inquiring user input regarding context of the data collection procedure”). The system may also recognize, via one or more sensors, that the system has walked into an office in the house via, e.g., location determination, reference to a floorplan, or object recognition of objects in the office (i.e., “inferring the context of the data collection procedure automatically from sensor data received from at least one sensor”), and determines office supplies as a category of objects (i.e., “generating corresponding context information”)),
after defining the context of the data collection procedure, executing the data collection workflow and requesting user input such that, when the user starts executing the data collection workflow, the workflow engine presents information relevant to the data to be collected on a graphical user interface (Hickman, [0097], where after recognizing that the user has walked into an office in the house, the computing device 108 determines office supplies as a category of objects (i.e., “after defining the context of the data collection procedure”), and provide a command during gameplay indicating a request to obtain data of specific office supplies for which the dataset in the database 135 may be lacking details (i.e., “executing the data collection workflow and requesting user input such that…the workflow engine presents information relevant to the data to be collected”), where the command may be via a graphical user interface (Hickman, [0086], where the system provides a command that indicates a request to the user to obtain additional data of the environment, the command being in the form of a textual graphic) (i.e., “present[ing] information relevant to the data to be collected on a graphical user interface”)), and 
using the context information, the provided information relevant to the data to be collected, and the received user input during the data collection workflow by the workflow engine for labelling the data to be collected and generating the dataset (Hickman, [0117], where the user is asked to complete a set of tasks, including collecting images of physical objects in the real world (Hickman, [0072]) (i.e., “the received user input during the data collection workflow”) that enables the computing device 108 to generate the dataset that may be used for machine learning (i.e., “generating the dataset”). See Hickman, [0032], where the computing device may be programmed to ask questions regarding details of an area to enable labeling of data that is collected which further completes the stored dataset (i.e., “the provided information relevant to the data to be collected, and the received user input during the data collection workflow…for labelling the data to be collected”). See Hickman, [0098], where as a user walks throughout an environment, the computing device determines a location (i.e., “using the context information”) and objects associated with that location for which the dataset lacks details, and requests additional data of specific objects highly relevant to a location of the computing device, such additional data including an identification of the object (Hickman, [0094], where an identification of an object based on the image of the object is received; see Hickman, [0095], where the system labels the additional data of the object with the identification of the object and stores the additional data of the object in the database 135) (i.e., “for labelling the data to be collected”)). 

	Regarding claim 2: Hickman teaches The method according to claim 1, wherein the process which uses an algorithm which is based on machine learning uses at least one sensor and sensor-driven component and implements an artificial intelligence model for processing signals received from the at least one sensor and sensor-driven component (Hickman, [0038], where a robotic device 102 may send a log of sensor data to a host device 106 and receive machine learning model data from host device 106. See Hickman, [0043], where the computing device 108 may perform the same functions as described with respect to the robotic devices 102. See Hickman, [0046], where the computing device includes processor(s) and sensors 132 (i.e., “sensor and sensor-driven component”)). 

	Regarding claim 4: Hickman teaches The method according to claim 1, wherein when executing the data collection workflow, at least some of the user inputs and/or context information are observed and used by the workflow engine to update the generated dataset at a particular stage of the workflow and/or used to update a current data collection step (Hickman, [0112], where the computing device 108 makes the determination that new data is collected, that is not previously included in the dataset stored in the database (i.e., “at least some of the user inputs…are observed and used by the workflow engine to update the generated dataset at a particular stage of the workflow”)). 

	Regarding claim 5: Hickman teaches The method according to claim 1, wherein after executing the data collection workflow, a summary information of user inputs during the workflow is provided (Hickman, [0065], where the computing device 108 is informed of success of the data collection for awarding points during gameplay, and the interface 129 on the display 128 of the computing device may then illustrate the points). 

	Regarding claim 11: Hickman as modified teaches The method according to claim 8, wherein the workflow engine, when executing the data collection workflow, requests user input such that, when the user starts executing the data collection workflow, the workflow engine presents information relevant to the data to be collected on the graphical user interface (Hickman, [0054], where a graphical user interface is displayed, which enables the user to interact with the visual display and accept user inputs/instructions to illustrate and collect data in a desired manner, where information and actions are made available to the user. See also Hickman, [0086], where the system provides a command that indicates a request to the user to obtain additional data of the environment, the command being in the form of a textual graphic. The command indicates to the user to obtain additional data of the environment and/or of the object. The command may indicate a specific type of data to collect, e.g., depth images, audio data, 2D image data, etc. (i.e., “presents information relevant to the data to be collected”). See also Hickman, [0094], where the command may provide information indicating a pose of the object at which to obtain the additional data of the object, or provide a time of day at which to obtain the additional data of the object (so as to capture data of the object with different lighting)). 

	Regarding claim 13: Claim 13 recites substantially the same claim limitations as claim 1, and is rejected for the same reasons.
	Note that Hickman teaches A computer readable medium comprising software code sections which, when loaded into an internal memory of a computing device, are adapted to perform a method of performing a data collection procedure for generating a dataset for input to a process which uses an algorithm which is based on machine learning, the method comprising [the claimed steps] (Hickman, [0049], where the disclosed system may be implemented as a non-transitory computer readable storage medium).



Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Hickman et al. (“Hickman”) (US 2019/0076731 A1), in view of Castillo et al. (“Castillo”) (US 2017/0064200 A1).
	Regarding claim 3: Hickman teaches The method according to claim 1, but does not appear to explicitly teach wherein when executing the data collection workflow, if data to be collected in a step requiring user input is not valid or available, the data collection workflow provides an option to skip that step.
Castillo teaches wherein when executing the data collection workflow, if data to be collected in a step requiring user input is not valid or available, the data collection workflow provides an option to skip that step (Castillo, [0032-0034], where the user continues around a building taking photos as suggested by the guide overlays until the building is captured. If not all sides of a building are accessible for ground-level images (i.e., “a step requiring user input is not valid or available”), many of the guides are skipped from the sequential set of graphical guides (i.e., “skip[s] that step”).
Although Castillo does not appear to explicitly state that the data collection workflow explicitly provides an option to the user for skipping that step (but rather does so automatically), one of ordinary skill in the art would have been suggested to modify Castillo such that the option to skip a step is presented to a user with the motivation of faster processing (e.g., the system does not have to compute or be aware of various contextual factors that may determine whether or not a user is capable of taking the desired photos, but rather has the user make such determinations)).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Hickman and Castillo with the motivation of allowing users greater control over the type of data collection tasks they engage in, as forcing them to perform tasks that are not feasible or desirable may make them less inclined to continue with the data collection workflow.


Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Hickman et al. (“Hickman”) (US 2019/0076731 A1), in view of Gerson et al. (“Gerson”) (US 2012/0315992 A1).
	Regarding claim 6: Hickman teaches The method according to claim 5, but does not appear to explicitly teach wherein the summary information is used for generating, by the workflow engine, a new data collection workflow or for generating a subsequent modified execution of the data collection workflow.
	Gerson teaches wherein the summary information is used for generating, by the workflow engine, a new data collection workflow or for generating a subsequent modified execution of the data collection workflow (Gerson, [0060], where a new game goal is dynamically added in association with a feature during gameplay based on achievement of another goal, where a map is updated to include the captured feature information, and the gameplay is altered in real time).
	Although Gerson does not appear to explicitly state that the summary information, i.e., the points, are utilized, but instead that goal achievements are used for generating a new data collection workflow (or subsequent modified execution of the data collection workflow), one of ordinary skill in the art would have found it obvious to substitute Gerson’s game goal achievement with the summary information, e.g., Hickman’s game points, because both the claimed invention and Gerson depend upon previous achievements for modifying goals (i.e., the claimed invention’s data collection workflow task completion; Gerson’s game goals). One of ordinary skill in the art would have been suggested to do so with the motivation of incentivizing users to gather the requested inputs (Hickman, [0072]) in order to collect data from unknown areas, i.e., have a more complete dataset, which can be used to train various machine learning systems (Hickman, [0033-0034]).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Hickman and Gerson with the motivation of ensuring that redundant information is not captured, i.e., such that a more complete dataset can be acquired and used to train machine learning systems (which would help with refining learning).


Claims 7-8 and 10 are rejected under 35 U.S.C. 103 as being unpatentable over Hickman et al. (“Hickman”) (US 2019/0076731 A1), in view of Chowdhury et al. (“Chowdhury”) (US 2019/0332893 A1).
	Regarding claim 7: Hickman teaches The method according to claim 1, wherein the process which uses an algorithm which is based on machine learning is an image processing process and the data to be collected are one or more objects of interest to be processed in the image processing process (Hickman, [0103], where the gameplay may prompt a person to show the computing device 108 a particular object so that data of that object can be collected. In this way, the gameplay directs the user to collect data of objects for which the dataset is lacking details (i.e., “the data to be collected are one or more objects of interest to be processed in the image processing process”), and may prompt the user to label the object (Hickman, [0095]). See Hickman, [0052], where the processor(s) can include a tensor processing unit (TPU) for training and/or inference of machine learning models, where the processor(s) may receive and process inputs to generate outputs that are stored in the data storage 124 and outputted to the display 128. Note that inputs may be images (see, e.g., Hickman, [0055], [0057], [0059], and [0071]), and thus the processing executed by the processor pertains to an image processing process, as claimed), and executing the data collection workflow requests user input as to … what is the context for collecting data for the objects of interest (Hickman, [0096], where the system may prompt the user to label newly received data with, e.g., an identification of a person to associate with a room or other object such as a shoe (i.e., “what is the context for collecting data for the objects of interest”)).
	Hickman does not appear to explicitly teach [requesting] user input as to which are the objects of interest.
	Chowdhury teaches [requesting] user input as to which are the objects of interest (Chowdhury, [0026], where a user may manually add a bounding box around an object of interest. Note that although Chowdhury does not appear to explicitly state that the system explicitly prompts/requests the user to perform a certain action, the claimed invention and Chowdhury both pertain to receiving user input regarding the objects of interest; thus, one of ordinary skill in the art would have found it obvious to have the data collection workflow (or system) prompt/request such information from a user with the motivation of explicitly guiding/instructing to make it more clear to the user as to what actions are needed, which increases the efficiency of data collection as the user more clearly understands what data is needed to be captured).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Hickman and Chowdhury with the motivation of having increased accuracy in the dataset for identifying object locations in training images (see, e.g., Chowdhury, [0002], where manual user segmentation and annotation methodology is robust, as humans have years of training in accurately identifying and discriminating between objects of interest and their surroundings).

	Regarding claim 8: Hickman as modified teaches The method according to claim 7, wherein the workflow engine, when executing the data collection workflow, presents at least one visual structure on the graphical user interface applicable as a label for associating with at least one object of interest displayed in at least one image capturing the at least one object of interest (Chowdhury, [0026], where an automated bounding box may be added using fiduciary points/edges. See Chowdhury, [0031], the bounding box is regarded as a form of annotation, where the object coordinates corresponding the bounding box are associated with the corresponding object label (i.e., “as a label for associating with at least one object of interest displayed in at least one image”)), and the data collection workflow requests user input for positioning the at least one visual structure in relation to the at least one object of interest in the at least one image (Chowdhury, [0026], where the bounding box is subject to marginal correction by the user, e.g., by moving or optionally resizing to fit the bounding box tightly around the objects of interest (Chowdhury, [0033]) (i.e., “positioning the at least one visual structure in relation to the at least one object of interest in the at least one image”)), wherein information regarding position of the at least one visual structure is used by the workflow engine as label information indicative of a position of the at least one object of interest in relation to the at least one image (Chowdhury, [0026], where object coordinates corresponding to the bounding box are saved with a corresponding object label, where the object coordinates corresponding to the bounding box (i.e., “wherein information regarding position of the at least one visual structure”) are regarded as a form of annotation (see, e.g., Chowdhury, [0031]) (i.e., “as label information indicative of a position of the at least one object of interest in relation to the at least one image”)).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Hickman and Chowdhury with the motivation of (1) requiring less manual annotation time since most bounding boxes/polygons/labels are pre-populated (Chowdhury, [0047]) (with regards to the limitation of “present[ing] at least one visual structure on the graphical user interface”)1, and (2) allowing models to be properly trained and have better detection capabilities by knowing where objects in an image are located (e.g., if there was an image comprising a dog, a cat, and a bird, the machine learning system might accidentally train on associating the object label “dog” with the cat or bird that appears in the image instead since the model had no preconception of what a dog was supposed to look like, and thus the model would have inaccurate object detection). 

	Regarding claim 10: Hickman as modified teaches The method according to claim 8, wherein the at least one visual structure has a frame-like form designed for placing the at least one object of interest, or at least part thereof, within the frame-like form by user input (Chowdhury, [0008] and [FIG. 8], where the polygon around the object of interest may be a bounding box (i.e., “frame-like form”)).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Hickman and Chowdhury with the motivation of reducing the computational complexity of bounding shapes (i.e., using four points is less computationally expensive to isolate in an image compared to, e.g., 6 or 8 points), and thus reduces computational resources/load and increases efficiency.


Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Hickman et al. (“Hickman”) (US 2019/0076731 A1), in view of Chowdhury et al. (“Chowdhury”) (US 2019/0332893 A1), in further view of Chu et al. (“Chu”) (US 2020/0410287 A1).
	Regarding claim 9: Hickman as modified teaches The method according to claim 8, but does not appear to explicitly teach wherein the information regarding at least one of the position, a size and a number of the at least one visual structure is used to calibrate the generation of the at least one visual structure itself.
	Chu teaches wherein the information regarding at least one of the position, a size and a number of the at least one visual structure is used to calibrate the generation of the at least one visual structure itself (Chu, [0018], where an annotating user may define a bounding box for a certain object in frame 1 of a given video, and then defined a bounding box at different coordinates for the same object in frame 61, where the computing system may store estimated bounding box information for the object across each of frames 2-60 (i.e., “calibrate the generation of the at least one visual structure itself”). See Chu, [0020], where this information is ultimately used for generating data in the object training data store 114 for training an object classifier 104 with minimal human involvement by the annotator).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Hickman as modified and Chu with the motivation of reducing the amount of annotation information that users need to provide (see, e.g., Chu, [0018], where although the annotating user may have only provided annotation information for 1/60th of the frames of a given video, a substantially larger percentage of the frames may have object annotation data stored after the tracking process is complete).


Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Hickman et al. (“Hickman”) (US 2019/0076731 A1), in view of Chowdhury et al. (“Chowdhury”) (US 2019/0332893 A1), in further view of Graham et al. (“Graham”) (US 10,116,861 B1).
	Regarding claim 12: Hickman as modified teaches The method according to claim 8, but does not appear to explicitly teach wherein the workflow engine, when executing the data collection workflow, displays on the graphical user interface a view of a camera for capturing the at least one image and presents in the view of the camera at least one visual structure in one or more set locations and sizes, wherein the workflow engine requests user input for fitting one or more of the at least one object of interest displayed in the view of the camera within at least one of the visual structures, and wherein once the at least one object of interest is fitted inside the at least one of the visual structures, the workflow engine requests user input for capturing the at least one image, and saves the captured at least one image and the data associated with the at least one of the visual structures.
	Graham teaches wherein the workflow engine, when executing the data collection workflow, displays on the graphical user interface a view of a camera for capturing the at least one image and presents in the view of the camera at least one visual structure in one or more set locations and sizes, wherein the workflow engine requests user input for fitting one or more of the at least one object of interest displayed in the view of the camera within at least one of the visual structures, and wherein once the at least one object of interest is fitted inside the at least one of the visual structures, the workflow engine requests user input for capturing the at least one image, and saves the captured at least one image and the data associated with the at least one of the visual structures (Graham, [16:7-24], where when the user positions the camera to have the actual product entered into the camera view of the user interface 442, the guided capture module 209 is configured to present the template (i.e., bounding box) (i.e., “presents in the view of the camera at least one visual structure in one or more set locations and sizes”) overlaid over the image of the actual product shown on the camera view (i.e., “displays on the graphical user interface a view of a camera for capturing the at least one image”). The user adjusts the position of the camera such that the actual product fits within the rectangle (i.e., “user input for fitting one or more of the at least one object of interest displayed in the view of the camera within at least one of the visual structures”) and takes a picture (i.e., “user input for capturing the at least one image”).
See Graham, [16:41-44], where responsive to the user selection of the “Continue” button in the user interface 444, the guided capture module 209 saves the newly captured image in the database (i.e., “user input for capturing the at least one image and saves the captured at least one image”).
See Graham, [11:64-67]-[12:1-28], where the system pertains to a workflow of capturing and storing product information (i.e., “data collection workflow”).
Note that although Graham does not appear to explicitly state that the system explicitly prompts/requests the user to perform a certain action, the claimed invention and Graham both pertain to a user fitting the object within the bounding box (i.e., the claimed visual structure); thus, one of ordinary skill in the art would have found it obvious to have the data collection workflow (or system) prompt/request such information from a user with the motivation of explicitly guiding/instructing to make it more clear to the user as to what actions are needed, which increases the efficiency of data collection as the user more clearly understands what data is needed to be captured.
See Chowdhury, [0026], with regards to the saving the image “and the data associated with the at least one of the visual structures”, where object coordinates corresponding to the bounding box are saved with a corresponding object label, where the object coordinates corresponding to the bounding box are regarded as a form of annotation (see, e.g., Chowdhury, [0031])).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Hickman as modified and Graham with the motivation of guiding users to capture (additional) images that can be used as training data to help improve a machine learning algorithm’s future recognitions (Graham, [18:36-53]).






Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure. See the enclosed 892 form. Robert (US Patent Publication No. 2021/0125004 A1) is cited to show the advantages of having the system automatically generate bounding boxes for users to position correctly. Such a feature is advantageous, as it allows users to quickly validate/edit large volumes of labeled data set images in a short amount of time (Robert, [0040]). The prior art should be considered to define the claims over the art of record.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to IRENE BAKER whose telephone number is (408)918-7601. The examiner can normally be reached M-F 8-5PM PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, NEVEEN ABEL-JALIL can be reached on (571)270-0474. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/IRENE BAKER/Primary Examiner, Art Unit 2152                                                                                                                                                                                                        
22 August 2022




    
        
            
        
            
    

    
        1 See also, e.g., Robert (US 2021/0125004 A1) at [0040], where users validating a bounding box encompassing an item being tracked in a video, would be able to do so quickly since the user merely has to move the bounding box, change the limits of the bounding box, or tag/click the item in the image, making the correcting task not cognitively onerous, and allowing a user to quickly validate/edit large volumes of labeled data set images in a short amount of time.