DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 112
2.	The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

3.	Claims 1-5 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
4.	The term "substantially flat” in claim 1 is a relative term which renders the claim indefinite.  The term "substantially flat" is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.  A surface is either flat or not.  By stating that the surface is substantially flat the claim is reciting a degree of flatness that is undefined and therefore, indefinite.


Claim Rejections - 35 USC § 103
5.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any 
6.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

7.	The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
8.	Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Sugahara et al. (U.S. Patent Application Publication No. 2019/0087976 A1) in view of Boardman et al. (U.S. Patent No. 10,403,307 B1).
9.	Regarding Claim 1, Sugahara discloses A system (paragraph [0029] reciting “FIG. 1 is a block diagram showing configuration of a system according to a first embodiment. A system according to a first embodiment will be described with reference to FIG. 1.”) comprising: 	a turntable configured to rotate a substantially flat surface about a first axis; (paragraph [0068] reciting “The platform 14 is a location to place the object which is the subject of the imaging unit 11. In the example shown in FIG. 4, the platform 14 is a flat disk-shaped. To reduce the reflection of light, the surface is black. The platform 14 is mounted on the actuator 15. The platform 14 can rotate horizontally by the function of the actuator 15. Since the object placed on the platform 14 also rotates, it is possible to take images of the object from different angles.” 	Platform 14 corresponds to a flat surface turntable that rotates.) 	an imaging device comprising a visual image sensor and a depth image sensor, wherein the turntable is within at least one field of view of the imaging device; (paragraph [0060] reciting “The imaging unit 11 is a camera which has a three-dimensional image sensor (distance image sensor) obtaining the depth information (distance image) including the distance between the sensor and the subject plus the shape of subject. An example of the three-dimensional image sensor includes a ToF sensor. ToF (Time of Flight) sensors radiate electromagnetic waves such as infrared rays or visible light to the subject. Then, the ToF sensor detects the reflected waves. Based on the time of radiation and the time when the reflected waves were received, the depth information could be calculated. RGB-D (Red, Green, Blue, Depth) sensors which detect both the color and the depth information can be used. As long as the depth information of the object can be obtained, other methods such as pattern light projection method, stereo cameras or the like could be used.”;
	paragraph [0038] reciting “The image of the target object needs to be taken by a camera before estimating the gripping location (specific region of object) by image recognition. Next, the gripping location within the object (the location of the specific 	and a server in communication with the imaging device, (paragraph [0067] reciting “The image storage 13 can be volatile memory such as SRAM, DRAM or the like. The image storage 13 can also be nonvolatile memory such as NAND, MRAM, FRAM or the like. Storage devices such as optical discs, hard discs, SSDs or the like may be used. In the example shown in FIG. 1, the image storage 13 is located within the three-dimensional imaging device 10. However, the image storage 13 can be located in outside of the three-dimensional imaging device 10. Thus, the image storage 13 can be placed in any location. For example, the distance image can be saved in an external storage device, an external server, cloud storage or the image processing device 20.”) wherein the server is programmed with one or more sets of instructions that, when executed by the server, cause the server to execute a method (paragraph [0071] reciting “For example, if the information on the angles and views include 810 combinations of angles and views, 810 distance images are taken. The combinations of angles and views could be configured manually by the users. Based on the size of the object, the imaging controller 16 can generate the combinations of angles and views automatically. Also, the control program running on an external computer (not shown) can configure the information on the angles and views.”;) comprising: 	receiving, from the imaging device, a first set of visual images of an object resting on top of the substantially flat surface, wherein each of the visual images of the first set is captured with the turntable rotating about the first axis, and wherein at least two of the visual images of the first set are captured with the object in different positions with respect to the first axis; (paragraph [0060] reciting “The imaging unit 11 is a camera which has a three-dimensional image sensor (distance image sensor) obtaining the depth information (distance image) including the distance between the sensor and the subject plus the shape of subject. An example of the three-dimensional image sensor includes a ToF sensor. ToF (Time of Flight) sensors radiate electromagnetic waves such as infrared rays or visible light to the subject. Then, the ToF sensor detects the reflected waves. Based on the time of radiation and the time when the reflected waves were received, the depth information could be calculated. RGB-D (Red, Green, Blue, Depth) sensors which detect both the color and the depth information can be used. As long as the depth information of the object can be obtained, other methods such as pattern light projection method, stereo cameras or the like could be used.”;
	paragraph [0056] reciting “The three-dimensional imaging device 10 and the image processing device 20 can send or receive data via electrical connections or wireless communication. Examples of standards for electrical connections include PCI Express, USB, UART, SPI, SDIO, serial ports, Ethernet or the like. However, other standards may be used. Examples of wireless communication standards include 
	paragraph [0068] reciting “The platform 14 is a location to place the object which is the subject of the imaging unit 11. In the example shown in FIG. 4, the platform 14 is a flat disk-shaped. …”;
paragraph [0172] reciting “The platform 63 is a disk-shaped platform. The platform 63 can rotate. The axis of rotation is in the center of the disk. In FIG. 16, the axis of rotation for the platform 63 is shown in a broken line. …”;
paragraph [0065] reciting “The arm 12 supports the imaging unit 11 to a height sufficient for taking an image of the object. The arm 12 can adjust the location of the imaging unit 11 in either of the positions along an arc. In the example shown in FIG. 4, the arm 12 is shown in the left side of the three-dimensional imaging device 10. The imaging unit 11 is mounted on the tip of the arm 12. …”)
	receiving, from the imaging device, a first set of depth data regarding the object, wherein the first set of depth data is captured with the turntable rotating about the first axis; (paragraph [0060] reciting “The imaging unit 11 is a camera which has a three-dimensional image sensor (distance image sensor) obtaining the depth information (distance image) including the distance between the sensor and the subject plus the shape of subject. An example of the three-dimensional image sensor includes a ToF sensor. ToF (Time of Flight) sensors radiate electromagnetic waves such as infrared rays or visible light to the subject. Then, the ToF sensor detects the reflected waves. Based on the time of radiation and the time when the reflected waves were received, the depth information could be calculated. RGB-D (Red, Green, Blue, Depth) sensors which detect both the color and the depth information can be used. As long as the depth information of the object can be obtained, other methods such as pattern light projection method, stereo cameras or the like could be used.”;
	paragraph [0043] reciting “The three-dimensional imaging device 10 is a device which generates the distance images of the objects by taking images from multiple angles. For example, by adjusting the angles by 1, 2, 5 or 10 degrees within the range of 2π steradians, multiple distance images for an object could be taken. By adjusting the angles, few hundred distance images to millions of distance images (multi-view images) can be obtained.”  	Object can be rotated in incremental angles and images of this object can be captured by camera.)
	generating a first three-dimensional model of the object based at least in part on the first set of visual images and the first set of depth data; (paragraph [0046] reciting “Comparing an object which is asymmetric with complicated bumps (concavities and convexities) to an object which has rotational symmetry, the former object requires more distance images to recognize the three-dimensional shape. As mentioned later, multiple distance images taken from multiple angles are combined to generate a CAD model representing the three-dimensional shape of an object.”)	selecting a first plurality of orientations for the first three-dimensional model; (paragraph [0147] reciting “Next, the conditions for the multi-view images (views, angles, number of images, whether there are colors or not) are determined. (Step S102) The condition for the images is determined for each object which is going to be gripped. It is ensured that each image have different angles. Regarding the view, 
	paragraph [0149] reciting “Based on the 3D point cloud data and the condition of the images, the three-dimensional model is generated. (Step S104) the generation of the three-dimensional model is done by the CAD model generator 22 in the image processing device 20. The CAD model generator 22 combines multiple distance images from multiple angles to generate the three-dimensional model.”)	rendering the first three-dimensional model in at least some of the first plurality of orientations; (paragraph [0075] reciting “The CAD model generator 22 generates a three-dimensional model representing the three-dimensional shape of the object by combining multiple distance images taken from multiple angles in the three-dimensional imaging device 10. Examples of the three-dimensional model include three-dimensional CAD (Computer-Aided Design), three-dimensional CG (Computer Graphics). However, as long as the three-dimensional shape of the object can be represented, any type of file or any format can be used. Since multiple distance images are combined to generate the three-dimensional model, the three-dimensional model needs to represent the accurate external form of the object, for angles that can be viewed from the three-dimensional imaging device 10. The generated three-dimensional model is saved in the CAD model storage 23.”)	generating a second set of visual images of the first three-dimensional model, wherein each of the visual images of the second set is generated with the first three- dimensional model rendered in one of the first plurality of orientations; paragraph [0085] reciting “FIG. 8 shows the task of specifying the gripping location according to the embodiment. According to the embodiment, if a region is specified in a single three-dimensional model, it is possible to obtain the 3D point cloud data for the specific region. Thereafter, only by entering the 3D point cloud data for the specific region to other three-dimensional models, it is possible to generate an image showing the gripping location viewed from a different angle. By repeating this procedure to the three-dimensional models for each angle, it is possible to generate the annotation images for different angles (views), automatically.”;
paragraph [0143] reciting “However, the flow shown in FIG. 13 is only an example. The order of execution can be different. For example, the three-dimensional imaging device 10 could take all the multi-view images first. Then, the three-dimensional models corresponding to each image can be generated later. In the following, more details on the coordinated operations of a three-dimensional imaging device 10 and an image processing device 20 are explained.”)
and training a machine learning model to recognize the object based at least in part on at least some of the second set of the visual images (paragraph [0039] reciting “To specify the gripping location (specific region of object), a model for image recognition needs to be prepared. Namely, multiple training images that are segmented need to be prepared. Namely, specific parts of the body including the head, the hands, the feet or the like are segmented in the training images, making the parts distinguishable from other regions. By using machine learning or deep learning, the image recognition models can be generated. Training images need to include multiple views of the object from different angles. Also, if the canned juice in cylindrical form and boxes also need to be gripped, image recognition models for other objects need to be generated. The robot applies the model to the image taken by the camera to estimate the gripping location (specific region of object).”;
paragraph [0050] reciting “The characteristic learning device 30 generates a model for recognizing the gripping location within an object by learning the distance image of an object and an image showing the gripping location of an object. The learning of the model can be done by using deep learning such as back propagation methods or machine learning such as SVM (Support Vector Machines) or the like. The distance image of the object and the image indicating the gripping location of the object are used as the teacher data in learning.”;
paragraph [0099] reciting “The CAD model storage 23 saves the three-dimensional model of the whole object. The CAD model storage 23 also saves the extracted images. Since the data stored in the CAD model storage 23 is used for learning the image recognition model, it needs to be accessible from the characteristic learning device 30. The configuration for shared storage to enable access from the characteristic learning device 30 is not limited. Also, various communication methods could be used. For example, the CAD model storage 23 could be configured as shared storage which is accessible from the network.”)
While not explicitly disclosed by Sugahara, Boardman discloses and training a machine learning model to recognize the object based at least in part on at least some of the second set of the visual images and an identifier of the object. (col. 37, lies 60-63 reciting “using machine learning techniques such as learned classifiers. If the 
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system of Sugahara with Boardman by including wherein training a machine learning model to recognize the object based at least in part on an identifier of the object as taught by Boardman because it could more efficiently estimate measurements for one or more attributes of one or more objects included in the images and/or to perform automated verification of such attribute measurements.
10.	Regarding Claim 2, Sugahara further discloses The system of claim 1, wherein the method further comprises: generating a point cloud corresponding to at least a portion of at least one surface of the object, wherein the point cloud is generated based at least in part on at least some of the first set of depth data; (paragraph [0061] reciting "… format of distance image is the 3D point cloud data …"); tessellating the point cloud; (paragraph [0062] reciting “3D point cloud data can be...data with the noise components filtered out ... other data formats including ... surface models, solid models, polygons or the like may be used ... 3D point cloud data includes all the data formats");	and applying at least a portion of at least some of the first set of visual images to the tessellated point cloud, (paragraph [0097] reciting “ For the extracted images, it is possible to use the 3D point cloud data"),

wherein the first three-dimensional model is the tessellated point cloud having at least the portion of the at least some of the first set of visual images applied thereto. (paragraph [0072] reciting "generates the three-dimensional model of the object based on the 3D point cloud data and the information on the angles and views").
11.	Regarding Claim 3, Sugahara further discloses The system of claim 1, wherein the machine learning model is at least one of: an artificial neural network, a deep learning system, a support vector machine, a nearest neighbor analysis, a factorization method, a K-means clustering technique, a similarity measure, a latent Dirichlet allocation, a decision tree or a latent semantic analysis.  
(paragraph [0050] reciting “learning of the model can be done by using ... machine learning such as SVM (Support Vector Machines)").
12.	Regarding Claim 4, Sugahara further discloses The system of claim 1, wherein the method further comprises: modifying at least a portion of at least one of the first set of visual images or the first set of depth data; (paragraph [0179] reciting “generate new distance images by editing existing distance images")
	generating a second three-dimensional model of the object based at least in part on the modified portion of the at least one of the first set of visual images or the first set of depth data; (paragraph [0078] reciting “quality of the three-dimensional model can be improved by the manipulation of the users.If the user edits the three-dimensional model, it is possible to add information such as the composition of each region for the object, which cannot be obtained by the three-dimensional image sensors")	selecting a second plurality of orientations for the second three-dimensional model; (paragraph [0102] reciting "selections of images ... instructions to the three-dimensional imaging device";
	paragraph [0142] reciting "automated generation of the three-dimensional models”)
	rendering the second three-dimensional model in at least some of the second plurality of orientations; (paragraph [0085] reciting “by entering the 3D point cloud data for the specific region to other three-dimensional models, it is possible to generate an image showing the gripping location viewed from a different angle"))
	and generating a third set of visual images of the second three-dimensional model, wherein each of the visual images of the third set is generated with the second three-dimensional model rendered in one of the second plurality of orientations, (paragraph [0078] "quality of the three-dimensional model can be improved by the manipulation of the users. If the user edits the three-dimensional model, it is possible to add information such as the composition of each region for the object, which cannot be obtained by the three-dimensional image sensors")
	wherein the machine learning model is trained to recognize the object based at least in part on the at least some of the second set of the visual images, at least some of the third set of visual images, and the identifier of the object.  
(paragraph (0039] reciting "By using machine learning or deep learning, the image recognition models can be generated.  Training images need to include multiple views of the object from different angles”;
	paragraph [0050] reciting “characteristic learning device 30 generates a model ... learning of the model can be done by using ... machine learning";
paragraph [0099] reciting “data stored in the CAD model storage 23 is used for learning the image recognition model ... accessible from the characteristic learning device")
13.	Regarding Claim 5, Sugahara further discloses The system of claim 1, wherein each of the second set of visual images is in one of a plurality of categories, (paragraph [0106] reciting "categorizes each pixel of the images into classes"), wherein each of the categories relates to one of: an orientation of the first three-dimensional model when one of the second set of visual images was generated; a lighting condition of the first three-dimensional model when the one of the second set of visual images was generated; a color of the first three-dimensional model when the one of the second set of visual images was generated; (para (0082) - "set specific colors or tones to the corresponding points or pixels"); or a texture of the first three-dimensional model when the one of the second set of visual images was generated, and wherein the method further comprises: splitting the second set of the visual images into a first subset and a second subset, and wherein training the machine learning model to recognize the object based at least in part on at least some of the second set of the visual images and the identifier comprises: (paragraph [0102] reciting "enables manipulations of the three-dimensional CAD software, selections of images shown in the displaying unit... instructions to the three-dimensional imaging device")
training the machine learning model to perform a computer-based task based at least in part on the first subset and the identifier of the object; (paragraph [0039] reciting "By using machine learning or deep learning, the image recognition models can 
	wherein testing the machine learning model comprises: providing each of the second subset of the second set of visual images to the machine learning model as inputs; (paragraph [0102] reciting "input unit 25 accepts... selections of images");
	and receiving outputs from the machine learning model in response to the inputs, wherein each of the outputs is received in response to one of the inputs; (paragraph [0113] reciting "One example of the input data is images ... input data is multiplied by the coupling weight ... Then, the output data (y1, y2 … yp) is generated");
	calculating at least one error metric for each of the categories of the second subset of the second set of visual images based at least in part on a difference between: the identifier of the object; and the output received from the machine learning model in response to an input comprising one of the second set of visual images; (paragraph [0114] reciting "error E between the output data and the teacher data could be represented by using the error function (loss function)");
	generating a third set of visual images of the first three-dimensional model, wherein each of the visual images of the third set is generated with the first three- dimensional model in accordance with the one of the categories; (paragraph [0078] reciting "quality of the three-dimensional model can be improved by the and 
	training the machine learning model to perform the computer-based task based at least in part on at least a portion of the third set of visual images and the identifier of the object. (paragraph [0039] reciting “By using machine learning or deep earning, the image recognition models can be generated. Training images need to include multiple views of the object from different angles”; 
	paragraph [0050] reciting "characteristic learning device 30 generates a model... learning of the model can be done by using... machine earning"; 
	paragraph [0099] reciting "data stored in the CAD model storage 23 is used for learning the image recognition model... accessible from the characteristic learning device”) 
	Boardman further discloses and testing the machine learning model based at least in part on the second subset and the identifier of the object, (col 2, In 37-39 reciting “perform various types of manipulations and/or analyses of the generated computer model(s)"; col 14. In 46-50 - 'after the group of images to represent the object 150 has been selected and are available in the image data 162, analyze the images of the selected group and generate one or more corresponding models")
	determining that error metrics calculated for the second subset of the second set of visual images in one of the categories exceed a threshold;45Athorus Matter No.: 156.0002-US (col 42, In 45-48 reciting “Field elements whose error is below a certain threshold are called good field elements For a given image, if the number of good field elements drops 	4828-6057-9795, v. 1in response to determining that the error metrics calculated for the second subset of the second set of visual images in the one of the categories exceed the threshold, (col 42, In 45-48 reciting "Field elements whose error is below a certain threshold are called good field elements, "or a given image, if the number of good field elements drops below a certain threshold").
14.	Regarding Claim 6, Sugahara discloses A computer-implemented method comprising: generating a first three-dimensional model of an object (para (0046) - "generate a CAD model representing the three-dimensional shape of an object") based at least in part on: a first set of visual images, wherein each of the first set of visual images depicts the object in one of a first plurality of orientations; (paragraph [0060] reciting “RGB-D (Red, Green, Blue. Depth) sensors which detect both the color and the depth information can be used”; 
	paragraph [0065] reciting “arm 12 supports the imaging unit 11 to the position for taking images. The arm 12 has a rotational mechanism which enables the rotational movement of the imaging unit”)
	and a first set of depth data, wherein the set of depth data defines at leasat one surface of the object; (paragraph [0060] reciting “depth information (distance image)”; 
	paragraph [0061] reciting “data may include patterns in the surface of the
object”)

	generating a second set of visual images based at least in part on the first three- dimensional model, wherein each of the second set of visual images depicts the first three- dimensional model rendered in one of a second plurality of orientations; (paragraph [0085] reciting “repeating this procedure to the three-dimensional models for each angle, it is possible to generate the annotation images for different angles (views)”; 	paragraph [0143] reciting “the three-dimensional imaging device 10 could take all the multi-view images first. Then, the three-dimensional models corresponding to each image can be generated later”)
	and training a machine learning model to perform a task associated with the object based at least in part on at least some of the second set of visual images (paragraph [0039] reciting “By using machine learning or deep learning, the image recognition models can be generated. Training images need to include multiple views of the object from different angles”; 
	paragraph [0050] reciting “characteristic learning device 30 generates a model... learning of the model can be done by using... machine learning”; 
	paragraph [0099] reciting “data stored in the CAD model storage 23 is used for learning the image recognition model... accessible from the characteristic learning device”)
	While Sugahara does not explicitly disclosed, Boardman discloses and training a machine learning model to perform a task associated with the object based at least in part on at least some of the second set of visual images and at least one identifier of the object. (col. 37, In 60-63 reciting "using machine learning techniques such as learned classifiers. If the classification is performed in Image space, the determined labels will be transferred to the 3D space").

15.	Regarding Claim 7, Sugahara further discloses The computer-implemented method of claim 6, wherein generating the second set of visual images comprises: causing a display of at least a portion of the first three-dimensional model rendered in each of the second plurality of orientations in at least one user interface on a display; (paragraph [0101] reciting “displaying unit 24 helps the generation of models and configuration changes of the models by visual images”);
	and capturing visual images of the at least one user interface on the display, wherein each of the visual images is captured with at least the portion of the first three-dimensional model rendered in one of the second plurality of orientations in the at least one user interface, (paragraph [0102] reciting “selections of images shown in the displaying unit 24. changes in the presentation of images") and wherein each of the second set of visual images is one of the visual images captured with at least the portion of the first three-dimensional model rendered in one of the second plurality of orientations in the at least one user interface. (paragraph [0078] reciting "quality of the three-dimensional model can be improved by the manipulation of the users. If the user edits the three-dimensional model, it is 
16.	Regarding Claim 8, Sugahara further discloses The computer-implemented method of claim 6, wherein training the machine learning model to perform the task associated with the object comprises: providing the at least some of the second set of visual images to the machine learning model as inputs; (paragraph [0102] reciting “input unit 25 accepts... selections of images”)
	receiving outputs from the machine learning model in response to the inputs; (paragraph [0113] reciting “One example of the input data is images... input data is multiplied by the coupling weight... Then, the output data (y1, y2...., yp) is generated”)
	Broardman further discloses and comparing the outputs to the at least one identifier of the object. (col 29. In 40-43 reciting “selected group of images from the concurrent or non-concurrent image selection processes is provided as output of the routine for additional analysis in order to measure one or more attributes of the object”)
17.	Regarding Claim 9, Sugahara further discloses The computer-implemented method of claim 6, wherein each of the first set of visual images is captured by an imaging device comprising a visual image sensor, (paragraph [0060] reciting “RGB-D (Red, Green. Blue. Depth) sensors which detect both the color and the depth information can be used”) and wherein each of the first set of visual images is captured with the imaging device and the object in relative rotational or translational motion with respect to one another. (paragraph [0065] reciting "arm 12 
18.	Regarding Claim 10, The computer-implemented method of claim 6, wherein generating the first three- dimensional model comprises: generating a point cloud corresponding to at least a portion of the object based at least in part on the set of depth data; (paragraph [0061] reciting “format of distance image is the 3D point cloud data"); 
	tessellating the point cloud; (paragraph [0062] reciting “3D point cloud data can be... data with the noise components filtered out... other data formats including... surface models, solid models, polygons or the like may be used... 3D point cloud data includes all the data formats”);
	and patching at least a portion of at least some of the first set of visual images onto the tessellated point cloud. (paragraph [0078] reciting “If the... accuracy of the 3D point :loud data is limited, the quality of the three-dimensional model can be improved by the manipulation of the users. If the user edits the three-dimensional model, it is possible to add information”).
19.	Regarding Claim 11, Sugahara further discloses The computer-implemented method of claim 6, wherein training the machine learning model to perform the task comprises: annotating each of the second set of visual images with the identifier of the object; (paragraph [0080] reciting “location specifying unit 22a can automatically specify the location of specific regions within each three dimensional model... Such tasks are called the annotation of images... images which indicate specific regions of the object... are called the extracted images in the following”) 
	training the machine learning model to perform the task based at least in part on the training subset, (paragraph [0039] reciting “By using machine learning or deep learning, the image recognition models can be generated. Training images need to include multiple views of the object from different angles"; 
	paragraph [0050] reciting “characteristic learning device 30 generates a model... learning of the model can be done by using... machine learning"; para [0099] - "data stored in the CAD model storage 23 is used for learning the image recognition model... accessible from the characteristic learning device”)
	Boardman further discloses parsing the second set of visual images into at least a training subset and a testing subset; (col 28. In 17-23 reciting “If it is determined to perform the image selection concurrently during image acquisition, the routine continues to block 520. where one or more initial images are acquired for an object of interest, and one of the initial images is selected as a first image in the group, as well as a current first item in an image queue to be used to temporarily store images being acquired until selected images are determined"; 
	col 29, In 21-26 reciting "If it is instead determined .. that the image selection will occur after all of the images have been acquired, the routine continues instead to... where a plurality of images are acquired of an object of interest, and in block 555 are evaluated to select a subset of the best images to use as the group to represent the object")
	and testing the machine learning model based at least in part on the testing subset. (col 2. ln 37-39 reciting “perform various types of manipulations and/or analyses of the generated computer model(s)”;
col 14. In 46-50 reciting “after the group of images to represent the object 150 has been selected and are available in the image data 162. analyze the images of the selected group and generate one or more corresponding models”)
20.	Regarding Claim 12, Sugahara further discloses The computer-implemented method of claim 11, further comprising: calculating at least one error metric for at least some of the images of the testing subset, wherein the at least one error metric is calculated based at least in part on a difference between the identifier of the object and an output received from the machine learning model in response to an input comprising one of the images of the testing subset; (paragraph [0114] reciting “error E between the output data and the teacher data could be represented by using the error function (loss function)”)
	determining that error metrics calculated for images of the testing subset in a category of images (paragraph [0116] reciting “coupling weight is adjusted to ensure that the error E is minimized... calculation of error and the calculation of adjusted amount repeated until the error E approaches 0”)
	wherein the category is one of: an orientation of the first three-dimensional model when one of the images of the testing subset was generated; a lighting condition of the first three-dimensional model when the one of the images of the testing subset was generated; a color of the first three-dimensional model when the one of the images of the testing subset was generated; (paragraph [0082] reciting “set specific colors or tones to the corresponding points or pixels”) or a texture of the first three-dimensional model when the one of the images of the testing subset was generated; generating at least one image based at least in part on the first three-dimensional model, wherein the at least one image is in the category of images; (paragraph [0078] reciting “quality of the three-dimensional model can be improved by the manipulation of the users. If the user edits the three-dimensional model, it is possible to add information such as the composition of each region for the object, which cannot be obtained by the three-dimensional image sensors”)
	and training the machine learning model to perform the task associated with the object based at least in part on the at least one image and the at least one identifier of the object. (paragraph [0039] reciting "By using machine learning or deep learning, the image recognition models can be generated. Training images need to include multiple views of the object from different angles"; 
	paragraph [0050] reciting "characteristic learning device 30 generates a model... learning of the model can be done by using... machine learning”; 
	paragraph [0099] reciting “data stored in the CAD model storage 23 is used for learning the image recognition model... accessible from the characteristic learning device”).
	Boardman further discloses determining that error metrics calculated for images of the testing subset in a category of images exceed a predetermined threshold, (col. 42, In 45-48 reciting “Field elements whoso error is below a certain threshold aro called good fiold elements. For a given image, if the number of good field elements drops below a certain threshold”)
	in response to determining that the error metrics for the images in the testing subset in the category of images exceed the predetermined threshold, (col. 42, In 45-48 reciting “Field elements whose error is below a certain threshold are 
21.	Regarding Claim 13, Boardman further discloses The computer-implemented method of claim 6, further comprising: transmitting code for operating the machine learning model to at least one computer device over at least one network.  (col 25, In 38-39 reciting “instructions or information have been received related to performing image acquisition for one or more objects of interest”).
22.	Regarding Claim 14, Sugahara further discloses The computer-implemented method of claim 6, wherein the task comprises: recognizing the object in at least one visual image; (paragraph [0034] reciting "image recognition processes");  or determining an anomaly with the object based at least in part on the at least one visual image.  
23.	Regarding Claim 15, Sugahara further discloses The computer-implemented method of claim 6, further comprising: generating a second three-dimensional model based at least in part on the first three- dimensional model, wherein at least one of a dimension, a color or a texture of the second three- dimensional model is different from the at least one of the dimension, the color or the texture of the first three-dimensional model; (paragraph [0179] reciting “generate new distance images... is called data augmentation. Examples of data augmentation include... modification of darkness, changed colors, additions of random noise, blur corrections or the like”) 
	and generating a third set of visual images based at least in part on the second three- dimensional model, wherein each of the third set of visual images depicts the second three- dimensional model rendered in one of a third plurality of orientations, (paragraph [0078] reciting “quality of the three- dimensional model can be improved by the manipulation of the users. If the user edits the three-dimensional model, it is possible to add information such as the composition of each region for the object, which cannot be obtained by the three-dimensional image sensors”) wherein the machine learning model is trained to perform the task associated with the object based at least in part on the at least some of the second set of visual images, at least some of the third set of visual images and the at least one identifier of the object. (paragraph [0039] reciting “By using machine learning or deep learning, the image recognition models can be generated. Training images need to include multiple views of the object from different angles”; 
	paragraph [0050] reciting “characteristic learning device 30 generates a model... learning of the model can be done by using... machine learning”; 
	paragraph [0099] reciting “data stored in the CAD model storage 23 is used for learning the image recognition model... accessible from the characteristic learning device”)
24.	Regarding Claim 16, Sugahara further discloses The computer-implemented method of claim 6, wherein the machine learning model is an artificial neural network comprising an input layer having a first plurality of neurons, at least one hidden layer having at least a second plurality of neurons, and an output layer having a third plurality of neurons, (paragraph [0113] reciting "deep learning with neural networks. The input data (x1, x2,..., xq) are transferred from the input layer to the hidden layer ... input data is multiplied by the coupling weight wnij between each layer. 1, y2 … yp) is generated") wherein a first connection between at least one of the first plurality of neurons and at least one of the second plurality of neurons in the machine learning model has a first synaptic weight, (paragraph [0113] reciting "input data is multiplied by the coupling weight wnij between each layer") wherein a second connection between at least one of the second plurality of neurons and at least one of the third plurality of neurons in the machine learning model has a second synaptic weight, (paragraph [0113] reciting “input data is multiplied by the coupling weight wnij between each layer”), 
	and wherein training the machine learning model to perform the task comprises: selecting at least one of the first synaptic weight for the first connection or the second synaptic weight for the second connection based at least in part on at least one of the second set of visual images and the identifier of the object. (paragraph [0120] reciting “detection of objects and the semantic segmentation of objects is performed by using neural networks whose adjustment of the coupling weight w by back propagation is completed.  If the objects are detected, information of each pixel in the distance image is entered into each input node as the feature values... an image which specifies the pixels corresponding to the object is obtained as the output data”)
25.	Regarding Claim 17, Sugahara further discloses The computer-implemented method of claim 6, wherein the machine learning model is at least one of an artificial neural network, a deep learning system, a support vector machine, a nearest neighbor analysis, a factorization method, a K-means clustering technique, a similarity measure, a latent Dirichlet allocation, a decision tree or a latent semantic analysis. (paragraph [0050] reciting “learning of the model can be done by using... machine learning such as SVM (Support Vector Machines)”).
26.	Regarding Claim 18, A computer-implemented method comprising: causing relative rotation of an object with respect to an imaging device configured to capture visual images and depth data; (para [0068] - "platform 14 can rotate horizontally"; para (0060) - "imaging unit... has a three-dimensional image sensor (distance image sensor) obtaining the depth information (distance image)... RGB-D (Red, Green. Blue, Depth) sensors which detect both the color and the depth information can be used"; para (0038) - "image of the target object needs to be taken by a camera")
	capturing, by the imaging device during the relative rotation of the object with respect to the imaging device, a first set of visual images of the object; (paragraph [0060] reciting “RGB-D (sRed, Green. Blue, Depth) sensors which detect both the color and the depth information can be used”; 
	paragraph [0056] reciting “three-dimensional imaging device 10 and the image processing device 20 can send or receive data”; 	paragraph [0068] reciting “platform 14 is a flat disk-shaped”; 
	paragraph [0172] reciting "axis of rotation is in the center of the disk"; 
	paragraph [0065] reciting “arm 12 supports the imaging unit 11 to the position for taking images. The arm 12 has a rotational mechanism which enables the rotational movement of the imaging unit"); 
	capturing, by the imaging device during the relative rotation of the object with respect to the imaging device, a first set of depth data regarding the object; (paragraph [0060] reciting “depth information (distance image)”; 
	paragraph [0043] reciting “by adjusting the angles by 1.2. 5 or 10 degrees... multiple distance images for an object could be taken”)
	generating a three-dimensional model of the object based at least in part on the first set of visual images and the first set of depth data; (paragraph [0046] reciting “images taken from multiple angles are combined to generate a CAD model representing the three-dimensional shape of an object”)
	selecting a plurality of orientations for the three-dimensional model; (paragraph [0147] reciting “conditions for the multi-view images (views, angles... ) are determined”); 
	rendering the three-dimensional model in each of the plurality of orientations; (paragarph [0075] reciting “CAD model generator 22 generates a three-dimensional model representing the three-dimensional shape of the object by combining multiple distance images taken from multiple angles in the three-dimensional imaging device”); 
	generating a second set of visual images of the three-dimensional model, wherein each of the visual images of the second set is captured with the three-dimensional model rendered in one of the plurality of orientations; (para [0085] - "repeating this procedure to the three-dimensional models for each angle, it is possible to generate the annotation images for different angles (views)"; para [0143) - "the three-
	training a machine learning model to recognize the object based at least in part on at least some of the second set of the visual images (paragraph [0039] reciting "By using machine learning or deep learning, the image recognition models can be generated Training images need to include multiple views of the object from different angles”; 
	paragraph [0050] reciting “characteristic learning device 30 generates a model... learning of the model can be done by using... machine learning”; 
	paragraph [0099] reciting “data stored in the CAD model storage 23 is used for learning the image recognition model... accessible from the characteristic learning device”)
	While Sugahara does not explicitly disclose, Boardman discloses and an identifier of the object; and distributing code for operating the machine learning model to at least one computer device associated with an end user.  (col. 37, In 60-63 reciting "using machine learning techniques such as learned classifiers. If the classification is performed in Image space, the determined labels will be transferred to the 3D space";
	col. 25, In 38-39 reciting "instructions or information have been received related to performing image acquisition for one or more objects of interest"). 
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system of Sugahara with Boardman by including wherein training a machine learning model to recognize the 
27.	Regarding Claim 19, Sugahara further discloses The computer-implemented method of claim 18, wherein generating the three-dimensional model comprises: generating a point cloud corresponding to at least a portion of the object based at least in part on the first set of depth data; (paragraph [0061] reciting “format of distance image is the 3D point cloud data”)
	tessellating the point cloud; (paragraph [0062] reciting “3D point cloud data can be... data with the noise components filtered out... other data formats including... surface models, solid models, polygons or the like may be used... 3D point cloud data includes all the data formats”) 
	and patching portions of at least some of the first set of visual images onto the tessellated point cloud. (paragraph [0078] reciting “If the... accuracy of the 3D point cloud data is limited, the quality of the three-dimensional model can be improved by the manipulation of the users. If the user edits the three- dimensional model, it is possible to add information”)
28.	Regarding Claim 20, The computer-implemented method of claim 18, wherein the machine learning model is an artificial neural network comprising an input layer having a first plurality of neurons, at least one hidden layer having at least a second plurality of neurons, and an output layer having a third plurality of neurons, (paragraph [0113] reciting “deep learning with neural networks. The input data (x1, x2, … xq) are transferred from the input layer to the hidden layer... input data is
multiplied by the coupling weight wnij between each layer. Then, the output data (y1, y2	yp) is generated"), wherein a first connection between at least one of the first plurality of neurons and at least one of the second plurality of neurons in the machine learning model has a first synaptic weight, (paragraph [0113] reciting “input data is multiplied by the coupling weight wnij between each layer”)
	wherein a second connection between at least one of the second plurality of neurons and at least one of the third plurality of neurons in the machine learning model has a second synaptic weight, (paragraph [0113] reciting “input data is multiplied by the coupling weight wnij between each layer”) and 	wherein training the machine learning model to perform the task comprises: selecting at least one of the first synaptic weight for the first connection or the second synaptic weight for the second connection based at least in part on at least one of the second set of visual images and the identifier of the object. (paragraph [0120] reciting “detection of objects and the semantic segmentation of objects is performed by using neural networks whose adjustment of the coupling weight w by back propagation is completed. If the objects are detected, information of each pixel in the distance image is entered into each input node as the feature values... an image which specifies the pixels corresponding to the object is obtained as the output data”)
CONTACT


Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kee Tung can be reached on 5712727794. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/FRANK S CHEN/Primary Examiner, Art Unit 2611