DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on June 09, 2022 has been entered.
 	Claims 1, 9, 15 have been amended; claim 3 has been canceled; claims 46-47 have been added.  Claims 1-47 are still pending in this Application.

Response to Arguments
Applicant’s arguments with respect to claims 1, 9, 15, 21, 33 and 39 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.


Claim Rejections - 35 USC § 102
	The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


	Claims 1-2, 9-11, 15-16, 21-22, 33-35, 39-41 and 45-46 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Jiang et al. (US 20210103776 A1, hereinafter Jiang).
As to claim 1.  Jiang discloses processor (Jiang, see FIG. 1, Processor 105 ) comprising: 
one or more circuits (Jiang, see FIG. 1 and at least par. [0043] The processor(s) 105 may include, without limitation, CPU-type processing unit, a GPU-type processing unit, a field-programmable gate array (FPGA), a digital signal processor (DSP), or other hardware logic components that can, in some instances, be driven by a CPU. For example, and without limitation, illustrative types of hardware logic components that can be used include Application-Specific Integrated Circuits (ASICs),) to cause one or more neural networks to generate a three-dimensional (3D) model of an object based, at least in part, on a plurality of images of the object (Jiang, see at least par. [0006] determining a set of 3D salient parts of the object using the deep neural network to generate the salient 3D model of the object…  [0127] Training the deep neural network begins at step 702, where training datasets include a set of two-dimensional (2D) images of the object from multiple views are received. The set of 2D images may be captured in different settings (e.g., different angles, different light settings, different environments, etc.) for each of the training datasets. A set of 3D models from the set of 2D images in each of the training datasets are reconstructed based on salient points of the object selected during reconstruction at 704. From the reconstructed 3D models, salient 3D models of the object may be generated that are an aggregation of the salient points of the object in the set of 3D models).
	
As to claim 2, Jiang further discloses wherein an image, of the plurality of images, comprises three-dimensional data indicative of three-dimensional locations on a surface of the object (Jiang, see at least par. [0113] FIG. 6A illustrates an example embodiment of a testing stage in accordance with testing stage 200B of FIG. 2A. In the testing stage 200B (post-training), the trained 3D assisted object detection network 214 is used to infer objects from a new dataset (e.g., new or input images 601). In one embodiment, new images 601 are input into an object detection network 602 of the 3D assisted object detection network inference 216 stage of the feed forward inference process. The object detection network 602 detects object bounding boxes 603 from the new images 601 (object bounding boxes are explained further below), and sends the object bounding boxes 603 to the 3D assisted object detection network 214 (now trained) for to detecting and segmenting the new images 601. As a result of detection and segmentation, the 3D assisted object detection network 214 outputs the 2D object localization information 218 and 3D object surface coordinates 220 of the new images 601.).
As to claim 10, Jiang discloses wherein the plurality of images comprise point data indicative of locations on a surface of the object (Jiang, see at least par. [0069] In the testing stage 200B, new images 601 are input into the 3D assisted object detection network 214 (now trained) for processing. As a result of processing, the 3D assisted object detection network 214 outputs the 2D object localization information 218 (the location of the object in the image) and 3D object surface coordinate 220. A detailed description of the testing stage 200B may be found below with reference to FIGS. 6A and 6B.).

As to claim 11.  Jiang further discloses wherein the 3D model is a probabilistic model (Jiang, see at least par. [0110] The output of the RoIAlign Layer 214B is fed into a fully convolutional network (FCN) 214C for object (or background) classification and 2D object surface coordinates 606 (e.g., estimated 2D object surface coordinates of each pixel). Classification of each pixel in the object bounding box 502 may be accomplished using, for example, the SVM-based method described above. As a result of classification, a set of K+1 probabilities is determined that indicates the probability of a pixel belonging to k-object parts or the non-object background (non-object part). That is, the pixels associated with each object part or non-object background are classified according to their probabilities, such as a classification score.).

As to claim 16, (Currently Amended) Jiang further discloses wherein an image, of the plurality of images, comprise information indicative of locations on a surface of the object (Jiang, see at least par. [0069] In the testing stage 200B, new images 601 are input into the 3D assisted object detection network 214 (now trained) for processing. As a result of processing, the 3D assisted object detection network 214 outputs the 2D object localization information 218 (the location of the object in the image) and 3D object surface coordinate 220. A detailed description of the testing stage 200B may be found below with reference to FIGS. 6A and 6B.).

As to claim 21.  (Currently Amended) Jiang discloses a car, comprising: 
a three-dimensional sensor (Jiang, see par. [0140], the camera sensor aide for pose tracking may be received from RGB camera 814, or from depth camera 815. In some embodiments, when the depth camera 815 is an active sensor, a projected pattern is used to estimate depth, this sensor (815) is used itself to track the pose of the computing device. R however 815 is a passive sensor and consists of two RGB/grayscale cameras paired in the stereo pair, then 814 typically does not exist by itself, and may be one of these two cameras. In those situations, RGB from one of the 2 cameras and/or depth information derived from the stereo pair may be used for camera tracking purposes.); and
one or more processors to be configured to process data obtained by the three- dimensional sensor, the data processed based at least in part on a 3D model of an object generated by one or more neural networks based, at least in part, on a plurality of images of the object (Jiang, see at least par. [0006], determining a set of 3D salient parts of the object using the deep neural network to generate the salient 3D model of the object;  [0127] Training the deep neural network begins at step 702, where training datasets include a set of two-dimensional (2D) images of the object from multiple views are received. The set of 2D images may be captured in different settings (e.g., different angles, different light settings, different environments, etc.) for each of the training datasets. A set of 3D models from the set of 2D images in each of the training datasets are reconstructed based on salient points of the object selected during reconstruction at 704. From the reconstructed 3D models, salient 3D models of the object may be generated that are an aggregation of the salient points of the object in the set of 3D models. At 706, a set of training 2D-3D correspondence data are generated between the set of 2D images of the object in a first training dataset and the salient 3D model of the object generated using the first training dataset. Using the set of training 2D-3D correspondence data generated using the first training dataset, a deep neural network is trained at 708 for object detection and segmentation.).

As to claims 9, 15, 33 and 39, are rejected for the same rationale of claim 1.
As to claims 22 and 34, are rejected for the same rationale of claim 10.
As to claims 35 and 41, are rejected for the same rationale of claim 8.
As to claim 40, is rejected for the same rationale of claim 2.
As to claim 46. (New) Jiang discloses the processor of claim 1, wherein the one or more circuits are to use the three-dimensional model generated by the one or more neural networks to align images of the object (Jiang, see at least par. [0129] At 711, as part of the salient attention learning, a set of matching 3D points is computed for each set of matching 3D models in the set of 3D models, and a six degree of freedom (6 DoF) rotation and translation is calculated at 712 to transform the set of matching 3D models. The 6 DoF rotations and translation are refined to align each of the 3D models into a unified 3D world coordinate system and to generate a unified 3D model by aligning each of the 3D models in the set of 3D models at 714 such that a set of 3D salient parts of the object may be determined using the deep neural network to generate the salient 3D model of the object at 716.).

Claim Rejections - 35 USC § 103
7.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


9.	Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Jiang et al. (US 20210103776 A1, hereinafter “Jiang”) as applied to claim 1 and further in view of  Tian et al. (US 20070256189 A1).
As to claim 4. Jiang does claim 1, the 3D model, but does not explicitly does not wherein the 3D model comprises a Gaussian mixture model, and wherein parameters for the Gaussian mixture model are generated based at least in part on alignment of the plurality of images of the object, the alignment based at least in part on a registration transform generated from the Gaussian mixture model.  However, Tian discloses comprises a Gaussian mixture model, and wherein parameters for the Gaussian mixture model are generated based at least in part on alignment of the plurality of images of the object, the alignment based at least in part on a registration transform generated from the Gaussian mixture model.   (Tian, see at least par. [0024] In step 402, alignment probabilities are estimated, for example, by computing device 301, for different source-target vector pairs. In this example, the alignment probabilities may be estimated using techniques related to Hidden Markov Models (HMM), statistical models related to extracting unknown, or hidden, parameters from observable parameters in a data distribution model. For example, each distinct vector in the source and target vector sequences may be generated by a left-to-right finite state machine that changes state once per time unit. Such finite state machines may be known as Markov Models. In addition, alignment probabilities may also be training weights, for example, values representing weights used to generate training parameters for a GMM based transformation.  Thus, an alignment probability need not be represented as a value in a probability range (e.g., 0 to 1, or 0 to 100), but might be a value corresponding to some weight in the training weight scheme used in a conversion.
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to combine the Jiang disclosed invention, and have wherein the 3D model comprises a Gaussian mixture model, and wherein parameters for the Gaussian mixture model are generated based at least in part on alignment of the plurality of images of the object, the alignment based at least in part on a registration transform generated from the Gaussian mixture model, as taught by Tian, in order to reduce alignment errors and allow for increased efficiency and quality when performing vector transformations.

10.	Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Jiang et al. (US 20210103776 A1, hereinafter “Jiang”) in view of Tian et al. (US 20070256189 A1, hereinafter Tian) as applied claim 4 above and further in view of Paul et al. (US 20200265259 A1, hereinafter “Paul”).
As to claim 5.  Jiang in view of Tian does not disclose wherein the registration transform is generated to be in a closed form enabling back-propagation of a registration error.  However, Paul teaches:
wherein the registration transform is generated to be in a closed form enabling back-propagation of a registration error (Paul. See at least par. [0022], “The GANs may synthesize new 3D data based on the generated input specific noise data and the identified ROIs. The GANs may then back-propagate any error (i.e., any difference between the synthesized 3D data and input 3D data 207) to the noise generating module 202.”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to combine the Jiang in view of Tian, disclosed invention, and have wherein the registration transform is generated to be in a closed form enabling back-propagation of a registration error, as taught by Paul, thereby to provide, and more particularly, to a method in which performs operations including clustering the initial 3D data to identify one or more ROIs. The operations may further include generating input specific noise data based on the one or more ROIs by an iterative process using Gaussian mixture model. The operations may further include iteratively synthesizing 3D data based on the one or more ROIs and the input specific noise data using GANs to generate final synthesized 3D data. The final synthesized 3D data may represent the plurality of possible scenarios and may be affine transforms of the initial 3D data (Paul. See par. [0007]).

11.	Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Jiang et al. (US 20210103776 A1, hereinafter “Jiang”) in view of Tian et al. (US 20070256189 A1, hereinafter Tian), and further in view of Paul et al. (US 20200265259 A1, hereinafter “Paul”) as applied claim 4 above and further in view of Qiu et al. (US 20200367970 A1, hereinafter “Qiu”).
As to claim 6.  Jiang in view of Tian and further in view of Paul does not disclose wherein the registration transform maps points in the plurality of images to a common coordinate system.  However, Qiu teaches wherein the registration transform maps points in the plurality of images to a common coordinate system (Qiu, see at least par. [0222], “A registration transform maps points in the physical space (the reference sensor) to the virtual space (data models).”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to combine the Jiang in view of Tian and further in view Paul, disclosed invention, and have wherein the registration transform maps points in the plurality of images to a common coordinate system, as taught by Paul, thereby to provide a convolutional neural network system, and more particularly, to improve planning, intervention, guidance, and education through quantitative and spatial feedback. As a spatial communication medium, it enables better application and dissemination of knowledge and expertise, as discussed by Qiu (see par. [0004].

Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over Jiang et al. (US 20210103776 A1, hereinafter “Jiang”) as applied claim 1 above and further in view of Gausebeck et al. (US 2019002696 A1, hereinafter Gausebeck).
As to claim 7.  Gausebeck further discloses wherein the one or more neural networks encode a geometry of the object (Gausebeck, see at least par. [0071], “the geometric data can comprise data points of geometry in addition to comprising texture coordinates associated with the data points of geometry (e.g., texture coordinates that indicate how to apply texture data to geometric data). In various embodiments, received 2D image data 102 (or portions thereof) can be associated with portions of the mesh to associate visual data from the 2D image data 102 (e.g., texture data, color data, etc.) with the mesh. In this regard, the 3D model generation component 118 can generate 3D models based and 2D images and the 3D data respectively associated with the 2D images.”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to combine the Jiang disclosed invention, and have wherein the one or more neural networks encode a geometry of the object, as taught by Gausebeck, in order to provide devices and techniques for accurately and efficiently aligning the 2D images using the 3D data to generate immersive 3D environments are in high demand.

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Jiang et al. (US 20210103776 A1, hereinafter “Jiang”) as applied claim 1 above and further in view of MORIKAWA (US 20200005438 A1, hereinafter MORIKAWA).
As to claim 8.  Jiang discloses claim 1, but does not explicitly disclose wherein the plurality of images comprise one or more labelled points corresponding to locations on an occluded surface of the object.  However, MORIKAWA discloses wherein the plurality of images comprise one or more labelled points corresponding to locations on an occluded surface of the object (MORIKAWA, see at least par. [0028] Referring to FIG. 2, with continued reference to FIG. 1, image acquisition device 104 generates image 200 at position 120 and time T1 (in accordance with environment 100 in FIG. 1). As shown in FIG. 1, line of sight 116 of image acquisition device 104 to point 112 is occluded by occlusion 114 when surveying system 102 is at position 120 at time T1. Accordingly, point 112 on a surface of object 110 is not visible in image 200. Image point 130 is a location (e.g., a 2D coordinate) in image 200 that corresponds to the 3D coordinate of the data point representing point 112. In one embodiment, image point 130 is determined by mapping the 3D coordinate of the data point to the 2D coordinate of image 200 using Equation 1, as discussed below. Since occlusion 114 occludes point 112 on a surface of object 110 in image 200, image point 130 represents (e.g., depicts) a point on occlusion 114, instead of point 112 on a surface of object 110. Occlusion 114 may be any object occluding, at least in part, point 112 in image 200. Examples of occlusion 114 may include a vehicle, a tree, a pole, etc. It should be understood that image 200 may include any number of image points corresponding to different data points in the set of data points, where the different data points corresponding to points on a surface of object 110 that may or may not be visible in image 200.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to combine the Jiang disclosed invention, and have wherein the plurality of images comprise one or more labelled points corresponding to locations on an occluded surface of the object, as taught by MORIKAWA, in order to provide a method and apparatus that helps the human operator by efficiently identifying the location of the occluded objects.

Claim 45 is rejected under 35 U.S.C. 103 as being unpatentable over Jiang et al. (US 20210103776 A1, hereinafter “Jiang”) as applied claim 1 above and further in view HU (US 20200168320 A1, hereinafter HU).
As to claim 45.  Jiang discloses the processor of claim 1, wherein the one or more neural networks are trained based, at least in part, but does not explicitly disclose wherein the one or more neural networks are trained based, at least in part, on differences between the three-dimensional model and the object.  However, HU discloses wherein the one or more neural networks are trained based, at least in part, on differences between the three-dimensional model and the object (HU, see at least par. [0072] In some embodiments, wherein processing the second image (or second pixel array) through the trained artificial neural network and the customization layer, the system may input the second pixel array into the first customization layer. The system may then receive a preliminary output from the first customization layer. The system may then input the preliminary output from the first customization layer into the trained artificial neural network. For example, as described in FIG. 2, the system may first process the image through the customization layer to normalize the differences between the first image and the second image, and then identify the object in the second image. In another example, in some embodiments, the first customization layer comprises a generative neural network and the trained artificial neural network is a discriminative neural network. As described in FIG. 3, the system may determine a portion of the known object that is obscured in the second image and generate a version of the second image where the portion is not obscured. That is, the generative neural network may take images that otherwise lack specific features and produce images that can be interpreted by the artificial neural network to execute imaging control instructions. In another example, as described in FIG. 4, the system the first customization layer comprises a geometric neural network and the trained artificial neural network is a convolutional neural network. The system may determine a three-dimensional model of the known object via the customization layer, and then the system may label a feature of the known object in the second image based on the three-dimensional model.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to combine the Jiang disclosed invention, and have wherein the one or more neural networks are trained based, at least in part, on differences between the three-dimensional model and the object, as taught by HU, in order to provide the system that improves in autonomous imaging which lacks specific features, requires semantic labels, and/or needs nonlinear adjustments and prevent the trained discriminatory artificial neural network from properly classifying objects in the image.

12.	Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Jiang et al. (US 20210103776 A1, hereinafter “Jiang”) as applied claim 11 above and in view of Bart et al. (US 20140100952 A1, hereinafter “Bart”).
As to claim 12. Jiang discloses claim 11 and the probabilistic model, but does not explicitly disclose wherein the probabilistic model is computed based at least in part on a weight matrix output by the one or more neural networks.  However, Bart discloses wherein the probabilistic model is computed based at least in part on a weight matrix output by the one or more neural networks (Bart, see at least par. [0045], “In the case of a probabilistic model, they can simply be probabilities of the user or offer belonging to each group. The system generates a combined weight matrix with one dimension of the matrix being the user groups and another dimension being the offer groups. Each element in the matrix is the corresponding combined weight generated by appropriately combining the respective user and offer weight. More details on the weight matrix are described in conjunction with FIG. 3.”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to combine the Jiang disclosed invention, and have wherein the probabilistic model is computed based at least in part on a weight matrix output by the one or more neural networks, as taught by Bart, thereby to provide provides a convolutional neural network system, and more particularly, to a method in which the one or more processor provide a method for determining offer time window comprises applying a mapping function from offer information to time windows pertaining to an offer's validity, as discussed by Bart, (see par. [0014]).

13.	Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over Jiang et al. (US 20210103776 A1, hereinafter “Jiang”) as applied claim 11 above and in view of Ananda (US 20150036889 A1).
As to claim 13.  Jiang discloses claim 11 and the probabilistic model does not disclose, but does not explicitly discloses wherein a registration transform is computed based at least in part on the probabilistic model.  However, Ananda teaches wherein a registration transform is computed based at least in part on the probabilistic model (Ananda, see at least par. [0038], “The temporal profile is preferably based upon the transformation model of Block S140 as trained in Block S150, in cooperation with a probabilistic model to allow generation of a profile of an amount (e.g., concentration, enumeration, density, etc.) of the therapeutic substance carrier within a patient's body over time. Preferably, the probabilistic model comprises a hidden Markov model; however, the probabilistic model can alternatively comprise any other suitable model for generating a temporal profile of the therapeutic substance carrier”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to combine the Jiang, disclosed invention, and have wherein a registration transform is computed based at least in part on the probabilistic model, as taught Ananda, thereby to provide provides a convolutional neural network system, and more particularly, to a method in which the one or more processor provide a method for transforming the image dataset and the spectra dataset is preferably performed at a processing subsystem that receives the image and the spectra datasets; however, transforming the image dataset and/or the spectra dataset can alternatively be performed at any other suitable processing subsystem, discussed by Ananda (see par. [0017]).

14.	Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over Jiang et al. (US 20210103776 A1, hereinafter “Jiang”) in view of Ananda (US 20150036889 A1) as applied claim 13 above, and further in view of KIM et al. (US 20180197084 A1, hereinafter “KIM”).
As to claim 14. Jiang in view of Ananda discloses claim 13, but does not explicitly disclose wherein a registration error is back-propagated to the one or more neural networks during training.  However, KIM teaches wherein a registration error is back-propagated to the one or more neural networks during training (KIM, see at least par. [0034], “In CNN learning, an error backpropagation algorithm may be used to back-propagate the weight error in the direction of minimizing the difference value between the result value and the expected value of such an operation”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to combine the Jiang in view of Ananda, disclosed invention, and have wherein a registration error is back-propagated to the one or more neural networks during training, as taught by KIM, thereby to provide a convolutional neural network system, and more particularly, to a method in which the one or more processor provide a method for reducing the amount of learning parameters required for an FC layer in a CNN model. The present disclosure also provides a method for performing a recognition task by converting a learning parameter into a binary variable (`-1` or `1`) in an FC layer. The present disclosure also provides a method and device for changing a learning parameter of an FC layer to a binary form to reduce the cost of managing learning parameters, as discussed by KIM, (see par. [0006]).
Claim 47 is rejected under 35 U.S.C. 103 as being unpatentable over Jiang et al. (US 20210103776 A1, hereinafter “Jiang”) as applied claim 1 above and in view of Guo et al. (US 20210158511 A1, hereinafter Guo).
As to claim 47. (New) Jiang discloses the processor of claim 1, and wherein the three-dimensional model generated by the one or more neural networks, but does not explicitly disclose wherein the three-dimensional model generated by the one or more neural networks comprises parameters of a statistical model usable to generate a transform to align images of the object.  However, Guo discloses wherein the three-dimensional model generated by the one or more neural networks comprises parameters of a statistical model usable to generate a transform to align images of the object (Guo, see at least par.[0004], the image segmentation system may align the shape prior substantially with the first segmentation and prevent the segmentation task from being stuck in local minima. The deformation of the shape prior may be performed using a statistical model of shape or appearance that is associated with the anatomical structure, and the deformation may include the second neural network adjusting one or more parameters of the statistical model based on features (e.g., such as an intensity profile or a gradient) of the image).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to combine the Jiang, disclosed invention and have wherein the three-dimensional model generated by the one or more neural networks comprises parameters of a statistical model usable to generate a transform to align images of the object, as suggested by Guo, in order to allow for a more focused analysis or study of the object. Aided by advanced computer vision and machine learning technologies, the accuracy and robustness of image segmentation has improved significantly (Guo, see par. [0003]).

15.	Claims 17, 23, 36 and 42 are rejected under 35 U.S.C. 103 as being unpatentable over Jiang et al. (US 20210103776 A1, hereinafter “Jiang”) as applied claims 15, 33 and 39 above and in view of Tian et al. (US 20070256189 A1, hereinafter Tian).
As to claim 17. (Currently Amended) Jiang discloses claim 15, but does not explicitly disclose cause the one or more processors to at least: align the plurality of images based, at least in part, on a Gaussian mixture model”.  However, Tian teaches cause the one or more processors to at least: align the plurality of images based, at least in part, on a Gaussian mixture model (Tian, see at least par. [0024] In step 402, alignment probabilities are estimated, for example, by computing device 301, for different source-target vector pairs. In this example, the alignment probabilities may be estimated using techniques related to Hidden Markov Models (HMM), statistical models related to extracting unknown, or hidden, parameters from observable parameters in a data distribution model. For example, each distinct vector in the source and target vector sequences may be generated by a left-to-right finite state machine that changes state once per time unit. Such finite state machines may be known as Markov Models. In addition, alignment probabilities may also be training weights, for example, values representing weights used to generate training parameters for a GMM based transformation.  Thus, an alignment probability need not be represented as a value in a probability range (e.g., 0 to 1, or 0 to 100), but might be a value corresponding to some weight in the training weight scheme used in a conversion.
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to combine the Jiang disclosed invention, and have cause the one or more processors to at least: align the plurality of images based, at least in part, on a Gaussian mixture model, as taught by Tian, in order to reduce alignment errors and allow for increased efficiency and quality when performing vector transformations.

As to claims 23 and 36, is rejected for the same rationale of claim 17.

16.	Claim 18 is rejected under 35 U.S.C. 103 as being unpatentable over Jiang et al. (US 20210103776 A1, hereinafter “Jiang”) in view of in view of Tian et al. (US 20070256189 A1, hereinafter Tian) and further in view of KOMATSU et al. (US 20200152171 A1, hereinafter “KOMATSU”).
As to claim 18.  (Currently Amended) Jiang in view of Tian does not disclose wherein the Gaussian mixture model is computed based at least in part on a weight matrix output by the one or more neural networks.  However, KOMATSU teaches wherein the Gaussian mixture model is computed based at least in part on a weight matrix output by the one or more neural networks (KOMATSU, see at least par. [0061], “The detection unit 204 receives, as input, weights transmitted, for example, as a weight matrix H from the analysis unit 103 …  The detection unit 204 may detect which object signal source exists in each time frame of Y by using a discriminator using a value of each element of H as a feature value. As a training model of a discriminator, for example, a support vector machine (SVM) or a Gaussian mixture model (GMM) is applicable”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to combine the Jiang in view of Tian, disclosed invention, and have wherein the Gaussian mixture model is computed based at least in part on a weight matrix output by the one or more neural networks, as taught by KOMATSU, thereby to provide a convolutional neural network system, and more particularly, to a method in which provide a signal processing technique capable of acquiring information of an object signal component that is modeled at a low memory cost even when a variation of object signals is large, as discuss by KOMATSU, (see par. [0007]).

17.	Claims 19 and 25 are rejected under 35 U.S.C. 103 as being unpatentable over Jiang et al. (US 20210103776 A1, hereinafter “Jiang”) in view of Tian et al. (US 20070256189 A1, hereinafter Tian) as applied claim 17 above and further in view of Risser (US 20190043242 A1).
As to claim 19.  (Currently Amended) Jiang in view of Tian discloses claim 17, but does not explicitly disclose wherein a registration transform is computed based at least in part on the Gaussian mixture model.  However, Risser teaches wherein a registration transform is computed based at least in part on the Gaussian mixture model (Risser, see at least par. [0032], “systems synthesize geometry in the form of a 3D mesh wherein the mean of each Gaussian function comprising a GMM is allowed to change through the upscale, randomization and correction process defined for image synthesis. 3D model synthesis shares an additional step in common with displacement map synthesis, where a new position is found for a pixel based on relative position of the pixel as compared to neighboring pixels, this concept is extended from a 1-element vector to a 3-element vector”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to combine the Jiang in view of Tian disclosed invention, and have “wherein a registration transform is computed based at least in part on the Gaussian mixture model”, as taught by Risser, thereby to provide a convolutional neural network system, and more particularly, to a method in which allows a user to supply a fully textured mesh as an input exemplar and apply the texture from the exemplar onto a different untextured model. A mapping is computed through NNS from the untextured models shape and the textured models shape. This mapping bootstraps and guides the material synthesis process, as discussed by Risser, (see par. [0031]).
As to claim 25, is rejected for the same rationale of claim 19.

18.	Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over Jiang et al. (US 20210103776 A1, hereinafter “Jiang”) in view of in view of in view of Tian et al. (US 20070256189 A1, hereinafter Tian) as applied claim 17 above and further in view of Risser (US 20190043242 A1) as applied claim 19 above, and further in view of KIM et al. (US 20180197084 A1, hereinafter “KIM”).
As to claim 20.  (Currently Amended) Jiang in view of Tian and further in view of Risser discloses claim 19, but does not  explicitly disclose wherein a registration error is back-propagated to the one or more neural networks during training.  However, KIM teaches wherein a registration error is back-propagated to the one or more neural networks during training (KIM, see at least par. [0034], “In CNN learning, an error backpropagation algorithm may be used to back-propagate the weight error in the direction of minimizing the difference value between the result value and the expected value of such an operation”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to combine the Jiang in view of Tian and further in view of Risser disclosed invention, and have “wherein a registration error is back-propagated to the one or more neural networks during training”, as taught by KIM, thereby to provide thereby to provide provides a convolutional neural network system, and more particularly, to a method in which provides a method and device for reducing the amount of learning parameters required for an FC layer in a CNN model. The present disclosure also provides a method for performing a recognition task by converting a learning parameter into a binary variable (`-1` or `1`) in an FC layer. The present disclosure also provides a method and device for changing a learning parameter of an FC layer to a binary form to reduce the cost of managing learning parameters, as discussed by KIM, (see par. [0006]).

19.	Claim 24 is rejected under 35 U.S.C. 103 as being unpatentable over Jiang et al. (US 20210103776 A1, hereinafter “Jiang”) in view of in view of Tian et al. (US 20070256189 A1, hereinafter Tian), as applied claim 23 above and further in view of KOMATSU et al. (US 20200152171 A1, hereinafter “KOMATSU”)
As to claim 24.  Jiang in view of Tian does not disclose “wherein the Gaussian mixture model is computed based at least in part on a weight matrix output by the one or more neural networks”.  However, KOMATSU teaches wherein the Gaussian mixture model is computed based at least in part on a weight matrix output by the one or more neural networks (KOMATSU, see at least par. [0061], “The detection unit 204 receives, as input, weights transmitted, for example, as a weight matrix H from the analysis unit 103 …  The detection unit 204 may detect which object signal source exists in each time frame of Y by using a discriminator using a value of each element of H as a feature value. As a training model of a discriminator, for example, a support vector machine (SVM) or a Gaussian mixture model (GMM) is applicable”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to combine the Jiang in view of Tian, disclosed invention, and have “wherein the Gaussian mixture model is computed based at least in part on a weight matrix output by the one or more neural networks”, as taught by KOMATSU, thereby to provide a convolutional neural network system, and more particularly, to a method in which provides a signal processing technique capable of acquiring information of an object signal component that is modeled at a low memory cost even when a variation of object signals is large, as discuss by KOMATSU, (see par. [0007]).

20.	Claim 26 is rejected under 35 U.S.C. 103 as being unpatentable over Jiang et al. (US 20210103776 A1, hereinafter “Jiang”) in view of in view of Tian et al. (US 20070256189 A1, hereinafter Tian), further in view of Risser (US 20190043242 A1) as applied claim 25 above and further in view of KIM et al. (US 20180197084 A1, hereinafter “KIM”).
As to claim 26.  Jiang in view of Tian and further in view of Risser does not disclose “wherein a registration error is back-propagated through the one or more neural networks during training”.  However, KIM teaches wherein a registration error is back-propagated through the one or more neural networks during training (KIM, see at least par. [0034], “In CNN learning, an error backpropagation algorithm may be used to back-propagate the weight error in the direction of minimizing the difference value between the result value and the expected value of such an operation”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to combine the Jiang in view of Tian and further in view of Risser disclosed invention, and have “wherein a registration error is back-propagated through the one or more neural networks during training”, as taught by KIM, thereby to provide a convolutional neural network system, and more particularly, to a method in which provides a method and device for reducing the amount of learning parameters required for an FC layer in a CNN model. The present disclosure also provides a method for performing a recognition task by converting a learning parameter into a binary variable (`-1` or `1`) in an FC layer. The present disclosure also provides a method and device for changing a learning parameter of an FC layer to a binary form to reduce the cost of managing learning parameters, as discussed by KIM, (see par. [0006]).

21.	Claims 27 and 28 are rejected under 35 U.S.C. 103 as being unpatentable Jiang et al. (US 20210103776 A1, hereinafter “Jiang”) in view of Yamaguchi et al. (US 20020024517 A1, hereinafter Yamaguchi).
As to claim 27, Jiang is rejected for the same rationale of claim 1, but Jiang does not disclose a processor, comprising: one or more arithmetic logic units (ALUs).  However, Yamaguchi discloses a processor, comprising: one or more arithmetic logic units (ALUs) (Yamaguchi, see at least [0184] In the arithmetic logic unit 500 diagrammed in FIG. 7, the generation of a three-dimensional model of the object 10 is omitted, and an image of the object 10 as seen along the line of sight 41 from the viewpoint 40 is generated directly from the multi-eyes stereo data. The method used here is similar to the method of establishing voxels according to a viewpoint coordinate system i4, j4, d4 as described with reference to FIG. 3. In the method used here, however, a three-dimensional model is not produced, wherefore the voxel concept is no longer used. Here, for each coordinate in the viewpoint coordinate system i4, j4, d4, a check is done to see whether or not there are corresponding multi-eyes stereo data and, when there are, an image seen from the viewpoint 40 is rendered directly using those multi-eyes stereo data.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to combine the Jiang, disclosed invention, and have a processor, comprising: one or more arithmetic logic units (ALUs), as taught by Yamaguchi, thereby to provide a convolutional neural network system, and more particularly, to a method in which make it possible to produce images to perform rendering at higher speed.

As to claim 28.  Jiang in view of Yamaguchi further discloses wherein an image, of the plurality of images, comprises data indicative of locations on a surface of the object (Jiang, see at least par. [0069] In the testing stage 200B, new images 601 are input into the 3D assisted object detection network 214 (now trained) for processing. As a result of processing, the 3D assisted object detection network 214 outputs the 2D object localization information 218 (the location of the object in the image) and 3D object surface coordinate 220. A detailed description of the testing stage 200B may be found below with reference to FIGS. 6A and 6B.).

22.	Claim 29 is rejected under 35 U.S.C. 103 as being unpatentable over Jiang et al. (US 20210103776 A1, hereinafter “Jiang”) in view of Yamaguchi et al. (US 20020024517 A1, hereinafter Yamaguchi) as applied claim 27 above, and further in view of Tian et al (US 20070256189 A1, hereinafter Tian). 
As to claim 29. Jiang in view of WANG discloses claim 27, but does not explicitly disclose “wherein the plurality of images are aligned based, at least in part, on a Gaussian mixture model”.  However, Tian teaches wherein the plurality of images are aligned based, at least in part, on a Gaussian mixture model (Tian, see at least par. [0024] In step 402, alignment probabilities are estimated, for example, by computing device 301, for different source-target vector pairs. In this example, the alignment probabilities may be estimated using techniques related to Hidden Markov Models (HMM), statistical models related to extracting unknown, or hidden, parameters from observable parameters in a data distribution model. For example, each distinct vector in the source and target vector sequences may be generated by a left-to-right finite state machine that changes state once per time unit. Such finite state machines may be known as Markov Models. In addition, alignment probabilities may also be training weights, for example, values representing weights used to generate training parameters for a GMM based transformation.  Thus, an alignment probability need not be represented as a value in a probability range (e.g., 0 to 1, or 0 to 100), but might be a value corresponding to some weight in the training weight scheme used in a conversion.
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to combine the Jiang disclosed invention, and have wherein the plurality of images are aligned based, at least in part, on a Gaussian mixture model, as taught by Tian, in order to reduce alignment errors and allow for increased efficiency and quality when performing vector transformations.

23.	Claim 30 is rejected under 35 U.S.C. 103 as being unpatentable over Jiang et al. (US 20210103776 A1, hereinafter “Jiang”)  in view of Yamaguchi et al. (US 20020024517 A1, hereinafter Yamaguchi) and further in view of Tian et al (US 20070256189 A1, hereinafter Tian) as applied claim 29 above and in view of KOMATSU et al. (US 20200152171 A1, hereinafter “KOMATSU”).
As to claim 30.  Jiang in view of WANG and further in view Tian discloses claim 29, but does not explicitly disclose “wherein the Gaussian mixture model is computed based at least in part on a weight matrix output by the one or more neural networks”.  However, KOMATSU teaches wherein the Gaussian mixture model is computed based at least in part on a weight matrix output by the one or more neural networks (KOMATSU, see at least par. [0061], “The detection unit 204 receives, as input, weights transmitted, for example, as a weight matrix H from the analysis unit 103 …  The detection unit 204 may detect which object signal source exists in each time frame of Y by using a discriminator using a value of each element of H as a feature value. As a training model of a discriminator, for example, a support vector machine (SVM) or a Gaussian mixture model (GMM) is applicable”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to combine the Jiang in view of WANG and further in view Tian disclosed invention, and have “wherein the Gaussian mixture model is computed based at least in part on a weight matrix output by the one or more neural networks”, as taught by KOMATSU, thereby to provide a convolutional neural network system, and more particularly, to a method in which provide a signal processing technique capable of acquiring information of an object signal component that is modeled at a low memory cost even when a variation of object signals is large, as discuss by KOMATSU, (see par. [0007]

24.	Claim 31 is rejected under 35 U.S.C. 103 as being unpatentable over Jiang et al. (US 20210103776 A1, hereinafter “Jiang”) in view of Yamaguchi et al. (US 20020024517 A1, hereinafter Yamaguchi) and further in view of Tian et al (US 20070256189 A1, hereinafter Tian), further in view of KOMATSU et al. (US 20200152171 A1, hereinafter “KOMATSU”) as applied claim 30 above, and further in view of Risser (US 20190043242 A1).
As to claim 31.  Jiang in view of WANG, further in view Tian and further in view of KOMATSU discloses claim 30, but does not explicitly disclose “wherein a registration transform is computed based at least in part on the Gaussian mixture model”.  However, Risser teaches wherein a registration transform is computed based at least in part on the Gaussian mixture model (Risser, see at least par. [0032], “systems synthesize geometry in the form of a 3D mesh wherein the mean of each Gaussian function comprising a GMM is allowed to change through the upscale, randomization and correction process defined for image synthesis. 3D model synthesis shares an additional step in common with displacement map synthesis, where a new position is found for a pixel based on relative position of the pixel as compared to neighboring pixels, this concept is extended from a 1-element vector to a 3-element vector”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to combine the Jiang in view of WANG, further in view Tian and further in view of KOMATSU disclosed invention, and have “wherein a registration transform is computed based at least in part on the Gaussian mixture model”, as taught by Risser, thereby to provide a convolutional neural network system, and more particularly, to a method in which allows a user to supply a fully textured mesh as an input exemplar and apply the texture from the exemplar onto a different untextured model. A mapping is computed through NNS from the untextured models shape and the textured models shape. This mapping bootstraps and guides the material synthesis process, as discussed by Risser, (see par. [0031]).

25.	Claim 32 is rejected under 35 U.S.C. 103 as being unpatentable over Jiang et al. (US 20210103776 A1, hereinafter “Jiang”) in view of Yamaguchi et al. (US 20020024517 A1, hereinafter Yamaguchi) and further in view of Tian et al (US 20070256189 A1, hereinafter Tian), further in view of KOMATSU et al. (US 20200152171 A1, hereinafter “KOMATSU”) and further in view of in view of Risser (US 20190043242 A1) as applied claim 31 above and further in view of KIM et al. (US 20180197084 A1, hereinafter “KIM”).
As to claim 32.  Jiang in view of Yamaguchi, further in view of Tian and further in view of Risser does not disclose “wherein a registration error is back-propagated through the one or more neural networks during training”.  However, Paul teaches:
wherein a registration error is back-propagated through the one or more neural networks during training (KIM, see at least par. [0034], “In CNN learning, an error backpropagation algorithm may be used to back-propagate the weight error in the direction of minimizing the difference value between the result value and the expected value of such an operation”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to combine the Jiang in view of WANG, further in view of Tian and further in view of Risser, disclosed invention, and have “wherein a registration error is back-propagated through the one or more neural networks during training.”, as taught by thereby to provide a convolutional neural network system, and more particularly, to a method in which the one or more processor provide a method for reducing the amount of learning parameters required for an FC layer in a CNN model. The present disclosure also provides a method for performing a recognition task by converting a learning parameter into a binary variable (`-1` or `1`) in an FC layer. The present disclosure also provides a method and device for changing a learning parameter of an FC layer to a binary form to reduce the cost of managing learning parameters, as discussed by KIM, (see par. [0006]).
27.	Claim 37 is rejected under 35 U.S.C. 103 as being unpatentable over Jiang et al. (US 20210103776 A1, hereinafter “Jiang”) as applied claim 33 above and further in view KIM et al. (US 20180197084 A1, hereinafter “KIM”).
As to claim 37.  Jiang disclose claim 33, but does not explicitly disclose “wherein the one or more neural networks are trained based at least in part on back-propagation of a registration error.”.  However, Paul teaches:  wherein the one or more neural networks are trained based at least in part on back-propagation of a registration error (KIM, see at least par. [0034], “In CNN learning, an error backpropagation algorithm may be used to back-propagate the weight error in the direction of minimizing the difference value between the result value and the expected value of such an operation”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to combine the Jiang, disclosed invention, and have “wherein the one or more neural networks are trained based at least in part on back-propagation of a registration error.”, as taught by thereby to provide a convolutional neural network system, and more particularly, to a method in which the one or more processor provide a method for reducing the amount of learning parameters required for an FC layer in a CNN model. The present disclosure also provides a method for performing a recognition task by converting a learning parameter into a binary variable (`-1` or `1`) in an FC layer. The present disclosure also provides a method and device for changing a learning parameter of an FC layer to a binary form to reduce the cost of managing learning parameters, as discussed by KIM, (see par. [0006]).

28.	Claims 38 and 43 are rejected under 35 U.S.C. 103 as being unpatentable Jiang et al. (US 20210103776 A1, hereinafter “Jiang”) as applied claim 33 above and further in view of Lombardi et al. (US 20190213772 A1, hereinafter “Lombardi”).
As to claim 38. Jiang discloses claim 33, but does not explicitly disclose “wherein the one or more neural networks are trained to comprise a latent encoding of a geometry of the object”.  However, Lombardi teaches wherein the one or more neural networks are trained to comprise a latent encoding of a geometry of the object (Lombardi, see at least par. [0040], “The encoding module 108 may be configured to receive and jointly encode the texture information (e.g., the view-independent texture map) and the geometry information to provide a latent vector z. In certain embodiments, the building encoding module 108 may be configured to learn to compress the joint variation of texture and geometry into a latent encoding”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to combine the Jiang disclosed invention, and have “wherein a registration transform is computed based at least in part on the Gaussian mixture model”, as taught by Lombardi, thereby to provide a convolutional neural network system, and more particularly, to a method in which provides a latent vector. The autoencoder may further be configured to infer, using the latent vector, an inferred geometry of the subject for a predicted viewpoint, and an inferred view-dependent texture of the subject for the predicted viewpoint. The rendering module may be configured to render a reconstructed image of the subject for the predicted viewpoint using the inferred geometry and the inferred view-dependent texture, as discussed by Lombardi, (see par. [0004]).
	As to claim 43, is rejected for the same rationale of claim 38.

29.	Claim 44 is rejected under 35 U.S.C. 103 as being unpatentable over Jiang et al. (US 20210103776 A1, hereinafter “Jiang”) in view of Lombardi et al. (US 20190213772 A1, hereinafter “Lombardi”) as applied claim 43 above and further in view of GHAFFARZADEGAN et al. (US 20210042583 A1, hereinafter “GHAFFARZADEGAN”).
As to claim 44.  Jiang in view of Lombardi does not disclose “wherein the one or more neural networks are trained to perform a computer vision task based at least in part on the latent encoding”.  However GHAFFARZADEGAN teaches wherein the one or more neural networks are trained to perform a computer vision task based at least in part on the latent encoding (GHAFFARZADEGAN, see at least pars. [0045], [0050] “This process may continue and at each step the decoder 307 output and the residual transform the image according to the learned latent encoding…  ResNets have achieved the state-of-the-art performance in various computer vision benchmarks”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to combine the Jiang in view Lombardi disclosed invention, and have “wherein the one or more neural networks are trained to perform a computer vision task based at least in part on the latent encoding”, as taught by Lombardi, thereby to provide a convolutional neural network system, and more particularly, to a method in which generate a sequential reconstruction of the input data utilizing the decoder and at least the first latent variable, obtain a residual between the input data and the reconstruction utilizing a comparison of at least the first latent variable, and output a final reconstruction of the input data utilizing a plurality of residuals from a plurality of sequences, as discussed by GHAFFARZADEGAN, (see par. [0005]).


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KIM THANH THI TRAN whose telephone number is (571)270-1408.  The examiner can normally be reached on Monday-Friday 8:00am-5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, JENNIFER MEHMOOD can be reached on 5712722976.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/KIM THANH T TRAN/Examiner, Art Unit 2612                                                                                                                                                                                                        
/JENNIFER MEHMOOD/Supervisory Patent Examiner, Art Unit 2612