DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
Applicant’s amendment/response filed 12/14/2021 has been entered and made of record. Claims 1-8 and 10 were amended. Claims 1-10 are pending in the application.
Claim Objections
Claim 5 is objected to because of the following informalities: Claim 5 recites “a loss Lproject Lprojection” instead of “a loss function Lprojection.” Appropriate correction is required.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1 and 6-9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Yang et al. (3D Object Reconstruction from a Single Depth View with Adversarial Learning, 2017, IEEE International Conference on Computer Vision Workshops, pp. 679-688) in view of Abbaszadeh et al. (US 2020/0067969).
Regarding claim 1, Yang teaches/suggests: A 3D reconstruction method based on deep learning, comprising the following steps: 
(1) using a potential vector constrained in an input image to reconstruct a complete 3D shape of a target (Yang §1 ¶7: “Particularly, our model first encodes the 2.5D view to a low dimensional latent space vector which implicitly represents general 3D geometric structures, then decodes it back to recover the most likely complete 3D structure.”), and learning a mapping between a part and the complete 3D shape, then realizing a 3D reconstruction of a single depth image (Yang §1 ¶4: “By utilizing the high performance of 3D convolutional neural nets and large open datasets of 3D models, our approach learns a smooth function to map a 2.5D view to a complete 3D shape. Particularly, we train an end-to-end model which estimates full volumetric occupancy from only one 2.5D depth view of an object, thus predicting occluded structures from a partial scan.”), 
(2) learning an intermediate feature representation between a 3D real object and a reconstructed object to obtain a plurality of target potential variables in step (1) (Yang §3.2 ¶3: “Instead, our discriminator is designed to output a long latent vector which represents distributions of real and fake reconstructions. Therefore, our discriminator is to distinguish the distributions of latent representations of fake and real reconstructions, while the generator is trained to make the two distributions as similar as possible;” §5 ¶1: “This 
(3) transforming a voxel floating value predicted in step (1) into a binary value to complete a high-precision reconstruction (Yang §3.3 ¶5: “…x represents a voxel value, e.g. {0,1}, of an input 2.5D view, while y’ is the estimated value in (0,1) for the corresponding voxel from generator, and y is the target value in {0,1} for the same voxel.” [The estimated value meets the claimed voxel floating value; the target value meets the claimed binary value.]).
Yang does not teach/suggest by using a limit learning machine. Abbaszadeh, however, teaches/suggests by using a limit learning machine (Abbaszadeh [0034]: “For example, some embodiments utilize Extreme Learning Machine (“ELM”) as a binary classification decision boundary. ELM is a special type of flashforward neural network recently developed for fast training.”). Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to modify the final voxel values of Yang to be determined by an ELM classifier as taught/suggested by Abbaszadeh in order for fast training.

Regarding claim 6, Yang as modified by Abbaszadeh teaches/suggests: The 3D reconstruction method based on deep learning according to claim 1, wherein in step (2), using a 3D depth convolution AE with jump connection to connect a feature layer of an encoder to a decoder accordingly (Yang §3.2 ¶2: “The generator is based on autoencoder with skip-connections between encoder and 

Regarding claim 7, Yang as modified by Abbaszadeh teaches/suggests: The 3D reconstruction method based on deep learning according to claim 6, wherein in step (2), a network structure comprises the encoder and the decoder (Yang Fig. 5: the illustrated encoder and decoder): 
the encoder has four 3D convolution layers, each convolution layer of four 3D convolution layers has a bank of 4x4x4 filters of 1x1x1 strides, followed by a leaky ReLU activation function and a maximum pooling layer (Yang §3.2 ¶2: “Particularly, the encoder has five 3D convolutional layers, each of which has a bank of 4x4x4 filters with strides of 1x1x1, followed by a leaky ReLU activation function and a max pooling layer.”); then there are two fully connected layers, a second fully connected layer of the two fully connected layers is the potential vector learned (Yang §3.2 ¶2: “The encoder is lastly followed by two fully-connected layers to embed semantic information into latent space.” [The claimed second fully connected layer is an inherent and/or implicit feature of the autoencoder and GAN. In addition, such feature would have been well known for 3D reconstruction.]); 
the decoder consists of four symmetric anti convolution layers; each layer of the four symmetric anti convolution layers concatenates a plurality of feature layers of the encoder accordingly, followed by a plurality of ReLU activations except for a last layer of the plurality of feature layers with sigmoid function 3(1)→323(64)→163(128)→83(256)→43(512)→32768→5000→32768→43(512)→83(256)→163(128)→323(64)→643(1) (Yang Fig. 5: the illustrated calculation process).

Regarding claim 8, Yang as modified by Abbaszadeh teaches/suggests: The 3D reconstruction method based on deep learning according to claim 7, wherein in step (2), by making the predicted 3D shape as close as possible to a real 3D shape to optimize a plurality of network parameters, objective function Lt of step (2) is the formula (5) (Yang equation 2; §3.3 ¶6: “Minimizing Lae tends to learn the overall 3D shapes.”) … wherein yt is a ground truth value for each voxel, y’t is a predicted value for the each voxel (Yang §3.3 ¶3: “…where y is the target value in {0,1} and y’ is the estimated value in (0,1) for each voxel from the autoencoder.”), a cross entropy is used to measure a quality of reconstruction (Yang §4.1 ¶2: “The second metric is the mean value of standard cross-entropy loss (CE) between a reconstructed shape and the ground true 3D model.”), for a value of most voxel grids of each object are zero, weight α is applied to a plurality of false positive and false negative samples to balance the value of most voxel grids, in the experiment α is set to 0.85 (Yang §3.3 ¶2: “However, most of the 

Regarding claim 9, Yang as modified by Abbaszadeh teaches/suggests: The 3D reconstruction method based on deep learning according to claim 1, wherein in step (3), a nonlinear binary reconstruction is applied to a voxel set output by a generator with an ELM classifier (Abbaszadeh [0066]: “For example, binary linear and non-linear supervised classifiers are examples of methods that could be used to obtain a decision boundary;” [0034]: “For example, some embodiments utilize Extreme Learning Machine (“ELM”) as a binary classification decision boundary. ELM is a special type of flashforward neural network recently developed for fast training.”). The same rationale to combine as set forth in the rejection of claim 1 above is incorporated herein.

Claims 2-3 and 5 is/are rejected under 35 U.S.C. 103 as being unpatentable over Yang et al. (3D Object Reconstruction from a Single Depth View with Adversarial Learning, 2017, IEEE International Conference on Computer Vision Workshops, pp. 679-688) in view of Abbaszadeh et al. (US 2020/0067969) as applied to claim 1 above, and further in view of Wu et al. (MarrNet: 3D Shape Reconstruction via 2.5D Sketches, 2017, 31st Conference on Neural Information Processing Systems, pp. 1-11).
claim 2, Yang as modified by Abbaszadeh teaches/suggests: The 3D reconstruction method based on deep learning according to claim 1, wherein step (1) comprises the following steps: 
(1.1) reconstruction of a 3D GAN and realization of discriminant constraints (Yang §3.3 ¶1: “The objective function of our 3D-RecGAN includes two main parts: an object reconstruction loss Lae for autoencoder based generator; the objective function Lgan for conditional GAN;” §3.3 ¶6: “To minimize Ldgan is to improve the performance of discriminator to distinguish fake and real reconstruction pairs.”), 
(1.2) realization of a plurality of consistency constraints of a plurality of potential features (Yang §3.3 ¶6: “Minimizing Lae tends to learn the overall 3D shapes, whilst minimizing Lggan estimates more plausible 3D structures conditioned on input 2.5D views;” §5 ¶1: “This confirms that our network has the capability of learning general 3D latent features of the objects.”), 
Yang as modified by Abbaszadeh does not teach/suggest:
(1.3) realization of a consistency constraint of a depth projection.
Wu, however, teaches/suggests:
(1.3) realization of a consistency constraint of a depth projection (Wu §3.3 ¶1: “Here, we explore novel ways to include a reprojection consistency loss between the predicted 3D shape and the estimated 2.5D sketch, consisting of a depth reprojection loss and a surface normal reprojection loss.”).
Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to modify the GAN of Yang to include a 

Regarding claim 3, Yang as modified by Abbaszadeh and Wu teaches/suggests: The 3D reconstruction method based on deep learning according to the claim 2, wherein step (1.1) uses an improved Wasserstein GAN to train (Yang §3.1 ¶3: “The recent WGAN [1] leverages Wasserstein distance with weight clipping as a loss function to stabilize the training procedure.”), for a generator, a 3D generator loss Lg is defined as formula (1) (Yang: equation 5) … wherein x, yt, yp, respectively represent a 3D voxel value converted for a depth image, a ground truth value and a 3D object value generated by a network (Yang §3.3 ¶5: “…x represents a voxel value, e.g. {0,1}, of an input 2.5D view, while y’ is the estimated value in (0,1) for the corresponding voxel from generator, and y is the target value in {0,1} for the same voxel.”), in an experiment β is set to 0.85, η is set to 5 (Yang §3.4 ¶1: “α ends up as 0.85 for our modified cross entropy loss function, while β is 0.05 for the joint loss function Lg.” [η=5 is a multiple of β=0.05, which would have been obvious to try.]), for a discriminator, the 3D GAN optimizes a plurality of parameters by narrowing a Wasserstein distance between a real pair and a fake pair, the discriminator loss Ld is defined as (2) (Yang: equation 6) … wherein ŷ = εx + (1 - ε)yp, ε ~ U[0,1], λ controls a tradeoff between optimizing a gradient penalty and an original objective (Yang §3.3 ¶5: “…where ŷ = εx + (1 - ε)y’, ε ~ U[0,1]. λ controls the tradeoff between optimizing the gradient penalty and the original objective in WGAN.”).

Regarding claim 5, Yang as modified by Abbaszadeh and Wu teaches/suggests: The 3D reconstruction method based on deep learning according to claim 3, wherein in step (1.3), a projection constraint is applied between a predicted 3D shape and the input depth image, a depth value after projection is consistent with an input depth value to improve a fidelity of an input information, to allow the model fine tune a generated 3D shape, a loss [function] Lprojection is the formula (4) … wherein yp(x,y,z) represents a value of the predicted 3D shape yp at a position (x, y,  z), yp(x,y,z) E {0,1}, dxy is the depth value of the input image x at a position (x, y) (Wu equation 1; §3.3 ¶2: “We use vx,y,z to represent the value at position (x, y, z) in a 3D voxel grid, assuming that vx,y,z E [0, 1], Ɐx, y, z. We use dx,y to denote the estimated depth at position (x, y).”). The same rationale to combine as set forth in the rejection of claim 2 above is incorporated herein.

Claim 4 is/are rejected under 35 U.S.C. 103 as being unpatentable over Yang et al. (3D Object Reconstruction from a Single Depth View with Adversarial Learning, 2017, IEEE International Conference on Computer Vision Workshops, pp. 679-688) in view of Abbaszadeh et al. (US 2020/0067969) and Wu et al. (MarrNet: 3D Shape Reconstruction via 2.5D Sketches, 2017, 31st Conference on Neural Information Processing Systems, pp. 1-11) as applied to claim 3 above, and further in view of Paulina et al. (US 2020/0065682).
Regarding claim 4, Yang as modified by Abbaszadeh and Wu does not teach/suggest: The 3D reconstruction method based on deep learning according wherein in step (1.2), a plurality of potential vectors of the input image are constrained by a potential feature vector information of a learned 3D real object to guide a model to generate a target 3D shape data, to accurately predict a missing part, a latent vector Ll is defined as (3) … wherein Zt is the latent vector decoded by a 3D ground truth object, Zp is decoded by an input depth image, and E(·) denotes an expectation. Paulina, in view of Yang, teaches/suggests wherein in step (1.2), a plurality of potential vectors of the input image are constrained by a potential feature vector information of a learned 3D real object to guide a model to generate a target 3D shape data, to accurately predict a missing part, a latent vector Ll is defined as (3) … wherein Zt is the latent vector decoded by a 3D ground truth object, Zp is decoded by an input depth image, and E(·) denotes an expectation (Yang §3.2 ¶2: “During training, the generator is supervised supplying by ground true 3D shapes … Instead, our discriminator is designed to output a long latent vector which represents distributions of real and fake reconstructions. Therefore, our discriminator is to distinguish the distributions of latent representations of fake and real reconstructions, while the generator is trained to make the two distributions as similar as possible;” §5 ¶1: “This confirms that our network has the capability of learning general 3D latent features of the objects;” Paulina [0156]: “For example, in some implementations, objective function 695 may be or include a loss function that compares (e.g., determines a difference between) output data generated by the model from the training data and labels (e.g., ground-truth labels) associated with the training data.”). Before the effective filing date of the .

Claim 10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Yang et al. (3D Object Reconstruction from a Single Depth View with Adversarial Learning, 2017, IEEE International Conference on Computer Vision Workshops, pp. 679-688) in view of Abbaszadeh et al. (US 2020/0067969) as applied to claim 9 above, and further in view of Kaufman et al. (US 2007/0103464), Grau-Moya et al. (US 2020/0364555), and Paulina et al. (US 2020/0065682).
Regarding claim 10, Yang as modified by Abbaszadeh teaches/suggests: The 3D reconstruction method based on deep learning according to claim 9, wherein in step (3), wherein step (3) has three layers in the network: an input layer, a hidden layer and an output layer (Abbaszadeh [0111]: “The structure of a one-output ELM network 1900 is depicted in FIG. 19, including an input layer 1910, a hidden layer 1920, and the singe node output lager 1930.”), an input is a feature of each voxel mesh of an object (Yang §3.1 ¶1: “To achieve this task, each object model is represented in a 3D voxel grid.”), a number of hidden layer nodes is determined as 11 by multiple experiments (Abbaszadeh Fig. 19: the illustrated hidden layer nodes [The claimed number of hidden layer nodes would have been obvious to try. In addition, such feature would have been well known for the hidden layer (Official Notice).]), an output is to judge whether a label of the each voxel is 0 or 1 (Yang §3.1 ¶1: “We only use the simple occupancy information for fast training.”). The same rationale to combine as set forth in the rejection of claim 1 above is incorporated herein.

Yang as modified by Abbaszadeh does not teach/suggest a neighborhood value around the each voxel mesh is extracted as a feature value, and a 7-dimensional feature vector is established. Kaufman, however, teaches/suggests a neighborhood value around the each voxel mesh is extracted as a feature value (Kaufman [0113]: “In this method, each voxel of an acquired image is evaluated with respect to a group of neighbor voxels. The voxel of interest is referred to as the central voxel and has an associated intensity value. A classification indicator for each voxel is established by comparing the value of the central voxel to each of its neighbors.”), and a 7-dimensional feature vector is established (Kaufman [0118]: “Because the data set for an abdominal image generally includes more than 300 slice images, each with a 512.times.512 voxel array, and each voxel having an associated 25 voxel local vector, it is desirable to perform feature analysis (step 1570) on the local vector series to reduce the computational burden. One such feature analysis is a principal component analysis (PCA), which can be applied to the local vector series to determine the dimension of a feature vector series.” [The claimed 7-dimensional feature vector would have 

Yang as modified by Abbaszadeh and Kaufman does not teach/suggest if an incentive function is infinitely differentiable over any real number interval for an ELM, the network approximates any nonlinear function. Grau-Moya, in view of Abbaszadeh, teaches/suggests if an incentive function is infinitely differentiable over any real number interval for an ELM, the network approximates any nonlinear function (Abbaszadeh [0034]: “For example, some embodiments utilize Extreme Learning Machine (“ELM”) as a binary classification decision boundary. ELM is a special type of flashforward neural network recently developed for fast training;” Grau-Moya [0028]: “In many practical applications of reinforcement learning, the number of possible states or state-action pairs is very large or infinite, in which case it is necessary to approximate the state value function or the action value function based on sequences of states, actions, and rewards experienced by the agent. For such cases, approximate value functions {circumflex over (ν)}(s, w) and {circumflex over (q)}(s, a, w) are introduced to approximate the value functions V(s) and Q(s, a) respectively, in which w is a vector of parameters defining the approximate functions.”). Before the effective 

Yang as modified by Abbaszadeh, Kaufman, and Grau-Moya does not teach/suggest a classifier loss function Lc is the formula (6) … wherein yfvoxel is a value of the each voxel mesh after the nonlinear binary reconstruction, ytvoxel is the value of the each voxel mesh of the 3D real object. Paulina, in view of Yang and Abbaszadeh, teaches/suggests a classifier loss function Lc is the formula (6) … wherein yfvoxel is a value of the each voxel mesh after the nonlinear binary reconstruction, ytvoxel is the value of the each voxel mesh of the 3D real object (Yang §3.1 ¶1: “We only use the simple occupancy information for map encoding, where 1 represents an occupied cell and 0 remains an empty cell;” Abbaszadeh [0034]: “For example, some embodiments utilize Extreme Learning Machine (“ELM”) as a binary classification decision boundary. ELM is a special type of flashforward neural network recently developed for fast training;” Paulina [0156]: “For example, in some implementations, objective function 695 may be or include a loss function that compares (e.g., determines a difference between) output data generated by the model from the training data and labels (e.g., ground-truth labels) associated with the training data.”). Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to modify the ELM classifier of Yang as modified by Abbaszadeh, .
Response to Arguments
Applicant's arguments filed 12/14/2021 have been fully considered but they are not persuasive.
Regarding claim 1, Applicant argues “none of Yang and Abbaszadeh, alone or in combination, disclose a potential vector constrained in an input image is used to reconstruct a complete 3D shape of a target ... none of Yang and Abbaszadeh, alone or in combination, disclose learn an intermediate feature representation between a 3D real object and a reconstructed object to obtain a plurality of target potential variables in step (1).” See Remarks, pp. 10-11.

    PNG
    media_image1.png
    455
    840
    media_image1.png
    Greyscale

Examiner respectfully disagrees. Yang discloses a 3D object reconstruction from a single image using adversarial learning. As shown in Fig. 5 (reproduced above), the GAN architecture of Yang is substantially the same as that of the 

Applicant further argues “Yang et al. use a fixed threshold to transform a voxel float value into a binary value … Abbaszadeh et al. use ELM to compute the decision boundaries for each individual monitoring node for abnormality detection ... there is no motivation to combine.” See Remarks, pp. 11-12.

Examiner respectfully disagrees. Abbaszadeh discloses ELM as a binary classification for fast training. As such, Yang in view of Abbaszadeh teaches/suggests step (3) as set forth in the rejection above.

Regarding claim 2, Applicant further argues “Yang does not constrain on a plurality of potential features in the generator.” See Remarks, pg. 14. However, Yang does constrain on a plurality of potential features in the generator via Eq. 2 as set forth in the rejection above.

Applicant further argues “different from Wu et al., the present invention designs a depth projection loss without view supervision and the loss tries to ensure that the predicted voxel corresponding to input occupied voxel of a single depth view is 1 and all voxels in front of it are 0.” See Remarks, pg. 14.

Examiner respectfully disagrees. Claim 2 recites “realization of a consistency constraint of a depth projection,” which does not include “a depth projection loss without view supervision and the loss tries to ensure that the predicted voxel corresponding to input occupied voxel of a single depth view is 1 and all voxels in front of it are 0.”

Regarding claim 3, Applicant further argues “Yang does not disclose the additional latent vector loss Li and depth loss Lprojection as described in the claimed invention.” See Remarks, pg. 15. However, claim 3 does not recite “the additional latent vector loss Li and depth loss Lprojection.”
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
US 2018/0129893 – convolution neural network
US 2019/0385292 – single image completion
US 2020/0081912 – completion from a partial sketch
US 2020/0134465 – reconstruction 3D microstructure
US 2020/0184190 – completion from partial fingerprint
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANH-TUAN V NGUYEN whose telephone number is 571-270-7513. The examiner can normally be reached on M-F 9AM-5PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, KEE TUNG can be reached on 571-272-7794. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.






/ANH-TUAN V NGUYEN/
Primary Examiner, Art Unit 2611