You Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 6 and 13 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Regarding claims 10 and 22, “the distance cost factor” lacks antecedent basis.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 4, 5, 13, 16, 17, and 25 are rejected under 35 U.S.C. 103 as being unpatentable over Park (“Sequence-to-Sequence Prediction of Vehicle Trajectory via LSTM Encoder-Decoder Architecture”, 2018) and in view of Fan (US 2017/0329892 A1). 


    PNG
    media_image1.png
    358
    470
    media_image1.png
    Greyscale


    PNG
    media_image2.png
    611
    429
    media_image2.png
    Greyscale

FIG. 1 in park
FIG. 3 of instant



Regarding claim 1, Park teaches operating an autoencoder comprising a structured sequence of n autoencoder units, each of which comprising an encoder unit and a decoder unit, wherein each of the encoder units and each of the decoder units is implemented each as a recurrent neural network unit; ([p. 1672] See FIG. 1. Park teaches a bidirectional stacked LSTM that is a structured sequence of n autoencoder units.  LSTM is interpreted as a type of recurrent neural network acting as decoder or encoder units.).
training the autoencoder by feeding all n vectors of the matrix of […] data to input layers of the n encoder units until the fed n vectors of […] data and respective decoder outputs only differ by a predefined threshold value, ([p. 1676] "The parameters of the LSTM encoder-decoder (including the embedding matrices) are trained in an end to end fashion. We generate the training data set by cropping all available (M + ∆) length trajectory samples from the trajectory record...For the minimization of the loss function, we adopt the stochastic gradient decent method with a momentum, called ADAM optimizer [19] with mini batch size B. The training is stopped if the validation error (obtained from 15% of the training data) does not decrease anymore" Training a n-length LSTM sequence end-to-end is interpreted as synonymous with feeding all n-vectors to input layers of the n encoder units.  Park explicitly teaches training the model until the error does not decrease anymore.  It is implicit and would be obvious to one of ordinary skill in the art to have a predefined threshold value.).
wherein the training of the auto-encoder units is performed stepwise by ([p. 1676] "The loss function to be minimized is given by the negative log likelihood function. [See Eqn. 9] where J is the total number of the training samples…The training is stopped if the validation error (obtained from 15% of the training data) does not decrease anymore." Park explicitly teaches training stepwise as a function of training samples, classes, and prediction length for an indefinite number of epochs until the training loss is sufficiently minimized.).
using as input for an ith selected encoder unit the respective ith vector of the […] interaction data, and ([p. 1672] "Fig. 1. The proposed trajectory prediction system. (x(n)t, y(n)t, x˙(n)t, y˙(n)t)denotes the nth surrounding vehicle’s relative position and velocity at timet and (vt, ψ˙t) denotes the ego vehicle’s speed and yaw rate at time t.").
output values of the previous encoder unit to the selected encoder unit in the structured sequence of n autoencoder units, (See FIG. 1/ FIG. 2 [p. 1673] "ct: cell memory state vector, ht: state output vector.").
using as input for an ith selected decoder unit output values of the ith encoder unit, and (See FIG. 1/FIG. 2 [p. 1673 §II.B] "The decoder recursively generates the output sequence s1; :::; sT0 of the length T0. In every update, the decoder feeds the output st-1 obtained in the previous update to the input for the current update.").
output values of the previous decoder unit to the selected decoder unit in the structured sequence of n autoencoder units, (See FIG. 1/FIG. 2 [p. 1673 §II.B] "The decoder recursively generates the output sequence s1; :::; sT0 of the length T0. In every update, the decoder feeds the output st-1 obtained in the previous update to the input for the current update.").
and performing backpropagation within each of the plurality of autoencoder units after all autoencoder units have processed their respective input values, and ([p. 1674] "As mentioned, the LSTM decoder aims to produce the probability distribution of st given the decoder cell state c't−1 and the (t − 1)th output sample st−1. One way to determine st is the greedy search strategy that simply picks the value for st that maximizes the probability p(st|c't−1, st−1) and feed it back to the decoder to generate the next output sample" Feeding back error interpreted as synonymous with backpropagation.). 
However, Park does not explicitly teach A computer-implemented method for inferring a 3D structure of a genome, the computer-implemented method comprising: 
providing a n*n matrix of genome interaction data; 
using the output values of the encoder units for deriving a 3D model for a visualization of the genome.  

Fan who teaches a related art of an applied neural network discloses using a neural network for inferring 3D information with respect to amino acids.  Fan teaches A computer-implemented method for inferring a 3D structure of a genome, the computer-implemented method comprising: ([¶0230] "Consistent with the present disclosure, the following description is about an embodiment in which the disclosed methods are applied to predict amino acid side chain using a deep neural network").
providing a n*n matrix of genome interaction data; ([¶0135] "Specifically, in the spectral clustering method, the RMSDs are expressed as a similarity matrix, which is defined as a symmetric matrix A" n*n matrix interpreted as synonymous with symmetric matrix.).
using the output values of the encoder units for deriving a 3D model for a visualization of the genome. ([¶0238] "Protein conformation can be encoded by both backbone information and side chain conformation. Since the backbone conformation encoded in 3D image format was going to be used in the disclosed CNN model, the amino acid side chain rotamer library is constructed in a backbone-independent fashion" [¶0235] "By further modeling amino acids side chains with 3-Dimensional (3D) images, a deep neural network is used to predict the likelihood for targeting amino acids adopting each pose. The most likely pose ranked by the disclosed convolutional neural network (CNN) architecture was the output for the prediction" Fan explicitly teaches output of encoder being used to represent 3D structure.). 

Park and Fan are both directed towards applied neural network systems.  It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to apply the technology taught in Park in regards to 3D vehicle interaction and collision predictions to predict 3D amino acid interactions and collisions. Fan teaches that a neural network can be used to predict 3D structures of amino acids and their interactions.  Fan teaches as a motivation for combination (¶0231] “As described in detail below, using a deep neural network architecture, side chain conformation prediction accuracy can be improved by more than 25%, especially for aromatic residues compared with current standard methods”).    

Regarding claim 4, the combination of Park and Fan teaches The computer-implemented method according to claim 1, wherein the output values of the encoder units are used as coordinates in 3-dimensional space. (Fan [¶0238] "Protein conformation can be encoded by both backbone information and side chain conformation. Since the backbone conformation encoded in 3D image format was going to be used in the disclosed CNN model, the amino acid side chain rotamer library is constructed in a backbone-independent fashion" [¶0235] "By further modeling amino acids side chains with 3-Dimensional (3D) images, a deep neural network is used to predict the likelihood for targeting amino acids adopting each pose. The most likely pose ranked by the disclosed convolutional neural network (CNN) architecture was the output for the prediction" Fan explicitly teaches output of encoder being used to represent 3D structure.). 

Regarding claim 5, the combination of Park and Fan teaches The computer-implemented method according to claim 1, wherein each of the recurrent neural networks units is an LSTM neural network unit. (Park [Abstract] "We employ the encoder-decoder architecture which analyzes the pattern underlying in the past trajectory using the long short-term memory (LSTM) based encoder and generates the future trajectory sequence using the LSTM based decoder."). 

	Claims 13, 16, and 17 disclose a system with substantially the same scope as claims 1, 4, and 5, therefore the rejection applied to claims 1, 4, and 5 also apply to claims 13, 16, and 17.  

Claim 25 discloses a computer program product with substantially the same scope as claim 1, therefore the rejection applied to claim 1 also applies to claim 25.  Furthermore, Fan discloses ([¶0086] “The processes disclosed herein may be implemented by a suitable combination of hardware, software, and/or firmware” Software interpreted as synonymous with computer program product. [¶0087] “The disclosed embodiments also relate to tangible and non-transitory computer readable media that include program instructions or program code that, when executed by one or more processors, perform one or more computer-implemented operations. For example, the disclosed embodiments may execute high level and/or low level software instructions, such as machine code (e.g., such as that produced by a compiler) and/or high level code that can be executed by a processor using an interpreter.”).

Claims 2, 3, 14, and 15 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Park and Fan and in further view of Clarke (US 2021/0163930 A1).

Regarding claim 2, the combination of Park and Fan teaches The computer-implemented method according to claim 1.  
However, the combination of Park and Fan does not explicitly teach, wherein the genome interaction data originate from a publicly available source.  

Clarke in the same area of endeavor discloses retrieving publically available HiC data for further analysis.  Clarke teaches wherein the genome interaction data originate from a publicly available source. ([¶0022] "Computational methods, for example applied to the vast amounts of publically available biological data, can be used to build models of the interactions that underly these networks"). 

It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the neural network for amino acid analysis disclosed in Fan with the chromatin modification methods in Clarke by using publically available HiC data. It would have been obvious to retrieve publically available data for training a neural network as is common in practice in the art and Clark discloses as an advantage ([¶0022] “the vast amounts of publically available biological data, can be used to build models of the interactions that underly these networks”).  Among that data Clark explicitly mentions HiC data.

Regarding claim 3, the combination of Park and Fan teaches The computer-implemented method according to claim 1.  
However, the combination of Park and Fan teaches does not explicitly teach wherein the matrix of the genome interaction data originate from an HiC experiment.  

Clarke in the same area of endeavor discloses retrieving publically available HiC data for further analysis.  Clarke teaches The computer-implemented method according to claim 1, wherein the matrix of the genome interaction data originate from an HiC experiment. ([¶0184] "The landscape of the human epigenome undergoes extensive changes during development, leading to distinct transcription programs in different cell types. Using Hi-C, Liu et al, 2017 (High-resolution Comparative Analysis Reveals a Primitive 3D Genome in Embryonic Stem Cells,..They found that in human ESCs, DNA looping interactions are not enriched at enhancers, suggesting a stochastic nature of DNA looping interactions at ESC enhancers"). 

It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the neural network for amino acid analysis disclosed in Fan with the chromatin modification methods in Clarke by using publically available HiC data. It would have been obvious to retrieve publically available data for training a neural network as is common in practice in the art and Clark discloses as an advantage ([¶0022] “the vast amounts of publically available biological data, can be used to build models of the interactions that underly these networks”).  Among that data Clark explicitly mentions HiC data.

Claim 14 discloses a system with substantially the same scope as claim 2, therefore the rejection applied to claim 2 also applies to claim 14.  

Claim 15 discloses a system with substantially the same scope as claim 3, therefore the rejection applied to claim 3 also applies to claim 15.  

Claims 6 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Park and Fan and in further view of Yan (“HiC-spector: a matrix library for spectral and reproducibility analysis of Hi-C contact maps”, 2017)

Regarding claim 6, the combination of Park and Fan teaches The computer-implemented method according to claim 1.  
However, the combination of Park and Fan does not explicitly teach, wherein each vector of the n*n matrix of genome interaction data represents contact information of a bin of the genome.  

Yan in the same area of endeavor discloses that Hi-C data is usually represented by contact matrices.  Yan ([Abstract] “Genome-wide proximity ligation based assays like Hi-C have opened a window to the 3D organization of the genome”).  Proximity ligation is interpreted as being highly relevant to sidechain conformation as further disclosed in Fan ([¶0004] “For any given peptide sequence, there may be a significant number of biologically relevant conformations, not to mention possible structural reorganization associated with ligand binding or with protein-protein interactions”).  
Yan teaches The computer-implemented method according to claim 1, wherein each vector of the n*n matrix of genome interaction data represents contact information of a bin of the genome. ("Data from Hi-C experiments are usually summarized by so called chromosomal contact maps. By binning the genome into equally sized bins, a contact map is a matrix whose elements store the population-averaged co-location frequencies between pairs of loci."). 

It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the neural network for amino acid analysis disclosed in Fan with the matrix library disclosed in Yan. Yan teaches the background of using Hi-C data and the taught improvements to the known methods ([p. 2199 §1] “Data from Hi-C experiments are usually summarized by so called chromosomal contact maps. By binning the genome into equally sized bins, a contact map is a matrix whose elements store the population-averaged co-location frequencies between pairs of loci. Therefore, mathematical tools like spectral analysis can be extremely useful in understanding these chromosomal contact maps. Our aim is to provide a set of basic analysis tools for handling Hi-C contact maps. In particular, we introduce a simple but novel metric to quantify the reproducibility of the maps using spectral decomposition.”).  It would therefore be obvious from a chromosomal analysis perspective why the combination of Yan would be advantageous.

Claim 18 discloses a system with substantially the same scope as claim 6, therefore the rejection applied to claim 6 also applies to claim 18.  

Claims 7 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Park and Fan and in further view of Xiong (“Revealing Hi-C subcompartments by imputing high-resolution inter-chromosomal chromatin interactions”, 2018) 

Regarding claim 7, the combination of Park and Fan teaches The computer-implemented method according to claim 1.  
However, the combination of Park and Fan does not explicitly teach wherein a loss function of each of the encoder units and decoder units is cell-type specific.  

Xiong, who teaches a related art of an autoencoder system with encoders and decoders, teaches using an autoencoder for Hi-C genome related prediction.  Xiong teaches wherein a loss function of each of the encoder units and decoder units is cell-type specific. ([p. 13] "The loss function sums over the class-specific entropy loss yi[c] log ˆyi for all classes in each training sample" [p. 14] "Si is the total entropy of region i subcompartment annotations, summed over the entropy of all C subcompartments. The fraction of subcompartment c at region i, pi,c, is computed by counting the number of occurrences of subcompartment c over all N cell types" Xiong explicitly teaches that the loss is a function of the class-specific entropy and further teaches that the entropy is a function of the cell-types.). 

It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the neural network systems in Park, Fan, and Xiong by adding a cell-type related variable in the loss function. Both Park and Xiong teach an autoencoder system with encoders and decoders.  The obvious benefit in regards to the combination is that Xiong utilizes a similar autoencoder system to that in Park for genome prediction and explicitly mentions ([Abstract] “We applied SNIPER to eight additional cell lines to identify the variation of Hi-C subcompartments across different cell types”) reinforcing that cell-type differentiation in the autoencoder system is of interest in the art.

Claim 19 discloses a system with substantially the same scope as claim 7, therefore the rejection applied to claim 7 also applies to claim 19.  

Claims 8 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Park and Fan and in further view of Tolstikhin (“Wasserstein Auto-Encoders”, 2018).    

Regarding claim 8, the combination of Park and Fan teaches The computer-implemented method according to claim 1.  
However, the combination of Park and Fan does not explicitly teach wherein a loss function comprises a reconstruction cost factor and a distance cost factor.  

Tolstikhin, who teaches a related art of an autoencoder system, teaches a loss function comprising a reconstruction factor and a distance factor.  Tolstikhin teaches The computer-implemented method according to claim 1, wherein a loss function comprises a reconstruction cost factor and a distance cost factor. ([p. 6 §3] “Variational auto-encoders [1] minimize a variational bound on the KL-divergence DKL(PX; PG) which is composed of the reconstruction cost plus the regularizer” Regularizer interpreted as synonymous with distance cost factor.). 

It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the neural networks of Park and Fan with that disclosed in Tolstikhin by implementing a reconstruction and distance loss in the loss function. A reconstruction cost and a regularizer is common in a combined loss function and Tolstikhin reinforces this and further provides motivation ([Abstract] “Our experiments show that WAE shares many of the properties of VAEs (stable training, encoder-decoder architecture, nice latent manifold structure) while generating samples of better quality, as measured by the FID score.”)  

Claim 20 discloses a system with substantially the same scope as claim 8, therefore the rejection applied to claim 8 also applies to claim 20.  

Claims 9 and 21 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Fan, Park, and Tolstikhin and in further view of Schroers (US 2020/0226797 A1). 

Regarding claim 9, the combination of Park, Fan, and Tolstikhin teaches The computer-implemented method according to claim 8.  
However, the combination of Park, Fan, and Tolstikhin does not explicitly teach, wherein the reconstruction cost factor is determined according to a mean-square error loss calculation.  

Schroers in the same field of endeavor discloses that the reconstruction loss may be defined by a mean square error.  Schroers teaches wherein the reconstruction cost factor is determined according to a mean-square error loss calculation. ([¶0062] "Reconstruction loss may be defined by a mean square error (MSE), an L1 error, or a multiscale structural similarity (MS-SSIM)"). 
	
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the reconstruction cost of Shroers with the loss function disclosed in Park, Fan, and Tolstikhin. The combination would have been obvious because a person of ordinary skill in the art would be able to determine that a mean squared error is a metric commonly used in the art as a loss function.  Since the reconstruction error is an element of a combined loss function it would be obvious to use a mean squared error for the reconstruction cost.  Schroers further demonstrates that the reconstruction cost may utilize a number of well-known loss functions ([¶0062] “It should be appreciated that other rate distortion losses may use different variables and functions (e.g., different reconstruction losses and probability models).”).  

Claim 21 discloses a system with substantially the same scope as claim 9, therefore the rejection applied to claim 9 also applies to claim 21.  

Claims 10 and 22 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Park and Fan and in further view of Santana (“INFORMATION THEORETIC-LEARNING AUTO-ENCODER”, 2016).

Regarding claim 10, the combination of Park and Fan teaches The computer-implemented method according to claim 1.  
However, the combination of Park and Fan does not explicitly teach wherein the distance cost factor acts as a regularizer on the lower-bound and upper-bound of the Euclidean distance between two consecutive bins of a genome.  

Santana in the same field of endeavor discloses using a Euclidian divergence in an autoencoder.  Santana teaches wherein the distance cost factor acts as a regularizer on the lower-bound and upper-bound of the Euclidean distance between two consecutive bins of a genome. ([p. 3298 §3] "Let us define autoencoders as a 4-tuple AE = {E, D, L, R}.  Where E and D are the encoder and the decoder functions, here parameterized as neural networks. L is the reconstruc tion cost function that measures the difference between orig inal data samples x and their respective reconstructions x = D(E(x)). A typical reconstruction cost is mean-squared er ror. R is a functional regularization. Here this functional reg ularization will only be applied to the encoder E. N...Variational Autoencoders (VAE) [4] adapt a lower bound of the variational regularization, R, using parametric, closed form solutions for the KL-divergence" [p. 3298 §3] "The functional regularization costs investigated in this pa per are the ITL Euclidean and Cauchy-Schwarz divergences. Both types of divergence encourage a smooth manifold similar to the imposed prior" Santana shows that Euclidian divergence is not only known, but that it shows benefit in certain use cases.). 

It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the autoencoder in Park with that of Santana by implementing Euclidian divergence. The combination would have been obvious because a person of ordinary skill in the art would be able to determine from Santana that Euclidian divergence can reasonably replace KL divergence which is well known in the art for loss functions, and can even outperform it in certain use cases ([p. 3299 §5] "the Euclidean distance worked better with smaller kernels").  

Claim 22 disclose a system with substantially the same scope as claim 10, therefore the rejection applied to claim 10 also applies to claim 22.  

Claims 11, 12, 23, and 24 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Park and Fan and in further view of Qiao (“Learning Bidirectional LSTM Networks for Synthesizing 3D Mesh Animation Sequences”, 2018). 

Regarding claim 11, the combination of Park and Fan teaches The computer-implemented method according to claim 1.
However, the combination of Park and Fan does not explicitly teach also comprising providing a time series of genome interaction data to the autoencoder, and 
using resulting time-dependent output values of the encoder units for deriving a time-dependent 3D model for a visualization of the genome.  

Qiao teaches The computer-implemented method according to claim 1, also comprising providing a time series of genome interaction data to the autoencoder, and ([p. 2] "Mesh animation sequences are typically represented as a set of meshes with the same vertex connectivity and different vertex positions. Such meshes can be obtained by consistent remeshing or mesh deformation...Assume the mesh sequence dataset M contains n shapes and each mesh is denoted is as mt (t = 1, 2, ...n). We denote pt;i E R3 as the ith vertex of the tth model" [p. 3] "Take the network St at time step t as an example, the input to Conv is the deformation representation Xt. The interface between cell and Conv is a fully connected layer, which outputs a low-dimensional vector z into cell. tCnv, a stack of transpose convolution layers, mirrors Conv and shares weights with it. The output of tCnv is the feature change dXt. dXt +Xt gives the predicted feature for time step t+1, which is fed into St+1 iteratively" See also FIG. 1. While Qiao doesn't explicitly mention using genome interaction data, the genome interaction data is seen as synonymous such that genome interaction data could be represented by standard 3d mesh animation transforms.).
using resulting time-dependent output values of the encoder units for deriving a time-dependent 3D model for a visualization of the genome. ([p. 4-5] With the help of LSTM, our model can record history information and iterate to generate realistic mesh sequences in any length" In 3D modeling mesh is synonymous with model, such that a mesh sequence is interpreted as synonymous with a 3D model animation.  3D model animation is time-dependent data.  Generating mesh sequences is interpreted as synonymous with generating time-dependent data for deriving 3d model visualization.). 

It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the neural network systems of Park and Fan with that of Qiao by feeding 3D animation data into the autoencoder.  Both Qiao and Park use sequence to sequence with a 1:1 series of LSTM encoders and decoders.  Qiao shows that the setup can be easily used for predicting time specific 3D data.  Qiao also gives as motivation for combination ([Abstract] “Benefiting from all these technical advances, our approach outperforms existing methods in sequence prediction and completion both qualitatively and quantitatively. Moreover, this network can also generate follow-up frames conditioned on initial shapes and improve the accuracy as more bootstrap models are provided”).  

Regarding claim 12, the combination of Fan, Park, and Qiao teaches The computer-implemented method according to claim 11, wherein the providing the time series of genome interaction data to the autoencoder also comprises, initializing ([p. 3 Col. 2] "the LSTM cell update its state to sj from an initial state s0" [p. 6 Col. 1] "we first find the optimal LSTM initial state ^s0 = arg min||X^n-Xn||"), during training, weight factors of the encoder units and decoder units with respective weight factors of a previous time point of the time series of genome interaction data. (Qiao [p. 2] "We design a share-weight bidirectional LSTM architecture that is able to boost performance and generate two sequences in opposite directions. Bidirectional generation also stabilizes training process and helps to complete a sequence in a more natural way" [p. 3] "The interface between cell and Conv is a fully connected layer, which outputs a low-dimensional vector z into cell. tCnv, a stack of transpose convolution layers, mirrors Conv and shares weights with it. The output of tCnv is the feature change dXt. dXt +Xt gives the predicted feature for time step t+1, which is fed into St+1 iteratively. (b) is our bidirectional LSTM. Both chains have the same architecture as in (a), and the only difference is their opposite direction. The forward chain takes the first model as input and the backward chain takes the last. They share weights and their predictions are constrained to match with each other." [p. 4] "LKL terms impose stronger constraints during training and consequently helps predict more accurate sequences." Qiao explicitly teaches initializing and updating encoder weights during training respective to time steps.). 

Claims 23 and 24 disclose a system with substantially the same scope as claim 11 and 12, therefore the rejection applied to claim 11 and 12 also applies to claim 23 and 24.  
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SIDNEY VINCENT BOSTWICK whose telephone number is (571)272-4720.  The examiner can normally be reached on M-F 7:30am-5:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on (571)270-7092.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/SB/Examiner, Art Unit 2124                                                                                                                                                                                                        

/MIRANDA M HUANG/Supervisory Patent Examiner, Art Unit 2124