Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
Claims 1-20 are pending in this action.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 08/27/2021 was filed.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

USC 112, 6th paragraph interpretation 

The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.



Claim limitation has/have been interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because it uses/they use a generic placeholder coupled 

Since the claim limitation(s) invokes 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, claim(s) 1-3 has/have been interpreted to cover the corresponding structure described in the specification that achieves the claimed function, and equivalents thereof.

A review of the specification shows that the following appears to be the corresponding structure described in the specification for the 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph limitation:

Claims 1 and 17 recite the limitation of "a noise generation unit configured” which has a corresponding structure as seen in Figures 3, 6, 8 Noise Addition Unit 102.

Claims 1 and 17 recite the limitation of "a decoding unit configured” which has a corresponding structure as seen in Figures 3, 6, 8 Decoding Unit 103.

Claim 1 and 17 recite the limitation of "a generation unit configured” which has a corresponding structure as seen in Figures 3, 6, 8 Generation Unit 104.



Claim 9 recites the limitation of "an acquisition unit configured” which has a corresponding structure as seen in Figures 3, 6, 8 Acquisition Unit 101.

Claim 10 recites the limitation of "Output Control unit configured” which has a corresponding structure as seen in Figures 3, 6, 8 Output Control Unit 106.

The Examiner believes that the above limitation invokes USC 112, 6th paragraph and has a corresponding structure and algorithm as seen in the Applicant’s specification in Figures 3, 6, and 8.

If applicant wishes to provide further explanation or dispute the examiner’s interpretation of the corresponding structure, applicant must identify the corresponding structure with reference to the specification by page and line number, and to the drawing, if any, by reference characters in response to this Office action.

Claim Rejections - 35 USC § 101

35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.




In analyzing under step 1, is the claim to a process, machine manufacture or composition of matter? Yes.

In analyzing under step 2A Prong One, Does the claim recite an abstract idea law of nature or natural phenomenon?  Yes.

The claim(s) 1 and 11 recite(s) the abstract limitations such as “receive a first code word and output a second code word, the second code word being the first code word with noise added…decode the second code word and output a third code word; … generate learning data including the second code word for which the third code word matches the first code word, the learning data being for learning a weight for a message passing decoding in which a message to be transmitted is multiplied by the weight;  and … receive the learning data and determine the weight for the message passing decoding using the learning data” is a process that, under its broadest reasonable interpretation, covers performance of the limitation under mathematical processes but for the recitation of generic computer processor such as “a noise generation unit, a decoding unit, a generation unit and a learning unit” (see claim 1), “a processor connected to a memory unit” (see claim 11)



The method outputting a second code word data is according to adding a noise data into the first code word data.  Therefore the method is according to mathematical adding calculation.

The method of decoding the second code word data into the third code word data until the third code word data matches the first code word is according to subtracting the noise data.  Therefore the method is according to mathematical subtracting calculation.

The method of learning data being for learning a weight for a message passing decoding is according to the subtraction of the calculation above.

As such, the claims are rejected under abstract.
 
In analyzing under step 2A Prong Two, Does the claim recite additional elements that integrate the judicial exception into a practical application?  NO.

This judicial exception is not integrated into a practical application because the claims recite a generic processor such “a noise generation unit, a decoding unit, a 

The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception because a generic processor and software module which are high level of generality performing code generation. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. 

In analyzing under step 2B, does the claim recite additional elements that amount to significantly more than the judicial exception? NO

Claims 1-17 do not recite any additional elements except a generic processor such as “r such “a noise generation unit, a decoding unit, a generation unit and a learning unit” (see claim 1), “a processor connected to a memory unit” (see claim 11) for learning a method of adding data and subtracting data.

Accordingly, the additional generic elements do not amount to significantly more than the judicial exception because a generic processor and software module which are high level of generality performing code generation

The claim is directed to an abstract idea.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Any claim not specifically mentioned, is rejected due to its dependency on a rejected claim.

Claims 1, 11 and 17 recite a limitation such as “generate learning data including the second code word for which the third code word matches the first code word”

The recited conditional limitation such as “for which the third code word matches the first code word” renders this limitation indefinite because it is unclear “for which the third code word does not match the first code word” after decoding.  It is unclear whether the learning data is generated for not matching condition.  Therefore, it is unclear whether the limitation such as “…receive the learning data and determine the weight for the message passing decoding using the learning data” can be obtained when not matching condition so that the learning data can be used for future decoding.

Claims 1, 11 and 17 recite a limitation such as “the learning data being for learning a weight for a message passing decoding in which a message to be transmitted is multiplied by the weight”

The recited limitation such as “being…a weight for a message passing decoding in which a message to be transmitted is multiplied by the weight” renders this limitation indefinite because the recited claims do not provide a method of “learning a weight for a message” when decoding the second code word to the third code word.  The recited claims do not mention anything about the message or the weight of the message when decoding 2nd to 3rd code word.  Therefore, it is unclear how it can “learning a weight for a message”

Claims 2, 3, 5, 12, 13, 15 recite a limitation such as “…the first code word and the third code word coincide with each other”

The recited limitation such as “….the first code word and the third code word coincide with each other” renders this limitation indefinite because Applicant fails to provide definition for terminology “coincide” in the recited claims.  It is unclear whether the term “coincide” is the same as the limitation “which the third code word matches the first code word” in the previous claims 1, 11 and 17. 


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Agrawal et al. (US 2021/0,142,159), in view  of Sharon et al. (US 2018/0,358,988)

As per claim 1:

Agrawal discloses:


(Agrawal, Figs. 1-47)

a noise generation unit configured to receive a first code word and output a second code word, the second code word being 
(Agrawal, [0141] received signal comprises a transmitted codeword and added noise.  In some examples, the added noise may be AWGN, which may in some examples be artificially added to imitate a wireless communication channel.  Referring to FIG. 4, in a first step 110, the method comprises inputting to an input layer of the NN a representation of message bits obtained from a received noisy signal)
(Agrawal, Figs. 4-7, 110, 210, 310, 410 message bits obtained from received noisy signal) 

a decoding unit configured to decode the second code word and output a third code word; 
(Agrawal, [0141] generating an intermediate output representation at step 121, generating an intermediate output codeword from the intermediate output representation at step 122, and performing a syndrome check on the generated intermediate output codeword at step 123.  Optimising trainable parameters of the NN to minimise a loss function comprises, if the syndrome check at step 123 is found to be satisfied at step 131, ceasing optimisation of the trainable parameters after optimisation at the layer at which the syndrome check is satisfied at step 132.  It will be appreciated that in some examples of the method 100, the steps 121 to 123 may be performed at every even layer of the NN) 
(Agrawal, [0098] decoder takes the LLR values as input, and returns decision on corrected bits.  The decoding follows the renowned Belief Propagation (BP) algorithm.  The messages (or beliefs) are updated by passing the messages over the edges of the graph representation of the code called the Tanner graph)
(Agrawal, Figs. 4-7, right flow chart) 

 a generation unit configured to generate learning data including the second code word (Agrawal, Fig. 4, 132) for which the third code word matches the first code word, (Agrawal, Figs. 4-7, check satisfied?)
(Agrawal, [0141] generating an intermediate output representation at step 121, generating an intermediate output codeword from the intermediate output representation at step 122, and performing a syndrome check on the generated intermediate output codeword at step 123.  Optimising trainable parameters of the NN to minimise a loss function comprises, if the syndrome check at step 123 is found to be satisfied at step 131, ceasing optimisation of the trainable parameters after optimisation at the layer at which the syndrome check is satisfied at step 132.  It will be appreciated that in some examples of the method 100, the steps 121 to 123 may be performed at every even layer of the NN) 


(Agrawal, [0004] NND learns to reduce the effect of artifacts such as cycles or trapping sets in the graph structure, by applying complimentary weights to the messages passed over edges of the graph which form cycles.  Weights are learned through a training process.  Training parameters such as Input/Target variables, Loss function, Regularization, and Optimizer etc., affect the performance of the network during testing and use.  In existing approaches, training is performed using "Cross entropy loss/multi-loss functions")
(Agrawal, [0055], Examples of the proposed solutions also improve the training process by pushing the network to learn correct weights early on)
(Agrawal, [0072] FIGS. 16 to 20 illustrate learned weight distribution over edges for different codes)
(Agrawal, [0133]-[0140] Trained Weights [0137] Only the weights that define the structure of the Tanner graph are trained.  The initialization of weight can be either fixed (1.0) or can be random)
(Agrawal, Figs. 4-7) 

a learning unit configured to receive the learning data and determine the weight for the message passing decoding using the learning data. 
(Agrawal, [0141] generating an intermediate output representation at step 121, generating an intermediate output codeword from the intermediate output representation at step 122, and performing a syndrome check on the generated intermediate output codeword at step 123.  Optimising trainable parameters of the NN to minimise a loss function comprises, if the syndrome check at step 123 is found to be satisfied at step 131, ceasing optimisation of the trainable parameters after optimisation at the layer at which the syndrome check is satisfied at step 132.  It will be appreciated that in some examples of the method 100, the steps 121 to 123 may be performed at every even layer of the NN) 
(Agrawal, [0055], Examples of the proposed solutions also improve the training process by pushing the network to learn correct weights early on)
(Agrawal, [0072] FIGS. 16 to 20 illustrate learned weight distribution over edges for different codes)
(Agrawal, [0133]-[0140] Trained Weights [0137] Only the weights that define the structure of the Tanner graph are trained.  The initialization of weight can be either fixed (1.0) or can be random)
(Agrawal, Figs. 4-7) 

Agrawal discloses that added noise …which may in some examples be artificially added to imitate a wireless communication channel.

However, Agrawal does not clearly disclose of “first code word with noise added” in a memory system

Sharon clearly discloses “first code word with noise added”
(Sharon, [0059] Step 302 is to pass a batch of noisy codewords through a parameterized iterative message passing decoder 244.  In one embodiment, the noisy codewords are represented as a vector of a-priori log-likelihood ratios.  In an offline embodiment, the noisy codewords may be generated by adding arbitrary noise to clean versions of the codewords)
 (Sharon, [0090] … generated by adding noise to the bits of the clean codeword 602.  … By reading data that was stored in non-volatile memory.  The data that is read from non-volatile memory will typically be noisy due to well-known factors such as read disturb, program disturb, charge loss, etc. The a-priori LLRs may be generated based on reading only hard bits or based on reading both hard and soft bits)
(Sharon, [0123], by learning such noise realizations, the decoder 244 can perform better.  For example, power can be saved, decoding latency can be reduced, and error correction can be made more accurate)

It would have been obvious before the effective filing date of the claimed to a person having ordinary skill in the art to incorporate Sharon’s method of adding noise to clean code word of Agrawal in order to improve the performance of decoder of Agrawal.
(Sharon, [0123], by learning such noise realizations, the decoder 244 can perform better.  For example, power can be saved, decoding latency can be reduced, and error correction can be made more accurate)

As per claim 2:

wherein the learning data includes the first code word, the second code word, and a determination result indicating whether the first code word and the third code word coincide with each other, and when the determination result indicates that the first code word and the third code word coincide with each other, the learning unit is configured to use the second code word as an input and the first code word as a correct answer in determining the weight. 
(Agrawal, [0141] generating an intermediate output representation at step 121, generating an intermediate output codeword from the intermediate output representation at step 122, and performing a syndrome check on the generated intermediate output codeword at step 123.  Optimising trainable parameters of the NN to minimise a loss function comprises, if the syndrome check at step 123 is found to be satisfied at step 131, ceasing optimisation of the trainable parameters after optimisation at the layer at which the syndrome check is satisfied at step 132.  It will be appreciated that in some examples of the method 100, the steps 121 to 123 may be performed at every even layer of the NN) 
(Agrawal, [0055], Examples of the proposed solutions also improve the training process by pushing the network to learn correct weights early on)
(Agrawal, [0072] FIGS. 16 to 20 illustrate learned weight distribution over edges for different codes)
(Agrawal, [0133]-[0140] Trained Weights [0137] Only the weights that define the structure of the Tanner graph are trained.  The initialization of weight can be either fixed (1.0) or can be random)
(Agrawal, Figs. 4-7) 

As per claim 3:
Agrawal-Sharon further discloses:

wherein the generation unit is configured to generate the learning data including just the first code word and the second code word when the first code word and the third code word coincide with each other. 
(Agrawal, [0141] generating an intermediate output representation at step 121, generating an intermediate output codeword from the intermediate output representation at step 122, and performing a syndrome check on the generated intermediate output codeword at step 123.  Optimising trainable parameters of the NN to minimise a loss function comprises, if the syndrome check at step 123 is found to be satisfied at step 131, ceasing optimisation of the trainable parameters after optimisation at the layer at which the syndrome check is satisfied at step 132.  It will be appreciated that in some examples of the method 100, the steps 121 to 123 may be performed at every even layer of the NN) 
(Agrawal, [0055], Examples of the proposed solutions also improve the training process by pushing the network to learn correct weights early on)
(Agrawal, [0072] FIGS. 16 to 20 illustrate learned weight distribution over edges for different codes)
(Agrawal, [0133]-[0140] Trained Weights [0137] Only the weights that define the structure of the Tanner graph are trained.  The initialization of weight can be either fixed (1.0) or can be random)
(Agrawal, Figs. 4-7) 

As per claim 4:
Agrawal-Sharon further discloses:
wherein the generation unit is configured to generate the learning data including the third code word and the second code word. 
(Agrawal, [0141] generating an intermediate output representation at step 121, generating an intermediate output codeword from the intermediate output representation at step 122, and performing a syndrome check on the generated intermediate output codeword at step 123.  Optimising trainable parameters of the NN to minimise a loss function comprises, if the syndrome check at step 123 is found to be satisfied at step 131, ceasing optimisation of the trainable parameters after optimisation at the layer at which the syndrome check is satisfied at step 132.  It will be appreciated that in some examples of the method 100, the steps 121 to 123 may be performed at every even layer of the NN) 
(Agrawal, [0055], Examples of the proposed solutions also improve the training process by pushing the network to learn correct weights early on)
(Agrawal, [0072] FIGS. 16 to 20 illustrate learned weight distribution over edges for different codes)
(Agrawal, [0133]-[0140] Trained Weights [0137] Only the weights that define the structure of the Tanner graph are trained.  The initialization of weight can be either fixed (1.0) or can be random)
(Agrawal, Figs. 4-7) 

As per claim 5:
Agrawal-Sharon further discloses:

wherein the generation unit is configured to generate the learning data including the first code word and the second code word when the first code word and the third code word coincide with each other, and the generation unit is configured to generate the learning data including the third code word and a fourth code word obtained by converting the second code word when the first code word and the third code word do not coincide with each other. 
(Agrawal, [0141] generating an intermediate output representation at step 121, generating an intermediate output codeword from the intermediate output representation at step 122, and performing a syndrome check on the generated intermediate output codeword at step 123.  Optimising trainable parameters of the NN to minimise a loss function comprises, if the syndrome check at step 123 is found to be satisfied at step 131, ceasing optimisation of the trainable parameters after optimisation at the layer at which the syndrome check is satisfied at step 132.  It will be appreciated that in some examples of the method 100, the steps 121 to 123 may be performed at every even layer of the NN) 
(Agrawal, [0055], Examples of the proposed solutions also improve the training process by pushing the network to learn correct weights early on)
(Agrawal, [0072] FIGS. 16 to 20 illustrate learned weight distribution over edges for different codes)
(Agrawal, [0133]-[0140] Trained Weights [0137] Only the weights that define the structure of the Tanner graph are trained.  The initialization of weight can be either fixed (1.0) or can be random)
(Agrawal, Figs. 4-7) 

As per claim 6:
Agrawal-Sharon further discloses:
wherein the message passing decoding is a belief-propagation algorithm in which the weight and the transmitted message are multiplied. 
(Agrawal, [0098] decoder takes the LLR values as input, and returns decision on corrected bits.  The decoding follows the renowned Belief Propagation (BP) algorithm.  The messages (or beliefs) are updated by passing the messages over the edges of the graph representation of the code called the Tanner graph)

As per claim 7:
Agrawal-Sharon further discloses:

(Agrawal, [0141] generating an intermediate output representation at step 121, generating an intermediate output codeword from the intermediate output representation at step 122, and performing a syndrome check on the generated intermediate output codeword at step 123.  Optimising trainable parameters of the NN to minimise a loss function comprises, if the syndrome check at step 123 is found to be satisfied at step 131, ceasing optimisation of the trainable parameters after optimisation at the layer at which the syndrome check is satisfied at step 132.  It will be appreciated that in some examples of the method 100, the steps 121 to 123 may be performed at every even layer of the NN) 
(Agrawal, [0055], Examples of the proposed solutions also improve the training process by pushing the network to learn correct weights early on)
(Agrawal, [0072] FIGS. 16 to 20 illustrate learned weight distribution over edges for different codes)
(Agrawal, [0133]-[0140] Trained Weights [0137] Only the weights that define the structure of the Tanner graph are trained.  The initialization of weight can be either fixed (1.0) or can be random)
(Agrawal, Figs. 4-7) 

As per claim 8:
Agrawal-Sharon further discloses:

(Agrawal, [0100] decoder uses a soft-iterative decoding technique called SPA.  SPA operates on sum-product semi-ring for iterative decoding, which leads to bit-wise Maximum a posteriori probability (MAP) decoding)

As per claim 9:
Agrawal-Sharon further discloses:
further comprising: an acquisition unit configured to acquire the first code word from a storage unit and supply the first code word to the noise addition unit. 
(Agrawal, [0141] received signal comprises a transmitted codeword and added noise.  In some examples, the added noise may be AWGN, which may in some examples be artificially added to imitate a wireless communication channel.  Referring to FIG. 4, in a first step 110, the method comprises inputting to an input layer of the NN a representation of message bits obtained from a received noisy signal)
(Agrawal, Figs. 4-7, 110, 210, 310, 410 message bits obtained from received noisy signal) 

As per claim 10:
Agrawal further discloses:

(Agrawal, [0141] generating an intermediate output representation at step 121, generating an intermediate output codeword from the intermediate output representation at step 122, and performing a syndrome check on the generated intermediate output codeword at step 123.  Optimising trainable parameters of the NN to minimise a loss function comprises, if the syndrome check at step 123 is found to be satisfied at step 131, ceasing optimisation of the trainable parameters after optimisation at the layer at which the syndrome check is satisfied at step 132.  It will be appreciated that in some examples of the method 100, the steps 121 to 123 may be performed at every even layer of the NN) 
(Agrawal, [0055], Examples of the proposed solutions also improve the training process by pushing the network to learn correct weights early on)
(Agrawal, [0072] FIGS. 16 to 20 illustrate learned weight distribution over edges for different codes)
(Agrawal, [0133]-[0140] Trained Weights [0137] Only the weights that define the structure of the Tanner graph are trained.  The initialization of weight can be either fixed (1.0) or can be random)
(Agrawal, Figs. 4-7) 

As per claim 11:

Agrawal discloses:

A learning device, comprising: a processor connected to a memory unit, the processor being configured to: 
(Agrawal, Figs. 1-47)
(Agrawal, [0041] controller comprises a processor and a memory, the memory containing instructions executable by the processor such that the controller is operable to input to  ... message bits obtained from a received noisy signal; propagate the representation through the NN)

receive a first code word and generate a second code word from the first code word, the second code word being 
(Agrawal, [0141] received signal comprises a transmitted codeword and added noise.  In some examples, the added noise may be AWGN, which may in some examples be artificially added to imitate a wireless communication channel.  Referring to FIG. 4, in a first step 110, the method comprises inputting to an input layer of the NN a representation of message bits obtained from a received noisy signal)
(Agrawal, Figs. 4-7, 110, 210, 310, 410 message bits obtained from received noisy signal) 

decode the second code word and output a third code word as a decoding result;  and 
(Agrawal, [0141] generating an intermediate output representation at step 121, generating an intermediate output codeword from the intermediate output representation at step 122, and performing a syndrome check on the generated intermediate output codeword at step 123.  Optimising trainable parameters of the NN to minimise a loss function comprises, if the syndrome check at step 123 is found to be satisfied at step 131, ceasing optimisation of the trainable parameters after optimisation at the layer at which the syndrome check is satisfied at step 132.  It will be appreciated that in some examples of the method 100, the steps 121 to 123 may be performed at every even layer of the NN) 
(Agrawal, [0098] decoder takes the LLR values as input, and returns decision on corrected bits.  The decoding follows the renowned Belief Propagation (BP) algorithm.  The messages (or beliefs) are updated by passing the messages over the edges of the graph representation of the code called the Tanner graph)
(Agrawal, Figs. 4-7, right flow chart) 

generate learning data including the second code word (Agrawal, Fig. 4, 132) for which the third code word matches the first code word;  and (Agrawal, Figs. 4-7, check satisfied?)
(Agrawal, [0141] generating an intermediate output representation at step 121, generating an intermediate output codeword from the intermediate output representation at step 122, and performing a syndrome check on the generated intermediate output codeword at step 123.  Optimising trainable parameters of the NN to minimise a loss function comprises, if the syndrome check at step 123 is found to be satisfied at step 131, ceasing optimisation of the trainable parameters after optimisation at the layer at which the syndrome check is satisfied at step 132.  It will be appreciated that in some examples of the method 100, the steps 121 to 123 may be performed at every even layer of the NN) 

determine a weight for a message passing decoding from the learning data, the weight being multiplied with a message transmitted in the message passage decoding. 
(Agrawal, [0141] generating an intermediate output representation at step 121, generating an intermediate output codeword from the intermediate output representation at step 122, and performing a syndrome check on the generated intermediate output codeword at step 123.  Optimising trainable parameters of the NN to minimise a loss function comprises, if the syndrome check at step 123 is found to be satisfied at step 131, ceasing optimisation of the trainable parameters after optimisation at the layer at which the syndrome check is satisfied at step 132.  It will be appreciated that in some examples of the method 100, the steps 121 to 123 may be performed at every even layer of the NN) 
(Agrawal, [0055], Examples of the proposed solutions also improve the training process by pushing the network to learn correct weights early on)
(Agrawal, [0072] FIGS. 16 to 20 illustrate learned weight distribution over edges for different codes)
(Agrawal, [0133]-[0140] Trained Weights [0137] Only the weights that define the structure of the Tanner graph are trained.  The initialization of weight can be either fixed (1.0) or can be random)
(Agrawal, Figs. 4-7) 

Agrawal discloses that added noise …which may in some examples be artificially added to imitate a wireless communication channel.

However, Agrawal does not clearly disclose of “first code word with noise added” in a memory system

Sharon clearly discloses “first code word with noise added”
(Sharon, [0059] Step 302 is to pass a batch of noisy codewords through a parameterized iterative message passing decoder 244.  In one embodiment, the noisy codewords are represented as a vector of a-priori log-likelihood ratios.  In an offline embodiment, the noisy codewords may be generated by adding arbitrary noise to clean versions of the codewords)
 (Sharon, [0090] … generated by adding noise to the bits of the clean codeword 602.  … By reading data that was stored in non-volatile memory.  The data that is read from non-volatile memory will typically be noisy due to well-known factors such as read disturb, program disturb, charge loss, etc. The a-priori LLRs may be generated based on reading only hard bits or based on reading both hard and soft bits)
(Sharon, [0123], by learning such noise realizations, the decoder 244 can perform better.  For example, power can be saved, decoding latency can be reduced, and error correction can be made more accurate)

It would have been obvious before the effective filing date of the claimed to a person having ordinary skill in the art to incorporate Sharon’s method of adding noise to clean code word of Agrawal in order to improve the performance of decoder of Agrawal.
(Sharon, [0123], by learning such noise realizations, the decoder 244 can perform better.  For example, power can be saved, decoding latency can be reduced, and error correction can be made more accurate)

As per claim 12:
Agrawal-Sharon further discloses:
wherein the learning data includes the first code word, the second code word, and a determination result indicating whether the first code word and the third code word coincide with each other. 
(Agrawal, Figs. 4-7, check satisfied? == YES)
 (Agrawal, [0141] generating an intermediate output representation at step 121, generating an intermediate output codeword from the intermediate output representation at step 122, and performing a syndrome check on the generated intermediate output codeword at step 123.  Optimising trainable parameters of the NN to minimise a loss function comprises, if the syndrome check at step 123 is found to be satisfied at step 131, ceasing optimisation of the trainable parameters after optimisation at the layer at which the syndrome check is satisfied at step 132.  It will be appreciated that in some examples of the method 100, the steps 121 to 123 may be performed at every even layer of the NN) 
(Agrawal, [0055], Examples of the proposed solutions also improve the training process by pushing the network to learn correct weights early on)
(Agrawal, [0072] FIGS. 16 to 20 illustrate learned weight distribution over edges for different codes)
(Agrawal, [0133]-[0140] Trained Weights [0137] Only the weights that define the structure of the Tanner graph are trained.  The initialization of weight can be either fixed (1.0) or can be random)
(Agrawal, Figs. 4-7) 

As per claim 13:
Agrawal-Sharon further discloses:
wherein the learning data includes just the first code word and the second code word when the first code word and the third code word coincide with each other. 
(Agrawal, Figs. 4-7, check satisfied? == YES)
 (Agrawal, [0141] generating an intermediate output representation at step 121, generating an intermediate output codeword from the intermediate output representation at step 122, and performing a syndrome check on the generated intermediate output codeword at step 123.  Optimising trainable parameters of the NN to minimise a loss function comprises, if the syndrome check at step 123 is found to be satisfied at step 131, ceasing optimisation of the trainable parameters after optimisation at the layer at which the syndrome check is satisfied at step 132.  It will be appreciated that in some examples of the method 100, the steps 121 to 123 may be performed at every even layer of the NN) 
(Agrawal, [0055], Examples of the proposed solutions also improve the training process by pushing the network to learn correct weights early on)
(Agrawal, [0072] FIGS. 16 to 20 illustrate learned weight distribution over edges for different codes)
(Agrawal, [0133]-[0140] Trained Weights [0137] Only the weights that define the structure of the Tanner graph are trained.  The initialization of weight can be either fixed (1.0) or can be random)
(Agrawal, Figs. 4-7) 

As per claim 14:
Agrawal-Sharon further discloses:
wherein learning data includes just the third code word and the second code word. 
(Agrawal, [0141] generating an intermediate output representation at step 121, generating an intermediate output codeword from the intermediate output representation at step 122, and performing a syndrome check on the generated intermediate output codeword at step 123.  Optimising trainable parameters of the NN to minimise a loss function comprises, if the syndrome check at step 123 is found to be satisfied at step 131, ceasing optimisation of the trainable parameters after optimisation at the layer at which the syndrome check is satisfied at step 132.  It will be appreciated that in some examples of the method 100, the steps 121 to 123 may be performed at every even layer of the NN) 
(Agrawal, [0055], Examples of the proposed solutions also improve the training process by pushing the network to learn correct weights early on)
(Agrawal, [0072] FIGS. 16 to 20 illustrate learned weight distribution over edges for different codes)
(Agrawal, [0133]-[0140] Trained Weights [0137] Only the weights that define the structure of the Tanner graph are trained.  The initialization of weight can be either fixed (1.0) or can be random)
(Agrawal, Figs. 4-7) 

As per claim 15:
Agrawal-Sharon further discloses:

manipulate a bit of the second code word when the third code word does not coincide with the first code and 
(Agrawal, Figs. 4-7, check satisfied? == NO)

generate a fourth code word from the manipulated second code word, and include the fourth code word in the learning data, 
(Agrawal, Figs. 4-7, Continue 233)


(Agrawal, Figs. 4-7, check satisfied? == YES)
(Agrawal, [0141] generating an intermediate output representation at step 121, generating an intermediate output codeword from the intermediate output representation at step 122, and performing a syndrome check on the generated intermediate output codeword at step 123.  Optimising trainable parameters of the NN to minimise a loss function comprises, if the syndrome check at step 123 is found to be satisfied at step 131, ceasing optimisation of the trainable parameters after optimisation at the layer at which the syndrome check is satisfied at step 132.  It will be appreciated that in some examples of the method 100, the steps 121 to 123 may be performed at every even layer of the NN) 
(Agrawal, [0055], Examples of the proposed solutions also improve the training process by pushing the network to learn correct weights early on)
(Agrawal, [0072] FIGS. 16 to 20 illustrate learned weight distribution over edges for different codes)
(Agrawal, [0133]-[0140] Trained Weights [0137] Only the weights that define the structure of the Tanner graph are trained.  The initialization of weight can be either fixed (1.0) or can be random)
(Agrawal, Figs. 4-7) 

As per claim 16:
Agrawal-Sharon further discloses:

(Agrawal, [0141] received signal comprises a transmitted codeword and added noise.  In some examples, the added noise may be AWGN, which may in some examples be artificially added to imitate a wireless communication channel.  Referring to FIG. 4, in a first step 110, the method comprises inputting to an input layer of the NN a representation of message bits obtained from a received noisy signal)
(Agrawal, Figs. 4-7, 110, 210, 310, 410 message bits obtained from received noisy signal) 

As per claim 17:

Agrawal discloses:


a learning device including a decoding unit, a generation unit configured to generate learning data, and a learning unit configured to receive the learning data and determine a weight for a message passing decoding using the learning data, wherein the memory controller is configured to: 
(Agrawal, [0041] controller comprises a processor and a memory, the memory containing instructions executable by the processor such that the controller is operable to input … message bits obtained from a received noisy signal… optimise trainable parameters of the NN to minimise a loss function.  …generating an intermediate output codeword from the intermediate output representation; and performing a syndrome check on the generated intermediate output codeword.  Optimising trainable parameters of the NN to minimise a loss function comprises, if the syndrome check is satisfied, ceasing optimisation of the trainable parameters after optimisation at the layer at which the syndrome check is satisfied.)
(Agrawal, Fig. 32, Host Computer)
(Agrawal, Fig. 9, Controller, Memory)
(Agrawal, Fig. 10, Input Module, Interfaces, Optimizing Module)
(Agrawal, Fig. 11, Output Module)



 read data from the first location of the memory unit via the memory interface, and supply the read data to the decoding unit as a second code word;  
(Agrawal, [0141] received signal comprises a transmitted codeword and added noise.  In some examples, the added noise may be AWGN, which may in some examples be artificially added to imitate a wireless communication channel.  Referring to FIG. 4, in a first step 110, the method comprises inputting to an input layer of the NN a representation of message bits obtained from a received noisy signal)
(Agrawal, Figs. 4-7, 110, 210, 310, 410 message bits obtained from received noisy signal) 

the decoding unit is configured to output a decoding resulting of the second code word as a third code word to the generation unit;  
(Agrawal, [0141] generating an intermediate output representation at step 121, generating an intermediate output codeword from the intermediate output representation at step 122, and performing a syndrome check on the generated intermediate output codeword at step 123.  Optimising trainable parameters of the NN to minimise a loss function comprises, if the syndrome check at step 123 is found to be satisfied at step 131, ceasing optimisation of the trainable parameters after optimisation at the layer at which the syndrome check is satisfied at step 132.  It will be appreciated that in some examples of the method 100, the steps 121 to 123 may be performed at every even layer of the NN) 
(Agrawal, [0098] decoder takes the LLR values as input, and returns decision on corrected bits.  The decoding follows the renowned Belief Propagation (BP) algorithm.  The messages (or beliefs) are updated by passing the messages over the edges of the graph representation of the code called the Tanner graph)
(Agrawal, Figs. 4-7, right flow chart) 

the generation unit is configured to: 

generate learning data including the second code word (Agrawal, Fig. 4, 132) for which the corresponding third code word matches the first code word, and (Agrawal, Figs. 4-7, check satisfied?)

supply the learning data to the learning unit; and the learning unit is configured to determine a weight for a message passing decoding using the learning data. 
(Agrawal, [0141] generating an intermediate output representation at step 121, generating an intermediate output codeword from the intermediate output representation at step 122, and performing a syndrome check on the generated intermediate output codeword at step 123.  Optimising trainable parameters of the NN to minimise a loss function comprises, if the syndrome check at step 123 is found to be satisfied at step 131, ceasing optimisation of the trainable parameters after optimisation at the layer at which the syndrome check is satisfied at step 132.  It will be appreciated that in some examples of the method 100, the steps 121 to 123 may be performed at every even layer of the NN) 
(Agrawal, [0055], Examples of the proposed solutions also improve the training process by pushing the network to learn correct weights early on)
(Agrawal, [0072] FIGS. 16 to 20 illustrate learned weight distribution over edges for different codes)
(Agrawal, [0133]-[0140] Trained Weights [0137] Only the weights that define the structure of the Tanner graph are trained.  The initialization of weight can be either fixed (1.0) or can be random)
(Agrawal, Figs. 4-7) 

Agrawal discloses a learning device for use in a communication system.

However, Agrawal does not clearly disclose a learning device for use in a memory system such as “comprising: a memory controller including: a host interface for receiving data from a host device, a memory interface for receiving data from a memory unit, an error correction encoding/decoding unit for encoding and decoding data to be written to and read from the memory unit, a data buffer for storing data to be written to the memory unit and data read from the memory unit”

Sharon discloses a learning device for use in “a memory system, comprising: a memory controller including: a host interface for receiving data from a host device, a memory interface for receiving data from a memory unit, an error correction encoding/decoding unit for encoding and decoding data to be written to and read from the memory unit, a data buffer for storing data to be written to the memory unit and data read from the memory unit, and 

(Sharon, Fig. 2, Non-Volatile Memory System 100. Controller 122, Host Interface 220, Memory Interface 230, Encoder 256, RAM 216, 218)

It would have been obvious before the effective filing date of the claimed to a person having ordinary skill in the art to incorporate Sharon’s learning device in a memory system as taught by Sharon in figure 2.
(Sharon, Fig. 2, Non-Volatile Memory System 100. Controller 122, Host Interface 220, Memory Interface 230, Encoder 256, RAM 216, 218)

Agrawal does not clearly disclose:
receive first data from the host device via the host interface, encode the first data using the error correction encoding/decoding unit to generate a first code word, store the first code word in the data buffer, write the first code word to a first location in the memory unit via the memory interface,

Sharon clearly discloses:
receive first data from the host device via the host interface, encode the first data using the error correction encoding/decoding unit to generate a first code word, store the first code word in the data buffer, write the first code word to a first location in the memory unit via the memory interface,
 (Sharon, [0059] Step 302 is to pass a batch of noisy codewords through a parameterized iterative message passing decoder 244.  In one embodiment, the noisy codewords are represented as a vector of a-priori log-likelihood ratios.  In an offline embodiment, the noisy codewords may be generated by adding arbitrary noise to clean versions of the codewords)
 (Sharon, [0090] … generated by adding noise to the bits of the clean codeword 602.  … By reading data that was stored in non-volatile memory.  The data that is read from non-volatile memory will typically be noisy due to well-known factors such as read disturb, program disturb, charge loss, etc. The a-priori LLRs may be generated based on reading only hard bits or based on reading both hard and soft bits)
(Sharon, [0123], by learning such noise realizations, the decoder 244 can perform better.  For example, power can be saved, decoding latency can be reduced, and error correction can be made more accurate)

It would have been obvious before the effective filing date of the claimed to a person having ordinary skill in the art to incorporate Sharon’s method of adding noise to clean code word of Agrawal in order to improve the performance of decoder of Agrawal.
(Sharon, [0123], by learning such noise realizations, the decoder 244 can perform better.  For example, power can be saved, decoding latency can be reduced, and error correction can be made more accurate)

As per claim 18:
Agrawal-Sharon further discloses:


(Agrawal, [0098] decoder takes the LLR values as input, and returns decision on corrected bits.  The decoding follows the renowned Belief Propagation (BP) algorithm.  The messages (or beliefs) are updated by passing the messages over the edges of the graph representation of the code called the Tanner graph)


As per claim 19:

Agrawal-Sharon further discloses:

wherein the determined weight is supplied to the error correction encoding/decoding unit. 
(Sharon, Fig. 3, Store Learned Parameters 308)
(Sharon, Fig.  3, Decode the data using the learned parameter in the parameterized iterative message passing decoder 312)

As per claim 20:

Agrawal-Sharon further discloses:


(Sharon, Fig. 2, Non-Volatile Memory 108)

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to THIEN DANG NGUYEN whose telephone number is (571)272-9189. The examiner can normally be reached Monday-Friday 7 AM - 3:30 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, April Blair can be reached on 571-270-1014. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/Thien Nguyen/Primary Examiner, Art Unit 2111