Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
The filing date of the present application is 10/18/2016.
This action is in response to amendments and/or remarks filed on 09/03/2021. In the current amendments, claims 49, 51-52, 56-57, 59-60, 64-65, 67-68, and 72 have been amended, and claims 1-48, 50, 53, 55, 58, 61, 63, 66, 69 and 71 have been cancelled. Claims 49, 51, 52, 54, 56, 57, 59, 60, 62, 64, 65, 67, 68, 70, and 72 are pending and have been examined. 
In view of Applicant’s amendments and/or remarks, the objections to claims 51, 59 and 67 made in the previous Office Action have been withdrawn.
In view of Applicant’s amendments and/or remarks, the rejections under 35 U.S.C. §112(a) to claims 49, 57 and 65 made in the previous Office Action have been withdrawn.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 09/03/2021 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:


Claims 49, 51, 52, 54, 56, 57, 59, 60, 62, 64, 65, 67, 68, 70, and 72 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.
Claims 49, 57 and 65 recite “providing data values for the first data field of the new form based on the first confidence score.” However, it appears that the specification is silent in regards to “data values for the first data field of the new form based on the first confidence score”. Instead, par 80 of the specification says “In one embodiment, the machine learning module 113 outputs results data 120 indicating that a candidate function has been found that is likely correct. The results data 120 can indicate what the candidate function is, the matching data 127 or confidence score data 128 related to the candidate function, or any other information that will be useful for review by an expert.” In other words, even though the paragraph says a single data value for the first data field of the new form based on the first confidence score and a single candidate function, the limitations says multiple data values for the first data field of the new form based on the first confidence score. In addition, claim 59 says providing another data based on the second confidence score. It appears that this is changing the scope of the claimed invention without support from the specification, therefore it is rejected under 112(a) lack of written description. (with emphasis underlined)
Claims 56 and 64 recite similar claim languages. Thus, it appears that based on the same reason, they are rejected under 112(a) lack of written description.
Claims 49, 56, 57, 64 and 65 each recite limitations that raise issues of indefiniteness as set forth above, and dependent claims 51-52, 54, 59-60, 62, 67-68, 70 and 72 are rejected at least based on their direct and/or indirect dependency from independent claims 49, 57 and 65. Appropriate explanation and/or amendment is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 49, 51, 52, 54, 56, 57, 59, 60, 62, 64, 65, 67, 68, 70, and 72 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 49 recites the limitation “data values” in line 14. However, it is not clear if the “data values” in line 14 indicates “data values” in line 4 or not. If they are different, “data values” in line 14 may be replaced with “a second data values”, “another data values” or something else. Claim 57 (the limitation “data values” in line 15) and claim 65 (the limitation “data values” in line 17) have the same issue. Appropriate correction is required. 
Claim 72 recites the limitation “the data value” in line 12.  There is insufficient antecedent basis for this limitation in the claim.
Claims 49, 57, 65 and 72 each recite limitations that raise issues of indefiniteness as set forth above, and dependent claims 51-52, 54, 56, 59-60, 62, 64, 67-68 and 70 are rejected at least based on their direct and/or indirect dependency from independent claims 49, 57 and 65. Appropriate explanation and/or amendment is required.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 49, 51, 56-57, 59, 64-65, 67 and 72 are rejected under 35 U.S.C. 103 as being unpatentable over Toda et al. (A Probabilistic Approach for Automatically Filling Form-Based Web Interfaces) in view of Djabarov (US 8,214,362 B1), further in view of Hermens et al. (A Machine-Learning Apprentice for the Completion of Repetitive Forms).

Regarding claim 49, 
Toda teaches 
the method comprising:

generating, by a machine learning module, a first plurality of arithmetic functions to provide data values for a first data field of a new form in the electronic document preparation system, wherein the data values are arithmetically dependent on one or more data fields … of the new form ([sec 4] “We have considered several alternatives for such combination, including the use of machine learning approaches, such as SVM [11], Genetic Programming [10], linear combination of values and the use of a Bayesian Network approach. … We model the computation of the probability of field fj given a segment Sab through a Bayesian belief network model similar to the one proposed by Ribeiro-Neto et al [17, 6] for ranking documents in search tasks. Our Bayesian network provides a graphical formalism for representing the probability model we develop allowing for a better visualization of how the features we consider lead to the final probability of a segment given a field. … 
    PNG
    media_image1.png
    40
    724
    media_image1.png
    Greyscale
 If style is taken into account, information from the set of sequence models has to be added to the computation. In this case, the resulting formula is: 
    PNG
    media_image2.png
    76
    721
    media_image2.png
    Greyscale
. Let Cj be the set of segments Sab such that P(fj |sab) is above threshold ϵ. We say that Cj is a set of candidate values for field Fj. We aim at finding a mapping M between candidates values and fields in the form-based interface with a maximum aggregate probability, such that (1) only a single segment is assigned to each field and (2) the selected segments are non-overlapping, i.e., there are no segments Sab and Scd for a < c in the mapping such that b ≥ c. This is accomplished by means of a two-phase procedure as follows”; [sec 4] “The main idea behind iForm is to rely on information about previous values used for each field of a form to fill this form when a new text is given as input.”; e.g., “Let Cj be the set of segments Sab such that P(fj |sab) is above threshold ϵ” with Eqs (7)-(8) reads on “generating … a first plurality of arithmetic functions” and “arithmetically dependent” since each segment and its probability are considered for filling out a field and since Eqs (7)-(8) have arithmetic operations. In addition, “We model the computation of the probability of field fj given a segment Sab through a Bayesian belief network model” reads on “machine learning module”. Furthermore, “fill this form when a new text is given as input” reads on “new form” since “this form” may be filled out with a new text.);

selecting a first arithmetic function from the first plurality of arithmetic functions ([sec 4] “The iForm approach for dealing with the form filling problem consists of taking candidate segments from the input text and then estimating the probability of a field given each segment. … 
    PNG
    media_image3.png
    42
    806
    media_image3.png
    Greyscale
  (7) If style is taken into account, information from the set of sequence models has to be added to the computation. In this case, the resulting formula is: 
    PNG
    media_image4.png
    93
    803
    media_image4.png
    Greyscale
 (8) … In the second phase, if any field remains not mapped to a segment, we use the probabilities derived from the style-related features to try to find further assignments, using equation Eq. 8 to compute the probability of each field given each segment”; see also [sec 3]; “estimating the probability of a field given each segment” with Eqs (7)-(8) reads on “selecting a first arithmetic function”.); 

generating first test values based on the first arithmetic function and based on training data generated from previously filled forms ([sec 2] “Thus, we only rely on information about input values entered for each field of the target form on previous submission made by the users.”; also see [sec 5] “previous submissions”; [sec 3] “The problem we face in this work is automatically filling out the fields of a given form-based interface with values extracted from a data-rich free text document, or portions of such documents. In particular, we identify two subproblems: the problems of (a) extracting values from the input text and (b) filling out the fields of the target form using them.”; [sec 4] “The iForm approach for dealing with the form filling problem consists of taking candidate segments from the input text and then estimating the probability of a field given each segment. … In the case of text boxes, we simply enter each mapped text segment as a value into its corresponding field.”; E.g., values for a form may read on “first test values”.); 

determining a [value calculated from] the first test values and corresponding values from the training data for the first data field of the new form ([fig 4]; [sec 4] as cited above, and “The main idea behind iForm is to rely on information about previous values used for each field of a form to fill this form when a new text is given as input. We consider two types of features from these values: the values themselves and the tokens composing these values, which we call content related features; and the style (e.g., capitalization, punctuation, etc.), which we call the style related feature. We stress that no features from the input tests are considered. The style feature requires a more detailed explanation. Let SVj be the set of previous values entered for a field Fj. We automatically learn a Naive Hidden Markov Model SM(Fj), which we call Value Style Model, that captures the wording style of the values in SVj. This model is similar to the inner HMM used in [4], also used to capture the wording style of sequences.”; e.g., “Let SVj be the set of previous values entered for a field Fj” reads on “corresponding values from the training data for the first data field of the new form”. In addition, e.g., a value that is calculated based on values for a form and “previous values” may read on “determining a [value calculated from] the first test values and corresponding values from the training data”.);

generating, for the first arithmetic function, a first confidence score based on the [value calculated from] the first test values and the corresponding values from the training data (Toda [fig 4]; [sec 4] “A graph representing a Value Style Model SM(Fj) is generated using the encodings of all symbol mask sequences found in values previously entered for field Fj. … 
    PNG
    media_image3.png
    42
    806
    media_image3.png
    Greyscale
  (7) If style is taken into account, information from the set of sequence models has to be added to the computation. In this case, the resulting formula is: 
    PNG
    media_image4.png
    93
    803
    media_image4.png
    Greyscale
 (8) … Let Cj be the set of segments Sab such that P(fj |sab) is above threshold ϵ. We say that Cj is a set of candidate values for field Fj. We aim at finding a mapping M between candidates values and fields in the form-based interface with a maximum aggregate probability, such that (1) only a single segment is assigned to each field and (2) the selected segments are non-overlapping, i.e., there are no segments Sab and Scd for a < c in the mapping such that b ≥ c. This is accomplished by means of a two-phase procedure as follows … In the second phase, if any field remains not mapped to a segment, we use the probabilities derived from the style-related features to try to find further assignments, using equation Eq. 8 to compute the probability of each field given each segment”; “probability” reads on “confidence score”.); and 

providing data values for the first data field of the new form based on the first confidence score ([sec 4] “The main idea behind iForm is to rely on information about previous values used for each field of a form to fill this form when a new text is given as input. … Let Cj be the set of segments Sab such that P(fj |sab) is above threshold ϵ. We say that Cj is a set of candidate values for field Fj. We aim at finding a mapping M between candidates values and fields in the form-based interface with a maximum aggregate probability, such that (1) only a single segment is assigned to each field and (2) the selected segments are non-overlapping, i.e., there are no segments Sab and Scd for a < c in the mapping such that b ≥ c. This is accomplished by means of a two-phase procedure as follows”);

However, Toda does not teach
A method performed by one or more processors of an electronic document preparation system:
other than the first data field of the new form;
determining a difference between the first test values and corresponding values from the training data for the first data field of the new form;
generating, for the first arithmetic function, a first confidence score based on the difference between the first test values and the corresponding values from the training data;
(Note: Hereinafter, if a limitation has one or more underlines, the one or more underlined claim languages indicate that they have not been taught yet, while the one or more non-underlined claim languages indicate that they have been taught already.)

Djabarov teaches
A method performed by one or more processors of an electronic document preparation system ([fig 3] “processor” and “memory”): 

determining a difference between the first test values and corresponding values from the training data for the first data field of the new form ([figs 8-10]; [col 6, ln 62– col 9, ln 3] “Upon display of the document to the user, the user may enter "John” which matches the attribute value associated with the attribute name “first name” in table 700. Accordingly, a mapping of the form field identifier “user data 1” to the attribute name “first name may be made for this form field in the web document. … a web document having a form field element identified by “answer1” may elicit a user entry of “John Doe”. In comparing the user entry with the content of table 700, it may be determined that form field element “answer 1” corresponds to a compound attributes name of “first name, last name”. This mapping may be stored.”; e.g., each “user entry” reads on “first test values”, and “content of table 700” reads on “corresponding values from the training data”. In addition, “matches” and “comparing the user entry with the content of table” read on “determining a difference between the first test values and corresponding values from the training data” since each user entry is compared with each attribute value of fig 7 for checking a match. Note that Toda also teaches “determining a [value calculated from] the first test values and corresponding values from the training data for the first data field of the new form”.);

generating, for the first arithmetic function, a first confidence score based on the difference between the first test values and the corresponding values from the training data ([figs 8-10]; [col 6, ln 62– col 9, ln 3] “Upon display of the document to the user, the user may enter "John” which matches the attribute value associated with the attribute name “first name” in table 700. Accordingly, a mapping of the form field identifier “user data 1” to the attribute name “first name may be made for this form field in the web document. … a web document having a form field element identified by “answer1” may elicit a user entry of “John Doe”. In comparing the user entry with the content of table 700, it may be determined that form field element “answer 1” corresponds to a compound attributes name of “first name, last name”. This mapping may be stored. … AutoFill engine 225 may require a predetermined confidence level for a identified mapping. For example, prior to indicating a mapping to AutoFill software 430, AutoFill engine 225 may require that 75% of all received mappings match the indicated mapping. Such a confidence interval or other statistical analysis, ensures that infrequently visited web documents or erroneously entered data does not unnecessarily skew the identified mappings, thereby resulting in incorrect data insertions.”; e.g., each “user entry” reads on “first test values”, and “content of table 700” reads on “corresponding values from the training data”. In addition, “matches” and “comparing the user entry with the content of table” read on “difference between the first test values and the corresponding values from the training data” since each user entry is compared with each attribute value of fig 7 for checking a match. Furthermore, “confidence level” reads on “confidence score”. Note that Toda also teaches “generating, for the first function, a first confidence score based on the [value calculated from] the first test values and the corresponding values from the training data”.);

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the electronic document management system of Toda with the comparison between test values and training values of Djabarov. Doing so would lead to providing a measure of how reliable the automatic filling is based on the data difference (Djabarov, col 6, ln 62– col 9, ln 3).

However, Toda and Djabarov do not teach
the data values are arithmetically dependent on one or more data fields, other than the first data field of the new form.

Hermens teaches
the data values are arithmetically dependent on one or more data fields, other than the first data field of the new form ([sec “Difficulties”] “Other roadblocks to learning the leave report form include the complex formula for calculating earned sick leave, which is based on the Summer and Academic boxes, plus the %FTE box and a constant value of 8.0: 
If Month                         
                            ∈
                        
                     { Sep, Oct, Nov, Dec, Jan, Feb, Mar, Apr} and Academic is checked, then Sick-Leave-Hours-Earned-Or-Received is 8.0
If Month                         
                            ∈
                        
                     {May, Aug} and Academic is checked, and Summer is not checked, then sick-Leave-Hours-Earned-Or-Received is 4.0
If Month                         
                            ∈
                        
                     {May, Aug} and Academic is checked, and Summer is checked, then Sick-Leave-Hours-Earned-Or-Received is (%FTE                        
                            ∙
                        
                    8.0)
If Month                         
                            ∈
                        
                     {June, July} and Summer is checked, then Sick-Leave-Hours-Earned-Or-Received is (%FTE                        
                            ∙
                        
                    8.0)”; Note that Toda and Djabarov teach one or more data fields of the new form”.).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the electronic document management system of Toda and Djabarov with the different dependent data fields of Hermens. Doing so would lead to solving considerably more difficult and complex problems based on current and previous electronic forms (Hermens, [sec “Difficulties”]).

Regarding claim 51, 
Toda, Djabarov and Hermens teach claim 49. 

Toda further teaches 
selecting a second arithmetic function from the first plurality of arithmetic functions based on the [value calculated from] the first test values and the corresponding values from the training data for the first data field of the new form ([sec 4] “The iForm approach for dealing with the form filling problem consists of taking candidate segments from the input text and then estimating the probability of a field given each segment. … 
    PNG
    media_image3.png
    42
    806
    media_image3.png
    Greyscale
  (7) If style is taken into account, information from the set of sequence models has to be added to the computation. In this case, the resulting formula is: 
    PNG
    media_image4.png
    93
    803
    media_image4.png
    Greyscale
 (8) … In the second phase, if any field remains not mapped to a segment, we use the probabilities derived from the style-related features to try to find further assignments, using equation Eq. 8 to compute the probability of each field given each segment”; see also [sec 3]; “estimating the probability of a field given each segment” with eqs (7)-(8) reads on “selecting a second arithmetic function”.);

generating second test values based on the second arithmetic function and the training data ([sec 2] “Thus, we only rely on information about input values entered for each field of the target form on previous submission made by the users.”; also see [sec 5] “previous submissions”; [sec 3] “The problem we face in this work is automatically filling out the fields of a given form-based interface with values extracted from a data-rich free text document, or portions of such documents. In particular, we identify two subproblems: the problems of (a) extracting values from the input text and (b) filling out the fields of the target form using them.”; [sec 4] “Hence, the problem is finding a subset of value-field pairs in I without conflicts whose aggregate probabilities are maximum. Finding the optimal solution for this problem requires assessing all possible subsets – an exponential number. In practice, we use a simple greedy heuristic to find an approximate solution. First, we extract the pair with the highest probability from I and verify whether it presents conflict to any pair in M or not. … In the case of text boxes, we simply enter each mapped text segment as a value into its corresponding field.”; “finding a subset of value-field pairs in I without conflicts whose aggregate probabilities are maximum” reads on “generating second test values” since different values may be estimated for a field.); 

determining a [value calculated from] the second test values and the corresponding values from the training data ([fig 4]; [sec 4] as cited above, and “The main idea behind iForm is to rely on information about previous values used for each field of a form to fill this form when a new text is given as input. We consider two types of features from these values: the values themselves and the tokens composing these values, which we call content related features; and the style (e.g., capitalization, punctuation, etc.), which we call the style related feature. We stress that no features from the input tests are considered. The style feature requires a more detailed explanation. Let SVj be the set of previous values entered for a field Fj. We automatically learn a Naive Hidden Markov Model SM(Fj), which we call Value Style Model, that captures the wording style of the values in SVj. This model is similar to the inner HMM used in [4], also used to capture the wording style of sequences.”; e.g., “Let SVj be the set of previous values entered for a field Fj” reads on “corresponding values from the training data”.);

generating, for the second arithmetic function, a second confidence score based on the [value calculated from] the second test values and the corresponding values from the training data (Toda [fig 4]; [sec 4] “A graph representing a Value Style Model SM(Fj) is generated using the encodings of all symbol mask sequences found in values previously entered for field Fj. … 
    PNG
    media_image3.png
    42
    806
    media_image3.png
    Greyscale
  (7) If style is taken into account, information from the set of sequence models has to be added to the computation. In this case, the resulting formula is: 
    PNG
    media_image4.png
    93
    803
    media_image4.png
    Greyscale
 (8) … Let Cj be the set of segments Sab such that P(fj |sab) is above threshold ϵ. We say that Cj is a set of candidate values for field Fj. We aim at finding a mapping M between candidates values and fields in the form-based interface with a maximum aggregate probability, such that (1) only a single segment is assigned to each field and (2) the selected segments are non-overlapping, i.e., there are no segments Sab and Scd for a < c in the mapping such that b ≥ c. This is accomplished by means of a two-phase procedure as follows … In the second phase, if any field remains not mapped to a segment, we use the probabilities derived from the style-related features to try to find further assignments, using equation Eq. 8 to compute the probability of each field given each segment”; “probability” reads on “confidence score”.); and

providing data for the first data field based on the second confidence score ([sec 3] “The problem we face in this work is automatically filling out the fields of a given form-based interface with values extracted from a data-rich free text document, or portions of such documents. In particular, we identify two subproblems: the problems of (a) extracting values from the input text and (b) filling out the fields of the target form using them.”; [sec 4] “Hence, the problem is finding a subset of value-field pairs in I without conflicts whose aggregate probabilities are maximum. Finding the optimal solution for this problem requires assessing all possible subsets – an exponential number. In practice, we use a simple greedy heuristic to find an approximate solution. First, we extract the pair with the highest probability from I and verify whether it presents conflict to any pair in M or not. … In the case of text boxes, we simply enter each mapped text segment as a value into its corresponding field.”; “enter each mapped text segment as a value into its corresponding field” reads on “providing data for the first data field.”);  

Djabarov further teaches 
selecting a second arithmetic function from the first plurality of arithmetic functions based on the difference between the first test values and the corresponding values from the training data for the first data field of the new form ([figs 8-10]; [col 6, ln 62– col 9, ln 3] “Upon display of the document to the user, the user may enter "John” which matches the attribute value associated with the attribute name “first name” in table 700. Accordingly, a mapping of the form field identifier “user data 1” to the attribute name “first name may be made for this form field in the web document. … a web document having a form field element identified by “answer1” may elicit a user entry of “John Doe”. In comparing the user entry with the content of table 700, it may be determined that form field element “answer 1” corresponds to a compound attributes name of “first name, last name”. This mapping may be stored.”; e.g., each “user entry” reads on “first test values”, and “content of table 700” reads on “corresponding values from the training data”. In addition, “matches” and “comparing the user entry with the content of table” read on “difference between the first test values and the corresponding values from the training data” since each user entry is compared with each attribute value of fig 7 for checking a match. Note that Toda also teaches “selecting a second function from the first plurality of functions based on the [value calculated from] the first test value and the corresponding values from the training data for the first data field of the new form”.);

determining a difference between the second test values and the corresponding values from the training data ([figs 8-10]; [col 6, ln 62– col 9, ln 3] “Upon display of the document to the user, the user may enter "John” which matches the attribute value associated with the attribute name “first name” in table 700. Accordingly, a mapping of the form field identifier “user data 1” to the attribute name “first name may be made for this form field in the web document. … a web document having a form field element identified by “answer1” may elicit a user entry of “John Doe”. In comparing the user entry with the content of table 700, it may be determined that form field element “answer 1” corresponds to a compound attributes name of “first name, last name”. This mapping may be stored.”; e.g., each “user entry” reads on “second test values”, and “content of table 700” reads on “corresponding values from the training data”. In addition, “matches” and “comparing the user entry with the content of table” read on “determining a difference between the second test values and corresponding values from the training data” since each user entry is compared with each attribute value of fig 7 for checking a match. Note that Toda also teaches “the second test values and corresponding values from the training data”.);

generating, for the second arithmetic function, a second confidence score based on the difference between the second test values and the corresponding values from the training data ([figs 8-10]; [col 6, ln 62– col 9, ln 3] “Upon display of the document to the user, the user may enter "John” which matches the attribute value associated with the attribute name “first name” in table 700. Accordingly, a mapping of the form field identifier “user data 1” to the attribute name “first name may be made for this form field in the web document. … a web document having a form field element identified by “answer1” may elicit a user entry of “John Doe”. In comparing the user entry with the content of table 700, it may be determined that form field element “answer 1” corresponds to a compound attributes name of “first name, last name”. This mapping may be stored. … AutoFill engine 225 may require a predetermined confidence level for a identified mapping. For example, prior to indicating a mapping to AutoFill software 430, AutoFill engine 225 may require that 75% of all received mappings match the indicated mapping. Such a confidence interval or other statistical analysis, ensures that infrequently visited web documents or erroneously entered data does not unnecessarily skew the identified mappings, thereby resulting in incorrect data insertions.”; e.g., each “user entry” reads on “second test values”, and “content of table 700” reads on “corresponding values from the training data”. In addition, “matches” and “comparing the user entry with the content of table” read on “difference between the second test values and the corresponding values from the training data” since each user entry is compared with each attribute value of fig 7 for checking a match. Furthermore, “confidence level” reads on “confidence score”. Note that Toda also teaches “generating, for the second function, a second confidence score based on the [value calculated from] the second test values and the corresponding values from the training data”.);

Djabarov, col 6, ln 62– col 9, ln 3).

Regarding claim 56, 
Toda, Djabarov and Hermens teach claim 49. 

Toda further teaches 
generating, for a second data field of the new form, a second plurality of arithmetic functions to provide data values for the second data field ([sec 4] “Our Bayesian network provides a graphical formalism for representing the probability model we develop allowing for a better visualization of how the features we consider lead to the final probability of a segment given a field. … 
    PNG
    media_image1.png
    40
    724
    media_image1.png
    Greyscale
 If style is taken into account, information from the set of sequence models has to be added to the computation. In this case, the resulting formula is: 
    PNG
    media_image2.png
    76
    721
    media_image2.png
    Greyscale
. Let Cj be the set of segments Sab such that P(fj |sab) is above threshold ϵ. We say that Cj is a set of candidate values for field Fj. We aim at finding a mapping M between candidates values and fields in the form-based interface with a maximum aggregate probability, such that (1) only a single segment is assigned to each field and (2) the selected segments are non-overlapping, i.e., there are no segments Sab and Scd for a < c in the mapping such that b ≥ c. This is accomplished by means of a two-phase procedure as follows”; [sec 4] “The main idea behind iForm is to rely on information about previous values used for each field of a form to fill this form when a new text is given as input.”; “Let Cj be the set of segments Sab such that P(fj |sab) is above threshold ϵ” with Eqs (7)-(8) reads on “generating … a second plurality of functions” since each segment and its probability are considered for filling out a field and since Eqs (7)-(8) have arithmetic operations.);

selecting a function from the second plurality of arithmetic functions  ([sec 4] “The iForm approach for dealing with the form filling problem consists of taking candidate segments from the input text and then estimating the probability of a field given each segment. … 
    PNG
    media_image3.png
    42
    806
    media_image3.png
    Greyscale
  (7) If style is taken into account, information from the set of sequence models has to be added to the computation. In this case, the resulting formula is: 
    PNG
    media_image4.png
    93
    803
    media_image4.png
    Greyscale
 (8) … In the second phase, if any field remains not mapped to a segment, we use the probabilities derived from the style-related features to try to find further assignments, using equation Eq. 8 to compute the probability of each field given each segment”; see also [sec 3]; “estimating the probability of a field given each segment” with Eqs (7)-(8) reads on “selecting a function”.);

generating second test values based on the selected function of the second plurality of arithmetic functions and the training data ([sec 2] “Thus, we only rely on information about input values entered for each field of the target form on previous submission made by the users.”; also see [sec 5] “previous submissions”; [sec 3] “The problem we face in this work is automatically filling out the fields of a given form-based interface with values extracted from a data-rich free text document, or portions of such documents. In particular, we identify two subproblems: the problems of (a) extracting values from the input text and (b) filling out the fields of the target form using them.”; [sec 4] “Hence, the problem is finding a subset of value-field pairs in I without conflicts whose aggregate probabilities are maximum. Finding the optimal solution for this problem requires assessing all possible subsets – an exponential number. In practice, we use a simple greedy heuristic to find an approximate solution. First, we extract the pair with the highest probability from I and verify whether it presents conflict to any pair in M or not. … In the case of text boxes, we simply enter each mapped text segment as a value into its corresponding field.”; “finding a subset of value-field pairs in I without conflicts whose aggregate probabilities are maximum” reads on “generating second test values” since different values may be estimated for a field.); and

determining a [value calculated from] the second test values and corresponding values from the training data for the second data field of the new form ([fig 4]; [sec 4] as cited above, and “The main idea behind iForm is to rely on information about previous values used for each field of a form to fill this form when a new text is given as input. We consider two types of features from these values: the values themselves and the tokens composing these values, which we call content related features; and the style (e.g., capitalization, punctuation, etc.), which we call the style related feature. We stress that no features from the input tests are considered. The style feature requires a more detailed explanation. Let SVj be the set of previous values entered for a field Fj. We automatically learn a Naive Hidden Markov Model SM(Fj), which we call Value Style Model, that captures the wording style of the values in SVj. This model is similar to the inner HMM used in [4], also used to capture the wording style of sequences.”; “Let SVj be the set of previous values entered for a field Fj” reads on “corresponding values from the training data for the second data field of the new form”.);

 (Toda [fig 4]; [sec 4] “A graph representing a Value Style Model SM(Fj) is generated using the encodings of all symbol mask sequences found in values previously entered for field Fj. … 
    PNG
    media_image3.png
    42
    806
    media_image3.png
    Greyscale
  (7) If style is taken into account, information from the set of sequence models has to be added to the computation. In this case, the resulting formula is: 
    PNG
    media_image4.png
    93
    803
    media_image4.png
    Greyscale
 (8) … Let Cj be the set of segments Sab such that P(fj |sab) is above threshold ϵ. We say that Cj is a set of candidate values for field Fj. We aim at finding a mapping M between candidates values and fields in the form-based interface with a maximum aggregate probability, such that (1) only a single segment is assigned to each field and (2) the selected segments are non-overlapping, i.e., there are no segments Sab and Scd for a < c in the mapping such that b ≥ c. This is accomplished by means of a two-phase procedure as follows … In the second phase, if any field remains not mapped to a segment, we use the probabilities derived from the style-related features to try to find further assignments, using equation Eq. 8 to compute the probability of each field given each segment”; “probability” reads on “confidence score”.); and 

providing data values for the second data field of the new form based on the third confidence score ([sec 4] “The main idea behind iForm is to rely on information about previous values used for each field of a form to fill this form when a new text is given as input. … Let Cj be the set of segments Sab such that P(fj |sab) is above threshold ϵ. We say that Cj is a set of candidate values for field Fj. We aim at finding a mapping M between candidates values and fields in the form-based interface with a maximum aggregate probability, such that (1) only a single segment is assigned to each field and (2) the selected segments are non-overlapping, i.e., there are no segments Sab and Scd for a < c in the mapping such that b ≥ c. This is accomplished by means of a two-phase procedure as follows”);

Djabarov further teaches 
determining a difference between the second test values and corresponding values from the training data for the second data field of the new form ([figs 8-10]; [col 6, ln 62– col 9, ln 3] “Upon display of the document to the user, the user may enter "John” which matches the attribute value associated with the attribute name “first name” in table 700. Accordingly, a mapping of the form field identifier “user data 1” to the attribute name “first name may be made for this form field in the web document. … a web document having a form field element identified by “answer1” may elicit a user entry of “John Doe”. In comparing the user entry with the content of table 700, it may be determined that form field element “answer 1” corresponds to a compound attributes name of “first name, last name”. This mapping may be stored.”; each “user entry” reads on “second test values”, and “content of table 700” reads on “corresponding values from the training data”. In addition, “matches” and “comparing the user entry with the content of table” read on “determining a difference between the second test values and corresponding values from the training data” since each user entry is compared with each attribute value of fig 7 for checking a match. Note that Toda also teaches “the second test value and corresponding values from the training data for the second data field of the new form”.);

generating, for the selected function, a third confidence score based on the difference between the second test values and the corresponding values from the training data ([figs 8-10]; [col 6, ln 62– col 9, ln 3] “Upon display of the document to the user, the user may enter "John” which matches the attribute value associated with the attribute name “first name” in table 700. Accordingly, a mapping of the form field identifier “user data 1” to the attribute name “first name may be made for this form field in the web document. … a web document having a form field element identified by “answer1” may elicit a user entry of “John Doe”. In comparing the user entry with the content of table 700, it may be determined that form field element “answer 1” corresponds to a compound attributes name of “first name, last name”. This mapping may be stored. … AutoFill engine 225 may require a predetermined confidence level for a identified mapping. For example, prior to indicating a mapping to AutoFill software 430, AutoFill engine 225 may require that 75% of all received mappings match the indicated mapping. Such a confidence interval or other statistical analysis, ensures that infrequently visited web documents or erroneously entered data does not unnecessarily skew the identified mappings, thereby resulting in incorrect data insertions.”; e.g., each “user entry” reads on “second test values”, and “content of table 700” reads on “corresponding values from the training data”. In addition, “matches” and “comparing the user entry with the content of table” read on “difference between the second test values and the corresponding values from the training data” since each user entry is compared with each attribute value of fig 7 for checking a match. Furthermore, “confidence level” reads on “confidence score”. Note that Toda also teaches “generating, for the selected function, a third confidence score based on the [value calculated from] the second test values and the corresponding values from the training data”.);

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the electronic document management system of Toda, Djabarov and Hermens with the comparison between test values and training values of Djabarov. Doing so would lead to providing a measure of how reliable the automatic filling is based on the data difference (Djabarov, col 6, ln 62– col 9, ln 3).

Regarding claim 57 
Claim 57 is a computer-readable storage medium claim corresponding to the method claim 49, and is directed to the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of claim 49. Note that Djabarov teaches computer-readable storage medium and processors ([fig 3] “processor” and “memory”).

Regarding claim 59 
Claim 59 is a computer-readable storage medium claim corresponding to the method claim 51, and is directed to the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of claim 51.

Regarding claim 64
Claim 64 is a computer-readable storage medium claim corresponding to the method claim 56, and is directed to the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of claim 56.

Regarding claim 65 
Claim 65 is a system claim corresponding to the method claim 49, and is directed to the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of claim 49. Note that Djabarov teaches processors and memory ([fig 3] “processor” and “memory”).

Regarding claim 67 
Claim 67 is a system claim corresponding to the method claim 51, and is directed to the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of claim 51.

Regarding claim 72
Claim 72 is a system claim corresponding to the method claim 56, and is directed to the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of claim 56.

Claims 52, 60 and 68 are rejected under 35 U.S.C. 103 as being unpatentable over Toda et al. (A Probabilistic Approach for Automatically Filling Form-Based Web Interfaces) in view of Djabarov (US 8,214,362 B1), further in view of Hermens et al. (A Machine-Learning Apprentice for the Completion of Repetitive Forms), further in view of Klappert et al. (US 2016/0117542 A1)

Regarding claim 52, 
Toda, Djabarov and Hermens teach claim 49. 

Toda further teaches 
the first confidence score indicates a number [based on] the first test values and the corresponding values from the training data (Toda [fig 4]; [sec 4] “A graph representing a Value Style Model SM(Fj) is generated using the encodings of all symbol mask sequences found in values previously entered for field Fj. … 
    PNG
    media_image3.png
    42
    806
    media_image3.png
    Greyscale
  (7) If style is taken into account, information from the set of sequence models has to be added to the computation. In this case, the resulting formula is: 
    PNG
    media_image4.png
    93
    803
    media_image4.png
    Greyscale
 (8) … Let Cj be the set of segments Sab such that P(fj |sab) is above threshold ϵ. We say that Cj is a set of candidate values for field Fj. We aim at finding a mapping M between candidates values and fields in the form-based interface with a maximum aggregate probability, such that (1) only a single segment is assigned to each field and (2) the selected segments are non-overlapping, i.e., there are no segments Sab and Scd for a < c in the mapping such that b ≥ c. This is accomplished by means of a two-phase procedure as follows … In the second phase, if any field remains not mapped to a segment, we use the probabilities derived from the style-related features to try to find further assignments, using equation Eq. 8 to compute the probability of each field given each segment”; “probability” reads on “confidence score”.).

However, Toda, Djabarov and Hermens do not teach
the first confidence score indicates a number of matches between the first test values and the corresponding values from the training data.

Klappert teaches
the first confidence score indicates a number of matches between the first test values and the corresponding values from the training data (Klappert [figs 6-7]; [pars 2-19] “In some embodiments, when control circuitry identifies the user profile to which the print corresponds, control circuitry may query a database. While querying the database, control circuitry may cross-reference characteristics of the detected print against entries in a database to identify an entry that corresponds to the characteristics of the print. While cross-referencing the characteristics, control circuitry may analyze features of the print (e.g., whorls, etc.) and utilize the features as a means of uniquely identifying the person to whom the print corresponds. … A level of confidence may correspond to a number, percentage, or ratio of corresponding features of a print to features of a user profile.”; “entry that corresponds to the characteristics of the print” reads on “matches”. Note that Toda, Djabarov and Hermens teach “the first confidence score indicates a number [based on] the first test values and the corresponding values from the training data”.);

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the electronic document management system of Toda, Djabarov and Hermens with the matches between test values and training values of Klappert. Doing so would lead to providing a confidence level between different two data sets based on their matches (Klappert, pars 2-19).

Regarding claim 60 
Claim 60 is a computer-readable storage medium claim corresponding to the method claim 52, and is directed to the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of claim 52.

Regarding claim 68 
Claim 68 is a system claim corresponding to the method claim 52, and is directed to the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of claim 52.

Claims 54, 62 and 70 are rejected under 35 U.S.C. 103 as being unpatentable over Toda et al. (A Probabilistic Approach for Automatically Filling Form-Based Web Interfaces) in view of Djabarov (US 8,214,362 B1), further in view of Hermens et al. (A Machine-Learning Apprentice for the Completion of Repetitive Forms), further in view of Byron et al. (US 2015/0046785 A1).

Regarding claim 54, 
Toda, Djabarov and Hermens teach claim 49. 


the one or more data fields are selected based on natural language parsing data and historical form analysis.

Byron teaches 
the one or more data fields are selected based on natural language parsing data and historical form analysis ([Fig 10]; [par 4] “The method further comprises determining, for the at least one portion, a functional dependency of the at least one portion of the tabular data on one or more other portions of the tabular data.”; [pars 19-30] “One such processing operation that may be applied is natural language processing (NLP) of a document that contains a table data structure.”; [pars 115-119] “Moreover, various artificial intelligence and machine learning methods may be employed to assist in identifying suspicious cells within the table structure. For example, features extracted during natural language processing of a document and the table structure, such as markup language information, layout information, functional clues, and the like, may be used to infer functional dependencies within the table structure. This facilitates training and using machine learning models which can then be used to signal the presence of a functional dependency within a table structure.”; “natural language processing (NLP) of a document that contains a table data structure” reads on “natural language parsing data”. In addition, “training and using machine learning models” reads on “historical form analysis”.).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the electronic document management system of Toda, Djabarov and Hermens with the natural language parsing data and historical form analysis of Byron. Doing so would lead to finding correct functional dependencies by confirming or supporting the hypothesis over dependent field values (Byron, [pars 92-122]).

Regarding claim 62
Claim 62 is a computer-readable storage medium claim corresponding to the method claim 54, and is directed to the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of claim 54.

Regarding claim 70
Claim 70 is a system claim corresponding to the method claim 54, and is directed to the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of claim 54.

Response to Arguments
Applicant's arguments filed on 09/03/2021 have been fully considered but they are not persuasive.
Applicant asserts 
“Neither Toda, Djabarov, nor Hermens, whether considered individually or in combination, discloses each and every element of claim 49, and therefore Applicant's claim 49 is patentable over the proposed combination of Toda, Djabarov, and Hermens. 
The BPAI (now PTAB) emphasized in a post-KSR ruling that "obviousness requires a suggestion of all limitations in a claim." 1 Further, the U.S. Supreme Court has ruled that a "patent composed of several elements is not proved obvious merely by demonstrating that each of its elements was, independently, known in the prior art." 2 Indeed, to establish a prima facie case of obviousness, "[a ll words in a claim must be considered in judging the patentability of that claim against the prior art." 3 This means that the Examiner cannot disregard key features of a claim when formulating a rejection of the claim under 35 U.S.C. § 103. 
tive functions, and the combination would have yielded nothing more than predictable results to one of ordinary skill in the art." MPEP § 2143.02, quoting KSR Int'l v. Teleflex Inc. (emphasis added). 
Specifically, neither Toda, Djabarov, nor Hermens, whether considered individually or in combination, discloses "generating, by a machine learning module, a first plurality of arithmetic functions to provide data values for a first data field of a new form in the electronic document preparation system, wherein the data values are arithmetically dependent on one or more data fields, other than the first data field of the new form," as recited in claim 49, and therefore Applicant's claim 49 is patentable over the cited references. 
First, as the Examiner acknowledges on page 9 of the Office Action, Toda does not teach generating functions to provide data values which are "dependent on one or more data fields, other than the first data field of the new form." 
The Examine cites to Hermens as allegedly disclosing the generation of such functions. 
Hermens describes how a simple form filling system failed to predict values in form fields which depend on calculations from other fields in the form. See Hermens at page 5 (under "Difficulties"). For example, a field for earned sick leave may depend on 
Thus, one of ordinary skill in the art would not modify Toda and Djabarov to incorporate features of Hermens for "generating . . . a first plurality of arithmetic functions to provide data values for a first data field . .. wherein the data values are arithmetically dependent on one or more data fields, other than the first data field," as recited in Applicant's claim 49. Indeed, Hermens teaches that the use of machine learning techniques to predict the values of such fields was not possible using the disclosures of Hermens. 
The Examiner has cited to no disclosures of Toda or Djabarov which disclose or suggest the generation of arithmetic functions for learning fields containing data values which are "arithmetically dependent on one or more data fields, other than the first data field," as recited in15/296,294 12  Applicant's claim 49. Indeed, Toda and Djabarov both relate primarily to the prediction of strings and simple numerical inputs such as social security numbers and the like. See, e.g., Toda at 1. Introduction (describing problem as extracting values from free text in order to fill fields of a form); Djabarov at col. 6:27-58 and FIGS. 6-7 (showing and describing an interface for receiving user submitted autofill information, and a table containing such information). 

Claims 51, 52, 54, and 56 depend, directly or indirectly, from claim 49, and are therefore patentable over the cited references for at least the same reasons as claim 49.” (Remarks, pg 11)

Examiner’s response:
The examiner respectively disagrees. 

As detailed in the rejections, Toda and Hermens, in combination, still teach “generating, by a machine learning module, a first plurality of arithmetic functions to provide data values for a first data field of a new form in the electronic document preparation system, wherein the data values are arithmetically dependent on one or more data fields, other than the first data field of the new form” since Toda teaches “generating, by a machine learning module, a first plurality of arithmetic functions to provide data values for a first data field of a new form in the electronic document preparation system, wherein the data values are arithmetically dependent on one or more data fields other than the first data field of the new form” except “other than the first data field”, and Hermens teaches “the data values are arithmetically dependent on one or more data fields, other than the first data field of the new form.” (Regarding arithmetic operators, please refer to par 9 of the specification as well.) In other words, Toda teaches generating arithmetic functions to provide data values based on multiple data 

In addition, Hermens, as detailed in the rejections, discloses the form filling based on functional relationships and also discloses that its implementation is possible, instead of teaching away from it, as stated: “A simple spreadsheet program can handle such conditional formulas, but it would require explicit programming by the user. We want to avoid programming systems for complex rules like these, so the goal remains to provide an agent capable of learning such rules.” (With emphasis underlined) Thus, Hermens definitely teaches “the data values are arithmetically dependent on one or more data fields, other than the first data field of the new form” in combination with Toda.

For more details, see the rejections. Thus, the examiner’s rejections are reasonable and proper.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SEHWAN KIM whose telephone number is (571)270-7409.  The examiner can normally be reached on Mon - Thu 7:00 AM - 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, ALEXEY SHMATOV can be reached on 571-270-3428.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/S.K./Examiner, Art Unit 2123                                                                                                                                                                                               




/LUIS A SITIRICHE/Primary Examiner, Art Unit 2126