Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
The filing date of the present application is 10/18/2016.
This action is in response to amendments and/or remarks filed on 10/19/2020. In the current amendments, claims 49, 51, 52, 56, 57, 59, 60, 64, 65, 67, 68, 70, and 72 have been amended, and claims 1-48, 50, 53, 55, 58, 61, 63, 66, 69 and 71 have been cancelled. Claims 49, 51, 52, 54, 56, 57, 59, 60, 62, 64, 65, 67, 68, 70, and 72 are pending and have been examined. 
In view of Applicant’s amendments and/or remarks, the objections to claim 70 made in the previous Office Action have been withdrawn.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 02/17/2021 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 10/19/2020 has been entered. 

Claim Objections
Claims 51, 59 and 67 are objected to because of the following informalities: it appears that the “determining …” step may need to be placed before the “generating, for the first function, …” step, based on a comparison with independent claims. Appropriate correction is required, if necessary.
Claim 51 is objected to because of the following informalities: it appears that “providing data for the first data field based the second confidence score” should read “providing data for the first data field based on the second confidence score”. Appropriate correction is required. (with emphasis underlined)
Claim 59 is objected to because of the following informalities: it appears that “providing data for the first data field based on the difference between the second confidence score” should read “providing data for the first data field based on the second confidence score”. Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 49, 51, 52, 54, 56, 57, 59, 60, 62, 64, 65, 67, 68, 70, and 72 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the 
Claims 49, 57 and 65 recite “determining a difference between the first test value and corresponding values from the training data for the first data field of the new form; generating, for the first function, a first confidence score based on the difference between the first test value and the corresponding values from the training data.” However, it appears that the specification is silent in regards to the difference between the first test value and corresponding values from the training data. Instead, par 73 of the specification says “In one embodiment, the machine learning module 113 then generates matching data 127 by comparing the test data value for each copy of the form to the actual data value from the completed data field of that copy of the form.” In other words, even though the paragraph says a one-to-one comparison between the test data value and the actual data value for each copy of the form, the limitations says a one-to-many comparison between the test data value and the corresponding values from the training data. It appears that this is changing the scope of the claimed invention without support from the specification, therefore it is rejected under 112(a) lack of written description. (with emphasis underlined)
Claims 51, 56, 59, 64, 67 and 72 recite similar claim languages. Thus, it appears that based on the same reason, they are rejected under 112(a) lack of written description.
Claims 49, 51, 56, 57, 59, 64, 65, 67 and 72 each recite limitations that raise issues of indefiniteness as set forth above, and dependent claims 52, 54, 60, 62, 68 and 70 are rejected at least based on their direct and/or indirect dependency from independent claims 49, 57 and 65. Appropriate explanation and/or amendment is required.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 49, 51, 56-57, 59, 64-65, 67 and 72 are rejected under 35 U.S.C. 103 as being unpatentable over Toda et al. (A Probabilistic Approach for Automatically Filling Form-Based Web Interfaces) in view of Djabarov (US 8,214,362 B1), further in view of Hermens et al. (A Machine-Learning Apprentice for the Completion of Repetitive Forms).

Regarding claim 49, 
Toda teaches 
the method comprising:

generating, by a machine learning module, a first plurality of functions to provide a data value for a first data field of a new form in the electronic document preparation system, wherein the data value is dependent on one or more data fields … of the new form ([sec 4] “We have considered several alternatives for such combination, including the use of machine learning approaches, such as SVM [11], Genetic Programming [10], linear combination of values and the use of a Bayesian Network approach. … We model the computation of the probability of field fj given a segment Sab through a Bayesian belief network model similar to the one proposed by Ribeiro-Neto et al [17, 6] for ranking documents in search tasks. Our Bayesian network provides a graphical formalism for representing the probability model we develop allowing for a better visualization of how the features we consider lead to the final probability of a segment given a field. … Let Cj be the set of segments Sab such that P(fj |sab) is above threshold ϵ. We say that Cj is a set of candidate values for field Fj. We aim at finding a mapping M between candidates values and fields in the form-based interface with a maximum aggregate probability, such that (1) only a single segment is assigned to each field and (2) the selected segments are non-overlapping, i.e., there are no segments Sab and Scd for a < c in the mapping such that b ≥ c. This is accomplished by means of a two-phase procedure as follows”; [sec 4] “The main idea behind iForm is to rely on information about previous values used for each field of a form to fill this form when a new text is given as input.”; “Let Cj be the set of segments Sab such that P(fj |sab) is above threshold ϵ” reads on “generating … a first plurality of functions” since each segment and its probability are considered for filling a field. In addition, “We model the computation of the probability of field fj given a segment Sab through a Bayesian belief network model” reads on “machine learning module”. Furthermore, “fill this form when a new text is given as input” reads on “new form” since “this form” may be filled out with a new text.);

selecting a first function from the first plurality of functions ([sec 4] “The iForm approach for dealing with the form filling problem consists of taking candidate segments from the input text and then estimating the probability of a field given each segment. … In the second phase, if any field remains not mapped to a segment, we use the probabilities derived from the style-related features to try to find further assignments, using equation Eq. 8 to compute the probability of each field given each segment”; see also [sec 3]; “estimating the probability of a field given each segment” reads on “selecting a first function”.); 

generating a first test value based on the first function and based on training data generated from previously filled forms ([sec 2] “Thus, we only rely on information about input values entered for each field of the target form on previous submission made by the users.”; also see [sec 5] “previous submissions”; [sec 3] “The problem we face in this work is automatically filling out the fields of a given form-based interface with values extracted from a data-rich free text document, or portions of such documents. In particular, we identify two subproblems: the problems of (a) extracting values from the input text and (b) filling out the fields of the target form using them.”; [sec 4] “In the case of text boxes, we simply enter each mapped text segment as a value into its corresponding field.”; “enter each mapped text segment as a value into its corresponding field” reads on “generating a first test value”.); 

determining a [value calculated from] the first test value and corresponding values from the training data for the first data field of the new form ([fig 4]; [sec 4] as cited above, and “The main idea behind iForm is to rely on information about previous values used for each field of a form to fill this form when a new text is given as input. We consider two types of features from these values: the values themselves and the tokens composing these values, which we call content related features; and the style (e.g., capitalization, punctuation, etc.), which we call the style related feature. We stress that no features from the input tests are considered. The style feature requires a more detailed explanation. Let SVj be the set of previous values entered for a field Fj. We automatically learn a Naive Hidden Markov Model SM(Fj), which we call Value Style Model, that captures the wording style of the values in SVj. This model is similar to the inner HMM used in [4], also used to capture the wording style of sequences.”; “Let SVj be the set of previous values entered for a field Fj” reads on “corresponding values from the training data for the first data field of the new form”.);

generating, for the first function, a first confidence score based on the [value calculated from] the first test value and the corresponding values from the training data (Toda [fig 4]; [sec 4] “A graph representing a Value Style Model SM(Fj) is generated using the encodings of all symbol mask sequences found in values previously entered for field Fj. … 
    PNG
    media_image1.png
    42
    806
    media_image1.png
    Greyscale
  (7) If style is taken into account, information from the set of sequence models has to be added to the computation. In this case, the resulting formula is: 
    PNG
    media_image2.png
    93
    803
    media_image2.png
    Greyscale
 (8) … Let Cj be the set of segments Sab such that P(fj |sab) is above threshold ϵ. We say that Cj is a set of candidate values for field Fj. We aim at finding a mapping M between candidates values and fields in the form-based interface with a maximum aggregate probability, such that (1) only a single segment is assigned to each field and (2) the selected segments are non-overlapping, i.e., there are no segments Sab and Scd for a < c in the mapping such that b ≥ c. This is accomplished by means of a two-phase procedure as follows … In the second phase, if any field remains not mapped to a segment, we use the probabilities derived from the style-related features to try to find further assignments, using equation Eq. 8 to compute the probability of each field given each segment”; “probability” reads on “confidence score”.); and 

providing the data value for the first data field of the new form based on the first confidence score ([sec 4] “The main idea behind iForm is to rely on information about previous values used for each field of a form to fill this form when a new text is given as input. … Let Cj be the set of segments Sab such that P(fj |sab) is above threshold ϵ. We say that Cj is a set of candidate values for field Fj. We aim at finding a mapping M between candidates values and fields in the form-based interface with a maximum aggregate probability, such that (1) only a single segment is assigned to each field and (2) the selected segments are non-overlapping, i.e., there are no segments Sab and Scd for a < c in the mapping such that b ≥ c. This is accomplished by means of a two-phase procedure as follows”);

However, Toda does not teach
A method performed by one or more processors of an electronic document preparation system:
the data value is dependent on one or more data fields, other than the first data field of the new form;
determining a difference between the first test value and corresponding values from the training data for the first data field of the new form;
generating, for the first function, a first confidence score based on the difference between the first test value and the corresponding values from the training data;
(Note: Hereinafter, if a limitation has one or more underlines, the one or more underlined claim languages indicate that they have not been taught yet, while the one or more non-underlined claim languages indicate that they have been taught already.)

Djabarov teaches
A method performed by one or more processors of an electronic document preparation system ([fig 3] “processor” and “memory”): 

a difference between the first test value and corresponding values from the training data for the first data field of the new form ([figs 8-10]; [col 6, ln 62– col 9, ln 3] “Upon display of the document to the user, the user may enter "John” which matches the attribute value associated with the attribute name “first name” in table 700. Accordingly, a mapping of the form field identifier “user data 1” to the attribute name “first name may be made for this form field in the web document. … a web document having a form field element identified by “answer1” may elicit a user entry of “John Doe”. In comparing the user entry with the content of table 700, it may be determined that form field element “answer 1” corresponds to a compound attributes name of “first name, last name”. This mapping may be stored.”; “user entry” reads on “first test value”, and “content of table 700” reads on “corresponding values from the training data”. In addition, “matches” and “comparing the user entry with the content of table” read on “determining a difference between the first test value and corresponding values from the training data” since the user entry is compared with each attribute value of fig 7 for checking a match. Note that Toda also teaches “the first test value and corresponding values from the training data for the first data field of the new form”.);

generating, for the first function, a first confidence score based on the difference between the first test value and the corresponding values from the training data ([figs 8-10]; [col 6, ln 62– col 9, ln 3] “Upon display of the document to the user, the user may enter "John” which matches the attribute value associated with the attribute name “first name” in table 700. Accordingly, a mapping of the form field identifier “user data 1” to the attribute name “first name may be made for this form field in the web document. … a web document having a form field element identified by “answer1” may elicit a user entry of “John Doe”. In comparing the user entry with the content of table 700, it may be determined that form field element “answer 1” corresponds to a compound attributes name of “first name, last name”. This mapping may be stored. … AutoFill engine 225 may require a predetermined confidence level for a identified mapping. For example, prior to indicating a mapping to AutoFill software 430, AutoFill engine 225 may require that 75% of all received mappings match the indicated mapping. Such a confidence interval or other statistical analysis, ensures that infrequently visited web documents or erroneously entered data does not unnecessarily skew the identified mappings, thereby resulting in incorrect data insertions.”; “user entry” reads on “first test value”, and “content of table 700” reads on “corresponding values from the training data”. In addition, “matches” and “comparing the user entry with the content of table” read on “difference between the first test value and the corresponding values from the training data” since the user entry is compared with each attribute value of fig 7 for checking a match. Furthermore, “confidence level” reads on “confidence score”. Note that Toda also teaches “generating, for the first function, a first confidence score based on the [value calculated from] the first test value and the corresponding values from the training data”.);

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the electronic document management system of Toda with the comparison between a test value and training values of Djabarov. Doing so would lead to providing a measure of how reliable the automatic filling is based on the data difference (Djabarov, col 6, ln 62– col 9, ln 3).

However, Toda and Djabarov do not teach
the data value is dependent on one or more data fields, other than the first data field of the new form.

Hermens teaches
other than the first data field of the new form ([sec “Difficulties”] “Other roadblocks to learning the leave report form include the complex formula for calculating earned sick leave, which is based on the Summer and Academic boxes, plus the %FTE box and a constant value of 8.0: 
If Month                         
                            ∈
                        
                     { Sep, Oct, Nov, Dec, Jan, Feb, Mar, Apr} and Academic is checked, then Sick-Leave-Hours-Earned-Or-Received is 8.0
If Month                         
                            ∈
                        
                     {May, Aug} and Academic is checked, and Summer is not checked, then sick-Leave-Hours-Earned-Or-Received is 4.0
If Month                         
                            ∈
                        
                     {May, Aug} and Academic is checked, and Summer is checked, then Sick-Leave-Hours-Earned-Or-Received is (%FTE                        
                            ∙
                        
                    8.0)
If Month                         
                            ∈
                        
                     {June, July} and Summer is checked, then Sick-Leave-Hours-Earned-Or-Received is (%FTE                        
                            ∙
                        
                    8.0)”; Note that Toda and Djabarov teach one or more data fields of the new form”.).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the electronic document management system of Toda and Djabarov with the different dependent data fields of Hermens. Doing so would lead to solving considerably more difficult and complex problems based on current and previous electronic forms (Hermens, [sec “Difficulties”]).

Regarding claim 51, 
Toda, Djabarov and Hermens teach claim 49. 

Toda further teaches 
[sec 4] “The iForm approach for dealing with the form filling problem consists of taking candidate segments from the input text and then estimating the probability of a field given each segment. … 
    PNG
    media_image1.png
    42
    806
    media_image1.png
    Greyscale
  (7) If style is taken into account, information from the set of sequence models has to be added to the computation. In this case, the resulting formula is: 
    PNG
    media_image2.png
    93
    803
    media_image2.png
    Greyscale
 (8) … In the second phase, if any field remains not mapped to a segment, we use the probabilities derived from the style-related features to try to find further assignments, using equation Eq. 8 to compute the probability of each field given each segment”; see also [sec 3]; “estimating the probability of a field given each segment” with eqs (7)-(8) reads on “selecting a second function”.);

generating a second test value based on the second function and the training data ([sec 2] “Thus, we only rely on information about input values entered for each field of the target form on previous submission made by the users.”; also see [sec 5] “previous submissions”; [sec 3] “The problem we face in this work is automatically filling out the fields of a given form-based interface with values extracted from a data-rich free text document, or portions of such documents. In particular, we identify two subproblems: the problems of (a) extracting values from the input text and (b) filling out the fields of the target form using them.”; [sec 4] “Hence, the problem is finding a subset of value-field pairs in I without conflicts whose aggregate probabilities are maximum. Finding the optimal solution for this problem requires assessing all possible subsets – an exponential number. In practice, we use a simple greedy heuristic to find an approximate solution. First, we extract the pair with the highest probability from I and verify whether it presents conflict to any pair in M or not. … In the case of text boxes, we simply enter each mapped text segment as a value into its corresponding field.”; “finding a subset of value-field pairs in I without conflicts whose aggregate probabilities are maximum” reads on “generating a second test value” since different segment values may be estimated on a field.);  

generating, for the second function, a second confidence score based on the [value calculated from] the second test value and the corresponding values from the training data (Toda [fig 4]; [sec 4] “A graph representing a Value Style Model SM(Fj) is generated using the encodings of all symbol mask sequences found in values previously entered for field Fj. … 
    PNG
    media_image1.png
    42
    806
    media_image1.png
    Greyscale
  (7) If style is taken into account, information from the set of sequence models has to be added to the computation. In this case, the resulting formula is: 
    PNG
    media_image2.png
    93
    803
    media_image2.png
    Greyscale
 (8) … Let Cj be the set of segments Sab such that P(fj |sab) is above threshold ϵ. We say that Cj is a set of candidate values for field Fj. We aim at finding a mapping M between candidates values and fields in the form-based interface with a maximum aggregate probability, such that (1) only a single segment is assigned to each field and (2) the selected segments are non-overlapping, i.e., there are no segments Sab and Scd for a < c in the mapping such that b ≥ c. This is accomplished by means of a two-phase procedure as follows … In the second phase, if any field remains not mapped to a segment, we use the probabilities derived from the style-related features to try to find further assignments, using equation Eq. 8 to compute the probability of each field given each segment”; “probability” reads on “confidence score”.); 

 ([fig 4]; [sec 4] as cited above, and “The main idea behind iForm is to rely on information about previous values used for each field of a form to fill this form when a new text is given as input. We consider two types of features from these values: the values themselves and the tokens composing these values, which we call content related features; and the style (e.g., capitalization, punctuation, etc.), which we call the style related feature. We stress that no features from the input tests are considered. The style feature requires a more detailed explanation. Let SVj be the set of previous values entered for a field Fj. We automatically learn a Naive Hidden Markov Model SM(Fj), which we call Value Style Model, that captures the wording style of the values in SVj. This model is similar to the inner HMM used in [4], also used to capture the wording style of sequences.”; “Let SVj be the set of previous values entered for a field Fj” reads on “corresponding values from the training data”.);

providing data for the first data field based the second confidence score ([sec 3] “The problem we face in this work is automatically filling out the fields of a given form-based interface with values extracted from a data-rich free text document, or portions of such documents. In particular, we identify two subproblems: the problems of (a) extracting values from the input text and (b) filling out the fields of the target form using them.”; [sec 4] “Hence, the problem is finding a subset of value-field pairs in I without conflicts whose aggregate probabilities are maximum. Finding the optimal solution for this problem requires assessing all possible subsets – an exponential number. In practice, we use a simple greedy heuristic to find an approximate solution. First, we extract the pair with the highest probability from I and verify whether it presents conflict to any pair in M or not. … In the case of text boxes, we simply enter each mapped text segment as a value into its corresponding field.”; “enter each mapped text segment as a value into its corresponding field” reads on “providing data for the first data field.”);  

Djabarov further teaches 
selecting a second function from the first plurality of functions based on the difference between the first test value and the corresponding values from the training data for the first data field of the new form ([figs 8-10]; [col 6, ln 62– col 9, ln 3] “Upon display of the document to the user, the user may enter "John” which matches the attribute value associated with the attribute name “first name” in table 700. Accordingly, a mapping of the form field identifier “user data 1” to the attribute name “first name may be made for this form field in the web document. … a web document having a form field element identified by “answer1” may elicit a user entry of “John Doe”. In comparing the user entry with the content of table 700, it may be determined that form field element “answer 1” corresponds to a compound attributes name of “first name, last name”. This mapping may be stored.”; “user entry” reads on “first test value”, and “content of table 700” reads on “corresponding values from the training data”. In addition, “matches” and “comparing the user entry with the content of table” read on “difference between the first test value and the corresponding values from the training data” since the user entry is compared with each attribute value of fig 7 for checking a match. Note that Toda also teaches “selecting a second function from the first plurality of functions based on the [value calculated from] the first test value and the corresponding values from the training data for the first data field of the new form”.);

generating, for the second function, a second confidence score based on the difference between the second test value and the corresponding values from the training data ([figs 8-10]; [col 6, ln 62– col 9, ln 3] “Upon display of the document to the user, the user may enter "John” which matches the attribute value associated with the attribute name “first name” in table 700. Accordingly, a mapping of the form field identifier “user data 1” to the attribute name “first name may be made for this form field in the web document. … a web document having a form field element identified by “answer1” may elicit a user entry of “John Doe”. In comparing the user entry with the content of table 700, it may be determined that form field element “answer 1” corresponds to a compound attributes name of “first name, last name”. This mapping may be stored. … AutoFill engine 225 may require a predetermined confidence level for a identified mapping. For example, prior to indicating a mapping to AutoFill software 430, AutoFill engine 225 may require that 75% of all received mappings match the indicated mapping. Such a confidence interval or other statistical analysis, ensures that infrequently visited web documents or erroneously entered data does not unnecessarily skew the identified mappings, thereby resulting in incorrect data insertions.”; “user entry” reads on “second test value”, and “content of table 700” reads on “corresponding values from the training data”. In addition, “matches” and “comparing the user entry with the content of table” read on “difference between the second test value and the corresponding values from the training data” since the user entry is compared with each attribute value of fig 7 for checking a match. Furthermore, “confidence level” reads on “confidence score”. Note that Toda also teaches “generating, for the second function, a second confidence score based on the [value calculated from] the second test value and the corresponding values from the training data”.);

determining a difference between the second test value and the corresponding values from the training data ([figs 8-10]; [col 6, ln 62– col 9, ln 3] “Upon display of the document to the user, the user may enter "John” which matches the attribute value associated with the attribute name “first name” in table 700. Accordingly, a mapping of the form field identifier “user data 1” to the attribute name “first name may be made for this form field in the web document. … a web document having a form field element identified by “answer1” may elicit a user entry of “John Doe”. In comparing the user entry with the content of table 700, it may be determined that form field element “answer 1” corresponds to a compound attributes name of “first name, last name”. This mapping may be stored.”; “user entry” reads on “second test value”, and “content of table 700” reads on “corresponding values from the training data”. In addition, “matches” and “comparing the user entry with the content of table” read on “determining a difference between the second test value and corresponding values from the training data” since the user entry is compared with each attribute value of fig 7 for checking a match. Note that Toda also teaches “the second test value and corresponding values from the training data”.);

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the electronic document management system of Toda, Djabarov and Hermens with the comparison between a test value and training values of Djabarov. Doing so would lead to providing a measure of how reliable the automatic filling is based on the data difference (Djabarov, col 6, ln 62– col 9, ln 3).

Regarding claim 56, 
Toda, Djabarov and Hermens teach claim 49. 

Toda further teaches 
generating, for a second data field of the new form, a second plurality of functions to provide a data value for the second data field ([sec 4] “Our Bayesian network provides a graphical formalism for representing the probability model we develop allowing for a better visualization of how the features we consider lead to the final probability of a segment given a field. … Let Cj be the set of segments Sab such that P(fj |sab) is above threshold ϵ. We say that Cj is a set of candidate values for field Fj. We aim at finding a mapping M between candidates values and fields in the form-based interface with a maximum aggregate probability, such that (1) only a single segment is assigned to each field and (2) the selected segments are non-overlapping, i.e., there are no segments Sab and Scd for a < c in the mapping such that b ≥ c. This is accomplished by means of a two-phase procedure as follows”; [sec 3] “The main idea behind iForm is to rely on information about previous values used for each field of a form to fill this form when a new text is given as input.”; “Let Cj be the set of segments Sab such that P(fj |sab) is above threshold ϵ” reads on “generating … a second plurality of functions” since each segment and its probability are considered for filling a field.); 

selecting a function from the second plurality of functions  ([sec 4] “The iForm approach for dealing with the form filling problem consists of taking candidate segments from the input text and then estimating the probability of a field given each segment. … In the second phase, if any field remains not mapped to a segment, we use the probabilities derived from the style-related features to try to find further assignments, using equation Eq. 8 to compute the probability of each field given each segment”; see also [sec 3]; “estimating the probability of a field given each segment” reads on “selecting a function”.);

generating a second test value based on the selected function of the second plurality of functions and the training data ([sec 2] “Thus, we only rely on information about input values entered for each field of the target form on previous submission made by the users.”; also see [sec 5] “previous submissions”; [sec 3] “The problem we face in this work is automatically filling out the fields of a given form-based interface with values extracted from a data-rich free text document, or portions of such documents. In particular, we identify two subproblems: the problems of (a) extracting values from the input text and (b) filling out the fields of the target form using them.”; [sec 4] “Hence, the problem is finding a subset of value-field pairs in I without conflicts whose aggregate probabilities are maximum. Finding the optimal solution for this problem requires assessing all possible subsets – an exponential number. In practice, we use a simple greedy heuristic to find an approximate solution. First, we extract the pair with the highest probability from I and verify whether it presents conflict to any pair in M or not. … In the case of text boxes, we simply enter each mapped text segment as a value into its corresponding field.”; “finding a subset of value-field pairs in I without conflicts whose aggregate probabilities are maximum” reads on “generating a second test value” since different segment values may be estimated on a field.); and

determining a [value calculated from] the second test value and corresponding values from the training data for the second data field of the new form ([fig 4]; [sec 4] as cited above, and “The main idea behind iForm is to rely on information about previous values used for each field of a form to fill this form when a new text is given as input. We consider two types of features from these values: the values themselves and the tokens composing these values, which we call content related features; and the style (e.g., capitalization, punctuation, etc.), which we call the style related feature. We stress that no features from the input tests are considered. The style feature requires a more detailed explanation. Let SVj be the set of previous values entered for a field Fj. We automatically learn a Naive Hidden Markov Model SM(Fj), which we call Value Style Model, that captures the wording style of the values in SVj. This model is similar to the inner HMM used in [4], also used to capture the wording style of sequences.”; “Let SVj be the set of previous values entered for a field Fj” reads on “corresponding values from the training data for the second data field of the new form”.);

generating, for the selected function, a third confidence score based on the [value calculated from] the second test value and the corresponding values from the training data (Toda [fig 4]; [sec 4] “A graph representing a Value Style Model SM(Fj) is generated using the encodings of all symbol mask sequences found in values previously entered for field Fj. … 
    PNG
    media_image1.png
    42
    806
    media_image1.png
    Greyscale
  (7) If style is taken into account, information from the set of sequence models has to be added to the computation. In this case, the resulting formula is: 
    PNG
    media_image2.png
    93
    803
    media_image2.png
    Greyscale
 (8) … Let Cj be the set of segments Sab such that P(fj |sab) is above threshold ϵ. We say that Cj is a set of candidate values for field Fj. We aim at finding a mapping M between candidates values and fields in the form-based interface with a maximum aggregate probability, such that (1) only a single segment is assigned to each field and (2) the selected segments are non-overlapping, i.e., there are no segments Sab and Scd for a < c in the mapping such that b ≥ c. This is accomplished by means of a two-phase procedure as follows … In the second phase, if any field remains not mapped to a segment, we use the probabilities derived from the style-related features to try to find further assignments, using equation Eq. 8 to compute the probability of each field given each segment”; “probability” reads on “confidence score”.); and 

providing the data value for the second data field of the new form based on the third confidence score ([sec 4] “The main idea behind iForm is to rely on information about previous values used for each field of a form to fill this form when a new text is given as input. … Let Cj be the set of segments Sab such that P(fj |sab) is above threshold ϵ. We say that Cj is a set of candidate values for field Fj. We aim at finding a mapping M between candidates values and fields in the form-based interface with a maximum aggregate probability, such that (1) only a single segment is assigned to each field and (2) the selected segments are non-overlapping, i.e., there are no segments Sab and Scd for a < c in the mapping such that b ≥ c. This is accomplished by means of a two-phase procedure as follows”);

Djabarov further teaches 
determining a difference between the second test value and corresponding values from the training data for the second data field of the new form ([figs 8-10]; [col 6, ln 62– col 9, ln 3] “Upon display of the document to the user, the user may enter "John” which matches the attribute value associated with the attribute name “first name” in table 700. Accordingly, a mapping of the form field identifier “user data 1” to the attribute name “first name may be made for this form field in the web document. … a web document having a form field element identified by “answer1” may elicit a user entry of “John Doe”. In comparing the user entry with the content of table 700, it may be determined that form field element “answer 1” corresponds to a compound attributes name of “first name, last name”. This mapping may be stored.”; “user entry” reads on “second test value”, and “content of table 700” reads on “corresponding values from the training data”. In addition, “matches” and “comparing the user entry with the content of table” read on “determining a difference between the second test value and corresponding values from the training data” since the user entry is compared with each attribute value of fig 7 for checking a match. Note that Toda also teaches “the second test value and corresponding values from the training data for the second data field of the new form”.);

generating, for the selected function, a third confidence score based on the difference between the second test value and the corresponding values from the training data ([figs 8-10]; [col 6, ln 62– col 9, ln 3] “Upon display of the document to the user, the user may enter "John” which matches the attribute value associated with the attribute name “first name” in table 700. Accordingly, a mapping of the form field identifier “user data 1” to the attribute name “first name may be made for this form field in the web document. … a web document having a form field element identified by “answer1” may elicit a user entry of “John Doe”. In comparing the user entry with the content of table 700, it may be determined that form field element “answer 1” corresponds to a compound attributes name of “first name, last name”. This mapping may be stored. … AutoFill engine 225 may require a predetermined confidence level for a identified mapping. For example, prior to indicating a mapping to AutoFill software 430, AutoFill engine 225 may require that 75% of all received mappings match the indicated mapping. Such a confidence interval or other statistical analysis, ensures that infrequently visited web documents or erroneously entered data does not unnecessarily skew the identified mappings, thereby resulting in incorrect data insertions.”; “user entry” reads on “second test value”, and “content of table 700” reads on “corresponding values from the training data”. In addition, “matches” and “comparing the user entry with the content of table” read on “difference between the second test value and the corresponding values from the training data” since the user entry is compared with each attribute value of fig 7 for checking a match. Furthermore, “confidence level” reads on “confidence score”. Note that Toda also teaches “generating, for the selected function, a third confidence score based on the [value calculated from] the second test value and the corresponding values from the training data”.);

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the electronic document management system of Toda, Djabarov and Hermens with the comparison between a test value and training values of Djabarov. Doing so would lead to providing a measure of how reliable the automatic filling is based on the data difference (Djabarov, col 6, ln 62– col 9, ln 3).

Regarding claim 57 
Claim 57 is a computer-readable storage medium claim corresponding to the method claim 49, and is directed to the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of claim 49. Note that Djabarov teaches computer-readable storage medium and processors ([fig 3] “processor” and “memory”).

Regarding claim 59 
Claim 59 is a computer-readable storage medium claim corresponding to the method claim 51, and is directed to the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of claim 51.

Regarding claim 64
Claim 64 is a computer-readable storage medium claim corresponding to the method claim 56, and is directed to the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of claim 56.

Regarding claim 65 
Claim 65 is a system claim corresponding to the method claim 49, and is directed to the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of claim 49. Note that Djabarov teaches processors and memory ([fig 3] “processor” and “memory”).

Regarding claim 67 
Claim 67 is a system claim corresponding to the method claim 51, and is directed to the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of claim 51.

Regarding claim 72
Claim 72 is a system claim corresponding to the method claim 56, and is directed to the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of claim 56.

Claims 52, 60 and 68 are rejected under 35 U.S.C. 103 as being unpatentable over Toda et al. (A Probabilistic Approach for Automatically Filling Form-Based Web Interfaces) in view of Djabarov (US 8,214,362 B1), further in view of Hermens et al. (A Machine-Learning Apprentice for the Completion of Repetitive Forms), further in view of Klappert et al. (US 2016/0117542 A1)

Regarding claim 52, 
Toda, Djabarov and Hermens teach claim 49. 

Toda further teaches 
the first confidence score indicates a number [based on] the first test value and the corresponding values from the training data (Toda [fig 4]; [sec 4] “A graph representing a Value Style Model SM(Fj) is generated using the encodings of all symbol mask sequences found in values previously entered for field Fj. … 
    PNG
    media_image1.png
    42
    806
    media_image1.png
    Greyscale
  (7) If style is taken into account, information from the set of sequence models has to be added to the computation. In this case, the resulting formula is: 
    PNG
    media_image2.png
    93
    803
    media_image2.png
    Greyscale
 (8) … Let Cj be the set of segments Sab such that P(fj |sab) is above threshold ϵ. We say that Cj is a set of candidate values for field Fj. We aim at finding a mapping M between candidates values and fields in the form-based interface with a maximum aggregate probability, such that (1) only a single segment is assigned to each field and (2) the selected segments are non-overlapping, i.e., there are no segments Sab and Scd for a < c in the mapping such that b ≥ c. This is accomplished by means of a two-phase procedure as follows … In the second phase, if any field remains not mapped to a segment, we use the probabilities derived from the style-related features to try to find further assignments, using equation Eq. 8 to compute the probability of each field given each segment”; “probability” reads on “confidence score”.).

However, Toda, Djabarov and Hermens do not teach
the first confidence score indicates a number of matches between the first test value and the corresponding values from the training data.

Klappert teaches
the first confidence score indicates a number of matches between the first test value and the corresponding values from the training data (Klappert [figs 6-7]; [pars 2-19] “In some embodiments, when control circuitry identifies the user profile to which the print corresponds, control circuitry may query a database. While querying the database, control circuitry may cross-reference characteristics of the detected print against entries in a database to identify an entry that corresponds to the characteristics of the print. While cross-referencing the characteristics, control circuitry may analyze features of the print (e.g., whorls, etc.) and utilize the features as a means of uniquely identifying the person to whom the print corresponds. … A level of confidence may correspond to a number, percentage, or ratio of corresponding features of a print to features of a user profile.”; “entry that corresponds to the characteristics of the print” reads on “matches”. Note that Toda, Djabarov and Hermens teach “the first confidence score indicates a number of matches between the first test value and the corresponding values from the training data”.);


Regarding claim 60 
Claim 60 is a computer-readable storage medium claim corresponding to the method claim 52, and is directed to the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of claim 52.

Regarding claim 68 
Claim 68 is a system claim corresponding to the method claim 52, and is directed to the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of claim 52.

Claims 54, 62 and 70 are rejected under 35 U.S.C. 103 as being unpatentable over Toda et al. (A Probabilistic Approach for Automatically Filling Form-Based Web Interfaces) in view of Djabarov (US 8,214,362 B1), further in view of Hermens et al. (A Machine-Learning Apprentice for the Completion of Repetitive Forms), further in view of Byron et al. (US 2015/0046785 A1).

Regarding claim 54, 
Toda, Djabarov and Hermens teach claim 49. 

However, Toda, Djabarov and Hermens do not teach
the one or more data fields are selected based on natural language parsing data and historical form analysis.

Byron teaches 
the one or more data fields are selected based on natural language parsing data and historical form analysis ([Fig 10]; [par 4] “The method further comprises determining, for the at least one portion, a functional dependency of the at least one portion of the tabular data on one or more other portions of the tabular data.”; [pars 19-30] “One such processing operation that may be applied is natural language processing (NLP) of a document that contains a table data structure.”; [pars 115-119] “Moreover, various artificial intelligence and machine learning methods may be employed to assist in identifying suspicious cells within the table structure. For example, features extracted during natural language processing of a document and the table structure, such as markup language information, layout information, functional clues, and the like, may be used to infer functional dependencies within the table structure. This facilitates training and using machine learning models which can then be used to signal the presence of a functional dependency within a table structure.”; “natural language processing (NLP) of a document that contains a table data structure” reads on “natural language parsing data”. In addition, “training and using machine learning models” reads on “historical form analysis”.).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the electronic document management system of Toda, Djabarov and Hermens with the natural language parsing data and historical form analysis of Byron. Doing so would lead to finding correct functional dependencies by confirming or supporting the hypothesis over dependent field values (Byron, [pars 92-122]).

Regarding claim 62


Regarding claim 70
Claim 70 is a system claim corresponding to the method claim 54, and is directed to the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of claim 54.

Response to Arguments
Applicant's arguments filed on 10/19/2020 have been fully considered but they are not persuasive.
Applicant asserts 
“Neither Toda, Bourdev, nor Hermens, whether considered individually or in combination, discloses each and every element of claim 49, and therefore Applicant's claim 49 is patentable over the proposed combination of Toda, Bourdev, and Hermens. Claim 49 has been amended to include features of claim 55. Therefore, the remarks below address the Office Action's rejection of claims 49 and 55. 
The BPAI emphasized in a post-KSR ruling that "obviousness requires a suggestion of all limitations in a claim." 1 Further, the U.S. Supreme Court has ruled that a "patent composed of several elements is not proved obvious merely by demonstrating that each of its elements was, independently, known in the prior art." 2 Indeed, to establish a prima facie case of obviousness, "[a]ll words in a claim must be considered in judging the patentability of that claim against the prior art." 3 This means that the Examiner cannot disregard key features of a claim when formulating a rejection of the claim under 35 U.S.C. § 103. 

The Office Action alleges that Toda, Bourdev, and Hermens discloses this feature of Applicant's claim 49, and states on page 18 of the Office Action:
…
Applicant respectfully disagrees. The portion of Bourdev relied upon by the Office Action is reproduced below for convenience:
…
Bourdev describes systems and techniques for autocompleting form fields. In one implementation, the techniques include observing values entered into form fields and generating likelihood assessments for possible values to be entered in a current form field based on the observed values.4 Bourdev describes a calibration technique where every time the users enters a value in a field, the engine compares a prediction value from a heuristic to a real value and rewards heuristics that predict the correct value and decreases the weight of others.5 
There is no language in the above cited passages of Bourdev that discloses or suggests "generating, for the first function, a confidence score based on a difference between the first test value and the corresponding values from the training data; and providing the data value for the first data field of the new form based on the confidence score," as recited in claim 49. Instead, Bourdev merely describes comparing a prediction value from a heuristic to a real value. Bourdev does NOT describe generating a 
None of the other cited references cure the critical deficiencies of Bourdev discussed above. 
Accordingly, because none of the cited references disclose or suggest "generating, for the first function, a confidence score based on a difference between the first test value and the corresponding values from the training data; and providing the data value for the first field of the new form based on the confidence score," as recited in claim 49, Applicant's claim 49 is patentable over the cited references. 
Claims 51-52, 54, and 56 depend, directly or indirectly, from claim 49, and are therefore patentable over the cited references for at least the same reasons as claim 49.” (Remarks, pg 10)

Examiner’s response:
The examiner respectively disagrees. 

Toda and Djabarov in combination still teach the recited limitations below because a confidence value based on a probability is generated based on a segment value and values previously entered for each field; a predicted data value is provided for each field based on a confidence value based on a probability as follows:

generating, for the first function, a first confidence score based on the difference between the first test value and the corresponding values from the training data (Toda [fig 4]; [sec 4] “A graph representing a Value Style Model SM(Fj) is generated using the encodings of all symbol mask sequences found in values previously entered for field Fj. … 
    PNG
    media_image1.png
    42
    806
    media_image1.png
    Greyscale
  (7) If style is taken into account, information from the set of sequence models has to be added to the computation. In this case, the resulting formula is: 
    PNG
    media_image2.png
    93
    803
    media_image2.png
    Greyscale
 (8) … Let Cj be the set of segments Sab such that P(fj |sab) is above threshold ϵ. We say that Cj is a set of candidate values for field Fj. We aim at finding a mapping M between candidates values and fields in the form-based interface with a maximum aggregate probability, such that (1) only a single segment is assigned to each field and (2) the selected segments are non-overlapping, i.e., there are no segments Sab and Scd for a < c in the mapping such that b ≥ c. This is accomplished by means of a two-phase procedure as follows … In the second phase, if any field remains not mapped to a segment, we use the probabilities derived from the style-related features to try to find further assignments, using equation Eq. 8 to compute the probability of each field given each segment”; “probability” reads on “confidence score”. Note that Djabarov teaches generating a confidence score based on the difference between the test value and the corresponding values from the training data.); and 

providing the data value for the first data field of the new form based on the first confidence score ([sec 4] “The main idea behind iForm is to rely on information about previous values used for each field of a form to fill this form when a new text is given as input. … Let Cj be the set of segments Sab such that P(fj |sab) is above threshold ϵ. We say that Cj is a set of candidate values for field Fj. We aim at finding a mapping M between candidates values and fields in the form-based interface with a maximum aggregate probability, such that (1) only a single segment is assigned to each field and (2) the selected segments are non-overlapping, i.e., there are no segments Sab and Scd for a < c in the mapping such that b ≥ c. This is accomplished by means of a two-phase procedure as follows”);

For more details, see the rejections. Thus, the examiner’s rejections are reasonable and proper.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SEHWAN KIM whose telephone number is (571)270-7409.  The examiner can normally be reached on Mon - Thu 7:00 AM - 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, ALEXEY SHMATOV can be reached on 571-270-3428.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/LUIS A SITIRICHE/Primary Examiner, Art Unit 2126