1DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s arguments regarding the independent claims have been fully considered but are not persuasive. Applicant argues that the cited references do not teach, “evaluating the one or more features to detect a template associated with the information, wherein the template applies to multiple types of non-marked up content”. It is argued that Overell teaches a knowledge representation system that includes a knowledge base in which knowledge is represented in a structured format. Applicant continues by stating that Overell teaches that structured knowledge is extracted from unstructured text and added to the system and that they use a profile about entities to present a template to the user. Applicant argues that the profile templates are distinct from the claims since Overell’s templates are used for giving generation information about a particular entity based on its class and the knowledge about the entity in that system. It is argued that the reference makes no indication that the disclosed profile templates are used for extracting structured knowledge from unstructured text. 
In response, Examiner argues that the reference teaches learning structured content from unstructured content since structured content can reasonably be interpreted as a template since it provides order to unorganized data. The reference analyzes unstructured text and the output of that is text with structure i.e. the template. The structured knowledge is the template that applies to the non-marked or unstructured text that is originally analyzed. Again the claim language simply recites evaluating features to detect a template that applies to non-marked up content. As the title of 
Furthermore, the profiles of Overell as cited contain templates, which “define the contents of an information screen” [0102]. This is again giving structure to content that is ordinarily non-marked again meeting the claim limitation. Applicant argues that claim 21 further emphasizes the deficiencies of the references. In response, Examiner argues that claim 21 recites the same subject matter as the independent claim only utilizing a second template. The references show that templates are not limited to one and that it may be appropriate to choose a different template depending on the data. The rejection of claim 21 shows that both Overell and Gulwani in combination meet the limitations. 
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 3, 6, 10, 12, and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Overell et al. US 2011/0307435 in view of Gulwani, Sumit, William R. Harris, and Rishabh Singh. "Spreadsheet data manipulation using examples." further in view of Alur, Rajeev, et al. "Syntax-guided synthesis." 
Regarding claim 1, Overwell teaches a computer-implemented method comprising: receiving information” ([0094] “Knowledge addition in the preferred embodiment is achieved by a number of "processes" which interact with general internet users via a sequence of web pages containing prompts, text input boxes and buttons. These processes receive, check and refine the answers provided by users and include confirmation pages”) 
“tagging one or more features of the information using a structure construct of a hierarchical organization” ([0622] “All other document nodes inherit from the document_node class. When parsing the template, all nodes that don't have special behaviour associated with them (including all XHTML nodes) are created as instances of document_node.”)
“evaluating the one or more features to detect a template associated with the information” ([0102] “This is implemented in the preferred embodiment via a collection of profile templates which define the contents of an information screen and what queries need to be run to populate it”), “wherein the template applies to multiple types of non-marked up content” ([0024] “methods, systems, and computer program products are provided for extracting structured knowledge from unstructured text for use in a knowledge representation system”)
Overell however does not explicitly teach the remaining features. Gulwani however teaches 
 “evaluating the one or more features to detect a template associated with the information” (pg. 2 §3.1 “A predicate Match(vi, r, k) is satisﬁed if and only if vi contains at least k nonoverlapping matches of regular expression r. (In general, any ﬁnite set of predicates can be used.” Wherein a predicate match is detecting a template which relates to the layout) “wherein the template comprises layout data relating to the information […]” (pg. 97 right col. ¶ above §2 “We also describe an application of this methodology to 
perform layout transformations on tables (Section 5).” and subsequently pg. 103 §5.1 ¶3 “For the tables in Example 5, the ﬁlter program F1 = Filter(λc.(c.data  ≠ “” ∧  c.col  ≠ 1 ∧  c.row  ≠ 1), SEQ3,3,1) maps each date, that is, each nonempty cell not in column 1 and not in row 1, to its corresponding cell in column 3 of the output table, starting at row 1. Call this map mF1.” This comprises layout data since the program is able to manipulate and change it)
“based on the template detected, selecting a learned program” (pg. 98 top of left col. “(i) Generate learns the set of all programs, represented using data structure D, that are consistent with a given single example. (ii) Intersect intersects these sets (each corresponding to a different example” and pg. 98 right col. §3 ¶ under Example 1 “Such tasks can be automated by applying a program that performs syntactic string transformations.”), 
“applying the learned program to extract data from the information” (pg. 102 ¶1 “The expression f6 looks up the Markup percentage of the item from the MarkupRec table and f 2 generates a substring of this lookup value by extracting the first numeric token (thus removing the % sign)” and pg. 97 right col. ¶1 “we developed a programming by example (PBE), or inductive synthesis, methodology15 that has produced synthesizers that can automatically generate a wide range of string/table manipulating programs in spreadsheets from input–output examples.”)
Therefore it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Overell with that of Gulwani since “Spreadsheet systems, like Microsoft Excel, come with a maze of features, but end users struggle to find the correct features to accomplish their tasks” pg. 98 §1 ¶2.
Both however do not teach the remaining limitations. Alur however teaches “wherein evaluating the one or more features comprises applying heuristic machine learning processing to a plurality of stored templates” (pg. 5 ¶2 “Inductive synthesizers generalize from examples by searching a restricted space of programs. In machine learning, this restricted space is called the concept class, and each element of that space is often called a candidate concept. The concept class is usually specified syntactically. Inductive learning is thus a natural fit for the syntax-guided synthesis problem introduced in this paper: the concept class is simply the set L of permissible expressions”)
“[…]wherein the synthesizing comprising applying inductive synthesis processing to generate one or more abstractions from the user-provided examples” (pg. 5 left col. ¶2 “Inductive synthesizers generalize from examples by searching a restricted space of programs. In machine learning, this restricted space is called the concept class, and each element of that space is often called a candidate concept. The concept class is usually specified syntactically. Inductive learning is thus a natural fit for the syntax-guided synthesis problem introduced in this paper: the concept class is simply the set L of permissible expressions”)
Therefore it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Overell and Gulwani with that of Alur since “Compared to the classical formulation of the synthesis problem that involves only the correctness specification, the syntax-guided version has many potential benefits.” pg. 2 left col. ¶2.
Regarding claims 10 and 19, Overell teaches “a system comprising: a memory; and at least one processing operatively connected with the memory, configured to execute operations comprising” ([0024] “According to one class of embodiments, methods, systems, and computer program products are provided for extracting structured knowledge from unstructured text for use”) 
“detecting a template associated with information including non-marked up content[…]that compares the information with a plurality of stored templates” ([0503] “If there are sequences still to be examined, the next one is selected (step 906) and all translation templates that might translate this sequence are then looked up (step 908).”) “wherein the template applies to multiple types of non-marked up content” ([0024] “methods, systems, and computer program products are provided for extracting structured knowledge from unstructured text for use in a knowledge representation system”)
Overell however does not explicitly teach the rest of the limitations. Gulwani however teaches “by applying machine learning processing” (abstract “developing a synthesis algorithm that can learn programs in that language from user-provided examples”)
 “determining, from a learned program pool comprising a plurality of learned programs” (pg. 97 right col. ¶1 “synthesizers that can automatically generate a wide range of string/table manipulating programs in spreadsheets from input–output examples” and pg. 98 top of left col. “Data structure for representing consistent programs: The number of programs in L that are consistent with a given set of input–output examples can be huge”)[…] -provided examples”)
“a learned program to apply based on the template detected”  (pg. 98 top of left col. “(i) Generate learns the set of all programs, represented using data structure D, that are consistent with a given single example. (ii) Intersect intersects these sets (each corresponding to a different example” and pg. 98 right col. §3 ¶ under Example 1 “Such tasks can be automated by applying a program that performs syntactic string transformations.”)
“applying the learned program to manipulate extracted data from the information” (pg. 102 ¶1 “The expression f6 looks up the Markup percentage of the item from the MarkupRec table and f 2 generates a substring of this lookup value by extracting the first numeric token (thus removing the % sign)” and pg. 97 right col. ¶1 “we developed a programming by example (PBE), or inductive synthesis, methodology15 that has produced synthesizers that can automatically generate a wide range of string/table manipulating programs in spreadsheets from input–output examples.”)
Therefore it would have been obvious to one having ordinary skill in the art as the time the invention was effectively filed to combine the teachings of Overell with that of Gulwani since “Spreadsheet systems, like Microsoft Excel, come with a maze of features, but end users struggle to find the correct features to accomplish their tasks” pg. 98 §1 ¶2.
Both however do not teach the remaining limitations. Alur however teaches
“[…]wherein the learned program is generated using inductive synthesis processing to generate one or more abstractions from user-provided examples” (pg. 5 left col. ¶2 “Inductive synthesizers generalize from examples by searching a restricted space of programs. In machine learning, this restricted space is called the concept class, and each element of that space is often called a candidate concept. The concept class is usually specified syntactically. Inductive learning is thus a natural fit for the syntax-guided synthesis problem introduced in this paper: the concept class is simply the set L of permissible expressions”)
Therefore it would have been obvious before the effective filing date of the claimed invention to combine the teachings of Overell and Gulwani with that of Alur since “Compared to the classical formulation of the synthesis problem that involves only the correctness specification, the syntax-guided version has many potential benefits.” pg. 2 left col. ¶2.
Note that independent claim 19 recites the same substantial subject matter as independent claim 10 only differing in embodiment and as such are subject to the same rejection. The different embodiment is taught by Overell, including the computer-readable storage device [0024] “According to one class of embodiments, methods, systems, and computer program products are provided for extracting structured knowledge from unstructured text for use” and the processor which would be inherent to any computing system such as the one of Overell.
Regarding claims 3 and 12, Gulwani further teaches “wherein the confidence level is determined by executing at least one of heuristic machine learning processing and machine learning processing for fingerprint template recognition” (pg. 101 Procedure “Intersect uses a greedy heuristic to minimize the number of partitions by starting with singleton partitions and then iteratively merging partitions that have the highest compatibility score, which is a function of the size of the resulting partition and its potential to be merged with other partitions.”)
Regarding claims 6 and 15, Gulwani further teaches “wherein the learned program is determined based on application machine learning processing comprising at least one of heuristic machine learning processing and machine learning processing for template recognition” (pg. 101 Procedure “Intersect uses a greedy heuristic to minimize the number of partitions by starting with singleton partitions and then iteratively merging partitions that have the highest compatibility score, which is a function of the size of the resulting partition and its potential to be merged with other partitions.”)
Regarding claim 21, the Overell, Gulwani, Alur, and Visan references have been addressed above. Overell further teaches “wherein the information is a first input, the template is a first template, the learned program is a first learned program, and the method further comprises receiving a second input including non-marked up content, wherein the second input is a different type than the first input” ([0094] “Knowledge addition in the preferred embodiment is achieved by a number of "processes" which interact with general internet users via a sequence of web pages containing prompts, text input boxes and buttons. These processes receive, check and refine the answers provided by users and include confirmation pages”)
Gulwani further teaches “evaluating one or more features of the second input to detect a second template associated with the second input, wherein the second template is the first template that applies to multiple types of non-marked up content” (Gulwani pg. 2 §3.1 “A predicate Match(vi, r, k) is satisﬁed if and only if vi contains at least k nonoverlapping matches of regular expression r. (In general, any ﬁnite set of predicates can be used.” Wherein a predicate match is detecting a template which relates to the layout) 
“based on the detected template: selecting a second learned program” (pg. 98 top of left col. “(i) Generate learns the set of all programs, represented using data structure D, that are consistent with a given single example. (ii) Intersect intersects these sets (each corresponding to a different example” and pg. 98 right col. §3 ¶ under Example 1 “Such tasks can be automated by applying a program that performs syntactic string transformations.”),
“applying the second learned program to extract data from the second input” (pg. 102 ¶1 “The expression f6 looks up the Markup percentage of the item from the MarkupRec table and f 2 generates a substring of this lookup value by extracting the first numeric token (thus removing the % sign)” and pg. 97 right col. ¶1 “we developed a programming by example (PBE), or inductive synthesis, methodology15 that has produced synthesizers that can automatically generate a wide range of string/table manipulating programs in spreadsheets from input–output examples.”)
Note that this claim essentially amounts to doing the steps of claim 1 only using a “second” input and “second” learned program. Regardless of which program being used the functionality is the same as the independent claim and thus the references are capable of meeting the limitations. 
Claim 2, 7-9, 11, 16-18, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Overell et al. US 2011/0307435 in view of Gulwani, Sumit, William R. Harris, and Rishabh Singh. "Spreadsheet data manipulation using examples." further in view of Alur, Rajeev, et al. "Syntax-guided synthesis." and Visan et al. US 2008/0025555.

Regarding claims 2 and 11, the Overell, Gulwani, and Alur references have been addressed above. All do not explicitly teach using a confidence level. Visan however teaches “wherein the detecting of the template further comprises determining a confidence level for matching a stored template with a template associated with the information, and selecting a template from the plurality of stored templates based on the confidence level” ([0073] “a numerical score 770D that relates a confidence level of all computations performed on the inspected document in relation to the chosen document template; A numerical value 770E indicating the threshold limit for passing or failing the inspection process.”)
It would have been obvious before the effective filing date of the claimed invention to combine the teachings of Overell, Gulwani, and Alur with that of Visan since a combination of known methods would yield predictable results that is, it is known in the art to use confidence levels when selecting an item. Thus using confidence levels to select a template to match data would operate in a predictable manner by allowing for more accurate templates to be selected.
Regarding claims 7 and 16, Visan further teaches “wherein the learned program is determined based on application of machine learning processing that runs the plurality of learned programs from the learned program pool and evaluates the extracted data from the plurality of learned programs using a confidence value associated with the extracted data” ([0057] “Knowledge base 300 (the contents of which are stored in storage devices 130 or 160) contains known templates for a variety of security documents 140 that are identified by a document signature.”)
Regarding claims 8 and 17, Visan further teaches “building the learned program pool comprising associating the plurality of learned programs with one or more of the stored templates” ([0054] “Knowledge base 300 (the contents of which are stored in storage devices 130 or 160) contains known templates for a variety of security documents 140 that are identified by a document signature”)
Regarding claims 9 and 18, Visan further teaches “wherein applying the learned program further comprises aggregating and exporting the extracted data into a collection of extracted values” (abstract “multiple features from a single document are assessed and scored separately from one another with a final aggregate or weighted score being provided to the user for the whole document”) and “outputting the collection of extracted values, wherein the outputting of the collection of extracted values comprises presenting the collection of extracted values as a data feed for use by other applications” ([0105] “It should be noted that the above methods may also be used to extract and compare not only the clearly visible features of a security document (e.g. microprinting, color of specific area, identifying indicia such as the maple leaf design) but also non-visible and hidden features as well”).
Regarding claim 20, Visan teaches “building the learned program pool comprising associating the plurality of learned programs with one or more of the stored templates” ([0054] “Knowledge base 300 (the contents of which are stored in storage devices 130 or 160) contains known templates for a variety of security documents 140 that are identified by a document signature”)
“outputting the extracted data manipulated based on application of the learned program” ([0105] “It should be noted that the above methods may also be used to extract and compare not only the clearly visible features of a security document (e.g. microprinting, color of specific area, identifying indicia such as the maple leaf design) but also non-visible and hidden features as well”).

Claims 4-5 and 13-14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Overell US 2011/0307435 in view of Gulwani, Sumit, William R. Harris, and Rishabh Singh. "Spreadsheet data manipulation using examples."further in view of Alur, and further in view of Vacariuc US 2011/0055748.
Regarding claims 4 and 13, the Overell, Gulwani, and Alur references have been addressed above. Both do not explicitly teach the limitation. Vacariuc however teaches “wherein when the confidence level is less than a threshold value, requesting a user to provide example operations for analyzing the information, and creating a new learned program from the example operations using program synthesis processing” ([0048] “If the degree of confidence of a match is not high then process 250 may seek further user-feedback” user feedback i.e. providing example)
Therefore it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Overell and Gulwani reference with that of Vacariuc since a combination of known methods would yield predictable results that is, utilizing a confidence level and threshold to determine if feedback is necessarily is known in the art and would operate normally on a set of data such as the data from the system of Visan and Gulwani.
Regarding claims 5 and 14, Gulwani further teaches “further comprising adding the new learned program to the learned program pool” (pg. 98 §2.1 ¶1 “If any output is incorrect, the user can fix it and reapply the synthesizer, using the fix as an additional example”).
Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KEVIN W FIGUEROA whose telephone number is (571)272-4623.  The examiner can normally be reached on Monday-Friday, 10AM-6PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, MIRANDA HUANG can be reached on (571)270-7092.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/Kevin W Figueroa/Examiner, Art Unit 2124                                                                                                                                                                                                        
/MIRANDA M HUANG/Supervisory Patent Examiner, Art Unit 2124