DETAILED ACTION
Claims 1-20 are pending in this action.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
All 112 2nd rejections for indefinite terms or phrases are withdrawn.
112 2nd rejection for antecedent basis is withdrawn.

Response to Amendment
Applicants’ arguments were considered, but are unpersuasive.
Applicants allege that reading metadata is not equivalent to defining metadata.  Remarks at 9.  Applicants therefore conclude that Rehal, which discloses reading from a table in [0113], does not show defining metadata.
Regarding the allegation of non-equivalence, yes, if one wants to be extremely strict in claim terms.  It is not exclusive from defining, (e.g., parsing and tokenizing data certainly defines tokens), but it is not necessarily equivalent either.

In the art of computing, defining something is to create that thing or identify it as that thing1.  The question is whether Rehal does so.

Looking to the very next step in the claim 1, the step states “storing the metadata and the semantic objects in a semantic object repository.”  Thus, the mapping already indicates that the contents of the data lake is metadata.  

Looking to Rehal, [0113] is a superparagraph that precedes a list, and explains that “the following information is collected for each column of each table to be imported.”  Thus, [0113] does not just signal an intent to read, but also to import particular information into the data lake (for example, that information as is expressed in [0114]-[0125]).  Consequently, it discloses defining metadata.

Applicants allege that the limitation “semantic objects represents data objects in the data definitions” so is more than merely ‘data stored as document data objects (e.g. JSON documents).’  Remarks at 9 (different emphasis added by the examiner”)
As claimed and argued, applicants limitation carries no patentable weight, because (a) directed to printed matter (see subsequent “storing” step) (b) the limitations are directed to the expression (semantic object representation is to the expressive content of the stored data), and is thus non-functional descriptive material.

To clarify, this does not mean that the term “representation” is a patentable-weight poison pill.    For example, “This End Up” printed on a box is representation of an orientation, and the impartation of an absolute direction by which to measure relative orientation imparts new functionality to the substrate (namely, the ability to transport delicate goods 2.  But that is not how the features of the “semantic object” is being argued here.

Applicants allege that semantic objects are selected in the query designer interface, and a query is received based on the selection.  Remarks at 9.  Applicants distinguish this from Rehal by alleging that Rehal does not “permit[] selection of generic data entities” much less “semantic objects.”
Disclosure to support rejections under 102 or 103 may be implicit.  MPEP 2112; see also In re Preda 159 USPQ 342, 344 (CCPA 1968).

The claim requires:
(1) providing…the metadata representing the semantic objects; and
(2) receiving…query is based on selected ones of the selected objects.

Firstly, no affirmative step of selection is actually required by the claim, so the alleged distinction is moot.

Secondly, the very first line of Rehal [0024], the paragraph cited for the receiving step, expressly states “[t]he query builder interface may comprise an interface for selection of data entities used in the query.”  So even if selection of data entities were affirmatively required by the claim, Rehal discloses that much.


(1) Those are not mutually exclusive concepts.  Moreover, per Rehal [0058], there are many embodiments to how data is stored.  Some of these are table-oriented, and some are not.

(2) The rejection is based on the reference as a whole, and is not confined to merely the specific paragraph cited.  Said paragraphs are merely there as a guide for applicants’ review.  Per Rehal [0050], Fig 23 shows the interface of the query builder, and show that the selection is more granular than whole tables.  


Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1, 3-5, 7, 8, 10-12, 14, 15, 17-19 is/are rejected under 35 U.S.C. 102(a)(1),(2) as being anticipated by Rehal (US 2018/0095952 A1) hereinafter Rehal, evidenced by:
LanguageManual Indexing (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Indexing) hereinafter HiveIndexing;
File Formats and Compression;
(https://cwiki.apache.org/confluence/display/Hive/FileFormats) hereinafter FileFormats;
And date of Apache Hive versions evidenced by Downloads (https://hive.apache.org/downloads.html) hereinafter HiveDownloads
With respect to claim 1, Rehal discloses 
A computer-implemented method, comprising: 
acquiring, from a plurality of data sources, data definitions defining data structures ([0064] use of hive metastore), the data definitions including information for fields, semantics, and data relationships ([0064] Hive metastore provides for tables, columns, and a description); 
defining, using the data definitions, metadata ([0113] shows collecting metadata from sources.) for semantic objects representing data objects in the data definitions ([0058] document objects in a “data lake”); 
storing the metadata and the semantic objects in a semantic object repository ([0127] metadata repository may be in data lake; [0058] objects corresponding to data from data source stored in data lake); 
receiving a request for creating a query ([0023] shows providing a query builder tool; thus, use of the tool is the expected matter of using the system and under the principles of inherency, this step is disclosed);

receiving, from the query designer interface, a query based on selected ones of the semantic objects ([0024] query builder interface permits selection of data entities); 
storing the query in a query repository (Fig. 24 shows query is stored.  Implicitly, whatever repository that stores the query is the query repository); and 
providing a runtime object for executing the query (implicit via Fig. 24 elements 2434, 2436, since query was tested and scheduled for execution).  

Claim 8 is a Beauregard claim, with claim text similar to that of claim 1, and is mapped accordingly.  Additionally, Rehal discloses the non-transitory computer-readable medium (Fig. 26 element 2606)

Claim 15 is a system claim, with claim text similar to that of claim 1, and is mapped accordingly.  Additionally, Rehal discloses a system (Fig. 26, generally) comprising one or more computers (Fig. 26 element 2600); and one or more memory devices interoperably coupled with the one or more computers  and having tangible, non-transitory, machine-readable media (Fig. 26 element 2606 is both a memory device and stores intructions) storing one or more instructions…

With respect to claims 3, 10, 17, Rehal discloses the metadata is provided in a visual, hierarchal format (Fig. 17 shows metadata in hierarchy.  [0351] indicates that this is a UI, so is visual.).  



With respect to claims 5, 12, 19, Rehal discloses the semantic object repository is stored in a remote server or on premises (Fig. 1 shows data lake.  [0058] states that it can be stored in a distributed file system, so can be stored as either onsite or offsite).


With respect to claims 7, 14, Rehal discloses receiving the runtime object for executing the query (implicit to scheduling step in Fig. 2436); and 
executing the query against the semantic object repository and providing query results (implicit to scheduling step in Fig. 2436).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2, 9, 16, is/are rejected under 35 U.S.C. 103 as being unpatentable over Rehal as applied to claims 1, 8, 15, in view of How to use JSON with a Database (https://web.archive.org/web/20161205154220/https://www.quackit.com/json/tutorial/json_with_database.cfm) hereinafter JSONDatabase
With respect to claim 2, Rehal teaches a variety of file formats being supported by the data lake including custom formats (See HiveFormats, Custom InputFormat and OutputFormat).

Rehal does not directly teach the data definitions are stored in JAVASCRIPT OBJECT NOTATION (JSON) or extensible markup language (XML) format.  

JSONDatabase teaches that it was known to store database entries with JSON itself (JSON Database, “Some database management systems store data as JSON documents”)

Thus, the combination of teachings disclose the data definitions are stored in JAVASCRIPT OBJECT NOTATION (JSON) or extensible markup language (XML) format (JSON may be supported in HIVE as a custom format).  

Rehal and JSON databases are directed to databases.  It would have been obvious to those skilled in the art at the time of filing to combine the teachings of the references to employ the advantages of JSON (e.g. compactness (which makes for faster parsing), readability by humans, more flexibility in structure).

Claims 6, 13, 20 is is/are rejected under 35 U.S.C. 103 as being unpatentable over Rehal as applied to claims 1, 8, 15, in view of Baird et al. (US 2017/0169092 A1) hereinafter Baird
Alternatively, it is rejected under the same, and further in view of Official Notice

and characterizing the data definitions based on patterns in previously-acquired data definitions.

However, the steps were known in the art.  For example, Baird teaches
performing an automated discovery of the semantics in the data definitions (Abstract, use of interactions in collaboration of multi-dimensional data model, and model objects are validated via semantic rules.; Fig. 1B elements 124, 136 shows model is of Metastore.  Thus, the creation and validation of model objects requires semantic discovery) and 
characterizing the data definitions based on patterns in previously-acquired data definitions ([0057] model may be built via machine learning algorithms).

Rehal and Baird are directed to systems relying on Apache Hive metastore,  It would have been obvious to those of ordinary skill in the art at the time of filing to combine the teachings of the references in order to validate new objects in the metastore model.

Because (1) supervised learning is a known class of machine learning models (2) because it is well-known to use historical data to generate training data, and (3) because classification is a conventional use of machine learning, Baird’s recitation of generic machine learning discloses the limitations to those skilled in the art.  To the extent that applicants wish to dispute this and require specific findings of fact to traverse, Official Notice is taken for both facts.



Remarks
All portions of all references cited in the course of prosecution of this application, in this or any previous office action, are hereby employed in support of the current rejections for clarity and to preserve their viability as evidence upon any future appeal.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
BenjaVR	https://stackoverflow.com/questions/50193859/how-to-use-historical-data-set-for-training-and-prospective-data-set-as-input-fo
	Example of using historical data as training data.

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JASON G LIAO whose telephone number is (571)270-3775.  The examiner can normally be reached on M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tamara Kyle can be reached on 571-272-4241.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/JASON G LIAO/Primary Examiner, Art Unit 2156                                                                                                                                                                                                        16 Jan 21


    
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 https://www.merriam-webster.com/dictionary/define
        2 But note that the expression “This End Up” would still not get weight.  It would have the same weight as a blue dot that is used to represent the top.