Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Claims 1-7 and 9-20 are pending. Claims 1, 5 and 13 are independent.  Independent Claims have different scopes.  Claim 8 is canceled.  New Claim 21 is added which depends from Claim 13.
This Application was published as U.S. 2019/0095444.
Applicant’s amendments and arguments are considered but are either unpersuasive or moot in view of the new grounds of rejection that where presented were necessitated by the amendments.
This is a second RCE. 
Response to Amendments 
Objection to Claims 1and 13 for including an informality is withdrawn in view of the amendments.
Response to Arguments
The added material is addressed by new or modified grounds of rejection.
	Please note the suggestion at the end of this section.
Independent Claims as amended provide:
1.  A method, comprising: 
receiving a dataset associated with a customer account, the data set including first metadata indicating a plurality of data fields of a table included in a portion of the dataset; 

receiving a natural language query from a client computer system, the natural language query requesting information associated with the dataset; and 
providing a response to the natural language query by at least: 
determining second metadata that describes the dataset in natural language terms including a plurality of natural language terms determined based at least in part on the first metadata;
processing the natural language query into a set of tokens based at least in part, on query terms included in the natural language query and the plurality of data fields of the table included in the second metadata; 
generating an executable query using the second metadata and the set of tokens, the executable query defining an analysis to perform on the dataset, based at least in part on the set of values, to satisfy the executable query using information obtained from the plurality of data fields of the table indicated in the second metadata; 
executing the executable query on the dataset associated with the customer account to produce a result, the result includes a set of data fields of the plurality of data fields of the table obtained from the dataset based at least in part on a portion of the second metadata, where the portion of second metadata includes an indication of a data field of the plurality of data fields that corresponds to a natural language term of the plurality of natural language terms; generating a natural language result including a natural language description of 
identifying a set of natural language terms of the plurality of natural language terms based at least in part on the natural language terms included in the second metadata indicating the set of data fields of the plurality of data fields of the table included in the dataset; and 
generating the natural language description of the result based at least in part on the set of natural language terms, where the natural language description is generated after receiving the natural language query; and 
transmitting the natural language result to the client computer system.


5.     A system, comprising: 
one or more processors; and 
a memory storing executable instructions that, as a result of being executed by the one or more processors, 
cause the system to: 
receive a dataset of a user including first metadata describing a table having a set of data fields and a natural language query for the dataset including a set of natural language terms associated with a subset of data fields of the set of data fields of the table; 
receive, at an audio interface, an audio stream representing a conversation between a first person and a second person;
identify the natural language query in the audio stream;
determine second metadata for the dataset corresponding to the subset of 
in response to the natural language query: 
process the natural language query into a set tokens by at least identifying query terms included in the natural language query based at least in part on the second metadata, where the set of tokens correspond to the subset of data fields of the set of data fields of the table included in the second metadata; 
generate an executable query using the second metadata and the set of tokens, the executable query defining an analysis to perform on the dataset to satisfy the executable query; 
execute the executable query on the dataset to produce a result; and 
generate a natural language result based at least in part on the result and the second metadata by at least determining a subset of natural language terms of the set of natural language terms that describes a data field of the subset of data fields included in the result, where the data field is associated with a column descriptor of the data field included in the subset of natural language terms of the second metadata; and 
transmit the natural language result to the user; and 
cause the natural language result generated from the result to be played by the audio interface.

13.     Anon-transitory computer-readable storage medium storing thereon executable instructions that, as a result of being executed by one or more processors of a computer system, 
cause the computer system to:
receive a dataset associated with a customer, the data set including a table having a plurality of data fields described in first metadata and a natural language query for the dataset, the natural language query including a set of natural language terms associated with a subset of data fields of the plurality of data fields of the table;
determine a set of insights for the dataset based at least in part on the plurality of data fields;
determine second metadata for the dataset based at least in part on a subset of natural language terms of the set of natural language terms, where members of the subset of natural language terms are associated with data type information of one a data field of the plurality of data fields of the dataset described in the second metadata; and  
generate a response to the natural language query by at least: 
processing the natural language query into a set of tokens, the set of tokens generated based at least in part on information included in the natural language query and the second metadata; 
generating a query using the second metadata and the set of tokens, the query defining an analysis to perform, based at least in part on the set of insights, on the subset of data fields of the plurality of data fields of the table of the dataset to satisfy the query; 
executing the query on the dataset to produce a result; 
generating a natural language result, based at least in part on the natural language query, the result, and the second metadata by at least converting the result to the natural language result, the natural language result including natural language term of the subset of natural language terms corresponding to the data field, where the data type information of the data filed is included in the result and the natural language term defines the data type information; and 
transmitting the natural language result to an endpoint associated with the customer. 

The Interview Summary of 10/22/2020 provided:

    PNG
    media_image1.png
    449
    684
    media_image1.png
    Greyscale

(Interview conducted 22 October 2021, mailed 10/27/2020)

As provided above, Examiner pointed to the now-canceled Claim for including “conversation.”  However, noted that because of the repeated references to “conversation,” this feature is likely to be important to the instant Application and if the Specification includes particulars in this respect, addition of those particulars may overcome the cited art.  A mere mention to “conversation” without any tie to the rest of the limitations and without more particulars was already in one of the Claims and subject to mapping to a reference.

Independent Claims began with varying scopes and have amended with differing material.  

Aside from the conversation aspect that was raised by the Examiner, another aspect that is emphasized in the Disclosure is the database of use data and the statistical analysis performed on it:

    PNG
    media_image2.png
    514
    736
    media_image2.png
    Greyscale

    PNG
    media_image3.png
    736
    528
    media_image3.png
    Greyscale

    PNG
    media_image4.png
    699
    531
    media_image4.png
    Greyscale


With respect to Claim 1, Applicant argues:

    PNG
    media_image5.png
    220
    645
    media_image5.png
    Greyscale

(Applicant’s Response, p.12.)
With respect to comments regarding the Examiner interview please refer to the Interview Summary of 10/22/2020 which is provided above.

	The amended language that is argued provides:
receiving a dataset associated with a customer account, the data set including first metadata indicating a plurality of data fields of a table included in a portion of the dataset; 
determining a set of values associated with the dataset by at least performing a set of statistical measurements using a plurality of data fields of the table;
….
	generating an executable query using the second metadata and the set of tokens, the executable query defining an analysis to perform on the dataset, based at least in part on the set of values, to satisfy the executable query using information obtained from the plurality of data fields of the table indicated in the second metadata; 

	These amendments expand upon language regarding the existence of data fields previously existing in Claim 1 and adds that statistical operations are performed on the values in the data fields.
The above aspects and the portions of the Disclosure provided above appear to provide an NLP interface for a user-specific database.
The basic idea of having an NLP front for a database query is taught by Romero.  Users that belong to different companies (see Figure 2) can access a tenants database according to their level of permission.  This teaches having a database specific to each user (each user has access to a portion of the tenants according to his authorization level).

	The specifics such as performing statistical analysis on the data which is added by amendment is addressed by new or modified grounds of rejection.

With respect to Claims 5 and 13, Applicant relies on the argument provided for Claim 1:

    PNG
    media_image6.png
    252
    654
    media_image6.png
    Greyscale

 (Applicant’s Response, p. 13.)
	The amendments to Claims 5 and 13 are different from the amendments to Claim 1 and also different from each other.

	Added language to Claim 5 mentions the “conversation” which was discussed during the interview but does not expand upon it other than by saying that the natural language query is formed from this conversation without any further particularity:
receive, at an audio interface, an audio stream representing a conversation between a first person and a second person;
identify the natural language query in the audio stream;

Amendments to Claim 13 provide:
determine a set of insights for the dataset based at least in part on the plurality of data fields;
with the follow-up in the added Claim 21 which further expands on the definition of “insights” as statistical measures such as max, min, avg, trends, or broadly “other characteristics of the dataset”: 
21. The non-transitory computer-readable storage medium of claim 13, wherein the set of insights include at least one of: statistical measures, relationships, minimum values, maximum values, trends, and other characteristics of the dataset.

	The added material is addressed by new or modified grounds of rejection.

Patentability of the dependent Claims is argued based on their dependence from their base independent Claims. Accordingly, the above provides a reply to those arguments as well.

Suggestion:  It appears that the Disclosure has the following features:  
1- Takes its query from a conversation between two human participants as speech.
2- Conducts Diarization on the conversation which means it identifies the participants by their voice and knows which one said what.
3- Has databases of information that is personal to the users such that the identification of the speakers matters.  The personal nature of the databases ties in the Conversation and Diarization to the Database Statistics.
4- Conducts statistical operations on the personal data of each user in the user databases.
5- Generates and outputs the result.
The above aspects are scattered among the Claims.  They need to be in one Claim and properly tied together to convey the idea of the instant Application.  Whether or not combination of references that teach the parts is warranted depends on how tightly the concepts tie together.  Are they pebbles sitting in a pond next to one another or algae strands tangled and inseparable and perhaps sharing a same root?

    PNG
    media_image7.png
    428
    527
    media_image7.png
    Greyscale
 
    PNG
    media_image8.png
    488
    334
    media_image8.png
    Greyscale

	
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-4 are rejected under 35 U.S.C. 103 as being unpatentable over Romero (U.S. 2018/0032576) in view of Orr (U.S. 2016/0378747) and further in view of Carothers (U.S. 2016/0314146).

    PNG
    media_image9.png
    609
    421
    media_image9.png
    Greyscale
 
    PNG
    media_image10.png
    717
    491
    media_image10.png
    Greyscale



    PNG
    media_image11.png
    442
    702
    media_image11.png
    Greyscale


    PNG
    media_image12.png
    474
    748
    media_image12.png
    Greyscale


    PNG
    media_image13.png
    753
    504
    media_image13.png
    Greyscale
 
    PNG
    media_image14.png
    717
    489
    media_image14.png
    Greyscale


    PNG
    media_image15.png
    494
    732
    media_image15.png
    Greyscale


Regarding Claim 1, Romero teaches:
1.  A method, comprising: 
receiving a dataset associated with a customer account, [Romero, Figure 2, “Multi-Tenant Database 16” expanded in Figure 3.  The database is associated with different customers/tenants such as John Smith, User 2, User 3, ACME, or Jones shown under Tenants 136 or Account Table 140.  The “Database 16” includes “Datasets” for each user/tenant and includes “MetaData 132A” which shows the different Tables that are available and “MetaData 132B” which shows the types of data in each table.  Figure 4 teaches that natural language phrases are mapped to datasets, tables in the datasets, and fields of the tables using metadata.  Thus, various types of metadata providing various levels of connection to the NL phrases exist by the process of Figure 4, 200A.  For example, in Figure 3, the 132B values that are values for fields of a table or dataset teach the “first metadata” of the Claim based on the definition provided for the “first metadata” further down in this Claim.  In “Account Table 140,” the user/tenant Account Names is the “second metadata” and the types of data fields are the “First Metadata” 142A, 142B, 142C.  See also:  “1. A method for operating a natural language platform in a multi-tenant database, comprising: storing metadata associating natural language phrases with structured data in the multi-tenant database…”]  
the data set including first metadata indicating a plurality of data fields of a table included in a portion of the dataset; [Romero, Figure 3, “Metadata 132B” / “first metadata” defines the “natural language term” of “phone number” as “work phone.”  Each of the Tables/Datasets in 132A includes its own data fields.  See Figure 4, 200A.  “[0120] FIG. 4 shows an example chatbot process for generating structured database queries. In operation 200A, metadata is created and stored in the multi-tenant database. The metadata may associate different natural language phrases with datasets or fields associated with a particular organization in the multi-tenant database. As described above, the metadata also may associate different keywords and/or natural language sentence structures, such as predicates, with different tables, columns, datasets, or SQL functions.”  The “first metadata” which indicate the fields and details of each dataset include the “data type.”  In Figure 3, “Account Table 140” the fields Work Phone 142A, Mobile Phone 142B, and Account Category 142C also teach “first metadata that indicates data fields.”]
determining a set of values associated with the dataset by at least performing a set of statistical measurements using a plurality of data fields of the table; [The types of data fields in Romero include the “sale price” which can be subjected to statistical analysis.  Also, interpreting “set of statistical measurements” broadly, it can be as broad as taking a value from the table.  However, Romero does not teach that values such as averages, trends, etc. are calculated.  “[0040] … Each table generally contains one or more data categories logically arranged as columns or fields in a viewable schema. Each row or element of a table can contain an instance of data for each category defined by the fields. For example, a CRM database can include a table that describes a customer with fields for basic contact information such as name, address, phone number, fax number, etc. Another table can describe a purchase order, including fields for information such as customer, product, sale price, date, etc. In some MTS implementations, standard entity tables can be provided for use by all tenants….”]
receiving a natural language query from a client computer system, the natural language query requesting information associated with the dataset; and [Romero, Figures 2-3 and 5-6, “natural language query 104” is asking for information from the “tenant database 16.”]
providing a response to the natural language query by at least: [Romero, Figure 6, “natural language response 310.”]
determining second metadata that describes the dataset in natural language terms including a plurality of natural language terms determined based at least in part on the first metadata, [Romero, Figure 3, “Metadata 132A” / “second metadata” defines the various Tables / “datasets.”  Second Metadata can be the User/Tenant name also:  Figure 3, Account Name: ACME, Jones.  Each user has his own dataset. ]  
processing the natural language query into a set of tokens based at least in part, on query terms included in the natural language query and the plurality of data fields of the table included in the second metadata; [Romero, Figure 6, “natural language processor 304” and “validator 302” parse the natural language query (parsing generates tokens) and do so based on the relevant datasets and data fields.  “[0128] … Chatbot 110 may include a natural language processor 304 that parses natural language text commonly used in person-to-person communications. Processor 304 may identify a grammatical sentence structure in natural language query 104 and identify keywords and a context for specific phrases within query 104 based on the sentence structure….”  “[0129] A validator 302 may receive the parsed data from natural language processor 304 and identify datasets in database 16 associated with query 104….”]
generating an executable query using the second metadata and the set of tokens, the executable query defining an analysis to perform on the dataset, based at least in part on the set of values, to satisfy the executable query using information obtained from the plurality of data fields of the table indicated in the second metadata; [Romero, Figure 6, the “Query Processor 300” generates a “database query 108” / “executable query” and sends it to the proper table in the proper dataset which is identified by the second metadata, such as metadata 132A shown in Figure 3.  “[0131] Query processor 300 submits structured database query or action 108 to multi-tenant database 16 ….”  Figure 4, 200C:  “Apply Natural Language Processing to Convert User Input into Structured Database Query or Action.”  The query requests data which are based on the information in the data fields of the tables of the database.]
executing the executable query on the dataset associated with the customer account to produce a result, the result includes a set of data fields of the plurality of data fields of the table obtained from the dataset based at least in part on a portion of the second metadata, where the portion of second metadata includes an indication of a data field of the plurality of data fields that corresponds to a natural language term of the plurality of natural language terms; [Romero, Figure 6, “query results 301.”  Figure 4, 200F: “Submit Structured Query to Multi-tenant Database And Send Results to User.”  Figures 2 and 3 showing that each tenant/user/participant has its own account and associated “permissions” as shown in Figure 5, 250B, 250C.  The datasets/ Tables are obtained based on second metadata 132A.  Figure 3 shows that “Metadata 132A” includes, for each Table/Dataset, a “predicate field” / “data field.”  Figure 4, 200A shows the creation of the metadata mapping to the database and teaches that the metadata are associated with datasets or with data fields:  “[0120] … The metadata may associate different natural language phrases with datasets or fields associated with a particular organization in the multi-tenant database….”  Each tenant organization has its own dataset and each dataset may have several types of tables each of which has a plurality of data fields.  “11 …. identifying a first phrase in the natural language query associated with a table in the database system ….”  “[0020] … Some on-demand database services can store information from one or more tenants into tables of a common database image to form a multi-tenant database system (MTS). … For example, a given application server may simultaneously process requests for a great number of customers, and a given database table may store rows of data such as feed items for a potentially much greater number of customers….”  “[0040] Each database can generally be viewed as a collection of objects, such as a set of logical tables, containing data fitted into predefined or customizable categories. A "table" is one representation of a data object, and may be used herein to simplify the conceptual description of objects and custom objects according to some implementations. It should be understood that "table" and "object" may be used interchangeably herein. Each table generally contains one or more data categories logically arranged as columns or fields in a viewable schema. Each row or element of a table can contain an instance of data for each category defined by the fields. For example, a CRM database can include a table that describes a customer with fields for basic contact information such as name, address, phone number, fax number, etc. Another table can describe a purchase order, including fields for information such as customer, product, sale price, date, etc. In some MTS implementations, standard entity tables can be provided for use by all tenants. For CRM database applications, such standard entities can include tables for case, account, contact, lead, and opportunity data objects, each containing pre-defined fields. As used herein, the term "entity" also may be used interchangeably with "object" and "table."”  “[0041] … In some implementations, for example, all custom entity data rows are stored in a single multi-tenant physical table, which may contain multiple logical tables per organization. It is transparent to customers that their multiple "tables" are in fact stored in one large table or that their data may be stored in the same table as the data of other customers.”] 
generating a natural language result including a natural language description of the result based at least in part on the result and second metadata by at least: [Romero, Figure 6,  “Formatter 306” generating the “natural language response 310.”  Figure 4, 200F: “… Send Results to User.”] 
identifying a set of natural language terms of the plurality of natural language terms based at least in part on the natural language terms included in the second metadata indicating the set of data fields of the plurality of data fields of the table included in the dataset; and [Romero, Figure 4, 200A and 200C.  See [0020], [0040] and [0041] above.  Figure 6.  The natural language query input is processed, its terms and phrases parsed out and correlated with the terms and phrases of the databases and results extracted.  The metadata correlate the natural language terms to databases and their fields:  “[0120] … The metadata may associate different natural language phrases with datasets or fields associated with a particular organization in the multi-tenant database….”]
generating the natural language description of the result based at least in part on the set of natural language terms, where the natural language description is generated after receiving the natural language query; and [ Romero, Figure 6, “query results 301” are converted to natural language by “formatter 306” and output as natural language to the “user system 12.”]
transmitting the natural language result to the client computer system. [Romero, Figure 6, “Natural Language Response 310.”]

Parsing generates tokens but Romero does not expressly include parsing into “Tokens.”
The types of data fields in Romero include the “sale price” which can be subjected to statistical analysis.  However, Romero does not teach that values such as averages, trends, etc. are calculated.  See Romero [0040].

Orr teaches:
receiving a dataset associated with a customer account, [Orr, “[0272] … According to some embodiments, the user's birthday is stored on the electronic device 200, or is stored in association with a user account that in turn is associated with the electronic device 200 and/or a service or program that transmits media to the electronic device, such as the iTunes.RTM. application program, Apple Music or iTunes Radio.TM. (services of Apple, Inc. of Cupertino, Calif.)….”]
the data set including first metadata indicating a plurality of data fields of a table included in a portion of the dataset; [Orr ontinues “[0272] … Upon determining the date of the user's birthday, the digital assistant then causes a search to be made of one or more databases of historical music chart information (e.g., the database of Billboard of New York, N.Y.) based on the date of the user's birthday. The digital assistant receives historical music chart information from one or more databases, and in response obtains for the user (through the use of streaming audio or by downloading) and plays one or more of the songs identified by that historical music chart information.”  Orr in Figure 2B teaches “metadata 283” associated with each event and delivery of the event.  See [0152].  Metadata can indicate the type of media being output or context of an application. [0256]-[0258], [0262].]
determining a set of values associated with the dataset by at least performing a set of statistical measurements using a plurality of data fields of the table; [Orr performs statistical operations on “intensity” data which pertains to the intensity of touching a screen by a user.  The “intensity data,” maybe user-adjustable [0292], but is not tracked per user and does not fall within the “dataset” of this Claim which is associated with a customer account.  See [0052], [0079], [0191].]
receiving a natural language query from a client computer system, the natural language query requesting information associated with the dataset; and [Orr, Figure 7B, “speech input” eventually converted to a “structured query.”  “[0239] In some examples, once natural language processing module 732 identifies an actionable intent (or domain) based on the user request, natural language processing module 732 can generate a structured query to represent the identified actionable intent. …. For example, the user may say "Make me a dinner reservation at a sushi place at 7." ….”]
providing a response to the natural language query by at least: [Orr, Figure 7B, “Responses” as output and as provided by the “speech synthesis module 740.”  “[0042] …For example, a user can ask the digital assistant a question, such as "Where am I right now?" Based on the user's current location, the digital assistant can answer, "You are in Central Park near the west gate." The user can also request the performance of a task, for example, "Please invite my friends to my girlfriend's birthday party next week." In response, the digital assistant can acknowledge the request by saying "Yes, right away," and then send a suitable calendar invite on behalf of the user to each of the user's friends listed in the user's electronic address book. …”]
determining second metadata that describes the dataset in natural language terms including a plurality of natural language terms determined based at least in part on the first metadata, where the first metadata indicates a plurality of data fields of a table included in a portion of the dataset:  [Orr, “[0152] In some embodiments, a respective event recognizer 280 includes metadata 283 with configurable properties, flags, and/or lists that indicate how the event delivery system should perform sub-event delivery to actively involved event recognizers….”  “[0225] In some examples, natural language processing module 732 can be configured to receive metadata associated with the speech input. The metadata can indicate whether to perform natural language processing on the speech input (or the sequence of words or tokens corresponding to the speech input). If the metadata indicates that natural language processing is to be performed, then the natural language processing module can receive the sequence of words or tokens from the STT processing module to perform natural language processing. However, if the metadata indicates that natural language process is not to be performed, then the natural language processing module can be disabled and the sequence of words or tokens (e.g., text string) from the STT processing module can be outputted from the digital assistant. In some examples, the metadata can further identify one or more domains corresponding to the user request. Based on the one or more domains, the natural language processor can disable domains in ontology 760 other than the one or more domains. In this way, natural language processing is constrained to the one or more domains in ontology 760. In particular, the structure query (described below) can be generated using the one or more domains and not the other domains in the ontology."”]
processing the natural language query into a set of tokens based at least in part, on query terms included in the natural language query and the plurality of data fields of the table included in the second metatdata; [Orr, Figure 7B, “token sequence” output from “STT Processing Module 730.”  “[0226] Natural language processing module 732 ("natural language processor") of the digital assistant can take the sequence of words or tokens ("token sequence") generated by STT processing module 730, and attempt to associate the token sequence with one or more "actionable intents" recognized by the digital assistant….”  The actionable intents are determined from the input query.  For Tokens see [0219] and [0225]-[0227].]
generating an executable query using the second metadata and the set of tokens, the executable query defining an analysis to perform on the dataset, based at least in part on the set of values, [Orr, in [0052] and [0191] teaches that the operation depends on the “intensity” of the touch and the “intensity values” are subjected to statistical analyses taught in [0199].  Thus, the operation/ “analysis to perform on the dataset” is based on the “set of values” obtained earlier from the statistical analyses.  However the dataset from which the set of values are obtained is not taught to be user-specific.] to satisfy the executable query using information obtained from the plurality of data fields of the table indicated in the second metadata; [Orr, Figure 7B, “structured query” generated by the NLP Module 732” is an executable query.  “[0239] .. In some examples, the structured query can include parameters for one or more nodes within the domain for the actionable intent, and at least some of the parameters are populated with the specific information and requirements specified in the user request. For example, the user may say "Make me a dinner reservation at a sushi place at 7."….”]
executing the executable query on the dataset associated with the customer account  [Orr, Figure 7B, the “task flow processing module 736” and “service processing module 738” both execute the query to generate a result.  The “user data 748” is input to the NLP 732 and is used to create the executable structured query.] to produce a result, the result includes a set of data fields of the plurality of data fields of the table obtained from the dataset based at least in part on a portion of the second metadata, where the portion of second metadata includes an indication of a data field of the plurality of data fields that corresponds to a natural language term of the plurality of natural language terms; [Orr takes an unstructured natural language input such as “Make me a dinner reservation at a sushi place at 7” and identifies the database fields that need to be searched such as Time, Date, Cuisine Type, etc.  “[0239] …For example, the user may say "Make me a dinner reservation at a sushi place at 7." In this case, natural language processing module 732 can be able to correctly identify the actionable intent to be "restaurant reservation" based on the user input. According to the ontology, a structured query for a "restaurant reservation" domain may include parameters such as {Cuisine}, {Time}, {Date}, {Party Size}, and the like….”  See [0225] for “metadata” in Orr which can identify “domains corresponding to the user request.”  This is similar to a category of Judges in Carothers.   Here the domain/ “data field” would be “Cuisine Type” for example or a Location / “data field” corresponding to the natural language term “Italy” which is part of the input natural language query:  “[0262] …Digital photographs typically are stored along with metadata such as the date taken and the location taken. Upon receiving nonspecific natural language user input requesting media such as "play hits from my trip to Italy," the digital assistant may cause a search to be performed for information relating to a trip to Italy. Upon finding photograph metadata that includes a location within Italy, the digital assistant determines the corresponding date information in that photograph metadata….”]
generating a natural language result including a natural language description of the result based at least in part on the result and second metadata by at least: [Orr, Figure 7B, “Responses” as output of the system and generate by the “Task flow processing module 736.”]
identifying a set of natural language terms of the plurality of natural language terms based at least in part on the natural language terms included in the second metadata indicating the set of data fields of the plurality of data fields of the table included in the dataset; and [Orr, see the example of “[0042] … Based on the user's current location, the digital assistant can answer, "You are in Central Park near the west gate.". …”  See [0262] for the example of trip to Italy.  “[0262] .. Upon finding photograph metadata that includes a location within Italy, the digital assistant determines the corresponding date information in that photograph metadata. The digital assistant then causes a search to be made of databases of historical music chart information (e.g., the database of Billboard of New York, N.Y.) based on the date information obtained from the photograph…”]
generating the natural language description of the result based at least in part on the set of natural language terms, where the natural language description is generated after receiving the natural language query; and [Orr, see the example of “[0042] … Based on the user's current location, the digital assistant can answer, "You are in Central Park near the west gate.". …”]
transmitting the natural language result to the client computer system.

Romero and Orr pertain to executing user queries on databases and it would have been obvious to combine the system of Orr which includes parsing into tokens expressly with the system of Romero which does not include this term for completeness.  This combination falls under combining prior art elements according to known methods to yield predictable results or simple substitution of one known element for another to obtain predictable results. See MPEP 2141, KSR, 550 U.S. at 418, 82 USPQ2d at 1396.

Carothers teaches:
receiving a dataset associated with a customer account, the data set including first metadata indicating a plurality of data fields of a table included in a portion of the dataset; [Carothers is directed to a GUI (Figure 7)  that provides an interface to a database of legal data that is analyzed based on judge, party, law firm, type of case, etc.  The fields of the databases are identified by metadata tags and the dataset can be searched by search parameters (704) which can be parameters used to find the corresponding metadata tag or directly input the metadata tags (708).  See Figure 7, “allow user to specify search parameters 704” and “allow user to filter the search result by adding or removing metadata elements 708.”  “… In order to permit searching of the legal data, metadata elements or tags can be generated for legal entities and legal events….”  Abstract.  “[0071] At block 704, the GUI can allow a user to specify search parameters with which to search the database. The search parameters include one or more metadata elements that are used to search the metadata and identify relevant legal data. As described above, in some instances the user may specify combinations of metadata elements (e.g., judge/jurisdiction) to further clarify what legal data is relevant and narrow the search. When specifying the search parameters, the user may be able to select pre-existing metadata elements (e.g., from a list) or manually input (e.g., type) data used to match or look up corresponding metadata elements (e.g., a party's name)….”]
determining a set of values associated with the dataset by at least performing a set of statistical measurements using a plurality of data fields of the table; [Carothers performs many types of statistical analyses on the data and can present different tables and charts as shown in various Figures of this reference such as Figures 2, 9-11, 13-15, and 17.  “… some embodiments, a legal analytics platform retrieves legal data from an electronic database, analyzes some or all of the legal data, and identifies interesting patterns and results of statistical analyses. …”    Abstract.]  
….
	generating an executable query using the second metadata and the set of tokens, the executable query defining an analysis to perform on the dataset, based at least in part on the set of values, to satisfy the executable query using information obtained from the plurality of data fields of the table indicated in the second metadata; [Carothers is directed to a GUI (Figure 7) that provides an interface to a database of legal data that is analyzed based on judge, party, law firm, type of case, etc..  Various bar charts and other types of graphs or numbers which are results of statistical analyses on the underlying data of the database are provided to the user.   “… Results of the statistical analyses can be presented to a user via a graphical user interface (GUI), which may also allow the user to interact with the legal analytics platform and search one or more databases of legal data.”    Abstract.  “[0010] In various embodiments, the method further includes generating a graphical user interface (GUI) that allows the user to specify search parameters with which to search the database, present search results, and/or allow the user to modify the search parameters. The search result can include textual, tabular, or graphical summaries of the relevant legal data….”   In Figure 7, the GUI allows user to specify the search parameters 704 based on which a query is generated and sent to the database.  “[0072] At block 706, the GUI can display a search result to the user. The search result may include legal data, selectable hyperlinks to legal data, a textual summary, a graphical summary (e.g., a chart), etc. For example, the search result may include a textual summary 904 and a graphical summary 902, as shown in FIG. 9, for a portion or subset of legal data specified by the user….”]
Romero and Orr and Carothers pertain to or include executing user queries on databases and it would have been obvious to combine the more specific operations of Carothers with the system of combination to provide for more details regarding database searches.  This combination falls under simple substitution of one known element for another to obtain predictable results or use of known technique to improve similar devices (methods, or products) in the same way. See MPEP 2141, KSR, 550 U.S. at 418, 82 USPQ2d at 1396.

Regarding Claim 2, Romero teaches:
2.     The method of claim 1, further comprising: 
converting the natural language result into an audio stream; and[Romero, input and output can be by speech:  “[0043] FIG. 2 shows an example natural language processing platform 110 operating as a chatbot within a database system 105. … In one example, chatbot interface 106 may receive and send text or audio data forming natural language query 104.”]
causing the audio stream to be played to a customer. [Romero, [0043] teaches that the interface may be audio.]
The use of speech synthesis is not express in Romero.
Orr more expressly teaches:
2.     The method of claim 1, further comprising: 
converting the natural language result into an audio stream; and [Orr, Figure 7B, “Responses” output from a “Speech Synthesis Module 740.”  Figure 2, speaker 211 and microphone 213.]
causing the audio stream to be played to a customer. [Orr, Figure 7B, “Responses” output from a “Speech Synthesis Module 740.”  “[0245] … In these examples, the generated response can be sent to speech synthesis module 740 (e.g., speech synthesizer) where it can be processed to synthesize the dialogue response in speech form….”  “[0059] Audio circuitry 210, speaker 211, and microphone 213 provide an audio interface between a user and device 200. Audio circuitry 210 receives audio data from peripherals interface 218, converts the audio data to an electrical signal, and transmits the electrical signal to speaker 211. Speaker 211 converts the electrical signal to human-audible sound waves….”]
Romero and Orr pertain to executing user queries on databases and it would have been obvious to combine the speech interface of Orr with the system of Orr which mentions an audio interface in order to provide hands free input and output.  This combination falls under simple substitution of one known element for another to obtain predictable results or use of known technique to improve similar devices (methods, or products) in the same way. See MPEP 2141, KSR, 550 U.S. at 418, 82 USPQ2d at 1396.

Regarding Claim 3, Romero teaches:
3.     The method of claim 1, wherein an individual token of the set of tokens represents a single word, operator, or punctuation element of the natural language query. [Romero,  “[0109] Chatbot 110 may recognize the phrase "How many" as a record count. Chatbot 110 may recognize the second phrase "accounts" as a dataset or table 140 in database 16 to filter data upon. By using a natural language processor to recognize the singular form of words, and a simple matching algorithm. chatbot 110 may match the second phrase with account_table 140….”]
(Orr, Figure 7B, “Token Sequence” output of STT 730 and “[0225] … However, if the metadata indicates that natural language process is not to be performed, then the natural language processing module can be disabled and the sequence of words or tokens (e.g., text string) from the STT processing module can be outputted from the digital assistant….”)

Regarding Claim 4, Romero teaches parsing of the natural language query but does not teach a parse tree.
Orr teaches:
4.     The method of claim 1, further comprising:
storing the executable query as a set of nodes arranged as a parse tree; and [Orr, Figures 7A, 7B, and 7C teach that the “token sequence” generated by the “STT processing module 730” is fed to a “Natural Language Processing Module 760” which maps the tokens to certain parts and domains of an “Ontology 760.”  See [0225].  Figure 7C shows the overlap of the domains on the “ontology 760” / “parse tree” of the Claim.  “[0236] Natural language processing module 732 can receive the token sequence (e.g., a text string) from STT processing module 730, and determine what nodes are implicated by the words in the token sequence. In some examples, if a word or phrase in the token sequence is found to be associated with one or more nodes in ontology 760 (via vocabulary index 744), the word or phrase can "trigger" or "activate" those nodes. Based on the quantity and/or relative importance of the activated nodes, natural language processing module 732 can select one of the actionable intents as the task that the user intended the digital assistant to perform.” ]
executing the executable query by at least traversing the parse tree and performing an operation for a node in the set of nodes. [Orr,   Figure 7B, the output the NLP 732 is a “structured query” based on the Ontology 760 / “parse tree” of the Claim and is provided to the “task flow processing module 736 and “service processing module 736” to perform the task or service.  “[0229] … For example, as shown in FIG. 7C, ontology 760 can include a "restaurant reservation" node (i.e., an actionable intent node). Property nodes "restaurant," "date/time" (for the reservation), and "party size" can each be directly linked to the actionable intent node (i.e., the "restaurant reservation" node). “  See [0028] to [0235] regarding the operation of the “ontology 760.” “[0240] …. Task flow processing module 736 can be configured to receive the structured query from natural language processing module 732, complete the structured query, if necessary, and perform the actions required to "complete" the user's ultimate request. …”  “[0228] In some examples, the natural language processing can be based on, e.g., ontology 760. Ontology 760 can be a hierarchical structure containing many nodes, each node representing either an "actionable intent" or a "property" relevant to one or more of the "actionable intents" or other "properties." As noted above, an "actionable intent" can represent a task that the digital assistant is capable of performing, i.e., it is "actionable" or can be acted on. A "property" can represent a parameter associated with an actionable intent or a sub-aspect of another property. A linkage between an actionable intent node and a property node in ontology 760 can define how a parameter represented by the property node pertains to the task represented by the actionable intent node.”]
Romero and Orr pertain to executing user queries and it would have been obvious to combine the Ontology/Parse Tree method of Orr for executing a query input by the user with the method of Romero that does not expressly state that it can use an ontology/parse tree based method for executing its query tasks to provide a more specific method or to add an alternative method for executing the query.  This combination falls under simple substitution of one known element for another to obtain predictable results or use of known technique to improve similar devices (methods, or products) in the same way. See MPEP 2141, KSR, 550 U.S. at 418, 82 USPQ2d at 1396.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 5-7 are rejected under 35 U.S.C. 103 as being unpatentable over Romero in view of Orr and further in view of Hakkani-Tur (U.S. 2015/0179168).

Regarding Claim 5, Romero teaches:
5.     A system, comprising: 
one or more processors; and [Romero, “[0031] FIG. 1B shows a block diagram of example implementations of elements of FIG. 1A and example interconnections between these elements according to some implementations. That is, FIG. 1B also illustrates environment 10, but FIG. 1B, various elements of the system 16 and various interconnections between such elements are shown with more specificity according to some more specific implementations. Additionally, in FIG. 1B, the user system 12 includes a processor system 12A, a memory system 12B, an input system 12C, and an output system 12D. The processor system 12A can include any suitable combination of one or more processors. The memory system 12B can include any suitable combination of one or more memory devices. The input system 12C can include any suitable combination of input devices, such as one or more touchscreen interfaces, keyboards, mice, trackballs. scanners, cameras, or interfaces to networks. The output system 12D can include any suitable combination of output devices, such as one or more display devices, printers, or interfaces to networks.”]
a memory storing executable instructions that, as a result of being executed by the one or more processors, [Romero, [0031] and FIG. 1B.]
cause the system to: 
receive a dataset of a user including first metadata describing a table having a set of data fields and a natural language query for the dataset including a set of natural language terms associated with a subset of data fields of the set of data fields of the table; [Romero, First Metadata” pertain to the overall “Database 16.”  Figure 3 shows the “Multi-Tenant Database 16” and Figure 4, step 200A shows the creation of the metadata mapping of natural language phrases to the database contents.  The datasets in the database 16 include tables that have data fields that are subsets of the entire dataset pertaining to one tenant and each tenant dataset is a subset of the entire database.  [120], [0020], [0040]-[0041].  “[0048] … Objects 120 may be stored as structured data in database 16, such as in tables 140 that include columns 142….”  “[0051] … Database system 16 identifies user permissions 138, tables, records, objects, metadata….”]
receive, at an audio interface, an audio stream representing a conversation between a first person and a second person; [Romero, “[0043] … For example, chatbot interface 106 may include Facebook.RTM. messenger, Flake, Apple.RTM. Imessage, or any other software where a user chats or sends text messages. In one example, chatbot interface 106 may receive and send text or audio data forming natural language query 104.”]
identify the natural language query in the audio stream; [Romero receives its query from a GUI but does teach receiving the query through audio as well “[0128] [0128] FIG. 6 shows an example chatbot platform. In one example chatbot 110 may operate on application server 126 as described above in FIG. 3. Chatbot 110 may include a natural language processor 304 that parses natural language text commonly used in person-to-person communications. …  As mentioned above, the natural language processor 304 also may convert audio signals into text and then process the converted text.”]
determine second metadata for the dataset corresponding to the subset of data fields of the dataset based at least in part on the set of natural language terms; [Romero, Figure 3, the “second metadata” identifies the datasets for one user/tenant and defines the various Tables / “subsets of data fields of the datasets.”]  
in response to the natural language query: 
process the natural language query into a set tokens by at least identifying query terms included in the natural language query based at least in part on the second metadata, where the set of tokens correspond to the subset of data fields of the set of data fields of the table included in the second metadata; [Romero, Figure 5, receiving the NL query and parsing it which implies generation of tokens to identify the query terms which would pertain to the data fields of the data sets pertaining to each user/tenant.  Figure 3 and Figure 5 together show that the system identifies the user in order to provide him access and then provides him data from his own particular dataset.]
generate an executable query using the second metadata and the set of tokens, the executable query defining an analysis to perform on the dataset to satisfy the executable query; [Romero, Figures 4 and 6.  Figure 4, 200C shows generation of the SQL from the results of parsing the NL query input.  Figures 3 and 6 show examples of the input query such as “How many accounts are open? 104” in Figure 3.]
execute the executable query on the dataset to produce a result; and [Romero, Figure 4, 200F: “Submit Structured Query to Multi-tenant Database And Send Results to User.”  Figure 6, “Database Query 108”.]
generate a natural language result based at least in part on the result and the second metadata by at least determining a subset of natural language terms of the set of natural language terms that describes a data field of the subset of data fields included in the result, where the data field is associated with a column descriptor of the data field included in the subset of natural language terms of the second metadata; and [Romero, Figure 4, 200F: “Submit Structured Query to Multi-tenant Database And Send Results to User.”  Figure 6, “query results 301.”  See example of the information in the result in Figure 6, “Browser 308.”]
transmit the natural language result to the user; and [Romero, Figure 6, “Natural Language Response 310.”]
cause the natural language result generated from the result to be played by the audio interface.[Romero, Figure 6, “Natural Language Response 310.”  However it is not played as speech and rather sent for display.  See Figure 5, “Display search Results to Chat Group 250G.]

Parsing into Tokens is not express in Romero and Orr is combined under the rationale provided for Claim 1.
Additionally, Orr teaches:
cause the natural language result generated from the result to be played by the audio interface. [Orr, Figure 7B, “Speech Synthesis Module 740” and output of Responses as speech. Figure 2, “[0059] Audio circuitry 210, speaker 211, and microphone 213 provide an audio interface between a user and device 200….”]
Romero and Orr pertain to executing user queries and it would have been obvious to combine the speech synthesis of Orr for providing output with the natural language output of Romero which is by display as speech synthesis is a short hop from text output and to provide a mode of communication that frees user’s eyes and attention for some other task.  This combination falls under simple substitution of one known element for another to obtain predictable results or use of known technique to improve similar devices (methods, or products) in the same way. See MPEP 2141, KSR, 550 U.S. at 418, 82 USPQ2d at 1396.

Romero and Orr do not teach receiving the query from a conversation.
Hakkani-Tur teaches:
receive, at an audio interface, an audio stream representing a conversation between a first person and a second person; [Hakkani-Tur listens to conversations of people around (thus at least a first and a second person) and infers a query from the content of the conversation.   See [0036] for types of queries.  “… The dialog system uses multi-human conversational context to improve domain detection. Using interactions between multiple users allows the dialog system to better interpret machine directed conversational inputs in multi-user conversational systems…..”  Abstract.]
identify the natural language query in the audio stream; [Hakkani-Tur teaches that the query can be found in the conversation of two people that is being observed/listened to by the machine:  “[0041] An addressee determination operation 308 determines whether a conversational input is addressed to the dialog system (i.e., a query directed to the computing device)….”  “[0042] In various embodiments, the addressee determination operation bases the addressee determination on an implicit signal associated with the conversational input. An implicit signal may be, but is not limited to, using a turn without an explicit addressing signal (i.e., based on the user's silence, words, or prosody). For example, during a conversation, one user may say "Okay, let's find a place then," to which the other user responds "Are there any Italian restaurants nearby?" Based on the words alone or together with silence and/or prosody, this conversational input can be determined to be implicitly addressed to the computer.” “[0043] In other embodiments, the dialog system may participate in the conversation without being implicitly or explicitly addressed when the addressee determination operation determines that the dialog system has information to add to the conversation. For example, during a conversation, one user may say "Wanna eat lunch?" to which the other user responds "Okay, do you wanna walk to downtown and find an Italian place?" After the first user responds affirmatively with "Okay," the computer may interject by stating "Here are some Italian restaurants in the downtown area" and/or showing downtown Italian restaurants on the display.”]

Romero and Orr and Hakkani-Tur pertain to executing natural language queries on databases and it would have been obvious to combine the method of Hakkani-Tur which infers its input query from a conversation between multiple humans with the system of combination in order to provide another method of receiving the query.  This combination falls under simple substitution of one known element for another to obtain predictable results or use of known technique to improve similar devices (methods, or products) in the same way. See MPEP 2141, KSR, 550 U.S. at 418, 82 USPQ2d at 1396.

Regarding Claim 6, Romero teaches:
6.     The system of claim 5, wherein: the executable instructions further include instructions that, as a result of being executed by the one or more processors, cause the system to:
generate a character sequence representing the natural language result; and [Romero, Figure 3, Answer 124 is shown to the user on a display.  See also Figure 6.  Both answers are natural language sentences:  “Accounts are Open.”]
display the character sequence to the user. [Romero, Figure 6, Answer is provided on Browser 308.]
(Orr:  “[0214] User interface module 722 can receive commands and/or inputs from a user via I/O interface 706 (e.g., from a keyboard, touch screen, pointing device, controller, and/or microphone), and generate user interface objects on a display. User interface module 722 can also prepare and deliver outputs (e.g., speech, sound, animation, text, icons, vibrations, haptic feedback, light, etc.) to the user via the I/O interface 706 (e.g., through displays, audio channels, speakers, touch-pads, etc.).”)

Regarding Claim 7, Romero teaches:
7.     The system of claim 5, wherein the executable instructions further include instructions that, as a result of being executed by the one or more processors, cause the system to determine the second metadata for the data field by at least:
determining the second metadata based at least in part on a relationship between the data field and a second dataset; and [Romero, Figures 3 and 4.  There are metadata connecting the natural language phrases to the tables/objects and fields of the tables for each of the datasets of each of the tenants. The “second dataset” would be the “dataset” of a different tenant.]
determining the second metadata based at least in part on a data type of the data field. [Romero, the “second metadata” which indicate the fields and details of each dataset include the “data type.”  Figure 3, “Account Table 140” including Work Phone 142A, Mobile Phone 142B, and Account Category 142C.]

(Orr:  “[0152] In some embodiments, a respective event recognizer 280 includes metadata 283 with configurable properties, flags, and/or lists that indicate how the event delivery system should perform sub-event delivery to actively involved event recognizers. In some embodiments, metadata 283 includes configurable properties, flags, and/or lists that indicate how event recognizers may interact, or are enabled to interact, with one another. In some embodiments, metadata 283 includes configurable properties, flags, and/or lists that indicate whether sub-events are delivered to varying levels in the view or programmatic hierarchy.”)

Claims 9-11 are rejected under 35 U.S.C. 103 as being unpatentable over Romero, Orr, and Hakkani-Tur in view of Tsiartas (U.S. 2017/0084295).
Regarding Claim 9, Romero teaches:
9.     The system of claim 5, wherein the executable instructions further include instructions that, as a result of being executed by the one or more processors, cause the system to at least:
identify the first person and the second person; [Romero does not teach identifying participants to a conversation.  However, it does teach that different users have different accounts and must be identified.]
identify one or more datasets that are associated with the first person or the second person; [Romero, Figure 5, 250C: Permission to access a particular database.  See also Figure 2.]
determine a characteristic of the one or more datasets; and [Romero, the characteristics could be any aspect of the tenant-specific databases.  See Figure 3 for contents of these databases.]
provide a description of the characteristic. [Romero, Figure 5, 250G.  Display results.]
Romero and Orr do not address a conversation or identifying the participants to the conversation.
Hakkani-Tur teaches that the query can be obtained from a content of conversation between multiple humans but does not teach that the identity of the humans in the conversation plays a role.
Tsiartas teaches:
…
identify the first person and the second person; [Tsiartas teaches diarization and diarization means identifying the participants to a conversation.  Figure 4B, “Diarization Module 458.”  “[0089] … For instance, if the speech sample is a recording of a two-person conversation, the diarization module 458 may tag the speech segments as spoken by either "speaker A" or "speaker B". ….”]
Romero and Orr and Hakkani-Tur and Tsiartas pertain to or include executing user queries on databases and it would have been obvious to combine the diarization feature of Tsiartas identifies the source/speaker of an input with the system of combination in order to provide the possibility to identify a user requesting the query or other subjects of the query according to identities of the participants to the speech.  This combination falls under simple substitution of one known element for another to obtain predictable results or use of known technique to improve similar devices (methods, or products) in the same way. See MPEP 2141, KSR, 550 U.S. at 418, 82 USPQ2d at 1396.

Regarding Claim 10, Romero mentions audio input and output ([0043]) but is not express.
Orr teaches:
10.     The system of claim 9, wherein the executable instructions further include instructions that, as a result of being executed by the one or more processors, cause the system to:
generate an output audio stream describing the characteristic; and [Orr, Figure 7B, “Speech Synthesis Module 740” and output of Responses as speech.]
play the output audio stream to the audio interface. [Orr, Figure 7B output of Responses.  And Figure 2, “[0059] Audio circuitry 210, speaker 211, and microphone 213 provide an audio interface between a user and device 200….”]
Rationale as provided for Claim 5.  Addition of a speech input/output interface to the GUI of Romero makes hands free use possible and is desirable.

Regarding Claim 11, Romero teaches:
11.     The system of claim 9, wherein the executable instructions further include instructions that, as a result of being executed by the one or more processors, cause the system to:
identify a name for the data field in the dataset; [Romero, Figure 3, in the database 16, the “data field” in Metadata 132A is “Account_Table.”]
identify a first token of the natural language query that matches the name; [Romero, Figure 3, in the query 104, the tenant/user John Smith asks:  “How many Accounts are Open?”  The name is “Accounts.”]
identify a second token of the natural language query that represents an operation; and [Romero, “Open”  or “How Many are Open” would be the “second token.”]
perform the operation on the data field of the dataset. [Romero performs the operation of determining How Many are Open on the name Accounts and comes back with the result that Five are Open.]

Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Romero and Orr and Hakkani-Tur and further in view of Carothers (U.S. 2016/0314146).
Regarding Claim 12, Romero does not expressly teach any of the types of output enumerated.  Neither does Orr.  Hakkani-Tur refers to statistical analysis in order to determine context and domain of the speech and hence the query.  The presented results is not a statistical chart.
Carothers teaches:
12.     The system of claim 5, wherein the result is a data trend, statistical measure, relationship, period-over-period comparison, or outlier of the dataset.  Carothers, Figure 9 is one example of the result that is output to the user.  The graph shown in Figure 9 shows a “data trend” of the Claim.  “… some embodiments, a legal analytics platform retrieves legal data from an electronic database, analyzes some or all of the legal data, and identifies interesting patterns and results of statistical analyses. …”    Abstract.  “[0072] At block 706, the GUI can display a search result to the user. The search result may include legal data, selectable hyperlinks to legal data, a textual summary, a graphical summary (e.g., a chart), etc. For example, the search result may include a textual summary 904 and a graphical summary 902, as shown in FIG. 9, for a portion or subset of legal data specified by the user….”  Figure 17, “[0087] … In some embodiments, the minimum value 1710 and the maximum value 1712 are determined after excluding outliers 1714, which lie outside of the box 1706 by a distance of more than 1.5 times the width of the box (i.e., difference between upper quartile and lower quartile).”  Figure 15 has a bar chart showing the comparison from year to year.]
Romero and Orr and Hakkani-Tur and Carothers pertain to or include executing user queries on databases and it would have been obvious to combine the more specific operations of Carothers with the system of combination to provide for more details regarding database searches..  This combination falls under simple substitution of one known element for another to obtain predictable results or use of known technique to improve similar devices (methods, or products) in the same way. See MPEP 2141, KSR, 550 U.S. at 418, 82 USPQ2d at 1396.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 13-14, and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Romero in view of Orr.
Regarding Claim 13, Romero teaches (Claim 13 is similar to Claim 5 with minor differences in the language.)
13.     Anon-transitory computer-readable storage medium storing thereon executable instructions that, as a result of being executed by one or more processors of a computer system, [Romero, hardware in [0031] and Figure 1B.]
cause the computer system to:
receive a dataset associated with a customer, the data set including a table having a plurality of data fields described in first metadata and a natural language query for the dataset, the natural language query including a set of natural language terms associated with a subset of data fields of the plurality of data fields of the table; [Romero, Figure 3 showing the “Multi-Tenant Database 16” and the input NL query and the output response and Figure 4 showing the connecting of NL phrases to the various Tenant Datasets in the “Multi-Tenant Database 16” by metadata.  Each Tenant-Specific Dataset is a Subset of the entire “database 16.”  [120], [0020], [0040]-[0041].  “[0048] … Objects 120 may be stored as structured data in database 16, such as in tables 140 that include columns 142….”  “[0051] … Database system 16 identifies user permissions 138, tables, records, objects, metadata….”]
determine a set of insights for the dataset based at least in part on the plurality of data fields; [Romero teaches that values of data fields are obtained and provided to the pertinent user.  See Figure 3 showing the types of information and Figure 6 showing the providing of the result.  “Insight” until defined is a very broad term.  Anything can be an “insight.”  For example, “insight” is mapped to the input parameters of the user in Figure 7 at either step 704 or step 708.]
determine second metadata for the dataset based at least in part on a subset of natural language terms of the set of natural language terms, where members of the subset of natural language terms are associated with data type information of one a data field of the plurality of data fields of the dataset described in the second metadata; and [Romero, Figure 3, the “second metadata” identifies the dataset for one user/tenant and defines the various Tables / “subsets of data fields of the datasets.”  Figure 3, “Account Table 140” includes Data Types 142A, 142B, 142C each having its own “Data Type” (numeric for phone numbers and alphabetic for values of the Account Category) and each tied with metadata to some NL value such as “Open” and “Closed.”]  
generate a response to the natural language query by at least: [Romero, Figure 6, “Natural Language Response 310.”]
processing the natural language query into a set of tokens, the set of tokens generated based at least in part on information included in the natural language query and the second metadata; [Romero, Figure 6, “natural language processor 304” and “validator 302” parse the natural language query (parsing generates tokens) and do so based on the relevant datasets and data fields indicated by the metadata associated with the NL query.  [0128]-[0129].]
generating a query using the second metadata and the set of tokens, the query defining an analysis to perform, based at least in part on the set of insights, on the subset of data fields of the plurality of data fields of the table of the dataset to satisfy the query; [Romero, Figure 6, the “Query Processor 300” generates a “database query 108” / “executable query” and sends it to the proper dataset/table which is identified by the second metadata which identifies the fields such as open or closed 142C.  [0131].  Figure 4, 200C:  “Apply Natural Language Processing to Convert User Input into Structured Database Query or Action.”  The queries are generated based on the data fields / insights identified in the user input.]
executing the query on the dataset to produce a result; [Romero, Figure 6, “query results 301.”  Figure 4, 200F: “Submit Structured Query to Multi-tenant Database And Send Results to User.”]
generating a natural language result, based at least in part on the natural language query, the result, and the second metadata by at least converting the result to the natural language result, the natural language result including natural language term of the subset of natural language terms corresponding to the data field, where the data type information of the data filed is included in the result and the natural language term defines the data type information; and [Romero, Figure 3 the Answer 124 includes “Five Accounts are Open.”  Five and Open and Accounts are associated with value, value, and data field of one of the tables in the data set belonging to the tenant John Smith.  See also Figure 6.]
transmitting the natural language result to an endpoint associated with the customer. [Romero, Figure 6, “Natural Language Response 310.”]

Parsing into Tokens is not express in Romero and Orr is combined under the rationale provided for Claim 1.

Regarding Claim 14, Romero teaches (see also Claim 11):
14.  The non-transitory computer-readable storage medium of claim 13, wherein the instructions that cause the computer system to determine metadata for the dataset further comprise instructions that, as a result of being executed by the one or more processors, cause the computer system to:
identify a name associated with the data field of the plurality of data fields based at least in part on the second metadata; and [Romero, Figure 3, name “Accounts” associated with “Account_Table” in the Metadata 132A.]
determine a data type of the data field based at least in part on the name and the data type information. [Romero, the data type is determined from Metadata Table 132A.]

Regarding Claim 18, Romero teaches:
18. The non-transitory computer-readable medium storage medium of claim 13, wherein the instructions further comprise instructions that, as a result of being executed by the one or more processors, cause the computer system to:
identify a name for the data field in the dataset; [Romero, Figure 3, in the database 16, the “data field” in Metadata 132A is “Account_Table.”]
identify a first token of the natural language query that matches the name; [Romero, Figure 3, in the query 104, the tenant/user John Smith asks:  “How many Accounts are Open?”  The name is “Accounts.”]
identify a second token of the natural language query that represents an operation; and [Romero, “Open”  or “How Many are Open” would be the “second token.”]
perform the operation on the data field of the dataset. [Romero performs the operation of determining How Many are Open on the name Accounts and comes back with the result that Five are Open.]

Regarding Claim 19, Romero does not teach a parse tree of the query.
Orr teaches:
19.     The non-transitory computer-readable storage medium of claim 13, wherein the instructions further comprise instructions that, as a result, of being executed by the one or more processors, cause the computer system to:
generate a set of nodes arranged in a parse tree that represents the executable query; and [Orr in Figure 7B at the NLP stage 732, maps the tokens of the input query to nodes of a subset of the Ontology 760, shown also in Figure 7C, and thus teaches “generate a set of nodes arranged in a parse tree” of the Claim.]
execute the query by at least performing an operation for a node in the set of nodes. [Orr, Figure 7B, the “structured query” output of the “NLP Module 732” which includes the mapping of the input query to a subset of the “ontology 760” / “parse tree” is input to the “task flow processing module 736” for execution of the task that was requested by the query.   “[0242] Once task flow processing module 736 has completed the structured query for an actionable intent, task flow processing module 736 can proceed to perform the ultimate task associated with the actionable intent. Accordingly, task flow processing module 736 can execute the steps and instructions in the task flow model according to the specific parameters contained in the structured query. For example, the task flow model for the actionable intent of "restaurant reservation" can include steps and instructions for contacting a restaurant and actually requesting a reservation for a particular party size at a particular time….”]
Rationale for combination, similar to that provided for Claim 4.

Regarding Claim 20, Romero teaches:
20,  The non-transitory computer-readable storage medium of claim 13, wherein the instructions that cause the computer system to generate the natural language result further comprise instructions that, as a result of being executed by the one or more processors, cause the computer system to:
identify the data field that is represented in the result; [Romero, Figure 3, in the database 16, the “data field” in Metadata 132A is “Account_Table.”]
determine a name that is associated with the data field; and [Romero, Figure 3, in the query 104, the tenant/user John Smith asks:  “How many Accounts are Open?”  The name is “Accounts.”]
generate a natural language term from the result using the name to include in the natural language result. [Romero performs the operation of determining How Many are Open on the name Accounts and comes back with the result that “Five Accounts are Open.”  Figure 3, 124.]

Claims 15-17 are rejected under 35 U.S.C. 103 as being unpatentable over Romero in view of Orr and further in view of Tsiartas (U.S. 2017/0084295).
Regarding Claim 15, Romero teaches audio input.  ([0043] and [0128].)
Orr teaches:
15.     The non-transitory computer-readable medium storage medium of claim 13, wherein the instructions further include instructions that, as a result of being executed by the one or more processors, cause the system to:
receive, at an audio interface, an audio stream representing a conversation between a first person and a second person; [Orr, Figure 7B, “speech input.”]
identify the natural language query in the audio stream; and [Orr, Figure 7B, “STT Processing 730” generating “Token Sequence” which after “NLP 732” yield the “structured query.”.]
play the natural language result generated from the result via the audio interface. [Orr, Figure 7B, “Responses” output from “Speech Synthesis Module 740.”]
Romero and Orr pertain to executing user queries on databases and it would have been obvious to combine the speech interface of Orr with the system of Romero which mentions such an interface but does not go into details particularly regarding the output of the results in order to provide hands free input and output.  This combination falls under simple substitution of one known element for another to obtain predictable results or use of known technique to improve similar devices (methods, or products) in the same way. See MPEP 2141, KSR, 550 U.S. at 418, 82 USPQ2d at 1396.

Romero/Orr does not teach that the speech from which the query is extracted is obtained from a conversation between two persons.  For this Claim it is not instrumental whether it is getting its input speech from one person or from a conversation between two people.  Nevertheless and for completeness, Orr does not teach that the input is taken from a conversation.
Tsiartas teaches:
receive, at an audio interface, an audio stream representing a conversation between a first person and a second person; [Tsiartas,  Figure 4B, “Diarization Module 458.”  Diarization means finding which part of a conversation is coming from which participant.  “[0089] A diarization module 458 and/or automatic speech recognition (ASR) module 460 may also operate on the speech information prior to its being input to the speech feature extraction module 412. The diarization module 458 may identify speaker turns; that is, given the speech segments identified by the audio segmentation module 456, the diarization module 458 may identify which of those segments are likely spoken by the same speaker (without indicating the identities of the different speakers). For instance, if the speech sample is a recording of a two-person conversation, the diarization module 458 may tag the speech segments as spoken by either "speaker A" or "speaker B". ….”  “[0069] The storage of such meta-data information may be used in conjunction with natural language understanding (NLU) queries to produce both data lookups from natural queries as well as system analyses for a query. For example, a user may perform a query such as: "show me the information for jane doe for this week compared to last week."”]
Romero and Orr and Tsiartas pertain to or include executing user queries on databases and it would have been obvious to combine the diarization feature of Tsiartas which classifies the turns of speech according to their source/speaker with the speech interface of Romero/Orr in order to provide the possibility to identify the participants to the conversation whose input speech is used for generation of the query.  This combination falls under simple substitution of one known element for another to obtain predictable results or use of known technique to improve similar devices (methods, or products) in the same way. See MPEP 2141, KSR, 550 U.S. at 418, 82 USPQ2d at 1396.

Regarding Claim 16, Romero teaches:
16.     The non-transitory computer-readable medium storage medium of claim 15, wherein the instructions further comprise instructions that, as a result of being executed by the one or more processors, cause the computer system to:
identify the first person and the second person; [Romero does not teach identifying participants to a conversation.  However, it does teach that different users have different accounts and must be identified.]
identify one or more datasets that are associated with the first person or the second person; [Romero, Figure 5, 250C: Permission to access a particular database.  See also Figure 2.]
determine a characteristic of the one or more datasets; and [Romero, the characteristics could be any aspect of the tenant-specific databases.  See Figure 3 for contents of these databases.]
provide a description of the characteristic. [Romero, Figure 5, 250G.  Display results.]
Romero and Orr do not address a conversation or identifying the participants to the conversation.
Tsiartas teaches:
…
identify the first person and the second person; [Tsiartas teaches diarization and diarization means identifying the participants to a conversation.  Figure 4B, “Diarization Module 458.”  “[0089] … For instance, if the speech sample is a recording of a two-person conversation, the diarization module 458 may tag the speech segments as spoken by either "speaker A" or "speaker B". ….”]
Romero and Orr and Tsiartas pertain to or include executing user queries on databases and it would have been obvious to combine the diarization feature of Tsiartas identifies the source/speaker of an input with the system of combination in order to provide the possibility to identify a user requesting the query or other subjects of the query according to identities of the participants to the speech.  This combination falls under simple substitution of one known element for another to obtain predictable results or use of known technique to improve similar devices (methods, or products) in the same way. See MPEP 2141, KSR, 550 U.S. at 418, 82 USPQ2d at 1396.

Regarding Claim 17, Romero mentions audio input and output ([0043]) but is not express.
Orr teaches:
17.     The non-transitory computer-readable medium storage medium of claim 16, wherein the instructions further comprise instructions that, as a result of being executed by the one or more processors, cause the computer system to:
generate an output audio stream describing the characteristic; and [Orr, Figure 7B, “Speech Synthesis Module 740” and output of Responses as speech.]
play the output audio stream to the audio interface. [Orr, Figure 7B output of Responses.  And Figure 2, “[0059] Audio circuitry 210, speaker 211, and microphone 213 provide an audio interface between a user and device 200….”]
Rationale as provided for Claim 13.  Addition of a speech input/output interface to the GUI of Romero makes hands free use possible and is desirable.

Claim 21 is rejected under 35 U.S.C. 103 as being unpatentable over Romero in view of Orr and further in view of Carothers.
Regarding Claim 21, Romero does not expressly teach any of the types of output enumerated.  Neither does Orr.  
Carothers teaches:
21. The non-transitory computer-readable storage medium of claim 13, wherein the set of insights include at least one of: statistical measures, relationships, minimum values, maximum values, trends, and other characteristics of the dataset. [Carothers, Figure 9 is one example of the result that is output to the user.  The graph shown in Figure 9 shows a “data trend” of the Claim.  “… some embodiments, a legal analytics platform retrieves legal data from an electronic database, analyzes some or all of the legal data, and identifies interesting patterns and results of statistical analyses. …”    Abstract.  “[0072] At block 706, the GUI can display a search result to the user. The search result may include legal data, selectable hyperlinks to legal data, a textual summary, a graphical summary (e.g., a chart), etc. For example, the search result may include a textual summary 904 and a graphical summary 902, as shown in FIG. 9, for a portion or subset of legal data specified by the user….”  Figure 17, “[0087] … In some embodiments, the minimum value 1710 and the maximum value 1712 are determined after excluding outliers 1714, which lie outside of the box 1706 by a distance of more than 1.5 times the width of the box (i.e., difference between upper quartile and lower quartile).”  Figure 15 has a bar chart showing the comparison from year to year.]
Romero and Orr and Carothers pertain to or include executing user queries on databases and it would have been obvious to combine the more specific operations of Carothers with the system of combination to provide for more details regarding database searches.  This combination falls under simple substitution of one known element for another to obtain predictable results or use of known technique to improve similar devices (methods, or products) in the same way. See MPEP 2141, KSR, 550 U.S. at 418, 82 USPQ2d at 1396.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to FARIBA SIRJANI whose telephone number is (571)270-1499.  The examiner can normally be reached on 9 to 5, M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre Desir can be reached on 571-272-7799.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/Fariba Sirjani/
Primary Examiner, Art Unit 2659