DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 08/24/2022 has been entered.

 This office action is a response to an application filed 08/24/2022 wherein claims 1 – 20 are pending and ready for examination.  

Response to Arguments
Applicant’s arguments, see Remarks, filed 08/24/2022, with respect to the rejection(s) of claim(s) 1-3, 6-9, 11-12 and 14-18 under U.S.C. 35 § 103 have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of Jayaraman; Baskar et al.

Summary of Examiner Interview.  
Applicant:  Applicant’s representative, Daniel Kwok (Reg. No. 69,042), had a telephonic
interview with Examiner Jones on April 18, 2022. During the interview, the Examiner
suggested incorporating the subject matter from paragraph [0014] of the Specification into
the independent claims, and indicated that the incorporation of the subject matter would
likely overcome the current §103 rejections, but that additional review and further searching
maybe required. The claims are amended herein based on what was discussed during the
interview.  Applicant’s representative, Daniel Kwok (Reg. No. 69,042), had a follow-up
interview with Examiner Jones on July 27, 2022. During the interview, the Examiner
brought up several additional references, such as U.S. Patent Number 10/095,688 (Schilling)
(col. 2) and U.S. Patent 11/269,859 (Luedtke) (col. 38). However, the Examiner and
Applicant’s representatives agreed that neither reference teaches the limitation added to the
independent claims. Applicant thanks Examiner Jones for conducting the interviews and advancing prosecution, and would welcome a further discussion if it would expedite allowance of this case.

Examiner Response:  The Examiner thanks applicant representative for working to advance the prosecution of the application.  The parties discussed the subject matter found in the originally fled specification at location [0014].  The Examiner did not find incorporated from the instant specification that portion deemed to be absent from the prior art of record; namely, the construction of a user profile extracted from database queries and the extraction of at least one of a length, format, or styling of the database query. For example, prior art Hsiao at [0027] clearly teaches a format of the database query since an ‘action set’ has a particular format or structure used by access devices for phone and location identification.   As noted, by applicant representative the Examiner was committed to reconsider the prior art and perform an additional search and discovered Jayaraman.  The Examiner thanks applicant representative for working to advance the prosecution and would encourage an additional review of the disclosure particularly the features whereby query results of applications are used to construct a profile of a user.

Applicant Asserts:  Applicant respectfully submits that the Hsiao and Karunaratne, alone or in
combination, fail to disclose or suggest all of the features recited in amended claim 1.
Specifically, Hsiao and Karunaratne fail to disclose or suggest the limitations of “extracting,
from each database query in the first plurality of database queries, a corresponding set of features representing different characteristics of the database query, wherein the different characteristics comprise at least one of a length of the database query, a format of the database query, or a styling of the database query; creating a set of artificial intelligence (AD) training data based on the corresponding sets of features” as recited in amended claim 1.
In particular, Hsiao teaches that “behavior vectors... may be generated by the
behavior analysis engine 122 based upon the queries, actions, action attributes, etc., as
recorded by the log of actions 120 to characterize the behavior of the application... the
global classifier 628 may classify a received behavior vector set for an application from a
computing device 602 that has identified an application as having suspicious behavior (e.g.,
block 612) as benign or malware” (Hsiao, paragraphs [0032] and [0043]), but fails to teach
at least the limitations as set forth above.

Examiner Response:  Respectfully, the Examiner disagrees because ‘obtaining’ is taught by Jayaraman as ‘using the ML model’ whereas the claimed ‘first output vector’ is taught by Jayaraman as ‘first textual record’.  The claimed ‘vector space’ is taught by Jayaraman as ‘first semantically-encoded vector space’. Therefore Jayaraman teaches the missing limitations of Hsiao.

Applicant Asserts:  The Office Action cited to Li, paragraph [0102] for rejecting a similar limitation in previously presented claim 5. However, Li teaches “the word2vec models... a set of
language modeling and feature learning techniques in natural language processing (NLP)
where words or phrases from the vocabulary are mapped to vectors of real numbers” and
does not teach that “different characteristics of the database query... comprise at least one
of a length of the database query, a format of the database query, or a styling of the database
query” are used for “creating a set of artificial intelligence (AI) training data” as recited in
amended claim 1.

Amendment to the claimsThe Examiner thanks applicant representative for diligently working to advance the prosecution of this application.  Seldom has the Examiner experienced such rigor in advocacy for the invention.  The parties have met twice and both times it was deemed prosecution was advanced.  Here, applicant has amended independent claims 1, 11, and 15, as underlined, to include:
	extracting, from each database query in the first plurality of database queries, a
corresponding set of features representing different characteristics of the database
query, wherein the different characteristics comprise at least one of a length of the database
query, a format of the database query, or a styling of the database query;

As previously noted in the above Interview section, the Examiner finds that Hsiao can query multiple corresponding “action sets” which are the result of database queries.  Action sets are formatted to identify the user device and location and is a characteristic defined by particular format such as action set format Figure 3. The Examiner in the previous Final rejection Office Action of 08/02/2022 cited prior art Hsiao as teaching the claimed limitation as: ‘extracting ...database query’ is taught by Hsiao as ‘extracted from logging’ since the log records the database queries.  One of ordinary skill in the art recognizes that a log is a database, and to query the database is to ‘extract’ the results

	Behavior Vectors:Regarding applicant’s assertion on behavior vectors.  The Examiner is not sure as to the point being asserted here since no claims are directed the term behavior vector even though one of ordinary skill in the art could argue that the last two limitations of claim 1 gives rise to the unclaimed behavior vector.   Even the instant specification at [0015] discloses:
Queries for each of a number of users can be mapped to a vector space such that each user's queries are clustered relatively close together, but have some distance between queries by other users. When a new sample query is received, it can be assessed (through a trained machine learning classifier) to determine its level of similarity with previous queries by that user, as well one or more other users who may behave similarly to the querying user (e.g. nearby neighboring users within a vector space) and one or more other users who do not write similar queries to the querying user (e.g. one or more distant users within the vector space).
Here, the prior art Hsiao is teaching the same behavior vector even though the claims do not convey this idea.

KarunaratneApplicant Asserts: Karunaratne teaches “storing a plurality of hyper-dimensional profile vectors and... determining a distance between the hyper-dimensional query vector and the plurality of hyperdimensional profile vectors” (Karunaratne, paragraph [0097]), but fails to cure the deficiency of Hsiao as discussed above.

Examiner Response: The Examiner, upon further review deleted the use of the combination of Hsiao and Karunaratne and introduced new art to cite obtaining a first output vector in the vector space using Jayaraman
.
	LiApplicant Asserts:  The Office Action cited to Li, paragraph [0102] for rejecting a similar limitation in previously presented claim 5. However, Li teaches “the word2vec models... a set of
language modeling and feature learning techniques in natural language processing (NLP)
where words or phrases from the vocabulary are mapped to vectors of real numbers” and
does not teach that “different characteristics of the database query... comprise at least one
of a length of the database query, a format of the database query, or a styling of the database
query” are used for “creating a set of artificial intelligence (AI) training data” as recited in
amended claim 1.
Examiner Response:  Respectfully, the Examiner does not agree with applicant representative characterization of prior art Li.  The Examiner introduced Li for determining if a first centroid for queries associated with the first user is within K closest neighboring centroids to the first output vector where K is a predefined integer greater than one because Li, at location [0145], teaches “At procedure 304, the cluster center detector 232 calculates the K-density for each vector using the distances calculated in procedure 302. The K-density for each vector…is a predetermined positive integer. In certain embodiments, K is set in a range of 10-100…The K vectors are K nearest neighbors to the vector i.  Here, the claimed ‘first centroid’ is taught by Li as ‘the vector i” since it is detected by cluster center detector 232 whereas the claimed ‘first output vector’ is taught by Li as ‘The K vectors’.  No where did the Examiner cite Li to teach the different query types or characteristics.  It is at least for these reasons that the Examiner remains unpersuaded and maintains the previous office action rejection.

	Other prior art
Applicant Asserts: When discussing the limitations above with the Examiner during a follow-up interview on July 27, 2022, the Examiner brought up several additional references, such as U.S. Patent Number 10/095,688 (Schilling) (col. 2) and U.S. Patent 11/269,859 (Luedtke) (col. 38). Applicant respectfully asserted that neither references teach the limitations set forth above, as agreed on during the follow-up interview.

Applicant further submits that while Gu were cited in the Office Action as allegedly
disclosing various features from dependent claims, they do not cure the deficiencies of
Hsiao and Karunaratne, even assuming that a motivation to combine the references exists (which Applicant does not concede).

Examiner Response:  The Examiner thanks applicant representative for working to advance the prosecution of this application and invites further discussion based on applicant’s review of the above prior art of record in a continued effort to advance prosecution.  The Examiner maintains the rejections of claims 1-20, as amended. 


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-3, 5, 11-12, and 16-18 are rejected under 35 U.S.C. 103 as being unpatentable over Hsiao; Hsu-Chun et al., US 20130247187, September 19, 2013, hereafter referred to as Hsiao in view of Jayaraman; Baskar et al, US 20200349183 A1, November, 11 2020, hereafter referred to as Jayaraman.

As to claim 1, Hsiao teaches a method – Hsiao [0008] ... invention may relate to an apparatus and method for a computing device to determine if an application is malware, comprising:
accessing a first plurality of database queries executed by a plurality of different users on one or more databases - Hsiao [0042] … a system 600 including a server 620 may be utilized to aggregate behavior reports from a crowd of computing devices 602.  Although only one computing device 602 is shown, the hereinafter described aspects relate to a plurality or crowd of computing devices 602. Here, the claimed ‘accessing’ is taught by Hsiao as ‘aggregate’ because gathering the queries requires having access to the queries.  The claimed ‘a first’ is taught by Hsiao as ‘a crowd’ which designates a singular group with no mention of a previous group.  The claimed ‘plurality of database queries’ is taught by Hsiao as ‘aggregate reports’ since the reports are from the query logger recording device 602 actions.  It is noted that device 602 are plural devices.   The claimed ‘databases’ is taught by Hsiao as ‘devices 602’);
extracting, from each database query in the first plurality of database queries, a
corresponding set of features representing different characteristics of the database query - Hsiao [0026] … the behavior vectors 130 may be used in a behavioral analysis framework to detect malware on computing devices. The resulting behavior vectors 130 include the objective observations extracted from logging. As an example, the behavior analysis engine 122 answers queries 210 regarding actions (e.g., "application installation without the user's consent?", "should the application behave like a game?", "should the website act like news?", "should the application be processing SMS messages?", "should the application be processing phone calls?" etc.).  Here, the claimed ‘extracting ...database query’ is taught by Hsiao as ‘extracted from logging’ since the log records the database queries.  The claimed ‘corresponding features’ is taught by Hsiao as ‘game, news, or SMS’ since these features can be used to profile the user/per instant specification [0014].  The claimed ‘characteristics...query’ is taught by Hsiao as ‘actions’ as the actions of the application are used to characterize the query): wherein the different characteristics comprise at least one of a length of the database query, a format of the database query, or a styling of the database query – Hsiao [0027] ... each action may be associated with one or more of four types of queries 310: existence query, amount query, order query, and category query. For example, an existence query 310 may refer to the existence of an action set. As an example of this query, the query may be to determine whether an application has accessed device information (e.g., has phone information been accessed, has location information been accessed, etc.  Here, the claimed ‘format of the database query’ is taught by Hsiao as ‘existence of an action set’ whereby an action set has a particular format/structure such as use of access devices for phone and location);
creating a set of artificial intelligence (AI) training data – Hsiao [0053] ... the behavior analysis engine 122 answers queries 210 regarding actions (e.g., "application installation without the user's consent?", "should the application behave like a game?", "should the website act like news?", "should the application be processing SMS messages?", "should the application be processing phone calls?" etc.). The answers to these queries 210 create the behavior vectors 130. Here, the claimed ‘creating’ is taught by Hsiao as ‘queries 210’ since the behavior analysis engine 122 uses the queries as construction filters to create the training data) based on the corresponding set of features – Hsiao [0054] ... the computing device 602 periodically monitors and computes a behavior vector 610 utilizing behavior analysis engine 608 for each running application and by utilizing a classifier may determine whether this application behaves similar to malware or benign applications. Here, the claimed ‘corresponding set of features’ is taught by Hsiao as ‘application behaves similar to malware’ since the data is used by the classifier to identify the corresponding features among devices 602);
training a machine learning (ML) classifier using the set of AI training data - Hsiao [0053] ... As to initialization a behavior analysis engine and classifier (e.g., for a computing device 602) may be trained by a set of known-bad applications.  Here, the claimed ‘training’ is taught b Hsiao as ‘may be trained’ whereas the claimed ‘machine learning classifier’ is taught by Hsiao as ‘classifier’.  The claimed ‘set of AI training data’ is taught by Hsiao as ‘known bad applications’) and using corresponding labels for the first plurality of database queries wherein each of the corresponding labels indicates an identity of one of the plurality of different users that is associated with a corresponding query - Hsiao [0038] … a behavior vector 529 is generated by the behavior analysis engine based on analyzing the log of actions from the query logger. The behavior vectors are simplified as being frequent use 530, rare use 532, and no use 534. As a numerical example, a behavior vector of around 5 may designate frequent use, a behavior vector of around 1-2 may designate rare use, and a behavior vector of around 0 may designate no use.  Here, the claimed ‘corresponding labels’ is taught by Hsiao as at least ‘rare use 532 whereas the claimed ‘users’ is taught by Hsiao as ‘no use’ since the application is associated with the device which is associated with the user), and wherein the trained ML classifier is configured to produce vector outputs in a vector space in response to receiving database queries - Hsiao [0038] For each of these applications, a behavior vector 529 is generated by the behavior analysis engine based on analyzing the log of actions from the query logger. The behavior vectors are simplified as being frequent use 530, rare use 532, and no use 534.  Here, the claimed ‘vector outputs’ is taught by Hsiao as ‘vector 529 is generated’ the claimed ‘in response...queries’ since the query logger represents the plurality of database queries from the crowd of device 602’s);
extracting, by a computer system, from a first database query associated with a first user - Hsiao [0031] Further, a wide variety of different types of actions 320: application installation, device information, communications, user interaction, access device information, start at boot, user data, package installation, sensor, location, media, camera, SMS, phone call, and phone information (block 322. Here, the claimed ‘first database query’ is taught by Hsiao as ‘actions 320’ whereas the claimed ‘associated with a first user’ is taught by Hsiao as ‘user interaction’; a first set of features representing the different characteristics of the first database query - Hsiao [0047] …as shown in FIG. 6, a query logger 604 of a computing device 602 may log the behavior of an application to generate of a log of actions 606. Next, the behavior analysis engine 608 of the computing device 602 may analyze the log of actions 608 to generate a behavior vector set 610 that characterizes the behavior of the application. Here, the claimed ‘first database query’ is taught by Hsiao as ‘log of actions 606’ whereas the claimed ‘first particular user’ is taught by Hsiao as ‘device 602’);
obtaining a first output vector in the vector space based on providing the first set of features to the trained ML classifier) - Hsiao [0047] a query logger 604 of a computing device 602 may log the behavior of an application to generate of a log of actions 606 ...if the classifier of the computing device 602 does not find that the behavior vector set indicates anything suspicious about the application (e.g., it has a low likelihood of being malware). Here the claimed ‘obtaining’ is taught by Hsiao as ‘may log’ whereby the query logger records queries.  The claimed ‘first output vector’ is taught by Hsiao as ‘log of actions 606’ as these actions have been aggregated or vectorized.  The claimed ‘vector space’ is suggested by Hsiao as ‘vector set’ because a plurality of vectors are grouped into one space); and
based on the first output vector, determining by the computer system if the first database query represents a data access anomaly – Hsiao [0054] … behavior vector 610 utilizing behavior analysis engine 608 for each running application and by utilizing a classifier may determine whether this application behaves similar to malware or benign applications.  Here, the claimed ‘data access anomaly’ is taught by Hsiao as ‘malware’.  HSIAO SUGGESTS obtaining a first output vector in the vector space, HOWEVER IN AN ANALAGOUS ART THAT IS DIRECTED TO THE SAME FIELD OF ENDAVOR  JAYARAMAN TEACHES obtaining a first output vector in the vector space – Jayaraman [0224] ... using the ML model to determine word vectors that describe, in a first semantically-encoded vector space, a meaning of respective words of the first textual record and comparing the word vectors to at least one of a location or a volume, within the first semantically-encoded vector space,   Here, the claimed ‘obtaining’ is taught by Jayaraman as ‘using the ML model’ whereas the claimed ‘first output vector’ is taught by Jayaraman as ‘first textual record’.  The claimed ‘vector space’ is taught by Jayaraman as ‘first semantically-encoded vector space’. Thus, it would have been recognized by one of ordinary skill in the art before the effective filing date of the claimed invention to apply the known technique of baselining the vector space using the sample vector, as taught by Jayaraman would have yielded predicable results whereby a vector space is created using the first vector as a baseline.  Hsiao suggests this feature but Jayaraman positively teaches baselining the vector space.  This combination of Hsiao and Jayaraman improves the detection of both malware and access anomalies since the vectors are statistically discerned to the user).

As to claim 2, the combination of Hsiao and Jayaraman teaches the method of claim 1 wherein the determining if the first database query represents a data access anomaly comprises:
determining a first distance between the first output vector and a first centroid for queries associated with the first user - Jayaraman [0195] ... a record could be precluded from assignment to a particular cluster unless a degree of similarity between the cluster and the record is greater than a threshold similarity. This could include a distance between the location of the record and a centroid or other characteristic location of the cluster being less than a threshold distance. Here, the claimed ‘determining’ 	is taught by Jayaraman as ‘precluded’ which requires an assessment of query vectors.  The claimed ‘first output vector’ is taught by Jayaraman as ‘a record’ whereas the claimed ‘first centroid’ is taught by Jayaraman as ‘a centroid or other characteristic location’).  The rationale provided for considering Jayaraman with Hsiao in claim 1 applies here in claim 2.

As to claim 3, the combination of Hsiao and Jayaraman teaches the method of claim 2, wherein the determining if the first database query represents a data access anomaly further comprises:
determining if the first distance exceeds a threshold value – Jayaraman [0014] ...  determining that the particular textual record fits into the particular cluster can include determining that the similarity metric indicates that the particular textual record fits into the particular cluster to a degree that exceeds a specified threshold similarity.  Here, the claimed ‘first distance’ is taught by Jayaraman as ‘a degree’ whereas the claimed ‘threshold value’ is taught by Jayaraman as ‘threshold similarity’.  The rationale provided for considering Jayaraman with Hsiao in claim 1 applies here in claim 3 as distance measuring from a threshold).
As to claim 5, the combination of Hsiao and Jayaraman teaches the method of claim 1.  wherein the styling of the database query represents a use frequency of a particular query - Hsiao [0028] ... Further, an amount query 310 may refer to the number of occurrences of actions. As an example of this query, the query may be to determine the number of occurrence of actions by an application. As an example, this may be the number of SMS sent (e.g., outgoing communication via SMS.  Here, the claimed ‘use frequency’ is taught by Hsiao as ‘number of occurrence’ since both quantify queries.).

As to claim 11, Hsiao teaches a non-transitory computer-readable medium having stored thereon instructions that when executed by a computer system cause the computer system to perform operations comprising:
providing a first database query from a first user to a trained machine learning
classifier – Hsiao [0053] ... a behavior analysis engine and classifier (e.g., for a computing device 602) may be trained by a set of known-bad applications, or malware, and a set of known-good applications. The training process may be accomplished using standard supervised machine learning techniques. Before a computing device 602 is provided to a user, the computing device 602 may be required to obtain an up-to-date behavior model for its behavior analysis engine 608 from the server 620);
receiving a first output vector in a vector space - Hsiao [0047] a query logger 604 of a computing device 602 may log the behavior of an application to generate of a log of actions 606 ...if the classifier of the computing device 602 does not find that the behavior vector set indicates anything suspicious about the application (e.g., it has a low likelihood of being malware).  Here, the claimed ‘vector outputs’ is taught by Hsiao as ‘vector 529 is generated’ the claimed ‘vector space’ is suggested by Hsiao as ‘vector set’ since the query logger represents the plurality of database queries from the crowd of device 602’s), wherein:
            a first plurality of database queries was executed by a plurality of different
users on one or more databases- Hsiao [0042] … a system 600 including a server 620 may be utilized to aggregate behavior reports from a crowd of computing devices 602.  Although only one computing device 602 is shown, the hereinafter described aspects relate to a plurality or crowd of computing devices 602. Here, the claimed ‘accessing’ is taught by Hsiao as ‘aggregate’ because gathering the queries requires having access to the queries.  The claimed ‘a first’ is taught by Hsiao as ‘a crowd’ which designates a singular group with no mention of a previous group.  The claimed ‘plurality of database queries’ is taught by Hsiao as ‘aggregate reports’ since the reports are from the query logger recording device 602 actions.  It is noted that device 602 are plural devices.   The claimed ‘databases’ is taught by Hsiao as ‘devices 602’);
           a set of artificial intelligence (AI) training data was created by extracting a
plurality of features from each of the first plurality of database queries - Hsiao [0026] … the behavior vectors 130 may be used in a behavioral analysis framework to detect malware on computing devices. The resulting behavior vectors 130 include the objective observations extracted from logging As an example, the behavior analysis engine 122 answers queries 210 regarding actions (e.g., "application installation without the user's consent?", "should the application behave like a game?", "should the website act like news?", "should the application be processing SMS messages?", "should the application be processing phone calls?" etc.).  Here, the claimed ‘extracting ...database query’ is taught by Hsiao as ‘extracted from logging’ since the log records the database queries.  The claimed ‘corresponding features’ is taught by Hsiao as ‘game, news, or SMS’ since these features can be used to profile the user/per instant specification [0014].  The claimed ‘characteristics...query’ is taught by Hsiao as ‘actions’ as the actions of the application are used to characterize the query): wherein the different characteristics comprise at least one of a length of the database query, a format of the database query, or a styling of the database query – Hsiao [0027] ... each action may be associated with one or more of four types of queries 310: existence query, amount query, order query, and category query. For example, an existence query 310 may refer to the existence of an action set. As an example of this query, the query may be to determine whether an application has accessed device information (e.g., has phone information been accessed, has location information been accessed, etc.  Here, the claimed ‘format of the database query’ is taught by Hsiao as ‘action set’ whereby an action set has a particular format/structure such as use of access devices for phone and location); and 
         the trained machine learning classifier was trained using the set of AI training
data - Hsiao [0053] ... As to initialization a behavior analysis engine and classifier (e.g., for a computing device 602) may be trained by a set of known-bad applications.  Here, the claimed ‘training’ is taught b Hsiao as ‘may be trained’ whereas the claimed ‘machine learning classifier’ is taught by Hsiao as ‘classifier’.  The claimed ‘set of AI training data’ is taught by Hsiao as ‘known bad applications’) and corresponding labels for each of the first plurality of database queries, wherein each of the labels indicates an identity of one of the plurality of different users that is associated with a corresponding database query - Hsiao [0038] … a behavior vector 529 is generated by the behavior analysis engine based on analyzing the log of actions from the query logger. The behavior vectors are simplified as being frequent use 530, rare use 532, and no use 534. As a numerical example, a behavior vector of around 5 may designate frequent use, a behavior vector of around 1-2 may designate rare use, and a behavior vector of around 0 may designate no use.  Here, the claimed ‘corresponding labels’ is taught by Hsiao as at least ‘rare use 532 whereas the claimed ‘users’ is taught by Hsiao as ‘no use’ since the application is associated with the device which is associated with the user); and
based on the first output vector, determining if the first database query represents a data access anomaly – Hsiao [0038] ... For each of these applications, a behavior vector 529 is generated by the behavior analysis engine based on analyzing the log of actions from the query logger. The behavior vectors are simplified as being frequent use 530, rare use 532, and no use 534.  Here, the claimed ‘first output vector’ is taught by Hsiao as ‘behavior vector 529’ representing the results of analyzing the log.  The claimed ‘database query’ is taught by Hsaio as ‘log of actions’ whereby the claimed ‘data access anomaly’ is taught by Hsiao as the contextual combination of ‘frequent, rare, and no use’ as further defined at [0040] youtube application.  HSIAO SUGGESTS receiving a first output vector in a vector space, HOWEVER IN AN ART THAT IS ANALAGOUS TO THE SAME FIELD OF ENDEAVOR JAYARAMAN TEACHES receiving a first output vector in a vector space – Jayaraman [0224] ... using the ML model to determine word vectors that describe, in a first semantically-encoded vector space, a meaning of respective words of the first textual record and comparing the word vectors to at least one of a location or a volume, within the first semantically-encoded vector space,   Here, the claimed ‘obtaining’ is taught by Jayaraman as ‘using the ML model’ whereas the claimed ‘first output vector’ is taught by Jayaraman as ‘first textual record’.  The claimed ‘vector space’ is taught by Jayaraman as ‘first semantically-encoded vector space’. Thus, it would have been recognized by one of ordinary skill in the art before the effective filing date of the claimed invention to apply the known technique of baselining the vector space using the sample vector, as taught by Jayaraman would have yielded predicable results whereby a vector space is created using the first vector as a baseline.  Hsiao suggests this feature but Jayaraman positively teaches baselining the vector space.  This combination of Hsiao and Jayaraman improves the detection of both malware and access anomalies since the vectors are statistically discerned to the user).

As to claim 12, the combination of Hsiao and Jayaraman teaches a non-transitory computer-readable medium of claim 11, wherein the determining if the first database query represents a data access anomaly comprises: determining a first distance between the first output vector and a first centroid for queries associated with the first user - Jayaraman [0195] ... a record could be precluded from assignment to a particular cluster unless a degree of similarity between the cluster and the record is greater than a threshold similarity. This could include a distance between the location of the record and a centroid or other characteristic location of the cluster being less than a threshold distance. Here, the claimed ‘determining’ is taught by Jayaraman as ‘precluded’ which requires an assessment of query vectors.  The claimed ‘first output vector’ is taught by Jayaraman as ‘a record’ whereas the claimed ‘first centroid’ is taught by Jayaraman as ‘a centroid or other characteristic location’).  The rationale provided for considering Jayaraman with Hsiao in claim 11 applies here in claim 12).

           As to claim 16, Hsiao teaches a system - Hsiao [0010] FIG. 1 is a block diagram of a system in which aspects of the invention may be practiced, comprising:
          one or more hardware processors - Hsiao [0009] The server may include: a processing circuit to receive a plurality of behavior vector sets from a plurality of computing devices; and
          a non-transitory computer-readable medium having stored thereon instructions that when executed by the one or more hardware processors - Hsiao [0065] ... a storage media may be any available media that can be accessed by a computer. By way of example, and not limitation, such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, cause the system to perform operations comprising:
                        accessing a first plurality of database queries executed by a plurality of
different users on one or more databases - Hsiao [0042] … a system 600 including a server 620 may be utilized to aggregate behavior reports from a crowd of computing devices 602.  Although only one computing device 602 is shown, the hereinafter described aspects relate to a plurality or crowd of computing devices 602. Here, the claimed ‘accessing’ is taught by Hsiao as ‘aggregate’ because gathering the queries requires having access to the queries.  The claimed ‘a first’ is taught by Hsiao as ‘a crowd’ which designates a singular group with no mention of a previous group.  The claimed ‘plurality of database queries’ is taught by Hsiao as ‘aggregate reports’ since the reports are from the query logger recording device 602 actions.  It is noted that device 602 are plural devices.   The claimed ‘databases’ is taught by Hsiao as ‘devices 602’);
                     extracting, from each database query in the first plurality of database queries, a corresponding set of features representing different characteristics of the
database query - Hsiao [0026] … the behavior vectors 130 may be used in a behavioral analysis framework to detect malware on computing devices. The resulting behavior vectors 130 include the objective observations extracted from logging. As an example, the behavior analysis engine 122 answers queries 210 regarding actions (e.g., "application installation without the user's consent?", "should the application behave like a game?", "should the website act like news?", "should the application be processing SMS messages?", "should the application be processing phone calls?" etc.).  Here, the claimed ‘extracting ...database query’ is taught by Hsiao as ‘extracted from logging’ since the log records the database queries.  The claimed ‘corresponding features’ is taught by Hsiao as ‘game, news, or SMS’ since these features can be used to profile the user/per instant specification [0014].  The claimed ‘characteristics...query’ is taught by Hsiao as ‘actions’ as the actions of the application are used to characterize the query wherein the different characteristics comprise at least one of a length of the database query, a format of the database query, or a styling of the database query – Hsiao [0027] ... each action may be associated with one or more of four types of queries 310: existence query, amount query, order query, and category query. For example, an existence query 310 may refer to the existence of an action set. As an example of this query, the query may be to determine whether an application has accessed device information (e.g., has phone information been accessed, has location information been accessed, etc.  Here, the claimed ‘format of the database query’ is taught by Hsiao as ‘existence of an action set’ whereby an action set has a particular format/structure such as use of access devices for phone and location:
                  creating a set of artificial intelligence (AI) training data – Hsiao [0053] ... the behavior analysis engine 122 answers queries 210 regarding actions (e.g., "application installation without the user's consent?", "should the application behave like a game?", "should the website act like news?", "should the application be processing SMS messages?", "should the application be processing phone calls?" etc.). The answers to these queries 210 create the behavior vectors 130. Here, the claimed ‘creating’ is taught by Hsiao as ‘queries 210’ since the behavior analysis engine 122 uses the queries as construction filters to create the training data) based on the corresponding set of features – Hsiao [0054] ... the computing device 602 periodically monitors and computes a behavior vector 610 utilizing behavior analysis engine 608 for each running application and by utilizing a classifier may determine whether this application behaves similar to malware or benign applications. Here, the claimed ‘corresponding set of features’ is taught by Hsiao as ‘application behaves similar to malware’ since the data is used by the classifier to identify the corresponding features among devices 602);
training a machine learning (ML) classifier using the set of AI training data - Hsiao [0053] ... As to initialization a behavior analysis engine and classifier (e.g., for a computing device 602) may be trained by a set of known-bad applications.  Here, the claimed ‘training’ is taught b Hsiao as ‘may be trained’ whereas the claimed ‘machine learning classifier’ is taught by Hsiao as ‘classifier’.  The claimed ‘set of AI training data’ is taught by Hsiao as ‘known bad applications’) and corresponding labels for each of the first plurality of database queries wherein each of the corresponding labels indicates an identity of one of the plurality of different users that is associated with a corresponding query - Hsiao [0038] … a behavior vector 529 is generated by the behavior analysis engine based on analyzing the log of actions from the query logger. The behavior vectors are simplified as being frequent use 530, rare use 532, and no use 534. As a numerical example, a behavior vector of around 5 may designate frequent use, a behavior vector of around 1-2 may designate rare use, and a behavior vector of around 0 may designate no use.  Here, the claimed ‘corresponding labels’ is taught by Hsiao as at least ‘rare use 532 whereas the claimed ‘users’ is taught by Hsiao as ‘no use’ since the application is associated with the device which is associated with the user), and wherein the trained ML classifier is configured to produce vector outputs in a vector space in response to receiving database queries - Hsiao [0038] For each of these applications, a behavior vector 529 is generated by the behavior analysis engine based on analyzing the log of actions from the query logger. The behavior vectors are simplified as being frequent use 530, rare use 532, and no use 534.  Here, the claimed ‘vector outputs’ is taught by Hsiao as ‘vector 529 is generated’ the claimed ‘in response...queries’ since the query logger represents the plurality of database queries from the crowd of device 602’s);
extracting, from a first database query associated with a first user a first set of features representing the different characteristics of the first database query - Hsiao [0031] ... a wide variety of different types of actions 320... may be utilized by the behavior analysis engine 122 to generate behavior vectors 130. Each of these actions as recorded by the log of actions 120 may be utilized by the behavior analysis engine 122 to generate a behavior vector 130 that characterizes the behavior of the application. Here, the claimed ‘first database query’ is taught by Hsiao as ‘actions 320’ whereas the claimed ‘associated with a first user’ is taught by Hsiao as ‘user interaction’; a first set of features representing the different characteristics of the first database query - Hsiao [0047] …as shown in FIG. 6, a query logger 604 of a computing device 602 may log the behavior of an application to generate of a log of actions 606. Next, the behavior analysis engine 608 of the computing device 602 may analyze the log of actions 608 to generate a behavior vector set 610 that characterizes the behavior of the application. Here, the claimed ‘first database query’ is taught by Hsiao as ‘log of actions 606’ whereas the claimed ‘first particular user’ is taught by Hsiao as ‘device 602’);
obtaining a first output vector in the vector space based on providing the first set of features to the trained ML classifier) - Hsiao [0047] a query logger 604 of a computing device 602 may log the behavior of an application to generate of a log of actions 606 ...if the classifier of the computing device 602 does not find that the behavior vector set indicates anything suspicious about the application (e.g., it has a low likelihood of being malware). Here the claimed ‘obtaining’ is taught by Hsiao as ‘may log’ whereby the query logger records queries.  The claimed ‘first output vector’ is taught by Hsiao as ‘log of actions 606’ as these actions have been aggregated or vectorized.  The claimed ‘vector space’ is suggested by Hsiao as ‘vector set’ because a plurality of vectors are grouped into one space); and
based on the first output vector, determining by the computer system if the first database query represents a data access anomaly – Hsiao [0054] … behavior vector 610 utilizing behavior analysis engine 608 for each running application and by utilizing a classifier may determine whether this application behaves similar to malware or benign applications.  Here, the claimed ‘data access anomaly’ is taught by Hsiao as ‘malware’.  HSIAO SUGGESTS obtaining a first output vector in the vector space, HOWEVER IN AN ANALAGOUS ART THAT IS DIRECTED TO THE SAME FIELD OF ENDAVOR  JAYARAMAN TEACHES obtaining a first output vector in the vector space – Jayaraman [0224] ... using the ML model to determine word vectors that describe, in a first semantically-encoded vector space, a meaning of respective words of the first textual record and comparing the word vectors to at least one of a location or a volume, within the first semantically-encoded vector space,   Here, the claimed ‘obtaining’ is taught by Jayaraman as ‘using the ML model’ whereas the claimed ‘first output vector’ is taught by Jayaraman as ‘first textual record’.  The claimed ‘vector space’ is taught by Jayaraman as ‘first semantically-encoded vector space’. Thus, it would have been recognized by one of ordinary skill in the art before the effective filing date of the claimed invention to apply the known technique of baselining the vector space using the sample vector, as taught by Jayaraman would have yielded predicable results whereby a vector space is created using the first vector as a baseline.  Hsiao suggests this feature but Jayaraman positively teaches baselining the vector space.  This combination of Hsiao and Jayaraman improves the detection of both malware and access anomalies since the vectors are statistically discerned to the user).

As to claim 17 the combination of Hsiao and Jayaraman teaches the system of claim 16, wherein the determining if the first database query represents a data access anomaly comprises: determining a first distance between the first output vector and a first centroid for queries associated with the first user - Jayaraman [0195] ... a record could be precluded from assignment to a particular cluster unless a degree of similarity between the cluster and the record is greater than a threshold similarity. This could include a distance between the location of the record and a centroid or other characteristic location of the cluster being less than a threshold distance. Here, the claimed ‘determining’ 	is taught by Jayaraman as ‘precluded’ which requires an assessment of query vectors.  The claimed ‘first output vector’ is taught by Jayaraman as ‘a record’ whereas the claimed ‘first centroid’ is taught by Jayaraman as ‘a centroid or other characteristic location’. Thus, it would have been recognized by one of ordinary skill in the art before the effective filing date of the claimed invention to apply the known technique of baselining the vector space using the sample vector, as taught by Jayaraman would have yielded predicable results whereby a vector space is created using the first vector as a baseline.  Hsiao suggests this feature but Jayaraman positively teaches baselining the vector space.  This combination of Hsiao and Jayaraman improves the detection of both malware and access anomalies since the vectors are statistically discerned to the user).

 As to claim 18, the combination of Hsiao and Jayaraman teaches the system of claim 17, wherein the determining if the first database query represents a data access anomaly further comprises:
determining if the first distance exceeds a threshold value determining if the first distance exceeds a threshold value – Jayaraman [0014] ...  determining that the particular textual record fits into the particular cluster can include determining that the similarity metric indicates that the particular textual record fits into the particular cluster to a degree that exceeds a specified threshold similarity.  Here, the claimed ‘first distance’ is taught by Jayaraman as ‘a degree’ whereas the claimed ‘threshold value’ is taught by Jayaraman as ‘threshold similarity’. The rationale provided for considering Jayaraman with Hsiao in claim 17 applies here in claim 18 as distance measuring from a threshold.

Claims 6 – 9 and 14-15 are rejected under 35 U.S.C. 103 as being unpatentable over Hsiao in view of Jayaraman, and in further view of Pike; Robert et al, US 20200404016 A1, December 24, 2020, hereafter referred to as Pike.

  As to claim 6, the combination of Hsiao and Jayaraman teaches the method of claim 1, further comprising:
in response to a determination by the computer system that the first database query represents a data access anomaly – Hsiao [0044] …a computing device 602 may determine that an application's behavior is suspicious and may transmit the behavior vector set 610 for the application to the server 620 to have the server 620 analyze the behavior vector set 610. The global classifier 628 of server 620 may classify the transmitted behavior vector set 610 as benign or malware. If the behavior vector set 610 for an application is classified as malware, a malware indicator may be transmitted to the plurality of computing devices 602 and the computing devices 602 may delete 632 the application.  THE COMBINATION OF HSIAO AND JAYARAMAN DO NOT TEACH automatically initiating a remedial action against a system account of the first user HOWEVER IN AN ANALAGOUS ART THAT IS DIRECTED TO THE SAME FIELD OF ENDEAVOR PIKE TEACHES automatically initiating a remedial action against a system account of the first user - Pike [0081] ... If the requisite approval is not received, or a disapproval is received, the multi-user authorization subsystem 234 may remove or suspend the account of that particular user from being able to approve further changes, and/or instruct the rebuild manager 236 to rebuild the authorization device used by that particular user.  Here, the claimed ‘automatically initiating’ is taught by Pike as ‘authorization subsystem 234’ since the device the disapproval triggers the action.  The claimed ‘remedial action’ is taught by Pike as ‘remove or suspend account’.  It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Hsiao and Jayaraman incorporating suspending a user’s account as a remedial action as taught by Pike.  The combination of Hsiao and Jayaraman will delete or deny access to applications on the user device but both are silent on suspending the account of the first user as a remedial action.  Suspending the user account can improve computer security since no additional exploits would be possible thereby improving system security). 

As to claim 7, the combination of Hsiao, Jayaraman, and Pike teaches the method of claim 6, wherein the remedial action includes transmitting an electronic alert regarding the first database query to a system administrator – Pike [0058] ... The management library 126 may perform this function by injecting one or more detection binaries executing as executables 124 within the operating system 122 which monitor the managed device 120 and operating system for any changes. If any changes fall outside an exception list of changes that are allowable/approved for the managed device 120, the one or more detection binaries may transmit an alert/message to the management library 126 or to the management system 130 indicating that a change was detected.  Here, the claimed ‘remedial action’ in this case is taught by Pike as ‘injecting one or more detection binaries’.  The claimed ‘transmitting’ is taught by Pike as ‘transmit an alert/message’.  The claimed ‘administrator’ is taught by Pike as ‘management library 126’.  The rationale to consider Pike’s remedial action for fault isolation and exception handling in claim 6 applies here in claim 7).

As to claim 8, the combination of Hsiao, Jayaraman, and Pike teaches the method of claim 6, wherein the automatically initiating the remedial action is based on a type of data being accessed by the first database query - Pike [0102] … The metric generated by each tracker is subsequently sent to the time to compliance detector 430, non-compliance enforcer 440, and measurement logger 450 for further processing. Although a particular arrangement of trackers is shown here, in other embodiments the compliance measurement set 410 may include a greater or fewer number of trackers based on the type of metrics which the SLA is intended to enforce. The rationale to consider Pike’s remedial action for fault isolation and exception handling in claim 6 applies here in claim 8).

As to claim 9, the combination of Hsiao, Jayaraman, and Pike teaches the method of claim 8, wherein the type of data includes at least one of financial data, customer data, or personally identifying information - Pike [0057] ... The changes caused by these unrestricted user accounts may be unapproved and may result in negative consequences, such as stealing confidential information (e.g., data 125 that is confidential) from storage within the managed network 100, creating a backdoor access within the network 110, causing devices on the network 110 to cease functioning, and so on.  It would have been obvious to a person of ordinary skill in the art to consider Pike’s classification of personally identifying information as security candidates for access.  The combination of Hsiao and Jayaraman do not teach this features in the context of providing remedial action. This obvious inclusion of a feature to identify personally identifying information only enhances the overall security of the combination of Hsiao and Jayaraman).

             As to claim 14, the combination of Hsiao and Jayaraman teaches the non-transitory computer-readable medium of claim 11. THE COMBINATION OF HSIAO AND JAYARAMAN DO NOT TEACH wherein the operations further comprise:
in response to a determination by the computer system that the first database query
represents a data access anomaly, HOWEVER IN AN ANALAGOUS ART THAT IS DIRECTED TO THE SAME FIELD OF ENDEAVOR PIKE TEACHES automatically initiating a remedial action against a system account of the first user. wherein the operations further comprise:
in response to a determination by the computer system that the first database query
represents a data access anomaly, automatically initiating a remedial action against a system account of the first user- Pike [0081] ... If the requisite approval is not received, or a disapproval is received, the multi-user authorization subsystem 234 may remove or suspend the account of that particular user from being able to approve further changes, and/or instruct the rebuild manager 236 to rebuild the authorization device used by that particular user.  Here, the claimed ‘automatically initiating’ is taught by Pike as ‘authorization subsystem 234’ since the device the disapproval triggers the action.  The claimed ‘remedial action’ is taught by Pike as ‘remove or suspend account’.  It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Hsiao and Jayaraman incorporating suspending a user’s account as a remedial action as taught by Pike.  The combination of Hsiao and Jayaraman will delete or deny access to applications on the user device but both are silent on suspending the account of the first user as a remedial action.  Suspending the user account can improve computer security since no additional exploits would be possible thereby improving system security). 

          As to claim 15, the combination of Hsiao and Jayaraman teaches the non-transitory computer-readable medium of claim 14, wherein the automatically initiating the remedial action is based on a type of data being accessed by the sample-first database query - Pike [0064] ... If a tracked metric is measured to exceed a threshold which would trigger or eventually trigger an SLA violation, the system can take remedial action, such as rolling back affected systems, in a prompt manner.  Here, the claimed ‘type of data’ is taught by Pike as ‘SLA violation’ whereby data in the Service Level Agreement may be queried.  It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate a filter for specific data types in a key log search.  The combination of Hsiao and Jayaraman do not explicitly automatically initiating a remedial action upon detection of the data type.  Pike provides this filter thereby enhancing the combination of Hsiao and Jayaraman to intercept access anomalies).

Claims 4, 13, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Hsiao and Jayaraman in view of Li; Sijia (US 20200311145 A1), October 1, 2020, hereafter referred to as Li.

As to claim 4, the combination of Hsiao and Jayaraman teaches the method of claim 1. THE COMBINATION OF HSIAO AND JAYARAMAN DO NOT TEACH wherein the determining if the first database query represents a data access anomaly comprises:
determining if a first centroid for queries associated with the first user is within K closest neighboring centroids to the first output vector where K is a predefined integer greater than one, HOWEVER IN AN ANALAGOUS ART THAT IS DIRECTED TO THE SAME FIELD OF ENDEAVOR LI TEACHES wherein the determining if the first database query represents a data access anomaly comprises:
determining if a first centroid for queries associated with the first user is within K closest neighboring centroids to the first output vector where K is a predefined integer greater than one - Li [0145] At procedure 304, the cluster center detector 232 calculates the K-density for each vector using the distances calculated in procedure 302. The K-density for each vector…is a predetermined positive integer. In certain embodiments, K is set in a range of 10-100…The K vectors are K nearest neighbors to the vector i.  Here, the claimed ‘first centroid’ is taught by Li as ‘the vector i” since it is detected by cluster center detector 232 whereas the claimed ‘first output vector’ is taught by Li as ‘The K vectors’.  The claimed ‘K closest neighboring’ is taught by Li as ‘K nearest neighbors’. Finally, the claimed ‘predefined integer’ is taught by Li as ‘predefined positive integer’. Thus, it would have been recognized by one of ordinary skill in the art before the effective filing date of the claimed invention that applying the known shared nearest neighborhood (SNN) technique taught by Li to the combination of Hsiao and Jayaraman mobile device(s) 602 and Server 620 would have yielded predicable results and resulted in an improved system 600, namely, a system that would positively benefit from application of variance detection from the centroid provided by cluster center detector 232 of Li).

 As to claim 13, the combination of Hsiao and Jayaraman teaches the computer-readable medium of claim 11. THE COMBINATION OF HSIAO AND JAYARAMAN DO NOT TEACH wherein the determining if the first database query represents a data access anomaly comprises
determining if a first centroid for queries associated with the first user is within K closest neighboring centroids to the first output vector where K is a predefined integer greater than one, HOWEVER IN AN ANALAGOUS ART THAT IS DIRECTED TO THE SAME FIELD OF ENDEAVOR LI TEACHES wherein the determining if the first database query represents a data access anomaly comprises:
determining if a first centroid for queries associated with the first user is within K closest neighboring centroids to the first output vector where K is a predefined integer greater than one - Li [0145] At procedure 304, the cluster center detector 232 calculates the K-density for each vector using the distances calculated in procedure 302. The K-density for each vector…is a predetermined positive integer. In certain embodiments, K is set in a range of 10-100…The K vectors are K nearest neighbors to the vector i.  Here, the claimed ‘first centroid’ is taught by Li as ‘the vector i” since it is detected by cluster center detector 232 whereas the claimed ‘first output vector’ is taught by Li as ‘The K vectors’.  The claimed ‘K closest neighboring’ is taught by Li as ‘K nearest neighbors’. Finally, the claimed ‘predefined integer’ is taught by Li as ‘predefined positive integer’. Thus, it would have been recognized by one of ordinary skill in the art before the effective filing date of the claimed invention that applying the known shared nearest neighborhood (SNN) technique taught by Li to the combination of Hsiao and Jayaraman mobile device(s) 602 and Server 620 would have yielded predicable results and resulted in an improved system 600, namely, a system that would positively benefit from application of variance detection from the centroid provided by cluster center detector 232 of Li).
.
As to claim 20, the combination of Hsiao and Jayaraman teaches the system of claim 16. THE COMBINATION OF HSIAO AND JAYARAMAN DO NOT TEACH wherein the determining if the first database query represents a data access anomaly comprises
determining if a first centroid for queries associated with the first user is within K closest neighboring centroids to the first output vector where K is a predefined integer greater than one, HOWEVER IN AN ANALAGOUS ART THAT IS DIRECTED TO THE SAME FIELD OF ENDEAVOR LI TEACHES wherein the determining if the first database query represents a data access anomaly comprises:
determining if a first centroid for queries associated with the first user is within K closest neighboring centroids to the first output vector where K is a predefined integer greater than one - Li [0145] At procedure 304, the cluster center detector 232 calculates the K-density for each vector using the distances calculated in procedure 302. The K-density for each vector…is a predetermined positive integer. In certain embodiments, K is set in a range of 10-100…The K vectors are K nearest neighbors to the vector i.  Here, the claimed ‘first centroid’ is taught by Li as ‘the vector i” since it is detected by cluster center detector 232 whereas the claimed ‘first output vector’ is taught by Li as ‘The K vectors’.  The claimed ‘K closest neighboring’ is taught by Li as ‘K nearest neighbors’. Finally, the claimed ‘predefined integer’ is taught by Li as ‘predefined positive integer’. Thus, it would have been recognized by one of ordinary skill in the art before the effective filing date of the claimed invention that applying the known shared nearest neighborhood (SNN) technique taught by Li to the combination of Hsiao and Jayaraman mobile device(s) 602 and Server 620 would have yielded predicable results and resulted in an improved system 600, namely, a system that would positively benefit from application of variance detection from the centroid provided by cluster center detector 232 of Li).

Claims 10 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Hsiao and Jayaraman, in view of Gu; Kunlong et al, US 20150169754 A1, June 18, 2015, hereafter referred to as Gu.

As to claim 10, the combination of Hsiao and Jayaraman teaches the method of claim 1.  THE COMBINATION OF HSIAO AND JAYARAMAN DO NOT TEACH further comprising:
determining, from a set of users, the plurality of different users for generating the set of Al training data based on a number of database queries previously submitted by each user in the set of users, wherein each user in the plurality of different users has previously submitted at least a minimum threshold number of database queries, wherein the set of users includes the first user, HOWEVER IN AN ANALAGOUS ART THAT IS DIRECTED TO THE SAME FIELD OF ENDEAVOR GU TEACHES determining, from a set of users, the plurality of different users for generating the set of Al training data based on a number of database queries previously submitted by each user in the set of users, wherein each user in the plurality of different users has previously submitted at least a minimum threshold number of database queries, wherein the set of users includes the first user - Gu [0007] Training the image relevance model for each of one or more indexed queries can include identifying a qualified query. The qualified query can be an indexed query for which at least a threshold number of images have at least a minimum relevance score for the indexed query, and at least a threshold number of user interactions have occurred with search results for the indexed query.  Here, the claimed ‘set of users’ is taught by Gu as ‘one or more indexed queries’ since the indexing is a subset from the main set of users querying the database.  The claimed ‘AI training data’ is taught by Gu as ‘training the image relevance model’ since training data is formed for each query.  The claimed ‘based of a number of database queries’ is taught by Gu as ‘qualified query’ since it must meet a minimum threshold whereas the claimed ‘previous …queries’ is taught by Gu as ‘interactions have occurred’ since the interaction are past user queries identified by a threshold. Thus, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to apply the use of empirical AI data with user actions to the combination of Hsiao and Jayaraman. The combination of Hsiao and Jayaraman suggests associating user interactions with AI training data using the query logger but Gu is explicit.  Applying Gu would positively benefit the combination of Hsiao and Jayaraman query classification of using statistical theory provided by query transformation Gu).

As to claim 19, the combination of Hsiao and Jayaraman teaches the system of claim 16. THE COMBINATION OF HSIAO AND JAYARAMAN DO NOT TEACH further comprise:
determining, from a set of users, the plurality of different users for generating the set of Al training data based on a number of database queries previously submitted by each user in the set of users, wherein each user in the plurality of different users has previously submitted at least a minimum threshold number of database queries, wherein the set of users includes the first user, HOWEVER IN AN ANALAGOUS ART THAT IS DIRECTED TO THE SAME FIELD OF ENDEAVOR GU TEACHES further comprise:
determining, from a set of users, the plurality of different users for generating the set of Al training data based on a number of database queries previously submitted by each user in the set of users, wherein each user in the plurality of different users has previously submitted at least a minimum threshold number of database queries, wherein the set of users includes the first user - Gu [0007] Training the image relevance model for each of one or more indexed queries can include identifying a qualified query. The qualified query can be an indexed query for which at least a threshold number of images have at least a minimum relevance score for the indexed query, and at least a threshold number of user interactions have occurred with search results for the indexed query.  Here, the claimed ‘set of users’ is taught by Gu as ‘one or more indexed queries’ since the indexing is a subset from the main set of users querying the database.  The claimed ‘AI training data’ is taught by Gu as ‘training the image relevance model’ since training data is formed for each query.  The claimed ‘based of a number of database queries’ is taught by Gu as ‘qualified query’ since it must meet a minimum threshold whereas the claimed ‘previous …queries’ is taught by Gu as ‘interactions have occurred’ since the interaction are past user queries identified by a threshold. Thus, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to apply the use of empirical AI data with user actions to the combination of Hsiao and Jayaraman. The combination of Hsiao and Jayaraman suggests associating user interactions with AI training data using the query logger but Gu is explicit.  Applying Gu would positively benefit the combination of Hsiao and Jayaraman query classification of using statistical theory provided by query transformation Gu).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WILLIAM B. JONES whose telephone number is (571) 272-9637.  The examiner can normally be reached on Mon - Fri., 5:30 a.m. to 2:00 p.m.  If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ashok Patel can be reached on 571-272-3972.  The fax phone number for the organization where this application or proceeding is assigned is 571-272-3900.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
 /WILLIAM B JONES/Examiner, Art Unit 249110/26/2022