DETAILED ACTION
This action is responsive to application filed on April 14, 2021.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
As required by M.P.E.P. 609, the applicant’s submission of the Information Disclosure Statements dated April 27, 2021 and December 13, 2021 are acknowledged by the examiner and the cited references have been considered in the examination of the claims now pending.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.


Claims 1, 5-12 and 20 are rejected under 35 U.S.C 102(a)(1) as being anticipated by Velasco (US Patent Application Publication No. US 20200097809 A1)

Regarding claim 1, Velasco teaches a computer-implemented method comprising: accessing one or more portions of data; accessing one or more neural embeddings, (See Velasco [0054] “FIG. 7 is a simplified diagram illustrating an exemplary context data input for determining a context embedding [i.e. neural embeddings]” See Velasco [0057-0058] “At process 410, the neural network 330 receives training data [Thus, accessing one or more portions of data]...At a process 420, the neural network model is trained on the training data. In some embodiments, for training, the neural network may perform pre-processing on the training data, for example, for each word, portion of a word, or character in the training text sequence or utterance. The embeddings are encoded, for example, with one or more encoding layers of the neural network to generate respective vectors.”)

the neural embeddings being configured to encode semantic information associated with the accessed data into numeric values; applying locality sensitive hashing to the accessed neural embeddings to assign data portions encoded within a specified numerical range to a cluster of related data items, and to assign data portions outside of the specified numerical range to a cluster of unrelated data items; and (See Velasco [0058] “The embeddings are encoded [Thus, the neural embeddings being configured to encode], for example, with one or more encoding layers of the neural network to generate respective vectors [i.e. numeric values].” See also Velasco [0065] “The encodings can encode the semantic relationship between words [Thus, semantic information associated with the accessed data].” See also Velasco [0075] “locality-sensitive hashing (LSH) may be used to reduce the dimensionality of high dimensional case embedding 630 and the context embeddings and thereby map similar case objects to case text 610 [Thus, applying locality sensitive hashing to the accessed neural embeddings]. This results in clustering one or more case objects with each other [Thus, cluster of related data items], which provides the predicted case objects as a result. Using a binary classification of 1 [Thus, within a specified numerical range] may recognize the case objects and the same [i.e. cluster of related data items], while 0 [Thus, outside of the specified numerical range] may recognize them as separate [i.e. cluster of unrelated data items]”)
	
performing at least one data management operation on the accessed data according to the clustering resulting from the locality sensitive hashing. (See Velasco [0075] “One or more case objects may therefore be identified for case text 610 for removal by dedupe process 640 [Thus, performing at least one data management operation on the accessed data according to the clustering resulting from the locality sensitive hashing]”)

Regarding claim 5, Velasco teaches the method of claim 1, wherein the data management operation comprises a search operation that searches the one or more portions of data for specified data. (See Velasco [0071] “At a process 430, the neural network 330 may receive a user query, such a case text 350 described with computing device 300 in FIG. 3 and/or case text 610 described with model serving flowchart 600 in FIG. 6, for case object searching [i.e. search operation].” See also Velasco [0075] “a nearest neighbor search [i.e. search operation] may be conducted based on the vectors for case embedding 630” [Thus, a search operation that searches the one or more portions of data for specified data])

Regarding claim 6, Velasco teaches the method of claim 5, wherein the search operation is performed using the clustering resulting from the locality sensitive hashing, such that data items in the cluster of related data items are searched prior to searching data items in the cluster of unrelated data items. (See Velasco [0071] “At a process 430, the neural network 330 may receive a user query, such a case text 350 described with computing device 300 in FIG. 3 and/or case text 610 described with model serving flowchart 600 in FIG. 6, for case object searching [i.e. search operation].” See also  Velasco [0076] “Once case embedding 630 is generated for case text 610, neural network 330 may return results that are used for focused solving process 650. The results may similarly be returned using a clustering algorithm  [Thus, using the clustering resulting from the locality sensitive hashing] (e.g., nearest neighbor algorithms, or utilizing LSH to reduce vector dimensionality and identify buckets of similar context embeddings for case embedding 630) [Thus, data items in the cluster of related data items are searched prior to searching data items in the cluster of unrelated data items].)

Regarding claim 7, Velasco teaches the method of claim 1, wherein the data management operation comprises a deduplication operation that removes duplicate information from the one or more portions of data. (See Velasco [0074] “a dedupe process 640 may be performed based on the results determined at process 440. A dedupe or “deduplication” process may refer to a process to eliminate or reduce duplicate or similar case objects in a CRM system. [Thus, a deduplication operation that removes duplicate information from the one or more portions of data]”)

Regarding claim 8, Velasco teaches the method of claim 1, wherein the deduplication operation is performed using the clustering resulting from the locality sensitive hashing, such that data items in the cluster of related data items are removed, and data items in the cluster of unrelated data items are maintained. (See also Velasco claims 5 “determine related cases in the case management system using the machine learning model and locality-sensitive hashing” See also Velasco [0074] “As such, dedupe process 640 may be performed based on the results obtained at process 440, where a clustering algorithm/process for case embedding 630 and the system's context embeddings corresponding to context embedding 880 generated for the system's case objects. [Thus, the deduplication operation is performed using the clustering resulting from the locality sensitive hashing]” See also Velasco claim 6 “The system of claim 5, wherein the system is further configured to: perform deduplication on the target case and the related cases based on binary classification of the related cases. [Thus, data items in the cluster of related data items are removed, and data items in the cluster of unrelated data items are maintained]”)

Regarding claim 9, Velasco teaches the method of claim 1, wherein the one or more portions of data comprise at least one of image data, video data, audio data, or textual data. (See Velasco [0057] “At process 410, the neural network 330 receives training data for training the neural model so that it is able to predict the case objects that are most relevant for a given query of a case. This training data can include text, utterances, comments, etc. [Thus, the one or more portions of data comprise at least textual data]”)

Regarding claim 10, Velasco teaches the method of claim 1, further comprising generating the one or more neural embeddings that are accessed for the application of locality sensitive hashing. (See Velasco [0054] “FIG. 8 is a simplified diagram illustrating neural network architecture for generation of a context embedding using the context data input described in FIG. 7 [Thus, generating neural embeddings]” See also Velasco [0072] “case embedding 630 may correspond to a case embedding of a case, such as utterance embedding 840, or may correspond to a context embedding, such as context embedding 880.” See also Velasco [0075] “locality-sensitive hashing (LSH) may be used to reduce the dimensionality of high dimensional case embedding 630 [Thus, neural embeddings that are accessed for the application of locality sensitive hashing]”)

Regarding claim 11, Velasco teaches the method of claim 10, wherein the neural embeddings are generated by a communicatively linked neural network. (See Velasco [0054] “FIG. 8 is a simplified diagram illustrating neural network architecture [i.e. communicatively linked neural network] for generation of a context embedding using the context data input described in FIG. 7”)

Regarding claim 12, Velasco discloses all of the elements of claim 1 in method form rather than system form. Velasco also discloses a system (0079). Therefore, the supporting rationale of the rejection to claim 1 applies equally as well to those elements of claim 12.

Regarding claim 20, Velasco discloses all of the elements of claim 1 in method form rather than computer readable medium form. Velasco also discloses a computer readable medium (0048). Therefore, the supporting rationale of the rejection to claim 1 applies equally as well to those elements of claim 20.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 2-4 are rejected under 35 U.S.C. 103 as being unpatentable over Velasco (US Patent Application Publication No. US 20200097809 A1), in view of Lakshmanan (US Patent Application Publication No. US 20190114166 A1).

Regarding claim 2, Velasco teaches the limitations of method of claim 1.

Velasco does not explicitly teach wherein the data management operation comprises a diff operation that identifies differences in the one or more portions of data.

However, Lakshmanan discloses wherein the data management operation comprises a diff operation that identifies differences in the one or more portions of data. (See Lakshmanan  [0073-0074] "function log generator 18 performs a diff operation for each matching function... The diff operation may make a line by line [i.e. one or more portions of data],  comparison of the previous version of the function with the new version of the function...the diff operation is performed to definitely determine whether changes have been made [Thus, a diff operation that identifies differences in the one or more portions of data.]”)

It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to modify Velasco to incorporate the teachings of Lakshmanan to include a diff operation that identifies differences in the one or more portions of data. 

One would be motivated to do so to improve the ability to determine changes (e.g. differences)  between log versions (0075)

Regarding claim 3, Velasco further in view of Lakshmanan, teaches all the  limitations and motivations of the method of claim 2, wherein the one or more portions of data comprise one or more log files, and wherein the diff operation is performed on the one or more log files. (See Lakshmanan  [0035] "implementations may include different data or a different arrangement of modules or data than those shown in FIG. 1. For example, source code files 22 may include source code file logs 24 and functions logs 26 [i.e. log files],  or CMS 10 may include additional modules, such as a diff module for performing diff operations to compare lines of code, a backup module for backing up a file, or a module for synchronizing the checking in or checking out of files from a repository. [Thus, the diff operation is performed on the one or more log files]")

Regarding claim 4, Velasco further in view of Lakshmanan, teaches all the limitations and motivations of the method of claim 3, wherein the one or more log files include a plurality of words or phrases, and wherein the neural embeddings encode semantic information associated with the words or phrases into a numerical representation associated with each word or phrase. (See Velasco [0065] “The encodings can encode the semantic relationship between words. [Thus, neural embeddings encode semantic information associated with the words]”)

Claims 13-18 are rejected under 35 U.S.C. 103 as being unpatentable over Velasco (US Patent Application Publication No. US 20200097809 A1), in view of Cheng (US Patent Application Publication No. US 20180336436 A1).

Regarding claim 13, Velasco teaches all the limitations of claim 12.

Velasco does not explicitly teach wherein the data management operation comprises exception monitoring configured to monitor for and identify anomalous occurrences.

However, Cheng discloses wherein the data management operation comprises exception monitoring configured to monitor for and identify anomalous occurrences. (See Cheng [0022] “Graph embedding with neural network technique is a natural method to represent the evolutionary structure of networks as vector representations because of its ability to leverage the structural correlations among the edges and vertices in the network. This opens the possibility of using clustering-based algorithms for anomaly detection in graph streams.” See also Cheng [0004] “The method includes receiving, by a processor, a plurality of vertices and edges from a streaming graph. The method also includes generating, by the processor, graph codes for the plurality of vertices and edges. The method additionally includes determining, by the processor, edge codes in real-time responsive to the graph codes.” [Thus, configured to monitor for and identify anomalous occurrences])

It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to modify Velasco to incorporate the teachings of Cheng to monitor for and identify anomalous occurrences. 

One would be motivated to do so to allow anomaly detection in real-time (0074)
	
Regarding claim 14, Velasco further in view of Cheng, teaches all the limitations and motivations of the method of claim 13, wherein the data management operation comprises event detection which determines when specified events have occurred. (See Cheng [0002] “The present invention relates to streaming graphs and more particularly anomaly detection in streaming networks [Thus, event detection].” See also Cheng [0032-0033] “Given a graph G(E,V), the incoming stream of graph objects at time-stamp t are assumed an edge or small graph object denoted by an edge list E(t) where |E(t)|>1. The vertex set in the edge list E(t) at time-stamp t is denoted by V(t). The vertex set V is the union of the vertex sets across all time-stamps, that is V=∪{V(t)}t=1 ∞. Similarly, E=∪{E(t)}t=1 ∞. Note that the entire vertex set V is not known to us at time-stamp t, which means new vertices will be created at time-stamp t′ for any t′>t. The graph at time-stamp t is denoted as G(t), which includes all edges and small graphs received from time-stamp 1 to t. The goal is to detect anomalous vertices, edges and communities (group of vertices) at any given time t, i.e., in real time as E(t) occurs. [Thus, determines when specified events have occurred]”)’

Regarding claim 15, Velasco teaches all the limitations of claim 12.

Velasco does not explicitly teach wherein the data management operation comprises event detection which determines when specified events have occurred.

However Cheng discloses, wherein the data management operation comprises event detection which determines when specified events have occurred. (See Cheng [0080] “The at least one anomaly detection system 605 is configured to detect one or more anomalies [i.e. event detection]. The computer processing system 610 is configured to perform anomaly detection on streaming networks. Moreover, the computer processing system 610 is configured to initiate an action (e.g., a control action) on the controlled system, machine, and/or device 620 responsive to the detected anomaly. Such action can include, but is not limited to, one or more of: powering down the controlled system, machine, and/or device 620 or a portion thereof; powering down, e.g., a system, machine, and/or a device that is affected by an anomaly in another device, stopping a centrifuge being operated by a user 620A before an imbalance in the centrifuge causes a critical failure and harm to the user 620A, opening a valve to relieve excessive pressure (depending upon the anomaly), locking an automatic fire door, and so forth. As is evident to one of ordinary skill in the art, the action taken is dependent upon the type of anomaly [i.e. event] and the controlled system, machine, and/or device 620 to which the action is applied [Thus, determines when specified events have occurred].”)
	
It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to modify Velasco to incorporate the teachings of Cheng to include event detection which determines when specified events have occurred.

One would be motivated to do so to allow detect anomalies in in real-time (0074)

Regarding claim 16, Velasco further in view of Cheng, teaches all the limitations and motivations of the method of claim 15, wherein the event detection is performed using the clustering resulting from the locality sensitive hashing, such that data items in the cluster of related data items are grouped together as part of a specified event. (See Cheng [0023] “A clustering based anomaly detection method can include one or more of several two procedures, e.g., graph sketching and anomaly detection based on the sketches. The sketches can be learned by hashing such as locality-sensitive hashing and Count-Min sketch” See also Cheng [0072] “If the distance D is larger than a, a new cluster for the point xi′ is created, and the corresponding TCF equals (1,xi′,T). If the data point falls with the anomaly threshold, it will be added to the closest cluster and all entries in TCF of this cluster will be updated using Eq. (5). The anomaly score of each point is reported as the closest distance to the centroids of existing clusters. [Thus, data items in the cluster of related data items are grouped together as part of a specified event]”)

Regarding claim 17, Velasco teaches all the limitations of claim 12.

Velasco does not explicitly teach wherein the data management operation performed on the accessed data comprises updating a neural embedding model used to generate the one or more neural embeddings.

However, Cheng discloses wherein the data management operation performed on the accessed data comprises updating a neural embedding model used to generate the one or more neural embeddings. (See Cheng [0022] “Graph embedding with neural network [i.e. neural embedding model] technique is a natural method to represent the evolutionary structure of networks as vector representations because of its ability to leverage the structural correlations among the edges and vertices in the network. This opens the possibility of using clustering-based algorithms for anomaly detection in graph streams.” See also Cheng [0024] “a new clustering based approach that 1) can incrementally update graph representations as new edges arriving, 2) dynamically maintains the clusters, and 3) detects anomalies in graph streams in real-time.” [Thus, updating a neural embedding model used to generate the one or more neural embeddings])

It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to modify Velasco to incorporate the teachings of Cheng to include updating a neural embedding model used to generate the one or more neural embeddings.

One would be motivated to do so to allow detect anomalies in in real-time (0074)

Regarding claim 18, Velasco further in view of Cheng, teaches all the limitations and motivations of the method of claim 17, wherein the embedding model is continually updated over time based on feedback derived from the locality sensitive hashing clustering. (Cheng [0023-0024] “A clustering based anomaly detection method can include one or more of several two procedures, e.g., graph sketching and anomaly detection based on the sketches. The sketches can be learned by hashing such as locality-sensitive hashing and Count-Min sketch [Thus, based on feedback derived from the locality sensitive hashing clustering]. The graph sketches or representations allow efficient updates as new graph objects arrive in the stream...a new clustering based approach that 1) can incrementally update graph representations as new edges arriving [Thus, continually updated over time], 2) dynamically maintains the clusters, and 3) detects anomalies in graph streams in real-time.”)




Claim 19 are rejected under 35 U.S.C. 103 as being unpatentable over Velasco (US Patent Application Publication No. US 20200097809 A1), in view of Woodworth (US Patent Application Publication No. US 20190121873  A1)

Regarding claim 19, Velasco teaches all the limitations of claim 12.

Velasco does not explicitly teach wherein the data management operation comprises
performing a substantially constant time semantic search on a dataset of at least a threshold minimum size.

However, Woodworth discloses wherein the data management operation comprises performing a substantially constant time semantic search on a dataset of at least a threshold minimum size. (See Woodworth [0026] “The invention, Architecture for Semantic Search over Encrypted Data in the Cloud (“S3C”), is a system that provides true semantic search functionality...The performance of SC3 against various real-world datasets shows that it produces accurate search results while maintaining minimal overhead storage.” See also Woodworth [0065] “It is worth noting that this operation needs only to be performed at startup of the cloud server, and that additions to the index at runtime operate at near constant time [Thus, performing a substantially constant time semantic search], regardless of the size of the dataset” See also Woodworth [0058] “For scalability tests, search times and storage overhead are measured for several three word queries against increasingly large portions of the dataset. Specifically, the following datasets were tested against: 500 MB, 1 GB, 5 GB, 10 GB, 25 GB, and 50 GB [Thus, on a dataset of at least a threshold minimum size]”)

It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to modify Velasco to incorporate the teachings of Woodworth to include performing a substantially constant time semantic search on a dataset of at least a threshold minimum size.

One would be motivated to do so to obtain accurate search results while maintaining minimal overhead storage

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to OSCAR WEHOVZ whose telephone number is (571)272-3362.  The examiner can normally be reached on 8:00am - 5:00pm ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Apu Mofiz can be reached on (571)272-4080.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/OSCAR WEHOVZ/Examiner, Art Unit 2161

/ETIENNE P LEROUX/Primary Examiner of Art Unit 2161