DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
	
Response to Arguments
Applicant's arguments regarding claims 1 – 17 and 21, filed 24 May 2022, have been fully considered but they are not persuasive.  
Applicant argues that Singh (US 2021/0397610) does not disclose “determining … at least one region within the 3-D latent embedding space having a density of vectors below a threshold” with the reasons of Singh does not disclose density and threshold for density (pages 10 and 11).  Applicant’s original specification discloses “a clustering model (e.g., K-means, DB SCAN (density-based spatial clustering of applications with noise), or a variety of other unsupervised machine learning models used for clustering) may take as input a latent space embedding and determine whether it belongs (e.g., based on a threshold distance) to one or more other clusters of other space embeddings that have been previously trained” (paragraph 81), “in (e.g., 3-Dimension) the RC may have a finite volume and a density based on the number of vectors within the RC” (paragraph 148), and “a region may be defined around a cluster based on a cluster center, or a collection of cluster centers within a threshold distance, and a radius, or edges of a region, may be based on distances to nearest neighbor centers of regions, or a threshold (e.g., minimum or maximum distance from a center of a region), which in some examples may be a normalized distance based on the dimensions of the RC and an pre-specified or maximum or  minimum number of regions that may be formed within the RC based on respective thresholds” (paragraph 148).  Singh discloses clustering may comprise applying a hierarchical clustering method to iterative combine separate clusters, e.g. small clusters based on a first predefined distance threshold may be defined, and then these small clusters may themselves be clustered or combined with other points based on a second predefined distance thresholds (paragraph 63).  One of ordinary skill in the art would clearly understand that a density of vectors below a threshold to be aligned in scope of a size of a cluster within a threshold distance based on the disclose specification. 
Applicant argues Singh does not disclose “updating … for the at least one region, a prioritization value to bias selection of a natural language text corresponding to, or identified to, the at least one region” with the reason there’s no supporting explanation but mere citation corresponding to prioritization value that updates a value for a region within a 3D latent embedding space (pages 11 and 12).  Applicant’s original specification discloses priorities are based on ranking (paragraphs 120 – 124, 135, 160), alignment to user (paragraphs 125, 129), context (paragraph 128), and exploration (paragraph 146).  Singh may not expressly disclose prioritization value but does disclose the utilization of distances, size, frequency for selection of canonical queries (paragraph 87) which is a type of ranking, and from these threshold that clusters are broken down (changed/updated) and assigned/reassigned to individual transcriptions to their closest cluster (paragraph 92).  One of ordinary skill in the art would realize the passages taught by Singh maps to certain passages in applicant’s original specification that is within the scope of the claim limitation.  
The 35 U.S.C. 112(b) rejections have been withdrawn due to applicant’s amendments.  


Information Disclosure Statement
The information disclosure statement filed 22 April 2022 fails to comply with 37 CFR 1.98(a)(2), which requires a legible copy of each cited foreign patent document; each non-patent literature publication or that portion which caused it to be listed; and all other information or that portion which caused it to be listed.  It has been placed in the application file, but the information referred to therein has not been considered.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1 – 5, 10 – 14, 16, 17, and 21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Singh (US 2021/0397610) in view of Acharya et al. (US 10,872,601) and He et al. (US 2020/0005503).  
Regarding independent claim 1, Singh teaches a computer-implemented method comprising: 
obtaining, by a computing system, a plurality of natural language texts (paragraph 51: natural language processing metric; paragraph 53: text data representing a query); 
determining, by a computing system with a natural language processing model, a high- dimensionality vector representation of each text comprising more than 500 dimensions (paragraph 61: the vector pairs 522, 524 were generated as a 768-dimensional vector, wherein the vector pairs 522, 524 may be combined into a single vector representation, e.g. either by concatenating the text data pairs 512, 514 or the resultant vector encodings); 
reducing, by a computing system with an encoder model, each high-dimensionality vector representation to a reduced vector representation (paragraph 62: dimensionality reduction operation may help reduce the dimensionality from a higher dimensional representation (e.g., 768 elements) to a lower dimensional representation (e.g., a few hundred elements)), 
Singh does not expressly disclose wherein 3 of the dimensions correspond to positional data within a 3-Dimensional latent embedding space.  He discloses projecting the higher-dimensional vector to a lower dimensional space to create a lower-dimensional vector (paragraph 38) and though points are projected to 2D space, 3D space views can be possible (paragraph 60).  It would have been obvious for one of ordinary skill in the art at the time of the invention (pre-AIA ) or at the time of the effective filing date of the application (AIA ) to modify Singh's system to achieve a predictable result by utilizing 3D space when visualizing lower-dimensional vector representations in one of finite possibilities from visualizing lower-dimensional vector representations in 2D space or 3D space, as taught by He, and the result would have been predictable.   
The combination of Singh’s and He’s systems teaches embedding, by a computing system within the 3-D latent embedding space, each of the reduced vector representations based on their respective positional data (Singh, paragraph 63 and Figure 5: a two-dimensional schematic illustration of clustering in a n-dimensional space); 
determining, by a computing system, at least one region within the 3-D latent embedding space having a density of vectors below a threshold (Singh, paragraph 63: clustering may comprise applying a hierarchical clustering method to iterative combine separate clusters, e.g. small clusters based on a first predefined distance threshold may be defined, and then these small clusters may themselves be clustered or combined with other points based on a second predefined distance thresholds); and 
updating, by a computing system, for the at least one region, a prioritization value to bias selection of a natural language text corresponding to, or identified to, the at least one region (Singh, paragraph 92: removing transcriptions that are farther than a certain threshold from the cluster centroid, breaking down clusters that are smaller than a certain threshold, and assigning individual transcriptions from previous steps to their closest cluster; Singh, paragraph 87: the method of training additional hyperparameters may comprise one or more of: clustering distance thresholds (e.g., cosine distance thresholds), cluster size thresholds, and frequency (e.g., target data sample count) thresholds for the selection of canonical queries; Singh, paragraph 64: hierarchical clustering function may use a cosine similarity distance as the distance measure for clustering).
Singh does not expressly disclose the reduced vector representation having fewer than 20 dimensions.  Acharya discloses dimensional reduction of a m X n matrix follows a formula of (m+n)*r < p*m*n (column 21, line 56 – column 22, line 5) and examples of reducing 10,000x300 matrix to 300x30 matrix if K = 30 (column 23, lines 26 – 34), wherein K can be set to different values depending on specific NLU model in consideration (column 24, line 7 – 37).  It would have been obvious for one of ordinary skill in the art at the time of the invention (pre-AIA ) or at the time of the effective filing date of the application (AIA ) to modify Singh's system to achieve a predictable result by trying a finite number of possible K values that would happen to start with more than 500 dimensional matrix and produce a reduced less than 20 dimensional matrix based on a specific NLU model, as taught by Acharya, which is than is utilized and incorporated into Singh’s system and the result would have been predictable.  

Regarding dependent claim 2, the combination of Singh’s and He’s systems teaches steps to display a visualization of the 3-D latent embedding space and the embeddings of the reduced vector representations on a user device (He, Figure 1: User Device 112A, 112B).  

Regarding dependent claim 3, Singh teaches wherein: positional data of a reduced vector representation of a given one of the natural language texts is indicative of semantic content of the natural language text (paragraph 56: canonical queries and corresponding query groups are determined based on the clustered vector representations).

Regarding dependent claim 4, Singh does not expressly disclose wherein: the 3-D latent embedding space corresponds to a volume having dimensions that are sized based on the positional data of the reduced vector representations.  He discloses generating visual representation of encoded lower-dimensional vector relative to other lower-dimensional vectors (paragraph 59 and Figure 3).  It would have been obvious for one of ordinary skill in the art at the time of the invention (pre-AIA ) or at the time of the effective filing date of the application (AIA ) to modify Singh's system of providing a graphical representation of the desired reduced-dimensional vectors in one display.  One would be motivated to do so because this would help a user to analyze the reduced-dimensional vectors.  

Regarding dependent claim 5, Singh teaches wherein: positional data of a reduced vector representation of a given one of the natural language texts is indicative of semantic content of the natural language text (paragraph 56: canonical queries and corresponding query groups are determined based on the clustered vector representations).  Singh does not expressly disclose the 3-D latent embedding space corresponds to a volume having dimensions that are sized based on the positional data of the reduced vector representations.  He discloses generating visual representation of encoded lower-dimensional vector relative to other lower-dimensional vectors (paragraph 59 and Figure 3).  It would have been obvious for one of ordinary skill in the art at the time of the invention (pre-AIA ) or at the time of the effective filing date of the application (AIA ) to modify Singh's system of providing a graphical representation of the desired reduced-dimensional vectors in one display.  One would be motivated to do so because this would help a user to analyze the reduced-dimensional vectors.  The combination of Singh’s and He’s systems teaches the dimensions of the 3-D latent embedding space are indicative of a semantic space that includes the natural language texts (Singh, Figure 5; He: Figure 7).  

Regarding dependent claim 10, Singh teaches determining coverage of a plurality of different regions of the 3-D latent embedding space based on respective densities of reduced vector representations occurring within each of the regions (paragraph 67: obtaining a size of each cluster of query vectors 522 and unassigning vector representations of queries for the given cluster responsive to the size being below a predefined threshold; paragraph 87: the method of training additional hyperparameters may comprise one or more of: clustering distance thresholds (e.g., cosine distance thresholds), cluster size thresholds, and frequency (e.g., target data sample count) thresholds for the selection of canonical queries; Singh, paragraph 64: hierarchical clustering function may use a cosine similarity distance as the distance measure for clustering).

Regarding dependent claim 11, Singh teaches wherein: each of the regions includes a respective subset of the reduced vector representations that have pairwise distances below a threshold; and a determined coverage of a given region indicates whether the given region has a greater or lower density than one or more other regions in the plurality of different regions (paragraph 95: semantic distances between pairs of queries may be compared (e.g., n2 pairs for n queries) and the pairs may then be filtered based on a semantic distance threshold (e.g., only pairs with a similarity score above a predefined threshold are kept); Singh, paragraph 92: removing transcriptions that are farther than a certain threshold from the cluster centroid, breaking down clusters that are smaller than a certain threshold, and assigning individual transcriptions from previous steps to their closest cluster).

Regarding dependent claim 12, Singh teaches determining one or more clusters of subsets of reduced vector representations, a subset of reduced vector representations in a cluster having pairwise distances below a threshold; and wherein the at least one region and other regions are formed based on pairwise distances between neighboring cluster centers (paragraph 95: semantic distances between pairs of queries may be compared (e.g., n2 pairs for n queries) and the pairs may then be filtered based on a semantic distance threshold (e.g., only pairs with a similarity score above a predefined threshold are kept); Singh, paragraph 92: removing transcriptions that are farther than a certain threshold from the cluster centroid, breaking down clusters that are smaller than a certain threshold, and assigning individual transcriptions from previous steps to their closest cluster).

Regarding dependent claim 13, Singh teaches wherein: the prioritization value of the at least one region is provided to a sampling model in association with dimensions of the at least one region (paragraph 68: a canonical query is determined as a top-ranking query 542 in an ordered list of queries 544 within each group).

Regarding dependent claim 14, Singh teaches wherein: a natural language text having a reduced vector representation with positional data occurring within a given region is prioritized for selection to a sample set by the sampling model over another natural language text occurring within a different region having a lower prioritization value (paragraph 68: a canonical query is determined as a top-ranking query 542 in an ordered list of queries 544 within each group, and the top-ranking query 542 may be a query that occurs most frequently and/or is closest to the centroid of a query cluster).

Regarding dependent claim 15, Singh teaches wherein: a prioritization value biases exploration of a semantic space that is minimally covered or not yet covered by the plurality of natural language texts (paragraph 71: the above filtering comprises one or more of removing paired data samples with a canonical query whose named entity tags do not match the named entity tags in the corresponding selection from the query group and removing paired data samples based on a comparison of semantic distance metrics for the canonical query and the corresponding selection from the query group).

Regarding dependent claim 16, Singh teaches obtaining a new natural language text; determining that the new natural language text corresponds to a given region assigned a prioritization value; and prioritizing a selection of the new natural language text to a sample based on the prioritization value assigned to the given region (paragraph 56: canonical queries and corresponding query groups are determined based on the clustered vector representations, wherein selecting clusters of vector representations relating to different queries and determining one representative query data sample for each cluster).

Regarding dependent claim 17, Singh teaches determining a reasoning context based on the positional data of the reduced vector representations and contents of the natural language texts (paragraph 95: semantic distances between pairs of queries may be compared (e.g., n2 pairs for n queries) and the pairs may then be filtered based on a semantic distance threshold (e.g., only pairs with a similarity score above a predefined threshold are kept).

Regarding claim 21, claim 21 is similar in scope as to claim 1, thus the rejection for claim 1 hereinabove is applicable to claim 21.  Singh teaches a tangible, non-transitory, machine-readable medium storing instructions that, when executed by a computer system (Claims 21).  

Claims 6 – 8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Singh (US 2021/0397610) in view of Acharya et al. (US 10,872,601) and He et al. (US 2020/0005503) and Kandogan (US 2002/0171646).  
Regarding dependent claim 6, Singh does not expressly disclose initializing a 3-D visualization of the volume of the semantic space based on the dimensions; and determining, for each reduced vector representation, a location to visually represent the reduced vector representation within the 3-D visualization of the volume based on an orientation of the volume and the positional data of the reduced vector representation.  Kandogan discloses for each information object, field vectors for each field in the information object are calculated, in which each of the field vectors has a magnitude determined by the field 's value and an orientation determined by the field's unit vector, and the field vectors are summed to determine an information object vector, and a feature (e.g., a point) representing the end point of the corresponding information object vector (paragraph 8).  It would have been obvious for one of ordinary skill in the art at the time of the invention (pre-AIA ) or at the time of the effective filing date of the application (AIA ) to modify Singh's system to provide a location and orientation of the vectors for graphical visualization.  One would be motivated to do so because this would help the user to analyze the reduced-dimensional vectors

Regarding dependent claim 7, Singh does not expressly disclose receiving, via a graphical user interface, an indication of a change to the orientation of the volume; and determining, based on the change to the orientation of the volume, updated locations of the reduced vector representations.  Kandogan discloses the graphical representations is then changed using a graphical user interface (e.g., a mouse), and the processor redefines the unit vector corresponding to the changed graphical representation in accordance with this changing of the graphical representation (paragraph 8).  It would have been obvious for one of ordinary skill in the art at the time of the invention (pre-AIA ) or at the time of the effective filing date of the application (AIA ) to modify Singh's system to updated the orientation and the location of the vectors as the user manipulates the viewpoint of the graphical visualization.  One would be motivated to do so because this would help the user to analyze the reduced-dimensional vectors

Regarding dependent claim 8, Singh does not expressly disclose wherein: the visual representation of each reduced vector representation may comprise a point and a numerical value assigned to the point, wherein the numerical value is indicative of a score or rank associated with the natural language text, however Singh does disclose a confidence level for each output sequence (paragraph 76).  It would have been obvious for one of ordinary skill in the art at the time of the invention (pre-AIA ) or at the time of the effective filing date of the application (AIA ) to modify Singh's system to store metadata, such as confidence score, associated with the reduced-dimensional vector representation.  One would be motivated to do so because this would allow better analysis of the performance of the natural language processes.  

Claim 9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Singh (US 2021/0397610) in view of Acharya et al. (US 10,872,601) and He et al. (US 2020/0005503) and Kandogan (US 2002/0171646) and Official Notice.  
Regarding dependent claim 9, Singh does not expressly disclose receiving, via a graphical user interface, selection of a point corresponding to respective reduced vector representation of one of the natural language texts; and causing, within the graphical user interface, display of information associated with the natural language text.  Examiner takes Official Notice that the concept of selecting a point on a graphical visualization and displaying metadata of a point in the vicinity of the point and the advantage of better analysis of the vector data are well known and expected in the art.  It would have been obvious for one of ordinary skill in the art at the time of the invention (pre-AIA ) or at the time of the effective filing date of the application (AIA ) to modify Singh's system to achieve a predictable result of viewing metadata of a point by selecting the desired point in the graphical visualization.  One would be motivated to do so because this would help better analyze the reduced vector representation outputted on the graphical visualization.  

Allowable Subject Matter
Claims 18 – 20 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JEFFREY J CHOW whose telephone number is (571)272-8078. The examiner can normally be reached 11AM-7PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jennifer Mehmood can be reached on (571) 272-2976. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/JEFFREY J CHOW/Primary Examiner, Art Unit 2612