DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
Applicant’s Information Disclosure Statements, filed 02/05/2021 and 05/28/2022, have been received, entered into the record, and considered.  See attached form PTO-1449.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 4-6, 8-16, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Shukla; Anand (“Shukla”) US 11294974 B1 and Onut; Iosif et al. (“Onut”) US 11277443 B2.
Regarding claim 1, Shukla teaches A computing system comprising: 
a processor; and memory storing instructions that, when executed by the processor, cause the processor to perform acts as a processor, such as a processor configured to execute instructions stored on and/or provided by a memory coupled to the processor (col. 3, lines 57-59) comprising: 
based upon the embedding for the URL, computing a score for the URL, wherein the score is indicative of a likelihood that a user will select a search result that represents the webpage when the search result is included on a search engine results page (SERP) generated by a search engine, wherein a search engine index for the search engine is updated based upon a determination being made that the score is greater than a threshold as The similarity score may be computed by computing a cosine similarity between an embedding associated with an entity and the generated embedding. One or more embeddings with a similarity score greater than a cosine similarity threshold are determined and returned (col. 73, lines 56-61).

An embedding associated with a web document or entity is an n-dimensional vector (e.g., 128 dimensional vector) and each n-dimensional vector is located within an embedding space. This enables similar web documents, similar entities, entities that are similar to a web document, and web documents that are similar to an entity to be determined (col. 4, lines 28-34).

One technique to determine whether two entities, two web documents, or a web document and an entity are similar to each other is to compute the cosine similarity between the corresponding embeddings(col. 4, lines 44-54).

In the event the dot product between the two 100 dimensional space vectors is greater than or equal to a document similarity threshold, then the two interests are determined to be similar (col. 20, lines 26-29).

In one embodiment, the disclosed collaborative filtering techniques are used to identify which sites are more relevant to which terms. For example, embedding-based techniques can be applied to determine a proximity (i.e., score) in the disclosed n-dimensional space between a term/topic and a site, such that sites that are closer in the n-dimensional space to the location of the term/topic in the n-dimensional space can be deemed to be more relevant to that term/topic (col. 46, lines 8-15).

At 1714, indexer 1732 provides an updated index to an inverted index serving stack (RDI) 1734, which inverts the index for efficiently serving relevant documents to queries/interests of users of the search and feed system (e.g., the selection of relevant documents to serve to users in response to queries or in their content feeds can be implemented using the orchestration components described herein) (col. 36, lines 20-27).

In one embodiment, unsupervised machine learning techniques are performed to generate a set of words/terms relevant to a given website... The site models can then be generated based on a ranking of each site for every term. For example, the disclosed techniques can be applied to allow the site models to determine that TechCrunch (www.techcrunch.com) is better for technology related content than ESPN (www.espn.com), CNN (www.cnn.com), and/or other sites based on the ranking of the term “technology” for the sites (col. 45, line 55 – col. 46, line 7).
Shukla teaches selecting a plurality of tokens based on processing of the plurality of web documents; and generating embeddings of the selected tokens in an embedding space. For example, each token can be represented as an n-dimensional vector, which corresponds to a point in the embedding space (col. 25, lines 59-64).
Shukla does not explicitly teach the steps of:
tokenizing a uniform resource locator (URL) for a webpage to generate tokens of the URL; and 
generating an embedding for the URL based upon the generated tokens, wherein the embedding for the URL represents semantics of the URL.
Onut; however, teaches the steps of:
tokenizing a uniform resource locator (URL) for a webpage to generate tokens of the URL as In an embodiment, the URL preprocessor 302 further removes a protocol path (e.g., “http://”), a top-level domain (e.g., “.ga,” “.com,” “.cn,” etc.), and a common subdomain name, such as “www,” “mail,” “cpanel,” “webmail,” “webdisk,” etc., which does not have meaningful semantics for the domain analysis. The rest of URL 301 (i.e., the processed URL 304), e.g., “updateyouraccount/apple/accountReset” is input to the word parser 306 to obtain a list of words 308, e.g., “update,” “your,” “account,” “apple,” “account,” “Reset.” (col. 5, lines 16-25); 
generating an embedding for the URL based upon the generated tokens, wherein the embedding for the URL represents semantics of the URL as Word embedding module 106 can convert the list of words 114 into a list of word vectors 116. Word embedding provides a vector representation of a word, based on the context and semantic similarity with respect to other words. Word embedding is the collective name for a set of language modeling and feature learning techniques in natural language processing (NLP) where words or phrases from the vocabulary are mapped to vectors of real numbers. Words with similar meanings are close to each other in the vector space. For instance, the words “car” and “truck” have two similar vectors in the vector space, because they are two instances of the same category (col. 4, lines 18-29).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of the cited references because Onut’s teaching would have allowed Shukla’s to facilitate determining semantic similarity between two entities by utilizing the word embedding technique.

Regarding claim 4, Shukla further teaches when the score is less than or equal to the threshold, failing to include the entry for the webpage in the search engine index for the search engine as At 1714, indexer 1732 provides an updated index to an inverted index serving stack (RDI) 1734, which inverts the index for efficiently serving relevant documents to queries/interests of users of the search and feed system (e.g., the selection of relevant documents to serve to users in response to queries or in their content feeds can be implemented using the orchestration components described herein) (col. 36, lines 20-27).
The similarity score may be computed by computing a cosine similarity between an embedding associated with an entity and the generated embedding. One or more embeddings with a similarity score greater than a cosine similarity threshold are determined and returned (col. 73, lines 56-61).

Regarding claim 5, Shukla does not explicitly teach the acts further comprising: mapping the generated tokens to respective identifiers, wherein the embedding for the URL is generated based upon the identifiers mapped to the generated tokens.
Onut; however, teaches mapping the generated tokens to respective identifiers, wherein the embedding for the URL is generated based upon the identifiers mapped to the generated tokens as  In an embodiment, the URL preprocessor 302 further removes a protocol path (e.g., “http://”), a top-level domain (e.g., “.ga,” “.com,” “.cn,” etc.), and a common subdomain name, such as “www,” “mail,” “cpanel,” “webmail,” “webdisk,” etc., which does not have meaningful semantics for the domain analysis. The rest of URL 301 (i.e., the processed URL 304), e.g., “updateyouraccount/apple/accountReset” is input to the word parser 306 to obtain a list of words 308, e.g., “update,” “your,” “account,” “apple,” “account,” “Reset.” (col. 5, lines 16-25). 
Word embedding module 106 can convert the list of words 114 into a list of word vectors 116. Word embedding provides a vector representation of a word, based on the context and semantic similarity with respect to other words. Word embedding is the collective name for a set of language modeling and feature learning techniques in natural language processing (NLP) where words or phrases from the vocabulary are mapped to vectors of real numbers (i.e., map tokens to identifiers). Words with similar meanings are close to each other in the vector space. For instance, the words “car” and “truck” have two similar vectors in the vector space, because they are two instances of the same category (col. 4, lines 18-29).

Regarding claim 6, Shukla further teaches wherein generating the embedding for the URL comprises generating word embeddings based upon the tokens, wherein each word embedding in the word embeddings is a two-dimensional vector as For example, the disclosed interest embeddings can be applied to generate embeddings for tokens, interests, users of the disclosed content and feed system, Twitter/social media users, hashtags, documents, articles/stories, sentences, and/or paragraphs. Specifically, embeddings for unigrams, bigrams (i.e., two-dimensional vector) and/or n-grams, entities, subreddits, and/or sites can be implemented based on co-occurrences in documents from a corpus of content (e.g., crawled content from the Internet/World Wide Web and/or content from social media networks) (col. 27, lines 10-19).
For example, the classifier can determine that all pages with a URL of “http://example-web-site-1.com/sports” are likely about sports and that all pages with a URL of “http://example-web-site-1.com/technology” are likely about technology and that all pages with a URL of “http://example-web-site-2.com” are likely about astronomy (col. 45, lines 30-36).

Regarding claim 8, Shukla does not explicitly teach wherein tokenizing the URL is performed by a tokenizer that is trained based upon a set of known URLs.
Onut further teaches wherein tokenizing the URL is performed by a tokenizer that is trained based upon a set of known URLs as The SVM classifier 314 can be pre-trained by a large number of malicious URLs and legitimate URLs. The higher the phishing score 316, the more likely for the URL 301 to be a phishing URL (col. 5, lines 36-39).

Regarding claim 9, Shukla further teaches based upon the embedding for the URL, computing a second score for the URL, wherein the second score is indicative of a likelihood that the webpage includes a threshold number of outbound links as At 710, a score is assigned to each portion of the text-based information in the web document associated with the embedded link (col. 14, lines 6-8).
An outlink is an embedded link within the web document that references a different web document. For example, a Wikipedia® page associated with an interest includes a number of inlinks and a number of outlinks (col. 18, lines 25-28).
A web document associated with a first interest can share a threshold number of common outlinks with a web document associated with a second interest (col. 18, lines 43-45).

Regarding claim 10, Shukla further teaches based upon the embedding for the URL, computing a second score for the URL, wherein the second score is indicative of a likelihood that the webpage includes content that is germane to a topic as Referring to FIG. 21 at 2102, processing web pages for a plurality of different websites is performed to identify topics for the web pages of each of the websites using the classifier (e.g., the classifier that was previously trained using training data sets as similarly described above). For example, the classifier can determine that all pages with a URL of “http://example-web-site-1.com/sports” are likely about sports and that all pages with a URL of “http://example-web-site-1.com/technology” are likely about technology and that all pages with a URL of “http://example-web-site-2.com” are likely about astronomy and that all pages with a URL of “http://example-web-site-32.com” are likely about chemistry (col. 45, lines 26-38).

Regarding claim 11, Shukla teaches A method executed by at least one processor of a computing system, the method comprising: 
based upon the information inferred about the webpage, retrieving the webpage from a computing device that hosts the webpage as At 2104, the classifier can identify websites that have pages related to a topic (e.g., mostly about a given topic based on a relative, threshold categorization determined using the classifier). At 2106, invert and identify the websites with labels for the topic. As a result, all pages with similar URLs can be labeled accordingly based on this inference (e.g., “http://example-web-site-1.com/sports/ . . . ” can be labeled as being about sports, “http://example-web-site-1.com/technology/ . . . ” can be labeled as being about technology, “http://example-web-site-2.com” can be labeled as being about astronomy, and “http://example-web-site-32.com” can be labeled as being about chemistry) (col. 45, lines 39-50); and 
upon retrieving the webpage, extracting content from the webpage and storing the extracted content in computer-readable storage as In the example of classifying documents, the disclosed techniques can be performed using the TensorFlow machine learning library with trained models (e.g., the classifier can be initially trained using a large number of training documents, such as to identify URLs relevant for a label such as for a politics label, and can through the search system determine that cnn.com/politics is relevant to politics and then all pages under that URL can be fed into the classifier system for deep learning models, which can be implemented using the Google Tensor Flow neural network open source component) to classify newly added documents (e.g., newly added documents to graph data store 1720 that are being processed by indexer 1732 and classifier 1740 as similarly described above) (col. 44, lines 53-66). 
Shukla does not explicitly teach the steps of:
retrieving a uniform resource locator (URL) for a webpage from a list of URLs for webpages, wherein the webpage is included in the World Wide Web; creating, based upon the URL, a vector of values that represents semantics existent in alphanumerical characters of the URL; inferring information about the webpage based upon the vector
Onut; however, teaches the steps of:
retrieving a uniform resource locator (URL) for a webpage from a list of URLs for webpages, wherein the webpage is included in the World Wide Web as In an embodiment, the URL preprocessor 302 further removes a protocol path (e.g., “http://”), a top-level domain (e.g., “.ga,” “.com,” “.cn,” etc.), and a common subdomain name, such as “www,” “mail,” “cpanel,” “webmail,” “webdisk,” etc., which does not have meaningful semantics for the domain analysis. The rest of URL 301 (i.e., the processed URL 304), e.g., “updateyouraccount/apple/accountReset” is input to the word parser 306 to obtain a list of words 308, e.g., “update,” “your,” “account,” “apple,” “account,” “Reset.” (col. 5, lines 16-25); 
creating, based upon the URL, a vector of values that represents semantics existent in alphanumerical characters of the URL as Word embedding module 106 can convert the list of words 114 into a list of word vectors 116. Word embedding provides a vector representation of a word, based on the context and semantic similarity with respect to other words. Word embedding is the collective name for a set of language modeling and feature learning techniques in natural language processing (NLP) where words or phrases from the vocabulary are mapped to vectors of real numbers. Words with similar meanings are close to each other in the vector space. For instance, the words “car” and “truck” have two similar vectors in the vector space, because they are two instances of the same category (col. 4, lines 18-29).
inferring information about the webpage based upon the vector as The average vector is processed by the SVM classifier 108, and a phishing score 118, e.g., from 0 to 1 for the semantical vector, is provided to measure the likelihood that the domain name 110 is a phishing domain. The phishing score 118 of “1” indicates the highest probability for the domain name 110 to be a phishing domain, while the phishing score 118 of “0” indicates the lowest probability for the domain name 110 to be a phishing domain (col. 4, lines 49-56).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of the cited references because Onut’s teaching would have allowed Shukla’s to facilitate determining semantic similarity between two entities by utilizing the word embedding technique.

Regarding claim 12, Shukla further teaches wherein the information is a likelihood that the webpage will be selected by an arbitrary user of a search engine when the webpage is included in a search engine results page (SERP) provided to the arbitrary user by the search engine as In one embodiment, the disclosed collaborative filtering techniques are used to identify which sites are more relevant to which terms. For example, embedding-based techniques can be applied to determine a proximity (i.e., score) in the disclosed n-dimensional space between a term/topic and a site, such that sites that are closer in the n-dimensional space to the location of the term/topic in the n-dimensional space can be deemed to be more relevant to that term/topic (col. 46, lines 8-15).
At 1714, indexer 1732 provides an updated index to an inverted index serving stack (RDI) 1734, which inverts the index for efficiently serving relevant documents to queries/interests of users of the search and feed system (e.g., the selection of relevant documents to serve to users in response to queries or in their content feeds can be implemented using the orchestration components described herein) (col. 36, lines 20-27).
The similarity score may be computed by computing a cosine similarity between an embedding associated with an entity and the generated embedding. One or more embeddings with a similarity score greater than a cosine similarity threshold are determined and returned (col. 73, lines 56-61). (Note: Shukla can determine the similarity between the user’s query and interest. As such, the prior art can return search results relevant to user’s search goal. Thereby, users will more likely to select the return search results.)

Regarding claim 13, Shukla further teaches wherein the extracted content from the webpage is included in a search engine index of the search engine as At 1714, indexer 1732 provides an updated index to an inverted index serving stack (RDI) 1734, which inverts the index for efficiently serving relevant documents to queries/interests of users of the search and feed system (e.g., the selection of relevant documents to serve to users in response to queries or in their content feeds can be implemented using the orchestration components described herein) (col. 36, lines 20-27).

Regarding claim 14, Shukla further teaches wherein the extracted content from the webpage is included in a search engine index of the search engine as At 2104, the classifier can identify websites that have pages related to a topic (e.g., mostly about a given topic based on a relative, threshold categorization determined using the classifier) (col. 45, lines 39-42).

Regarding claim 15, Shukla further teaches wherein the information is a likelihood that content of the webpage has been updated within a threshold amount of time as For instance, if a user tweets about a new posted article (e.g., web page on a website, as publishers generally post a tweet or other online announcement that indicates that a new article is being released or posted on their site at about the same time as it is being released/posted on their site, so such can provide a timely notification to add to the time series/crawl list for crawling and indexing to timely update the RDI as similarly described herein), then the delay to the serving stack can be as little as one minute or less during which the new web page is crawled, indexed, and available as a newly added document in the RDI provided by the serving stack (e.g., the serving structure as shown at 1734 of FIG. 17) (col. 49, lines 54-65).

Regarding claim 16, Shukla does not explicitly teach wherein the information is a likelihood that content of the webpage is written in a particular language.
Onut; however, teaches wherein the information is a likelihood that content of the webpage is written in a particular language as The domain preprocessor 102 can then replace each identified non-English character in the domain name 110 with the obtained correct English character. In an embodiment, the domain name 110 can be in another language, for example, French, German, Spanish, Arabic, Chinese, etc. The domain preprocessor 102 can identify one or more, e.g., non-German characters in the domain name 110, and replace each identified non-German character in the domain name 110 with the correct German character. The same principle also applies to other languages, for example, French, Spanish, Arabic, Chinese, etc.(col. 3, lines 30-41).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of the cited references because Onut’s teaching would have allowed Shukla’s to facilitate determining semantic similarity between two entities by utilizing the word embedding technique.

Regarding claim 20, the claim recites a computer readable storage medium with similar limitations as claim 1 and as such rejected under the same rationale as claim 1.

Claims 2 and 3 are rejected under 35 U.S.C. 103 as being unpatentable over Shukla; Anand (“Shukla”) US 11294974 B1 and Onut; Iosif et al. (“Onut”) US 11277443 B2 as applied to claims 1, 11, and 20 further in view of Serdyukov; Pavel Viktorovich et al. (“Serdyukov”) US 8938408 B1.

Regarding claim 2, Shukla and Onut do not explicitly teach wherein the score for the URL is output by a computer-implemented binary classifier.
Serdyukov; however, teaches wherein the score for the URL is output by a computer-implemented binary classifier as In another aspect, the pairwise classification may include generating a learning set of classifiers based on the pairs of web address features. In another example aspect, the learning set of classifiers may be generated by assigning a label to at least each of the extracted web address features; assigning a value of 1 to the pairs of web address features relating to a similar or same search goal and assigning a value of 0 to all other pairs of the web address features (i.e., binary classifier); and training the learning set of classifiers using at least the extracted web address features (col. 2, lines 18-27).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of the cited references because Serdyukov’s teaching would have allowed Shukla-Onut’s to provide a means to group the data into two classes of labels by utilizing the binary classifier.

Regarding claim 3, Shukla and Onut do not explicitly teach wherein the computer-implemented binary classifier is trained based upon content of a search log of a search engine, wherein training data for training the binary classifier includes URLs in the search log of the search engine and indications as to whether search results corresponding to the URLs were selected by users of the search engine.
Serdyukov; however, teaches wherein the computer-implemented binary classifier is trained based upon content of a search log of a search engine, wherein training data for training the binary classifier includes URLs in the search log of the search engine and indications as to whether search results corresponding to the URLs were selected by users of the search engine as  At step 215, the method 200 includes generating, based on the browsing log and the extracted web page features one or more classifiers that determine whether two entries in the browsing log relate the same or different user goals (col. 6, lines 22-25 and col. 8, lines 1-14).
The raw browsing log may include, but is not limited to, user search queries, URLs of web pages viewed by a user during one or more browsing session, time stamps relating to when a site was visited, durations of each visit to each web site, URLs visited as a result of clicks within a web site and other web browsing information (col. 4, lines 56-62).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of the cited references because Serdyukov’s teaching would have allowed Shukla-Onut’s to better predict the user’s action by training the system using the log data as examples.

Claims 7, 18, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Shukla; Anand (“Shukla”) US 11294974 B1 and Onut; Iosif et al. (“Onut”) US 11277443 B2 as applied to claims 1, 11, and 20 further in view of Zamir; Oren Eli et al. (“Zamir”) US 7,693,827 B2.

Regarding claim 7, Shukla and Onut do not explicitly teach subsequent to tokenizing the URL, generating n-grams based upon the tokens, wherein the embedding for the URL is generated based upon the generated n-grams.
Zamir; however, teaches subsequent to tokenizing the URL, generating n-grams based upon the tokens, wherein the embedding for the URL is generated based upon the generated n-grams as An n-gram is defined as a sequence of n tokens, where the tokens may be words. For example, the phrase "search engine" is an n-gram of length 2, and the word "search" is an n-gram of length 1 (col. 7, lines 61-64).

N-grams can be used to represent textual objects as vectors (i.e., embedding). This makes it possible to apply geometric, statistical and other mathematical techniques, which are well defined for vectors, but not for objects in general. In the present invention, n-grams can be used to define a similarity measure between two terms based on the application of a mathematical function to the vector representations of the terms (col. 7, line 65 – col. 8, line 4).

By way of example, a preferred URL or host in a link-based profile is often associated with a specific topic, e.g., finance.yahoo.com is a URL focusing on financial news. Therefore, what is achieved by a link-based profile that comprises a list of preferred URLs or hosts to characterize a user's preference may also be achievable, at least in part, by a category-based profile that has a set of categories that cover the same topics covered by preferred URLs or hosts (col. 10, lines 39-46).

 The user's profile is compared to the placed content profile to obtain a similarity score. The similarity score is then used to modify the placed content's ranking. If one considers each of the profiles as a vector, then one of ordinary skill in the art will recognize various mathematical ways to compare the profiles (col. 21, lines 6-10).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of the cited references because Zamir’s teaching would have allowed Shukla-Onut’s to define a similarity measure between two terms by utilizing n-grams to represent textual objects as vectors (col. 7, line 65 – col. 8, line 4).

Regarding claim 18, Shukla further teaches generating n-grams from the extracted tokens, wherein each n-gram includes several tokens and using word embedding, and based upon the n-grams, generating s-dimensional vectors for the n-grams, wherein the s-dimensional vectors represent semantics of the n-grams  as For example, the disclosed interest embeddings can be applied to generate embeddings for tokens, interests, users of the disclosed content and feed system, Twitter/social media users, hashtags, documents, articles/stories, sentences, and/or paragraphs. Specifically, embeddings for unigrams, bigrams (i.e., two-dimensional vector) and/or n-grams, entities, subreddits, and/or sites can be implemented based on co-occurrences in documents from a corpus of content (e.g., crawled content from the Internet/World Wide Web and/or content from social media networks) (col. 27, lines 10-19).

Shukla does not explicitly teach wherein creating the vector of values that represents semantics existent in the alphanumerical characters of the URL comprises: 
tokenizing the URL to extract tokens from the URL; 
mapping the extracted tokens to respective identifiers; 
Onut; however, teaches the steps of:
tokenizing the URL to extract tokens from the URL as In an embodiment, the URL preprocessor 302 further removes a protocol path (e.g., “http://”), a top-level domain (e.g., “.ga,” “.com,” “.cn,” etc.), and a common subdomain name, such as “www,” “mail,” “cpanel,” “webmail,” “webdisk,” etc., which does not have meaningful semantics for the domain analysis. The rest of URL 301 (i.e., the processed URL 304), e.g., “updateyouraccount/apple/accountReset” is input to the word parser 306 to obtain a list of words 308, e.g., “update,” “your,” “account,” “apple,” “account,” “Reset.” (col. 5, lines 16-25); 
mapping the extracted tokens to respective identifiers as Word embedding module 106 can convert the list of words 114 into a list of word vectors 116. Word embedding provides a vector representation of a word, based on the context and semantic similarity with respect to other words. Word embedding is the collective name for a set of language modeling and feature learning techniques in natural language processing (NLP) where words or phrases from the vocabulary are mapped to vectors of real numbers (i.e., map tokens to identifiers)(col. 4, lines 18-25).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of the cited references because Onut’s teaching would have allowed Shukla’s to facilitate determining semantic similarity between two entities by utilizing the word embedding technique.

Zamir is cited for additional support of the limitations:
generating n-grams from the extracted tokens, wherein each n-gram includes several tokens as N-grams can be used to represent textual objects as vectors. This makes it possible to apply geometric, statistical and other mathematical techniques, which are well defined for vectors, but not for objects in general. In the present invention, n-grams can be used to define a similarity measure between two terms based on the application of a mathematical function to the vector representations of the terms (col. 7, line 65 – col. 8, line 4; and 
using word embedding, and based upon the n-grams, generating s-dimensional vectors for the n-grams, wherein the s-dimensional vectors represent semantics of the n-grams as An n-gram is defined as a sequence of n tokens, where the tokens may be words. For example, the phrase "search engine" is an n-gram of length 2, and the word "search" is an n-gram of length 1 (col. 7, lines 61-64).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of the cited references because Zamir’s teaching would have allowed Shukla-Onut’s to define a similarity measure between two terms by utilizing n-grams to represent textual objects as vectors (col. 7, line 65 – col. 8, line 4).

Regarding claim 19, Shukla further teaches wherein the s-dimensional vectors are 2-dimensional vectors as For example, the disclosed interest embeddings can be applied to generate embeddings for tokens, interests, users of the disclosed content and feed system, Twitter/social media users, hashtags, documents, articles/stories, sentences, and/or paragraphs. Specifically, embeddings for unigrams, bigrams (i.e., two-dimensional vector) and/or n-grams, entities, subreddits, and/or sites can be implemented based on co-occurrences in documents from a corpus of content (e.g., crawled content from the Internet/World Wide Web and/or content from social media networks) (col. 27, lines 10-19).
Zamir is cited for additional support of the limitation 2-dimensional vectors as An n-gram is defined as a sequence of n tokens, where the tokens may be words. For example, the phrase "search engine" is an n-gram of length 2, and the word "search" is an n-gram of length 1 (col. 7, lines 61-64).

Claim 17 is rejected under 35 U.S.C. 103 as being unpatentable over Shukla; Anand (“Shukla”) US 11294974 B1 and Onut; Iosif et al. (“Onut”) US 11277443 B2 as applied to claims 1, 11, and 20 further in view of LIAO, Yong-jian (“LIAO”) CN 111198995 A.
Regarding claim 17, Shukla and Onut do not explicitly teach wherein the information is a likelihood that the webpage is associated with malware.
LIAO; however; teaches wherein the information is a likelihood that the webpage is associated with malware as step 5.1…url link 1 represents malicious (steps 5.1 to 5.3).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of the cited references because LIAO’s teaching would have allowed Shukla-Onut’s to facilitate identifying malicious web site by determining semantic similarity between two web addresses by utilizing the word embedding technique.

	Conclusion
The prior art made of record and not relied upon in form PTO-892 is considered pertinent to applicant's disclosure.

Inquiry
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LESLIE WONG whose telephone number is (571)272-4120. The examiner can normally be reached Monday-Friday 8am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ashish K. Thomas can be reached on : 571-272-0631. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.


Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/LESLIE WONG/Primary Examiner, Art Unit 2164