DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant's arguments filed 071 have been fully considered but they are not persuasive.
Applicant Remarks:
The proposed method takes raw posts from the social media handle and emits the percentage of users who are Male versus Female and the percentage of users who fall into different age bins. The first social media data may comprise, but is not limited to, blog posts, feeds, tweets, comments obtained from one or more sources (e.g., social websites such as LinkedIn®, Twitter®, Facebook®, microblogging websites, and the like). Social media handles (e.g., posts from users) may be collected as text files. Further, the first social media data is filtered by identifying one or more stop words, and one or more expressions to obtain a first filtered social media data. For example, expressions may refer to (i) emoticons that enable determining the context of media data and the topics and/or discussion being engaged by the one or more users, (ii) special characters or symbols for example, @, !, #, and the like. For Example: at first step, raw Twitter® post are collected (e.g., English language posts) and the data is cleaned by filtering regular expression, stop words. In an embodiment, numerical part (if any) is also removed in this step. This preprocessing is required to avoid unnecessary words and alpha numeric strings. Cleaned (or cleansed) data is parsed into set of words and stop words are removed from the text. Finally, the output is a word token, which contains only English words. 
Examiner response:
	The examiner respectfully disagrees, Glickman in [0114] “One way to leverage such social media is to obtain a panel of social media users, suitably identified e.g. by their username for whom explicit demographic characteristics may be supplied in profile pages.” and [0175] “Identify posts referencing web pages or web content e.g. by searching for url's” teach that the webpages of Glickman are related to social media streams. The information is gathered from social media platform users based on the usernames (i.e. social media handles of the first set of users). Posts are identified from the social media web pages.
Applicant Remarks:
Further, Glickman discloses about obtaining input individual pages each tied to demographic characteristics of a specific user that viewed the page (examples are click-stream data of users from panel with known demographics, URLs supplied from a social media stream from users with specified demographics in profile, etc.). Further, Glickman particularly states regarding extract features from page for learning. Features include content words in page textual content, stylistic writing features (use of use of parts of speech such as but not limited to all or certain pronouns, sentence length) as well as html specific features (bold, color, images, etc.). 
Nowhere Glickman discloses about "filtering said first social media data by identifying and removing one or more stop words, alpha numeric strings, and one or more expressions to obtain a first filtered social media data, wherein the expressions are emoticons and special characters, and wherein the first filtered social media data is obtained as text files containing only English word', as recited in amended claim 1. 
Examiner Response:
	The examiner notes that the arguments listed above are moot because the new grounds of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. 
	In response to arguments regarding the limitations “containing only English word” Glickman in Paragraph [0024] “ a processor to analyze page content (e.g. the words in the text appearing in a page)” teaches the page content as being words in the text on the page. The examiner interprets this as containing only English words. The examiner further notes that the text files of Glickman do not explicitly teach the use of other characters other than text in English word. The teachings of Alphanumeric and non-Alphanumeric are taught by the newly added Towell reference.
Applicant Remarks:
Additionally, amended Claim 1 recites inter alia: "(iii) generating a word embedding matrix comprising a first set of co- occurrence words from said first filtered social media data, each co-occurrence word from said first set of co-occurrence words is represented as a vector 
The present application proposes generating a first word embedding matrix comprising a first set of co-occurrence words (e.g., two or more co-occurrence words) from the first filtered social media data, each co-occurrence word from the first set of co- occurrence words is represented as a vector comprising a context. For example, vectors difference between Man and women are similar to vector differences between king and queen) and the vectors that are similar are almost occurring next to each other. After data Pre-Processing, only English words are retained for every post or blog (if posts are made in English language). Now these words are mapped to vectors using Word Embedding. In an embodiment, the embodiments of the present disclosure implements two ways to generate vector from Words continuous bags of words (CBOW) and Skip-Gram. CBOW predicts current words based on context. CBOW has one hidden layer, and the input to this is context and output is word relevant to the given context. Skip-Gram predicts context based on given word. This model generates similar context from a given word. Skip Gram uses neural network model to find similar words. It takes one word and predicts similar words. In other words, it is a map from word to context. The system 100 is implemented in such a way that it also works better with small training data. This provides a window size that could be used to determine how (far) forward and backward to look for the context. The target word is the input in this model and the context words are the output in this model. CBOW and Skip-Gram both is basically neural network with one hidden layer. CBOW uses one or more unsupervised models to obtain (or generate) center word from surrounding 
For instance: Learning word contexts and represents them in a vector space in conventional techniques cannot capture different meanings of a word that can differ depending on a context (for example, Bank can mean a River Bank or a Financial Bank), for which the proposed method/system would know depending on the context. 
Examiner Response:
	The examiner respectfully disagrees, Gao teaches a method of converting words into vector form as recited in Paragraph [0015] “Word embedding is a collective name for a set of language modeling and feature learning techniques in natural language processing where words from the vocabulary (and possibly phrases thereof) may be mapped to vectors of real numbers in a space that is low-dimensional relative to the vocabulary size.”. The claimed limitation requires that two or more co-occurrence words are transformed into a word embedding matrix. Gao reads on the step of word embedding different vocabulary from a data source into vector form for further processing. Further, the claim requires a continuous bag-of-word (CBOW) model and a continuous skip-gram model which is further taught in Paragraph [0015] “Methods such as the continuous bag-of-word (CBOW) model and the continuous skip-gram model may leverage the 
	In response to the newly added amendments the examiner notes that, “the Skip-Gram predicts similar contexts from a given word” is taught by Paragraph [0048] of Gao “skip-gram model that predicts a target word based on its context, the neural network architecture includes a contextual information branch and a parallel morphological knowledge branch, which leverages morphological knowledge to assist in predicting a target word. Intuitively, if a word wt of a vocabulary space  is the central word in a context window, the method may predict, in the contextual information branch, the surrounding words by leveraging not only the representation of word wt as contextual information, but also the representations of the words that are morphologically similar to wt,”. The cited portion of Gao teaches predicting in the contextual branch the surrounding words by leveraging not only the representation of word wt as contextual information. 
Gao further teaches “the CBOW predicts words relevant to a given context” in Paragraph [0015] “Word embedding is a collective name for a set of language modeling and feature learning techniques in natural language processing where words from the vocabulary (and possibly phrases thereof) may be mapped to vectors of real numbers in a space that is low-dimensional relative to the vocabulary size. Methods such as the continuous bag-of-word (CBOW) model and the continuous skip-gram model may leverage the surrounding context of a word in documents to transform words into vectors (i.e., word embedding’s) in a continuous space, which may capture both semantic and syntactic relationships between words. In one underlying principle, words that are syntactically or semantically similar may likely have similar surrounding contexts.” The CBOW is used with the Skip-gram model to predict new words 
Applicant Remarks:
In similar lines, Gao merely talks about word embedding, which is a collective name for a set of language modeling and feature learning techniques in natural language processing where words from the vocabulary (and possibly phrases thereof) may be mapped to vectors of real numbers in a space that is low-dimensional relative to the vocabulary size. In view of this Gao merely mentions about various methods such as the continuous bag-of-word (CBOW) model and the continuous skip-gram model that may leverage the surrounding context of a word in documents to transform words into vectors (i.e., word embedding’s) in a continuous space. 
However, Gao doesn't teach about generating word embedding matrix comprising co-occurring words using both CBOW and a Skip-Gram. 
So, both Gao and Glickman are completely silent about generating a word embedding matrix comprising a first set of co-occurrence words (e.g., two or more co-occurrence words). Moreover, both Gao and Glickman are completely silent about: "vector representation of each co-occurrence word is achieved by building a neural network model from a continuous bags of words (CBOW) and a Skip-Gram, and wherein the CBOW predicts words relevant to a given context. and the Skip-Gram predicts similar contexts from a given word", as recited in amended claim 1. 
Based on the above, as Gao and Glickman fails to discloses about steps (i) to (iii), Gao and Glickman can be contended to be reciting: 

wherein said second social media data pertains to a second set of users; 
(vii) repeating the steps of (ii) till (iv) to obtain a second set of 
vectors for said second set of users based on said second social media 
data; 
(viii) applying said one or more trained machine learning techniques (ML Ts) 
on said second set of vectors and context associated with each of said second set 
of vectors," 
( Emphasis added) 
Examiner Response:
	 The examiner respectfully disagrees, “generating word embedding matrix comprising co-occurring words using both CBOW and a Skip-Gram” is further taught by Gao Paragraph [0029] “represent an input word as an n-dimensional representation; find a set or group of words that are morphologically similar to the input word; extract embedding of individual words among the set of words from an embedding matrix based, at least in part, on contextual information associated with the input word; and generate a target word having a meaning based, at least in part, on the set of words and the contextual information.”. The examiner notes that Gao further teaches leverage the surrounding context of a word in documents to transform words into vectors (i.e., word embedding’s) in a continuous space, which may capture both semantic and syntactic relationships between words.
Applicant Remarks:
Moreover, the Examiner contended that Glickman at   [0143]-[0144] teaches regarding claim limitation: "predicting an age and a gender of each user from said second set of users upon 
Applicant respectfully disagrees with this matching. In response, Applicant submits that the proposed method comprises applying the one or more trained machine learning techniques (herein referred as MLTs or trained MLTs) on a second set of vectors and context associated with each of the second set of vectors to predict age and gender of the second set of users. In an embodiment, the step of applying the one or more trained MLTs includes selecting at least a subset of the one or more machine learning techniques based on a training level of the MLTs. In another embodiment, the subset of the trained one or more machine learning techniques is selected based on one or more weight(s) assigned to the one or more machine learning techniques during training. For example, a first machine learning technique may be assigned less weight compared to a second machine learning technique during training and accordingly used for prediction. Similarly, the first machine learning technique may be assigned higher weight compared to the second machine learning technique during training and accordingly used for prediction. As disclosed in 1 [0063] of the present application, the system 100 is able to provide more accurate prediction by implementing the trained MLTs, where the weights are assigned to these trained MLTs at the time of training, and predicting age and gender for subsequent social media handles. The training of MLTs has enabled the system 100 to select and weight these MLTs for prediction which may be further dependent on how closely each MLT represents (or has represented) observed conditions in the present and in the recent past. 
Whereas, Glickman merely mentions that a learning algorithm is applied once and a separate model is learned for each demographic type (e.g. age, gender, marital status). Thus, the 
Examiner Response:
	The examiner notes that Glickman in Paragraph [0069] teaches “In accordance with an embodiment of the presently disclosed subject matter, there is yet further provided a system wherein the audience information gatherer runs iteratively over users, by following suitable links, so as to obtain a larger set of different users.” Obtain a second social media data from one or more sources, wherein said second social media data pertains to a second set of users is taught as obtain a larger set of different users. In paragraph [0070] the audience information gatherer extracts demographic characteristics on social media platforms.
	Glickman in Paragraph [0030] recites “develop a prediction process which if applied to the content characteristic would have predicted the training data, wherein the prediction process is constructed and operative to be employed by an audience predicting processor operative, for at least one new webpage whose audience is unknown, to compute at least one content characteristic of the new webpage and to generate predicted demographic information predicted to characterize the unknown audience of the new webpage by applying the prediction process to the new webpage's content characteristic” Predicting an age and gender of each user from said second set of users upon said one or more machine learning techniques applied … is taught as 
	In response to the applicants arguments “a learning algorithm is applied once and a separate model is learned for each demographic type” the examiner refers the applicant to claim 34 of Glickman “webpage audience predicting processor is operative to perform a new webpage learning process including comparing output of said prediction process as applied to said new webpage, to new webpage audience information, if known, and updating the prediction process accordingly.” The prediction model is updated based on a second audience or second data.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1, 6, and 11 is/are rejected under 35 U.S.C. 103 as being unpatentable over Glickman (U.S. 20120209795) in view of Gao (U.S. 20170011289) and Towell (U.S. 20130307779).
Regarding claim 6, A system comprising: a memory storing instructions (Glickman: Paragraph [0078] “machine-readable memory”); one or more communication interfaces (Glickman: Paragraph [0079] “devices may communicate via any conventional wired or wireless digital communication means, e.g. via a wired or cellular telephone network or a computer network such as the Internet.”); and one or more hardware processors communicatively coupled to said memory using said one or more communication interfaces (Glickman: Paragraph [0080] “machine readable memory containing or otherwise storing a program of instructions which, when executed by the machine,”), wherein said one or more hardware processors are configured by said instructions to: 
obtain a first social media data from one or more sources, said social media data pertains to a first set of users (Glickman: Paragraph [0155] “Obtain, e.g. as input individual pages each tied to demographic characteristics of a specific user that viewed the page (examples are click-stream data of users from panel with known demographics, URLs supplied from a social media stream from users with specified demographics in profile, etc.)” Obtain a first social media data from one or more sources, said social media data pertains to a first set of users is taught as obtain input individual pages tied to demographic characteristics. Refer to paragraph [0155] “Extract features from page for learning. Features include content words in page textual content,” Features are extracted from the ages.), wherein the first social media data are raw post from social media handles of the first set of users (Glickman: Paragraph [0114] “One way to leverage such social media is to obtain a panel of social media users, suitably identified e.g. by their username for whom explicit demographic characteristics may be supplied in profile pages.” [0175] “Identify posts referencing web pages or web content e.g. by searching for url's” The examiner notes that the webpages of Glickman are related to social media streams. The information is gathered from social media platform users based on the usernames (i.e. social media handles of the first set of users). Posts are identified from the social media web pages.), is obtained as text files (Glickman: Paragraph [0029] “Other embodiments may analyze other content types such as pdf files.” The first social media data is obtained as text files is taught as analyzing other content types such as pdf files. The pdf files are analyzed for Textual features. Refer to Paragraph [0147].); 
filter said first social media data by identifying and removing one or more stop words, …, and one or more expressions to obtain a first filtered social media data (Glickman: Paragraph [0148] “Filtering the huge amount of textual features so as to leave only meaningful textual features is known. Stop word lists (e.g. as defined in Wikipedia) allow filtering of non-informative words such as “the” or “and”.” Filter said first social media data by identifying one or more stop words, and one or more expressions to obtain a first filtered social media data is taught as filtering the huge amount of textual features so as to leave only meaningful textual features (i.e. removing one or more stop words).)…, and wherein the first filtered social media data is obtained as text files (Glickman: Paragraph [0029] “Other embodiments may analyze other content types such as pdf files.” The first social media data is obtained as text files is taught as analyzing other content types such as pdf files. The pdf files are analyzed for Textual features. Refer to Paragraph [0147].) containing only English word (Glickman: Paragraph [0024] “se a processor to analyze page content (e.g. the words in the text appearing in a page)” Containing only English word is taught as analyze page content (e.g. the words in the text appearing in a page.);… first filtered social media data (Glickman: Paragraph [0155] “Obtain, e.g. as input individual pages each tied to demographic characteristics of a specific user that viewed the page (examples are click-stream data of users from panel with known demographics, URLs supplied from a social media stream from users with specified demographics in profile, etc.)” Obtain a first social media data from one or more sources, said social media data pertains to a first set of users is taught as obtain input individual pages tied to demographic characteristics.)… social data submitted by each user Glickman: Paragraph [0155] “Obtain, e.g. as input individual pages each tied to demographic characteristics of a specific user that viewed the page (examples are click-stream data of users from panel with known demographics, URLs supplied from a social media stream from users with specified demographics in profile, etc.)” Obtain a first social media data from one or more sources, said social media data pertains to a first set of users is taught as obtain input individual pages tied to demographic characteristics. Refer to paragraph [0155] “Extract features from page for learning. Features include content words in page textual content,” Features are extracted from the ages.)…
train one or more machine learning techniques using said first set of vectors and context associated with said first set of vectors to obtain one or more trained machine learning techniques (Glickman: Paragraph [0155] “Run a known Supervised Learning classification algorithm over the training data to obtain a classification model. Examples of classification algorithms that may be used include Support Vector Machines” Train one or more machine learning techniques using said first set of vectors and context associated with said first set of vectors to obtain one or more trained machine learning techniques is taught as run a know supervised learning classification model over the training data to obtain a classification model.);
obtaining a second social media data from one or more sources, wherein said second social media data pertains to a second set of users (Glickman: Paragraph [0069] “In accordance with an embodiment of the presently disclosed subject matter, there is yet further provided a system wherein the audience information gatherer runs iteratively over users, by following suitable links, so as to obtain a larger set of different users.” Obtain a second social media data from one or more sources, wherein said second social media data pertains to a second set of users is taught as obtain a larger set of different users. In paragraph [0070] the audience information gatherer extracts demographic characteristics on social media platforms.); 
repeating the steps of (ii) to obtain a second set of vectors for said second set of users based on said second social media data (Glickman: Paragraph [0148] “Filtering the huge amount of textual features so as to leave only meaningful textual features is known. Stop word lists (e.g. as defined in Wikipedia) allow filtering of non-informative words such as “the” or “and”.” Glickman teaches step(ii). Glickman further teaches a method of training and testing. Paragraph [0218] “The same method is used to scale both training (learning) and testing (classification) data.” An ordinary artisan would understand using a trained model on a second data set on a model for testing. Glickman also further explains obtaining a larger data set with different users in Paragraph [0069], this reads on second social media data.); 
applying said one or more trained machine learning techniques (MLTs) on said second set of vectors and context associated with each of said second set of vectors (Glickman: Paragraph [0071] “and using content items represented as features along with corresponding target demographic characteristics as input to a supervised learning process thereby to obtain a prediction model predicting demographic characteristics of a page audience, as a function of page features.” Apply said one or more machine learning techniques (MLTs) on said second set of vectors and context associated with each of said second set of vectors is taught as using content items represented as features along with corresponding target demographic characteristics as input to a supervised learning process thereby to obtain a prediction model predicting demographic characteristics of a page audience, as a function of page features.); 
andAMENDMENT AND RESPONSE UNDER37CFR§ 1116Page No 3 Serial Number 15/630,659Filing Date June 22, 2017Docket No 13178 0231predicting an age and gender of each user from said second set of users upon said one or more machine learning techniques applied on said second set of vectors and said context associated with each of said second set of vectors (Glickman: Paragraph [0030] “develop a prediction process which if applied to the content characteristic would have predicted the training data, wherein the prediction process is constructed and operative to be employed by an audience predicting processor operative, for at least one new webpage whose audience is unknown, to compute at least one content characteristic of the new webpage and to generate predicted demographic information predicted to characterize the unknown audience of the new webpage by applying the prediction process to the new webpage's content characteristic” Predicting an age and gender of each user from said second set of users upon said one or more machine learning techniques applied … is taught as developing a trained model then applying it to a new webpage in which the demographic information (i.e. Each such individual user has associated demographic characteristics such as but not limited to age, gender, occupation, and education level.) is predicted. Glickman further teaches a method of training and testing. Paragraph [0218] “The same method is used to scale both training (learning) and testing (classification) data.” An ordinary artisan would understand using a trained model on a second data on a model for testing. Glickman also further explains testing on a new webpage after the prediction model is developed, this reads on second social media data.), wherein each of said predicted age and said predicted gender are associated with a probability score (Glickman: Paragraph [0144] “The result of the classification process typically includes, for each input-content and for each demographic characteristic type, a characteristic value along with a corresponding confidence score for that value.” Each demographic characteristic type [age or gender] is associated with a confidence score [probability score] for the value.).

Glickman does not explicitly teach the following limitations
alpha numeric strings…, wherein the expressions are emoticons and special characters…
generate a word embedding matrix comprising a first set of co-occurrence words from said…, each co-occurrence word from said first set of co-occurrence words is and wherein the CBOW predicts words relevant to a given context, and the Skip-Gram predicts similar contexts from a given word; 
aggregate one or more vectors pertaining each social data submitted by each user to obtain a first set of vectors for said first set of users; …
repeating the steps of (ii) till (iv) to obtain a second set of vectors for said second set of users based on said second social media data

Gao further teaches
generate a word embedding matrix comprising a first set of co-occurrence words from said …, each co-occurrence word from said first set of co-occurrence words is represented as a vector comprising context , wherein the vector representation of each co-occurrence word is achieved by buildinq a neural network model from a continuous baps of words (CBOW) and a Skip-Gram (Gao: Paragraph [0015] “Word embedding is a collective name for a set of language modeling and feature learning techniques in natural language processing where words from the vocabulary (and possibly phrases thereof) may be mapped to vectors of real numbers in a space that is low-dimensional relative to the vocabulary size. Methods such as the continuous bag-of-word (CBOW) model and the continuous skip-gram model may leverage the surrounding context of a word in documents to transform words into vectors (i.e., word embeddings) in a continuous space, which may capture both semantic and syntactic relationships between words.” Generate a word embedding matrix comprising a first set of co-occurrence words… each co-occurrence word from said first set of co-occurrence words is represented as a vector comprising context is taught as the word embedding’s that are created by transforming words into vectors. Wherein the vector representation of each co-occurrence word is achieved by building a neural network model from a continuous bags of words (CBOW) and a Skip-Gram is taught as methods such as the continuous bag-of-word (CBOW) model and the continuous skip-gram model may leverage the surrounding context of a word in documents to transform words into vectors [Paragraph 0018 “a neural network architecture that can leverage morphological word similarity for word embedding.”].) and wherein the CBOW predicts words relevant to a given context (Gao: Paragraph [0015] “continuous bag-of-word (CBOW) model and the continuous skip-gram model may leverage the surrounding context of a word in documents to transform words into vectors (i.e., word embeddings) in a continuous space, which may capture both semantic and syntactic relationships between words.” The CBOW predicts words relevant to a given context is taught as the continuous bag-of-word (CBOW) model in a continuous space, which may capture both semantic and syntactic relationships between words. The examiner notes that both models are used in target word prediction.), and the Skip-Gram predicts similar contexts from a given word (Gao: Paragraph [0048] “To incorporate morphological knowledge into the learning process, a method that includes a neural network architecture may be used. Beyond the basic skip-gram model that predicts a target word based on its context, the neural network architecture includes a contextual information branch and a parallel morphological knowledge branch, which leverages morphological knowledge to assist in predicting a target word. Intuitively, if a word wt of a vocabulary space 302 is the central word in a context window, the method may predict, in the contextual information branch, the surrounding words by leveraging not only the representation of word wt as contextual information, but also the representations of the words that are morphologically similar to wt, as determined in the morphological knowledge branch” The Skip-Gram predicts similar contexts from a given word is taught as the skip-gram model that predicts a target word based on its context, the neural network architecture includes a contextual information branch and a parallel morphological knowledge branch, which leverages morphological knowledge to assist in predicting a target word.); 
aggregate one or more vectors pertaining each …[data] to obtain a first set of vectors for said first set of users (Gao: Paragraph [0050] “An aggregated representation of the input word, denoted as vwI, may be calculated as a weighted sum of the representations from the contextual information branch and the morphological knowledge branch:” Aggregate one or more vectors pertaining each …[data] to obtain a first set of vectors for said first set of users is taught as the aggregated representation of the input word as v[weighted representation of the vectors]);…
repeating the steps of (iii) (Gao: Paragraph [0015] “Word embedding is a collective name for a set of language modeling and feature learning techniques in natural language processing where words from the vocabulary (and possibly phrases thereof) may be mapped to vectors of real numbers in a space that is low-dimensional relative to the vocabulary size. Methods such as the continuous bag-of-word (CBOW) model and the continuous skip-gram model may leverage the surrounding context of a word in documents to transform words into vectors (i.e., word embeddings) in a continuous space, which may capture both semantic and syntactic relationships between words.” Generate a word embedding matrix comprising a second set of co-occurrence words… each co-occurrence word from said second set of co-occurrence words is represented as a vector comprising context is taught as the word embedding’s that are created by transforming words into vectors. Wherein the vector representation of each co-occurrence word is achieved by buildinq a neural network model from a continuous bags of words (CBOW) and a Skip-Gram is taught as methods such as the continuous bag-of-word (CBOW) model and the continuous skip-gram model may leverage the surrounding context of a word in documents to transform words into vectors [Paragraph 0018 “a neural network architecture that can leverage morphological word similarity for word embedding.”].) [and] (iv) to obtain a second set of vectors for said second set of users based on said second social media data (Gao: Paragraph [0050] “An aggregated representation of the input word, denoted as vwI, may be calculated as a weighted sum of the representations from the contextual information branch and the morphological knowledge branch:” Aggregate one or more vectors pertaining each …[data] to obtain a second set of vectors for said second set of users is taught as the aggregated representation of the input word as v[weighted representation of the vectors])
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified analysis system of webpage audience demographics of Glickman with the the continuous bag-of-word (CBOW) model and the continuous skip-gram model of Gao. Doing so would allow leveraging the surrounding context of a word in documents to transform words into vectors (i.e., word embeddings) in a continuous space, which may capture both semantic and syntactic relationships between words (Gao: Paragraph [0015]).
Glickman in view of Goa do not explicitly disclose alpha numeric strings…, wherein the expressions are emoticons and special characters…
Towell further teaches alpha numeric strings (Towell: Paragraph [0047] “the software system functions to integrate alphanumeric data” Alpha numeric strings is taught as alphanumeric data.)…, wherein the expressions are emoticons (Towell: Paragraph [0007] “As a text-based language, communications involving emoticons are based in predefined alphanumeric characters and are, therefore, limited.” Wherein the expressions are emoticons and special characters is taught as emoticons are based in predefined alphanumeric characters.) and special characters (Towell: Paragraph [0036] “user-driven non-alphanumeric content, message or communication data, and other data” Special characters are taught as non-alphanumeric content.)…
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Glickman and Gao with the alphanumeric and non-alphanumeric characters of Towell in order to utilize alphanumeric/non-alphanumeric text along with emoticons in social media posts, thereby forming a social communication platform, as well as other applications, such as text, email, messaging, posts, tags, comments, word documents which utilize alphanumeric/non-alphanumeric and special characters (Towell: Paragraph [0023] “ a mechanism by which to integrate alphanumeric data with inline user-driven non-alphanumeric content or data to create a new cyber language is provided (FIG. 1). The software system may also form a social communication platform, as well as other applications, such as text, email, messaging, posts, tags, comments, word documents,”).


Claim 1 and 11 are similarly rejected, refer to claim 6 for further analysis.


Claim 4, 5, 9, 10, 14, and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Glickman (U.S. 20120209795) in view of Gao (U.S. 20170011289), Towell (U.S. 20130307779), and Kariv (U.S. 20120023041).Regarding claim 9, Glickman in view of Gao and Towell teaches the system of claim 6, Glickman in view of Gao does not explicitly disclose wherein said one or more hardware processors are configured to select at least a subset of said one or more machine learning techniques based on a training level of the MLTs.
Kariv further teaches wherein said one or more hardware processors are configured to select at least a subset of said one or more machine learning techniques based on a training level of the MLTs (Kariv: Paragraph [0034-0035] “The combined output is then preferably analyzed according to a threshold in order to determine the model of the performance of the computer network. The threshold is more preferably related to the predictive performance of the model… If the model fails to meet the threshold of predictive performance, then optionally and preferably a new model is created.” Select at least a subset of said one or more machine learning techniques based on a training level of the MLTs is taught as select a model out of the plurality of prediction models that meets the threshold of predictive performance [training level].). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Glickman, Gao and Towell with the generation of models based on performance of Kariv in order to select a model from the plurality of models that meets the threshold of predictive accuracy, thereby removing the models with a lower level of accuracy (Kariv: Paragraph [0037] “Optionally and most preferably, all models are removed for which such removal does not reduce the precision and recall of the system, for example in order to increase efficiency of operation of the predictive model”).
Claim 4 and 14 are similarly rejected, refer to claim 9 for further analysis.
Regarding claim 10, Glickman in view of Gao, Towell and Kariv teach the system of claim 9, Kariv further teaches wherein said at least a subset of said one or more machine learning techniques is selected based on a weight assigned to said one or more machine learning techniques during training (Kariv: Paragraph [0033] “each model is preferably assigned a weight, during the learning phase, which relates to the relative accuracy of the model.” A subset of said one or more machine learning techniques is selected based on a weight assigned to said one or more machine learning techniques during training is taught as each model is preferably assigned a weight, during the learning phase, which relates to the relative accuracy of the model. Kariv selects models based on the weighted accuracy.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Glickman, Gao, and Towell with the generation of models based on performance of Kariv in order to select a model from the plurality of models that meets the threshold of predictive accuracy, thereby removing the models with a lower level of accuracy (Kariv: Paragraph [0037] “Optionally and most preferably, all models are removed for which such removal does not reduce the precision and recall of the system, for example in order to increase efficiency of operation of the predictive model”).

Claim 5 and 15 are similarly rejected, refer to claim 10 for further analysis.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AHSIF A. SHEIKH whose telephone number is (571)272-2607.  The examiner can normally be reached on Mon-Fri 7:30-5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexey Shmatov can be reached on 571-270-3428.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/A.A.S./Examiner, Art Unit 2123                                                                                                                                                                                                        
/ALEXEY SHMATOV/Supervisory Patent Examiner, Art Unit 2123