DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .  This action is in response to the communication filed on 5/17/2022. Claims 1-23 are pending in this application. In response filed 5/17/2022 applicant elected Group I, Claims 1-7, with traverse.
Examiner Note
Applicant is reminded that withdrawn claims should be cancelled prior to allowance.
Priority
This application is a CIP of 16/730,954 filed 12/30/2019. The other CIPs listed do not appear to have support for the instant claims. The assignee of record is Userzoom Technologies, Inc. The listed inventor(s) is/are: Mestres, Xavier; Sanchez, David; Pujol, Xavier; del Castillo, Francesc; Rogers, Robert Derward.
Information Disclosure Statement
The information disclosure statement(s) (IDS) submitted on 2/5/2021 is/are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the IDS(s) is/are being considered by the examiner.
Election/Restrictions
In response filed 5/17/2022 applicant elected Group I, Claims 1-7, with traverse.
Claim Objections
Claim 1 objected to because of the following informalities:  Line 1 “AI” appears to be an abbreviation for artificial intelligence and examiner suggest to clarify the claims to read “artificial intelligence (AI)” and then refer to only “AI” if needed later in the claims.  Appropriate correction is required.
Allowable Subject Matter
Claims 1-7 allowed.
Citation of Pertinent Prior Art
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure is listed below, thank you:
i. US 20210034707 A1 
Description
[0016] The embodiments address a computer-centric and Internet-centric problem of classifying user input questions with unseen text or unseen user behavior into different classes and implement classification task with the built neural network system. The neural network system may be configured to process and classify the user questions associated with unseen text or unseen user behavior by initializing word embeddings and using pre-training clickstream embedding generation network. For example, character embedding may fit for misspelling words, emoticons, infrequent words, and/or new words included in user input questions. For questions with ambiguous text, embodiments described herein may use clickstream embedding representing user browsing behavior to disambiguate the question type.
[0020] In some embodiments, the neural networks described herein may be configured to classify text and available clickstreams by utilizing deep learning algorithms to train text embeddings and clickstream embedding separately. The neural networks may be configured to use a pre-trained LSTM neural network to extract clickstream features associated with user behavior and question text. Further, the neural network system may utilize a concatenation module to concatenate the extracted word-based features and character-based features along with the clickstream embeddings to form a representation vector indicative of user behavior and question text. The representation vector may be fed into a fully connected feed-forward network which is configured to predict different classes for the user input questions. In some embodiments, the output layer of the neural network system may provide binary class labels and/or numeric scores to the input questions based on the processing results.
[0037] To generate a classification model for classifying user questions, data to be processed may be provided to include non-normalized user question text and related clickstream data of last n clicks/pages. Each input dataset 210 may be labeled to include multiple features of respective user questions. In text processing, question text may represent discrete and categorical features. The labeled dataset associated with a particular question may include various attributes or features. The clickstream data 242 associated with the user questions may represent features of user behaviors when users ask the related questions. The related clickstream data 242 may include page identifier (ID), page title, time spent on page for the last n clicks on visited pages, etc. A page ID may be one-hot encoded vector. Time spent on each page may be passed as a continuous domain feature. A page title may be included in the question text and be processed for word extraction and summing word embeddings through the neural network system 200 using the same methods described below. For example, the text of the page title may include t words. The pre-trained character and word embeddings may be used for each of those t words to generate corresponding title embeddings. The corresponding title embeddings may be aggregated using different methods to form one vector representation of the title. The methods may include averaging title embeddings, performing self-attention based weighted average or concatenation on title embeddings, for example.
[0057] In some embodiments, the third neural network 240 may be a pre-trained Siamese network 610 with an algorithm implemented by multiple Long Short-Term Memory (LSTM) to generate the clickstream embeddings associated with user entry behavior associated with the user input questions. The Siamese network is a learning framework that may be applied to any type of networks. For example, it may be a convolutional network when inputs are images. It may be a recurrent network such as LSTM when input is language or time series data.
[0066] In some embodiments, the present disclosure may employ a “zero shot learning” approach by classifying text with deep learning algorithms trained on text embeddings only to solve a tax/product classification task despite not having received sufficient number of click stream training examples of that task. Zero shot learning is a way that the model may handle previously unseen input values as well. For example, a particular clickstream behavior at run time may be totally new and no clickstream behavior of that type has been present in the training dataset. The model may be able to handle such cases as well. In some embodiments, tax category can be detected in user text entry without ever having recorded a clickstream sample of that particular user before.
Claims
20. A method implemented by a computing system, the computing system comprising one or more processors and one or more computer-readable storage devices storing computer-executable computer instructions, the method comprising executing the instructions thereby causing the computing system to perform operations comprising: receiving input datasets associated with user input questions from a database, the input datasets comprising input datasets of the user input questions; extracting character-based features from the input datasets by utilizing a first neural network; extracting word-based features of from the input datasets by utilizing a second neural network; extracting respective clickstreams from clickstream data to generate clickstream datasets; applying a pre-trained Siamese network with the clickstream datasets to generate clickstream embeddings of the clickstream data; concatenating respective clickstream embeddings, the word-based features, and the character-based features of the input datasets to form a representation vector indicative of the question text and related user behavior; predicting, based on the representation vector through a fourth neural network, a first class and a second class of respective user input questions; and assigning a first target class label and a second target class label to respective user input questions.

ii. US 10789643 B1
Description
(24) In one or more embodiments, the behavioral model (128) includes representations of behavior relating to usage of the BMA (106). The representations of usage behavior may be based on clickstream data and/or accounting-related data. In one or more embodiments, the behavioral model (128) is trained using training data (130) (e.g., product usage data and/or clickstream data) labeled as fraudulent by human observers. For example, a behavioral model (128) of usage behavior may be based some or all of the following data regarding one or more user accounts of the BMA (106) (e.g., user accounts (150A, 150N) of FIG. 1B) to which an accountant has access: a. total activity in the user account (e.g., including any access or modification to data of the user account, such as changing receivables, payables, or transactions) b. timing and/or frequency of issuing voided checks c. changed preferences of the user account d. deleting a user account
(39) In Step 308, training data including behaviors associated with accountants is obtained. In one or more embodiments, the training data may include data on the spending behavior of accountants (e.g., including known fraudulent accountants). For example, the spending behavior may be based on financial transactions associated with accountants obtained from the BMA and/or third party data sources. Training data for other aspects of accountant behavior may be obtained from a variety of data sources (e.g., social media sites). In one or more embodiments, the training data includes product usage data and/or clickstream data relating to the usage of the BMA, where some of the training data is labeled as fraudulent by human observers.
(56) In an alternate scenario, the cluster generator (122) uses additional information to compute the probability that Betty's loan application (402A) is fraudulent. The cluster generator (122) trains a behavioral model using training data for behaviors associated with accountants, including personal spending behavior and usage behavior relative to the BMA (106). The spending behavior is based on financial transactions associated with accountants obtained from the BMA (106), where the median amount spent is $2000/month. The usage behavior is based on clickstream data relating to the usage of the BMA (106) that is labeled as fraudulent. The fraudulent usage behavior includes voiding a check in the week prior to receiving Betty's loan application (402A) and changing a product preference for the BMA (106) in the month prior to receiving Betty's loan application (402A).

iii. US 20200074006 A1
Description
[0072] In one embodiment, the content recommendation model 120 is trained with a machine learning process. In one embodiment, the content recommendation model 120 is trained with a supervised machine learning process. The supervised machine learning process can include utilizing training set data to train the content recommendation model 120. The training set data can include clickstream data 150 associated with a large number of users of the website service provider 112. The training set data can also include other user related attributes including geolocation, age, marital status, demographics, financial, or other kinds of data related to the users. The content recommendation model 120 is trained to identify, for a large number of webpages, which attributes of users predict various clickstream behaviors. The content recommendation model 120 is trained to identify aspects of the webpage that are most likely to be relevant to a user based on the user related data 132.

iv. US 20190317739 A1
Description
[0021] More particularly, in some examples, visual and textual descriptions of a GUI provided by a designer are captured and processed through a series of artificial intelligence (AI) processes to first identify the visual aspects and features to be included in the GUI design, and then to generate instructions (e.g., executable code or script) defining a design proposal reflective of the identified aspects and features. In some examples, many different design proposals may be generated as specified by corresponding instructions and/or code automatically generated for each such design proposal. Automatic generation of the wireframes, styles, and mockups 106, 108, 110, as well as the subsequent code generation can significantly reduce the time-to-prototype. Furthermore, inasmuch as the instructions and/or code is generated ready for use in creating the prototype 114, any changes in design can be immediately translated to the prototype stage, thereby reducing the time and effort required by human designers and/or programmers per iteration. Further, users of the example systems disclosed herein need not be graphic designers and/or computer programmers to generate stylish and/or functional GUIs because the generation of wireframes, styles, and mockups 106, 108, 110 that serve as the basis for a final GUI design are fully automated (subject to basic initial inputs by the user). Further, the code synthesized from the initial user inputs (e.g., sketches and descriptions) may be for any suitable programming language (e.g., hypertext markup language (HTML) for websites, C/C++ for many computer software application GUIs, etc.)
[0024] The DSL generation stage 204 analyzes and processes the textual and visual inputs 202 using a combination of AI models described further below to output a DSL (domain specific language) instructions 212 (e.g., executable code or script) (instructions which may be executed by a machine—perhaps after compiling or other processing into machine code). DSLs are computer languages that are specialized to a specific task or application. The main idea behind DSLs is to facilitate expressing problems and solutions in a specific domain. In examples disclosed herein, the DSL instructions output from the DSL generation stage 204 are based on a DSL that has been particularly defined to represent all the desired visual aspects of a final GUI. With such a DSL, a visual description of a GUI or the underlying building blocks for a final GUI (e.g., wireframes, mockups, etc.) may be defined by one or more DSL statements. For example, a DSL statement may define that a button for a wireframe is a rectangle with an “x” inside. The DSL instructions 212 output from the DSL generation stage 204 contains all necessary DSL statements to fully define a particular GUI design. Therefore, it is possible to render the DSL instructions 212 into a visual representation (e.g., a wireframe and/or mockup). Accordingly, in some examples, a rendering tool is implemented to generate or render an image based on the DSL instructions. In some such examples, the image rendering is implemented as part of the DSL generation stage 204 because such rendered images are compared against the initial input image 206 in subsequent iterations of the AI processes to update the DSL instructions and, therefore, update the rendered image. That is, in some examples, the DSL instructions 212 generated by one iteration of the DSL generation stage 204 are used as the basis to generate an updated version of the rendered image that is compared with the user-provided input image in a subsequent iteration of the process resulting in an updated DSL instructions. The process may repeat to repeatedly generate new DSL instructions until the resulting image rendered from the DSL instructions corresponds to the user-provided textual and visual input 202.
[0036] Once trained, the deployed model may be operated in an inference phase to process data. In the inference phase, data to be analyzed (e.g., live data) is input to the model, and the model executes to create an output. This inference phase can be thought of as the AI “thinking” to generate the output based on what it learned from the training (e.g., by executing the model to apply the learned patterns and/or associations to the live data). In some examples, input data undergoes pre-processing before being used as an input to the machine learning model. Moreover, in some examples, the output data may undergo post-processing after it is generated by the AI model to transform the output into a useful result (e.g., a display of data, an instruction to be executed by a machine, a GUI and/or a mockup of a GUI, etc.).
[0038] As mentioned above, in some examples, RNNs are implemented in disclosed examples because they have the ability to process variable length sequences. A common application for RNNs is in natural language processing where language data (e.g., speech and/or text) to be processed can vary in length (e.g., from a few words, to a full sentence, to multiple sentences, etc.). More particularly, RNNs are effective at both (1) interpreting or understanding language data and (2) generating language data. Thus, in some examples, an RNN is used during the DSL generation stage 204 of FIG. 2 to interpret or understand the user-provided textual description 208 of a GUI design. Further, in some examples, the encoding of the textual description 208 by a first RNN are provided as an input to a second RNN used to generate DSL statements to be included in the DSL instructions 212 output from the DSL generation stage 204. In some examples, the results of the image analysis performed by the CNN serves as the basis for a separate input to second RNN (in addition to the encoded representation of the textual description 208). That is, in some examples, the second RNN analyzes encoded information associated with the user-provided input image 206 and/or the textual description 208 to generate DSL commands that serve as the basis to generate DSL statements for the DSL instructions 212. The DSL commands output by the second RNN are in the form of DSL statements and, therefore, could be referred to as DSL statements. However, the DSL commands are incomplete statements because they do not include positional or spatial information (e.g., size and position) for the elements of the GUI being defined. Thus, for the sake of clarity, an incomplete DSL statement (output by the RNN that does not include positional information) is referred to herein as a DSL command to distinguish it from a complete DSL statement (that includes position information), which is referred to herein simply as a DSL statement. In some instances, a particular DSL statement may not include positional information. In such situations, the DSL command output by the second RNN is the same as the final DSL statement. Although RNNs may be used to process the user-provided textual and/or visual inputs to generate DSL instructions representative of wireframes and/or mockups for a GUI design, other types of AI models may be used for these purpose such as transformer networks and time-delay convolutional neural networks.
[0056] After generating the DSL instructions 212 through the DSL generation stage 204, the style generation stage 214 can generate style properties or definitions for the DSL statements as needed. In some examples, style properties are defined as optional in the DSL definition. Accordingly, in some examples, style properties are not used to produce the rendered images R.sub.k during the DSL generation stage 204. However, the style properties may be used to generate final mockups and/or instructions (e.g., code or script) to generate a prototype of a GUI design. In some examples, the style generation stage 214 leverages the generator network from generative adversarial networks (GANs). That is, a GAN is an AI system that includes two neural networks including a generator and a discriminator. The process of training a GAN can be described as a contest between the generator that attempts to generate new data that is similar to training data and the discriminator that attempts to distinguish the new data from the training data. The adversarial nature of the two neural networks is what gives generative adversarial networks their names. Over time, the generator network learns to generate new data that is statistically similar to the training data by approximating the distribution of the training data.
[0102] From the foregoing, it will be appreciated that example methods, apparatus and articles of manufacture have been disclosed that employ AI processes to automatically generate mockups of GUI design with little to now human involvement beyond providing initial concepts and requirements via hand drawn sketches (or other visual inputs) and textual descriptions of the visual inputs. Further, example mockups may include automatically generated styles incorporated into the various visual elements in the mockups. The typical time-to-design from initial concepts to a GUI mockup can take hours or days. By contrast, example mockups generated in accordance with teachings disclosed herein are generated in substantially real-time (e.g., within seconds), thereby significantly increasing the efficiency of users in developing GUIs and enabling the rapid iteration through multiple design ideas. Further, in some examples, multiple different GUI designs may be automatically generated (e.g., based on variations in the style properties) to be provided to a user for selection. Further, the example mockups are generated or rendered based on underlying DSL instructions (e.g., executable code or script) that can be directly translated to any suitable programming language for subsequent integration with user diagram flows to form working prototypes.
Conclusion
Any inquiry concerning communications from the examiner should be directed to Michael Keller at (571)270-3863 or michael.keller@uspto.gov.  If attempts to reach the examiner are unsuccessful, the examiner’s supervisor, Brian Gillis can be reached on 571-272-7952. 
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MICHAEL A KELLER/
Primary Patent Examiner, Art Unit 2446