DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
The amendments were received on 3/30/2021.  Claims 1-9, 11-19, 21, and 22 are pending where claims 1-9 and 11-19 were previously presented, claims 10 and 20 were cancelled, and claims 21 and 22 are newly added.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-9, 11-19, 21, and 22 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception without significantly more. The claim(s) recite(s) reading one or more tuples in the configuration file; instantiating one or more unifier applications specified in a first portion of a respective tuple in the one or more tuples in the configuration file; identifying relevant feature data of the raw feature data; storing the relevant feature data in an output file. These limitations relate to reading data, selecting a method of analyzing/manipulating data, and displaying the results of the analysis.  The concepts are similar to concepts identified by the court as being directed towards a judicial exception that can practically be performed in the human mind including, for example, observations (reading tuples in config file), evaluations (instantiating one or more unifier applications; identify relevant feature data; store relevant feature data), judgments, and opinions.  The particular claim limitations 
This judicial exception is not integrated into a practical application because the recitations of the processors and computer-readable storage devices recite generic computer elements that are utilized to perform generic computer functionality.  With regard to the recitation of the ETL application, this recitation merely describes where the data is coming from, i.e. source of data and no actual functionality is claimed of the ETL application except to provide a matrix (further discussed below).  With regard to the unifier application, this recitation merely describes using computer programs to perform 
The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the various additional features do not add any meaningful limitations beyond that of the abstract idea as discussed below.  The claims recites receiving from an ETL application, a matrix comprising raw feature data which recites the setup for the mental process step, since to collect and analyze data, one must first receive it which relates to well-understood, routine, and conventional activity of receiving data over a network and electronic recordkeeping (see MPEP 2106.05(d)).  Similarly, the receiving of the configuration file also relates to well-understood, routine, and conventional activity of receiving data over a network.  Both the storing the matrix comprising the raw feature data and storing the configuration file recite well-understood, routine, and conventional activity of storing and retrieving information in memory and add no meaningful limitations beyond that of the abstract idea.  As for the indication of a standard format, this recites both insignificant extrasolution activity of selecting a type of data to be manipulated as well as well-understood, routine, and conventional activity of storing and retrieving information in memory since computer memories have a particular standard format for storing data, different types of memory have different format types.  With regard to the usage of a matrix/table, the particular format of the data recites insignificant extrasolution activity of selecting a type of data as well as electronic recordkeeping where the matrix/table is used to store the recordkeeping data.  With regard to the transmitting the output file limitation, this limitation recites merely the transmitting of the data to some third party 

With regard to claim 2, this limitation recites identifying a list of programs to execute/instantiate to perform the unifier/joining of data which recites field of use and technological environment limitations (listing programs and instantiating programs) which relate to the above identified abstract idea of analyzing the data (i.e. joining/summing/aggregating the data based on user-known rules/procedures (unifier applications)) and add no meaningful limitations beyond that of the abstract idea.
With regard to claim 3, this claim recites particular information that is used for comparisons to filter/sort the raw data to get a matching data set that is meant to be analyzed which relates to insignificant extrasolution activity of selecting a type of data to be manipulated similar to concepts such as limiting a database index to XML tags, in this case limiting the raw data to some range of data.  The matching of data sorts out the unwanted data which recites well-understood, routine, and conventional activity of sorting information.
With regard to claim 4, although the claim mentions the ETL application and identity mapping, these steps relate to mental process steps of recognizing attributes that relate to the same user when analyzing the data such two different records with the same last name and first name being the same; or one record using last name and telephone number and another record only using telephone number.  As such, the 
With regard to claim 5, this claim recites one of three joins to perform which recites the above identified mental process step of analyzing data (combining the records from multiple documents onto a single piece of paper or table) which also relates to well-understood, routine, and conventional activity of electronic recordkeeping and sorting information (see MPEP 2106.05(d)).
With regard to claim 6, this claim recites that the output file is a sparse representation that only represents non-zero values which recites insignificant extrasolution activity of selecting a type of data to be manipulated which relates to well-understood, routine, and conventional activity of sorting information (sorting/filtering out zero values for fields) and storing information in memory (storing data in output file).
With regard to claim 7, this claim recites reader applications to first type of feature data and removing second type that isn’t the first type of feature data which recites insignificant extrasolution activity of selecting a type of data to be manipulated similar to concepts such as limiting a database index to XML tags, in this case limiting the raw data to some range of data.  The matching of data sorts out the unwanted data which recites well-understood, routine, and conventional activity of sorting information.
With regard to claim 8, this claim recites insignificant extrasolution activity and field of use limitations to identify file types and adds no meaningful limitation beyond that of the abstract idea since the storing of a computer file relates to well-understood, routine, and conventional activity of storing information in a memory.
With regard to claim 9, this claim recites usage of a delimiter to identify different types of raw feature data which recites well-understood, routine, and conventional activity of electronic recordkeeping (delimiters to identify columns/attributes as well as start of a new record/row) as well as sorting information (separating first type of data from second type).
With regard to claims 11-19, these claims are substantially similar to claims 1-9 and are rejected for similar reasons as discussed above.
With regard to claims 21 and 22, these claims recite identifying a Unix path expression for a storage location of the relevant feature data of the raw feature data which recites well-understood, routine, and conventional activity of storing information in a memory and appears to merely attempt to recite field of use limitations to apply the abstract idea on a computer.  

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having 

Claims 1-3, 7-9, 11-13, and 17-19 are rejected under 35 U.S.C. 103 as being unpatentable over Shah et al [US 2015/0213372 A1] in view of Stein et al [US 2019/03247] and Urquhart et al [US 2004/0177062 A1].
With regard to claim 1, Shah teaches a system comprising: one or more processors; and one or more non-transitory computer-readable storage devices storing computing instructions configured to run on the one or more processors and perform acts of: receiving, from an extraction, transform, load (ETL) application, a matrix comprising raw feature data (see paragraphs [0035] and [0039] and [0042]; the system extracts/fetches raw data that has various columns/attributes/features and corresponding values for various users/rows); 
receiving a configuration file over a computer network (see paragraphs [0033], [0081] and [0047]; the system can receive a configuration file);
identifying relevant feature data of the raw feature data (see paragraphs [0044] and [0045]; particular data is identified and assembled into multiple feature vectors);
and transmitting, over the computer network in real time to a model building system, the output file comprising the relevant feature data from the one or more non- transitory computer-readable storage devices, so that machine learning model building applications have immediate access to up-to-date data (see paragraphs [0033]-[0034] and [0039]; the system can pass the assembled/relevant feature data to the model building system so that it has up-to-date data).
Shah does not appear to explicitly teach storing the matrix comprising the raw feature data in a standard format in the one or more non-transitory computer readable 
Urquhart teaches storing the matrix comprising the raw feature data in a standard format in the one or more non-transitory computer readable storage devices (see paragraph [0016]; the system can extract the data and store them in a standard/intermediate format).
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify the data extraction system of Shah by utilizing an intermediate data storage format as taught by Urquhart in order to allow for a single format to be used by multiple systems that allow for all the data to be normalized in a single format-type for easier processing/analysis of the data from multiple systems by other components without those components having to individually have to modify/transform all the extracted data during runtime analysis.
Shah in view of Urquhart do not appear to explicitly teach storing the configuration file in a standard format in the one or more non- transitory computer readable storage devices; reading one or more tuples in the configuration file, as stored; instantiating one or more unifier applications based upon specified in a first portion of a respective tuple in the one or more tuples in the configuration file; storing the relevant 
Stein teaches storing the configuration file in a standard format in the one or more non- transitory computer readable storage devices; reading one or more tuples in the configuration file, as stored; instantiating one or more unifier applications based upon specified in a first portion of a respective tuple in the one or more tuples in the configuration file (see paragraphs [0070] and [0047]; the system utilizes a configuration file that includes join configurations for joining/unifying the data).
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify the configuration file of the data extraction system of Shah in view of Urquhart by specifying various rules in the configuration file including join configurations as taught by Stein in order to allow for the system to be able to access and utilize data from a variety of environments and already have the means to join/unify the data together thereby expediting the data aggregation by having it automated and joined/unified in a defined and desired manner without having to waste user or system resources to analyze and determine during runtime the best way to join the data thereby having the data aggregated/joined in the desired manner right away and ready to be processed/analyzed.
Shah in view of Urquhart and Stein teach storing the relevant feature data in a standardized format in an output file in the one or more non-transitory computer-readable storage devices (see Stein, paragraph [0070]; see Shah, paragraphs [0033]-[0034] and [0039]; the system can pass the assembled/relevant feature data to the 

With regard to claim 2, Shah in view of Urquhart and Stein teach wherein instantiating the one or more unifier applications based on the configuration file comprises: accessing the configuration file to identify a list of unifier applications comprising the one or more unifier applications; and instantiating the one or more unifier applications (see Stein, paragraphs [0047], [0019], and [0036]; see Shah, paragraph [0045]; the system can access a configuration file to determine how to render the feature vectors including joining/unifying rules used to join/unify the features together in a specified manner).

With regard to claim 3, Shah in view of Urquhart and Stein teach wherein identifying the relevant feature data comprises: filtering the matrix comprising the raw feature data to remove data from the raw feature data not characterized by a second portion of the respective tuple in the one or more tuples, the second portion of the respective tuple comprising at least one of: (1) start date; (2) a direction; or (3) a duration (see Shah, paragraph [0054]; the system can filter records based on duration/start date, i.e. past 30 days).

With regard to claim 7, Shah in view of Urquhart and Stein teach wherein identifying the relevant feature data of the raw feature data comprises: instantiating one or more reader applications; identifying, using the one or more reader applications, first 

With regard to claim 8, Shah in view of Urquhart and Stein teach wherein the raw feature data comprises at least one of: a text file; a sequence file; a compressed file; or a partitioned file (see Urquhart, paragraph [0016]; various file types can be utilized by the system for storing data).

With regard to claim 9, Shah in view of Urquhart and Stein teach wherein identifying, using the one or more reader applications, the first data comprising the first type of the raw feature data comprises: identifying a delimiter in the raw feature data; and separating the raw feature data into the first type of the raw feature data and a second type of the raw feature data based upon a location of the delimiter (see Urquhart, paragraph [0016]; Shah, paragraph [0058]; CSV files can be used that have delimiters to identify different types of data).

With regard to claims 11-13 and 17-19, these claims are substantially similar to claims 1-3 and 7-9 and are rejected for similar reasons as discussed above.



Claims 4 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Shah et al [US 2015/0213372 A1] in view of Stein et al [US 2019/03247] and Urquhart et al [US 2004/0177062 A1] in further view of Katzir [US 2008/0285464 A1].
With regard to claim 4, Shah in view of Urquhart and Stein teach all the claim limitations of claim 1 as discussed above.
Shah in view of Urquhart and Stein do not appear to explicitly teach wherein the one or more non-transitory computer-readable storage devices storing the computing instructions are further configured to perform: receiving, from the ETL application, an identity mapping to a specific user to which the raw feature data pertains; and joining, using the identity mapping, the one or more unifier applications such that the specific user is identified, wherein each of the one or more unifier applications are joined using different unique IDs in the identity mapping.
Katzir teaches an identity mapping to a specific user to which the raw feature data pertains (see paragraphs [0053], [0061], and [0064]-[0068]; the system can utilize an identity cluster to be able to map to a specific user no matter what identifier information/application they use). 
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify the user activity monitoring system of Shah in view of Urquhart and Stein by incorporating means to analyze user identifying information to create an aggregate/clustered identifier for the user as taught by Katzir in order to allow the system to be able to track user activity across various locations and different devices thus allowing the system to be able to acquire a more complete picture of user behavior and activity across a variety of 
Shah in view of Urquhart and Stein in further view of Katzir teach wherein the one or more non-transitory computer-readable storage devices storing the computing instructions are further configured to perform: receiving, from the ETL application, an identity mapping to a specific user to which the raw feature data pertains (see Katzir, paragraphs [0079], [0064]-[0068], [0061], and [0053]; see Stein, paragraphs [0070] and [0047]; see Shah, paragraphs [0035] and [0039] and [0042]; the system utilizes the ETL to find particular user/member data no matter what application or device was used); 
and joining, using the identity mapping, the one or more unifier applications such that the specific user is identified, wherein each of the one or more unifier applications are joined using different unique IDs in the identity mapping (see Stein, paragraphs [0047] and [0036]; see Shah, paragraph [0045]; see Katzir, Figure 2 and paragraphs [0027], and [0064]-[0068]; the system can access a configuration file to determine how to render the feature vectors including joining/unifying rules used to join/unify the features together in a specified manner).

With regard to claim 14, this claim is substantially similar to claim 4 and is rejected for similar reasons as discussed above.



Claims 5 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Shah et al [US 2015/0213372 A1] in view of Stein et al [US 2019/03247] and Urquhart et al [US 2004/0177062 A1] in further view of Katzir [US 2008/0285464 A1] and in further view of Wikipedia, Join (SQL), https://web.archive.org/web/20071105/180550/https://en.wikipedia.org/wiki/Join_(SQL).
With regard to claim 5, Shah in view of Urquhart and Stein in further view of Katzir teach all the claim limitations of claims 1 and 4 as discussed above.
Shah in view of Urquhart and Stein in further view of Katzir do not appear to explicitly teach wherein joining, using the identity mapping, the one or more unifier applications comprise at least one of: using the identity mapping as a key to perform performing a left join; using the identity mapping as the key to perform performing a right join; or using the identity mapping as the key to perform performing a full join.
Wikipedia teaches a left join; a right join; or a full join (see Wikipedia, pages 6-8; left, right, and full joins can be used).
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify the join configurations of Shah in view of Urquhart and Stein in further view of Katzir by utilizing well-known and widely used join techniques as taught by Wikipedia in order to utilize established algorithms thus allowing modularity to occur and for the system to use already established algorithms instead of dedicating large quantities of time and money for their own system developers to write/create the code.
Shah in view of Urquhart and Stein in further view of Katzir and in further view of Wikipedia teach wherein joining, using the identity mapping, the one or more unifier 

With regard to claim 15, this claim is substantially similar to claim 5 and is rejected for similar reasons as discussed above.



Claims 21 and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Shah et al [US 2015/0213372 A1] in view of Stein et al [US 2019/03247] and Urquhart et al [US 2004/0177062 A1] in further view of Wikipedia, Path (computing), https://web.archive.org/web/20070105235649/https://en.wikipedia.org/wiki/Path_(computing).
With regard to claim 21, Shah in view of Urquhart and Stein teach all the claim limitations of claim 1 as discussed above.
Shah in view of Urquhart and Stein do not appear to explicitly teach wherein identifying the relevant feature data of the raw feature data comprises: identifying a Unix path expression for a storage location of the relevant feature data of the raw feature data.
Wikipedia teaches identifying a Unix path expression for a storage location (see Wikipedia, pages 1-3; Unix path expressions can be utilized to specify a storage location for a file).
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify the storage mechanisms of the intermediate file that stores the extracted raw feature data of Shah in view of Urquhart and Stein by utilizing well-known and widely used operating system features and operating system as taught by Wikipedia in order to utilize the operating system to manage the storage of information at the server computing device by utilizing storage locations/paths so that the system can store and easily retrieve particular and desired information without having to do a full storage disk scan of every memory location to find a specific file.
Shah in view of Urquhart and Stein in further view of Wikipedia teach wherein identifying the relevant feature data of the raw feature data comprises: identifying a Unix path expression for a storage location of the relevant feature data of the raw feature data (see Wikipedia, pages 1-3; Unix path expressions can be utilized to specify a storage location for a file; see Urquhart, paragraphs [0016] and [0012]; the server system can be a Unix-based computer that stores the intermediate storage with the data stored in files to be used for later processing).

With regard to claim 22, this claim is substantially similar to claim 21 and is rejected for similar reasons as discussed above.



Claims 6 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Shah et al [US 2015/0213372 A1] in view of Stein et al [US 2019/03247] and Urquhart et al [US 2004/0177062 A1] in further view of Dumais et al [US 6,192,360].
With regard to claim 6, Shah in view of Urquhart and Stein teach all the claim limitations of claim 1 as discussed above.
Shah in view of Urquhart and Stein do not appear to explicitly teach wherein the relevant feature data is stored in the output file as a sparse representation of the relevant feature data, the sparse representation comprising a feature vector having only non-zero counts of the relevant feature data.
Dumais teaches a sparse representation of the relevant feature data, the sparse representation comprising a feature vector having only non-zero counts of the relevant feature data (see col 9, lines 44-59; a sparse representation of feature data can be utilized).
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify the vector output means of Shah in view of Urquhart and Stein by utilizing a sparse vector representation as taught by Dumais in order to save storage space, processing time, and reduce network consumption/congestion by utilizing a sparse representation of a feature vector that stores the small number of non-zero values versus storing a relatively large array of mostly zero-valued features.
Shah in view of Urquhart and Stein in further view of Dumais teach wherein the relevant feature data is stored in the output file as a sparse representation of the relevant feature data (see Dumais, col 9, lines 44-59; see Stein, paragraph [0070]; see Shah, paragraphs [0033]-[0034] and [0039]; the system can pass the assembled/relevant feature data, as represented in a sparse vector representation, to the model building system so that it has up-to-date data where the data is stored in a standard format that the machine learning model is adapted to utilize).

With regard to claim 16, this claim is substantially similar to claim 6 and is rejected for similar reasons as discussed above.

Response to Arguments
Applicant's arguments (see the third paragraph on page 8 through the second to last paragraph on page 23) have been fully considered but they are not persuasive.  The applicant argues (a) the idea identified by the Examiner does not correspond to the specific claim limitations as amended since the Office Action overgeneralized the claims and ignored important limitations including receiving a configuration file over a network, storing the file, instantiating one or more unifier applications in the tuple; (b) the claims do not fall within one of the three groupings; (c) the Office Action does not specify how a human is supposed to perform receiving a configuration file over a network, store it or instantiate unifier applications and merely recites conclusory statements; (d) even if claim recites an abstract idea, at least the receiving and storing a configuration file and instantiating unifier application limitations fall outside the abstract idea; (e) the claims 
The Examiner respectfully disagrees.  With regard to arguments (a) and (b), the Examiner notes that in the 35 USC 101 rejections above, the Examiner specifically pointed out the specific claim limitations that are directed towards an abstract idea and why (as well as the particular grouping) as well as those limitations that are additional elements.  Therefore, for at least those reasons discussed above, the claim recites a judicial exception.   
With regard to argument (c), the Examiner illustrated how the above noted which claim limitations were associated with the abstract idea and which elements were additional elements.  As well as provide a real-world example of the mental process 
With regard to arguments (f), (g) and (h), the Examiner notes that the claims send/store data and configuration data and then access and analyze data to transmit to some third party (i.e. machine learning model); however, the limitation “so that machine learning model building applications” recites intended use, meaning the claim is directed to receiving data, analyzing the data in some fashion, and then sending the data on which does not appear to integrate the judicial exception into a practical application since the analyzed data does not appear to be explicitly required to be utilized by any machine learning model.  As for improvement to computer functionality, reading data and performing joins appears to relate to query processing techniques (i.e. SQL queries), which although the config file may specify the particular parameters to generate the query, having a computer perform data analysis does not appear to improve the processor speed, reduce network congestion, or save storage space.  Joins are mentioned at a high-level and does not appear to be any optimized version of a join either.  With regard to the similarity to the claimed example, the Examiner notices some major differences including remote updating of data in a non-standard format which 
With regard to argument (i) about a specific improvement in another technology or technical field, specifically machine learning and artificial intelligence; the Examiner notes that the claims do not require the data to actually be utilized and thus no improvement in those fields are recited.  As noted above, the claims may receive raw data and perform analysis steps to receive relevant feature data for a model and even transmit the data to a model building system; however, that is where the claims stop.  There is no requirements for the model building system to use it, or illustrate how it is used for an improvement.  It could be merely archived and still meet the claim limitations as recited.  As such, the argument about a specific improvement is not convincing.
With regard to arguments (k) and (l) and (m), the 35 USC 101 rejections acknowledges various claim elements that are additional elements and explains how they do not provide significantly more than the abstract idea; therefore, for at least the reasons discussed above, the applicant’s argument about the claim elements being significantly more is not convincing.  As for the well-understood, routine, and conventional activity arguments, the Examiner provided citations as evidence to support 

Applicant’s arguments (see the last paragraph on page 23 through the last paragraph on page 25) with respect to the rejection(s) of claim(s) under 35 USC 103 have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of Shah in view of Urquhart and Stein.  As seen from the 35 USC 103 rejections above, new references were found after an update search was conducted, where the new references, when combined, appear to teach or fairly suggest the claim limitations as recited.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.  Pendar [US 2010/0287160 A1] teaches at paragraph [0029] sparse vector representation; Shah et al [US 2015/0235258 A1] teaches at paragraph [0032] the aggregation of user on-line behavior/activity across multiple devices.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARC S SOMERS whose telephone number is (571)270-3567. The examiner can normally be reached M-F 11-8 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mariela Reyes can be reached on 5712701006. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/MARC S SOMERS/Primary Examiner, Art Unit 2159                                                                                                                                                                                                        2/15/2022