Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
2.	This action is in response to communication filed on September 26, 2022.
Response to Amendment
3.	As a result of the amendment filed on 09/26/2022, claims 16-17 and 19 has been amended, claims 18 has been cancelled.
4.	Claims 1-17 and 19-20 remain pending in this office action.
Claim Rejections - 35 USC § 102
5.	The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

6.	Claims 1, 4-12, 16, 19-20 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Miller (US 2021/0374188 A1). 
	As per claim 1, Miller discloses:
	- a non-transitory computer-readable medium comprising computer readable instructions, that when executed by one or more processors, causes the one or more processors to perform operations comprising (processor to execute the computer readable instruction, Para [0069], [0072]),
	- receiving an input data set related to digital content, wherein the input data set comprises a plurality of input entries (input data related to digital content is received, Para [0045], Fig. 1, item 108), 
	- matching each input entry of the plurality of input entries to one or more baseline entries of a baseline data set (matching input entries with local database entries (i.e. baseline entries), Par [0047], Fig.1, item 104), 
	- assigning a probability score to each respective baseline entry of the one or more baseline entries for each respective input entry based on metadata associated with the input data set, wherein the probability score for each respective baseline entry indicates a probability that the respective baseline entry is an accurate match to the input entry (probability score for each baseline entries and calculated based on input metadata, Para [0049], [0054], [0057], Fig. 1, item 115), probability score indicating accurate match (i.e. match) to the input entry, Para [0106]), 
	-and generating an output data set comprising a plurality of output entries, wherein each respective input entry corresponds to a respective output entry of the plurality of output entries, and wherein each respective output entry comprises (generating output with matched entries, Para [0046], [0094]), 
	- a baseline entry of the one or more baseline entries having a highest probability score (local database record (i.e. baseline record) have high probabilities, Para [0009], [0059]), 
	- and additional data associated with the respective input entry (additional data or   description associated with input entry, [0059], [0064], [0076], [0091]).
	As per claim 4, rejection of claim 1 is incorporated, ad further Miller discloses:
- wherein matching each input entry of the plurality of input entries to the one or more baseline entries of the baseline data set comprises matching one or more input words of each input entry to one or more baseline words of the one or more baseline entries (matching word with input entries and baseline entries, Para [0022], [0077]).
As per claim 5, rejection of claim 1 is incorporated, and further Miller discloses:
- generating a plurality of candidate pools, wherein each candidate pool of the plurality of candidate pools comprise one or more candidate entries for each input entry of the plurality of input entries (candidate database record (i.e. candidate pool), para [0008]),
- assigning a matching score to each candidate entry of each candidate pool of the plurality of candidate pools, wherein each matching score is indicative of a degree of matching between a respective candidate entry and the respective input entry (similarity score assigned to each candidate entries, Para [0019], [0076]).
As per claim 6, rejection of claim 5 is incorporated, and further Miller discloses:
- merging the plurality of candidate pools by summing or averaging respective matching scores for common candidate entries among the plurality of candidate pools to generate a merged candidate pool, wherein the merged candidate pool comprises a plurality of candidate entries for each input entry of the plurality of input entries, and wherein each candidate entry of the plurality of candidate entries is assigned a merged matching score comprising summed or averaged matching scores (same or similar or common entries and combined together, Para [0052], [0079]), 
- 28312165-1 (NBCU:0154) selecting the one or more baseline entries for each input entry as a subset of candidate entries for each input entry having a merged matching score above a threshold score (merged record with threshold score, Para [0085], [0125]). 
As per claim 7, rejection of claim 1 is incorporated, and further Miller discloses:
- wherein the operations comprise matching each input entry to the one or more baseline entries via a machine learning model, and wherein the machine learning model indicates previous confirmed matches of each respective input entry to the one or more baseline entries (machine learning model indication whether the training example constitutes a previously confirmed match, Para [0015], [0046]).
As per claim 8, rejection of claim 1 is incorporated, and further Miller discloses:
- wherein the input data set comprises movie titles, series titles, episode titles, program titles, event titles, names of people, advertisement information, song names, entity names, or a combination thereof (input set comprise title, episode, Para [0047], [0059]).
As per claim 9, rejection of claim 8 is incorporated, and further Miller discloses:
- wherein the metadata associated with the input data set comprises data source information, genre information, a year of release, a year of production, viewership data, impression data, statistical information associated with an actor, statistical information associated with an athlete, statistical information associated with an entity, or a combination thereof (metadata with genre, Para [0020], [0046]).
As per claim 10, rejection of claim 1 is incorporated, and further Miller discloses:
- wherein the input data set is received from an input data source, wherein the baseline data set is received from a baseline data source, and wherein the input data source is different from the baseline data source (input dataset and baseline data set are different, Para [0023], [0044]).
As per claim 11, Miller discloses:
- a system, comprising: one or more hardware processors (system with hardware processor, Para [0069]), 
- and a non-transitory memory storing instructions that, when executed by the one or more hardware processors, causes the one or more hardware processors to perform 29312165-1 (NBCU:0154) actions comprising (processor to execute the computer readable instruction, Para [0069], [0072]),
- receiving an input data set related to digital content, wherein the input data set comprises a plurality of input entries (input data related to digital content is received, Para [0045], Fig. 1, item 108), 
- matching one or more input words of each input entry of the plurality of input entries to one or more baseline words of one or more baseline entries of a baseline data set (matching input entries with local database entries (i.e. baseline entries), Par [0047], Fig.1, item 104), 
- and generating an output data set comprising a plurality of output entries, wherein each respective input entry corresponds to a respective output entry of the plurality of output entries, and wherein each respective output entry comprises (generating output with matched entries, Para [0046], [0094]),
- a baseline entry of the one or more baseline entries having a highest probability of matching the respective input entry (local database record (i.e. baseline record) have high probabilities, Para [0009], [0059]),
- and additional data associated with the respective input entry (additional data or   description associated with input entry, [0059], [0064], [0076], [0091]).
As per claim 12, rejection of claim 11 is incorporated, and further Miller discloses:
- assigning a probability score to each respective baseline entry of the one or more baseline entries for each respective input entry based on metadata associated with the input data set, the baseline data set, or both, wherein the probability score for each respective baseline entry indicates a probability that the respective baseline entry is an accurate match to the input entry (probability score for each baseline entries and calculated based on input metadata, Para [0049], [0054], [0057], Fig. 1, item 115), probability score indicating accurate match (i.e. match) to the input entry, Para [0106]),
- and selecting, for each respective input entry, the baseline entry having a highest probability score (highest feature score being selected, Para [0094]).
	As per claim 16, Miller discloses:
	- a method of generating data related to digital content, comprising (creating new record (i.e. generating data related to digital content, Para [0001], [0006]), 
	- receiving an input data set related to digital content, wherein the input data set comprises a plurality of input entries (input data related to digital content is received, Para [0045], Fig. 1, item 108), 
	- matching one or more input words of each input entry of the plurality of input entries to one or more baseline words of one or more baseline entries of a baseline data set (matching input entries with local database entries (i.e. baseline entries), Par [0047], Fig.1, item 104),
	- and generating an output data set comprising a plurality of output entries, wherein each respective input entry corresponds to a respective output entry of the plurality of output entries, and wherein each respective output entry comprises (generating output with matched entries, Para [0046], [0094]),
	- a baseline entry of the one or more baseline entries (local database record (i.e. baseline record) have high probabilities, Para [0009], [0059]),
- and additional data associated with the respective input entry (additional data or   description associated with input entry, [0059], [0064], [0076], [0091]).
  	As per claim 19, rejection of claim 16 is incorporated, and further Miller discloses:
- generating a plurality of candidate pools, wherein each candidate pool of the plurality of candidate pools comprise one or more candidate entries for each input entry of the plurality of input entries (candidate database record (i.e. candidate pool), para [0008]),
- and assigning a matching score to each candidate entry of each candidate pool of the plurality of candidate pools, wherein each matching score is indicative of a degree of matching between a respective candidate entry and the respective input entry (similarity score assigned to each candidate entries, Para [0019], [0076]).  
As per claim 20, rejection of claim 19 is incorporated, and further Miller discloses:
- merging the plurality of candidate pools by summing or averaging respective matching scores for common candidate entries among the plurality of candidate pools to generate a merged candidate pool, wherein the merged candidate pool comprises a plurality of candidate entries for each input entry of the plurality of input entries, and wherein each candidate entry of the plurality of candidate entries is assigned a merged matching score comprising summed or averaged matching scores (same or similar or common entries and combined together, Para [0052], [0079]), 
- selecting the one or more baseline entries for each input entry as a subset of candidate entries for each input entry having a merged matching score above a threshold score (merged record with threshold score, Para [0085], [0125]). 
Claim Rejections - 35 USC § 103
7.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

8.	Claim 2 is rejected under 35 U.S.C. 103 as being unpatentable over Miller (US 2021/0374188 A1), as applied to claim 1 above and further in view of Springer Jr. et al (US 2005/0055372 A1).
As per claim 2, rejection of claim 1 is incorporated,
	Miller does not explicitly disclose tokenizing, lemmatizing, or both, one or more words of one or more input entries of the plurality of input entries to generate a processed input data set comprising a plurality of processed input entries; and matching each processed input entry of the plurality of processed input entries to the one or more baseline entries of the baseline data set. However, in the same field of endeavor Springer in an analogous art disclose tokenizing, lemmatizing, or both, one or more words of one or more input entries of the plurality of input entries to generate a processed input data set comprising a plurality of processed input entries; and matching each processed input entry of the plurality of processed input entries to the one or more baseline entries of the baseline data set (tokenizer creating tokens for input string to be process, Para [0050], Fig. 3-5, and matching processed token with baseline entries (i.e. standardized metadata), Para [0006], Para [0054]-[0056]).
Therefore, it would have been obvious to a person of the ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Miller with the teaching of Springer by modifying Miller such that common metadata identification of Miller to match digital content in different sources using token analysis of Springer for efficient analysis of identical digital content. The motivation for doing so would be use of cleansed token arranged as a series of ordered tokens with most high frequency to improve search efficiency.
9.	Claim 3, 15 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Miller (US 2021/0374188 A1), as applied to claim 1, 11 and 16 above and further in view of Ong et al (US 2016/0378862 A1)
As per claim 3, rejection of claim 1 is incorporated,
Miller does not explicitly disclose wherein matching each input entry of the plurality of input entries to the one or more baseline entries of the baseline data set comprises matching each input entry to the one or more baseline entries via Jaro-Winkler matching, Levenshtein matching, Metaphone matching, or a combination thereof. However, in the same field of endeavor Ong in an analogous art disclose wherein matching each input entry of the plurality of input entries to the one or more baseline entries of the baseline data set comprises matching each input entry to the one or more baseline entries via Jaro-Winkler matching, Levenshtein matching, Metaphone matching, or a combination thereof (matching input entries using Levenshtein similarity, Para [0094]).
Therefore, it would have been obvious to a person of the ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Miller with the teaching of Springer by modifying Miller such that common metadata identification of Miller to match digital content in different sources using similarity matching techniques of Ong better identification of similar assets in different providers and publishers. 
As per claim 15, rejection of claim 12 is incorporated, and further Ong discloses:
	- wherein the metadata associated with the baseline data set comprises airing information associated with each baseline entry of the one or more baseline entries of the baseline data set, and wherein the airing information comprises a channel, a timeframe, a frequency, or a combination thereof (metadata include airdate of a show or episode, Para [0006], [0070] - [0071]). 
Therefore, it would have been obvious to a person of the ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Miller with the teaching of Springer by modifying Miller such that common metadata identification of Miller to match digital content in different sources using similarity matching techniques of Ong better identification of similar assets in different providers and publishers.
As per claim 17, rejection of claim 16 is incorporated,
Miller does not explicitly disclose wherein matching one or more input words of each input entry of the plurality of input entries to the one or more baseline words of the one or more baseline entries of the baseline data set comprises matching each input entry to the one or more baseline entries via Jaro-Winkler matching, Levenshtein matching, Metaphone matching, or a combination thereof. However, in the same field of endeavor Ong in an analogous art disclose wherein matching one or more input words of each input entry of the plurality of input entries to the one or more baseline words of the one or more baseline entries of the baseline data set comprises matching each input entry to the one or more baseline entries via Jaro-Winkler matching, Levenshtein matching, Metaphone matching, or a combination thereof (matching input entries using Levenshtein similarity, Para [0022], [0077], Para [0094]).
Therefore, it would have been obvious to a person of the ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Miller with the teaching of Springer by modifying Miller such that common metadata identification of Miller to match digital content in different sources using similarity matching techniques of Ong better identification of similar assets in different providers and publishers. 
10.	Claim 13 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Miller (US 2021/0374188 A1), as applied to claim 11 above and further in view of Kocks et al (US 2013/0343598 A1).
As per claim 13, rejection of claim 12 is incorporated, 
Miller does not explicitly disclose wherein the metadata associated with the baseline data set comprises viewership data associated with each baseline entry of the one or more baseline entries of the baseline data set. However, in the same field of endeavor Kocks in an analogous art disclose wherein the metadata associated with the baseline data set comprises viewership data associated with each baseline entry of the one or more baseline entries of the baseline data set (metadata with viewership data, Para [0030], [0147]).
Therefore, it would have been obvious to a person of the ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Miller with the teaching of Kocks by modifying Miller such that common metadata identification of Miller to match identifying number of viewers viewed a program or digital content of Kocks for efficiently discovering viewership statistics. 
 As per claim 14, rejection of claim 13 is incorporated,
Miller does not explicitly disclose wherein a first baseline entry of the one or more baseline entries is assigned a higher probability score relative to a second baseline entry 30312165-1 (NBCU:0154) of the one or more baseline entries based on the first baseline entry having greater viewership than the second baseline entry.  However, in the same field of endeavor Kocks in an analogous art disclose wherein a first baseline entry of the one or more baseline entries is assigned a higher probability score relative to a second baseline entry 30312165-1 (NBCU:0154) of the one or more baseline entries based on the first baseline entry having greater viewership than the second baseline entry (ranking entity in the recommendation server (i.e. baseline entries) based on viewership data, Para [0151], [0156], [0158], [0180]).
 Therefore, it would have been obvious to a person of the ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Miller with the teaching of Kocks by modifying Miller such that common metadata identification of Miller to match identifying number of viewers viewed a program or digital content of Kocks for efficiently discovering viewership statistics. 
			Response to Arguments
11.	Applicant's arguments filed on 09/26/2022 with respect to claims 1-20 have been fully considered but they are not deemed to be persuasive. 
In response to applicant’s argument tin page 10, line 1-4, applicant argued that Miller does not appear to disclose generating an output dataset where each output entry includes a baseline entry matched with an input entry and additional data associated with input entry. Examiner disagree and respectfully response that besides the cited portion Miller teaches Database management application. Miller database management application receive database records from multiple media asset provider such as Netflix, Hulu, Amazon, etc. and create a new local record based on pair of database records and their metadata to constitute a match in pairs to match by a match machine learning model, see Para [0046], [0058]-[0061]. Miller also teaches in Para [0075] additional data associated with input entry for a match similarity between description of two separate database records, See, Fig. 4, item 402, 404 Para [0075] - [0076]. Examiner broadest reasonable interpretation: Miller database management application and match machine learning model generate an aggregated list of record received from Amazon database, Hulu Database and Netflix Database obviously include existing data and metadata with received data and metadata when creating new record. See Fig. 2A, 2B, 4 and 9.
Therefore, Miller teaches the argued limitation and claim 1 as claimed.
Conclusion
12.	THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
				Contact Information
13.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMMED R UDDIN whose telephone number is (571)270-3138. The examiner can normally be reached M-F: 9:00 AM-5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Beausoliel Robert can be reached on 571-272-3645. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MOHAMMED R UDDIN/Primary Examiner, Art Unit 2167