DETAILED ACTION
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
2.	Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.
Information Disclosure Statement
3.         The information disclosure statements (IDS) submitted on 09/21/2020 and 06/18/2021 have been received, entered into the record, and considered.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statements are being considered by the examiner.
Claim Objections
4.	Claims 5 and 20 is objected to because of the following informalities:  The phrase “frames of image” in claim 20 is grammatically incoherent.  Appropriate correction is required.
Claim Rejections - 35 USC § 103
5.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
6.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
7.	This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to 
8.	Claims 1, 3, 10, 11-12, 16, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Newell et al. (U.S. PGPUB 2017/0257595) in view of Shah et al. (Article entitled “ADVISOR – Personalized Video Soundtrack Recommendation by Late Fusion with Heuristic Rankings”, dated 07 November 2014).
9.	Regarding claims 1, 11, and 16, Newell teaches a method, non-transitory computer readable storage medium, and computing device comprising:
A)  obtaining a material for which background music is to be added (Paragraphs 54 and 65-66); 
B)  determining at least one visual semantic tag of the material (Paragraph 14); 
C)  the at least one visual semantic tag describing at least one characteristic of the material (Paragraphs 59, and 61-62); 
D)  identifying a matched music matching the at least one visual semantic tag from a candidate music library (Paragraphs 66 and 69); 
E)  sorting the matched music according to user assessing information (Paragraphs 68-70); 
F)  screening the matched music based on a sorting result and according to a preset music screening condition (Paragraph 70); and 
G)  matched music obtained through the screening (Paragraph 70).
	The examiner notes that Newell teaches “obtaining a material for which background music is to be added” as “The media device 12 computer 36 is programmed to, when activated, to capture video data (visual and audio) from an environment of the user, on an on-going, continuous basis. For example, the user may, attach the media device 12 to the user's clothing, e.g., on the user's chest, such that the video camera 32 in the media device 12 is generally facing forward. The field of vision of the optical element in the video camera 32 may extend, in an initial position, from the media device 12 in a direction perpendicular to the chest of the user” (Paragraph 54) and “The server 18 may generate a secondary video recording from one or more Newell teaches “determining at least one visual semantic tag of the material” as “The primary video recording may be stored together with primary metadata. As described in additional detail below, the primary metadata may include, e.g., the identity of the user, a time stamp indicating a time of the recording, a location of the recording, start and stop indicia, a mental state of the user, etc” (Paragraph 14).  The examiner further notes that the primary metadata (including mental state) teaches the claimed visual semantic tag.  The examiner further notes that Newell teaches “the at least one visual semantic tag describing at least one characteristic of the material” as “the user may witness a comical event and wish to generate a primary video recording that starts just before the comical event. The user may instruct, via a user interface or verbal command, the media device 12 to display a specified portion of the video, e.g., the last five minutes of video data on the display device 14. The user may view the last five minutes of data, and select, for example via a user interface, the starting time of the primary video recording” (Paragraph 59) and “The computer 36 may be programmed to receive user mental state data from the user, and include the user mental state data in metadata associated with a primary video recording. The computer 36 may be programmed, e.g., to recognize ten different keywords representing common mental states such as "happy," "sad," "laughing," "thrilled," etc. The available keywords may be made available to the user via, for example, a user manual, or a media device tutorial.  When the user initiates the generation of a primary video recording, the user may, in addition to indicating a start time, indicate a mental state. For example, the user Newell teaches “identifying a matched music matching the at least one visual semantic tag from a candidate music library” as “the server 18 may select music to include in a soundtrack for the secondary video recording, based on the mental state keyword in the primary metadata. For example, the server 18 may detect the keyword "happy" in the primary metadata. Based on the keyword, the server 18 may search through a popular music library for a song that is identified as a happy song, or a song appropriate as a soundtrack for a happy video” (Paragraph 66) and “The server 18 may select, as a candidate to include in the secondary video recording, a song that is the most popular song having a song keyword that matches the user mental state keyword” (Paragraph 69).  The examiner further notes that the searching for songs in a music library matching the mental state keyword in the primary metadata teaches the claimed identifying.  The examiner further notes that Newell teaches “sorting the matched music according to user assessing information” as “In addition to providing song keywords, the popular music library may provide, for example, a ranking of songs, based on current popularity. Current popularity may be determined, based on, for example the number of downloads (or purchases) of the song in the previous week, the number of times someone has listened to the song on one or more websites in the previous week, etc.  The server 18 may select, as a candidate to include in the secondary video recording, a song that is the most popular song having a song keyword that matches the user mental state keyword.  After selecting the song, the server 18 may further obtain information from various data sources, e.g., the server 18 could include programming to search a data store comprising a personal music library of the user to determine if the user owns rights to Newell teaches “screening the matched music based on a sorting result and according to a preset music screening condition” as “After selecting the song, the server 18 may further obtain information from various data sources, e.g., the server 18 could include programming to search a data store comprising a personal music library of the user to determine if the user owns rights to use the selected song in the secondary video recording. In the case that the user does not have rights to use the song, the server 18 may select, for example, a second most popular song having a song keyword matching the mental state keyword, and determine whether the song is in the user library data store” (Paragraph 70).  The examiner further notes that the screening of matched songs based on ownership rights (i.e. the claimed preset music screening condition) of a user teaches the claimed screening.  The examiner further notes that Newell teaches “matched music obtained through the screening” as “After selecting the song, the server 18 may further obtain information from various data sources, e.g., the server 18 could include programming to search a data store comprising a personal music library of the user to determine if the user owns rights to use the selected song in the secondary video recording. In the case that the user does not have rights to use the song, the server 18 may select, for example, a second most popular song having a song keyword matching the mental state keyword, and determine whether the song is in the user library data store” (Paragraph 70).  The examiner further notes that the screening of matched songs based on ownership rights (i.e. the claimed preset music screening condition) of a user teaches the claimed screening.
	Newell does not explicitly teach:
E)  user assessing information of a user corresponding to the material;
G)  recommending matched music as candidate music of the material.
Shah, however, teaches “user assessing information of a user corresponding to the material” as “To enhance the appeal of a UGV for viewing and sharing, we have designed ADVISOR, which replaces the ambient background noise of a UGV with a soundtrack that matches both the video scenes and a user’s preferences” (Page 607), “The soundtrack recommendation component of the backend system re-ranks a list of songs retrieved by the heuristic method based on user preferences and recommends them for the UGV (see Figure 5)” (Page 608), and “first, a learning model based on the late fusion of geo and visual features recognizes scene moods in the UGV. Second, a novel heuristic method recommends a list of songs based on the predicted scene moods. Third, the soundtrack recommendation component re-ranks songs recommended by the heuristics method based on the user’s listening history. Finally, our Android application generates a music video from the UGV by automatically selecting the most appropriate song using a learning model based on the late fusion of visual and concatenated audio features” (Paragraph 615), and “recommending matched music as candidate music of the material” as “The soundtrack recommendation component of the backend system re-ranks a list of songs retrieved by the heuristic method based on user preferences and recommends them for the UGV (see Figure 5)” (Page 608) and “first, a learning model based on the late fusion of geo and visual features recognizes scene moods in the UGV. Second, a novel heuristic method recommends a list of songs based on the predicted scene moods. Third, the soundtrack recommendation component re-ranks songs recommended by the heuristics method based on the user’s listening history. Finally, our Android application generates a music video from the UGV by automatically selecting the most appropriate song using a learning model based on the late fusion of visual and concatenated audio features” (Paragraph 615).
	The examiner further notes that the secondary reference of Shah teaches the concept of using user preferences (i.e. user accessing information of a user corresponding to a material) as a basis for ranking candidate background music.  Moreover, Shah explicitly teaches the recommendation of candidate of music.  The combination would result in the use of user preferences as a basis to rank candidate background music for subsequent recommendation after using Newell’s screening.
Shah’s would have allowed Newell’s to provide a method for using user preferences for generating soundtracks, as noted by Shah (Page 615).

	Regarding claims 3, 12, and 18, Newell does not explicitly teach a method, non-transitory computer readable storage medium, and computing device comprising: 
A)  determining at least one visual semantic tag, designated by the user from available visual semantic tags, as the at least one visual semantic tag of the material; or parsing content of the material, to determine the at least one visual semantic tag of the material.
	Shah, however, teaches “determining at least one visual semantic tag, designated by the user from available visual semantic tags, as the at least one visual semantic tag of the material; or parsing content of the material, to determine the at least one visual semantic tag of the material” as “To generate the music soundtrack for the UGV, the Android application first uploads its recorded sensor data and selected key-frames to the backend system. Next, the backend system computes geo and visual features for the UGV and forwards these features to MGV M and MGV C to predict scene mood tags and mood clusters, respectively, for the UGV” (Paragraph 608) and “A geo feature computed from geo-categories reflects the environmental atmosphere associated with moods and a color histogram computed from key-frames represents moods in the video content. Next, the sequence of geo-features and the sequence of visual features are synchronized based on their respective timestamps to train emotion prediction models using SVM hmm method. Figure 3 shows the process of mood recognition from UGVs based on heterogeneous late fusion of SVM hmm models constructed from geo and visual features” (Page 610).
	The examiner further notes that the secondary reference of Shah teaches the concept of using video features as a basis for recognizing emotions/moods of videos (i.e. the claimed parsing to determining at least one visual semantic tag of the material).  The combination would result in the parsing for determining the mental state/mood/emotions of a user of the videos of Newell.
Shah’s would have allowed Newell’s to provide a method for using user preferences for generating soundtracks, as noted by Shah (Page 615).

	Regarding claims 10 and 15, Newell does not explicitly teach a method and computing device comprising:
A)  sorting the matched music according to parameter values of one type of music assessing behavior data of the user corresponding to the material for music, or a comprehensive value obtained after weighted processing is performed on parameter values of at least two types of music assessing behavior data of the user; 
B)  wherein music assessing behavior data of one user for one piece of music comprising any one of the following: a music score, a click-through rate, a favorites behavior, a like behavior, and a sharing behavior.
	Shah, however, teaches “sorting the matched music according to parameter values of one type of music assessing behavior data of the user corresponding to the material for music, or a comprehensive value obtained after weighted processing is performed on parameter values of at least two types of music assessing behavior data of the user” as “To enhance the appeal of a UGV for viewing and sharing, we have designed ADVISOR, which replaces the ambient background noise of a UGV with a soundtrack that matches both the video scenes and a user’s preferences” (Page 607), “The soundtrack recommendation component of the backend system re-ranks a list of songs retrieved by the heuristic method based on user preferences and recommends them for the UGV (see Figure 5)” (Page 608), and “first, a learning model based on the late fusion of geo and visual features recognizes scene moods in the UGV. Second, a novel heuristic method recommends a list of songs based on the predicted scene moods. Third, the soundtrack recommendation component re-ranks songs recommended by the heuristics method based on the user’s listening history. Finally, our Android application generates a music video from the UGV by automatically selecting the most appropriate song using a learning model based on the late fusion of visual and concatenated audio features” (Paragraph 615), and “wherein music assessing behavior data of one user for one piece of music comprising any one of the following: a music score, a click-through rate, a favorites behavior, a like behavior, and a sharing behavior” as “The soundtrack recommendation component of the backend system re-ranks a list of songs retrieved by the heuristic method based on user preferences and recommends them for the UGV (see Figure 5)” (Page 608) and “first, a learning model based on the late fusion of geo and visual features recognizes scene moods in the UGV. Second, a novel heuristic method recommends a list of songs based on the predicted scene moods. Third, the soundtrack recommendation component re-ranks songs recommended by the heuristics method based on the user’s listening history. Finally, our Android application generates a music video from the UGV by automatically selecting the most appropriate song using a learning model based on the late fusion of visual and concatenated audio features” (Paragraph 615).
	The examiner further notes that the secondary reference of Shah teaches the concept of using user preferences (i.e. including listening history) as a basis for ranking (i.e. sorting) matched music.  Moreover, listening histories of users teaches the undefined claimed click-through rate and/or favorites behavior in the broadest reasonable interpretation.  The combination would result in the use of listening history as a basis to rank candidate background music for subsequent recommendation after using Newell’s screening.
	It would have been obvious to one of ordinary skill in the art before the effective filing date of instant invention to combine the teachings of the cited references because teaching Shah’s would have allowed Newell’s to provide a method for using user preferences for generating soundtracks, as noted by Shah (Page 615).
10.	Claims 2 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Newell et al. (U.S. PGPUB 2017/0257595) in view of Shah et al. (Article entitled “ADVISOR – Personalized Video Soundtrack Recommendation by Late Fusion with Heuristic Rankings”, dated 07 November 2014) as applied to claims 1, 3, 10, 11-12, 16, and 18 above, and further in view of Lovejoy et al. (U.S. PGPUB 2007/0234214).
11.	Regarding claims 2 and 17, Newell and Shah do not explicitly teach a method and computing device comprising:

B)  synthesizing the background music to the material according to the indication information; and 
C)  transmitting the material synthesized with music to the terminal device.
	Lovejoy, however, teaches “receiving indication information that is transmitted by a terminal device and that designates background music from the candidate music” as “A web based video and image editing system facilitates the creation of video compositions and photo books. Content, such as images, audio, video clips and text are stored on a networked server or database, and the system provides a web based interface that allows the user to select images, video clips and text to include in a video composition or photo book, which may be set to selected music or other audio” (Paragraph 24), “A web based video editing system is indicated generally at 100 in FIG. 1. A user 105 may access the system 100 via a computer 110 or other device that is capable of running a browser. The user may send a request to a web interface 115, which provides interfaces to the user for many different video editing functions as illustrated by multiple modules” (Paragraph 26), “FIG. 5 illustrates a music selection screen 500. The buttons 430 appear at the top of screen 500, and are numbered as follows: video composition title 510, select cover 515, select music 520, select content 525, edit video composition 530 and preview video composition 535. The select music button 520 is currently selected, and provides functions such as uploading songs 540 from another source, and suggesting songs 545, which may be tied into software that identifies songs as a function of information collected from the video composition title information entry screen corresponding to button 510. At 550, a list of categories of music available is provided. At 555, a song list is provided with check boxes to select the music desired for the creation. The title, author, and length are provided as well as a "listen" labeled link to hear the music” (Paragraph 39), and “Editing functions provided include a transition to the next item at 915, and the ability to select the relative volume level of video and background music at 920, 925, and 930” (Paragraph 46), “synthesizing the background music to the material according to the indication information” as “A web based video and image editing system facilitates the creation “transmitting the material synthesized with music to the terminal device” as “The video composition may also be emailed at 1035, posted online at 1040, downloaded at 1045 and the user may also select to buy a DVD at 1050, which starts the process of rendering a high resolution version of the video composition and recording it onto suitable media. If the user clicks on the email function 1025, the user is provided a screen that allows the user to specify email addresses, and also include a message. A default message is provided that indicates the user should click on a provided link to view the video. At 1055, a thumbnail image used to represent the video composition is provided. This may be changed by clicking on it to replace it with another desired image” (Paragraph 50).
Lovejoy teaches the concept of a user (via their computer) performing online editing of multimedia content.  Such editing includes indicating background music amongst several candidates (See Figure 5).  The resultant synthesized multimedia content (with the background music selected by the user) can be downloaded (i.e. transmitted to the user’s terminal device) as shown in Figure 10.  The combination would result in a user being able to manually select a background music candidate amongst the recommended background music candidates of Newell and Shah and subsequently download the synthesized content (with the manually selected background music candidate) on their devices.  
	It would have been obvious to one of ordinary skill in the art before the effective filing date of instant invention to combine the teachings of the cited references because teaching Lovejoy’s would have allowed Newell’s and Shah’s to provide a method for more easily allowing users to edit multimedia content, as noted by Lovejoy (Paragraph 2).
12.	Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Newell et al. (U.S. PGPUB 2017/0257595) in view of Shah et al. (Article entitled “ADVISOR – Personalized Video Soundtrack Recommendation by Late Fusion with Heuristic Rankings”, dated 07 November 2014) as applied to claims 1, 3, 10, 11-12, 16, and 18 above, and further in view of Sarakit (Article entitled “A Music Video Recommender System Based on Emotion Classification on User Comments”, dated 2015).
13.	Regarding claim 6, Newell further teaches a method comprising:
A)  obtaining the matched music matching the at least one visual semantic tag based on the at least one visual semantic tag (Paragraphs 66 and 69).
	The examiner notes that Newell teaches “obtaining the matched music matching the at least one visual semantic tag based on the at least one visual semantic tag” as “the server 18 may select music to include in a soundtrack for the secondary video recording, based on the mental state keyword in the primary metadata. For example, the server 18 may detect the keyword "happy" in the primary metadata. Based on the keyword, the server 18 may search through a popular music library for a song that is identified as a happy song, or a song appropriate as a soundtrack for a happy video” (Paragraph 66) and “The server 18 may select, as a candidate to include 
	Newell does not explicitly teach:
A)  and by using a pre-trained music search model. 
	Shah, however, teaches “and by using a pre-trained music search model” as “the ADVISOR system consists of two parts: an offline training and an online processing component. Offline a training dataset with geo-tagged videos is used to train SVMhmm models that map videos to mood tags” (Page 607) and “The sensor data streams are mapped to a geo feature G, and a visual feature F is calculated from the video content. With the trained models, G and F are mapped to mood tags. Then, songs matching these mood tags are recommended” (Page 608).
	The examiner further notes that the secondary reference of Shah teaches the concept of using trained models to match (i.e. search) for music.  The combination would result in the use of such trained models to perform the searching in Newell.
	It would have been obvious to one of ordinary skill in the art before the effective filing date of instant invention to combine the teachings of the cited references because teaching Shah’s would have allowed Newell’s to provide a method for using user preferences for generating soundtracks, as noted by Shah (Page 615).
	Newell and Shah do not explicitly teach:
B)  wherein the music search model is obtained after text classification training is performed on music comment information of users for various music.
	Sarakit, however, teaches “wherein the music search model is obtained after text classification training is performed on music comment information of users for various music” as “In the first step, the emotion filtering tags user comments with three label types of emotional comments, non-emotional comments, and unrelated junk comments. As the second step, the emotion classification aims to classify the emotional comments into six emotion types, including anger, disgust, fear, happiness, sadness, and surprise. With the YouTube API, the total of 85 video clips with 12,000 comments 
	The examiner further notes that the secondary reference of Sarakit teaches the concept of classifying user comments on music data for subsequent matching (i.e. searching).  The combination would result in the use of such comments to expand the music searching in Newell and Shah. 
	It would have been obvious to one of ordinary skill in the art before the effective filing date of instant invention to combine the teachings of the cited references because teaching Sarakit’s would have allowed Newell’s and Shah’s to provide a method for adding value to online music sources via the detection of mood, as noted by Sarakit (Abstract).
14.	Claims 7-8 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Newell et al. (U.S. PGPUB 2017/0257595) in view of Shah et al. (Article entitled “ADVISOR – Personalized Video Soundtrack Recommendation by Late Fusion with Heuristic Rankings”, dated 07 November 2014) as applied to claims 1, 3, 10, 11-12, 16, and 18 above, and Shakirova (Article entitled “Collaborative Filtering for Music Recommender System”, Copyright 2017).
15.	Regarding claims 7 and 13, Newell further teaches a method and computer device comprising:
A)  sorting the matched music according to estimated music assessing information of the user corresponding to the material for the matched music (Paragraphs 68-70); 
B)  the estimated music assessing information of the user for the matched music being obtained based on actual music assessing information of users for candidate music (Paragraphs 68-70); 
D)  the music assessing behavior data comprises any one of or any combination of: a music score, a click-through rate, a favorites behavior, a like behavior, and a sharing behavior (Paragraphs 68-70).
	The examiner notes that Newell teaches “sorting the matched music according to estimated music assessing information of the user corresponding to the material for the matched music” as “In addition to providing song keywords, the popular music library may provide, for example, a ranking of songs, based on current popularity. Current popularity may be determined, based on, for example the number of downloads (or purchases) of the song in the previous week, the number of times someone has listened to the song on one or more websites in the previous week, etc.  The server 18 may select, as a candidate to include in the secondary video recording, a song that is the most popular song having a song keyword that matches the user mental state keyword.  After selecting the song, the server 18 may further obtain information from various data sources, e.g., the server 18 could include programming to search a data store comprising a personal music library of the user to determine if the user owns rights to use the selected song in the secondary video recording. In the case that the user does not have rights to use the song, the server 18 may select, for example, a second most popular song having a song keyword matching the mental state keyword, and determine whether the song is in the user library data store” (Paragraphs 68-70).  The examiner further notes that the ranking of returned matching songs based on popularity of other users (i.e. estimated accessing information) teaches the claimed sorting.  The examiner further notes that Newell teaches “the estimated music assessing information of the user for the matched music being obtained based on actual music assessing information of users for candidate music” as “In addition to providing song keywords, the popular music library may provide, for example, a ranking of songs, based on current popularity. Current popularity may be determined, based on, for example the number of downloads (or purchases) of the song in the previous week, the number of times someone has listened to the song on one or more websites in the previous week, etc.  The server 18 may select, as a candidate to include in the secondary video recording, a song that is the most popular song having a song keyword that matches the user mental state keyword.  After selecting the song, the server 18 may further obtain information from various data sources, e.g., the server 18 could include programming to search a data store comprising a personal music library of the user to determine if the user owns rights to use the selected song in the secondary video recording. In the case that the user does not have rights to use the song, the server 18 may select, for example, a second most popular song having a song keyword matching the mental state keyword, and determine whether the song is in the user library data store” (Paragraphs 68-70).  The examiner further notes that the obtained popularity data of other users teaches the claimed estimated accessing information.  The examiner further notes that Newell teaches “the music assessing behavior data comprises any one of or any combination of: a music score, a click-through rate, a favorites behavior, a like behavior, and a sharing behavior” as “In addition to providing song keywords, the popular music library may provide, for example, a ranking of songs, based on current popularity. Current popularity may be determined, based on, for example the number of downloads (or purchases) of the song in the previous week, the number of times someone has listened to the song on one or more websites in the previous week, etc.  The server 18 may select, as a candidate to include in the secondary video recording, a song that is the most popular song having a song keyword that matches the user mental state keyword.  After selecting the song, the server 18 may further obtain information from various data sources, e.g., the server 18 could include programming to search a data store comprising a personal music library of the user to determine if the user owns rights to use the selected song in the secondary video recording. In the case that the user does not have rights to use the 
	Newell and Shah do not explicitly teach:
C)  wherein actual music assessing information of one user for one piece of music is obtained after weighted processing is performed on parameters of music assessing behavior data of the user.
	Shakirova, however, teaches “wherein actual music assessing information of one user for one piece of music is obtained after weighted processing is performed on parameters of music assessing behavior data of the user” as “We use simple weighted sum strategy for aggregating the information provided by similar users/items. In the user-based type of recommendation the scoring function is computed by… The score of the item is proportional to the similarities between the target user and other users who have the item in their history of listening. This score is higher for items which are often rated by similar users” (Page 549).
	The examiner further notes that although the primary reference of Newell clearly obtains accessing information of other users for music, there is no explicit teaching of the use of “weighting” mathematical operations.  The secondary reference of Shakirova teaches the concept of using weighting on “parameters” of accessing information of other users.  The combination would result in the use of weighting the accessing information of the other users of Newell. 
	It would have been obvious to one of ordinary skill in the art before the effective filing date of instant invention to combine the teachings of the cited references because teaching Shakirova’s would have allowed Newell’s and Shah’s to provide a method for improving the effectiveness of recommender systems, as noted by Shakirova (Page 550).

Newell and Shah do not explicitly teach a method comprising:
A)  obtaining, for the matched music, user attribute information of users assessing the matched music; and 
B)  obtaining, through screening, similar users whose user attribute information is similar to user attribute information of the user.
C)  obtaining actual music assessing information of the similar users for the matched music; and 
D)  performing mean processing on the actual music assessing information of the similar users for the matched music, to obtain the estimated music assessing information of the user for the matched music.
	Shakirova, however, teaches “obtaining, for the matched music, user attribute information of users assessing the matched music” as “On the basis of users’ history of listening we prepare the matrix of preferences R…which contains information how many times the user u listened to song i… Then the cosine similarity between users u and v is computed by… It is obvious that the cosine similarity measure is a special case of a conditional probability measure” (Page 549), “obtaining, through screening, similar users whose user attribute information is similar to user attribute information of the user” as “On the basis of users’ history of listening we prepare the matrix of preferences R…which contains information how many times the user u listened to song i… Then the cosine similarity between users u and v is computed by… It is obvious that the cosine similarity measure is a special case of a conditional probability measure” (Page 549), “obtaining actual music assessing information of the similar users for the matched music” as “We use simple weighted sum strategy for aggregating the information provided by similar users/items. In the user-based type of recommendation the scoring function is computed by… The score of the item is proportional to the similarities between the target user and other users who have the item in their history of listening. This score is higher for items which are often rated by similar users” (Page 549), and “performing mean processing on the actual music assessing information of the similar users for the matched music, to obtain the estimated music assessing information of the user for the matched music” as “We use simple weighted sum strategy for aggregating the information provided by similar users/items. In the user-based type of recommendation the scoring function is computed by… The score of the item is proportional to the similarities between the target user and other users who have the item in their history of listening. This score is higher for items which are often rated by similar users” (Page 549).
	The examiner further notes that although the primary reference of Newell clearly obtains accessing information of other users for music, there is no explicit teaching of the use of “mean processing”.  The secondary reference of Shakirova teaches the concept of using a weighted sum formula (i.e. “mean processing” in the broadest reasonable interpretation).  The combination would result in the use of such processing for the accessing information of the other users of Newell.  Moreover, Shakirova clearly teaches the use of preference information (i.e. the claimed attribute information in the broadest reasonable interpretation) as a basis to find similar users. 
	It would have been obvious to one of ordinary skill in the art before the effective filing date of instant invention to combine the teachings of the cited references because teaching Shakirova’s would have allowed Newell’s and Shah’s to provide a method for improving the effectiveness of recommender systems, as noted by Shakirova (Page 550).
Allowable Subject Matter
16.	Claims 4 and 19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
	Specifically, although the prior art (See Dunker) teaches the automated generation of soundtracks for slideshows (i.e. image sets) and Shah clearly teaches the parsing of content into frames to determine semantic vectors via the use of trained models, the detailed claim language directed towards the training of a model via the use of tag recognition samples that comprise a sample image and a visual semantic tag vector housing both a score and a semantic tag of the sample image is not found in the prior art, in conjunction with the rest of the limitations of the parent claims.
s 5 and 20 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
	Specifically, although the prior art (See Shah) clearly teaches the parsing of video content into frames to determine semantic vectors via the use of trained models, the detailed claim language directed towards the training of a model via the use of tag recognition samples that comprise a sample image and a visual semantic tag vector housing both a score and a semantic tag of the sample image is not found in the prior art, in conjunction with the rest of the limitations of the parent claims.
Claims 9 and 14 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
	Specifically, although the prior art (See Shakirova) clearly teaches matrices with respect to similar users for recommendation of music, the detailed claim language directed towards the use of a music matrix and user matrix in order to perform specific mathematical operations of a transpose and subsequent product to obtain the estimated accessing information is not found in the prior art, in conjunction with the rest of the limitations of the parent claims.
Conclusion
15.	The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
U.S. PGPUB 2021/0012761 issued to Song on 14 January 2021.  The subject matter disclosed therein is pertinent to that of claims 1-20 (e.g., methods to recommend background music).
U.S. PGPUB 2018/0096708 issued to Choi et al. on 05 April 2018.  The subject matter disclosed therein is pertinent to that of claims 1-20 (e.g., methods to recommend background music).
Article entitled “Semantic Based Background Music Recommendation for Home Videos” by Lin et al., dated 2014.  The subject matter disclosed therein is pertinent to that of claims 1-20 (e.g., methods to recommend background music).
Kuo et al., dated 2013.  The subject matter disclosed therein is pertinent to that of claims 1-20 (e.g., methods to recommend background music).
Article entitled “Content-Aware Auto-Soundtracks for Personal Photo Music Slideshows”, by Dunker et al., dated 2011.  The subject matter disclosed therein is pertinent to that of claims 1-20 (e.g., methods to recommend background music).
Contact Information
16.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to Mahesh Dwivedi whose telephone number is (571) 272-2731.  The examiner can normally be reached on Monday to Friday 8:20 am – 4:40 pm.
	If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Fred Ehichioya can be reached (571) 272-4034.  The fax number for the organization where this application or proceeding is assigned is (571) 273-8300.
	Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov.  Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).


Mahesh Dwivedi
Primary Examiner
Art Unit 2168

September 16, 2021
/MAHESH H DWIVEDI/Primary Examiner, Art Unit 2168