EXAMINER'S AMENDMENT
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.

Authorization for this examiner’s amendment was given in an interview with Jonathan Berschadsky on 16 June 2021.

The application has been amended as follows: 

In the specification,
Paragraph [0006]
In an example, the first modality is an auditory modality and the second modality is a visual modality. In another example, the first modality is a visual modality and the second modality is an auditory modality. For example, the audio modality is music, and the visual modality is obtained from any one of a movie, television program, photo, a single frame of a video, or a combination thereof. 

Paragraph [0009]
In an example, the first modality is an auditory modality and the second modality is a visual modality. In another example, the first modality is a visual modality and the second modality is an visual modality is obtained from any one of a movie, television program, photo, a single frame of a video, or a combination thereof. 

In the claims,

1. (Currently Amended) A method of associating at least one media content clip with another media content clip having a different modality, the method comprising the steps of:
	determining a plurality of first embedding vectors of a plurality of media content items of a first modality;
receiving a media content clip of a second modality, wherein the second modality is different than the first modality;
	determining a second embedding vector of the media content clip of the second modality;
training a model by constraining a stream of video with a plurality of predetermined tags for the first modality and constraining a stream of audio with a plurality of predetermined tags for the second modality, wherein the one or more predetermined tags are used to represent an emotion;
	ranking, using the model, the plurality of first embedding vectors based on a distance between the plurality of first embedding vectors and the second embedding vector; 
	selecting one or more of the plurality of media content items of the first modality based on the ranking;
presenting, via a user interface, the selected one or more of the plurality of media content items of the first modality;
receiving, via the user interface, [[an]] a selected media content item from the one or more media content items of the first modality; and
providing an output including the media content clip of the second modality and the selected media content item from the one or more media content items of the first modality.

2. (Original) The method according to claim 1, wherein the first modality is an auditory modality and the second modality is a visual modality.

3. (Original) The method according to claim 1, wherein the first modality is a visual modality and the second modality is an auditory modality.

4. (Original) The method according to claim 2, wherein the audio modality is music.

5. (Original) The method according to claim 3, wherein the audio modality is music.

6. (Cancelled).

7. (Cancelled).

8. (Amended) The method according to claim [[7]]1, wherein the emotion[[s are]] is selected from a set of predetermined emotions.

9. (Amended) The method according to claim 2, wherein the visual modality is obtained from any one of a movie, a television program, a photo, a single frame of a video, or a combination thereof. 

10. (Amended) The method according to claim 3, wherein the visual modality is obtained from any one of a movie, a television program, a photo, a single frame of a video, or a combination thereof. 

11. (Currently Amended) A system configured to associate at least one media content clip with another media content clip having a different modality, the system comprising:
	a computing system including a programmable circuit operatively connected to a memory, the memory storing computer-executable instructions which, when executed by the programmable circuit, cause the computing system to
		determine a plurality of first embedding vectors of a plurality of media content items of a first modality;
receive a media content clip of a second modality, wherein the second modality is different than the first modality;
		determine a second embedding vector of the media content clip of the second modality;
train a model by constraining a stream of video with a plurality of predetermined tags for the first modality and constraining a stream of audio with a plurality of predetermined tags for the second modality, wherein the one or more predetermined tags are used to represent an emotion;
, using the model, the plurality of first embedding vectors based on a distance between the plurality of first embedding vectors and the second embedding vector; 
	select one or more of the plurality of media content items of the first modality based on the ranking
present, via a user interface, the selected one or more of the plurality of media content items of the first modality to a user;
receive, via the user interface, [[an]] a selected media content item from the one of more media content items of the first modality; and
provide an output including the media content clip of the second modality and the elected media content item from the one or more media content items of the first modality.

12. (Original) The system according to claim 11, wherein the first modality is an auditory modality and the second modality is a visual modality.

13. (Original)  The system according to claim 11, wherein the first modality is a visual modality and the second modality is an auditory modality.

14. (Original) The system according to claim 12, wherein the audio modality is music.

15. (Original) The system according to claim 13, wherein the audio modality is music.

16. (Cancelled).

17. (Cancelled).

18. (Amended) The system according to claim [[17]]11, wherein the emotion[[s are]] is selected from a set of predetermined emotions.

19. (Amended) The system according to claim 12, wherein the visual modality is obtained from any one of a movie, a television program, a photo, a single frame of a video, or a combination thereof. 

visual modality is obtained from any one of a movie, a television program, a photo, a single frame of a video, or a combination thereof. 

21. (New) A non-transitory computer-readable medium having stored thereon one or more sequences of instructions for causing one or more processors to perform:
	determining a plurality of first embedding vectors of a plurality of media content items of a first modality;
receiving a media content clip of a second modality, wherein the second modality is different than the first modality;
	determining a second embedding vector of the media content clip of the second modality;
training a model by constraining a stream of video with a plurality of predetermined tags for the first modality and constraining a stream of audio with a plurality of predetermined tags for the second modality, wherein the one or more predetermined tags are used to represent an emotion;
	ranking, using the model, the plurality of first embedding vectors based on a distance between the plurality of first embedding vectors and the second embedding vector; 
	selecting one or more of the plurality of media content items of the first modality based on the ranking;
presenting, via a user interface, the selected one or more of the plurality of media content items of the first modality;
receiving, via the user interface, an elected media content item of the first modality; and
providing an output including the media content clip of the second modality and the elected media content item of the first modality.

22. (New)	The non-transitory computer-readable medium of Claim 21, wherein the first modality is an auditory modality and the second modality is a visual modality.

23. (New)	The non-transitory computer-readable medium of Claim 22, wherein the audio modality is music and the visual modality is obtained from any one of a movie, a television program, a photo, a single frame of a video, or a combination thereof.

24. (New)	The non-transitory computer-readable medium of Claim 21, wherein the emotion is selected from a set of predetermined emotions.

Reasons for Allowance
The following is an examiner’s statement of reasons for allowance: the prior art does not teach a model trained by constraining a stream of video with a plurality of predetermined tags for the first modality and constraining a stream of audio with a plurality of predetermined tags for the second modality, wherein the one or more predetermined tags are used to represent an emotion.
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WILLIAM SPIELER whose telephone number is (571)270-3883.  The examiner can normally be reached on Monday-Friday, 11-3.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mariela Reyes can be reached on 571-270-1006.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-


WILLIAM SPIELER
Primary Examiner
Art Unit 2159



/WILLIAM SPIELER/Primary Examiner, Art Unit 2159