DETAILED ACTION

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 5/16/2022 has been entered. Claims 1-2, 4-5, 7-15, 17-18 and 20 are pending in the application.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Response to Arguments
Applicant’s arguments with respect to claim(s) 1-2, 4-5, 7-15, 17-18 and 20 have been considered but are moot because a new reference, Wan et al. (US 2007/0150802), was applied to teach the amended, argued limitations.
Wood teaches at para. 36-37: detected events may be classified into a semantic category such as birthday, wedding, etc. Such media assets may be classified together as depicting the same event because they share the same location, setting, or activity per a unit of time, and are intended to be related, according to the expected intent of the user or group of users. Within each event, media assets can also be clustered into separate groups of relevant content called sub-events. Multiple events themselves may also be clustered into larger groups called super-events; fig. 6: item 605, for example, is a subset/filtered of SEQ. or ROOT segment, if a user is viewing the subset 610 would also be able to select its parent/filtered set 605 for viewing which includes at least the subset 620.
Wan further teaches at para. 73: a collection of digital images may be partitioned according to the distribution of their time stamps, the event they are associated with, the location they are related to, the people they show, their capture parameters, or the lighting conditions; para. 79: sub-groups of images; para. 187-189: in the event that the contents of a cluster are presented as a summary by showing only some of the documents 701, and hiding other documents within the cluster, this is represented by a show/all button 705 presented beside the cluster summary. Hidden documents can be viewed by maximising the cluster, either by clicking the show/all button 705, or by using a minimise/maximise control 706 displayed on each cluster. Therefore, users can select show all button to see all images in the filtered/sub-set of images. The combination of references does teach the amended limitations.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2, 4-5, 7-11, 13-14, 17-18 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Wood (US 20150363409) in view of Sewak (US 20200097569) and further in view of Wan et al. (US 2007/0150802).
As per claims 1, 5, 14, Wood teaches
a computer-implemented method, comprising: maintaining, by one or more processors of a computing device, a knowledge graph comprising a plurality of nodes associated with respective sets of digital assets of a digital asset collection stored at the computing device (figs.1-2: multimedia collection, metadata generator and repository; para. 28, 34-37: assets may be classified in a sub-event because they were captured at roughly the same time and optionally have some measure of visual similarity, classifying images or videos in a multimedia collection into one of several event categories using a combination of time-based and content based features; para. 43-46: the set of multimedia assets is defined as a hierarchical grouped and ordered set of media assets);
receiving, at a user interface of the computing device, selection of a collection of related digital assets; identifying, by the one or more processors and based at least in part on the selection, a set of digital assets associated with the collection of related assets, the set of digital assets being associated with a collection node of the plurality of nodes of the knowledge graph, the collection node corresponding to the selection (para. 28, 36: while media assets may be generally classified as depicting the same event because they share the same setting or activity, media assets in a sub-event share more specific similar content within the event; para. 49-56; para. 72: the user selecting a particular output modality, e.g., an 8x10 photobook or video slideshow. The system may automatically create view-based representations on speculation for a user; para. 74-75: present filtered images based on selection; para. 79-80: the user to select any picture in the final view-based representation and to view alternative images at different levels of the hierarchy);
generating, by the one or more processors, a subset of digital assets from the set of digital assets identified based at least in part on capture time data associated with respective digital assets of the set of digital assets (para. 25, 36; para. 60: the system provides for grouping within a storyboard, where the groups are determined by a variety of techniques beside chronology. For example, frequent item set mining, face detection, face recognition, location clustering, object detection, event detection, and event recognition all could be the basis for grouping, either individually or in some combination. The system is capable of generating these and other types of groupings. Each grouping type has an appropriate algorithm for determining both the grouping and the associated priority; para. 52: the system determines semantic equivalence by considering the elapsed time between any two consecutive pictures that appeared within a given sub-event cluster, as well as the visual similarity of the image);
generating, by the one or more processors, content metadata for each of the digital assets of the subset of digital assets, the content metadata for a digital asset including a plurality of confidence scores that individually describe a degree of confidence that an object is included in the digital asset (para. 22-23, 31: detecting faces and analyzing facial features to generate derived metadata; para. 34-35: face/object recognition; para. 60, 76: metadata encompasses derived metadata, such as metadata computed by face detection or event classification algorithms that are applied to media assets post-capture; para. 51-57: grouped under alternates segment. Grouping these media assets in an alternate segment indicates that these two assets are semantically-equivalent, or near duplicates…using metadata associated with individual images, such as time, place, or people identified in the image; para.66: face detectors generate a confidence score that the detected face is an actual face, a confidence score indicating the strength of the belief that the output class is true);
calculating, by the one or more processors, a distance value quantifying a degree of similarity between first content metadata associated with a first digital asset of the subset of digital assets that includes the object in a first location and second content metadata associated with a second digital asset of the subset of digital assets that includes the object in a second location (para. 30-31: Content-based Image Retrieval (CBIR) techniques retrieve images from a database that are similar to an example (or query) image, enable CBIR may also be used to generate metadata indicating image similarity based on low-level image features. This concept can be extended to portions of images or Regions of Interest (ROI); para. 34-35: face recognition is the identification or classification of a face to an example of a person or a label associated with a person based on facial features. Face clustering is a form of face recognition wherein faces are grouped by similarity (thus distance). With face clustering faces that appear to represent the same person are associated together and given a label, but the actual identity of the person is not necessarily known. Face clustering uses data generated from facial detection and feature extraction algorithms to group faces that appear to be similar. This selection may be triggered based on a numeric confidence value. The output of the face clustering algorithm is new metadata: namely, a new object representing the face cluster is created.) 
filtering out the second digital asset, based at least in part on a determination that the distance value is below a threshold value, by generating a filtered set of digital assets that includes the first digital asset and excludes the second digital asset from the filtered set of digital assets; presenting, at a display of the computing device and based at least in part on the selection, the filtered set of digital assets as being associated with the collection of related digital assets (para. 51-53: for a set of semantically equivalent images, an feature of the present invention provides for simply selecting a single representative image to represent the set. Media assets are grouped under alternates segment. Grouping these media assets in an alternates segment indicates that these two assets are semantically-equivalent, or near duplicates of each other. Priority values associated with each image (not shown) could be used to pick the best image as the representative image from this set. Thus, at least a second semantically-equivalent or near duplicate image is excluded from being displayed);
 
presenting, at the display of the computing device, a user interface element configured to enable presentation of an unfiltered version of the filtered set of digital assets that includes at least the second digital asset (para. 36-37: detected events may be classified into a semantic category such as birthday, wedding, etc. Such media assets may be classified together as depicting the same event because they share the same location, setting, or activity per a unit of time, and are intended to be related, according to the expected intent of the user or group of users. Within each event, media assets can also be clustered into separate groups of relevant content called sub-events. Multiple events themselves may also be clustered into larger groups called super-events; fig. 6: item 605, for example, is a subset/filtered of SEQ. or ROOT segment, if a user is viewing the subset 610 would also be able to select its parent/filtered set 605 for viewing which includes at least the subset 620).

	Wood does not explicitly teach utilizing a neural network, digital asset locations.
	Sewak teaches 
utilizing a neural network; a distance value quantifying a degree of similarity; object locations (para. 68: the pictorial summary program module uses the knowledge graph and the acquired document text corpus from step 310 of FIG. 3 to determine the indirect relations between vectors, using deep learning and cognitive computing based techniques including advanced natural language processing; col. 71: classify the objects and concepts in each of the images in the database of pictorial representations, uses machine learning and visual analytics techniques including convolutional neural networks and deep learning based computer vision to classify the objects and concepts into the different classes; col. 73, 79: refines and enriches the models based on class and object combination models and description for relative location, aspect, distance, and scaling. The pictorial summary program module 220 uses deep learning based computer vision and cognitive computing techniques to refine and enrich the similarity model developed at step 515, the taxonomies developed at step 520, and the model for inter-object distance, nearest distance point specification). Thus, it would have been obvious to one or ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Wood and Sewak in order to effectively display to users the most relevant digital assets.
	Even if Wood and Sewak do not explicitly teach enable presentation of an unfiltered version of the filtered set of digital assets that includes at least the second digital asset, 
	Wan teaches 
enable presentation of an unfiltered version of the filtered set of digital assets that includes at least the second digital asset; in accordance with receiving, at a user interface on the display of the computing device, selection of the user interface element, presenting, at the display of the computing device, the unfiltered version of the filtered set of digital assets that includes at least the second digital asset; and in accordance with not receiving, at the user interface, selection of the user interface element, maintaining presentation of the filtered set of digital assets (para. 73: a collection of digital images may be partitioned according to the distribution of their time stamps, the event they are associated with, the location they are related to, the people they show, their capture parameters, or the lighting conditions; para. 79: sub-groups of images; para. 187-189: in the event that the contents of a cluster are presented as a summary by showing only some of the documents 701, and hiding other documents within the cluster, this is represented by a show/all button 705 presented beside the cluster summary. Hidden documents can be viewed by maximising the cluster, either by clicking the show/all button 705, or by using a minimise/maximise control 706 displayed on each cluster.) Thus, it would have been obvious to one or ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Wood, Sewak and Wan in order to effectively allow users to view different sets of digital assets.

As per claims 2, 13, Wood teaches
wherein the network is trained with previously categorized digital assets to identify a plurality of features, wherein the network is configured to receive the digital asset as input, and wherein the network is configured to output content metadata for the digital asset (para. 28, 31-35, 46, 66: machine learning and metadata generating, e.g., detecting faces and analyzing facial features to generate derived metadata.)
Wood does not explicitly teach the neural network.
	Sewak teaches 
the neural network in relating to the categorized digital assets (para. 68: the pictorial summary program module uses the knowledge graph and the acquired document text corpus from step 310 of FIG. 3 to determine the indirect relations between vectors, using deep learning and cognitive computing based techniques including advanced natural language processing; col. 71: classify the objects and concepts in each of the images in the database of pictorial representations, uses machine learning and visual analytics techniques including convolutional neural networks and deep learning based computer vision to classify the objects and concepts into the different classes; col. 73, 79: refines and enriches the models based on class and object combination models and description for relative location, aspect, distance, and scaling. The pictorial summary program module 220 uses deep learning based computer vision and cognitive computing techniques to refine and enrich the similarity model developed at step 515, the taxonomies developed at step 520, and the model for inter-object distance, nearest distance point specification). Thus, it would have been obvious to one or ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Wood and Sewak in order to effectively display to users the most relevant digital assets.

As per claim 4, Wood teaches
wherein calculating the distance value includes comparing a first plurality of confidence scores of the first content metadata to a second plurality of confidence scores of the second content metadata (para. 34-35: face recognition is the identification or classification of a face to an example of a person or a label associated with a person based on facial features. Face clustering is a form of face recognition wherein faces are grouped by similarity (thus distance). With face clustering faces that appear to represent the same person are associated together and given a label, but the actual identity of the person is not necessarily known. Face clustering uses data generated from facial detection and feature extraction algorithms to group faces that appear to be similar. This selection may be triggered based on a numeric confidence value. The output of the face clustering algorithm is new metadata: namely, a new object representing the face cluster is created; para. 51-53: grouped under alternates segment 440. Grouping these media assets in an alternates segment indicates that these two assets are semantically-equivalent, or near duplicates of each other; para.66: face detectors generate a confidence score that the detected face is an actual face, a confidence score indicating the strength of the belief that the output class is true.)

As per claim 7, Wood teaches
prepare for display a user interface that includes user interface elements, each user interface element of the user interface elements identifying the collection of related digital assets and a corresponding multimedia icon that represents a corresponding digital asset associated with the collection of related digital assets; and receive a selection of at least one of the user interface elements, wherein the filtered set of digital assets is presented based at least in part on the selection received (para. 22: metadata includes recorded or previously recorded metadata, which is recorded by the capture device, e.g., capture time, date, and location provided by a digital camera. Metadata also encompasses user-provided metadata; para. 60, 76, 36: detected events may be classified into a semantic category such as birthday, wedding, etc.; para. 79: the user to select any picture in the final view-based representation and to view alternative images at different levels of the hierarchy.)

As per claim 8, Wood teaches
wherein digital assets of the collection of related digital assets are related by at least one metadata attribute including an event, a location, content, a capture time, or a subject (para. 36: assets may be classified in a sub-event because they were captured at roughly the same time and optionally have some measure of visual similarity.)

As per claim 9, Wood teaches
present, with the filtered set of digital assets, additional collections of digital assets that relate to the filtered set of digital assets by at least one attribute of corresponding metadata of the filtered set of digital assets (para. 47, 65-66: priority score quantifies how confident the system is that the assets assigned to a group comply with constraints of the group; para.74: filtering/only assets whose priority/confidence exceeds the threshold are included.)

As per claim 10, Wood teaches
receive, at a user interface of the computing device, a selection associated with a filter option, wherein additional digital assets of the set of digital assets is presented with the filtered set of digital assets (para. 51: for a set of semantically equivalent images, an feature of the present invention provides for simply selecting a single representative image to represent the set; para. 72: the user selecting a particular output modality, e.g., an 8x10 photobook or video slideshow. The system may automatically create view-based representations on speculation for a user; para. 74-75: present filtered images based on selection; para. 79-80: the user to select any picture in the final view-based representation and to view alternative images at different levels of the hierarchy).

As per claim 11, Wood teaches
provide, at the user interface, a video including the filtered set of digital assets (para. 71-72: displaying the filtered media assets; para. 74-75: present filtered images based on selection; para. 79-80: the user to select any picture in the final view-based representation and to view alternative images at different levels of the hierarchy).

As per claim 17, Wood teaches
identifying subjects of the filtered set of digital assets based at least in part on executing facial recognition techniques with the filtered set of digital assets; and presenting, with the filtered set of digital assets, icons corresponding to the subject identified (para. 5, 31-35: identifying subjects/person; para. 51, 60, 79: the user to select any picture in the final view-based representation and to view alternative images at different levels of the hierarchy.)

As per claim 18, Wood teaches
wherein the filtered set of digital assets provides a representative set of digital assets of the one or more digital assets that excludes duplicative digital assets (para. 51-53: for a set of semantically equivalent images, an feature of the present invention provides for simply selecting a single representative image to represent the set. Media assets are grouped under alternates segment. Grouping these media assets in an alternates segment indicates that these two assets are semantically-equivalent, or near duplicates of each other. Priority values associated with each image (not shown) could be used to pick the best image as the representative image from this set. Thus, at least a second semantically-equivalent or near duplicate image is excluded from being displayed).

As per claim 20, Wood teaches
wherein the first location and the second location are different (para. 30-31: Content-based Image Retrieval (CBIR) techniques retrieve images from a database that are similar to an example (or query) image, enable CBIR may also be used to generate metadata indicating image similarity based on low-level image features. This concept can be extended to portions of images or Regions of Interest (ROI); para. 34-35: face recognition is the identification or classification of a face to an example of a person or a label associated with a person based on facial features. Face clustering is a form of face recognition wherein faces are grouped by similarity (thus distance). Thus, the distance is between feature(s)/locations of first and second images.)

Claims 12, 15 are rejected under 35 U.S.C. 103 as being unpatentable over Wood (US 20150363409) in view of Sewak (US 20200097569) and further in view of Wan et al. (US 2007/0150802) and Sentinelli (US 20130336590).
As per claims 12, 15, Wood teaches
wherein generating the filtered set of digital assets is generated comprises: by determining a plurality of aesthetic scores associated with the two or more semantically similar digital assets first digital asset and the second digital asset; selecting a highest aesthetic score of the plurality of aesthetic scores; and including, in the filtered set of digital assets, a particular digital asset associated with the highest aesthetic score (para. 38: an image value index (IVI) is defined as a measure of the degree of importance (significance, attractiveness, usefulness, or utility) that an individual user might associate with a particular asset. An M score can be a stored rating entered by a user as metadata. Automatic M algorithms can utilize image features, such as sharpness, lighting, and other indications of quality; para. 49-51: images containing people may be scored in part based upon the identity of the people and in part based on the quality of detected faces).
Wood, Sewak and Wan do not explicitly teach aesthetic score. 
Sentinelli teaches aesthetic or quality score at para. 54- 57: automatic selection of the most representative frame or the frame with the highest quality from a number of recurring scenes, as may happen for instance when recording a sports event like biathlon or formula one. Similar image frames; para. 71, 141-142. Thus, it would have been obvious to one or ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Wood, Sewak, Wan and Sentinelli to effectively discard low quality frames/images and display to the users high quality digital assets. 
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Bernstein (US 2017/0010847) teaches at para. 61: an image management application; para. 98, 210: a show all open tabs. Trifunovic (US 20170255693) teaches at para. 29: eliminate duplicates of images in the search results; para. 31-33, 52: similarity scores below a specified threshold.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LINH BLACK whose telephone number is (571)272-4106. The examiner can normally be reached 9AM-5PM EST M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tony Mahmoudi can be reached on 571-272-4078. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/LINH BLACK/Examiner, Art Unit 2163                                                                                                                                                                                                        6/8/2022

/TONY MAHMOUDI/Supervisory Patent Examiner, Art Unit 2163