DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after allowance or after an Office action under Ex Parte Quayle, 25 USPQ 74, 453 O.G. 213 (Comm'r Pat. 1935). Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, prosecution in this application has been reopened pursuant to 37 CFR 1.114.  Applicant's submission filed on 7/22/22 has been entered.

Information Disclosure Statement
The information disclosure statements (IDS) submitted on 7/22/22 and 11/17/22 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statements are being considered by the examiner.

Specification
The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. 

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1-3, 5-12, 15-17, 23-25 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by US PG Pub 2017/0098152 to Kerr et al.

Regarding claim 1. Kerr discloses a method (Fig. 4, “flow diagram showing a method 400”, paragraph 53), comprising: 
detecting a set of one or more attributes of an input image using a machine learning framework (Fig. 4, “At step 410, a selection of one or more images from a user via a user device is received. Each of one or more images comprises one or more attributes that may be identified, in embodiments, by a neural network or other feature extraction algorithm. The neural network or other feature extraction algorithm may compare feature vectors corresponding visual-based query to feature vectors in the set of images to identify image results based on visual similarity. In some embodiments, the attributes include one or more of composition, color, style, texture, or font. A selection of at least one attribute for each image is received”, paragraph 54), 
wherein the machine learning framework is trained at least in part on a set of training images comprising a prescribed scene type (“machine learning, deep neural networks, and other computer vision techniques are utilized to extract attributes of images, for example as a feature vector. In various embodiments, the attributes include color, composition, font, style, and texture. Attributes may also include line weight or line style. Training images may be utilized to implement a generic system initially that identifies visual similarity generally, but without any understanding of specific attributes. The generic system may then be trained with a new set of training data for a specific attribute. In this way, the system may be fine-tuned at different output layers to detect different attributes with each layer being independently evolved from the generic system. In other words, the transformations necessary to extract a particular feature vector at a particular layer of the system is learned based on set of training data for each specific attribute”, paragraph 18), wherein the input image comprises the prescribed scene type (“Upon training the system to extract attributes from an image, a user can submit a sample image comprising at least one desired attribute. A user can then select a specific attribute of the sample image to focus on that specific attribute from the sample image”, paragraph 21), and wherein the detected set of attributes is not known for the input image prior to detection by the machine learning framework (“Training images may be utilized to implement a generic system initially that identifies visual similarity generally, but without any understanding of specific attributes. The generic system may then be trained with a new set of training data for a specific attribute. In this way, the system may be fine-tuned at different output layers to detect different attributes with each layer being independently evolved from the generic system. In other words, the transformations necessary to extract a particular feature vector at a particular layer of the system is learned based on set of training data for each specific attribute”, paragraph 18); and 
generating an output image comprising a modified version of the input image by modifying at least a subset of the detected set of attributes (“Some embodiments of the present invention are directed to modifying one or more specific attributes found in an image. To do so, a user may submit a first sample image comprising a number of attributes. The user may submit a second sample image that comprises at least one attribute that is different from the attributes of the first sample image. Upon selecting one or more attributes from the second image, the user may modify at least one attribute extracted from the first sample image based on the selection. This enables a user to modify an image to include a desired attribute that is not inherent to the image without having to perform a search for images that include all of the desired attributes. In some embodiments, the user may submit a search query based on the modified image”, paragraph 22).

Regarding claim 2. The method of claim 1, Kerr discloses wherein the prescribed scene type comprises one or more of a constrained set of objects (“The network or algorithm can identify one or more associations between the semantic content of an image and a class of semantically similar images. For example, a neural network or other feature extraction algorithm may analyze training images with certain recurring objects, color schemes, or other semantic content and determine that the objects, color schemes, or other semantic content are indicative of a certain class of content (e.g., “dogs,” “vehicles,” “trees,” etc.)”, paragraph 25).

Regarding claim 3. The method of claim 1, Kerr discloses wherein the set of training images includes images comprising different combinations of objects and object arrangements, camera configurations, lighting types and locations, and materials and textures (“The network or algorithm can identify one or more associations between the semantic content of an image and a class of semantically similar images. For example, a neural network or other feature extraction algorithm may analyze training images with certain recurring objects, color schemes, or other semantic content and determine that the objects, color schemes, or other semantic content are indicative of a certain class of content (e.g., “dogs,” “vehicles,” “trees,” etc.)”, paragraph 25).

Regarding claim 5. The method of claim 1, Kerr discloses wherein at least the subset of the detected set of attributes is associated with an aesthetic (“machine learning, deep neural networks, and other computer vision techniques are utilized to extract attributes of images, for example as a feature vector. In various embodiments, the attributes include color, composition, font, style, and texture”, paragraph 18, “In additional or alternative embodiments, semantic similarity includes a similarity between a first image style in a first image and a second image style in a second image. For example, vectors representing color or contrast information can be calculated for two images. The stylistic similarity can be determined by calculating a distance between these vectors. A larger calculated distance indicates a lower degree of stylistic similarity, and a smaller calculated distance indicates a higher degree of stylistic similarity”, paragraph 24).

Regarding claim 6. The method of claim 5, Kerr discloses wherein the output image comprises a different aesthetic than the input image (“Some embodiments of the present invention are directed to modifying one or more specific attributes found in an image. To do so, a user may submit a first sample image comprising a number of attributes. The user may submit a second sample image that comprises at least one attribute that is different from the attributes of the first sample image. Upon selecting one or more attributes from the second image, the user may modify at least one attribute extracted from the first sample image based on the selection. This enables a user to modify an image to include a desired attribute that is not inherent to the image without having to perform a search for images that include all of the desired attributes”, paragraph 22).

Regarding claim 7. The method of claim 1, Kerr discloses wherein at least the subset of the detected set of attributes is associated with a style (“machine learning, deep neural networks, and other computer vision techniques are utilized to extract attributes of images, for example as a feature vector. In various embodiments, the attributes include color, composition, font, style, and texture”, paragraph 18, “In additional or alternative embodiments, semantic similarity includes a similarity between a first image style in a first image and a second image style in a second image. For example, vectors representing color or contrast information can be calculated for two images. The stylistic similarity can be determined by calculating a distance between these vectors. A larger calculated distance indicates a lower degree of stylistic similarity, and a smaller calculated distance indicates a higher degree of stylistic similarity”, paragraph 24).

Regarding claim 8. The method of claim 7, Kerr discloses wherein the output image comprises a restyled version of the input image (“In additional or alternative embodiments, semantic similarity includes a similarity between a first image style in a first image and a second image style in a second image. For example, vectors representing color or contrast information can be calculated for two images. The stylistic similarity can be determined by calculating a distance between these vectors. A larger calculated distance indicates a lower degree of stylistic similarity, and a smaller calculated distance indicates a higher degree of stylistic similarity”, paragraph 24).

Regarding claim 9. The method of claim 1, Kerr discloses wherein at least the subset of the detected set of attributes is associated with an object in the input image (“the image comparison algorithm can analyze image data associated with two or more separate images to determine that the images are visually similar. For example, the direct image comparison algorithm may determine that two separate images, each having the Eiffel tower isolated front and center, as having a high likelihood of visual similarity”, paragraph 23, “a neural network or other feature extraction algorithm may analyze training images with certain recurring objects”, paragraph 25).

Regarding claim 10. The method of claim 9, Kerr discloses wherein the object in the input image is replaced by a different object in the output image (“enables a user to modify an image to include a desired attribute that is not inherent to the image”, paragraph 22, “A selection of at least one attribute for each image is received, at step 412, from the user via the user device. Each selection may additionally include a weight selected by a user that may indicate an importance of each attribute to the user. In some embodiments, a negative attribute may be selected for one or more images that indicates an attribute the user does not want the result images to include”, paragraph 54).

Regarding claim 11. The method of claim 1, Kerr discloses wherein at least the subset of the detected set of attributes is associated with lighting (“to focus on one specific attribute, such as color, from a first image and a different specific attribute, such as composition”, paragraph 21, for those of ordinary skill in the art color in an image necessarily includes rendering of lighting effects as such are delineated by changes and shifts in color).

Regarding claim 12. The method of claim 11, Kerr discloses wherein the output image comprises a relit version of the input image (“to focus on one specific attribute, such as color, from a first image and a different specific attribute, such as composition”, paragraph 21, for those of ordinary skill in the art color in an image necessarily includes rendering of lighting effects as such are delineated by changes and shifts in color).

Regarding claim 15. The method of claim 1, Kerr discloses labeling or tagging the output image with a modified set of attributes (“A selection of at least one attribute for each image is received, at step 412, from the user via the user device. Each selection may additionally include a weight selected by a user that may indicate an importance of each attribute to the user. In some embodiments, a negative attribute may be selected for one or more images that indicates an attribute the user does not want the result images to include”, paragraph 54).

Regarding claim 16. The method of claim 1, Kerr discloses wherein the detected set of attributes comprises one or more attributes associated with object/scene types, geometries, placements, materials, textures, camera characteristics, lighting characteristics, noise statistics, and contrast (“machine learning, deep neural networks, and other computer vision techniques are utilized to extract attributes of images, for example as a feature vector. In various embodiments, the attributes include color, composition, font, style, and texture”, paragraph 18, “In additional or alternative embodiments, semantic similarity includes a similarity between a first image style in a first image and a second image style in a second image. For example, vectors representing color or contrast information can be calculated for two images. The stylistic similarity can be determined by calculating a distance between these vectors. A larger calculated distance indicates a lower degree of stylistic similarity, and a smaller calculated distance indicates a higher degree of stylistic similarity”, paragraph 24).

Regarding claim 17. The method of claim 1, Kerr discloses wherein a training image of the set of training images is labeled or tagged with metadata (“Some embodiments of the present invention are directed to modifying one or more specific attributes found in an image. To do so, a user may submit a first sample image comprising a number of attributes. The user may submit a second sample image that comprises at least one attribute that is different from the attributes of the first sample image. Upon selecting one or more attributes from the second image, the user may modify at least one attribute extracted from the first sample image based on the selection. This enables a user to modify an image to include a desired attribute that is not inherent to the image without having to perform a search for images that include all of the desired attributes”, paragraph 22).

Regarding claim 23. The method of claim 1, Kerr discloses wherein the machine learning framework comprises a deep neural network, a convolutional neural network, or both (Abstract).

Regarding claim 24. Kerr discloses a system (Fig. 1), comprising: 
a processor (Fig. 8) configured to: 
detect a set of one or more attributes of an input image using a machine learning framework (Fig. 4, “At step 410, a selection of one or more images from a user via a user device is received. Each of one or more images comprises one or more attributes that may be identified, in embodiments, by a neural network or other feature extraction algorithm. The neural network or other feature extraction algorithm may compare feature vectors corresponding visual-based query to feature vectors in the set of images to identify image results based on visual similarity. In some embodiments, the attributes include one or more of composition, color, style, texture, or font. A selection of at least one attribute for each image is received”, paragraph 54), 
wherein the machine learning framework is trained at least in part on a set of training images comprising a prescribed scene type (“machine learning, deep neural networks, and other computer vision techniques are utilized to extract attributes of images, for example as a feature vector. In various embodiments, the attributes include color, composition, font, style, and texture. Attributes may also include line weight or line style. Training images may be utilized to implement a generic system initially that identifies visual similarity generally, but without any understanding of specific attributes. The generic system may then be trained with a new set of training data for a specific attribute. In this way, the system may be fine-tuned at different output layers to detect different attributes with each layer being independently evolved from the generic system. In other words, the transformations necessary to extract a particular feature vector at a particular layer of the system is learned based on set of training data for each specific attribute”, paragraph 18), 
wherein the input image comprises the prescribed scene type (“Upon training the system to extract attributes from an image, a user can submit a sample image comprising at least one desired attribute. A user can then select a specific attribute of the sample image to focus on that specific attribute from the sample image”, paragraph 21), and 
wherein the detected set of attributes is not known for the input image prior to detection by the machine learning framework (“Training images may be utilized to implement a generic system initially that identifies visual similarity generally, but without any understanding of specific attributes. The generic system may then be trained with a new set of training data for a specific attribute. In this way, the system may be fine-tuned at different output layers to detect different attributes with each layer being independently evolved from the generic system. In other words, the transformations necessary to extract a particular feature vector at a particular layer of the system is learned based on set of training data for each specific attribute”, paragraph 18); and 
generate an output image comprising a modified version of the input image by modifying at least a subset of the detected set of attributes (“Some embodiments of the present invention are directed to modifying one or more specific attributes found in an image. To do so, a user may submit a first sample image comprising a number of attributes. The user may submit a second sample image that comprises at least one attribute that is different from the attributes of the first sample image. Upon selecting one or more attributes from the second image, the user may modify at least one attribute extracted from the first sample image based on the selection. This enables a user to modify an image to include a desired attribute that is not inherent to the image without having to perform a search for images that include all of the desired attributes. In some embodiments, the user may submit a search query based on the modified image”, paragraph 22); and 
a memory coupled to the processor and configured to provide the processor with instructions (Fig. 8).

Regarding claim 25. Claim 25 is rejected for the same reasons and rational as provided above for claim 1.

Claims 4, 18-19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kerr as applied to claim 1 above, and further in view of US PG Pub 2019/0026956 to Gausebeck et al.

Regarding claim 4.  Kerr discloses wherein a training image of the set of training images is rendered using one or more models, is captured by an imaging or a scanning device, or is generated from one or more other existing images (“Some embodiments of the present invention are directed to modifying one or more specific attributes found in an image. To do so, a user may submit a first sample image comprising a number of attributes. The user may submit a second sample image that comprises at least one attribute that is different from the attributes of the first sample image. Upon selecting one or more attributes from the second image, the user may modify at least one attribute extracted from the first sample image based on the selection. This enables a user to modify an image to include a desired attribute that is not inherent to the image without having to perform a search for images that include all of the desired attributes. In some embodiments, the user may submit a search query based on the modified image”, paragraph 22).
	Kerr does not disclose using three-dimensional models.  However, Gausebeck in the same area of neural network imaging discloses to render from three-dimensional object models (“For example, such auxiliary input data can include information regarding capture position and orientation of the 2D image, information regarding capture parameters of the capture device that generated the 2D image (e.g., focal length, resolution, lens distortion, lighting, other image metadata, etc.), actual depth data associated with the 2D image captured by a 3D sensor (e.g., 3D capture hardware), depth data derived for a 2D image using stereo image processing, and the like”, paragraph 123).
Therefore, it would have been obvious to a person with ordinary skill in the art before the effective filing date of the claimed invention to have modified Kerr’s computer vision techniques to extract attributes of images to include: using three-dimensional models.
It would have been obvious to a person with ordinary skill in the art before the effective filing date of the claimed invention to have modified Kerr’s computer vision techniques to extract attributes of images, by the teaching of Gausebeck because of the following reasons: (a) techniques for generating 3D data for 2D images using affordable, user friendly devices and techniques for accurately and efficiently aligning the 2D images using the 3D data to generate immersive 3D environments are in high demand, (paragraph 4, Gausebeck); and (b) enable a user to modify an image to include a desired attribute that is not inherent to the image as taught by Kerr at paragraph 7.

Regarding claim 18. The method of claim 1, Gausebeck discloses wherein a training image of the set of training images is labeled or tagged with ground truth data associated with generating the training image (“the training data development component 3316 can further employ the 3D space models included in the 3D model and alignment data 3304 to create synthetic “ground truth” 3D data from those reconstructed environments to match each 2D used to create the 3D space model (e.g., included in the indexed 2D image data 3306) as well as synthetic 2D images generated from perspectives of the 3D space model that were never actually captured from the actual environment by an actual camera”, paragraph 251).

Regarding claim 19. The method of claim 1, Gausebeck discloses wherein the output image comprises a photograph or a photorealistic rendering (“The 3D model generation component 118 can further remove objects photographed (e.g., walls, furniture, fixtures, etc.) from the 3D model, integrate new 2D and 3D graphical objects on or within the 3D model in spatially aligned positions relative to the 3D model, change the appearance of visual features of the 3D model (e.g., color, texture, etc.), and the like”, paragraph 77).


Claims 13 and 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kerr as applied to claim 1 above, and further in view of US 5,526,446 to Adelson et al.

Regarding claim 13. The method of claim 1, Kerr does not disclose wherein at least the subset of the detected set of attributes is associated with noise.
However, Adelson in same area of image extraction discloses an attributes is associated with noise (“a technique is provided to remove noise from images and to enhance their visual appearance through the utilization of a technique which converts an image into a set of coefficients in a multi-scale image decomposition process, followed by modification of each coefficient based on its value and the value of coefficients of related orientation, position, or scale, which is in turn followed by a reconstruction or synthesis process to generate the enhanced image. Also contributing to the improved enhancement is a set of orientation tuned filters of a specialized design to permit steering, with the analysis and synthesis filters also having a self-inverting characteristic. Additionally, steerable pyramid architecture is used for image enhancement for the first time, with the steering being provided by the above orientation tuned filters. The utilization of related coefficients permits coefficient modification with multipliers derived through a statistical or neural-network analysis of coefficients derived through the utilization of clean and degraded images, with the modifiers corresponding to vectors which result in translating the degraded image coefficients into clean image coefficients, in essence by cancelling those portions of a coefficient due to noise. Further improvements include an overlay of classical coring on single coefficients. Thus, the subject technique provides improved image enhancement through the use of a multi-band or scale-oriented analysis and synthesis transform having improved coefficient modification, good orientation tuning, improved bandpass characteristics, and good spatial localization”, column 6, lines 39-67).
Therefore, it would have been obvious to a person with ordinary skill in the art before the effective filing date of the claimed invention to have modified Kerr’s computer vision techniques to extract attributes of images to include: wherein attributes are associated with noise.
It would have been obvious to a person with ordinary skill in the art before the effective filing date of the claimed invention to have modified Kerr’s learning-based framework for personalized image quality evaluation by the teaching of Adelson because of the following reasons: (a) to provide improved image enhancement, (column 6, lines 39-67, Adelson); and (b) enable a user to modify an image to include a desired attribute that is not inherent to the image as taught by Kerr at paragraph 7.

Regarding claim 14. The method of claim 13, Kerr does not disclose wherein the output image comprises a denoised version of the input image.
However, Adelson in same area of image extraction discloses wherein the output image comprises a denoised version of the input image (“a technique is provided to remove noise from images and to enhance their visual appearance through the utilization of a technique which converts an image into a set of coefficients in a multi-scale image decomposition process, followed by modification of each coefficient based on its value and the value of coefficients of related orientation, position, or scale, which is in turn followed by a reconstruction or synthesis process to generate the enhanced image. Also contributing to the improved enhancement is a set of orientation tuned filters of a specialized design to permit steering, with the analysis and synthesis filters also having a self-inverting characteristic. Additionally, steerable pyramid architecture is used for image enhancement for the first time, with the steering being provided by the above orientation tuned filters. The utilization of related coefficients permits coefficient modification with multipliers derived through a statistical or neural-network analysis of coefficients derived through the utilization of clean and degraded images, with the modifiers corresponding to vectors which result in translating the degraded image coefficients into clean image coefficients, in essence by cancelling those portions of a coefficient due to noise. Further improvements include an overlay of classical coring on single coefficients. Thus, the subject technique provides improved image enhancement through the use of a multi-band or scale-oriented analysis and synthesis transform having improved coefficient modification, good orientation tuning, improved bandpass characteristics, and good spatial localization”, column 6, lines 39-67).

Claims 20-22 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kerr as applied to claim 1 above, and further in view of PG Pub 2019/0205946 to Mellina et al.
Regarding claim 20. The method of claim 1, Kerr does not disclose wherein the output image comprises a video frame. However, for those of ordinary skill in the art and in view of Mellina, such is an obvious variation (for those of ordinary skill in the art it is well known that a video is comprised of multiple photos presented as frames, further as disclosed below, in Mellina, in same area of image extraction and creation discloses wherein the prescribed scene type is associated with an animation or a video sequence “processes may be trained on any media type, including images, audio, video, and/or text, for example, and/or combinations, like audio accompanying a video”, paragraph 32).


Regarding claim 21. The method of claim 1, Kerr does not disclose wherein the prescribed scene type is associated with a retailer or a brand.  However, Mellina in same area of image extraction and creation discloses wherein the prescribed scene type is associated with a retailer or a brand (“generative processes may be trained with respect to content, such as composition of elements comprising an online ad, semantic content (such as type of scene), color scheme, qualities of people appearing in an ad, text and/or logos appearing in an ad, and/or other brand-related elements”, paragraph 35).
	Therefore, it would have been obvious to a person with ordinary skill in the art before the effective filing date of the claimed invention to have modified Kerr’s computer vision techniques to extract attributes of images to include: wherein the prescribed scene type is associated with a retailer or a brand.
It would have been obvious to a person with ordinary skill in the art before the effective filing date of the claimed invention to have modified Kerr’s learning-based framework for personalized image quality evaluation by the teaching of Mellina because of the following reasons: (a) greater efficiency and/or customization regarding creating and/or presenting online ads, (paragraph 3, Mellina); and (b) enable a user to modify an image to include a desired attribute that is not inherent to the image as taught by Kerr at paragraph 7.

Regarding claim 22. The method of claim 1, Kerr does not disclose wherein the prescribed scene type is associated with an animation or a video sequence.  
However, Mellina in same area of image extraction and creation discloses wherein the prescribed scene type is associated with an animation or a video sequence (“processes may be trained on any media type, including images, audio, video, and/or text, for example, and/or combinations, like audio accompanying a video”, paragraph 32).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. US PG Pub 2017/0097948 to Kerr et al. provides in various implementations, specific attributes found in images can be used in a visual-based search. Utilizing machine learning, deep neural networks, and other computer vision techniques, attributes of images, such as color, composition, font, style, and texture can be extracted from a given image. A user can then select a specific attribute from a sample image the user is searching for and the search can be refined to focus on that specific attribute from the sample image. In some embodiments, the search includes specific attributes from more than one image.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHRISTOPHER D. WAIT, Esq. whose telephone number is (571)270-5976. The examiner can normally be reached Monday-Friday, 9:30- 6:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mohammad Ghayour can be reached on 571 272-3021. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

CHRISTOPHER D. WAIT, Esq.
Primary Examiner
Art Unit 2672



/CHRISTOPHER WAIT/Primary Examiner, Art Unit 2672