DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1, 2, 3, 7, 18, 21 are rejected under 35 U.S.C. 103 as being unpatentable over Wilensky et al. (US 2014/0081625) (hereafter Wilensky) in view of Paransis et al. (US 2018/0367729) (hereafter Parasnis).
Regarding claim 1, Wilensky discloses a signal processing device (see, paragraph [0027]), comprising: a signal collector configured to obtain an image to be processed (see, image capture device, 104, see, paragraph [0027] the image capture device 104 may be configured as part of the computing device) and collect an input signal (see, paragraph [0033]); 

 an instruction converter configured to convert the signal into an image processing instruction according to a target signal instruction conversion model (see, paragraph [0039] The natural language processing module 116 may then employ both the gesture and the natural language input to initiate an image editing operation. Continuing with the above example, the natural language processing module 116 may identify the image editing operation from the gesture 206 and a subject of the image editing operation from a natural language input, e.g., generated form the audio data 118, manually input by a user, and so on. The natural language processing module 116 may also identify a subject and operation using the reverse in which a gesture identifies the subject and a natural language input specifies the operation, [0049]-[0053], [0046] The gesture may then cause operation of an object identification module 402 to identify an object in the image 204 associated with the location of the tap, which may include identification of a boundary of the object in the image 204. The object identification module 402, for instance, may employ one or more facial recognition algorithms 404 to recognize a user in the image 204, such as the "Dad," "Son," and so on responsive to a tap on those portions of the image. By using the facial recognition algorithm 404, boundaries of these people may be determined and used to define a subject of an image editing operation. see, Fig. 5, the taking audio or text data as input and perform 
an image processor configured to edit the image to be processed according to the image processing instruction and a target image processing model to obtain a result image (see, paragraph [0059], [0365], Fig. 8, the image editing module, 112, and 906 in Fig. 9).
But, Welinsky does no explicitly disclose a memory for storing images to be processed. However, in same field of endeavor, Paransis teaches in Fig. 2, the storage device, 112 for storing digital image, and 110. See, paragraph [0026], the digital image 110 is illustrated as stored in storage 112, e.g., a computer-readable storage medium, database, and so forth.
Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to combine the teachings of Paransis into the Welinsky, to perform the image editing operation on the stored image based on the signal instruction, the motivation is to yield predictable results.
 	Regarding claim 2, Wilensky further discloses the signal processing device, wherein the image processing instruction includes at least one of the following: an image editing area, an image editing method, and an image editing mode, wherein the image editing mode is a real-time editing mode, or a single image editing mode, or a multi-image editing mode (see, paragraph [0037], [0038], [0370] discloses the user inputs an instruction and immediately the image is changed is interpreted as real time operation, Fig. 2, image editing module, 112).
 	Regarding claim 3, Wilensky further discloses the signal processing device, wherein the image to be processed comprises content that is captured in real time by an image collecting device (paragraphs [0027], [0028]), or comprises at least one frame of an image (see, paragraphs [0032], [0033], [0104]) or a video stored in the memory.
Regarding claim 7, Wilensky further discloses the signal processing device, wherein the signal includes at least one of the following: a voice signal, an image signal, a text signal, and a 
	Regarding claim 18, the combined teachings further discloses a machine learning operation device, comprising one or more signal processing devices of claim 1, wherein the machine learning operation device is configured to obtain data to be processed and control information from other processing devices, perform specified machine learning computations, and send execution results to peripheral devices through I/O interfaces; if the machine learning operation device includes multiple signal processing devices, the multiple signal processing devices transfer data between each other; wherein the data is transferred among the multiple signal processing devices via a PCIE bus, so as to support larger scale machine learning computations; the multiple signal processing devices share one control system or have separate control systems (paragraph [0060], [0060] The example computing device 702 as illustrated includes a processing system 704, one or more computer-readable media 706, and one or more I/O interface 708 that are communicatively coupled, one to another. Although not shown, the computing device 702 may further include a system bus or other data and command transfer system that couples the various components, one to another. A system bus can include any one or combination of different bus structures, such as a memory bus or memory controller, a peripheral bus, a universal serial bus, and/or a processor or local bus that utilizes any of a variety of bus architectures. A variety of other examples are also contemplated, such as control and data lines, the combined teachings do not explicitly disclose PCIE bus however official notice taken that PCIE bus is well known to one of ordinary skilled in the art because Peripheral Component Interconnect Express (PCIe or PCI-E) is a serial expansion bus standard for connecting a computer to one or more peripheral devices. PCIe provides lower latency and higher data transfer rates than parallel busses such as PCI and PCI-X. ... PCI Express slots on a motherboard).

.

7.	Claims 4 and 5 are rejected under 35 U.S.C. 103 as being unpatentable over Wilensky and Paransis and further in view of Lee et al. (US 20170262959) (hereafter Lee).
Regarding claim 4, Wilensky further discloses the signal processing device, wherein the instruction converter includes: a first signal recognizer configured to convert the signal into text information through a signal recognition technology (see, Fig. 8, audio data, 118), wherein the signal recognition technology is at least one of the following: a voice recognition technology, a semantic understanding technology, an image recognition technology, and a natural language processing technology (see, Fig. 8, natural language processing module),
 a signal text converter configured to convert the text information into an image processing method through the natural language processing technology and the target signal instruction conversion model (see, Fig. 8, the speech to text engine, 210, paragraph [0038], 0038] The image editing module 112 is also illustrated as including audio data 118 that is processed by a speech-to-text engine 210 to form a natural language input, paragraph [0049], [0049] FIG. 5 depicts a system 500 in an example implementation showing a natural language processing module 116 in greater detail. The natural language processing module 116 is illustrated as including a plurality of sub-
But, do not explicitly disclose a first image recognizer configured to divide the image to be processed into areas according to a granularity of a semantic area in the image processing instruction and the image recognition technology to obtain an image editing area.
However, in same field of endeavor, Lee teaches in paragraph [0056], an image segmentation module 386 and an image alignment module 384 may be included in the browsing service 380 to support the alignment of the images. The image segmentation module 386 may be configured to segment images into respective portions. The segmentation may be based on modeling information stored in a modeling data store 334. The image alignment module 384 may be configured to vertically and/or horizontally align the portions of images identified for presentation in respective display zones. The alignment may be based on modeling information stored in the modeling data store 334. In some implementations, the image alignment module 384 may be configured to align images which have been segmented by the image segmentation module 386. See, paragraph [0084] teaches once the neural network model is trained, unknown images may be processed to identify segments for the image. Other computer vision techniques that may be used for segmentation include object localization, semantic segmentation, edge detection, and the like. [0086] at block 705, a first image portion is generated from a first image. The first image portion may be generated by segmenting an image selected from a data store of images as discussed herein. At block 710, a second image portion is generated from a second image. As with the first image portion, the second image portion may be generated by segmenting a second image selected from a data store of images as discussed herein. Note that the semantic segmentation 
Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to combine the teachings of Lee with the Wilensky and Paransis, as a whole, to implement the semantic segmentation to divide the areas into plurality of portions for browsing images, the motivation is to adjust the portions for the images. 
 	Regarding claim 5, the combined teachings further discloses the signal processing device wherein the instruction converter includes: a second signal recognizer configured to convert the signal into the image editing method according to the signal recognition technology and the target signal instruction conversion model (see, Welinsky, Fig. 8, the image editing module, 112 and natural language processing module, 116), and a second image recognizer configured to divide the image to be processed into areas according to a granularity of the semantic area in the image processing instruction and the image recognition technology to obtain the image editing area (Lee teaches in paragraph [0056], an image segmentation module 386 and an image alignment module 384 may be included in the browsing service 380 to support the alignment of the images. The image segmentation module 386 may be configured to segment images into respective portions. The segmentation may be based on modeling information stored in a modeling data store 334. The image alignment module 384 may be configured to vertically and/or horizontally align the portions of images identified for presentation in respective display zones. The alignment may be based on modeling information stored in the modeling data store 334. In some implementations, the image alignment module 384 may be configured to align images which have been segmented by the image segmentation module 386. See, paragraph [0084] teaches once the neural network model is trained, unknown images may be processed to identify segments for the image. Other computer vision techniques that may be used for segmentation include object localization, semantic segmentation, edge detection, and the like. [0086] at block 705, a first image portion is generated 

 8.	Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Wilensky and Paransis and further in view of Gupta et al. (US 2019/0102229) (hereafter Gupta).
 	Regarding claim 6, the combined teachings disclose the signal processing device, wherein the image processor includes:, and a processing module configured to process the image editing area according to the image processing instruction and the target image processing model (Wilensky, image editing module, 112, natural language processing module, 116, the speech to text engine, 210 to generate the text, 802 are image processing instructions) But, do not explicitly disclose, an instruction fetching module configured to obtain an image processing instruction in a preset time window However, Gupta, in same field of endeavor, teaches in paragraph [0046], an execution cycle is the time period during which one instruction in a thread is fetched from memory and executed, so, particular cycle is interpreted to be preset time window at which instruction is fetched. Therefore, it would have been obvious to one of ordinary skilled in the art to combine the teachings of Gutpa with the Welinsky and Paransis, to fetch instruction during time-period to perform the data imaging editing operation, the motivation is to perform image editing in preset time window.


 	Regarding claim 8, the combined teachings do not explicitly disclose the signal processing device, wherein the target signal instruction conversion model is obtained by implementing adaptive training on a signal instruction conversion model, and wherein the target image processing model is obtained by implementing adaptive training on an image processing model. 
	However, in same field of endeavor, Tachibana teaches in fig. 4, the speech signal as input using the analyzer, 12 and training unit, 14 for training the speech signal using training controller, 140 and generate the synthesized speech signal which is target signal. Paragraph [0006], training apparatus includes an autoregressive model configured to estimate a current signal from a past signal sequence and a current context label. The autoregressive model includes a network structure capable of statistical data modeling. The training apparatus includes a vocal tract feature analyzer configured to analyze an input speech signal to determine a vocal tract filter coefficient representing a vocal tract feature, a residual signal generator configured to output a residual signal between a speech signal predicted based on the vocal tract filter coefficient and the input speech signal, a quantization unit configured to quantize the residual signal output from the residual signal generator to generate a quantized residual signal, and a training controller configured to provide as a condition, a context label of an already known input text for an input speech signal corresponding to the already known input text to the autoregressive model and to train the autoregressive model by bringing a past sequence of the quantized residual signals for the input speech signal and the current context label into correspondence with a current signal of the quantized residual signal. [0032] More specifically, analyzer 12 and training unit 14 are responsible for machine learning for constructing autoregressive model 16. Analyzer 12 and training unit 14 function as a training apparatus for the speech synthesis system and constructs autoregressive model 16. Details of 
	Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to combine the teachings of Tachibana with the Wilensky and Paransis, as a whole, to perform training method to output the desired speech signal based on the neural network model, the motivation is to generate the instruction conversion using training method. 
 	Regarding claim 9, the combined teachings further discloses the signal processing device wherein the instruction converter is configured to: convert the signal into a prediction instruction according to the signal instruction conversion model, determine a correlation coefficient between the prediction instruction and a corresponding instruction set of the prediction instruction, and optimize the signal instruction conversion model according to the correlation coefficient between the prediction instruction and the corresponding instruction set of the prediction instruction to obtain the target signal instruction conversion model (Tachibana teaches in fig. 4, the speech signal as input using the analyzer, 12 and training unit, 14 for training the speech signal using training controller, 140 and generate the synthesized speech signal which is target signal. paragraph [0006], training apparatus includes an autoregressive model configured to estimate a current signal from a past signal sequence and a current context label. The autoregressive model includes a network structure capable of statistical data modeling. The training apparatus includes a vocal tract feature analyzer configured to analyze an input speech signal to determine a vocal tract filter coefficient representing a vocal tract feature, a residual signal generator configured to output a residual signal between a speech signal predicted based on the vocal tract filter coefficient and the input speech signal, a quantization unit configured to quantize the residual signal output from the residual signal generator to generate a quantized residual signal, and a training controller configured to provide as 
 	Regarding claim 10, the combined teachings further discloses the signal processing device further comprising a trainer configured to: convert the signal into the prediction instruction according to the instruction conversion model, determine the correlation coefficient between the prediction instruction and the corresponding instruction set of the prediction instruction, optimize the signal instruction conversion model according to the correlation coefficient between the prediction instruction and the corresponding instruction set of the prediction instruction to obtain the target signal instruction conversion model (see, Tachibana teaches in fig. 4, the speech signal as input using the analyzer, 12 and training unit, 14 for training the speech signal using training controller, 140 and generate the synthesized speech signal which is target signal. paragraph [0006], training apparatus includes an autoregressive model configured to estimate a current signal from a past signal sequence and a current context label. The autoregressive model includes a network structure capable of statistical data modeling. The training apparatus includes a vocal tract feature analyzer configured to analyze an input speech signal to determine a vocal tract filter coefficient representing a vocal tract feature, a residual signal generator configured to output a residual signal between a speech signal predicted based on the vocal tract filter coefficient and the input speech 
process the image to be processed according to the image processing model to obtain a predicted image, determine a correlation coefficient between the predicted image and a corresponding target image of the predicted image, and optimize the image processing model according to the correlation coefficient between the predicted image and the corresponding target image of the predicted image to obtain the target image processing model (see, Parasnis, paragraphs [0046] and [0047], the machine learning module 410 includes a model training module 502 configured to generate models 504 using machine learning, e.g., through use of a neural network. The machine learning module 410 also includes a model use module 506 that is configured to use the models 504 to generate capture support data 118 as corresponding to the request data 406 extracted from the first communication 402. [0047] the model training module 502 may train the models 504 in a variety of ways. In one illustrated example, the modeling training module 502 receives training data from a content creation system 508 and/or a content sharing system 510. The content creation system 508 is configured to provide content creation 
 	Regarding claim 12, the combined teachings further discloses the signal processing device wherein the image processor is further configured to: process the image to be processed according to the image processing model to obtain a predicted image, determine a correlation coefficient between the predicted image and a corresponding target image of the predicted image, and optimize the image processing model according to the correlation coefficient between the predicted image and the corresponding target image of the predicted image to obtain the target image processing model (see, Parasnis, paragraphs [0046] and [0047], the machine learning module 410 includes a model training module 502 configured to generate models 504 using machine learning, e.g., through use of a neural network. The machine learning module 410 also includes a model use module 506 that is configured to use the models 504 to generate capture support data 118 as corresponding to the request data 406 extracted from the first communication 402. [0047] the model training module 502 may train the models 504 in a variety of ways. In one illustrated example, the modeling training module 502 receives training data from a content creation system 508 and/or a content sharing system 510. The content creation system 508 is configured to provide content creation functionality that is usable to edit digital images and/or raw image data, which is represented by the creation manager module 512).
 	Regarding claim 15, the combined teachings further discloses the signal processing device wherein the signal processing device is configured to: convert the voice signal into the prediction instruction according to the signal instruction conversion model, determine the correlation coefficient between the prediction instruction and the corresponding instruction set of the prediction instruction, and optimize the signal instruction conversion model according to the correlation coefficient between the prediction instruction and the corresponding instruction set of the prediction instruction to obtain the target signal instruction conversion model (see, Tachibana 
 	Regarding claim 17, the combined teachings further discloses the signal processing device wherein the signal processing device is configured to: process the image to be processed according to the image processing model to obtain a predicted image, determine the correlation coefficient between the predicted image and the corresponding target image of the predicted image, and .

Claim Rejections - 35 USC § 102
10.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


11.	Claim(s) 24-26 and 30 are rejected under 35 U.S.C. 102 (a) (1) as being anticipated by Welinsky. 
Regarding claim 24, Welinsky discloses a signal processing method, comprising obtaining an image to be processed; collecting an input signal (see, image capture device, 104, see, paragraph [0027] the image capture device 104 may be configured as part of the computing device, paragraph [0033]);
 converting the signal into an image processing instruction according to a target signal instruction conversion model; (see, paragraph [0039] The natural language processing module 116 may then employ both the gesture and the natural language input to initiate an image editing operation. Continuing with the above example, the natural language processing module 116 may identify the image editing operation from the gesture 206 and a subject of the image editing operation from a natural language input, e.g., generated form the audio data 118, manually input by a user, and so on. The natural language processing module 116 may also identify a subject and operation using the reverse in which a gesture identifies the subject and a natural language input specifies the operation, [0049]-[0053], [0046] The gesture may then cause operation of an object identification module 402 to identify an object in the image 204 associated with the location of the tap, which may include identification of a boundary of the object in the image 204. The object identification module 402, for instance, may employ one or more facial recognition algorithms 404 to recognize a user in the image 204, such as the "Dad," "Son," and so on responsive to a tap on those portions of the image. By using the facial recognition algorithm 404, boundaries of these people may be determined and used to define a subject of an image editing operation. see, Fig. 5, the taking audio or text data as input and perform natural language processing, 116 to the image editing operation, 520, Fig. 8, the image editing module, 112) and 
editing the image to be processed according to the image processing instruction and a target image processing model to obtain a result image. (See, paragraph [0059], [0365], Fig. 8, the image editing module, 112, and 906 in Fig. 9).


 	Regarding claim 26., Wilensky further discloses the method of claim 24, wherein the image to be processed comprises content that is captured in real time by an image obtaining device (paragraphs [0027], [0028]), or comprises at least one frame of an image (see, paragraphs [0032], [0033], [0104]) or a video stored from a memory or a cache.
Regarding claim 30, Wilensky further discloses the signal processing device, wherein the signal includes at least one of the following: a voice signal, an image signal, a text signal, and a sensor signal (see, Fig. 5, audio input, 504, text, 506, Fig. 8, audio data, 118 and speech to text engine, 210).

12.	Claims 27 and 28 are rejected under 35 U.S.C. 103 as being unpatentable over Wilensky in view of Lee.
Regarding claim 27, Wilensky further discloses the signal processing device of claim 2, wherein the instruction converter includes: 
a first signal recognizer configured to convert the signal into text information through a signal recognition technology (see, Fig. 8, audio data, 118), wherein the signal recognition technology is at least one of the following: a voice recognition technology, a semantic understanding technology, an image recognition technology, and a natural language processing technology (see, Fig. 8, natural language processing module),

 	But, does not explicitly disclose a first image recognizer configured to divide the image to be processed into areas according to a granularity of a semantic area in the image processing instruction and the image recognition technology to obtain an image editing area.
However, in same field of endeavor, Lee teaches in paragraph [0056], an image segmentation module 386 and an image alignment module 384 may be included in the browsing service 380 to support the alignment of the images. The image segmentation module 386 may be configured to segment images into respective portions. The segmentation may be based on modeling information stored in a modeling data store 334. The image alignment module 384 may be configured to vertically and/or horizontally align the portions of images identified for presentation in respective display zones. The alignment may be based on modeling information stored in the modeling data store 334. In some implementations, the image alignment module 384 may be configured to align images which have been segmented by the image segmentation module 386. See, paragraph [0084] teaches once the neural network model is trained, unknown images may be processed to identify segments for the image. Other computer vision techniques that may 
Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to combine the teachings of Lee with the Wilensky and Paransis, as a whole, so as to implement the semantic segmentation to divide the areas into plurality of portions for browsing images, the motivation is to adjust the portions for the images. 
 	Regarding claim 28, the combined teachings further discloses the signal processing device wherein the instruction converter includes: a second signal recognizer configured to convert the signal into the image editing method according to the signal recognition technology and the target signal instruction conversion model (see, Welinsky, Fig. 8, the image editing module, 112 and natural language processing module, 116), and a second image recognizer configured to divide the image to be processed into areas according to a granularity of the semantic area in the image processing instruction and the image recognition technology to obtain the image editing area (Lee teaches in paragraph [0056], an image segmentation module 386 and an image alignment module 384 may be included in the browsing service 380 to support the alignment of the images. The image segmentation module 386 may be configured to segment images into respective portions. The segmentation may be based on modeling information stored in a modeling data store 334. The image alignment module 384 may be configured to vertically and/or horizontally align the portions of images identified for presentation in respective display zones. The alignment may be based on .

13.	Claim 29 is rejected under 35 U.S.C. 103 as being unpatentable over Wilensky et al. (US 2014/0081625) (hereafter Wilensky) in view of Gupta et al. (US 2019/0102229) (hereafter Gupta).
 	Regarding claim 29, Welinsky further disclose the signal processing device of claim 3, wherein the image processor includes:, and a processing module configured to process the image editing area according to the image processing instruction and the target image processing model (Wilensky, image editing module, 112, natural language processing module, 116, the speech to text engine, 210 to generate the text, 802 are image processing instructions) But, do not explicitly disclose, an instruction fetching module configured to obtain an image processing instruction in a preset time window However, Gupta, in same field of endeavor, teaches in paragraph [0046], an execution cycle is the time period during which one instruction in a thread is fetched from memory and executed, so, particular cycle is interpreted to be preset time window at which instruction is fetched. Therefore, it would have been obvious to one of ordinary skilled in the art to combine the 

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DHAVAL V PATEL whose telephone number is (571)270-1818. The examiner can normally be reached Monday to Friday (8:00am-4:30pm).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Shuwang Liu can be reached on 571-272-3036. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DHAVAL V PATEL/Primary Examiner, Art Unit 2631                                                                                                                                                                                                        2/8/2022