EXAMINER’S AMENDMENT
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.

Authorization for this examiner’s amendment was given in a telephone interview with Mr. Nathaniel Lucek, Reg. # 60,766 on 09/10/2021.

Pursuant to MPEP 606.01, the title had been changed to read:
-- SEMICONDUCTOR INSPECTION AND METROLOGY SYSTEMS FOR DISTRIBUTING JOB AMONG THE CPUs OR GPUs BASED ON LOGICAL IMAGE PROCESSING BOUNDARIES --

This listing of claims will replace all prior versions of claims:

1-20. (Cancelled)

21. (New) A system for scalable and flexible job distribution having a plurality of worker nodes coupled to a master node comprising: 
	the master node having at least one processor to run a master job manager, wherein the master job manager is configured to:

		divide the input image data into at least a first job; and
		distribute the first job to a first CPU worker node of the plurality of worker nodes;
	the plurality of worker nodes including:
		the first CPU worker node coupled with the master node, wherein the first CPU worker node includes two or more CPUs and two or more GPUs, wherein one of the CPUs in the first CPU worker node runs a worker job manager for the first CPU worker node, wherein the worker job manager includes a module with a deep learning model;
		a second CPU worker node coupled with the master node that includes one or more CPUs without any GPU; and
		at least one GPU worker node coupled with the master node that includes one or more GPUs without any CPU other than to run the worker job manager for the GPU worker node;
	wherein the worker job manager of the first CPU worker node is configured to divide the first job into a plurality of tasks to be processed by the two or more CPUs or the two or more GPUs based on logical image processing boundaries, wherein the plurality of tasks include defect detection and defect classification;
	wherein the deep learning model of the first CPU worker node is configured to:

	wherein the worker job manager of the first CPU worker node is further configured to: 
	dispatch the CPU-bound tasks for CPU-bound algorithm processes and the GPU-bound tasks for GPU-bound algorithm processes or for a GPU job manager; and
	wherein the GPU job manager of the first CPU worker node is configured to:
		queue in an input queue the input image data of at least some of the GPU-bound tasks for the GPU job manager to prioritize ahead of a later GPU bound-task; and
		distribute the input image data of the at least some of the GPU-bound tasks in equal batches to the two or more GPUs of the first CPU worker node for processing such that completion time of the plurality of tasks is minimized.

22. (New) The system of claim 21, wherein there are more of the CPU than the GPU in one of the worker nodes.

23. (New) The system of claim 21, wherein there are more of the GPU than the CPU in one of the worker nodes.


25. (New) The system of claim 21, wherein one of the CPU in the second CPU worker node runs the worker job manager for the second CPU worker node.

26. (New) The system of claim 21, further comprising an interface layer configured to communicate with an integrated memory controller (IMC) client using an application programming interface.

27. (New) The system of claim 21, further comprising a neural network to execute the deep learning model.

28. (New) A method for scalable and flexible job distribution having a plurality of worker nodes coupled to a master node comprising:
	receiving input image data from a semiconductor inspection tool or a semiconductor metrology tool at the master node, wherein the input image data is of a semiconductor wafer or reticle, and wherein the master node has a master job manager; 
	dividing the input image data into at least a first job using the master job manager;

	the first CPU worker node that includes two or more CPUs and two or more GPUs, wherein one of the CPUs in the first CPU worker node runs a worker job manager for the first CPU worker node, wherein the worker job manager includes a module with a deep learning model;
	a second CPU worker node coupled with the master node that includes one or more CPUs without any GPU; and
	at least one GPU worker node coupled with the master node that includes one or more GPUs without any CPU other than to run the worker job manager for the GPU worker node;
	dividing, using the worker job manager of the first CPU worker node, the first job into a plurality of tasks to be processed by the two or more CPUs or the two or more GPUs based on logical image processing boundaries, wherein the plurality of tasks include defect detection and defect classification;
	determining, using the deep learning model of the first CPU worker node, whether each of the tasks is a CPU-bound task or a GPU-bound task based on the logical image processing boundaries, wherein the CPU-bound task is assigned to one of the two or more CPUs in the first CPU worker node instead of to one of the two or more GPUs in the first CPU worker node, and wherein the GPU-bound task is assigned to one of the two or more GPUs in the first CPU worker node instead of to one of the two or more CPUs in the first CPU worker node;

	queuing, using the GPU job manager of the first CPU worker node, in an input queue the input image data of at least some of the GPU-bound tasks for the GPU job manager to prioritize ahead of a later GPU bound-task; and
	distributing, using the GPU job manager of the first CPU worker node, the input image data of the at least some of the GPU-bound tasks in equal batches to the two or more GPUs of the first CPU worker node for processing such that completion time of the plurality of tasks is minimized.

29. (New) The method of claim 28, wherein the method further comprises retraining the deep learning model.

30. (New) The method of claim 28, wherein the worker job managers operate under a first in first out job queue.

31. (New) The method of claim 28, wherein the input image data is from multiple wafer locations, and wherein the input image data is processed in a same batch.

32. (New) The method of claim 28, wherein the first job is distributed to the first CPU worker node in parallel and in real-time with other jobs from the input image data distributed to the plurality of worker nodes.



Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ZUJIA XU whose telephone number is (571)272-0954.  The examiner can normally be reached on M-F 9:00-5:30 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Meng-Ai An can be reached on (571) 272-3756.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access 

/MENG AI T AN/Supervisory Patent Examiner, Art Unit 2195                                                                                                                                                                                                        




/Z.X./Examiner, Art Unit 2195