DETAILED ACTION
This Action is in response to Applicant’s response filed on 10/25/2022. Claims 1-20
is still pending in the present application. This Action is made FINAL.

Response to Amendment
With respect to Claim Rejections - 35 USC § 112(b): The amended claim filed on 10/25/2022 overcome the claim rejections in the previous office action.

Response to Arguments
Applicant's arguments have been considered but are moot in view of the new ground(s)
of rejection in view of  Jean et al (U.S. 20170255840 A1; Jean).

Claim Objections
Claims 1, 8 and 15 are objected to because of the following informalities: 
In claim 1, line 11 discloses "an organization-specific score" and line 14 discloses "the organization-specific relevancy score". The Examiner believes that the term "relevancy" should be removed from line 14 to avoid any antecedent basis issues (35 USC 112(b)). Independent claims 8 and 15 are objected for similar reasons.
Appropriate correction is required.

					Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-20  is rejected under 35 U.S.C. 103 as being unpatentable over Bakkali, Souhail, et al, (“Visual and textual deep feature fusion for document image classification.”; Bakkali), in view of Jean et al (U.S. 20170255840 A1; Jean).

Regarding claims 1, 8 and 15,  Bakkali discloses a system (1. Introduction: “ From a computer vision perspective, earlier studies that have been using deep neural networks for document analysis tasks focused on their structural similarity constraints and their visual features”) comprising: 
obtaining a document file (Fig.2 – input document image) from a document collection of a particular organization; (Page 2395 - I-Introduction: “Amongst all classes of the RVL-CDIP dataset, some samples from specific categories present particular layout properties and document structures.”, it shows “specific categories” is read as “a particular organization”. )
extracting one or more parts of the document file, the one or more parts including one or more image parts; (Fig.2 – OCR engine – Extract text and Document visual embedding) for each image part of the one or more image parts: 
feeding the image part (Fig.2: extracted text) to a content decision engine, (Fig.2:cross-modal deep network)  the content decision engine generating an organization-specific score (See table 1 - each categories “Advertisement, File Folder, and Handwritten” read as “organization-specific score”; See Table 4 : “F1-score for Text Stream: 0.72- 0.78 – 0.85 Or F1-score for Image Stream: 0.83 -0.87- 0.92 – 0.92”) for the image part based on relevancy of the image part to the particular organization,(Page 2399-2400: 5.5: Ablation Study: “As seen in Table. 1, the classification results of each class of the three word embedding procedures are very low concerning three main categories that are: Advertisement, File Folder, and Handwritten … Also, FastText managed to improve the performance and reduced the classification error by 4% where 31.13% of Advertisement, and 28.28% of Handwritten class documents are predicted as File Folder documents. ”)

    PNG
    media_image1.png
    265
    720
    media_image1.png
    Greyscale


    PNG
    media_image2.png
    440
    738
    media_image2.png
    Greyscale

the content decision engine (Fig.2:cross-modal deep network) utilizing a first machine learned model (Fig.2- OCR engine and text model ; Page 2399 – 5.2 Preprocessing of the experiment: “We utilized this OCR engine to conduct a fully automatic page segmentation, as the document images from the datasets are well oriented and relatively clean.”) trained by a first machine learned algorithm (Page 2399-2400: 5.5: Ablation Study: “FastText managed to improve the performance and reduced the classification error by 4% where 31.13% of Advertisement, and 28.28% of Handwritten class documents are predicted as File Folder documents.”, it shows “FastText” read as “first machine learning algorithm”) using a set of training data unique to the organization; (Page 2398: 4.2. Text features: “As well, we found 3,601,377 unique tokens, 24,109 of null word embeddings, and a dictionary size of 3,601,377 for FastText word embedding on the same standard dataset.”)
compressing the document file by including only image parts that have not been discarded in a compressed version of the document file. (Page 2398 – 4.3: Cross-modal features: “the effectiveness of the cross-modal features that are jointly learned from the image stream and text stream for the classification of document images. We adopt the late fusion process with two different methodologies, i.e. equal concatenation and average ensemble fusion”)

However, Bakkali does not discloses at least one hardware processor; and a non-transitory computer-readable medium storing instructions that, when executed by the at least one hardware processor, cause the at least one hardware processor to perform operations comprising:
	based on the organization-specific relevancy score for at least one image part of the one or more image parts, discarding the at least one image part;  

Jean discloses at least one hardware processor; (Fig.31: processor(s) 3110) and a non-transitory computer-readable medium storing instructions (Fig.31: memory 3111) that, when executed by the at least one hardware processor, (Paragraph 180: “ Memory 3111 may store data and instructions that configure the processor(s) 3110 to execute operations in accordance with the techniques described above.”) cause the at least one hardware processor to perform operations comprising:
	based on the organization-specific relevancy score (Fig.20, SIFT classifier 2030 or ORB classifier 2040 or WORD classifier 2045) for at least one image part of the one or more image parts, (Paragraph 132: “For each vector collection, we use their corresponding classifier to predict the set of class labels Y.sub.s, Y.sub.o, Y.sub.w, such as SIFT classifier 2030, ORB classifier 2040, and WORD classifier 2045 At SIFT-ORB ensemble 2035 we merge the predictions of SIFT and ORB to obtain the sorted ensemble list of candidates”) discarding the at least one image part; (Paragraph 168: “when all items from a subset of templates are classified and removed from the master batch, the next clustering should only provide new labels to continue classifying and emptying the master batch.”)
	It would have been obvious to a person of ordinary skill in the art before effective filling
date to incorporate “ A system and method of analyzing content of digital images” of Jean into the teaching of Bakkali in order to improving computational capabilities during feature extraction and matching as well as reducing the processing time in image analysis field. 

Regarding claims 2, 9 and 16, Bakkali, as modified by Jean discloses all the claim invention. Bakkali further discloses the content decision engine performs optical character recognition (OCR) of text in each image part of the one or more image parts and feeds the optically recognized characters to a word embedding machine learned model trained using domain-specific text in the set of training data unique to the particular organization. (Fig.2, The proposed cross-modal deep network ; 4. Cross-modal feature learning: Page 2397 , Column 1-2: “  In the first stream, we feed input document images to the backbone model. In the second stream, we extract the textual information from document images with an OCR engine, then we feed the text strings generated as input to the word embedding algorithm.”)

Regarding claims 3, 10 and 17, Bakkali, as modified by Jean discloses all the claim invention. Bakkali further discloses the content decision engine further calculates a percentage of the text in each image part that comprises a currency or percentage value. (Table 1 and Table 4- the accuracy (%) in text stream with different model).

Regarding claims 4, 11 and 18, Bakkali, as modified by Jean discloses all the claim invention. Bakkali further discloses the content decision engine further includes a long-short term memory (LSTM)-based convolutional neural network (CNN) trained to determine whether the image part constitutes a stock chart. (4.2: Text features, Page 2397, column 2: “ textual content is required to perform text classification, we process all document images with an off-the shelf optical character recognition (OCR) engine, i.e. TesseractOCR2. It is based on LSTM layers and includes a neural network subsystem configured in English as a text line recognizer. ; Fig. 3, Confusion Matrix of our best cross-modal network with the average ensembling fusion method). 

Regarding claims 5, 12 and 19, Bakkali, as modified by Jean discloses all the claim invention. Bakkali further discloses the content decision engine further includes a table identification component (Fig. 3, Confusion Matrix of our best cross-modal network with the average ensembling fusion method; 5.1: data set – Page 2398, column 2: “The RyersonVision Lab Complex Document Information Processing(RVL-CDIP) dataset consists of 400,00 grayscale labeled document images in 16 classes (advertisement, budget, email, file folder, form, handwritten, invoice, letter, memo, news article, presentation, questionnaire, resume, scientific publication, scientific report, specification), with25,000 images per class.”) including an optical character recognition (OCR) tool designed to identify two-dimensional structures in each image part of the one or more image parts. (Fig.2, The proposed cross-modal deep network ; 4.2: Text features, Page 2397, column 2: “ textual content is required to perform text classification, we process all document images with an off-the shelf optical character recognition (OCR) engine, i.e. TesseractOCR2. It is based on LSTM layers and includes a neural network subsystem configured in English as a text line recognizer.”)

Regarding claims 6, 13 and 20, Bakkali, as modified by Jean discloses all the claim invention. Bakkali further discloses the OCR tool includes a neural network that identifies bounding boxes of characters in each image part of the one or more image parts and constructs a sparse two-dimensional grid of some of the characters. (Fig.2, The proposed cross-modal deep network ; 4.2: Text features, Page 2397, column 2: “ textual content is required to perform text classification, we process all document images with an off-the shelf optical character recognition (OCR) engine, i.e. TesseractOCR2. It is based on LSTM layers and includes a neural network subsystem configured in English as a text line recognizer.”)

Regarding claims 7 and 14, Bakkali, as modified by Jean discloses all the claim invention. Bakkali further discloses the one or more image parts are a plurality of image parts and the operations further comprise, for each image part of the plurality of image parts: compressing the corresponding image part based on the organization-specific relevancy score for the corresponding image part; and wherein the compressing the document file further comprises including the compressed corresponding image part in the compressed document file. . (Page 2398 – 4.3: Cross-modal features: “the effectiveness of the cross-modal features that are jointly learned from the image stream and text stream for the classification of document images. We adopt the late fusion process with two different methodologies, i.e. equal concatenation and average ensemble fusion”)

Relevant Prior Art Directed to State of Art
Andrade et al (U.S. 20180316571 A1), “ Enhanced data collection and analysis facility” , teaches about collecting and analyzing data to provide presentation paradigms for such data. It also teaches a system and method are described for generating a classification model to determine predictive user behavior. The method may include obtaining data from a mobile network provider. The data including a plurality of utilization metrics pertaining to a plurality of mobile devices carrying out a plurality of network interactions, the plurality of mobile devices being associated with a plurality of users. The method may also include categorizing the data into a plurality of Internet domains associated with the data and determining a plurality of patterns in the data.
Ishiguro (U.S. 20020135790 A1), “Image Processing Apparatus, Image Forming Apparatus, And Image Processing Method”, teaches about An image processing apparatus having a first area judgment unit for judging whether an inputted pixel is an edge pixel of a character image, a second area judgment unit for judging whether the pixel in an edge area having an intensity variation level equal to, or greater than an intensity variation level of a halftone image edge area, a natural image edge selection unit for specifying an edge pixel of a halftone image based on judgments of the first area judgment unit and second area judgment unit, and a first correction unit for conduction edge enhancement processing on edge pixels of a character image, and a second correction processing unit for conducting sharpness enhancement processing on edge pixels of a halftone image.
Pandian et al (U.S. 20050289182 A1), “Document Management System With Enhanced Intelligent Document Recognition Capabilities”, teaches about methods and apparatus for document management, which capture image data from electronic document sources as diverse as facsimile images, scanned images, and other document management systems and provides, for example, indexed, accessible data in a standard format which can be easily integrated and reused throughout an organization or network-based system.
Williams, JR, et al (U.S. 20150254555 A1), “Classifying data with deep learning neural records incrementally refined through expert input”, teaches about classifying data using machine learning that may be incrementally refined based on expert input. Data provided to a deep learning model that may be trained based on a plurality of classifiers and sets of training data and/or testing data. If the number of classification errors exceeds a defined threshold classifiers may be modified based on data corresponding to observed classification errors. A fast learning model may be trained based on the modified classifiers, the data, and the data corresponding to the observed classification errors. And, another confidence value may be generated and associated with the classification of the data by the fast learning model. Report information may be generated based on a comparison result of the confidence value associated with the fast learning model and the confidence value associated with the deep learning model.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Duy A Tran whose telephone number is (571)272-4887. The examiner can normally be reached Monday-Friday 8:00 am - 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Edward F Urban can be reached on (571)-272-7899. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DUY TRAN/            Examiner, Art Unit 2665                                                                                                                                                                                            

/BOBBAK SAFAIPOUR/            Primary Examiner, Art Unit 2665