DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-30 are pending and have been examined.
The present application was filed on 03/20/2018 and claims priority to patent application 62/601,370 (filed on 03/20/2017).
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 

(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 

This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: 
Claim 1
… a logical ruleset generation engine  configured to perform … 
Claim 4
… the classification engine is configured to determine …
… the classification engine is configured to … generate …
Claim 6
… the classification engine is configured to determine …
… the classification engine is configured to … represent …
Claim 7
… the logical ruleset generation engine is configured to rank …
… the logical ruleset generation engine is configured to … select … 
Claim 10
… the logical ruleset generation engine is configured to weight …
Claim 11
… the logical ruleset generation engine is configured to determine …
… the logical ruleset generation engine is configured to … remove …
Claim 13
… the extraction engine is configured to compare …
Claim 15
… the logical ruleset generation engine is configured to receive …
… the logical ruleset generation engine is configured to … determine…
… the logical ruleset generation engine is configured to … segment …
… the logical ruleset generation engine is configured to … generate…
Claim 16
… the logical ruleset generation engine is configured to determine …
… the logical ruleset generation engine is configured to … combine …
Claim 23
… a logical ruleset generation engine  configured to perform … 
Claim 24
… the classification engine configured to determine…
… the classification engine configured to … generate …
Claim 25
… the classification engine configured to … determine …
Upon a review of the disclosure, 
Specification [0037] provides the following: “FIG. 1 shows an example data processing system 100. The data processing system 100 preprocesses data for a machine learning The data processing system 100 include one or more processing devices for logical ruleset pipeline processing 50. The data processing system 100  includes an input port 110, an extraction engine 120, a logical ruleset generation engine 130, classification engine 140, and a shared memory data store 160. In some implementations, the data processing system 100 includes a library of data signatures 150. The data processing system 100 is receives, at the logical ruleset pipeline processing processors 50, one or more data items 105a,  105b, 105c and 105d. The processors 50 of the data processing system process the data and execute the extraction engine 120, the logical ruleset generation engine 130, and the classification engine 140.”
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. 
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 19, and 23 are rejected under 35 U.S.C. 103 as being unpatentable over Akgül et al. (“Content-Based Image Retrieval in Radiology: Current Status and Future Directions”) in view of Uchida (“Image processing and recognition for biological images”) and in further view of Alexander et al. (US 5467459 A).
Regarding Claim 1,
Akgül et al. teaches a data processing system (p. 209, Introduction, paragraph 4, “Another emerging technique that may assist radiology interpretation is content-based image retrieval (CBIR) …” and p. 209, Introduction, paragraph 5, “At present there is a substantial gap between CBIR, and its focus on raw image information, and decision support systems, which typically enter the workflow beyond the point of image analysis itself. This gap represents what we believe is a major opportunity to develop decision support systems that integrate image features exploited in CBIR systems …”  teaches a content based image retrieval system [data processing system]) 
configured to pre-process data for a machine learning classifier (p. 215, Similarity Measures, paragraph 4, “ … Statistical classifiers need to be pretrained.  The membership of the query image to each class is usually used as a feature set representing the image content. However, classifiers can also be used as a preprocessing step to narrow the search space in CBIR systems …” teaches pre-processing data for classifiers [machine learning classifier]), the data processing system comprising:
… an extraction engine that extracts, from a data item of the one or more data items written to the shared memory data store, a plurality of data signatures and structure data representing relationships among the data signatures (p. 213, Image Descriptors, paragraph 3, “ … the system … addresses the retrieval of pathology bearing regions (PBRs) in lung CT images by a human-in-the-loop approach … the system lets the user delineate PBRs in lung images … The system describes the images by the so-called attributional relational graphs (ARGs). In an ARG, the image regions are segregated by graph nodes and their spatial relationships are represented by edges between these nodes … Both nodes and edges are labeled by attributes corresponding to the properties of these regions and their relationships, respectively.  The attributes can be derived from any type of the visual cues described above, e.g., texture properties, global geometric features of regions (e.g., size, roundness), or features specified in some transform domain (e.g., Fourier coefficients of the extracted region boundary) … ”  teaches extraction of attributes into a graph comprising of nodes [a plurality of data signatures] and edges [structure data representing relationships among the data signatures]);
a logical rule set generation engine configured to perform operations comprising: generating a data structure from the plurality of data signatures, wherein the data structure includes a plurality of nodes connected with edges, each node in the data structure represents a data signature, and wherein each edge specifies a relationship between a first node and a second node, with the specified relationship corresponding to a relationship represented in the structure selecting a particular data signature of the data structure (p. 213, Image Descriptors, paragraph 3, “ … the system … addresses the retrieval of pathology bearing regions (PBRs) in lung CT images by a human-in-the-loop approach … the system lets the user delineate PBRs in lung images … The system describes the images by the so-called attributional relational graphs (ARGs). In an ARG, the image regions are segregated by graph nodes and their spatial relationships are represented by edges between these nodes … Both nodes and edges are labeled by attributes corresponding to the properties of these regions and their relationships, respectively.  The attributes can be derived from any type of the visual cues described above, e.g., texture properties, global geometric features of regions (e.g., size, roundness), or features specified in some transform domain (e.g., Fourier coefficients of the extracted region boundary) … ” teaches a graph [data structure], which includes graph nodes [a plurality of nodes] and spatial relationships between graph nodes [edges], and  teaches deriving a texture property [selecting a particular data signature] of the pathology bearing region of lung image in the attributional relational graph [data structure]);
for the particular data signature of the data structure that is selected, identifying each instance of the particular data signature in the data structure (p. 210, Image Features/Descriptors, paragraph 3, “ … In the medical domain, texture-based descriptors become particularly important as they can potentially reflect the fine details contained within an image structure. For example, cysts and solid nodules generally have uniform internal density and signal intensity characteristics, while more complex lesions and infiltrative disorders have heterogeneous characteristics … ”  teaches identifying cysts or solid nodules [each instance] of the texture-based feature [particular data signature]); and
p. 213, Image Descriptors, paragraph 3, “ … the system … addresses the retrieval of pathology bearing regions (PBRs) in lung CT images by a human-in-the-loop approach … the system lets the user delineate PBRs in lung images … The system describes the images by the so-called attributional relational graphs (ARGs). In an ARG, the image regions are segregated by graph nodes and their spatial relationships are represented by edges between these nodes … Both nodes and edges are labeled by attributes corresponding to the properties of these regions and their relationships, respectively.  The attributes can be derived from any type of the visual cues described above, e.g., texture properties, global geometric features of regions (e.g., size, roundness), or features specified in some transform domain (e.g., Fourier coefficients of the extracted region boundary) … ”  and
  p. 216, Segmentation, paragraph 1, “Segmentation is a key preprocessing step in CBIR systems that describe the image content through regions of interest.  The goal is to identify the semantically meaningful regions/objects within an image … “ teaches identifying the semantically meaningful regions/objects [instances] of the image corresponding to a particular attribute [particular data signature] of the image).
Akgül et al. does not appear to explicitly teach … an input port that receives one or more data items; a shared memory data store that stores the one or more data items, with each of the one or more data items being written to the shared memory data store; … identifying, based on the segmenting, one or more sequences of data signatures connected to the particular data signature, each of the one or more sequences identified being different from other identified sequences of data signatures connected to the particular data signature in the data structure; generating a logical ruleset, wherein each logical rule of the logical ruleset is a sequence of data 
Uchida teaches … identifying, based on the segmenting, one or more sequences of data signatures connected to the particular data signature, each of the one or more sequences identified being different from other identified sequences of data signatures connected to the particular data signature in the data structure (p. 523, Introduction, paragraph 4, “This paper introduces several basic image processing and image pattern recognition techniques, which will be useful for analyzing bioimages automatically by computer …” and p. 533, Clustering, paragraph 2, “Clustering can be used for image segmentation. Clustering-based image segmentation begins by representing an                         
                            M
                             
                            ×
                            N
                        
                     image as a set of                         
                            M
                             
                            ×
                            N
                        
                     vectors, P = {(x, y, I(x,y)}, where x and y represent the location of each pixel and I(x,y) represent some feature vector of the pixel.  A typical example of I(x,y) is an RGB color vector.  By applying some clustering method to the set P, the set is partitioned into groups having not only similar locations but also similar feature vectors, that is, an image segmentation result.  A key point of this method is to represent each pixel with its location, x and y.  Without the location, pixels having similar features are grouped regardless of their location” teaches identifying disjoint groups of similar                         
                            M
                            ×
                            N
                        
                     feature vectors [sequence of data signatures] within an image [data structure], wherein the feature vector is an RGB color vector [particular data signature]); and 
generating a logical ruleset, wherein each logical rule of the logical ruleset is a sequence of data signatures of the one or more sequences of data signatures that are identified (p. 523, Introduction, paragraph 4, “This paper introduces several basic image processing and image pattern recognition techniques, which will be useful for analyzing bioimages automatically by computer …” and p. 533, Clustering, paragraph 2, “Clustering can be used for image segmentation. Clustering-based image segmentation begins by representing an                         
                            M
                             
                            ×
                            N
                        
                     image as a set of                         
                            M
                             
                            ×
                            N
                        
                     vectors, P = {(x, y, I(x,y)}, where x and y represent the location of each pixel and I(x,y) represent some feature vector of the pixel.  A typical example of I(x,y) is an RGB color vector.  By applying some clustering method to the set P, the set is partitioned into groups having not only similar locations but also similar feature vectors, that is, an image segmentation result.  A key point of this method is to represent each pixel with its location, x and y.  Without the location, pixels having similar features are grouped regardless of their location” teaches representing an                        
                             
                            M
                            ×
                            N
                        
                     image as a set of                          
                            M
                            ×
                            N
                        
                     feature vectors [logical ruleset], wherein this set of                          
                            M
                            ×
                            N
                        
                     feature vectors [logical ruleset] is partitioned into groups having similar locations and similar feature vectors, wherein each group represents a logical rule); and
a classification engine that receives the logical ruleset as an input and executes one or more classifiers against the logical ruleset to classify the one or more data items received by the input port (p. 542, Image pattern recognition, paragraphs 1-3, “ … Diagnosis of an embryo, or a single cell, or a subcellular organelle using its imaging result is also an image pattern recognition problem. For the simplest diagnosis, it is reduced to a two-class recognition problem, that is, normal or abnormal … Image pattern recognition is comprised of two modules: feature extraction and classification.  Feature extraction is the module to convert an input image as a set of values, that is, a vector … Classification is the module to classify the input feature vector into a class according to some rule, called a classifier. A classifier is trained automatically using patterns whose class is known. This training mechanism of the classifier is called machine learning and feature vectors for training are called training patterns. The class label attached to each training pattern is called the ground-truth. We need to train a classifier to classify the training patterns correctly…” teaches a training a classifier [executing one or more classifiers] using feature vectors for training [logical ruleset], wherein the feature vectors for training represent radiological images [data items]), 
wherein one or more additional logical rules for the logical ruleset are generated based on the executing (p. 542, Image pattern recognition, paragraph 3, “Classification is the module to classify the input feature vector into a class according to some rule, called a classifier. A classifier is trained automatically using patterns whose class is known. This training mechanism of the classifier is called machine learning and feature vectors for training are called training patterns. The class label attached to each training pattern is called the ground-truth. We need to train a classifier to classify the training patterns correctly. The patterns subjected to the trained classifier are called test patterns.  The test patterns are assumed to be “unseen” patterns and thus we do not know their correct label … Usually the test pattern set and the training pattern set should be independent” teaches the test patterns [additional logical rules] being available upon successfully training the classifier [generated based on the executing]).
Akgül et al. and Uchida are considered analogous art because they are directed to accurate interpretation of complex information in medical images.
In view of the teachings of Akgül et al. it would have been obvious for a person of ordinary skill in the art to apply the teachings of Uchida at the time the application was filed in order to analyze biological images automatically or semi-automatically on a computer, thus allowing biologists to manage a larger number of biological images and exclude subjective Uchida, p.523, Introduction, paragraphs 2-3, “ … it is difficult to analyze a huge number of bioimages and have reliable analysis results. Furthermore, we need to be careful that the analysis result by the manual inspection may be biased by subjective observation. That is, the analysis result will depend largely on personal skill, decision, and preference.  On the other hand, research on “bioimage-informatics” is becoming active …  If the techniques are accurate enough, their analysis results will be more accurate and reliable than those by manual inspection …”). 
Alexander et al. further teaches … an input port that receives one or more data items (col. 6, lines 66-67 & col 7, lines 1-6 “With reference to FIG. 1, an image and graphics processing system 10 in accordance with the present invention consists of a parallel vector processor 12 for image processing, a shared memory 14, a set of high speed buses 16, and a graphics subsystem 18. Each of these units incorporates addressing schemes and/or multiprocessor control schemes that support the high speed imaging and graphics processing of the imaging and graphics processing system …” teaches an image processing system comprising a parallel vector processor and a shared memory; 
col 7, lines 10-11 “The parallel vector processing unit comprises a number of vector processors connected in parallel to the buses 16”  and col 8, lines 13-22 “The parallel vector processing unit 12 is the primary computation engine in the system, and is used mainly for imaging and general mathematical computations. With reference to FIG. 2, a single vector processor 20 comprises two floating point units (FPUs) 46, a set of scalar register files 48 and vector register files 50, a control ASIC52 for control and instruction issuance, a pixel formatter unit (PFU) 54 for pixel handling, an instruction and data cache 56, and a bus interface unit (BIU) 58 for interface to the high speed buses 16” teaches the parallel vector 
col 8, lines 66-67 “ … Each vector register file has a separate read and write port …” teaches the vector register file each comprising a read port [input port] for receiving pixel data [data items]); and
a shared memory data store that stores the one or more data items, with each of the one or more data items being written to the shared memory data store (col 9, lines 9-18 “In the present system, data conversion between floating point and integer values, referred to as pixel formatting, is carried out by a special pixel formatter unit (PFU) 54. The unit is implemented with a field programmable gate array. In general, image pixel data include 8-bit or 16-bit packed unsigned integer values, whereas image processing is per formed in floating point for accuracy. Also, computation results, which are in floating point, are preferably converted to 8-bit or 16-bit packed integer values before transfer to the shared memory …”  teaches shared memory [data store] that stores processed pixel data [data items] in the form of 8-bit and 16-bit packed unsigned integer values, wherein the processed pixel data [data items] are written to the shared memory by the pixel formatter unit).
Regarding Claim 19,
Akgül et al. in view of Uchida and in further view of Alexander et al. teaches the data processing system of claim 1. 
Akgül et al. further teaches wherein the data item comprises an image (p. 209, Introduction, paragraph 4, “Another emerging technique that may assist radiology interpretation is content-based image retrieval (CBIR) …” and p. 209, Introduction, paragraph 5, “At present there is a substantial gap between CBIR, and its focus on raw image information, and decision support systems, which typically enter the workflow beyond the point of image analysis itself. This gap represents what we believe is a major opportunity to develop decision support systems that integrate image features exploited in CBIR systems … ”  teaches a content based image retrieval system [wherein the data item comprises an image]),
wherein extracting comprises performing an image processing process on the image (  p. 216, Segmentation, paragraph 1, “Segmentation is a key preprocessing step in CBIR systems that describe the image content through regions of interest.  The goal is to identify the semantically meaningful regions/objects within an image … “ teaches image segmentation [an image processing process on the image]),
and wherein at least one of the plurality of data signatures comprises a visual feature of the image (p. 210, Image Features/Descriptors, paragraph 1, “Image features/descriptors are derived from visual cues contained in an image. They are represented as alpha-numeric data in different formats such as vectors or graphs, which stand as compact surrogates for the visual content. One can distinguish two types of visual features. Photometric features exploit color and texture cues and they are derived directly from raw pixel intensities. Geometric features, on the other hand, make use of shape-based cues …” teaches photometric and geometric features [visual feature of the image]).
Regarding Claim 23,
Akgül et al. teaches a data processing system (p. 209, Introduction, paragraph 4, “Another emerging technique that may assist radiology interpretation is content-based image retrieval (CBIR) …” and p. 209, Introduction, paragraph 5, “At present there is a substantial gap between CBIR, and its focus on raw image information, and decision support systems, which typically enter the workflow beyond the point of image analysis itself. This gap represents what we believe is a major opportunity to develop decision support systems that integrate image features exploited in CBIR systems … ”  teaches a content based image retrieval system [data processing system]) 
configured to pre-process data for a machine learning classifier (p. 215, Similarity Measures, paragraph 4, “ … Statistical classifiers need to be pretrained.  The membership of the query image to each class is usually used as a feature set representing the image content. However, classifiers can also be used as a preprocessing step to narrow the search space in CBIR systems …” teaches pre-processing data for statistical classifiers [machine learning classifier]), the data processing system comprising:
	… an extraction engine that extracts image data representing a biological structure by image processing one of the one or more radiological images (p.215, Image Features/Descriptors, paragraph 4, “We use the term shape to refer to the information that can be deduced directly from images and that cannot be represented by color or texture; as such, shape defines a complementary space to color and texture. A powerful way of representing shape is through perceptually grouped geometric cues such as edges, contours, joints, polylines, and polygonal regions extracted from an image … ” teaches extraction of geometric cues [image data] represented by joints [biological structure]),
	a logical ruleset generation engine configured to perform operations comprising:
identifying one or more portions of the biological structure each having a biological signature based on comparing the biological structure to a library of specified biological signatures (p. 210, Current CBIR Technology, paragraph 2 “ … In a typical CBIR system, database images are returned and displayed in decreasing order of their computed similarity to a query image provided by the user. Thus basically, whatever the application domain is, a CBIR system must provide a means for (a) describing and recording the image content based on pixel/voxel information (image features/descriptors) and (b) assessing the similarity between the query image and the images in the database” teaches comparing a query image [one or more portions of the biological structure each having a biological signature] to a database of images [a library of specified biological signatures] in order to determine the information contained in the image [identifying one or more portions of the biological structure]);
	generating a data structure from the biological structure, wherein the data structure includes a plurality of nodes connected with edges, each node in the data structure represents one of the biological signatures, and wherein each edge specifies a relationship between a first node and a second node; selecting a particular biological signature of the biological structure in the data structure (p. 213, Image Descriptors, paragraph 3, “ … the system … addresses the retrieval of pathology bearing regions (PBRs) in lung CT images by a human-in-the-loop approach … the system lets the user delineate PBRs in lung images … The system describes the images by the so-called attributional relational graphs (ARGs). In an ARG, the image regions are segregated by graph nodes and their spatial relationships are represented by edges between these nodes … Both nodes and edges are labeled by attributes corresponding to the properties of these regions and their relationships, respectively.  The attributes can be derived from any type of the visual cues described above, e.g., texture properties, global geometric features of regions (e.g., size, roundness), or features specified in some transform domain (e.g., Fourier coefficients of the extracted region boundary) … ” teaches the lung image [biological structure] represented by a graph [data structure], which includes graph nodes [a plurality of nodes] and spatial relationships between graph nodes [edges], and 

for the particular biological signature that is selected, identifying each instance of the particular biological signature in the data structure (p. 210, Image Features/Descriptors, paragraph 3, “ … In the medical domain, texture-based descriptors become particularly important as they can potentially reflect the fine details contained within an image structure. For example, cysts and solid nodules generally have uniform internal density and signal intensity characteristics, while more complex lesions and infiltrative disorders have heterogeneous characteristics … ”  teaches identifying cysts or solid nodules [instances] of the texture-based feature [particular biological signature]); and 
… segmenting the data structure around instances of the particular biological signature (p. 213, Image Descriptors, paragraph 3, “ … the system … addresses the retrieval of pathology bearing regions (PBRs) in lung CT images by a human-in-the-loop approach … the system lets the user delineate PBRs in lung images … The system describes the images by the so-called attributional relational graphs (ARGs). In an ARG, the image regions are segregated by graph nodes and their spatial relationships are represented by edges between these nodes … Both nodes and edges are labeled by attributes corresponding to the properties of these regions and their relationships, respectively.  The attributes can be derived from any type of the visual cues described above, e.g., texture properties, global geometric features of regions (e.g., size, roundness), or features specified in some transform domain (e.g., Fourier coefficients of the extracted region boundary) … ”  and
 p. 216, Segmentation, paragraph 1, “Segmentation is a key preprocessing step in CBIR systems that describe the image content through regions of interest.  The goal is to identify the semantically meaningful regions/objects within an image … “ teaches identifying the semantically meaningful regions/objects [instances] of the image corresponding to a particular attribute [particular biological signature] of the image).
 	Akgül et al. does not appear to explicitly teach … an input port that receives one or more radiological images; a shared memory data store that stores the one or more radiological images, with each of the one or more radiological images being written to the shared memory data store;
… identifying, based on the segmenting, one or more sequences of biological signatures connected to the particular biological signature in the data structure, each of the one or more sequences identified being different from other identified sequences of biological signatures connected to the particular biological signature in the data structure; and generating a logical ruleset, wherein each logical rule of the logical ruleset is a sequence of biological signatures of the one or more sequences of biological signatures that are identified; and a classification engine that receives the logical ruleset as an input and executes one or more classifiers against the logical ruleset to classify the one or more radiological images received by the input port, wherein one or more additional logical rules for the logical ruleset are generated based on the executing.
Uchida teaches … identifying, based on the segmenting, one or more sequences of biological signatures connected to the particular biological signature in the data structure, each of the one or more sequences identified being different from other identified sequences of biological signatures connected to the particular biological signature in the data structure (p. 523, Introduction, paragraph 4, “This paper introduces several basic image processing and image pattern recognition techniques, which will be useful for analyzing bioimages automatically by computer …” and p. 533, Clustering, paragraph 2, “Clustering can be used for image segmentation. Clustering-based image segmentation begins by representing an                         
                            M
                             
                            ×
                            N
                        
                     image as a set of                         
                            M
                             
                            ×
                            N
                        
                     vectors, P = {(x, y, I(x,y)}, where x and y represent the location of each pixel and I(x,y) represent some feature vector of the pixel.  A typical example of I(x,y) is an RGB color vector.  By applying some clustering method to the set P, the set is partitioned into groups having not only similar locations but also similar feature vectors, that is, an image segmentation result.  A key point of this method is to represent each pixel with its location, x and y.  Without the location, pixels having similar features are grouped regardless of their location” teaches identifying disjoint groups of similar                         
                            M
                            ×
                            N
                        
                     feature vectors [sequence of biological signatures] within an image [data structure], wherein the feature vector is an RGB color vector [particular biological signature]); and 
generating a logical ruleset, wherein each logical rule of the logical ruleset is a sequence of biological signatures of the one or more sequences of biological signatures that are identified (p. 523, Introduction, paragraph 4, “This paper introduces several basic image processing and image pattern recognition techniques, which will be useful for analyzing bioimages automatically by computer …” and p. 533, Clustering, paragraph 2, “Clustering can be used for image segmentation. Clustering-based image segmentation begins by representing an                         
                            M
                             
                            ×
                            N
                        
                     image as a set of                         
                            M
                             
                            ×
                            N
                        
                     vectors, P = {(x, y, I(x,y)}, where x and y represent the location of each pixel and I(x,y) represent some feature vector of the pixel.  A typical example of I(x,y) is an RGB color vector.  By applying some clustering method to the set P, the set is partitioned into groups having not only similar locations but also similar feature vectors, that is, an image segmentation result.  A key point of this method is to represent each pixel with its location, x and y.  Without the location, pixels having similar features are grouped regardless of their location” teaches representing an                        
                             
                            M
                            ×
                            N
                        
                     image as a set of                          
                            M
                            ×
                            N
                        
                     feature vectors [logical ruleset], wherein this set of                          
                            M
                            ×
                            N
                        
                     feature vectors is partitioned into groups having similar locations and similar feature vectors and each group representing a logical rule); and
a classification engine that receives the logical ruleset as an input and executes one or more classifiers against the logical ruleset to classify the one or more radiological images received by the input port (p. 542, Image pattern recognition, paragraphs 1-3, “ … Diagnosis of an embryo, or a single cell, or a subcellular organelle using its imaging result is also an image pattern recognition problem. For the simplest diagnosis, it is reduced to a two-class recognition problem, that is, normal or abnormal … Image pattern recognition is comprised of two modules: feature extraction and classification.  Feature extraction is the module to convert an input image as a set of values, that is, a vector … Classification is the module to classify the input feature vector into a class according to some rule, called a classifier. A classifier is trained automatically using patterns whose class is known. This training mechanism of the classifier is called machine learning and feature vectors for training are called training patterns. The class label attached to each training pattern is called the ground-truth. We need to train a classifier to classify the training patterns correctly…” teaches a training a classifier [executing one or more classifiers] using feature vectors for training [logical ruleset], wherein the feature vectors for training represent radiological images), 
wherein one or more additional logical rules for the logical ruleset are generated based on the executing (p. 542, Image pattern recognition, paragraph 3, “Classification is the module to classify the input feature vector into a class according to some rule, called a classifier. A classifier is trained automatically using patterns whose class is known. This training mechanism of the classifier is called machine learning and feature vectors for training are called training patterns. The class label attached to each training pattern is called the ground-truth. We need to train a classifier to classify the training patterns correctly. The patterns subjected to the trained classifier are called test patterns.  The test patterns are assumed to be “unseen” patterns and thus we do not know their correct label … Usually the test pattern set and the training pattern set should be independent” teaches the test patterns [additional logical rules] are generated based upon successfully training the classifier [the executing]).
Akgül et al. and Uchida are considered analogous art because they are directed to accurate interpretation of complex information in medical images.
In view of the teachings of Akgül et al. it would have been obvious for a person of ordinary skill in the art to apply the teachings of Uchida at the time the application was filed in order to analyze biological images automatically or semi-automatically on a computer, thus allowing biologists to manage a larger number of biological images and exclude subjective biases (cf. Uchida, p.523, Introduction, paragraphs 2-3, “ … it is difficult to analyze a huge number of bioimages and have reliable analysis results. Furthermore, we need to be careful that the analysis result by the manual inspection may be biased by subjective observation. That is, the analysis result will depend largely on personal skill, decision, and preference.  On the other hand, research on “bioimage-informatics” is becoming active …  If the techniques are accurate enough, their analysis results will be more accurate and reliable than those by manual inspection …”). 
Alexander et al. further teaches … an input port that receives one or more radiological images (col. 6, lines 66-67 & col 7, lines 1-6 “With reference to FIG. 1, an image and graphics processing system 10 in accordance with the present invention consists of a parallel vector processor 12 for image processing, a shared memory 14, a set of high speed buses 16, and a graphics subsystem 18. Each of these units incorporates addressing schemes and/or multiprocessor control schemes that support the high speed imaging and graphics processing of the imaging and graphics processing system …” teaches an image processing system comprising a parallel vector processor and a shared memory; 
col 7, lines 10-11 “The parallel vector processing unit comprises a number of vector processors connected in parallel to the buses 16”  and col 8, lines 13-22 “The parallel vector processing unit 12 is the primary computation engine in the system, and is used mainly for imaging and general mathematical computations. With reference to FIG. 2, a single vector processor 20 comprises two floating point units (FPUs) 46, a set of scalar register files 48 and vector register files 50, a control ASIC52 for control and instruction issuance, a pixel formatter unit (PFU) 54 for pixel handling, an instruction and data cache 56, and a bus interface unit (BIU) 58 for interface to the high speed buses 16” teaches the parallel vector processor comprising a plurality of vector processors with each vector processor comprising vector register files; 
col 8, lines 66-67 “ … Each vector register file has a separate read and write port …” teaches the vector register file each comprising a read port [input port] for receiving image pixel data [radiological images]); and
a shared memory data store that stores the one or more radiological images, with each of the one or more radiological images being written to the shared memory data store (col 9, lines 9-18 “In the present system, data conversion between floating point and integer values, referred to as pixel formatting, is carried out by a special pixel formatter unit (PFU) 54. The unit is implemented with a field programmable gate array. In general, image pixel data include 8-bit or 16-bit packed unsigned integer values, whereas image processing is per formed in floating point for accuracy. Also, computation results, which are in floating point, are preferably converted to 8-bit or 16-bit packed integer values before transfer to the shared memory …”  teaches shared memory [data store] that stores processed pixel data [radiological images] in the form of 8-bit and 16-bit packed unsigned integer values, wherein the processed pixel data [radiological images] are written to the shared memory by the pixel formatter unit).
Claims 2-3 are rejected under 35 U.S.C. 103 as being unpatentable over Akgül et al. (“Content-Based Image Retrieval in Radiology: Current Status and Future Directions”) in view of Uchida (“Image processing and recognition for biological images”) and in view of Alexander et al. (US 5467459 A) and in further view of García et al. (“Big data preprocessing: methods and prospects”).
Regarding Claim 2,
	Akgül et al. in view of Uchida and in further view of Alexander et al. teaches the data processing system of claim 1. 
	Akgül et al. in view of Uchida and in further view of Alexander et al. does not appear to explicitly teach wherein generation of the logical ruleset enables classification of the one or more data items with a reduced amount of data, relative to an amount of data required to classify the one or more data items independent of the generation of the logical ruleset.
	García et al. teaches wherein generation of the logical ruleset enables classification of the one or more data items with a reduced amount of data, relative to an amount of data required to classify the one or more data items independent of the generation of the logical ruleset (p. 6, Feature Selection, paragraph 1, “ Feature selection (FS) is “the process of identifying and removing as much irrelevant and redundant information as possible” … The goal is to obtain a subset of features from the original problem that still appropriately describe it. This subset is commonly used to train a learner, with added benefits reported in the specialized literature … FS can remove irrelevant and redundant features which may induce accidental correlations in learning algorithms, diminishing their generalization abilities. The use of FS is also known to decrease the risk of over-fitting in the algorithms used later. FS will also reduce the search space determined by the features, thus making the learning process faster and also less memory consuming” teaches addressing the same classification problem with less features [reduced amount of data]).
Akgül et al., Uchida, Alexander et al. and García et al. are considered analogous art because they are directed to the use of machine learning to identify features within data sets which best enable identification of relevant patterns. 
In view of the teachings of Akgül et al. in view of Uchida and in further view of Alexander et al. it would have been obvious for a person of ordinary skill in the art to apply the teachings of García et al. at the time the application was filed in order to identify the smallest set of features that can still convey essential information contained in large data sets, thus minimizing demands on a computing system and minimizing interpretation issues by the researcher (cf. García et al., p. 6, Feature Selection, paragraph 2, “The use FS can also help in task not directly related to the data mining algorithm applied to the data. FS can be used in the data collection stage, saving cost in time, sampling, sensing and personnel used to gather the data. Models and visualizations made from data with fewer features will be easier to understand and to interpret”). The Examiner notes that a person of ordinary skill in the art would find a suggestion to perform this type of analysis since Akgül et al. discloses this as a necessary activity for the taught invention (cf. Akgül et al., p. 208, Introduction, paragraph 1, “Diagnostic radiologists are struggling to maintain high interpretation accuracy while maximizing efficiency in the face of increasing exam volumes and numbers of images per study.  A promising approach 
Regarding Claim 3,
	Akgül et al. in view of Uchida and in view of Alexander et al. and in further view of García et al. teaches the data processing system of claim 2. 
	García et al. further teaches wherein classification of the one or more data items with a reduced amount of data increases a processing speed of the data processing system in classifying the one or more data items, relative to a processing speed of the data processing system in classifying the one or more data items independent of the generation of the logical ruleset (p. 6, Feature Selection, paragraph 1, “ Feature selection (FS) is “the process of identifying and removing as much irrelevant and redundant information as possible” … The goal is to obtain a subset of features from the original problem that still appropriately describe it. This subset is commonly used to train a learner, with added benefits reported in the specialized literature … FS can remove irrelevant and redundant features which may induce accidental correlations in learning algorithms, diminishing their generalization abilities. The use of FS is also known to decrease the risk of over-fitting in the algorithms used later. FS will also reduce the search space determined by the features, thus making the learning process faster and also less memory consuming” teaches training a learner [classification of the one or more data items] at a faster speed with less features [a reduced amount of data]).
Akgül et al., Uchida, Alexander et al. and García et al. are combinable for the same rationale as set forth above with respect to claim 2.
Claims 4-6 and 29-30 are rejected under 35 U.S.C. 103 as being unpatentable over Akgül et al. (“Content-Based Image Retrieval in Radiology: Current Status and Future Directions”) in Uchida (“Image processing and recognition for biological images”) and in view of Alexander et al. (US 5467459 A) and in further view of Han et al. (“Centroid-Based Document Classification: Analysis and Experimental Results”).
Regarding Claim 4,
	Akgül et al. in view of Uchida and in further view of Alexander et al. teaches the data processing system of claim 1. 
	Akgül et al. in view of Uchida and in further view of Alexander et al. does not appear to explicitly teach wherein the classification engine is configured to: determine a frequency for which each logical rule of the logical ruleset appears in the data structure; and generate a vector representing the data item, the vector defined by the frequency for each logical rule of the logical ruleset.
	Han et al. teaches wherein the classification engine is configured to: determine a frequency for which each logical rule of the logical ruleset appears in the data structure; and generate a vector representing the data item, the vector defined by the frequency for each logical rule of the logical ruleset (p. 425, section 2, paragraphs 1-2 
    PNG
    media_image1.png
    646
    891
    media_image1.png
    Greyscale
 teaches frequency for which each term [logical rule] of the set of terms [logical ruleset] appears in the document [data item] in the term-space [data structure] and teaches generation of the term-document vector representing the document [data item]).
Akgül et al., Uchida, Alexander et al. and Han et al. are considered analogous art because they are directed to the use of machine learning to identify features within data sets which best enable identification of relevant patterns. 
In view of the teachings of Akgül et al. in view of Uchida and in further view of Alexander et al. it would have been obvious for a person of ordinary skill in the art to apply the teachings of Han et al. at the time the application was filed in order to classify a document based on how closely its behavior matches the behavior of documents belonging to different classes, thus making it easier for people to search for information online (cf. Han et al., p. 424, Section Akgül et al. discloses this as a necessary activity for the taught invention (cf. Akgül et al., p. 208, Introduction, paragraph 1, “Diagnostic radiologists are struggling to maintain high interpretation accuracy while maximizing efficiency in the face of increasing exam volumes and numbers of images per study.  A promising approach to manage this image “explosion” is to integrate computer-based assistance into the image interpretation process … “).
Regarding Claim 5,
	Akgül et al. in view of Uchida and in view of Alexander et al. and in further view of Han et al. teaches the data processing system of claim 4. 
	Han et al. further teaches wherein the classification engine is configured to: compare the vector with another vector generated for another data item of the one or more data items, wherein comparing includes computing a distance between the vector and the other vector in a vector space (p. 425, section 2, paragraphs 1-2 
    PNG
    media_image1.png
    646
    891
    media_image1.png
    Greyscale
 teaches using cosine function to compute the similarity [distance] between two document vectors in a vector-space model).
Akgül et al., Uchida, Alexander et al. and Han et al. are combinable for the same rationale as set forth above with respect to claim 4.
Regarding Claim 6,
	Akgül et al. in view of Uchida and in further view of Alexander et al. teaches the data processing system of claim 1. 
Akgül et al. in view of Uchida and in further view of Alexander et al. does not appear to explicitly teach wherein the classification engine is configured to: determine which logical rules of the logical ruleset occur in another data item of the one or more data items; and represent the other data item as a vector of the logical rules that occur in the other data item.
Han et al. further teaches wherein the classification engine is configured to: determine which logical rules of the logical ruleset occur in another data item of the one or more data items; and represent the other data item as a vector of the logical rules that occur in the other data item (p. 425, section 2, paragraphs 1-2 
    PNG
    media_image1.png
    646
    891
    media_image1.png
    Greyscale
 teaches identification of terms [logical rules] in any document [data item] and representing any document [data item] as a term-frequency vector capturing the frequency of the terms [logical rules] in the document [data item]).
 Akgül et al., Uchida, Alexander et al. and Han et al. are combinable for the same rationale as set forth above with respect to claim 4.
Regarding Claim 29,
Akgül et al. in view of Uchida and in view of Alexander et al. and in further view of Han et al. teaches the data processing system of claim 5. 
	Han et al. further teaches wherein the other vector in the vector space is representative of a particular data item comprising a specified classification (p. 425, section 2, paragraph 3 
    PNG
    media_image2.png
    407
    897
    media_image2.png
    Greyscale
 teaches the other vector being a centroid vector [particular data item comprising a specified classification]).
Akgül et al., Uchida, Alexander et al. and Han et al. are combinable for the same rationale as set forth above with respect to claim 4.
Regarding Claim 30,
	Akgül et al. in view of Uchida and in view of Alexander et al. and in further view of Han et al. teaches the data processing system of claim 5. 
	Han et al. further teaches wherein the other vector represents an average of a plurality of vectors generated by the classification engine (p. 425, section 2, paragraph 3 
    PNG
    media_image2.png
    407
    897
    media_image2.png
    Greyscale
 teaches the other vector being a centroid vector generated by averaging the weights of the terms present in the document vector representation of set S).
Akgül et al., Uchida, Alexander et al. and Han et al. are combinable for the same rationale as set forth above with respect to claim 4.
Claims 7-10 are rejected under 35 U.S.C. 103 as being unpatentable over Akgül et al. (“Content-Based Image Retrieval in Radiology: Current Status and Future Directions”) in view of Uchida (“Image processing and recognition for biological images”) and in view of Alexander et al. (US 5467459 A) and in further view of Han et al. (US 8,527,435 Bl).
Regarding Claim 7,
Akgül et al. in view of Uchida and in further view of Alexander et al. teaches the data processing system of claim 1. 
Akgül et al. in view of Uchida and in further view of Alexander et al. does not appear to explicitly teach wherein the logical rule set generation engine is configured to: rank the plurality of data signatures; and select a higher ranked data signature to be the particular data signature.
Han et al. teaches wherein the logical rule set generation engine is configured to: rank the plurality of data signatures; (col. 10, lines 63-67 & col. 11, lines 1-8, “Dimensionality reduction is a challenging problem for supervised and unsupervised machine learning for classification, regression, and time series prediction. In this section we focus on variable selection for supervised classification and regression models. The taxonomy of variable selection has two branches: variable ranking and subset selection … a natural ranking of input variables is proposed based on the values of tuned Parzen window parameters, σ” teaches a ranking of input variables [data signatures] in order to reduce the dimensionality for classification purposes); and
select a higher ranked data signature to be the particular data signature (col. 10, lines 63-67 & col. 11, lines 1-15, “Dimensionality reduction is a challenging problem for supervised and unsupervised machine learning for classification, regression, and time series prediction. In this section we focus on variable selection for supervised classification and regression models. The taxonomy of variable selection has two branches: variable ranking and subset selection … a natural ranking of input variables is proposed based on the values of tuned Parzen window parameters, σ. The original variables are ranked corresponding to the sigma ranking (from low to high σ values). Bottom-ranked variables, i.e., variables corresponding to a higher σ value correspond to features that do not contribute much to the calculation of the RBF kernel entry and are therefore less important. Some of the bottom-ranked variables can therefore be eliminated”  teaches higher ranked variables [data signatures] being selected based on being retained). 
Akgül et al., Uchida, Alexander et al. and Han et al. are considered analogous art because they are directed to the use of machine learning to identify features within data sets which best enable identification of relevant patterns. 
In view of the teachings of Akgül et al. in view of Uchida and in further view of Alexander et al. it would have been obvious for a person of ordinary skill in the art to apply the teachings of Han et al. at the time the application was filed in order to identify the smallest set of features that can still convey essential information contained in large data sets, thus minimizing interpretation issues by the researcher (cf. Han et al., line 7, cols 1- 5, “… The sigma tuning and variable selection procedure introduced in this paper is applied to industrial magnetocardiogram data for the detection of ischemic heart disease from measurement of the magnetic field around the heart”). The Examiner notes that a person of ordinary skill in the art would find a suggestion to perform this type of analysis since Akgül et al. discloses this as a necessary activity for the taught invention (cf. Akgül et al., p. 208, Introduction, paragraph 1, “Diagnostic radiologists are struggling to maintain high interpretation accuracy while maximizing efficiency in the face of increasing exam volumes and numbers of images per study.  A promising approach to manage this image “explosion” is to integrate computer-based assistance into the image interpretation process … “).
Regarding Claim 8,
Akgül et al. in view of Uchida and in view of Alexander et al. and in further view of Han et al. teaches the data processing system of claim 7. 
Han et al. further teaches wherein data signatures above a threshold ranking are iteratively selected to be the particular data signature, and wherein the logical ruleset comprises logical rules generated for each of the data signatures selected to be the particular data signature ( col. 10, lines 63-67 & col. 11, lines 1-23, “Dimensionality reduction is a challenging problem for supervised and unsupervised machine learning for classification, regression, and time series prediction. In this section we focus on variable selection for supervised classification and regression models. The taxonomy of variable selection has two branches: variable ranking and subset selection … a natural ranking of input variables is proposed based on the values of tuned Parzen window parameters, σ. The original variables are ranked corresponding to the sigma ranking (from low to high σ values). Bottom-ranked variables, i.e., variables corresponding to a higher σ value correspond to features that do not contribute much to the calculation of the RBF kernel entry and are therefore less important. Some of the bottom-ranked variables can therefore be eliminated. The elimination phase can (i) proceed iteratively, where a few variables are dropped at a time, or (ii) proceed in a single-step greedy fashion. A random gauge variable (Embrechts, 2005; Bi, 2003) can be introduced to avoid discarding possibly significant variables. This random variable can either be uniform or Gaussian. Only features that rank below the random gauge variable will be eliminated ( during a single step)” teaches iterative selection of input variables [data signatures], wherein data signatures ranked above a random gauge variable [threshold ranking] are retained;
col 11, lines 23-27 “After the variable selection stage, a new K-PLS learning model is built based on different bootstraps with bagging in order to evaluate the performance of the sigma tuning based feature selection. Two benchmark data sets illustrate this procedure on a regression and a classification problem…” teaches datasets [logical ruleset] comprising data values [logical rules] using only variables [data signatures] retained after variable selection [selected to be the particular data signature]).
Akgül et al., Uchida, Alexander et al. and Han et al. are combinable for the same rationale as set forth above with respect to claim 7.
Regarding Claim 9,
Akgül et al. in view of Uchida and in view of Alexander et al. and in further view of Han et al. teaches the data processing system of claim 7. 
Han et al. further teaches wherein the ranking for a data signature is proportional to a frequency in which that data signature appears in the plurality of data signatures ( col. 11, lines 7-14, “ … a natural ranking of input variables is proposed based on the values of tuned Parzen window parameters, σ. The original variables are ranked corresponding to the sigma ranking (from low to high σ values). Bottom-ranked variables, i.e., variables corresponding to a higher σ value correspond to features that do not contribute much to the calculation of the RBF kernel entry and are therefore less important” teaches ranking of input variables [data signatures] based on their presence or absence [frequency] in the calculation of the RBF kernel entry).
Akgül et al., Uchida, Alexander et al. and Han et al. are combinable for the same rationale as set forth above with respect to claim 7.
Regarding Claim 10,
Akgül et al. in view of Uchida and in view of Alexander et al. and in further view of Han et al. teaches the data processing system of claim 7. 
Han et al. further teaches wherein the logical rule set generation engine is configured to weight a data signature with a predetermined weight value, and wherein ranking is based on the predetermined weight value of the data signature (col. 10, lines 63-67 & col. 11, lines 1-15, “Dimensionality reduction is a challenging problem for supervised and unsupervised machine learning for classification, regression, and time series prediction. In this section we focus on variable selection for supervised classification and regression models. The taxonomy of variable selection has two branches: variable ranking and subset selection … a natural ranking of input variables is proposed based on the values of tuned Parzen window parameters, σ. The original variables are ranked corresponding to the sigma ranking (from low to high σ values). Bottom-ranked variables, i.e., variables corresponding to a higher σ value correspond to features that do not contribute much to the calculation of the RBF kernel entry and are therefore less important. Some of the bottom-ranked variables can therefore be eliminated” teaches each variable [data signature] ranked according to their sigma value [predetermined weight value]).
Akgül et al., Uchida, Alexander et al. and Han et al. are combinable for the same rationale as set forth above with respect to claim 7.
Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Akgül et al. (“Content-Based Image Retrieval in Radiology: Current Status and Future Directions”) in view of Uchida (“Image processing and recognition for biological images”) and in view of Alexander et al. (US 5467459 A) and in further view of Shin et al. (“Enhanced centroid-based classification technique by filtering outliers”).
Regarding Claim 11,
	Akgül et al. in view of Uchida and in further view of Alexander et al. teaches the data processing system of claim 1. 
	Akgül et al. in view of Uchida and in further view of Alexander et al. does not appear to explicitly teach wherein the logical rule set generation engine is configured to: determine, for a logical rule, a frequency for which a sequence that defines the logical rule appears in the data 
	Shin et al. teaches wherein the logical rule set generation engine is configured to: determine, for a logical rule, a frequency for which a sequence that defines the logical rule appears in the data structure; (p. 160, section 2, paragraph 1 
    PNG
    media_image3.png
    446
    978
    media_image3.png
    Greyscale
 teaches document d [data structure] capturing frequency of term t [logical rule] and document similarity formula dependent upon frequency);  
determine, for a logical rule, that frequency is less than a threshold frequency; and  remove the logical rule from the logical ruleset
Figure 1 (right) 
    PNG
    media_image4.png
    491
    1042
    media_image4.png
    Greyscale
   and pp. 160-161, section 3, paragraph 1 “We observed that the training data items that are far away from the center of its training category tend to reduce the accuracy of classification. Our hypothesis is that those items merely represent noise and not provide any useful training examples and thus decrease the classification accuracy. Thus we exclude them from consideration; see Figure 1, right. Specifically, at the training stage we calculate the center Ci of each category Si using (2).  Then we form new categories by discarding the outliers:                         
                            
                                
                                    S
                                
                                
                                    i
                                
                                
                                    '
                                
                            
                            =
                            {
                            d
                             
                            ∈
                             
                            
                                
                                    S
                                
                                
                                    i
                                     
                                
                            
                            :
                            S
                            i
                            m
                             
                            
                                
                                    
                                        
                                            d
                                        
                                        
                                            k
                                        
                                    
                                    ,
                                    
                                        
                                            C
                                        
                                        
                                            i
                                        
                                    
                                
                            
                            >
                             
                            ε
                            }
                        
                    … in the next section, we discuss the choice of the threshold                         
                            ε
                        
                     …” teaches documents with similarity scores less than a threshold [determine, for a logical rule, that frequency is less than a threshold frequency] and subsequent removal of terms [logical rules] in said documents).
Akgül et al., Uchida, Alexander et al. and Shin et al. are considered analogous art because they are directed to the use of machine learning to identify features within data sets which best enable identification of relevant patterns. 
Akgül et al. in view of Uchida and in further view of Alexander et al. it would have been obvious for a person of ordinary skill in the art to apply the teachings of Shin et al. at the time the application was filed in order to classify a document based on how closely its behavior matches the behavior of documents belonging to different classes, thus making it easier for people to search for information online (cf. Shin et al., p. 159, section 1, paragraphs 1-3, “Since late 1990s, the explosive growth of Internet resulted in a huge quantity of documents available on-line. Technologies for efficient management of these documents are being developed continuously. One of representative tasks for efficient document management is text categorization, also called as classification … we show that removing outliers from the training categories significantly improves the classification results obtained by using the Centroid-based method”). The Examiner notes that a person of ordinary skill in the art would find a suggestion to perform this type of analysis since Akgül et al. discloses this as a necessary activity for the taught invention (cf. Akgül et al., p. 208, Introduction, paragraph 1, “Diagnostic radiologists are struggling to maintain high interpretation accuracy while maximizing efficiency in the face of increasing exam volumes and numbers of images per study.  A promising approach to manage this image “explosion” is to integrate computer-based assistance into the image interpretation process … “).
Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Akgül et al. (“Content-Based Image Retrieval in Radiology: Current Status and Future Directions”) in view of Uchida (“Image processing and recognition for biological images”) and in view of Alexander et al. (US 5467459 A) and in further view of Korde et al. (“Text Classification and Classifiers: A survey”).
Regarding Claim 12,
Akgül et al. in view of Uchida and in further view of Alexander et al. teaches the data processing system of claim 1. 
Akgül et al. in view of Uchida and in further view of Alexander et al. does not appear to explicitly teach wherein the one or more sequences comprise a plurality of sequences, and wherein the logical rule set generation engine is configured to:  determine that a first sequence of the plurality of sequences includes a second sequence of the plurality of sequences; and remove, from the logical ruleset, a logical rule defined by the first sequence.
Korde et al. teaches wherein the one or more sequences comprise a plurality of sequences, and wherein the logical rule set generation engine is configured to:  determine that a first sequence of the plurality of sequences includes a second sequence of the plurality of sequences; and remove, from the logical ruleset, a logical rule defined by the first sequence ( 
p. 86, section 2.2, paragraphs 1-2, “The first step of pre-processing which is used to presents the text documents into clear word format. The documents prepared for next step in text classification are represented by a great amount of features. Commonly the steps taken are:
Tokenization: A document is treated as a string, and then partitioned into a list of tokens.
Removing stop words: Stop words such as “the”, “a”, “and”, etc are frequently occurring, so the insignificant words need to be removed” teaches a plurality of lists [a plurality of sequences] of tokens, determining that a list [a first sequence] of tokens includes a list [a second sequence] of stop words, and removing the stop words [logical rule] from the list [the first sequence] of tokens).
Akgül et al., Uchida, Alexander et al. and Korde et al. are considered analogous art because they are directed to the use of machine learning to identify features within data sets which best enable identification of relevant patterns. 
In view of the teachings of Akgül et al. in view of Uchida and in further view of Alexander et al. it would have been obvious for a person of ordinary skill in the art to apply the teachings of Korde et al. at the time the application was filed in order to classify a document based on how closely its behavior matches the behavior of documents belonging to different classes, thus making it easier for people to search for information online (cf. Korde et al., p. 159, section 1, paragraph 1, “The text mining studies are gaining more importance recently because of the availability of the increasing number of the electronic documents from a variety of sources. Which include unstructured and semi structured information. The main goal of text mining is to enable users to extract information from textual resources and deals with the operations like, retrieval, classification (supervised, unsupervised and semi supervised) and summarizationNatural Language Processing (NLP), Data Mining, and Machine Learning techniques work together to automatically classify and discover patterns from the different types of the documents”). The Examiner notes that a person of ordinary skill in the art would find a suggestion to perform this type of analysis since Akgül et al. discloses this as a necessary activity for the taught invention (cf. Akgül et al..
Claims 13-14 are rejected under 35 U.S.C. 103 as being unpatentable over Akgül et al. (“Content-Based Image Retrieval in Radiology: Current Status and Future Directions”) in view of Uchida (“Image processing and recognition for biological images”) and in view of Alexander et al. (US 5467459 A) and in further view of Leary et al. (“An Optimal Structure-Discriminative Amino Acid Index for Protein Fold Recognition “).
Regarding Claim 13,
Akgül et al. in view of Uchida and in further view of Alexander et al. teaches the data processing system of claim 1. 
Akgül et al. in view of Uchida and in further view of Alexander et al. does not appear to explicitly teach wherein the extraction engine is configured to: compare a portion of the data item to a library of specified data signatures, and wherein a data signature is extracted from the data item when the portion of the data item matches a specified data signature of the library.
Leary et al. teaches wherein the extraction engine is configured to: compare a portion of the data item to a library of specified data signatures, and wherein a data signature is extracted from the data item when the portion of the data item matches a specified data signature of the library (p. 411, Introduction, paragraphs 1-2, “ … in fold recognition, a protein sequence is evaluated with respect to a set of known three-dimensional structure classes, and is assigned to the class with which it is most compatible  … In alignment-based fold recognition, the target sequence is aligned against class libraries of representative aligned sequences or a class profile extracted from such libraries, and then assigned to the class most similar to the target …” teaches comparing a protein sequence [a portion of the data item] to a class library of representative aligned sequences [a library of specified data signatures]; and

Akgül et al., Uchida, Alexander et al. and Leary et al. are considered analogous art because they are directed to the use of machine learning to identify features within data sets which best enable identification of relevant patterns. 
In view of the teachings of Akgül et al. in view of Uchida and in further view of Alexander et al. it would have been obvious for a person of ordinary skill in the art to apply the teachings of Leary et al. at the time the application was filed in order to identify the fold class of a protein sequence of unknown structure based on how closely its behavior matches the behavior of protein sequences belonging to different classes, thus making it easier for researchers to interpret results (cf. Leary et al., pp. 411-412, Introduction, paragraphs 1-7, “ … In many sequence analysis problems, it is desirable to classify protein sequences into two or more categories whose characteristics are known ahead of time. For example in fold recognition, a protein sequence is evaluated with respect to a set of known three-dimensional structure classes, and is assigned to the class with which it is most compatible …  In this article we introduce FoldID, a general method for classifying sequences, and demonstrate its use in a fold assignment task … ”). The Examiner notes that a person of ordinary skill in the art would find a suggestion to perform this type of analysis since Akgül et al. discloses this as a necessary activity for the taught invention (cf. Akgül et al., p. 208, Introduction, paragraph 1, “Diagnostic radiologists are struggling to maintain high interpretation accuracy while maximizing efficiency in the face of increasing exam volumes and numbers of images per study.  A promising approach to manage this image “explosion” is to integrate computer-based assistance into the image interpretation process … “).
Regarding Claim 14,
Akgül et al. in view of Uchida and in view of Alexander et al. and in further view of Leary et al. teaches the data processing system of claim 13. 
Leary et al. further teaches wherein a specified data signature of the library is assigned one or more parameter values, and wherein the extraction engine extracts the data signature from the data item when the portion of the data item satisfies the one or more parameter values assigned to the data signature (p. 412, Introduction, paragraphs 7-9, “In this article we introduce FoldID, a general method for classifying sequences, and demonstrate its use in a fold assignment task. Fold assignment is performed in the context of a supervised learning scenario for pattern classification … Supervised learning refers to the use of a library of patterns (the “training set”) with known class labels as the basis for the training of a classification rule. Usually the training consists of determining parameter values in the classification rule such that the rule performs optimally with respect to a merit function when tested on the training set … The index vector r is considered as the parameter vector to be optimized by the training process. A simple nearest centroid classification rule assigns target sequences to folds based on the closest library fold class centroid vector, as measured by Euclidean distance. A merit function representing the power of the index r to discriminate between the folds is defined as the ratio J(r) =SB(r)/SW(r) of the between-class variation to within-class variation of the corresponding library profile vectors. The index ropt with optimal discriminatory power that maximizes the merit function is obtained as the maximal eigenvector of a generalized eigenvalue problem, with other possibly useful independent indices being defined by lower eigenvectors … The training set consists of a library of sequences that have been structurally aligned and organized into 174 structural classes … Because the sequences within each class are already structurally aligned, no alignment phase is necessary during training. In fact, the given alignments are optimal in the sense that they are precisely the ones used to define the fold classes …” teaches optimizing index vector r [one or more parameter values] while training the classifier, enabling assignment of target protein sequences [data item] to fold classes [data signature] based on closest library fold class centroid vector [the data item satisfies the one or more parameter values assigned to the data signature]). 
Akgül et al., Uchida, Alexander et al. and Leary et al. are combinable for the same rationale as set forth above with respect to claim 13.
Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over Akgül et al. (“Content-Based Image Retrieval in Radiology: Current Status and Future Directions”) in view of Uchida (“Image processing and recognition for biological images”) and in view of Alexander et al. (US 5467459 A) and in further view of Zhao et al. (“Sequential Pattern Mining: A Survey”).
Regarding Claim 15,
Akgül et al. in view of Uchida and in further view of Alexander et al. teaches the data processing system of claim 1. 
Akgül et al. in view of Uchida and in further view of Alexander et al. does not appear to explicitly teach wherein the logical rule set generation is configured to: receive data indicating a threshold number of sequences; determine that a number of identified sequences for the data signature exceeds the threshold number of sequences; segment the data signature into sub-data signatures that each comprise at least one feature of the data signature; and generate another logical ruleset for at least one of the sub-data signatures, the other logical ruleset replacing the logical ruleset for the data signature.
Zhao et al. teaches wherein the logical rule set generation is configured to: receive data indicating a threshold number of sequences; determine that a number of identified sequences for the data signature exceeds the threshold number of sequences (p. 8, section 2.1, paragraph 4,  “Let D be a database of customer transactions,                         
                            I
                            =
                             
                            
                                
                                    I
                                
                                
                                    1
                                
                            
                            ,
                            
                                
                                    I
                                
                                
                                    2
                                
                            
                            ,
                             
                            ⋯
                            ,
                             
                            
                                
                                    I
                                
                                
                                    m
                                
                            
                        
                     be a set of m distinct attributes called items, T be transaction that includes {customer-id, transaction-time, item-purchased},                         
                            
                                
                                    s
                                
                                
                                    i
                                
                            
                        
                     be an itemsets, which contains a set of items from I, S be a sequence that consists of an ordered list of itemsets                         
                            
                                
                                    
                                        
                                            s
                                        
                                        
                                            1
                                        
                                    
                                    ,
                                    
                                        
                                            s
                                        
                                        
                                            2
                                        
                                    
                                    ,
                                     
                                    ⋯
                                    ,
                                     
                                    
                                        
                                            s
                                        
                                        
                                            n
                                        
                                    
                                
                            
                        
                    ” and  p. 8, section 2.1, paragraph 8, “Sequential pattern mining is the process of extracting certain sequential patterns whose support exceed a predefined minimal support threshold. Since the number of sequences can be very large, and users have different interests and requirements, to get the most interesting sequential patterns, usually a minimum support is pre-defined by users. By using the minimum support we can prune out those sequential patterns of no interest, consequently make the mining process more efficient …” teaches extracting sequential patterns [data] from an ordered list of itemsets [identified sequences for the data signature] whose support exceed predefined minimum support threshold [data indicating a threshold number of sequences]); 
segment the data signature into sub-data signatures that each comprise at least one feature of the data signature; and generate another logical ruleset for at least one of the sub-data signatures, the other logical ruleset replacing the logical ruleset for the data signature (p.9, Table 1 
    PNG
    media_image5.png
    1021
    1212
    media_image5.png
    Greyscale
 teaches itemset [data signature] in a) being segmented into itemset [sub-data signature] in b) and teaches creation of new dataset in “After Mapping” column of last table [another logical ruleset] ).
Akgül et al., Uchida, Alexander et al. and Zhao et al. are considered analogous art because they are directed to the use of machine learning to identify features within data sets which best enable identification of relevant patterns. 
In view of the teachings of Akgül et al. in view of Uchida and in further view of Alexander et al. it would have been obvious for a person of ordinary skill in the art to apply the teachings of Zhao et al. at the time the application was filed in order to extract useful Zhao et al., p.1, section I, paragraph 2, “ Many people take data mining as a synonym for another popular term, Knowledge Discovery in Database (KDD). Alternatively other people treat Data Mining as the core process of KDD.  Usually there are three processes. One is called preprocessing, which is executed before data mining techniques are applied to the right data. The preprocessing includes data cleaning, integration, selection and transformation. The main process of KDD is the data mining process, in this process different algorithms are applied to produce hidden knowledge. After that comes another process called postprocessing, which evaluates the mining result according to users' requirements and domain knowledge. Regarding the evaluation results, the knowledge can be presented if the result is satisfactory
.… ”). The Examiner notes that a person of ordinary skill in the art would find a suggestion to perform this type of analysis since Akgül et al. discloses this as a necessary activity for the taught invention (cf. Akgül et al., p. 208, Introduction, paragraph 1, “Diagnostic radiologists are struggling to maintain high interpretation accuracy while maximizing efficiency in the face of increasing exam volumes and numbers of images per study.  A promising approach to manage this image “explosion” is to integrate computer-based assistance into the image interpretation process … “).
Claim 16 is rejected under 35 U.S.C. 103 as being unpatentable over Akgül et al. (“Content-Based Image Retrieval in Radiology: Current Status and Future Directions”) in view of Uchida (“Image processing and recognition for biological images”) and in view of Alexander et al. (US 5467459 A) and in further view of Tilton et al. (“Best Merge Region-Growing Segmentation with Intergrated Nonadjacent Region Object Aggregation”).
Regarding Claim 16,
Akgül et al. in view of Uchida and in further view of Alexander et al. teaches the data processing system of claim 1. 
Akgül et al. in view of Uchida and in further view of Alexander et al. does not appear to explicitly teach wherein the one or more sequences comprise a plurality of sequences, wherein the logical rule set generation engine is configured to: determine that at least two sequences of the plurality are within a threshold similarity to one another; and combine at least two logical rules of the logical rule set, each of the least two logical rules corresponding to one of the at least two sequences of the plurality.
Tilton et al. teaches wherein the one or more sequences comprise a plurality of sequences, wherein the logical rule set generation engine is configured to: determine that at least two sequences of the plurality are within a threshold similarity to one another; and combine at least two logical rules of the logical rule set, each of the least two logical rules corresponding to one of the at least two sequences of the plurality (p. 2, section II(A), paragraph 1 
    PNG
    media_image6.png
    669
    601
    media_image6.png
    Greyscale
 teaches a plurality of regions [a plurality of sequences], and teaches merging two nonadjacent regions [combine at least two logical rules of the logical rule set] when dissimilarity criterion value between the pair of nonadjacent regions [two sequences]  is below a threshold as outlined in Step 5 [within a threshold similarity to one another]).
Akgül et al., Uchida, Alexander et al. and Tilton et al. are considered analogous art because they are directed to the use of machine learning to identify features within data sets which best enable identification of relevant patterns. 
Akgül et al. in view of Uchida and in further view of Alexander et al. it would have been obvious for a person of ordinary skill in the art to apply the teachings of Tilton et al. at the time the application was filed in order to perform image segmentation via best merge region growing, thus leading to reduced computing time and improved flexibility in segmenting moderate-sized to large-sized high spatial resolution images (cf. Tilton et al., p.3, section II(B), paragraphs 1-2, “The approach taken for implementing nonadjacent region object aggregation in this original version of HSeg requires excessive computing time. This is because the inclusion of spatially nonadjacent region merging requires the intercomparison of each region to every other region. Since HSeg is normally initialized with single pixel regions, this results in a combinatorial explosion of intercomparisons in the initial stage of the algorithm. In contrast, HSWO requires that each image pixel be initially compared only with its neighboring pixels. The RHSeg approximation to HSeg was devised to overcome this computational problem. RHSeg recursively subdivides the image data into subsections and then applies HSeg to the subsections of data that are small enough to be processed relatively quickly. However, RHSeg’s subdivision and subsequent recombination of the segmentation results can lead to processing window artifacts in which region boundaries are aligned with the processing window boundaries. This is because some region-merging decisions made by RHSeg in one processing window may have been nonoptimal due to the absence of knowledge concerning regions in other processing windows. RHSeg includes a provision to find and split out pixels that may have been inappropriately merged into a particular region at deeper levels of recursion and to remerge such pixels into a more appropriate region utilizing the global information available at higher levels of recursion.”). The Examiner notes that a person of ordinary skill in the art would find a suggestion to perform this type of analysis since Akgül et al. discloses this as a necessary activity for the taught invention (cf. Akgül et al., p. 208, Introduction, paragraph 1, “Diagnostic radiologists are struggling to maintain high interpretation accuracy while maximizing efficiency in the face of increasing exam volumes and numbers of images per study.  A promising approach to manage this image “explosion” is to integrate computer-based assistance into the image interpretation process … “).
Claim 17 is rejected under 35 U.S.C. 103 as being unpatentable over Akgül et al. (“Content-Based Image Retrieval in Radiology: Current Status and Future Directions”) in view of Uchida (“Image processing and recognition for biological images”) and in view of Alexander et al. (US 5467459 A) and in further view of Yan et al. (“gSpan: Graph-Based Substructure Pattern Mining”).
Regarding Claim 17,
Akgül et al. in view of Uchida and in further view of Alexander et al. teaches the data processing system of claim 1. 
Akgül et al. in view of Uchida and in further view of Alexander et al. does not appear to explicitly teach wherein the data item comprises a graph, wherein extracting comprises performing a traversal of the graph, and wherein a logical rule of the logical ruleset comprises a graph rule of the graph.
Yan et al. teaches wherein the data item comprises a graph, wherein extracting comprises performing a traversal of the graph (p. 723, section 3, paragraphs 1-2 
    PNG
    media_image7.png
    340
    562
    media_image7.png
    Greyscale
  teaches discovering all subgraphs of a graph dataset [data item] containing particular edges [performing a traversal of the graph]), and 
wherein a logical rule of the logical ruleset comprises a graph rule of the graph (p. 722, section 2, paragraphs 3-4 
    PNG
    media_image8.png
    432
    544
    media_image8.png
    Greyscale
 teaches a graph rule based on a linear order built among the edges in the graph G).
Akgül et al., Uchida, Alexander et al. and Yan et al. are considered analogous art because they are directed to the use of machine learning to identify features within data sets which best enable identification of relevant patterns. 
In view of the teachings of Akgül et al. in view of Uchida and in further view of Alexander et al. it would have been obvious for a person of ordinary skill in the art to apply the teachings of Yan et al. at the time the application was filed in order to discover all the frequent subgraphs of a graph without candidate generation and pruning of false positives, thus accelerating the graph mining process  (cf. Yan et al., p.721, section 1, paragraphs 3-4, “ … In the context of frequent subgraph mining, the Apriori-like algorithms meet two challenges: (1) candidate generation:  the generation of size (k + 1) subgraph candidates from size k frequent subgraphs is more complicated and costly than that of itemsets; and (2) pruning false positives: subgraph isomorphism test is an NP-complete problem, thus pruning false positives is costly … In this paper, we develop gSpan, which targets to reduce or avoid the significant costs mentioned above … ”). The Examiner notes that a person of ordinary skill in the art would find a suggestion to perform this type of analysis since Akgül et al. discloses this as a necessary activity for the taught invention (cf. Akgül et al., p. 208, Introduction, paragraph 1, “Diagnostic radiologists are struggling to maintain high interpretation accuracy while maximizing efficiency in the face of increasing exam volumes and numbers of images per study.  A promising approach to manage this image “explosion” is to integrate computer-based assistance into the image interpretation process … “).
Claim 18 is rejected under 35 U.S.C. 103 as being unpatentable over Akgül et al. (“Content-Based Image Retrieval in Radiology: Current Status and Future Directions”) in view of Uchida (“Image processing and recognition for biological images”) and in view of Alexander et al. (US 5467459 A) and in further view of Graening et al. (“Shape mining: A holistic data mining approach for engineering design”).
Regarding Claim 18,
Akgül et al. in view of Uchida and in further view of Alexander et al. teaches the data processing system of claim 1. 
Akgül et al. in view of Uchida and in further view of Alexander et al. does not appear to explicitly teach wherein the data item comprises one or more graphical elements, wherein a data signature comprises a shape, and wherein a logical rule of the logical ruleset comprises a shape rule.
Graening et al. teaches wherein the data item comprises one or more graphical elements, wherein a data signature comprises a shape, and wherein a logical rule of the logical ruleset comprises a shape rule (p. 175, Figure 8 
    PNG
    media_image9.png
    706
    796
    media_image9.png
    Greyscale
teaches the body of a car superimposed on a three dimensional coordinate system [one or more graphical elements];
 p. 175, section 6, “To underpin the practicality of concepts for knowledge extraction, a priori generated design data from a realistic application is needed. Typically, design data result from various diverse design processes where each design process follows a pre-defined strategy to reach a specific design goal. In this chapter two design strategies, as they are frequently used in CAE, are carried out to design the shape of a passenger car. The first one implements a global search strategy by means of uniformly sampling a constrained design space, while the second strategy follows a direct local search by exploiting the characteristics of the design during the progress of the design process. Both strategies result in design data sets, which are characteristic for explorative and exploitative design processes. Typically, a sensible combination of the two strategies is used, for both computational as well as human driven engineering design. The design space that represents all potential solutions is restricted by the representation of the passenger car. The improvement of the aerodynamic performance of the shape is pursued, with the overall design goal being formulated based on the results from computational fluid dynamic simulations (CFD)” teaches shape of a passenger car [data signature];
p. 171, section 4.1, paragraphs 1-2, “ … In the field of machine learning and data mining, IF–Then rules are often used to represent such abstractions in human-readable form, which are formally defined as: IF {antecedent} THEN {consequent} … When adopting the formulation of rules for the description of design concepts, the antecedent A represents an abstract object specification, e.g. defining the object shape, and the consequent C defines design quality related properties. Depending on the level of abstraction and the nature of the data, rules can be categorized into qualitative and quantitative rules …” teaches rules for the description of design concepts [shape rule]).
Akgül et al., Uchida, Alexander et al. and Graening et al. are considered analogous art because they are directed to the use of machine learning to identify features within data sets which best enable identification of relevant patterns. 
In view of the teachings of Akgül et al. in view of Uchida and in further view of Alexander et al. it would have been obvious for a person of ordinary skill in the art to apply the teachings of Graening et al. at the time the application was filed in order to identify the smallest Graening et al., p.184, section 11, paragraphs 1-2, “ … The adaptation of statistical data mining methods to surface data is preceded by the definition of appropriate distance measures between geometrical objects. Even though the surface differences considered in our research are not large, we have shown that the Geodesic distance is more suitable than the standard Euclidean distance between surface vertices. The sensitivity analysis based on correlation methods and information theoretic approaches applied to surface data constitutes the first step towards shape mining. The application of more sophisticated methods of knowledge formation necessitates to resolve the typical drawback of the universal design representation, i.e., its high dimensionality. Therefore, we apply feature evaluation, reduction and clustering techniques to reduce the representation to a significant subset. Based on this feature set, methods for concept retrieval can be applied. The procedure for the retrieval, description and evaluation of design concepts has been generalized and can be carried out independently of the used modeling technique. A new measure has been introduced to evaluate extracted design concepts based on the estimation of their utility. The new measure allows the ranking of concepts according to the formulation of the engineer’s objectives.”). The Examiner notes that a person of ordinary skill in the art would find a suggestion to perform this type of analysis since Akgül et al. discloses this as a necessary activity for the taught invention (cf. Akgül et al..
Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over Akgül et al. (“Content-Based Image Retrieval in Radiology: Current Status and Future Directions”) in view of Uchida (“Image processing and recognition for biological images”) and in view of Alexander et al. (US 5467459 A) and in further view of Gulordava et al. (“Diachronic Trends in Word Order Freedom and Dependency Length in Dependency-Annotated Corpora of Latin and Ancient Greek”).
Regarding Claim 20,
Akgül et al. in view of Uchida and in further view of Alexander et al. teaches the data processing system of claim 1. 
Akgül et al. in view of Uchida and in further view of Alexander et al. does not appear to explicitly teach wherein the data item comprises text, wherein at least one of the plurality of data signatures comprises a word of the text, and wherein the structure data comprises one or more of a word order and word distance between two words of the text.
Gulordava et al. teaches wherein the data item comprises text, wherein at least one of the plurality of data signatures comprises a word of the text (p. 123, section 3.1, paragraph 1, “For each of the texts in our corpus, we computed the percentage of prenominal versus post-nominal placement for two modifiers — adjectives and numerals. To avoid interference with size effects, these counts include only simple one-word modifiers” teaches data items comprising texts in a corpus as well as adjectives in the text [a word of the text]), and 
wherein the structure data comprises one or more of a word order and word distance between two words of the text (p. 123, section 3, “We begin our investigation of word order variation by looking at word order in the noun phrase, a controlled setting potentially influenced by fewer factors than sentential word order” teaches word order in a noun phrase;
p. 123, Figure 1 
    PNG
    media_image10.png
    320
    527
    media_image10.png
    Greyscale
  and p. 123, section 2.3, paragraph 6, “In this paper, we present an adjacency analysis for the noun phrase. More precisely, we identify modifiers which are separated from their head noun by at least one word which does not belong to the subtree headed by the noun. For instance, as can be seen from the dependency tree in Figure 1, the adjective reliquis is separated from its head maribus by the verb utimur, which does not belong to the subtree of maribus …” teaches distance between the adjective reliquis and the noun maribus in a sentence [word distance between two words of the text]).
Akgül et al., Uchida, Alexander et al. and Gulordava et al. are considered analogous art because they are directed to the use of machine learning to identify features within data sets which best enable identification of relevant patterns. 
In view of the teachings of Akgül et al. in view of Uchida and in further view of Alexander et al. it would have been obvious for a person of ordinary skill in the art to apply the teachings of Gulordava et al. at the time the application was filed in order to effectively parse free order languages, thus expanding the capability of natural language processing to a multitude of foreign languages from different time periods (cf. Gulordava et al., p.128, section 5, paragraphs 6, “We also evaluate parsing performance across time periods. Our intuition is that it Akgül et al. discloses this as a necessary activity for the taught invention (cf. Akgül et al., p. 208, Introduction, paragraph 1, “Diagnostic radiologists are struggling to maintain high interpretation accuracy while maximizing efficiency in the face of increasing exam volumes and numbers of images per study.  A promising approach to manage this image “explosion” is to integrate computer-based assistance into the image interpretation process … “).
Claim 21 is rejected under 35 U.S.C. 103 as being unpatentable over Akgül et al. (“Content-Based Image Retrieval in Radiology: Current Status and Future Directions”) in view of Uchida (“Image processing and recognition for biological images”) and in view of Alexander et al. (US 5467459 A) and in further view of Chakrabarti et al. (“Graph Mining: Laws, Generators, and Algorithms”).
Regarding Claim 21,
Akgül et al. in view of Uchida and in further view of Alexander et al. teaches the data processing system of claim 1. 
Akgül et al. in view of Uchida and in further view of Alexander et al. does not appear to explicitly teach wherein the data item comprises a social graph, wherein a data signature of the plurality comprises a node of the social graph, and wherein the structure data comprises one or more edges of the social graph.
Chakrabarti et al. teaches wherein the data item comprises a social graph, wherein a data signature of the plurality comprises a node of the social graph, and wherein the structure data comprises one or more edges of the social graph (p. 36, Figure 12 
    PNG
    media_image11.png
    479
    328
    media_image11.png
    Greyscale



and p. 36, section 3.3.1, paragraphs 1-3 “The small-world model is motivated by the observation that most real-world graphs seem to have low average distance between nodes (a global property) but have high clustering coefficients (a local property). Two experiments from the field of sociology shed light on this phenomenon. Travers and Milgram … conducted an experiment where participants had to reach randomly chosen individuals in the U.S.A. using a chain letter between close acquaintances. Their surprising find was that, for the chains that completed, the average length of the chain was only six in spite of the large population of individuals in the social network. While only around 29% of the chains were completed, the idea of small paths in large graphs was still a landmark find. The reason behind the short paths was discovered by Mark Granovetter … who tried to find out how people found jobs. The expectation was that the job seeker and his eventual employer would be linked by long paths. However, the actual paths were empirically found to be very short, usually of length one or two. This corresponds to the low average path length previously mentioned. Also, when asked whether a friend had told them about their current job, a frequent answer of the respondents was “Not a friend, an acquaintance”. Thus, this low average path length was being caused by acquaintances with whom the subjects only shared weak ties. Each acquaintance belonged to a different social circle and had access to different information. Thus, while the social graph has high clustering coefficient (i.e., is clique-ish), the low diameter is caused by weak ties joining faraway cliques”  teaches a social graph [data item] consisting of nodes [data signature] and their links to other nodes [one or more edges]).
Akgül et al., Uchida, Alexander et al. and Chakrabarti et al. are considered analogous art because they are directed to the use of machine learning to identify features within data sets which best enable identification of relevant patterns. 
In view of the teachings of Akgül et al. in view of Uchida and in further view of Alexander et al. it would have been obvious for a person of ordinary skill in the art to apply the teachings of Chakrabarti et al. at the time the application was filed in order to design a product that finds the distinguishing characteristics of real-world graphs and detect patterns that appear Chakrabarti et al., p.62, section 7, paragraph 1, “Naturally occurring graphs, perhaps collected from a variety of different sources, still tend to possess several common patterns. The most common of these are:
—power laws, in degree distributions, in PageRank distributions, in eigenvalue-versus-rank plots and many others,
—small diameters, such as the six degrees of separation for the US social network, 4
for the Internet AS-level graph, and 12 for the Router-level graph, and
—community structure as shown by high clustering coefficients, large numbers of bipartite cores, etc.
Graph generators attempt to create synthetic but realistic graphs which can mimic these patterns found in real-world graphs. Recent research has shown that generators based on some very simple ideas can match some of the patterns ...”). The Examiner notes that a person of ordinary skill in the art would find a suggestion to perform this type of analysis since Akgül et al. discloses this as a necessary activity for the taught invention (cf. Akgül et al., p. 208, Introduction, paragraph 1, “Diagnostic radiologists are struggling to maintain high interpretation accuracy while maximizing efficiency in the face of increasing exam volumes and numbers of images per study.  A promising approach to manage this image “explosion” is to integrate computer-based assistance into the image interpretation process … “).
Claim 22 is rejected under 35 U.S.C. 103 as being unpatentable over Akgül et al. (“Content-Based Image Retrieval in Radiology: Current Status and Future Directions”) in view of Uchida (“Image processing and recognition for biological images”) and in view of Alexander et al. (US 5467459 A) and in further view of Fiorentini et al.  (“An Ontology for Assembly Representation”).
Regarding Claim 22,
Akgül et al. in view of Uchida and in further view of Alexander et al. teaches the data processing system of claim 1. 
Akgül et al. in view of Uchida and in further view of Alexander et al. does not appear to explicitly teach wherein the data item comprises a schematic, wherein a data signature comprises a machine part, and wherein the structure data comprises enumerated relationships between the machine parts.
Fiorentini et al.  teaches wherein the data item comprises a schematic (p. 57, Figure 29 
    PNG
    media_image12.png
    339
    448
    media_image12.png
    Greyscale
teaches a diagram [schematic] of a planetary gear system), 
p. 58, Table 22 
    PNG
    media_image13.png
    442
    601
    media_image13.png
    Greyscale
teaches kinematic part and associated parts [data signature] of the planetary gear system ), and 
wherein the structure data comprises enumerated relationships between the machine parts (p. 47, section 6.2.1.6, paragraphs 1 – 3 
    PNG
    media_image14.png
    392
    602
    media_image14.png
    Greyscale
 teaches the instances fc_5 and fc_6 linking two different parts together [enumerated relationships between the machine parts]).
Akgül et al., Uchida, Alexander et al. and Fiorentini et al. are considered analogous art because they are directed to the use of machine learning to identify features within data sets which best enable identification of relevant patterns. 
In view of the teachings of Akgül et al. in view of Uchida and in further view of Alexander et al. it would have been obvious for a person of ordinary skill in the art to apply the teachings of Fiorentini et al. at the time the application was filed in order to design a model to capture the evolution of the assembly from the design phases and throughout the product’s useful life, thus achieving higher levels of interoperability between different design stakeholders (cf. Fiorentini et al., p. 1, section 1, paragraph 1, “The development of an ontological assembly representation was initiated from several considerations concerning assembly representation for Akgül et al. discloses this as a necessary activity for the taught invention (cf. Akgül et al., p. 208, Introduction, paragraph 1, “Diagnostic radiologists are struggling to maintain high interpretation accuracy while maximizing efficiency in the face of increasing exam volumes and numbers of images per study.  A promising approach to manage this image “explosion” is to integrate computer-based assistance into the image interpretation process … “).
Claims 24, 25, and 28 are rejected under 35 U.S.C. 103 as being unpatentable over Akgül et al. (“Content-Based Image Retrieval in Radiology: Current Status and Future Directions”) in view of Uchida (“Image processing and recognition for biological images”) and in view of Alexander et al. (US 5467459 A) and in further view of Zhang et al. (“Dictionary Pruning with Visual Word Significance for Medical Image Retrieval”).
Regarding Claim 24,
Akgül et al. in view of Uchida and in further view of Alexander et al. teaches the data processing system of claim 23. 
Akgül et al. in view of Uchida and in further view of Alexander et al. does not appear to explicitly teach wherein the classification engine is configured to: determine a frequency for 
Zhang et al. teaches wherein the classification engine is configured to: determine a frequency for which each logical rule of the logical ruleset appears in the data structure (p.2, section 1.1, paragraphs 1-2, “The aim of CBMIR is to extract visual characteristics of images to identify the level of similarity between two images.  Feature extraction can be categorized into global-(GFM) and local-feature (LFM) models based on the scope of descriptors … Specifically, the LFM is used to extract a collection of local patch features from each image.  The entire patch feature set computed from all images in the database is then grouped into clusters, which each cluster regarded as a visual word and the whole cluster collection considered as the visual dictionary. Then, all patch features in one image are assigned to visual words, generating a visual word frequency histogram to represent this image.  Finally, the similarity between images is computed based on these frequency histograms for retrieval” teaches a frequency histogram generated for each feature [logical rule] of the cluster [logical ruleset] in the cluster collection [data structure]); and 
generate a vector representing the radiological image, the vector defined by the frequency for each logical rule of the logical ruleset (p. 17, section 4.2.3, paragraph 2, “… Our method prunes the dictionary by keeping the most meaningful words and thus obtains a low-dimensional word frequency histogram vector for each image. Such dimensionality reduction can increase the speed of the retrieval process …” teaches a low-dimensional word frequency histogram vector for each image).
Akgül et al., Uchida, Alexander et al. and Zhang et al. are considered analogous art because they are directed to the use of machine learning to identify features within data sets which best enable identification of relevant patterns. 
In view of the teachings of Akgül et al. in view of Uchida and in further view of Alexander et al. it would have been obvious for a person of ordinary skill in the art to apply the teachings of Zhang et al. at the time the application was filed in order to design an effective image representation so that images with visually similar anatomical structures are closely correlated, thus improving disease treatment planning and management capabilities (cf. Zhang et al., p. 2, section 1, paragraph 1, “In the past three decades, but in particular in the last decade, medical image data have expanded rapidly due to the pivotal role of imaging in patient management and the growing range of image modalities … Traditional text-based retrieval, which manually indexes the images with alphanumerical keywords, is unable to sufficiently meet the increased demand from this growth. At the same time, advances in computer-aided content based medical image analysis systems mean that there are methods that can automatically extract the rich visual properties/features to characterize the images efficiently...”). 
Regarding Claim 25,
Akgül et al. in view of Uchida and in view of Alexander et al. and in further view of Zhang et al. teaches the data processing system of claim 24. 
Zhang et al. further teaches wherein the classification engine is configured to: compare the vector with another vector generated for another radiological image of the one or more radiological images, wherein comparing includes computing a distance between the vector and the other vector in a vector space; and determine whether the vector is indicative of one or more biological anomalies based on the classifier (p. 17, section 4.2.3, paragraph 2, “… Our method prunes the dictionary by keeping the most meaningful words and thus obtains a low-dimensional word frequency histogram vector for each image. Such dimensionality reduction can increase the speed of the retrieval process …” and p. 5, section 2.1, paragraph 1, “ … The word frequency histograms of images are … calculated and used to compare the image similarity with Euclidean distance for retrieval” teaches computing a Euclidean distance between two word frequency histogram vectors that represent images [compare the vector with another vector];
p. 13, section 4.2.1, paragraph 1, “ … It can be seen that our method can retrieve the cases with the same diagnosis, which are visually similar or different. For example of the lung nodule images, we retrieved … the most desired cases. While the first result is visually similar to the query, the second one is with a larger lung nodule than that in the query image and has more noise in the background regions. In addition, the proposed method can find the differences between the visually similar images that present different diseases …” teaches query image and result image, both represented by word frequency histogram vectors presenting different diseases [indicative of one or more biological anomalies]).
Akgül et al., Uchida, Alexander et al. and Zhang et al. are combinable for the same rationale as set forth above with respect to claim 24.
Regarding Claim 28,
Akgül et al. in view of Uchida and in view of Alexander et al. and in further view of Zhang et al. teaches the data processing system of claim 25. 
Zhang et al. further teaches wherein the other vector represents a radiological image indicative of a disease, and where determining whether the vector is indicative of the one or more biological anomalies comprises diagnosing the disease based on the comparing (p. 17, section 4.2.3, paragraph 2, “… Our method prunes the dictionary by keeping the most meaningful words and thus obtains a low-dimensional word frequency histogram vector for each image. Such dimensionality reduction can increase the speed of the retrieval process …” and p. 5, section 2.1, paragraph 1, “ … The word frequency histograms of images are … calculated and used to compare the image similarity with Euclidean distance for retrieval” and p. 13, section 4.2.1, paragraph 1, “ … It can be seen that our method can retrieve the cases with the same diagnosis, which are visually similar or different. For example of the lung nodule images, we retrieved … the most desired cases. While the first result is visually similar to the query, the second one is with a larger lung nodule than that in the query image and has more noise in the background regions. In addition, the proposed method can find the differences between the visually similar images that present different diseases …” teaches query image and result image, both represented by word frequency histogram vectors presenting different diseases [wherein the other vector represents a radiological image indicative of a disease]; and
	teaches the method of retrieving images/cases having the same lung nodules as query image [diagnosing the disease] based on visually similarity [based on the comparing]).
Akgül et al., Uchida, Alexander et al. and Zhang et al. are combinable for the same rationale as set forth above with respect to claim 24.
Claims 26-27 are rejected under 35 U.S.C. 103 as being unpatentable over Akgül et al. (“Content-Based Image Retrieval in Radiology: Current Status and Future Directions”) in view of Uchida (“Image processing and recognition for biological images”) and in view of Alexander et al. (US 5467459 A) and in further view of Lee et al. (“Identifying retinal vessel networks in ocular fundus images”).
Regarding Claim 26,
	Akgül et al. in view of Uchida and in further view of Alexander et al. teaches the data processing system of claim 23. 
	Akgül et al. in view of Uchida and in further view of Alexander et al. does not appear to explicitly teach wherein the biological structure comprises a vascular structure, and wherein the biological signature comprises one or more of a fork, a bend, and a loop in the vascular structure.
	Lee et al. teaches wherein the biological structure comprises a vascular structure (p. 357, section 5, paragraph 1, “The problem of understanding color fundus images is of considerable interest and utility.  This paper presents a series of processes to identify the vascular network.  As the experimental results show, the vessel identification system successfully labels the vascular network of typical fundus images … “  teaches a vascular network [biological structure]), and
	wherein the biological signature comprises one or more of a fork, a bend, and a loop in the vascular structure (p. 352, section 4.1, “Prior to labeling the vessel branches as arteries or veins, physical level knowledge of the vascular structure is required. The spatial distribution of the blood vessels includes branching, crossing, and meandering. Vascular branching is directional: vessels are large at the root and taper towards smaller branches; also, vessel type is inherited by the smaller branches at each branching point. Under normal condition, crossings only occur between an artery and a vein, often called arterio—venous crossing” and   p. 354, section 4.2.3 
    PNG
    media_image15.png
    571
    1387
    media_image15.png
    Greyscale
teaches a vascular structure [biological structure] including a distorted crossing, leading to a double branch fork [biological signature]).
Any limitation that recites “one or more of” has been interpreted as requiring one of the alternatives and not all of the alternatives.
Akgül et al., Uchida, Alexander et al. and Lee et al. are considered analogous art because they are directed to the use of machine learning to identify features within data sets which best enable identification of relevant patterns. 
In view of the teachings of Akgül et al. in view of Uchida and in further view of Alexander et al. it would have been obvious for a person of ordinary skill in the art to apply the teachings of Lee et al. at the time the application was filed in order to automate the process of extracting and labeling of vascular patterns in fundus images, thus allowing ophthalmologists to diagnose or prognosticate diseases more effectively based on reliable fundus evaluations (cf. Lee et al., p. 349, section 1, paragraph 1, “Color ocular fundus images are commonly used in diagnosing diseases such as hypertension, leukemia, glaucoma, and diabetes. These diseases, affecting either the eye or the central nervous system, are readily observable in the retina through the clear window provided by the cornea and lens. Early diagnosis of a disease and prognosis of 
Regarding Claim 27
	Akgül et al. in view of Uchida and in further view of Alexander et al. teaches the data processing system of claim 23. 
	Akgül et al. in view of Uchida and in further view of Alexander et al. does not appear to explicitly teach wherein the biological structure comprises one of a tissue configuration, a nervous system, or a bone structure.
	Lee et al. teaches wherein the biological structure comprises one of a tissue configuration, a nervous system, or a bone structure (p. 349, section 1, paragraph 1, “Color ocular fundus images are commonly used in diagnosing diseases such as hypertension, leukemia, glaucoma, and diabetes. These diseases, affecting either the eye or the central nervous system, are readily observable in the retina through the clear window provided by the cornea and lens. Early diagnosis of a disease and prognosis of the course of a disease depends on objective fundus analysis. Fundus image analysis includes measuring parameters such as the cup-to-disc ratio and blood vessel's rate of constriction, analyzing arterio-venous crossing phenomena, and examining a time sequence of retinal images taken from the same patient to observe changes caused by progression of a disease or degenerative conditions. Unfortunately, subjective evaluation of the fundus is presently the only method in use and vascular measurements must be performed manually. A computer method in fundus image analysis can provide a generalized means of evaluating and measuring vascular parameters in both disease and health” teaches color fundus images used to diagnose diseases affecting the central nervous system [biological structure]).
Any limitation that recites “one of” has been interpreted as requiring one of the alternatives and not all of the alternatives.
Prior Art
The prior arts made of record and not relied upon are considered pertinent to applicant’s disclosure:  Guyon et al. (US 2013/0172043 A1) teaches a method comprising the use of Support Vector Machines and Recursive Feature Elimination for the identification of patterns that are useful for medical diagnosis, prognosis, and treatment. 
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHIAKA CHUKWUMA OKOROH whose telephone number is (571)272-3710.  The examiner can normally be reached on M - F 7:30 AM - 4:30 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/CHIAKA CHUKWUMA OKOROH/Examiner, Art Unit 2125                                                                                                                                                                                                        
/MICHAEL J HUNTLEY/Primary Examiner, Art Unit 2116