DETAILED ACTION

 	This communication is in response to the application filed on 02/28/2019.
 	After a thorough search and examination of the present application and in light of the prior art made of record, double patenting review, applicant's amendment and the examiner's amendment stated below, claims 1-20 are allowed. 

Notice of Pre-AIA  or AIA  Status
 	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
 	The information disclosure statement (IDS) submitted on 12/23/2020 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
EXAMINER’S AMENDMENT
An examiner’s amendment to the record appears below. Should the changes 
and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.
Authorization for this examiner’s amendment was given in a telephone interview with applicant attorney Anchit Kapoor on June. 09, 2021 (Please see the Applicant Initiated Interview Summary for detailed interview discussion). 

AMENDMENTS TO THE CLAIMS

This listing of claims will replace all prior versions, and listings, of claims in the application:

1.	(Currently Amended) A computer-implemented method for storing a file in object storage, comprising:
receiving, from an object storage, a source file having data comprising at least one of structured data and semi-structured data;
converting the source file into a data edge file having a manifest portion, a symbol portion, and a locality portion, wherein the symbol portion contains a sorted unique set of [[the]] symbols from the source file, and the locality portion contains a plurality of location values referencing the symbol portion; [[and]]
normalizing the data from the source file by modifying the manifest portion of the data edge file to include a description of at least one nonexistent column representing an omission of data at an associated position in the source file;
receiving a search query for data stored in the object storage;
querying the normalized data based on the search query to return a result set; and
generating a materialized view of the result set of the search query based on the querying of the normalized data.

2.	(Original) The method of claim 1, wherein normalizing the data from the source file by modifying the manifest portion of the data edge file comprises:
determining a maximum column count of the data; and


3.	(Original) The method of claim 1, further comprising:
inserting, in the manifest portion of the data edge file, a descriptive entry indicating at least one empty locality value associated with a column is to be replaced with a statistical value associated with the column, wherein the statistical value is maintained for each column of the data.

4.	(Original) The method of claim 3, wherein the statistical value associated with the column comprises at least one of a median value of the column, a mean value of the column, and a standard deviation of the column.

5.	(Original) The method of claim 1, wherein normalizing the data from the source file by modifying the manifest portion of the data edge file comprises:
determining shape information for a record in the data edge file; and
responsive to determining that the record has an anomalous shape based on the shape information, inserting a descriptive entry in the manifest portion indicating one or more location values from the locality portion is to be disregarded to achieve a regular shape of the data.

6.	(Original) The method of claim 1, further comprising:

wherein normalizing the data from the source file by modifying the manifest portion of the data edge file is performed based at least in part on the generated plurality of statistical values.

7.	(Currently Amended) The method of claim 1, wherein each of the symbols is stored at a corresponding location within the symbol portion, wherein a location value at a respective position within the locality portion represents an occurrence in the source file of a corresponding symbol identified by [[the]] a respective location value.

8.	(Currently Amended) The method of claim 1, wherein the source file comprises structured data, and wherein the plurality of location values are respectively ordered within the locality [[file]] portion by 

9.	(Currently Amended) The method of claim 1, wherein the data of the source file is semi-structured data comprising attribute-value pairs and array data types, wherein converting the source file into the data edge file further comprises:
generating a plurality of data segments that are arranged in flattened two-dimensional representation of the array data types in the semi-structured data, wherein each array data type is restructured into a separate data segment and referenced by a join identifier; and
plurality of self-join statements are configured to reconstruct an original structure of the 

10.	(Currently Amended) A computer apparatus for storing a file in object storage, comprising:
a memory; and
at least one processor coupled to the memory and configured to:
receive, from an object storage, a source file having data comprising at least one of structured data and semi-structured data;
convert the source file into a data edge file having a manifest portion, a symbol portion, and a locality portion, wherein the symbol portion contains a sorted unique set of [[the]] symbols from the source file, and the locality portion contains a plurality of location values referencing the symbol portion; [[and]]
normalize the data from the source file by modifying the manifest portion of the data edge file to include a description of at least one nonexistent column representing an omission of data at an associated position in the source file;
receive a search query for data stored in the object storage;
query the normalized data based on the search query to return a result set; and
generate a materialized view of the result set of the search query based on the querying of the normalized data.

, the at least one processor is further configured to:
determine a maximum column count of the data; and
responsive to determining that a record of the data has less values than the maximum column count, insert a description of the at least one nonexistent column in the manifest portion associated with the record.

12.	(Currently Amended) The computer apparatus of claim 10, wherein the at least one processor is further configured to:
insert, in the manifest portion of the data edge file, a descriptive entry indicating at least one empty locality value associated with a column is to be replaced with a statistical value associated with the column, wherein the statistical value is maintained for each column of the data.

13.	(Original) The computer apparatus of claim 12, wherein the statistical value associated with the column comprises at least one of a median value of the column, a mean value of the column, and a standard deviation of the column.

14.	(Currently Amended) The computer apparatus of claim 10, wherein , the at least one processor is further configured to:
determine shape information for a record in the data edge file; and


15.	(Currently Amended) The computer apparatus of claim 10, wherein the at least one processor is further configured to:
generate a plurality of statistical values about the data, the plurality of statistical values being stored in the manifest portion, wherein normalizing the data from the source file by modifying the manifest portion of the data edge file is performed based at least in part on the generated plurality of statistical values.

16.	(Currently Amended) The computer apparatus of claim 10, wherein each of the symbols is stored at a corresponding location within the symbol portion, wherein a location value at a respective position within the locality portion represents an occurrence in the source file of a corresponding symbol identified by [[the]] a respective location value.

17.	(Currently Amended) The computer apparatus of claim 10, wherein the source file comprises structured data, and wherein the plurality of location values are respectively ordered within the locality [[file]] portion by 

, the at least one processor is further configured to:
generate a plurality of data segments that are arranged in flattened two-dimensional representation of the array data types in the semi-structured data, wherein each array data type is restructured into a separate data segment and referenced by a join identifier; and
generate a plurality of self-join statements that are stored in the manifest portion of the data edge file, wherein the plurality of self-join statements are configured to reconstruct an original structure of the 

19.	(Currently Amended) A non-transitory computer-readable medium storing computer executable code for storing a file in object storage, comprising code to:
receive, from an object storage, a source file having data comprising at least one of structured data and semi-structured data;
convert the source file into a data edge file having a manifest portion, a symbol portion, and a locality portion, wherein the symbol portion contains a sorted unique set of [[the]] symbols from the source file, and the locality portion contains a plurality of location values referencing the symbol portion; [[and]]
normalize the data from the source file by modifying the manifest portion of the data edge file to include a description of at least one nonexistent column representing an omission of data at an associated position in the source file;
receive a search query for data stored in the object storage;
query the normalized data based on the search query to return a result set; and
generate a materialized view of the result set of the search query based on the querying of the normalized data.

20.	(Currently Amended) The non-transitory computer-readable medium of claim 19, wherein the data of the source file is semi-structured data comprising attribute-value pairs and array data types, wherein the code configured to convert the source file into the data edge file further comprises code to:
generate a plurality of data segments that are arranged in flattened two-dimensional representation of the array data types in the semi-structured data, wherein each array data type is restructured into a separate data segment and referenced by a join identifier; and
generate a plurality of self-join statements that are stored in the manifest portion of the data edge file, wherein the plurality of self-join statements are configured to reconstruct an original structure of the .


Reasons for Allowance
 The following is an examiner’s statement of reasons for allowance: 
 The prior art of record fail to teach the combination of claimed elements including:    
receiving, from an object storage, a source file having data comprising at least one of structured data and semi-structured data; converting the source file into a data edge independent claim 1, 10, 19).

 	Prior art are summarized as below:
a)  Muthuswamy teaches fulfilment of a tiering policy, dividing, by a cloud provider engine of a computing device, data blocks of a filesystem object into data chunks. Some examples comprise generating, by the cloud provider engine, a current manifest file in a local memory and causing the cloud storage system to generate a current pseudo folder in the cloud storage system corresponding to a particular epoch of the filesystem object. Some other examples comprise tiering, by the cloud provider engine, the data chunks and the current manifest file to the current pseudo folder, the current manifest including pointers to the data chunks corresponding to the filesystem object at the particular epoch. 
b)  Strader et al teaches a manifest file associated with the digital content item, wherein the manifest file includes a location for at least one segment file associated with the 
c)  Prahlad et al teaches performing data storage operations, including content-indexing, containerized deduplication, and policy-driven storage, within a cloud environment. The systems support a variety of clients and cloud storage sites that may connect to the system in a cloud environment that requires data transfer over wide area networks, such as the Internet, which may have appreciable latency and/or packet loss, using various network protocols, including HTTP and FTP. Methods are disclosed for content indexing data stored within a cloud environment to facilitate later searching, including collaborative searching. Methods are also disclosed for performing containerized deduplication to reduce the strain on a system namespace, effectuate cost savings, etc. Methods are disclosed for identifying suitable storage locations, including suitable cloud storage sites, for data files subject to a storage policy. Further, systems and methods for providing a cloud gateway and a scalable data object store within a cloud environment are disclosed, along with other features.
d) Gu et al teaches dynamically generating a targeted manifest file for use by a playback device to retrieve a video stream including targeted content, includes periodically receiving a manifest file that identifies a sequence of media files, and updating a master manifest file to identify the sequence of media files from each periodically received index file, such that the master manifest file identifies a continuous .

Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

Conclusion
 	Any inquiry concerning this communication or earlier communications from the examiner should be directed to Mohammad A Sana whose telephone number is (571)270-1753.  The examiner can normally be reached on Monday-Friday 9-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mark D Featherstone can be reached on 5712703750.  The fax phone 
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/AZAM M CHEEMA/Primary Examiner, Art Unit 2166