Coppereye Greenwich Retrieval Server is a high-performance, multi-threaded data indexing and query engine. It has been developed to give the user timely and rapid access to flat data files through the use of Coppereye's patented indexing technology. Coppereye Greenwich RS provides the unique indexing component to index large quantities of data efficiently and inexpensively whilst also providing the ability to retrieve the information quickly and easily.
Coppereye Greenwich RS has the capacity to process flat files in a number of formats including the typical log files that emanate from IP systems, csv files and more complex ASN1 type files without having to use an ETL process. In addition, an SDK is provided to support the most sophisticated file formats.
Source data files can be fed from anywhere on the network and stored securely in their original form. The original source data files are retained and compressed as read only files. The load process is extremely fast and means that source data can be made available almost immediately.
Data files in original format
Source data files are fed from the network or mediation device and stored securely in their original form on Greenwich RS servers. Greenwich RS is not a database in the sense that there is no requirement for the data from the source files to be loaded into internal proprietary data storage. Instead the original source data files are retained and compressed as read only files, ensuring that there are no issues around inappropriate or information degrading transformations of the data during loading processes. This also means that all data attributes within the source data are available for potential future system requirements, e.g. additional data fields from the same data files need to be retained.
Whenever a query is required to return data fields, which are not stored in an index, such data is taken directly from the source data files.
Additionally, in some cases there is a requirement to have the source files readily available to certify that the data retained is the real source data indeed. The fact that Greenwich RS takes the data directly from the original data files means that there is no uplift to the storage requirements to do so, while traditional database approach would require to keep the original data files next to the DBMS proprietary data files, effectively doubling the volume of stored data.
High performance of indexing and querying
Greenwich RS has been developed with a view to utilizing the modern commodity hardware, without the need to use high-end storage units or servers. Typical Greenwich RS installation resides on 2-to-4 rack unit commodity servers with SATA disks, while being able to ingest billions or records per day and supply queries at a rate of hundreds of thousands of records per minute.
Subset of ANSI SQL-92
Greenwich RS supports SQL query access and data is queried by selecting results from the virtual tables configured in the server. There is no proprietary query language to learn, a query that can be executed against any RDMBS can be run on Greenwich RS with little or no modifications.
To simplify integration with any SQL-client, enterprise application or custom purpose-built software, Greenwich RS is shipped with both ODBC and JDBC drivers, supporting plug-and-play approach to integration.
Incoming data is parsed by Greenwich RS by means of Cartridges, which are plugins to Greenwich RS that can be engineered to support any custom data file format. Greenwich RS is shipped with a pre-built highly configurable ASCII Cartridge, which allows to set up parsing of virtually any text-based file format within minutes. Cartridge SDK is provided to enable users to support arbitrary data file format.
Within commodity server with 15 disks in the RAID6 set and RHEL6 the GreenwichServer performance is as the following:
- Querying tens of thousands of records from tens of terabytes of data per minute
- Indexing one hundred million records per hour
Coppereye patented indexing technology provides functionality similar to any other indexing technology, but delivers two orders of magnitude performance improvement. This is achieved by enabling sequential writes to the indexing structure, thus dramatically reducing the I/O – to 1/100th of the IO when compared to conventional B-Tree and hashed indexing.
- Patents filed across 11 independent aspects
- Over 100 filings worldwide