US20080263017A1 - System for unordered relational database retrieval returning distinct values - Google Patents

System for unordered relational database retrieval returning distinct values Download PDF

Info

Publication number
US20080263017A1
US20080263017A1 US12/166,289 US16628908A US2008263017A1 US 20080263017 A1 US20080263017 A1 US 20080263017A1 US 16628908 A US16628908 A US 16628908A US 2008263017 A1 US2008263017 A1 US 2008263017A1
Authority
US
United States
Prior art keywords
tuples
component
distinct
tuple
auxiliary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US12/166,289
Inventor
Ian R. Finlay
Tony Wen Hsun Lai
Daniel C. ZILIO
Calisto Paul Zuzarte
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to US12/166,289 priority Critical patent/US20080263017A1/en
Publication of US20080263017A1 publication Critical patent/US20080263017A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • G06F16/24554Unary operations; Data partitioning operations
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching

Definitions

  • the present invention generally relates to relational database management systems, and in particular to relational database retrieval constrained to return distinct values.
  • relational queries In relational database management systems, typically relational queries are supported which may be constrained to return distinct tuples or rows.
  • An example is the SQL keyword DISTINCT which, when used to qualify a query, ensures that there are no duplicate rows in the returned set of data satisfying the query.
  • a relational database is used as a backend for a time sensitive application, such as a website, for example, the time needed to sort the resulting table before discarding the duplicate rows may result in user dissatisfaction.
  • the data is to be presented to the user in a previously established order, after duplicate filtering the resulting table must be reordered to reflect that previously established order.
  • a method for sequentially providing a consumer process with a set of relational data including tuples matching a defined criteria including the steps of:
  • a relational database management system including a distinct operator component, a source component, and an auxiliary logger component, the relational database management system supporting the provision of data from a defined table to a consumer process, the consumer process requesting data from the distinct operator component, the distinct operator component including:
  • the source component including means for accessing a tuple in the set of tuples from the defined table upon request from the distinct operator component and providing the tuple to the distinct operator component, and
  • the auxiliary component including means for sequentially receiving tuples in the set of tuples from the distinct component and means for determining if each sequentially received tuple is distinct from other previously returned tuples in the sequence to verify the uniqueness of each sequentially received tuple to the distinct operator component.
  • the above relational database management system in which the means for determining if each sequentially received tuple is distinct includes a hash table to which each unique sequentially received tuple is added.
  • the above relational database management system in which the means for determining if each sequentially received tuple is distinct includes a sorted data structure to which each unique sequentially received tuple is added.
  • a computer program product including a computer usable medium tangibly embodying computer readable program code means for implementing the retrieval of distinct tuples in a relational database management system, the computer readable program code means including a distinct operator component, a source component, and an auxiliary logger component, the relational database management system supporting the provision of data from a defined table to a consumer process, the consumer process requesting data from the distinct operator component, the distinct operator component including:
  • code means for sequentially requesting a set of tuples from the source component upon a request from the consumer process, and for accepting tuples returned from the source component
  • the auxiliary component including code means for sequentially receiving tuples in the set of tuples from the distinct component and including means for determining if each sequentially received tuple is distinct from other previously returned tuples in the sequence to verify the uniqueness of each sequentially received tuple to the distinct operator component.
  • Advantages of the invention include the ability to provide tuples to a consuming process as they are verified for uniqueness and to provide the tuples in the sequence in which they are received from the database table.
  • FIG. 1 is a block diagram illustrating an implementation of the preferred embodiment of the invention in accordance with one implementation
  • FIG. 2A is an exemplary block diagram of a system in context with a relational database management system of the present invention in accordance with one implementation.
  • FIG. 2B is an exemplary block diagram of a relational database management system of the present invention in accordance with one implementation.
  • the present invention generally relates to relational database management systems, and in particular to relational database retrieval constrained to return distinct values.
  • the following description is presented to enable one of ordinary skill in the art to make and use the invention and is provided in the context of a patent application and its requirements.
  • Various modifications to the preferred embodiment and the generic principles and features described herein will be readily apparent to those skilled in the art.
  • the present invention is not intended to be limited to the embodiment shown but is to be accorded the widest scope consistent with the principles and features described herein.
  • FIG. 1 shows, in a block diagram view, different processes 100 which implement the relational database query of the preferred embodiment.
  • RDBMSs relational database management systems
  • the functionality of the different constituents shown in FIG. 1 may be achieved by other arrangements of computer components.
  • the description of the preferred embodiment refers to different processes. In other implementations the processes may be replaced by procedures. In either approach the system will include components to carry out the functions described below.
  • consumer 10 represents a process that initiates a data retrieval request ( 08 ) constrained to return distinct values ( 09 ).
  • Consumer 10 makes a request at 08 for a set of tuples (rows) from a relational database table.
  • Consumer 10 receives row data from the table at 09 in response to the request of 08 .
  • consumer 10 is a consumer of the data.
  • consumer 10 may be, for example, a process invoked by an SQL compiler in response to a user query or may be a process which is invoked as part of a more complex RDBMS operation.
  • consumer 10 represents a process that initiates a data retrieval request constrained to return distinct values.
  • Consumer 10 makes a request for a set of tuples (rows) from a relational database table.
  • Consumer 10 receives row data from the table in response to the request.
  • consumer 10 is a consumer of the data.
  • consumer 10 may be, for example, a process invoked by an SQL compiler in response to a user query or may be a process which is invoked as part of a more complex RDBMS operation.
  • distinct operator 12 is a component (a process, in the preferred embodiment) that carries out the steps to retrieve tuples from a defined table and to return those tuples to consumer 10 without duplicates in the returned set.
  • Distinct operator 12 invokes source 14 which in the preferred embodiment is a process that returns single tuples from a relational table.
  • Auxiliary logger 16 is a process that receives a tuple from distinct operator 12 .
  • Auxiliary logger 16 both records (logs) the tuple and indicates whether the tuple has been previously seen by auxiliary logger 16 nor not.
  • consumer 10 sends a request for tuples meeting a set of defined selection criteria (for example, matching a query predicate) to distinct operator 12 .
  • a set of defined selection criteria for example, matching a query predicate
  • consumer 10 is seeking a set of tuples that contain no duplicate values.
  • Distinct operator 12 sequentially processes the request for tuples using source 14 .
  • Source 14 responds to requests from distinct operator 12 by providing one tuple at a time to distinct operator 12 .
  • distinct operator 12 handles tuples from source 14 by sending each tuple in the sequence to auxiliary logger 16 .
  • Auxiliary logger 16 returns a value to distinct operator 12 indicating whether the tuple value has been seen in the set of values retrieved from source 14 .
  • auxiliary logger 16 verifies the uniqueness or (distinctness) of the received tuple in comparison with previously received tuples in the sequence. It will be apparent to those skilled in the art how to initialize auxiliary logger 16 to delimit the sequence of tuples that are returned in response to the request from consumer 10 .
  • Auxiliary logger 16 maintains a data structure to permit the identification of tuple values that have previously been obtained from source 14 .
  • One approach to implementing auxiliary logger 16 is for the process to maintain a sorted table into which unique tuples are stored. When a tuple is passed to auxiliary logger 16 that tuple will be added to the table if it is not already in the table. Where the tuple value is already in the table, auxiliary logger 16 returns a value to distinct operator 12 to indicate that the tuple value is not unique. Where auxiliary logger 16 determines that the tuple has a distinct or unique value (relative to those in the sequence), the process returns a value to distinct operator 12 to indicate the tuple is distinct (verifies uniqueness).
  • distinct operator 12 passes a tuple value to auxiliary logger 16 and the responding value signifies that the tuple value has not already been retrieved from source 14 in the defined sequence
  • distinct operator 12 passes the tuple to consumer 10 . Otherwise the tuple is ignored and not passed to consumer 10 . In this manner consumer 10 receives a unique set of tuples.
  • this approach to data retrieval from a relational database where distinct values are required permits tuples to be returned to the requesting process (consumer 10 in the preferred embodiment illustration of FIG. 1 ) without having to carry out a potentially slow sort of the entire set of retrieved tuples.
  • the first tuple retrieved will be quickly passed to consumer 10 and it is expected that other tuples may be quickly checked by auxiliary operator 16 and passed to consumer 10 when they are determined to be distinct.
  • This approach will provide the potential advantage of supplying data to consumer 10 early in the retrieval process.
  • the first display page of data may be more quickly determined than was the case in the prior art approach which required a sort of the entire retrieved set of tuples before any data was returned to consumer 10 .
  • the data returned to consumer 10 will be maintained III the same sequence as source 14 accesses the data. This will be advantageous in applications where the sequencing of the retrieved data is important.
  • auxiliary logger 16 may be implemented using different data structures and methods to determine if a given tuple value has already been passed to auxiliary logger 16 .
  • the process may, for example, employ a hash table to check and enter new tuple values.
  • auxiliary logger 16 may be implemented as a part of distinct operator 12 , not as a separate procedure or process.
  • source 14 returns a single tuple in response to a request from distinct operator 12 .
  • Certain optimized implementations of the preferred embodiment support source 14 returning multiple tuples to distinct operator 12 in response to a request.
  • distinct operator 12 may continue to pass returned tuples to auxiliary logger 16 on a tuple by tuple basis.
  • distinct operator 12 may pass auxiliary logger 16 a set of tuples.
  • auxiliary logger 16 will return a data structure corresponding to the set of tuples passed to it, to enable distinct operator 12 to determine which tuples in the set are to be returned to consumer 10 .
  • the size of the set will affect the ability of the preferred embodiment to return tuples promptly to consumer 10 .
  • a set size limit is selected to ensure that this advantage of the invention is not minimized.
  • FIG. 2A is an exemplary block diagram of a system 201 in context with a relational database management system 205 of the present invention in accordance with one implementation.
  • random access memory 202 and storage 203 are in communication with the relational database management system (RDBMS) 205 which is capable of receiving input 207 and sending output at 208 via 206 to external programs 204 .
  • RDBMS relational database management system
  • Instructions for receiving a sending input and output, in addition to instructions for performing the present invention, are set forth at 209 , which may be software, computer-readable instructions, or similar.
  • FIG. 2B is an exemplary block diagram of a relational database management system 200 of the present invention in accordance with one implementation.
  • the RDBMS 200 includes a database 210 having tuples, a directory for interaction with the database 220 , a data manipulation process at 230 such as assessing, evaluating or otherwise determining the uniqueness of the data retrieved and preparing to provide such via 231 for output at 240 , or deleting such at 232 , and an external program at 250 .
  • the present invention is not so limited to such a single RDBMS exemplary implementation.

Abstract

The retrieval of distinct tuples in a relational database management system. In response to a request from a consumer process for distinct tuples in a relational database table matching a defined criteria, a distinct operator component sequentially requests tuples from a source component. The source component access the database table and returns a tuple in the sequence to the distinct operator component. The distinct operator component passes each tuple in the sequence to an auxiliary logger. The auxiliary component receives a tuples from the distinct component and determines if it is distinct from other previously received tuples in the sequence to verify its uniqueness to the distinct operator. Tuples that are verified as unique by the auxiliary logger are returned to the consumer process by the distinct operator upon verification.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • Under 35 USC §120, this application is a continuation application and claims the benefit of priority to U.S. patent application Ser. No. 10/188,569, filed Jul. 2, 2002, entitled “Method for Unordered Relational Database Retrieval Returning Distinct Values”, which claims the benefit of priority under 35 USC §119 to Canadian Application No. 2,353,015, filed Jul. 12, 2001, all of which is incorporated herein by reference.
  • FIELD OF THE INVENTION
  • The present invention generally relates to relational database management systems, and in particular to relational database retrieval constrained to return distinct values.
  • BACKGROUND OF THE INVENTION
  • In relational database management systems, typically relational queries are supported which may be constrained to return distinct tuples or rows. An example is the SQL keyword DISTINCT which, when used to qualify a query, ensures that there are no duplicate rows in the returned set of data satisfying the query.
  • In the prior art, such queries are implemented by the returned set of rows or tuples being calculated and then sorted. After the sort is carried out, the duplicate rows are discarded and the unique set of rows or tuples is returned.
  • Where a relational database is used as a backend for a time sensitive application, such as a website, for example, the time needed to sort the resulting table before discarding the duplicate rows may result in user dissatisfaction. In addition, where the data is to be presented to the user in a previously established order, after duplicate filtering the resulting table must be reordered to reflect that previously established order.
  • It is therefore desirable to provide an implementation of the relational query that is constrained to return distinct or unique values but which is not subject to initial delays in presenting resulting rows to a user and in which the resulting table retains a previously established ordering.
  • SUMMARY OF THE INVENTION
  • According to an aspect of the present invention there is provided improved retrieval of distinct tuples or rows in a relational database management system.
  • According to another aspect of the present invention there is provided a method for sequentially providing a consumer process with a set of relational data including tuples matching a defined criteria, the method including the steps of:
      • retrieving from a database table a tuple in a sequence of tuples, the tuple satisfying the defined criteria,
      • determining whether the tuple is unique in comparison to previously retrieved tuples in the sequence,
      • providing the consumer process with the tuple where the tuple is unique and discarding the tuple where the tuple is not unique, and repeating the above steps until all tuples matching the defined criteria have been retrieved from the relational table.
  • According to another aspect of the present invention there is provided a relational database management system including a distinct operator component, a source component, and an auxiliary logger component, the relational database management system supporting the provision of data from a defined table to a consumer process, the consumer process requesting data from the distinct operator component, the distinct operator component including:
      • means for sequentially requesting a set of tuples from the source component upon a request from the consumer process, and for accepting tuples returned from the source component,
      • means for sequentially passing the tuples in the set of tuples to the auxiliary logger component for uniqueness verification, and
      • means for passing only verified unique tuples to the consumer process,
  • the source component including means for accessing a tuple in the set of tuples from the defined table upon request from the distinct operator component and providing the tuple to the distinct operator component, and
  • the auxiliary component including means for sequentially receiving tuples in the set of tuples from the distinct component and means for determining if each sequentially received tuple is distinct from other previously returned tuples in the sequence to verify the uniqueness of each sequentially received tuple to the distinct operator component.
  • According to another aspect of the present invention there is provided the above relational database management system in which the means for determining if each sequentially received tuple is distinct includes a hash table to which each unique sequentially received tuple is added.
  • According to another aspect of the present invention there is provided the above relational database management system in which the means for determining if each sequentially received tuple is distinct includes a sorted data structure to which each unique sequentially received tuple is added.
  • According to another aspect of the present invention there is provided a computer program product including a computer usable medium tangibly embodying computer readable program code means for implementing the retrieval of distinct tuples in a relational database management system, the computer readable program code means including a distinct operator component, a source component, and an auxiliary logger component, the relational database management system supporting the provision of data from a defined table to a consumer process, the consumer process requesting data from the distinct operator component, the distinct operator component including:
  • code means for sequentially requesting a set of tuples from the source component upon a request from the consumer process, and for accepting tuples returned from the source component,
      • code means for sequentially passing the tuples in the set of tuples to the auxiliary logger component for uniqueness verification, and
      • code means for passing only verified unique tuples to the consumer process, the source component including code means for accessing a tuple in the set of tuples from the defined table upon request from the distinct operator component and providing the
      • tuple to the distinct operator component, and
  • the auxiliary component including code means for sequentially receiving tuples in the set of tuples from the distinct component and including means for determining if each sequentially received tuple is distinct from other previously returned tuples in the sequence to verify the uniqueness of each sequentially received tuple to the distinct operator component.
  • Advantages of the invention include the ability to provide tuples to a consuming process as they are verified for uniqueness and to provide the tuples in the sequence in which they are received from the database table.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • In the accompanying drawing which illustrate the invention by way of example only,
  • FIG. 1 is a block diagram illustrating an implementation of the preferred embodiment of the invention in accordance with one implementation;
  • FIG. 2A is an exemplary block diagram of a system in context with a relational database management system of the present invention in accordance with one implementation; and,
  • FIG. 2B is an exemplary block diagram of a relational database management system of the present invention in accordance with one implementation.
  • In the drawing, the preferred embodiment of the invention is illustrated by way of example. It is to be expressly understood that the description and drawings are only for the purpose of illustration and as an aid to understanding, and are not intended as a definition of the limits of the invention.
  • DETAILED DESCRIPTION
  • The present invention generally relates to relational database management systems, and in particular to relational database retrieval constrained to return distinct values. The following description is presented to enable one of ordinary skill in the art to make and use the invention and is provided in the context of a patent application and its requirements. Various modifications to the preferred embodiment and the generic principles and features described herein will be readily apparent to those skilled in the art. Thus, the present invention is not intended to be limited to the embodiment shown but is to be accorded the widest scope consistent with the principles and features described herein.
  • FIG. 1 shows, in a block diagram view, different processes 100 which implement the relational database query of the preferred embodiment. It will be appreciated by those skilled in the art that the preferred embodiment may be implemented in different relational database management systems (RDBMSs). The functionality of the different constituents shown in FIG. 1 may be achieved by other arrangements of computer components. For example, the description of the preferred embodiment refers to different processes. In other implementations the processes may be replaced by procedures. In either approach the system will include components to carry out the functions described below.
  • In FIG. 1, consumer 10 represents a process that initiates a data retrieval request (08) constrained to return distinct values (09). Consumer 10 makes a request at 08 for a set of tuples (rows) from a relational database table. Consumer 10 receives row data from the table at 09 in response to the request of 08. In this sense, consumer 10 is a consumer of the data. As will be appreciated, consumer 10 may be, for example, a process invoked by an SQL compiler in response to a user query or may be a process which is invoked as part of a more complex RDBMS operation.
  • In FIG. 1, consumer 10 represents a process that initiates a data retrieval request constrained to return distinct values. Consumer 10 makes a request for a set of tuples (rows) from a relational database table. Consumer 10 receives row data from the table in response to the request. In this sense, consumer 10 is a consumer of the data. As will be appreciated, consumer 10 may be, for example, a process invoked by an SQL compiler in response to a user query or may be a process which is invoked as part of a more complex RDBMS operation.
  • In the preferred embodiment, distinct operator 12 is a component (a process, in the preferred embodiment) that carries out the steps to retrieve tuples from a defined table and to return those tuples to consumer 10 without duplicates in the returned set. Distinct operator 12 invokes source 14 which in the preferred embodiment is a process that returns single tuples from a relational table. Auxiliary logger 16 is a process that receives a tuple from distinct operator 12. Auxiliary logger 16 both records (logs) the tuple and indicates whether the tuple has been previously seen by auxiliary logger 16 nor not.
  • In the preferred embodiment, consumer 10 sends a request for tuples meeting a set of defined selection criteria (for example, matching a query predicate) to distinct operator 12. By using distinct operator 12, consumer 10 is seeking a set of tuples that contain no duplicate values. Distinct operator 12 sequentially processes the request for tuples using source 14. Source 14 responds to requests from distinct operator 12 by providing one tuple at a time to distinct operator 12. distinct operator 12 handles tuples from source 14 by sending each tuple in the sequence to auxiliary logger 16. Auxiliary logger 16 returns a value to distinct operator 12 indicating whether the tuple value has been seen in the set of values retrieved from source 14. In effect, auxiliary logger 16 verifies the uniqueness or (distinctness) of the received tuple in comparison with previously received tuples in the sequence. It will be apparent to those skilled in the art how to initialize auxiliary logger 16 to delimit the sequence of tuples that are returned in response to the request from consumer 10.
  • Auxiliary logger 16 maintains a data structure to permit the identification of tuple values that have previously been obtained from source 14. One approach to implementing auxiliary logger 16 is for the process to maintain a sorted table into which unique tuples are stored. When a tuple is passed to auxiliary logger 16 that tuple will be added to the table if it is not already in the table. Where the tuple value is already in the table, auxiliary logger 16 returns a value to distinct operator 12 to indicate that the tuple value is not unique. Where auxiliary logger 16 determines that the tuple has a distinct or unique value (relative to those in the sequence), the process returns a value to distinct operator 12 to indicate the tuple is distinct (verifies uniqueness).
  • In the case where distinct operator 12 passes a tuple value to auxiliary logger 16 and the responding value signifies that the tuple value has not already been retrieved from source 14 in the defined sequence, distinct operator 12 passes the tuple to consumer 10. Otherwise the tuple is ignored and not passed to consumer 10. In this manner consumer 10 receives a unique set of tuples.
  • As may be seen from the above description, this approach to data retrieval from a relational database where distinct values are required permits tuples to be returned to the requesting process (consumer 10 in the preferred embodiment illustration of FIG. 1) without having to carry out a potentially slow sort of the entire set of retrieved tuples. The first tuple retrieved will be quickly passed to consumer 10 and it is expected that other tuples may be quickly checked by auxiliary operator 16 and passed to consumer 10 when they are determined to be distinct. This approach will provide the potential advantage of supplying data to consumer 10 early in the retrieval process. Where, for example, the data is retrieved for use in a web-page environment, the first display page of data may be more quickly determined than was the case in the prior art approach which required a sort of the entire retrieved set of tuples before any data was returned to consumer 10.
  • In addition, the data returned to consumer 10 will be maintained III the same sequence as source 14 accesses the data. This will be advantageous in applications where the sequencing of the retrieved data is important.
  • As will be appreciated, auxiliary logger 16 may be implemented using different data structures and methods to determine if a given tuple value has already been passed to auxiliary logger 16. The process may, for example, employ a hash table to check and enter new tuple values.
  • As will be further appreciated, although the preferred embodiment has been described with reference to distinct processes, the preferred embodiment may be implemented by processes which combine one or more of the functions in the processes shown in FIG. 1. For example, auxiliary logger 16 may be implemented as a part of distinct operator 12, not as a separate procedure or process.
  • In the preferred embodiment described above, source 14 returns a single tuple in response to a request from distinct operator 12. Certain optimized implementations of the preferred embodiment support source 14 returning multiple tuples to distinct operator 12 in response to a request. In this case distinct operator 12 may continue to pass returned tuples to auxiliary logger 16 on a tuple by tuple basis. Alternatively, distinct operator 12 may pass auxiliary logger 16 a set of tuples. In this latter implementation, auxiliary logger 16 will return a data structure corresponding to the set of tuples passed to it, to enable distinct operator 12 to determine which tuples in the set are to be returned to consumer 10. As will be appreciated, where the components in the preferred embodiment pass sets of tuples, the size of the set will affect the ability of the preferred embodiment to return tuples promptly to consumer 10. A set size limit is selected to ensure that this advantage of the invention is not minimized. Although a preferred embodiment of the invention has been described above, it will be appreciated by those skilled in the art that variations may be made, without departing from the spirit of the invention or the scope of the appended claims.
  • FIG. 2A is an exemplary block diagram of a system 201 in context with a relational database management system 205 of the present invention in accordance with one implementation. In the system 201, random access memory 202 and storage 203 are in communication with the relational database management system (RDBMS) 205 which is capable of receiving input 207 and sending output at 208 via 206 to external programs 204. Instructions for receiving a sending input and output, in addition to instructions for performing the present invention, are set forth at 209, which may be software, computer-readable instructions, or similar.
  • FIG. 2B is an exemplary block diagram of a relational database management system 200 of the present invention in accordance with one implementation. The RDBMS 200, in one implementation, includes a database 210 having tuples, a directory for interaction with the database 220, a data manipulation process at 230 such as assessing, evaluating or otherwise determining the uniqueness of the data retrieved and preparing to provide such via 231 for output at 240, or deleting such at 232, and an external program at 250. However, the present invention is not so limited to such a single RDBMS exemplary implementation.
  • Although the present invention has been described in accordance with the embodiments shown, one of ordinary skill in the art will readily recognize that there could be variations to the embodiments and those variations would be within the spirit and scope of the present invention. Accordingly, many modifications may be made by one of ordinary skill in the art without departing from the spirit and scope of the appended claims.

Claims (9)

1. A relational database management system comprising a source component;
an auxiliary logger component, and a distinct operator, the relational database management system supporting the provision of data from a defined table to a consumer process, the consumer process requesting data from the distinct operator component;
the distinct operator component for sequentially requesting a set of tuples from the source component upon a request from the consumer process, and for accepting tuples returned from the source component, for sequentially passing the tuples in the set of tuples to the auxiliary logger component for uniqueness verification, and for passing only verified unique tuples to the consumer process, the source component for accessing a tuple, in the set of tuples from the defined table upon request from the distinct operator component and providing the tuple to the distinct operator component, and the auxiliary logger component for sequentially receiving tuples in the set of tuples from the distinct operator component and determining using an auxiliary component if each sequentially received tuple is distinct from other previously returned tuples in the sequence to verify the uniqueness of each sequentially received tuple to the distinct operator component and indicating whether the retrieved tuple has been previously received by the auxiliary component.
2. The relational database management system of claim 1 in which the auxiliary component includes a hash table to which each unique sequentially received tuple is added.
3. The relational database management system of claim 1 in which the auxiliary component includes a sorted data structure to which each unique sequentially received tuple is added.
4. A computer readable medium containing program instructions for the retrieval of distinct tuples in a relational database management system, the program instructions for providing a distinct operator component,
providing a source component, and
providing an auxiliary logger component, wherein the relational database management system supports the provision of data from a defined table to a consumer process, wherein the consumer process requests data from the distinct operator component,
wherein the instruction for the distinct operator component further includes instructions for:
sequentially requesting a set of tuples from the source component upon a request from the consumer process, and for accepting tuples returned from the source component sequentially passing the tuples in the set of tuples to the auxiliary logger component for uniqueness verification, and
passing only verified unique tuples to the consumer process,
the source component comprising code means for accessing a tuple in the set of tuples from the defined table upon request from the distinct operator component and providing the tuple to the distinct operator component, and
wherein the instructions for the auxiliary logger component further includes instructions for sequentially receiving tuples in the set of tuples from the distinct component and for determining using an auxiliary component if each sequentially received tuple is distinct from other previously returned tuples in the sequence to verify the uniqueness of each sequentially received tuple to the distinct operator component and indicating whether the retrieved tuple has been previously received by the auxiliary component.
5. The computer readable medium of claim 4 in which the instructions for determining if each sequentially received tuple is distinct includes instructions for maintaining a hash table, for verifying uniqueness of tuples using the hash table and for adding each unique sequentially received tuple to the hash table.
6. The computer readable medium of claim 4 in which the instructions for determining if each sequentially received tuple is distinct further includes instructions for maintaining a sorted data structure, for verifying uniqueness of tuples using the sorted data structure and for adding each unique sequentially received tuple to the sorted data structure.
7. A computer readable medium containing program instructions for the retrieval of distinct tuples in a relational database management system, the program instructions for providing a distinct operator component,
providing a source component, and
providing an auxiliary logger component, wherein the relational database management system supports the provision of data from a defined table to a consumer process, wherein the consumer process requests data from the distinct operator component, wherein the instruction for the distinct operator component further includes instructions for:
sequentially requesting a set of tuples from the source component upon a request from the consumer process,
sequentially passing one or more tuples in the set of tuples to the auxiliary logger component for uniqueness verification, and
passing only verified unique tuples to the consumer process,
the source component comprising code means for accessing one or more tuples in the set of tuples from the defined table upon request from the distinct operator component and returning a subset of tuples to the distinct operator component, and
wherein the instructions for the auxiliary logger component further including instructions for sequentially receiving a verification set of tuples in the set of tuples from the distinct component and for determining using an auxiliary component if each sequentially received tuple is distinct from other previously returned tuples in the sequence to verify the uniqueness of each sequentially received tuple to the distinct operator component and indicating whether the retrieved tuple has been previously received by the auxiliary component.
8. The computer readable medium of claim 7 wherein the instructions for determining if each sequentially received tuple is distinct includes instructions for maintaining a hash table and for verifying uniqueness of tuples using the hash table and for adding each unique sequentially received tuple to the hash table.
9. The computer readable medium of claim 7 in which the instructions for determining if each sequentially received tuple is distinct includes instructions for maintaining a sorted data structure and for verifying uniqueness of tuples using the sorted data structure and for adding each unique sequentially received tuple to the sorted datastructure.
US12/166,289 2001-07-12 2008-07-01 System for unordered relational database retrieval returning distinct values Abandoned US20080263017A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US12/166,289 US20080263017A1 (en) 2001-07-12 2008-07-01 System for unordered relational database retrieval returning distinct values

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CA2353015 2001-07-12
CA002353015A CA2353015A1 (en) 2001-07-12 2001-07-12 Unordered relational database retrieval returning distinct values
US10/188,569 US7752160B2 (en) 2001-07-12 2002-07-02 Method for unordered relational database retrieval returning distinct values
US12/166,289 US20080263017A1 (en) 2001-07-12 2008-07-01 System for unordered relational database retrieval returning distinct values

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US10/188,569 Continuation US7752160B2 (en) 2001-07-12 2002-07-02 Method for unordered relational database retrieval returning distinct values

Publications (1)

Publication Number Publication Date
US20080263017A1 true US20080263017A1 (en) 2008-10-23

Family

ID=4169458

Family Applications (2)

Application Number Title Priority Date Filing Date
US10/188,569 Expired - Lifetime US7752160B2 (en) 2001-07-12 2002-07-02 Method for unordered relational database retrieval returning distinct values
US12/166,289 Abandoned US20080263017A1 (en) 2001-07-12 2008-07-01 System for unordered relational database retrieval returning distinct values

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US10/188,569 Expired - Lifetime US7752160B2 (en) 2001-07-12 2002-07-02 Method for unordered relational database retrieval returning distinct values

Country Status (2)

Country Link
US (2) US7752160B2 (en)
CA (1) CA2353015A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9613219B2 (en) 2011-11-10 2017-04-04 Blackberry Limited Managing cross perimeter access
CN110049106A (en) * 2019-03-22 2019-07-23 口碑(上海)信息技术有限公司 Service request processing system and method

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2353015A1 (en) * 2001-07-12 2003-01-12 Ibm Canada Limited-Ibm Canada Limitee Unordered relational database retrieval returning distinct values
US8984301B2 (en) * 2008-06-19 2015-03-17 International Business Machines Corporation Efficient identification of entire row uniqueness in relational databases
EP2992447A4 (en) * 2013-04-30 2016-09-21 Hewlett Packard Entpr Dev Lp Database table column annotation

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4201046A (en) * 1977-12-27 1980-05-06 United Technologies Corporation Burner nozzle assembly for gas turbine engine
US5557788A (en) * 1993-03-27 1996-09-17 Nec Corporation Relational access system for network type data bases which uses a unique declarative statement
US5615361A (en) * 1995-02-07 1997-03-25 International Business Machines Corporation Exploitation of uniqueness properties using a 1-tuple condition for the optimization of SQL queries
US5659728A (en) * 1994-12-30 1997-08-19 International Business Machines Corporation System and method for generating uniqueness information for optimizing an SQL query
US5689697A (en) * 1994-06-27 1997-11-18 International Business Machines Corporation System and method for asynchronous database command processing
US5724070A (en) * 1995-11-20 1998-03-03 Microsoft Corporation Common digital representation of still images for data transfer with both slow and fast data transfer rates
US5764973A (en) * 1994-02-08 1998-06-09 Enterworks.Com, Inc. System for generating structured query language statements and integrating legacy systems
US5822748A (en) * 1997-02-28 1998-10-13 Oracle Corporation Group by and distinct sort elimination using cost-based optimization
US5842224A (en) * 1989-06-16 1998-11-24 Fenner; Peter R. Method and apparatus for source filtering data packets between networks of differing media
US5860070A (en) * 1996-05-31 1999-01-12 Oracle Corporation Method and apparatus of enforcing uniqueness of a key value for a row in a data table
US5903887A (en) * 1997-09-15 1999-05-11 International Business Machines Corporation Method and apparatus for caching result sets from queries to a remote database in a heterogeneous database system
US5995959A (en) * 1997-01-24 1999-11-30 The Board Of Regents Of The University Of Washington Method and system for network information access
US20030078923A1 (en) * 2001-05-31 2003-04-24 Douglas Voss Generalized method for modeling complex ordered check constraints in a relational database system
US6801906B1 (en) * 2000-01-11 2004-10-05 International Business Machines Corporation Method and apparatus for finding information on the internet
US6907414B1 (en) * 2000-12-22 2005-06-14 Trilogy Development Group, Inc. Hierarchical interface to attribute based database
US7752160B2 (en) * 2001-07-12 2010-07-06 International Business Machines Corporation Method for unordered relational database retrieval returning distinct values

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5201046A (en) * 1990-06-22 1993-04-06 Xidak, Inc. Relational database management system and method for storing, retrieving and modifying directed graph data structures
US5937401A (en) * 1996-11-27 1999-08-10 Sybase, Inc. Database system with improved methods for filtering duplicates from a tuple stream
US6788316B1 (en) * 2000-09-18 2004-09-07 International Business Machines Corporation Method of designating multiple hypertext links to be sequentially viewed

Patent Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4201046A (en) * 1977-12-27 1980-05-06 United Technologies Corporation Burner nozzle assembly for gas turbine engine
US5842224A (en) * 1989-06-16 1998-11-24 Fenner; Peter R. Method and apparatus for source filtering data packets between networks of differing media
US5557788A (en) * 1993-03-27 1996-09-17 Nec Corporation Relational access system for network type data bases which uses a unique declarative statement
US5764973A (en) * 1994-02-08 1998-06-09 Enterworks.Com, Inc. System for generating structured query language statements and integrating legacy systems
US5689697A (en) * 1994-06-27 1997-11-18 International Business Machines Corporation System and method for asynchronous database command processing
US5659728A (en) * 1994-12-30 1997-08-19 International Business Machines Corporation System and method for generating uniqueness information for optimizing an SQL query
US5696960A (en) * 1994-12-30 1997-12-09 International Business Machines Corporation Computer program product for enabling a computer to generate uniqueness information for optimizing an SQL query
US5615361A (en) * 1995-02-07 1997-03-25 International Business Machines Corporation Exploitation of uniqueness properties using a 1-tuple condition for the optimization of SQL queries
US5724070A (en) * 1995-11-20 1998-03-03 Microsoft Corporation Common digital representation of still images for data transfer with both slow and fast data transfer rates
US5860070A (en) * 1996-05-31 1999-01-12 Oracle Corporation Method and apparatus of enforcing uniqueness of a key value for a row in a data table
US5995959A (en) * 1997-01-24 1999-11-30 The Board Of Regents Of The University Of Washington Method and system for network information access
US5822748A (en) * 1997-02-28 1998-10-13 Oracle Corporation Group by and distinct sort elimination using cost-based optimization
US5974408A (en) * 1997-02-28 1999-10-26 Oracle Corporation Method and apparatus for executing a query that specifies a sort plus operation
US5903887A (en) * 1997-09-15 1999-05-11 International Business Machines Corporation Method and apparatus for caching result sets from queries to a remote database in a heterogeneous database system
US6801906B1 (en) * 2000-01-11 2004-10-05 International Business Machines Corporation Method and apparatus for finding information on the internet
US6907414B1 (en) * 2000-12-22 2005-06-14 Trilogy Development Group, Inc. Hierarchical interface to attribute based database
US20030078923A1 (en) * 2001-05-31 2003-04-24 Douglas Voss Generalized method for modeling complex ordered check constraints in a relational database system
US7752160B2 (en) * 2001-07-12 2010-07-06 International Business Machines Corporation Method for unordered relational database retrieval returning distinct values

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9613219B2 (en) 2011-11-10 2017-04-04 Blackberry Limited Managing cross perimeter access
CN110049106A (en) * 2019-03-22 2019-07-23 口碑(上海)信息技术有限公司 Service request processing system and method

Also Published As

Publication number Publication date
US7752160B2 (en) 2010-07-06
CA2353015A1 (en) 2003-01-12
US20030014390A1 (en) 2003-01-16

Similar Documents

Publication Publication Date Title
US7158996B2 (en) Method, system, and program for managing database operations with respect to a database table
KR100971863B1 (en) System and method for batched indexing of network documents
US10152513B2 (en) Managing record location lookup caching in a relational database
US7111025B2 (en) Information retrieval system and method using index ANDing for improving performance
US7146365B2 (en) Method, system, and program for optimizing database query execution
US7756889B2 (en) Partitioning of nested tables
US10120899B2 (en) Selective materialized view refresh
US8868595B2 (en) Enhanced control to users to populate a cache in a database system
US20130117255A1 (en) Accessing a dimensional data model when processing a query
EP2843567B1 (en) Computer-implemented method for improving query execution in relational databases normalized at level 4 and above
EP1643384A2 (en) Query forced indexing
US7680821B2 (en) Method and system for index sampled tablescan
US20070239673A1 (en) Removing nodes from a query tree based on a result set
JP2012069152A (en) Method and recording medium for narrowing down searches using index keys
US20080263017A1 (en) System for unordered relational database retrieval returning distinct values
US8229940B2 (en) Query predicate generator to construct a database query predicate from received query conditions
US7376646B2 (en) Cost-based subquery correlation and decorrelation
CN113094387A (en) Data query method and device, electronic equipment and machine-readable storage medium
US6856996B2 (en) Method, system, and program for accessing rows in one or more tables satisfying a search criteria
EP4150484A1 (en) Efficient indexing for querying arrays in databases
US7925617B2 (en) Efficiency in processing queries directed to static data sets
US10313438B1 (en) Partitioned key-value store with one-sided communications for secondary global key lookup by range-knowledgeable clients
US20060184499A1 (en) Data search system and method
JPH03208143A (en) Distributed data base processor
JPH11259513A (en) Data base system

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION