Results 1 to 9 of 9
  1. #1
    Join Date
    Aug 2003
    Posts
    13

    Unanswered: Issues about resutl set of a select statement

    Here is the problem I encountered.

    The pl/sql program does the following things

    1) I retrive eligible records from a huge table(about one hunderd million records) by bitmap index

    2) for every records retrieved, I do some processing and put it into some other tables by certain criteria.

    3) for every 100,000 records, I print out a message and do a commit;

    The constraint is that, I only have the read access to the huge table and every day I only have about 10 hours to access the table. After 10 hours. the database will shutdown for maintainance.

    The problem is that, the operation can not be completed within 10 hours. Due to a commit for every 100,000 records, there are records in every table. But, how can I resume the execution, that is, based on the inserted records, from the next day?

    I proposed two possible solutions. First, to add a sequence no to the huge table and do the processing in sequence no order. But this does not work since I can only read the table.

    Second is to order by come columns of the table, for example, date column. This seems to work. But ordering itself will need huge space and time.

    So I am thinking about that, is it ever possible for sql select statement to have some implicit ordering rules that I can utilize? For example, will the rowid be arranged in some rules?

    Anybody can hlep? Please. Thanks.

  2. #2
    Join Date
    Oct 2003
    Location
    Switzerland
    Posts
    140
    You might consider working with partitions. As a starter, here is the introduction about partitioning in the Oracle documentation

    <<<ORACLE DOCUMENTATION>>>
    Introduction to Partitioning
    =================
    Partitioning addresses key issues in supporting very large tables and indexes by letting you decompose them into smaller and more manageable pieces called partitions. SQL queries and DML statements do not need to be modified in order to access partitioned tables. However, once partitions are defined, DDL statements can access and manipulate individuals partitions rather than entire tables or indexes. This is how partitioning can simplify the manageability of large database objects. Also, partitioning is entirely transparent to applications.

    Partitioning offers these advantages:
    ========================
    Partitioning enables data management operations such data loads, index creation and rebuilding, and backup/recovery at the partition level, rather than on the entire table. This results in significantly reduced times for these operations.

    Partitioning improves query performance. In many cases, the results of a query can be achieved by accessing a subset of partitions, rather than the entire table. For some queries, this technique (called partition pruning) can provide order-of-magnitude gains in performance.

    Partitioning can significantly reduce the impact of scheduled downtime for maintenance operations.

    etc.
    <<<END ORACLE DOCUMENTATION>>>

  3. #3
    Join Date
    Oct 2003
    Posts
    87
    I've a problem with committing every 100,000 fetches. Isn't it important, data integrity-wise, that you're only partially completed. Is the data consistant after each 100,000 fetches, or doesn't it matter? Secondly, the reference table should have a unique index on it. You can use it to order your result set (yes, I know, you'll need hugh temp. files, but this is a cost of doing business).

    Edit: if data integrity isn't an issue, just select via a range predicate and spread the processing over serveral days.
    Last edited by N-ary; 11-12-03 at 09:59.
    Oracle - DB2 - MS Access -

  4. #4
    Join Date
    Aug 2003
    Posts
    13
    Hi, cvandemaele. Thank you for your constructive reply.

    Anyway, I also thought about it. But I even don't have insert right on the table, how could I ever possibly partition the table? And I can not see the table's partition situation, too.

    But if I could know the partition of the table, I can do as you suggested.

    Thanks.

  5. #5
    Join Date
    Aug 2003
    Posts
    13
    Hi, N-ary .

    In my problem, the table is always integrity since it is a backup table, no insert or update will be performed at all. So commit every 100,000 will do. So you suggestion should work, by several range predicated. But one thing that I should make sure first is how to dertimine the range so that the data will be distributed evenly.

    I am confused by your second suggestion. Why will I need an index on the ref table? I donn't need to order the result set. Is there any embedded knowledge? please advise.

    Thanks.




    Originally posted by N-ary
    I've a problem with committing every 100,000 fetches. Isn't it important, data integrity-wise, that you're only partially completed. Is the data consistant after each 100,000 fetches, or doesn't it matter? Secondly, the reference table should have a unique index on it. You can use it to order your result set (yes, I know, you'll need hugh temp. files, but this is a cost of doing business).

    Edit: if data integrity isn't an issue, just select via a range predicate and spread the processing over serveral days.

  6. #6
    Join Date
    Oct 2003
    Posts
    87
    Your source table probably has an index on it that is unique. If it does, you can use this index as your range predicate. You can select count(*) on some test ranges to see if the distribution is fairly even across the test ranges. EX: given a range of a-z and you wish to complete in two rounds, a-m, and n-z. Regards . . .

    Originally posted by crops
    Hi, N-ary .

    In my problem, the table is always integrity since it is a backup table, no insert or update will be performed at all. So commit every 100,000 will do. So you suggestion should work, by several range predicated. But one thing that I should make sure first is how to dertimine the range so that the data will be distributed evenly.

    I am confused by your second suggestion. Why will I need an index on the ref table? I donn't need to order the result set. Is there any embedded knowledge? please advise.

    Thanks.
    Oracle - DB2 - MS Access -

  7. #7
    Join Date
    Dec 2003
    Posts
    6

    Partitioned Table

    Originally posted by cvandemaele
    You might consider working with partitions. As a starter, here is the introduction about partitioning in the Oracle documentation

    <<<ORACLE DOCUMENTATION>>>
    Introduction to Partitioning
    =================
    Partitioning addresses key issues in supporting very large tables and indexes by letting you decompose them into smaller and more manageable pieces called partitions. SQL queries and DML statements do not need to be modified in order to access partitioned tables. However, once partitions are defined, DDL statements can access and manipulate individuals partitions rather than entire tables or indexes. This is how partitioning can simplify the manageability of large database objects. Also, partitioning is entirely transparent to applications.

    Partitioning offers these advantages:
    ========================
    Partitioning enables data management operations such data loads, index creation and rebuilding, and backup/recovery at the partition level, rather than on the entire table. This results in significantly reduced times for these operations.

    Partitioning improves query performance. In many cases, the results of a query can be achieved by accessing a subset of partitions, rather than the entire table. For some queries, this technique (called partition pruning) can provide order-of-magnitude gains in performance.

    Partitioning can significantly reduce the impact of scheduled downtime for maintenance operations.

    etc.
    <<<END ORACLE DOCUMENTATION>>>
    Hello,

    I am having one table with large number of records and I have made partitions on it. Now I would like to know while retrieving the records from the Partitioned table how can I see from which partition data has been retrieved.

    Thanks in Advance

  8. #8
    Join Date
    Oct 2003
    Posts
    87

    Re: Partitioned Table

    Use a CASE structure in your SQL to identify each row's partition by checking the range of the partitioning key.

    Originally posted by dhiru_dw
    Hello,

    I am having one table with large number of records and I have made partitions on it. Now I would like to know while retrieving the records from the Partitioned table how can I see from which partition data has been retrieved.

    Thanks in Advance
    Oracle - DB2 - MS Access -

  9. #9
    Join Date
    Dec 2003
    Location
    Noida, India.
    Posts
    171
    For huge volumne of data, it would be worth to try direct-loading. This will not require a frequent commit. Trade offs are:
    - rollback not available
    - destination table PK will have an impact when the source table provides duplicate values of PK columns.

    Looping through cursors hits performace very bad. If the processing to be done is something that can be done in SQL itself, try inserting using a direct path load with insert into select * from ...;

    Performance gain in phenominal.
    Oracle can do wonders !

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •