I have never heard the term before. I can only imagine that it means that the data is physically stored in some order, e.g. by doing "INSERT INTO T1 SELECT * FROM T2 ORDER BY X". If that was the case, there is no way to verify that the data is ordered other than by selecting it all - and even then, the order in which records are selected is not absolutely guaranteed (imagine parallel query).
Originally posted by MePreeti
I came across the following in one of the white papers.
"If the query returns less than 40 percent of the table rows in an ordered table or 7 percent of the rows in an unordered table , the query can be tuned to use an index in lieu of the full table scan"
Can you post a link to the white paper? I could hazard a guess at what that means - in fact I will! Suppose you have a query like this:
select * from emp where empno < 5000;
Now suppose there are 1000 employees, of which 300 match the criteria. And further suppose the records are stored 10 to a block, so we have 100 blocks.
1) A full table scan will obviously read 100 blocks.
2) If the emp table data is physically ordered by empno, then all the required data will probably reside in the first 30 blocks, more or less. So using the index we can avoid reading the ~70 blocks that have no data for our query.
3) If the emp table data is randomly ordered, then it is likely that the 300 records required will be scattered 3 per block over all 100 blocks. There is therefore no advantage in using the index to locate the blocks.
The 40% and 7% figures are pure "rules of thumb", do not take them literally.
Thanks. Well in that very article, in the section headed "Row re-sequencing", Don Burleson describes what he means by an ordered table, although he doesn't use that phrase there - and it is as I said above.
There is no table attribute ORDERED=YES/NO that you can look up, instead as he says you can determine how ordered the table is by looking at the user_indexes.clustering_factor column.
As he also says, ordering the data in a table has limited application: "This technique is useful only when the application selects multiple rows, when using index range scans, or if the application issues multiple requests for consecutive keys. Databases with random primary-key unique accesses won't benefit from row resequencing."
The bit about 40% and 7% is definitely a "rule of thumb", although it seems to be stated as a concrete fact in the article (the article clearly could have benefited from more rigorous editing!)