Results 1 to 12 of 12
  1. #1
    Join Date
    Aug 2009
    Posts
    4

    Unanswered: worker threads not spawned when index is used on a table with partitions

    Hi,

    Why does a search using index on a table with partitions does not spawn one thread per partition?
    When no index is used, one thread per partition is invoked.

    Can anyone explain this?

    Thx.
    Last edited by shanti; 08-27-09 at 10:49.

  2. #2
    Join Date
    Mar 2009
    Location
    Sydney, Australia
    Posts
    258
    Sure. The operation is quite different for ASE 12.5.4 vs 15.0.x. Due to completely rewritten 15.0 Optimiser, the operaton is often (unexpectedly) the opposite of what you were used to in 12.5.4. Essentially Sybase have removed parallelism in 15.0 unless you get the Semantic Partitioning licence. If you provide more details, I can provide more specific answers.

    If the context of your question is not a 15.0 migration, then the answer is less complex.
    Regards
    Derek Asirvadem (Formerly DerekA)
    Information Architect / Senior Sybase DBA
    Copyright 2009 Software Gems Pty Ltd

    I answer questions from the Original Poster only. If you have a genuine question, as the moderators have requested, start a new thread.

    http://www.softwaregems.com.au

  3. #3
    Join Date
    Mar 2009
    Location
    Sydney, Australia
    Posts
    258
    Actually, the issue needs to be discussed. It could be anything from "that's the way it works" [you need to understand why] to "12.5 and 15.0 parallelism are quite different" [and you need to know the differences]. Hence the request to post more details, the context, maybe the code.

    It could be as simple as: where ASE scans the table, one thread per partition is [correctly] invoked; where it scans an index, since indices are not partitioned, only one thread is required.

    In 12.5 indices are not partitioned. That is the default for 15.0 as well. To partition an index in 15.0, you need to create it as LOCAL. But there are many considerations and consequences, so I would not advise simply partitioning an index just to ensure it can be scanned in parallel; the results could be disastrous in other code segments and in maintenance.

    Now if you have a fat-wide index, that is a different problem.
    Regards
    Derek Asirvadem (Formerly DerekA)
    Information Architect / Senior Sybase DBA
    Copyright 2009 Software Gems Pty Ltd

    I answer questions from the Original Poster only. If you have a genuine question, as the moderators have requested, start a new thread.

    http://www.softwaregems.com.au

  4. #4
    Join Date
    Aug 2009
    Posts
    4
    I am refering to 12.5 version.
    Given your explanation, it does makes sense.
    But as per all the Sybase manuals I read on internet, search on indices should use one thread per partition. It will be great help, if you could forward me any related article.

    Thx.

  5. #5
    Join Date
    Mar 2009
    Location
    Sydney, Australia
    Posts
    258
    Serious Articles
    Listen. I am a consultant running a small specialist software house. We have been Sybase for 19 years and Sybase BSA Partners fo 13 years. Performance, all aspects, and how it comes about, has been my particular specialisation. When I am not too busy I provide a free service by attending newsgroups. Just google my name. Of course I write papers, but they are for paying customers, and I cannot legally turn around and provide those papers free to the public. What you are getting is considered advice from an experienced (in that exact area) person.

    Shallow Articles
    The web is full of rubbish. You get one of:
    a. websites that answer low-level questions by people who themselves do not have much expertise, but have more than the questioner (nothing wrong with that; just do not expect high-level expertise to be freely available)
    b. Sybase manuals. They were good to 11.9.2; with 12.0 they were handed over from Engineering to Tech Writing, and the technical quality went through the floor. They also started introducing contradictions, sometimes on the same page! But to be fair, they are reference documents, not entirely descriptive in "how it works".
    c. There are alternate shallow doco and books, but they are merely pro-Sybase (all the marketing hype for a feature) without getting into the nitty-gritty (the limitations or disadvantages).
    That's why people like me continue to have a work, we can fix in one day what certified Sybase experts cannot fix in three months. That's what our value is; if I implement a partitioned table (for the 100th time), I know all the gotchas and the don't-bothers to avoid, I know what config to set up for your particular usage. I do not know of anyone else who has a framework to use Sybase within the limits of its parallelism; everyone trying it finds out the hard way, by real experience, and is limited to the context they have (it does not produce an overall framework).

    Back to the Technical
    Think about this: each index is a singular storage unit [right ?] and the data is a separate single storage unit [a Heap], except for APL/Clustered Index, where the two are married into one storage unit. If the heap is partitioned as say 16, great you now have 16 storage units for the heap. ASE can send 16 workers down the partitions for [not all but] almost anything, but for the index ... that depends on what you are doing. If you are inserting/deleting (changing the actual non-leaf level; allocating more index pages; splitting index pages; adding an index level; etc), it has to be done as an atomic unit. Therefore one worker. Now multiply that by the no of indices on that table.

    Life is different for APL/CI, but I won't go into a discussion of that.

    Last, indices (non-leaf level) are small, and should be kept few and small by design, even on very large tables; therefore normally one worker changing the index is not a serious impediment. But people these days implement less and less quality and standards in their database design; if they have created a monster index (out of ignorance of the consequences of their fat-wide variable columns), then the normally tiny index which is now a sumo wrestler becomes the problem itself. And you will be seeking parallelism to wade through it.

    Last month I fixed a customer table, the heap was 20gb and the indices were 12.3gb (not counting text of course). All the justification is irrelevant, it was simply wrong. After making the changes to the causative issues, the indices are now 0.8gb; you can image the speed improvement ... but it will take some time for them to change their code.

    The progression is, in 15.0 they provide Local Indices. But that is a first cut and does not go all the way, even that has serious limitations. I think 16 will be a good release.

    Feel free to trawl the web and find better information. I know of none. Alternately, try different config options and parallel configuration, and post questions.
    Regards
    Derek Asirvadem (Formerly DerekA)
    Information Architect / Senior Sybase DBA
    Copyright 2009 Software Gems Pty Ltd

    I answer questions from the Original Poster only. If you have a genuine question, as the moderators have requested, start a new thread.

    http://www.softwaregems.com.au

  6. #6
    Join Date
    Aug 2009
    Posts
    4
    I am not questioning your expertise. Its only that I could not find what you were suggesting in Sybase manuals.

    I am not a database expert so its hard to makeout what is correct or incorrect on internet.

    Thx.

  7. #7
    Join Date
    Mar 2009
    Location
    Sydney, Australia
    Posts
    258
    I didn't take it personally. But I had to explain the difference between the 99% of rubbish you get free; the porn etc; and the message you need to take notice of. You asked a serious question. I am in that 99th percentile. Use me while you can.

    You will not find the following in Sybase or Oracle or MS manuals:
    - how to implement your object in it
    - the configuration to set
    - for whatever purpose
    - how to integrate every other object to it
    - the value of standards
    No, they cannot be expected to provide that. They are Reference manuals for software product, not education on Normalisation or implementation guidelines for databases. You're supposed to know that.

    You've done a good job, in the sense that you have set up parallelism, and got it running. But no one told you that parallelism is the platform; the partitions are the pillars; the indices are the stairs to the platform. They all go together, you need to change their configuration in concert with each other (as well as the sp_configure values).

    Parallelism, or speed on an ordinary table, suffers when:
    - there are variable length columns in the table
    - heavily, when variable length columns are used in an index
    Which is one reason that I never use them (I cannot figure out for the life of me why some customers use them, but that's another story). If you are intending to use the column to support an index, or the table is a candidate for parallelism, then fixed length is demanded. You won't find that on the web. You will only find that is fast databases. You won't find that in slow ones. There is noting you can learn from a slow database, or from the 99 percent out there. research, yes, but learn , no.

    It is not so much that the web is correct or incorrect; it is about what you can expect from it. Anybody with a keyboard and two fingers can post anything they want. I do not build databases from the one-liners that people post. In some uneducated countries, they think wiki is an "encyclopedia"; that only happens because they have not been introduced to the real thing in school.

    In these days, with the level of actual expertise in the I.T. business (it is no longer an industry), you get all kinds of people believing the most ridiculous things; some even do those things. I do not have a problem with that, so far, but when they start posting about it, that's when we part company, and I take issue. The point is, all of us think: if it posted it must be true. We need to appreciate that: if it is posted, it is a low-level "truth" for consumption by the masses.

    The educated are not tricked into using it, let alone believing in it; they use encyclopedia; they were taught knowledge and reason, not copying from books. Learn how to identify the live creature amongst the flotsam and jetsam, that 99th percentile, and watch them like a hawk.

    I am not a database expert so its hard to makeout what is correct or incorrect on internet.
    And that is why one should never use the web for the purpose of learning. Not knowing is correct/incorrect is a serious hindrance to progress. Worse still, to find out after implementation, that the info posted, that you used, was utter nonsense. SUre people need to be more responsible, and not post crap, but we have to be careful and not listen to crap.

    Keep asking questions until the issue is resolved.
    Last edited by Derek Asirvadem; 09-01-09 at 09:23.
    Regards
    Derek Asirvadem (Formerly DerekA)
    Information Architect / Senior Sybase DBA
    Copyright 2009 Software Gems Pty Ltd

    I answer questions from the Original Poster only. If you have a genuine question, as the moderators have requested, start a new thread.

    http://www.softwaregems.com.au

  8. #8
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Provided Answers: 10
    Quote Originally Posted by Derek Asirvadem
    And that is why one should never use the web for the purpose of learning.
    I have to disagree with this point - I have learnt so much from this here interweb and it has helped me hone my SQL writing skills. Without it I would have been stuck more times than I care to recall

    I appreciate the sentiment that you are trying to purvey but fear that your wording is a bit too much of a blanket statement.

    I think the proviso is "do not trust everything you read on the internet - test it for yourself, consult official documentation, seek answers from alternate sources before you decide what have you been presented with is fact."
    George
    Home | Blog

  9. #9
    Join Date
    Mar 2009
    Location
    Sydney, Australia
    Posts
    258
    Everyone is free to have their opinions. You do not have to have a relation to another's opinions. But if you do relate yours to another's, then there better be a relation. I was answering Shanti's question, if you would like me to respond to your opinions about another subject, you need to post a new thread. Please do no hijack Shanti's thread, she did not engage you. You probably missed the irony, in that you are:
    Anybody with a keyboard and two fingers can post anything they want.
    Last edited by Derek Asirvadem; 09-01-09 at 09:43.
    Regards
    Derek Asirvadem (Formerly DerekA)
    Information Architect / Senior Sybase DBA
    Copyright 2009 Software Gems Pty Ltd

    I answer questions from the Original Poster only. If you have a genuine question, as the moderators have requested, start a new thread.

    http://www.softwaregems.com.au

  10. #10
    Join Date
    Mar 2009
    Location
    Sydney, Australia
    Posts
    258
    I said what I said. I did not say "you cannot learn anything from the internet"; that would be stupid. When I was 14 I learned about sex from porn magazines that Americans left, lying around the house; that turned out not to be sex. Going to dinner with a friend may well be therapeutic; but it is not therapy. No, for that you pay $100 or whatever and get serious, and listen to every word. Ofcourse, if I flooded the planet with copies of a child's 100-page encyclopedia, everyone would learn something. I said you cannot learn anything of value from the internet; in fact it is dangerous because you do not know what is correct/incorrect. That's why this is Shanthi's thread. You've forgotten the context of my statement. And then tried to argue with it as a simple all-or-nothing truth. New thread.
    Regards
    Derek Asirvadem (Formerly DerekA)
    Information Architect / Senior Sybase DBA
    Copyright 2009 Software Gems Pty Ltd

    I answer questions from the Original Poster only. If you have a genuine question, as the moderators have requested, start a new thread.

    http://www.softwaregems.com.au

  11. #11
    Join Date
    Aug 2009
    Posts
    4
    I do not want to comment on above discussion on internet usage.

    Let's get back to the actual problem -

    A ' select * ' query uses 'order by' clause. There is a clustered index on the columns used for "order by". The query plan shows that the same clustered index is used for table scan.

    Ques 1. Shouldn't the clustered index be used only to sort the results?, after the table scan is complete because the query is selecting all columns. So that table scan can run search on all partitions in parallel.

    Ques 2. Even if index is used for select then why is the query so slow?

    Ques 3. Without "order by", the same query uses no index and table scan invokes one thread per partition. The results are returned much much faster.
    I think that this is expected because query does not have to sort the results.

    Ques 4. But when there are no matching records, Why does the query with 'order by' takes so long as compared to without 'order by'?

  12. #12
    Join Date
    Mar 2009
    Location
    Sydney, Australia
    Posts
    258
    Exactly. It is your thread, your context.
    The query plan shows that the same clustered index is used for table scan.
    No, the showplan will state either a table scan or an index scan; the table scan does not use any index; for an index scan, the index used is named.

    (it would help to know the DDL of the table plus indices, I could give even more specific answers. If you want to discuss the showplan, please post it as an attachment.)

    1. Two different things again. First, it will do either table scan or index scan. If it is getting all columns, at some point it has to go to the data. An index scan is only good to find/narrow down which rows qualify (where clause) before going to the heap. (A covered query is the "perfect" item, because the columns required are in one or more indices used, and it does not have to go to the data; but that is not applicable here.) Once the table scan is progressed enough, yes, it has to sort the data (the original sort order is lost due to eg. 16 worker producing results); the sort is done in worktables, as soon as all eg. 16 worker start producing results, at the network handler level (it does not wait until all 16 workers have finished). The CI was great for the initial steps, but it is useless for sorting a small or large subset of rows.

    2. Could be any number of things. Post the table/index DDL, the SQL and the showplan for the query. (If confidentialty is important, then obfuscate the table/column names.)

    3. Correct. And that may answer [2] at least partially.

    4. The query is optimised, and a plan is built. At that time, based on a number of things (eg. absolute values vs @variables, plus the statistics available for the columns in the where clause), the optimiser chooses one set of decisions (join type, nested-loop, index vs table scan, etc), of the many possible. That plan is cached; if it is a proc, that plan is saved; in any case, if the query/proc ran recently, that plan is available in the cache and used. Could well be that it is an "efficient" plan for extracting x% of the rows. Some time later, the query is executed and it returns y% or 0% of the rows. Well, it is using the same plan, which may be "inefficient" in your mind for the y% or 0% rows.

    But overall, as long as you are doing reasonable maintenance, stats, etc, and x% is the more common %age rows returned, that is actually the most "efficient" plan. If it is not, then just ensure you compile it with data values the reflect the more common set of rows returned. (Never code "exec sproc ... with recompile", that is for debugging problems only.) But sure, test what I am describing here using "with recompile".

    There is more, of course. Based on the issues above, the optimiser may or may not be able to identify the number of rows expected, so it will go for a table scan, where it would be faster to go for an index scan. Those are things that you do have control/influece over.

    Last, never "select *" except for testing; always use the correct column list.
    Regards
    Derek Asirvadem (Formerly DerekA)
    Information Architect / Senior Sybase DBA
    Copyright 2009 Software Gems Pty Ltd

    I answer questions from the Original Poster only. If you have a genuine question, as the moderators have requested, start a new thread.

    http://www.softwaregems.com.au

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •