Results 1 to 14 of 14
  1. #1
    Join Date
    Aug 2011
    Posts
    8

    Unanswered: SQL Server 2008 - multiple import processes simultaneously

    The query is regarding data modeling for core functionality of my application.

    I have a scenario where multiple users will be doing import processes, but all of them will be working for different clients. Also, at the same time client's data could be shown to the user and can be modified/inserted too, while the import process for the same or different client is in process.

    I have 2 core tables which get the most hits whenever import processes run.

    I have 2 options now
    1. To have 2 core tables and do the sequential imports by making queue for the import processes.

    Table 1

    ID
    ClientID
    SourceID
    Count
    AnotherCol1
    AnotherCol2
    AnotherCol3

    Table 2

    ID
    ClientID
    OrderID
    Count
    AnotherCol4
    AnotherCol5
    AnotherCol6

    2. To have 1000 core table, 2 for each client (I may have maximum 500 clients), it will allow the users to work on the import processes simultaneously without waiting for one another.

    More information about the import process:
    1. These table is not going to be used in any Reporting.
    2. Each import process will insert 20k-30k records (7 columns) in these each table. And there will be around 40-50 such imports in a day.
    3. While the import process is going on, data could be retrieved from these tables by some other user and INSERT OR UPDATED too.
    4. These are going to be one of the most usable tables in the application.
    5. BULK INSERT will be used for insertion.
    6. Clustered index is on the Primary Key which is an Identity column.
    7. We are considering the table partitioning too.

    Can you please suggest which option is better and why?
    Also, if you suggest to go with option 2, then would it not be a performance hit to create so many tables in the database? Should we create a separate database for these 1000 tables in this case?

  2. #2
    Join Date
    Mar 2007
    Location
    Holmestrand, Norway
    Posts
    332
    Are you going to allow dirty reads (read uncommitted data)? While inserting/updating/deleting data, an exclusive (X) lock will be held on the data for the duration of the transaction. The only way to avoid this and immediately allow updates, would be to import the rows one at a time, which is not very effective.

    To allow read, update, and delete on the table while import is going on, you could have a look at snapshot isolation, more precisely read_committed_snapshot. Other users will be able to read committed data, and make modifications to it. But, no two user will be allowed to modify the same data simultaneously, the second commit will fail when it detects that the data has changed since it was read.

    Beware though, depending on the size of your inserts and updates, and the duration of your transactions, you could end up using quite some space in tempdb.
    Ole Kristian Velstadbråten Bangås - Virinco - MSSQL.no - Facebook - Twitter

  3. #3
    Join Date
    Aug 2011
    Posts
    8
    Hi roac

    Yes, we are going to allow dirty reads. And as per your analysis, what data model should be followed from the 2 options I mentioned? and why?

  4. #4
    Join Date
    Aug 2011
    Posts
    8
    Can anyone please answer this question?

  5. #5
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    This kind of data organization just screams SSAS to me. It would force you (and your users) to look at the data differently, but it would remove the bottlenecks that you are facing and probably fix many other problems for you too.

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  6. #6
    Join Date
    Aug 2011
    Posts
    8
    Pat, I can't use SSIS, SSAS.

  7. #7
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Why can't you use SSAS or SSIS ? Without giving us the whole problem, we can't give you a decent answer (as we just demonstrated).

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  8. #8
    Join Date
    Aug 2011
    Posts
    8
    We don't have a skill set of SSIS/SSAS and are instructed to use T-SQL, can you please suggest it could best achieve with T-SQL?

  9. #9
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    No, this problem can't be best acheived using T-SQL. SSAS is a better solution than Transact-SQL for this problem, and there may well be even better solutions.

    Without much more information about what your real problem is, there is no way I can help you... every answer I give will be like the last one, unusable to you for some reason that you haven't yet specified.

    I think that you need professional help for this kind of problem. Hire someone talented, let them examine these problems (there are three that I see so far) and help you to find a good solution for all of these problems.

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  10. #10
    Join Date
    Aug 2011
    Posts
    8
    Pat, does not my question explain my requirements? If not, please ask me the specific question, if any?

  11. #11
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Start with my questions in post #7 above.

    Expand the concept of my questions from post #7 for the (potentially thousands) of other requirements that you haven't described.

    I realize that you're trying your best. Based on your post #10 you seem to think that you've already done everything that you can to explain your problem. The problem is that you are so tightly woven into the problem, you assume too much... I don't understand your problem or even the scope of that problem, so I can't offer usable advice.

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  12. #12
    Join Date
    Mar 2007
    Location
    Holmestrand, Norway
    Posts
    332
    As he is going to allow dirty reads during import, it strikes me that (as far as I know it) SSAS is out of the question. It seems to me as he want rows to be updated by other processes while the import is running, inserting rows without holding a lock. No matter what transactional isolation level you're using, SQL Server will hold a lock, but you can mimic this behavior by inserting a small subset of rows at a time, which seems a bit contradictory to the desire of bulk insert.

    I'll agree with Pat. Find someone with the right skillset. Describe your business needs (put all technical constraints aside), and move on from there.
    Ole Kristian Velstadbråten Bangås - Virinco - MSSQL.no - Facebook - Twitter

  13. #13
    Join Date
    Aug 2011
    Posts
    8
    Never mind!!

    What I realized is that importing 40k-50k records (with 10-15 columns) into a staging table using BULK INSERT take 2-3 seconds.

    Once the data is in staging then I can transform it and then use SQL Server 2008's MERGE statement to make changes in my main table which also does not take long (2-3 seconds) on a heavy database.

    Hence, BULK INSERT->Transformations->MERGE works for me.

    Anyways, Thanks for giving attention to me problem.

    You can still provide your comments on my above method.

  14. #14
    Join Date
    Aug 2011
    Posts
    8
    Hi

    Before finalizing my DB design, I seek you advice again to make sure I am going the right way. To give you more insight into my database schema so that you can address the issue properly, please have a look at the below:


    1. Firstly, please have a look at the attached database design.


    · MS_HospitalTrusts - This refers the actual trust.

    · MS_Hospitals - TrustID is the foreign key in this table. There will be many hospitals under one trust, hence ther 1-n relationship.

    · MS_HospitalUnits - HospitalID is the foreign key in this table. There will be 3-4 units under one hospital, hence ther 1-n relationship.

    · MS_HospitalDepartments: UnitID is the foreign key in this table. There will be 30-40 departments under one unit, hence ther 1-n relationship.

    · MS_HospitalPatients: DepartmentID is the foreign key in this table. This is the table in which acutal import has to happen. There will be 20K-30 records/import i.e. per department, hence ther 1-n relationship.

    2. Secondly, data in table MS_HospitalPatients is going to be of same format for all trusts.

    Main Problem

    As there will be multiple imports simultaneously and also at the same time system will be reading data (dirty read), showing in UI where user can make changes in the data. All these things will happen on MS_HospitalPatients table.

    My Analysis

    What I analyzed is that importing 40k-50k records into a staging table using BULK INSERT take 2-3 seconds.

    Once the data is in staging then I can transform it and then use SQL Server 2008's MERGE statement to make changes in my main table i.e. MS_HospitalPatients which should not take long even on a heavy database.

    Hence, BULK INSERT->Transformations->MERGE seems to be the solution.

    Questions for you

    1. Can you please comment on my proposed solution explained above? Should I go with one common table i.e MS_HospitalPatients for all hosipitals/units/departments OR should I create separate MS_HospitalPatients table for each trust (which will be created when a new trust is added) and then use dynamic queries to insert/update/delete/select from that trust specific table.

    2. Data from MS_HospitalPatients table would be accessed based on UnitID wise most of the times and DataID wise sometimes. So, should the table MS_HospitalPatients be de-normalized by adding field UnitID into it to avoid joins with table MS_HospitalDepartments which contains the UnitID. In either cases what should be the clustered index for table MS_HospitalPatients table and why?

    3. On what basis (columns) the table partioning should be implemented in MS_HospitalPatients table? Would it be good idea to add another column HospitalID in MS_HospitalPatients table and to do the table partioning on that column?

    4. What other techniqiues (like indexing, table partioning) can be implemented to make the database design efficient, robust so that there are NO performance issues later on?



    Kindly provide your reply at the earliest.
    Attached Thumbnails Attached Thumbnails Database Diagram.jpg  

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •