Results 1 to 3 of 3
  1. #1
    Join Date
    Jan 2007
    Posts
    56

    Unanswered: Suggestions for archiving data and decreasing db size

    Hello,

    We are using SQL Server 2000. We have a database which has been growing for for the last 3 years, and is currently at 64GB. Upon investigation, I found out that the database has many tables, and one of them - table "A" - is the largest, and is taking up ~ 40GB (20GB for data and 20GB for indexes). This table has 15 columns and we keep data in it forever (i.e. do not purge it).

    After talking to developers, I found out that we only need 2 columns forever, and others only for 1 year. Thus came a question of how to use this knowledge to decrease the table (database) size. The goal was to come up with an approach which is easy to maintain, and would decrease the space usage as much as possible, and improve overall performance. We brainstormed 3 approaches and they were:

    Approach 1
    ----------
    1. Create "A_Archive" table with 2 columns only
    2. Move the data from table "A" to "A_Archive" which is older than 1 year
    3. Purge data from the "A" table, which is older than 1 year
    4. Create a job, which would periodically (daily?) move the data from table "A" to table "A_Archive" and purge day's worth of old data from table "A"

    Approach 2
    ----------
    1. Create "A_Archive" table with 2 columns only
    2. Move ALL the data from "A" to "A_Archive"
    3. Purge data from the "A" table, which is older than 1 year
    4. Modify the stored procedure which inserts into "A" table to simultaneously insert into the "A_Archive" table

    Approach 3
    ----------
    1. Change the columns in the "A" table to be NULLable (those which are not)
    2. NULL the data which is > 1 year old, for all except the 2 columns we need to keep forever

    Here are my thoughts:

    Approach 1 decreases more space than Approach 2
    Approach 2 is easier to maintain than Approach 1
    Approach 3 I don't like, as it would require changing the columns to be NULLable

    My questions are:

    1. Which of these approaches would you advise?
    2. Are there any other possibilities that any of you used to achieve the same goal?
    3. When the data is NULLed (as in Approach 3), is the space really released and database size decreases? If so, does it depend on where in the table the column is - that is (last column or middle column) or not?

    Thanks a lot in advance!

  2. #2
    Join Date
    Jun 2004
    Location
    Long Island
    Posts
    696
    Quote Originally Posted by sql_er
    Hello,

    We are using SQL Server 2000. We have a database which has been growing for for the last 3 years, and is currently at 64GB. Upon investigation, I found out that the database has many tables, and one of them - table "A" - is the largest, and is taking up ~ 40GB (20GB for data and 20GB for indexes). This table has 15 columns and we keep data in it forever (i.e. do not purge it).

    After talking to developers, I found out that we only need 2 columns forever, and others only for 1 year. Thus came a question of how to use this knowledge to decrease the table (database) size. The goal was to come up with an approach which is easy to maintain, and would decrease the space usage as much as possible, and improve overall performance. We brainstormed 3 approaches and they were:

    Approach 1
    ----------
    1. Create "A_Archive" table with 2 columns only
    2. Move the data from table "A" to "A_Archive" which is older than 1 year
    3. Purge data from the "A" table, which is older than 1 year
    4. Create a job, which would periodically (daily?) move the data from table "A" to table "A_Archive" and purge day's worth of old data from table "A"

    Approach 2
    ----------
    1. Create "A_Archive" table with 2 columns only
    2. Move ALL the data from "A" to "A_Archive"
    3. Purge data from the "A" table, which is older than 1 year
    4. Modify the stored procedure which inserts into "A" table to simultaneously insert into the "A_Archive" table

    Approach 3
    ----------
    1. Change the columns in the "A" table to be NULLable (those which are not)
    2. NULL the data which is > 1 year old, for all except the 2 columns we need to keep forever

    Here are my thoughts:

    Approach 1 decreases more space than Approach 2
    Approach 2 is easier to maintain than Approach 1
    Approach 3 I don't like, as it would require changing the columns to be NULLable

    My questions are:

    1. Which of these approaches would you advise?
    2. Are there any other possibilities that any of you used to achieve the same goal?
    3. When the data is NULLed (as in Approach 3), is the space really released and database size decreases? If so, does it depend on where in the table the column is - that is (last column or middle column) or not?

    Thanks a lot in advance!

    I'd go with A, I can't tell you how many times that data "was not needed" anymore, only to have someone come to me 2,3 weeks later asking for data they said they did not need. I would always keep a full copy of the data somewhere whether in an archive DB or csv file, just in case.

  3. #3
    Join Date
    Jan 2007
    Posts
    56
    What do you mean by "A"?

    Neither Approach saves all the data. I guess we could just keep the old db backup available all the time, just in case of such event, although we do not foresee it happening.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •