Results 1 to 8 of 8
  1. #1
    Join Date
    Aug 2012

    How to estimate cost of archiving and migrating a database

    Hello everyone,

    I am trying to estimate the cost of
    1) archiving a database
    2) migrating a database

    The context is that I've been asked to estimate the cost of decommissioning various software systems (hardware is not to be considered). I am trying to create a model that enables calculation of the costs for doing the above. So far, for 1) above, I have listed the following parameters. Can anyone comment on these, add ones I'm missing or provide any other background or information?

    • Cost of backing up the database, cost per Tb for example.
    • Verification of a readable backup.
    • Cost of the actual physical media.
    • Cost of the backup computer system.
    • Labor cost based on daily rate of a "full time" employee
    • Other?

    Basically, I'm trying to devise a model that fits the following constraints:

    • Product independent backup. Is there a generic form for the backup if we don't know the database into which the archived data might be imported later (DB2, Oracle, Sybase, etc.).
    • Parallel run cost. In case we have to run the system live while backing it up.
    • Other costs?

    I am delineating the following costs:

    1. Archival cost
    2. Ongoing storage costs either on site or by a third party storage facility
    3. Migration costs.

    So, the archival costs would not include the cost of actually migrating the database into another system. But I would like to create a model for the migration costs too.

    Please feel free to tell me I need to provide more detail or definition of the task. Also, please suggest any other "parameters" that need to be defined in order to provide an accurate cost model.

    Many thanks in advance to all....

  2. #2
    Join Date
    Oct 2009
    221B Baker St.
    You may want to reconsider the scope of what you want to do.

    I'm not aware of any generic database backup that will backup physical data that is then restorable on some other database system (possibly i've misunderstood what you mean).

    One way to backup "something" and restore it "somewhere else" is to "backup" the data into a data-delimited file that could be reloaded. This is rather ineffieient and would probably require change(s) any time the database structure was modified. There would be a similar issue on the target(s).

    The best way i know to ensure a backup is usable (merely being readable is not enough) is to re-load it into a test environment and ensure everything is there and usable.

    The answer to your cost items is "It depends". If you post something more specific, someone here may have some of these statistics and can share.

  3. #3
    Join Date
    Aug 2012
    Hi there,

    Many thanks for the reply. I know my statement of the problem is a bit vague. The problem is that I don't know precise details. I think there is a good chance that some of the systems being decommissioned use either DB2 or Oracle 10g or Sybase. Therefore, I would settle for understanding the costs of just doing the archiving (forgetting the migration for a moment) per TB of any database hosted by one of these database products.

    For example, how would I estimate the cost (per Tb) of backing up and archiving an Oracle 10g database, reloading it and verifying readability and usability? What is the cost model for calculating this.

    I would like to include the following parameters as a minimum:

    • media cost
    • backup and test system cost
    • labor cost based on daily rate of a "full time" employee
    • any other categories of cost that I'm missing

    I apologize for not being able to provide more detail. But please barrage specific questions that come to mind to address anything I might have omitted.

    Thanks again.

  4. #4
    Join Date
    Oct 2009
    221B Baker St.
    Is this exercise for systems within your organizaton or is this to be some kind of service your organization provides to others?

    If you are involved with an organization, suggest you speak with the Purchasing dept. Many of the better places have purchasing people who understand much of IT related issues. Some do not, so you may have to work more closely with them.

    Media cost - you have to determine how much you want and where it will be located (launched into a "cloud", backup equipment at the customer site, backup equipment at your site, etc) and how many copies of the data will be needed during the decommissioning. What about data that will be used in some other system? How long will the backup server(s) and media be needed? Is the hardware that supports the decommissioned processes/data dedicated or does this support lots of other things as well?

    If a requirement to "bring it back" should be discovered, where will it be staged/run?

    I have no idea to guesstimate labor costs as i have no idea what they might be responsible for. . .

    I still believe you have lots of research to do (and post more specifics) before someone can provide any kind of realistic estimate.

    And these questions are only a sample . . .
    Last edited by papadi; 08-13-12 at 23:32.

  5. #5
    Join Date
    Aug 2012
    As I said, I really don't have any information. Not because I have not done any research, but because the respective organizations don't have the answers.

    For example, there is an effort to decommission systems. But there is huge variation in the database products and types of databases (hierarchical, relational, etc.).

    No one can tell me whether they want to archive data to NAS, SAN or 9-track tape. Nor do I know if a third party company will do the archival and periodic verification of data integrity or whether it will be done in house.

    So I realize no one could give me an accurate answer under these circumstances. I was hoping someone could have made statements such as the following:

    "I am a sys admin [or a dba] and from my experience it takes x days per Tb to backup and verify the usability of an Oracle database if written to 9-track tape. "

    "A typical price for off-site storage of data in an environmentally controlled room with secure access, quarterly verification of data integrity, etc., etc. costs $x per Tb."
    Etc., etc....

    But anyway, thanks for the replies....

  6. #6
    Join Date
    Aug 2012

    How to estimate labor required to archive a database?

    Hello again folks,

    Previously in this thread I was trying to estimate the cost of database archival. I'm changing the thrust of my effort. I would now just like to know how to estimate the labor required to archive a database.

    Here are some conditions:

    1. Archivals of operational database records will need to be re-read and used to populate a new database of the same product at some point in the future. I'm not sure of the implications for the format of the archival.
    2. I do not know the complexity of the database schemas. Assume "fairly complex" as most serve complex business functions.
    3. Archives will be written to magnetic tape.
    4. Database products containing the databases to be archived are either DB2, Oracle or MySQL.
    5. Data is subject to regulatory and compliance standards; some data is financial transaction (online purchases) and will need to be maintained for a number of years.
    6. The hardware hosting the databases are quite modern data center class machines (plenty of CPU power and memory and modern SAN or clusters).
    7. I need to include in the labor hours the time it takes to reload the archived data into a completely independent database of the same product and verify the integrity and usability of the data that was archived onto tape.

    I really don't have information on the size of the databases.

    I would be satisfied if someone could estimate the number of man-hours per data size required to do an archival. For example, if you said
    it takes roughly x hours to archive 100 TB of data to magnetic tape.
    that would be just fine.

    I'm a database novice, so pardon any naive statements or questions. But, I'm wondering if people usually back up database reports also? My understanding is that reports are basically the output of queries. If so, assume that reports need not be archived.

    I also don't know if the complexity of a schema has non-trivial affect on the time required (the amount of processing required) to perform the archival.

    So, a useful answer for my purposes would be something like:
    "It takes roughly x hours per 100 TB to archive and subsequently verify usability of all data and meta-information."
    Is this enough information to go on? If not, please ask for additional information, and, if I can provide it, I will be happy to do so....

    Many thanks....

  7. #7
    Join Date
    Feb 2004
    In front of the computer
    Without understanding what the databases are used for, what kind of hardware is involved, etc. I can't imagine how you'd come up with an approximation.

    For a really fast SAN with very high end hardware and zero extraneous load, you could easily archive 100 Tb in 24 hours.

    For a really slow distributed database with poor networking facilities and slow tape, you could easily take more than 7200 hours to archive 100 Tb of data.

    If you factor in high (1000+/second) transaction rates onto the same distributed platform there is no way that you'll ever archive the 100 Tb of data... It will change faster than you can back it up, and it will never reach consistancy so there is no way to prove whether the backup was successful at any point in time either.

    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  8. #8
    Join Date
    Aug 2012

    I understand the need for specific environment parameters...

    Hello Pat,

    Thank you for your reply. Believe it or not, your answer helped me understand the problem with trying to create a reasonable estimate.

    Let me say that I know I am a novice when it comes to data center technologies. I am an Enterprise Architect that has been thrown into a highly charged political environment.

    Much of the information I need is currently being collected. I am trying to get a head start on understanding how to give cost estimates to our client for budgeting a very ambitious decommissioning project where 100 applications will be decommissioned -- everything from mainframe to home-grown Java web apps running on Linux.

    I can't say too much except the following. The client is an international publisher. The systems in question are used for various applications including online purchasing, online ordering, external and internal cataloging of titles, large accounting systems running for example JD Edwards packages, applications running SAP applications and so forth. In other words, the systems run a range of applications and are used by internal and external users (or both).

    I really don't have any idea of the size of the databases at this time. I have created a detailed questionnaire to gather technical information about every layer of every tier for over 100 of the client's systems. That questionnaire is going out to over 75 SMEs and technical owners of the systems who will provide answers hopefully. When I get that data back I'll know more.

    One other piece of information will help me at this time. How fast can a "fast" magnetic tape drive (a "high end" one) write data sequentially (how many bytes/sec. or whatever metric you want to quote).

    Perhaps I can just understand how much time it would take to do an archival for a very specific set of parameters. Would the following do?

    Backup an Oracle 11g database using a "fast" magnetic tape drive (qualify brand and model) connected to a "fast" SAN (using fibre channel?) connected via a network with such-and-such sustained throughput, on a system that is taken off-line (no applications running).

    Would that be a fair qualification? Can a decent estimate of time to archive be calculated with such a set of parameters? Or, is it still too vague?

    Regarding the problem you raised about archiving an online system (my paraphrase, hopefully correct), I suggested to our team that we tell the client to run a parallel instance of each system that will be archived. The target system will be taken off-line and the parallel system will take its place. This arrangement would circumvent performance problems and the problem of taking a snapshot of the database in a consistent state.

    I can see already that planning such a parallel run is non-trivial itself. How does one backup the data that changed during the archival of the original system?

    Thanks again....

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts