Results 1 to 4 of 4
  1. #1
    Join Date
    Jan 2014

    Unanswered: Some MySQL help with lots of data on import

    We’re using MySQL 5.6 in RDS. The app controls some ETL process which imports millions of records. Basically it bulk loads data into staging tables and then does a merge.

    1. What db params should be applied to db instance to make huge amount of data inserts/updates/deletes.
    2. Any index operations (create/drop) on big tables take too much time. On 24 million records table it takes more than half an hour to create an index. I did the same on similar tables in Sql Server and it took only 1-2 minutes. Any db params for indexes?
    3. Does it make sense to drop/recreate indexes on tables while merging data from staging tables in ETL. Currently we have an SP for merge with the following structure:

    - Drop index in target table
    - Merge (Insert/Update/Delete) data into target table
    - Recreate index

    Create index statement timed out recently. Thinking on removing drop/recreate index.

    Target table 20+ millions records. Above said index is idx_fact_usages_common.

    CREATE TABLE `vcm_fact_usages` (
    `id` int(11) NOT NULL AUTO_INCREMENT,
    `aws_reseller_billing_info_id` int(11) DEFAULT NULL,
    `dim_invoice_version_id` int(11) DEFAULT NULL,
    `dim_linked_account_id` int(11) DEFAULT NULL,
    `dim_usage_category_id` int(11) DEFAULT NULL,
    `dim_resource_group_id` int(11) DEFAULT NULL,
    `dim_resource_id` int(11) DEFAULT NULL,
    `dim_date_id` int(11) DEFAULT NULL,
    `quantity` decimal(20,10) DEFAULT NULL,
    `blended_cost` decimal(15,8) DEFAULT NULL,
    `un_blended_cost` decimal(15,8) DEFAULT NULL,
    PRIMARY KEY (`id`),
    KEY `idx_fact_usages_common` (`aws_reseller_billing_info_id`,`dim_invoice_versi on_id`,`dim_linked_account_id`,`dim_usage_category _id`,`dim_resource_group_id`,`dim_resource_id`,`di m_date_id`)
    ) ENGINE=InnoDB AUTO_INCREMENT=238138595 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

    4. Any other best practices on RDS mysql when building DWs? DB Params, ETL process, db maintenance (indexes, etc.)

    Thank you,

  2. #2
    Join Date
    Jan 2014
    Any ideas of how I might improve on the process would greatly be helpful here people.

  3. #3
    Join Date
    Jan 2014
    Does anyone have any advice here at all?

  4. #4
    Join Date
    May 2005
    Found this while searching, i understand this is an older thread.

    Here is some info about the best practices in your case.

    Amazon RDS Best Practices - Amazon Relational Database Service

    You could also look into partitioning the data to help with the performance.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts