Results 1 to 5 of 5
  1. #1
    Join Date
    Aug 2010
    Posts
    87

    Unanswered: Access dB Cleanup for CSV Export

    Hello,

    First and foremost, thank you for taking the time to read this and if you can help, I very much appreciate it.

    I have a large Access dB which I learned how to setup queries for and get sets of data. Once I have the data, I then copy it over to excel for editing. I'm doing things like...

    • Remove Double Spaces
    • Prefix image names with server's path to images
    • Find and replace ; with ,
    • Add hyphens for spaces on certain columns (if you call it a column that is)
    • Change case to all lowercase for certain columns
    • Remove illegal characters like and a plethora of others
    • And a little of this and that
    • Remove duplicate rows based on SKU duplications


    Originally I set it up like this.
    1. Query Data
    2. Copy to Excel
    3. Do a million edits in excel (the list above and more)
    4. Remove duplicates
    5. Save as CSV
    6. Add several other columns of data (usually with same words in it like "Yes" to indicate an active product.
    7. Separate Category paths and split into multiple columns
    6. Import into mySQL dB for ecommerce site

    Believe me, there's even more. I really think though there has to be an easier way. I must go through this same process every three months and it's tedious. I'd like to set something up so that I can ease the pain a bit.

    One additional challenge I face is this. If say 1 dB has 40,000 records in it, then the one that is to replace it has 41,000, there's a new 1,000 records. No big deal right? But some of the original 40,000 are no longer available so basically I have to dump the whole original dB.

    If a file name or sku is off by 1 character, then the records from the old dB (page names, reviews assigned to skus and such) won't line up and viola, a giant mess.

    So this next Access dB cleanup I'd like to set up a method in which is not forgettable or maybe is programmed so that I can't possibly mess it up.

    Does anyone have any ideas at all as to how to tackle this?

    Note: Currently (if I could find my original setup docs that is) it takes me about 3 hours to go through 40,000 records, clean them up and get them ready for a CSV import to mySQL. Not that big a deal but it's mind numbing, slow, tedious and lots of other words that indicate it's not fun in any way.

    I'm willing to learn... Just need the method

    Thank you for reading and thank you even more for responding.

    RB

  2. #2
    Join Date
    Mar 2009
    Posts
    5,442
    Provided Answers: 14
    Several VBA functions can be used to cleanup the data:

    Remove Double Spaces
    Code:
    CurrentDb.Execute "UPDATE Some_Table SET Some_Table.Some_Data = Replace([Some_Data], '  ', ' ');", dbFailOnError
    Prefix image names with server's path to images
    Code:
    <NewPath> = <Path to Server> & <FileName>
    Find and replace ; with ,
    Code:
    CurrentDb.Execute "UPDATE Some_Table SET Some_Table.Some_Data = Replace([Some_Data], ';', ',');", dbFailOnError
    Add hyphens for spaces on certain columns (if you call it a column that is)
    Code:
    CurrentDb.Execute "UPDATE Some_Table SET Some_Table.Some_Data = Replace([Some_Data], ' ', '-');", dbFailOnError
    Change case to all lowercase for certain columns
    Code:
    CurrentDb.Execute "UPDATE Some_Table SET Some_Table.Some_Data = LCase([Some_Data]);", dbFailOnError
    Remove illegal characters like and a plethora of others
    Code:
    CurrentDb.Execute "UPDATE Some_Table SET Some_Table.Some_Data = Replace(Replace([Some_Data], '', '''), '', '');", dbFailOnError
    You can also use there functions in an Update Query object. For finding duplicates, there is the Find Duplicates Query Wizard that should set you in the right direction.
    Have a nice day!

  3. #3
    Join Date
    Nov 2004
    Location
    out on a limb
    Posts
    13,692
    Provided Answers: 59
    bear in mind you codl do the same data manipulation / editing in MySQL

    persoanlly I'd ditch the intermediate step in Excel.. spreadsheets are is great for data analysis and what if scenarios. its pretty crap at data storage, its pretty crap at data integrity, and its easy to (*&^ things up in spreadsheets nd not realise for a while that things are (*^&)(*& up

    persoanlly I'd be tempted to stuff data immediately into MySQL but with a flag to indicate that that record isnt released for public consumption jut yet
    I'd do your editing (well as Sindho points out update queries) on the MySQL data, remove the flag to indicate the record is now ready for public consumption


    if that isn't appropriate then I'd keep the data in Access untill I was happy wit it then pump it accross to MySQL.

    I have a legacy website which was set up before the advent of ADSL, strictly dial up lines. so that is a split db the dtaa entry and vaildation is done under access, then transferred onto the live MySQL site. its on my to do lists, get rid of the Access stage and stay purely in mySQL. with modern broadband connections theres no need for it, and it cuts out anopther application that needs maintaining
    I'd rather be riding on the Tiger 800 or the Norton

  4. #4
    Join Date
    Aug 2010
    Posts
    87
    Thank you both very much. I'll have to keep things in Access for now. I'm far from a mysql guy and barely know access. I"ll give it a shot. Thank you so very much to both of you.

    RB

  5. #5
    Join Date
    Mar 2009
    Posts
    5,442
    Provided Answers: 14
    You're welcome!
    Have a nice day!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •