If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > MySQL > What are my options for manipulating this much data?

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 06-29-10, 21:39
jeff_johnson jeff_johnson is offline
Registered User
 
Join Date: Jun 2010
Posts: 1
What are my options for manipulating this much data?

Hi all,

There is a site that outputs 1,000,000 single cell rankings on a daily basis (.csv). These rankings change on a daily basis. This csv file only has two things: the numbered ranking, and the name of the item.

I want an automated way to crunch this data in the following ways:

1. Find new additions to the list.
2. Biggest gainers and losers in terms of rankings. (so this process will involve yesterday's list)

I have servers available to crunch data (mysql) and would prefer this process be done there rather than a macro/script that crunches this on my desktop.

I would like the results to be outputted into excel format.

Please move if this is the wrong forum. Thanks
Reply With Quote
  #2 (permalink)  
Old 06-30-10, 03:33
pootle flump pootle flump is offline
King of Understatement
 
Join Date: Feb 2004
Location: One Flump in One Place
Posts: 14,905
The MINUS operator is perfect for the first question but mySQL does not have one so you'll need to use EXISTS or a LEFT OUTER JOIN. I also don't have a clue how to get from mySQL to Excel without some third party tools.

I am inclined to move this to mySQL forum. Please could someone with more experience of that product let me know if that is the best destination forum.
Reply With Quote
  #3 (permalink)  
Old 06-30-10, 03:46
Pat Phelan Pat Phelan is offline
Resident Curmudgeon
 
Join Date: Feb 2004
Location: In front of the computer
Posts: 12,606
When you only have a hammer, every problem looks like a nail. The OP (Original Poster) has MySQL, so that's the best place to move this post. MySQL is also a good tool for solving this kind of problem.

If MySQL supported the FULL OUTER JOIN, that would be the best solution. A UNION between two SELECT statements will get you close enough to get the FULL OUTER JOIN functionality.

A lot of the design decisions depend on how much history the OP needs to keep for research purposes. If the output of the comparison is all that is needed, then a simple (two table) design would probably be best. This is rarely the case, history often needs to be online for at least 15 trading days so some analysis needs to be done up front to ferret out the requirements.

The short answer boils down to "Yes, I think that MySQL can do what the OP needs relatively well."

-PatP
__________________
In theory, theory and practice are identical. In practice, theory and practice are unrelated.
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On