I recently inherited a job where I'll be dealing with tens of millions of rows of historical data and maintaining about 500,000 current rows in a separate table (at least that's what I hope to do).
My problem is this: I'm aggregating data 8 times a day, 3 hours between each update. Each update will handle around 500,000 rows and have to check if anything has changed on several colums of each row in order to decide whether to store historical data or not. Here is how I envision this process happening currently:
select id from current_user where username = 'username' limit 1;
To get the ID of the user. Then this is what trips me up... If that query returns a row (an ID) then I need to check if several of the rows have changed and if any of them have changed to update those several rows and also add a new row to the historical data table. I have no idea how to do this in SQL.
And, of course, if the ID returns 0 rows to insert a new user.