If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > General > Database Concepts & Design > Database Design Question. - pic to help explain

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 07-10-09, 06:55
ameyer ameyer is offline
Registered User
 
Join Date: Jul 2009
Posts: 32
Database Design Question. - pic to help explain

I working on this project code.bildr.org ( safari and firefox only) that works a little like SVN. It's a wiki for code (very different than github).

You have a project with files and folders, and when you change a file, it saves it. Right now you can only see the history of individual files because of how it works.

I want to be able to save and remember the state of the entire project, so when a folder is added or a file changed or deleted, I can go through the history, and flip through so I can go back to an older state when files/folders were added, changed or deleted. It's a lot like SVN in a way.

I understand how it would work, but I cant figure out the database. Here is a pic I made to try to explain what I need.

Database Design Question. - pic to help explain-db.jpg

I figure save states need to keep track of the files in the project, but if 2 files change and 5 do not, I want to only write new file entries for the ones that changed, and just note that this save state uses old files.

It's weird to me because it is like saying save 1 is associated with file 1,2,3,4 but save state 2 is associated with 1,2,5,6,7. The overlap is what is throwing me through a loop.

I thought about making a table that tracks this, but it seems incredibly wasteful as I would have tons of row entries for every save.

Any help would be most appreciated.

-Adam

Last edited by ameyer; 07-10-09 at 07:02.
Reply With Quote
  #2 (permalink)  
Old 07-10-09, 07:58
mike_bike_kite mike_bike_kite is offline
vaguely human
 
Join Date: Jun 2007
Location: London
Posts: 2,519
I think you could do this with the following tables and fields :
Author: id, name, email, pwd, etc
Project: id, author_id, name, descr etc
Files:id, project_id, name, file_type, file_date, status_type, author_id, txt, descr, change_comment
If a user wants to download the live code for a project then the system would look for all the code associated with the project and just take the records with the latest date that have a live status.

A few small points:
  • status_type would hold something like dev, beta, live. This will need a lookup table.
  • file_type would hold something like php code
  • descr would hold a description of the code or project
  • change_comment would hold a list of changes made
You might want to :
  • implement version numbers with dates and lists of changes
  • be able to store lists of bugs against a project or file
  • store feedback by users against a project or file
  • have a picture for a project
  • store the whole file each time or you could just store the changes
  • compress files before storing in the database if required
Mike
Reply With Quote
  #3 (permalink)  
Old 07-10-09, 08:06
ameyer ameyer is offline
Registered User
 
Join Date: Jul 2009
Posts: 32
Thanks Mike.

What would happen if a user wanted to grab an old version? How would the system know what files/folders are associated with that version? Or revert to an older version?

As for features. This system is actually only the back end of a single feature of the site. The rest is all driven on mediawiki (wikipedia's software). So commenting and users etc will all be done through that.
Reply With Quote
  #4 (permalink)  
Old 07-10-09, 08:44
mike_bike_kite mike_bike_kite is offline
vaguely human
 
Join Date: Jun 2007
Location: London
Posts: 2,519
Personally I'd just use a date ie give me the live project code as of "1 Mar 2009". You'd then just ignore files older than this date. I mentioned versioning but that's a whole new bunch of additions and wouldn't really add to the overall functionality. You could always just list the versions along with the release dates and let the user enter that date. If you do want versions then you'd need a new table:
Version : project_id, version_name, date
but you'd need to decide on what to do if a project leader doesn't enter new (or any) versions and whether files should be linked in the database to a version or whether it will be done by date alone.
Reply With Quote
  #5 (permalink)  
Old 07-10-09, 09:49
blindman blindman is offline
World Class Flame Warrior
 
Join Date: Jun 2003
Location: Ohio
Posts: 11,726
Right. If you associate each individual file change with a date, find the latest change for any file as of any given date is not difficult. At the point, the concepts of "save states" or "snapshots" is irrelevant, because your point-in-time recoverability supersedes them.
__________________
If it's not practically useful, then it's practically useless.

blindman
www.chess.com: "sqlblindman"
Reply With Quote
  #6 (permalink)  
Old 07-10-09, 09:53
ameyer ameyer is offline
Registered User
 
Join Date: Jul 2009
Posts: 32
I guess im missing something.

If I delete a file from a new version. Or I add one half way through, how do I know what files were part of that save state?
Reply With Quote
  #7 (permalink)  
Old 07-10-09, 11:29
mike_bike_kite mike_bike_kite is offline
vaguely human
 
Join Date: Jun 2007
Location: London
Posts: 2,519
Let's say your project consists of 3 files
Code:
1 Jan 2009 initial release of live code
      file_1
      file_2
      file_3

on 1 Feb 2009 do a live release of just one file
      file_2

on 1 Mar you release live code for other 2 programs
      file_1
      file_3
Now if you want to download the whole project you'll pull the latest vesion of each file (1st Mar for files 1 and 3, 1st Feb for file 2).

If you want to download the project as it stood in the middle of Feb then you'd pull the latest files at this point (1st Jan for files 1 and 3, 1st Feb for file 2).

If you want to get rid of the file that was released in Feb then I guess you could just delete it. Personally I think it should stay there otherwise you are messing with history and bad thinks happen (ask Doctor Who).

The code to pull the latest files for a given date might be something like this (this is just a quick guess) :
Code:
select file
from   Files f.name
where  f.project_id = $project_id
       and f.file_date = ( select max(d.file_date)
                           from   Files d
                           where  d.project_id = f.project_id
                                  and d.id = f.id
                                  and d.file_date < $supplied_date )
Reply With Quote
  #8 (permalink)  
Old 07-10-09, 12:26
ameyer ameyer is offline
Registered User
 
Join Date: Jul 2009
Posts: 32
I really appreciate this. Thank you.

Quote:
Originally Posted by mike_bike_kite
If you want to download the project as it stood in the middle of Feb then you'd pull the latest files at this point (1st Jan for files 1 and 3, 1st Feb for file 2).
So that's the issue I see. What if in Jan it is files 1,2,3 but Feb is file 2,3,4
Then I need to know not to include file 1 when looking at February's state.

But what keeps track of what files to include, and what not to include.
Reply With Quote
  #9 (permalink)  
Old 07-10-09, 13:17
mike_bike_kite mike_bike_kite is offline
vaguely human
 
Join Date: Jun 2007
Location: London
Posts: 2,519
If you want the ability to get rid of files then why not just include an empty file_txt field in files for this file in Feb. The SQL can be altered to just select non empty files in the outer section of the SQL and that should work for you.
Code:
select file
from   Files f.name
where  f.project_id = $project_id
       and f.txt > ""
       and f.file_date = ( select max(d.file_date)
                           from   Files d
                           where  d.project_id = f.project_id
                                  and d.id = f.id
                                  and d.file_date < $supplied_date )
It's well worth it to just create some simple tables, insert the example data and try the SQL. It should produce the correct results for a given date - no promises though as I've only looked at the problem for a few minutes while you have hopefully been looking at it for longer.
Reply With Quote
  #10 (permalink)  
Old 07-10-09, 14:35
ameyer ameyer is offline
Registered User
 
Join Date: Jul 2009
Posts: 32
Ok.. I see what you are saying.

So later on when there are thousands of non used files will it be an issue?
Reply With Quote
  #11 (permalink)  
Old 07-10-09, 14:53
mike_bike_kite mike_bike_kite is offline
vaguely human
 
Join Date: Jun 2007
Location: London
Posts: 2,519
Quote:
Originally Posted by ameyer
So later on when there are thousands of non used files will it be an issue?
So far you haven't given any idea of the number of file changes you'll get per day, the size of these files or how much history you want to keep. If you want to store the contents of the files in the database and you want to keep a history for all files then so be it. You can reduce the storage requirements by compressing files before saving them or just saving the differences from one version to the next. Storing thousands of records isn't much of an issue to databases, some folks are storing billions of records. Correct indexes allow you to jump to the correct record instantly.
Reply With Quote
  #12 (permalink)  
Old 07-10-09, 21:01
ameyer ameyer is offline
Registered User
 
Join Date: Jul 2009
Posts: 32
Id imagine that some projects will have 10s of edits a day. The files them selves will most often be only in the hundreds of lines of code.

Im trying to imagine how I would/could revert to an earlier state without making copies of the code with this. I was thinking I could do it this way... but I assume it is not a good idea, but im not sure why not.


Code:
id		save_state	save_file
1		1		1
2		1		2
3		1		3
4		1		4
5		2		1
6		2		5
7		2		6
8		2		7
9		2		4
10		3		8
11		3		5
12		3		9
13		3		7
14		3		10
15		4		11
16		4		9
17		4		7
18		4		12

SELECT * FROM save, files WHERE save_state=$save AND save_file = file_id
Reply With Quote
  #13 (permalink)  
Old 07-11-09, 04:40
mike_bike_kite mike_bike_kite is offline
vaguely human
 
Join Date: Jun 2007
Location: London
Posts: 2,519
I'd store the complete files each time at 1st and get your system working from there. There's no point complicating the problem until you've got something working first.

As a next stage I'd then look at the diff and ed commands in Unix shell (assuming you're on Unix) to see how you could just save the differences between one version and the next rather than storing the whole file each time.
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On