If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > General > Database Concepts & Design > Open Question to Design Gurus

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 01-21-09, 01:26
ATXBroker ATXBroker is offline
Registered User
 
Join Date: Jan 2009
Posts: 2
Question Open Question to Design Gurus

Hi there folks. Short time reader, first time poster.

I'm working on a relatively simple application for displaying real estate listings and doing some other processing on the listing data.

Our current MLS database of listings is about 100k rows. About 15k of those listing rows are active and of the most interest to folks and, to be honest, to my application.

Our MLS system allows up to 12 photos per listing. This means that I'll have about 65k photos of about 70k each if I were to download them all. Not every listing has all 12 photos.

The issue I'm having is trying to figure out how to store these files so that they are easy/fast for me to access programatically and also so that when I'm manipulating the data on the server that I don't have a folder with 65k files in it. The current ftp process that the MLS provides me dumps all the files into a single folder and it sucks. They are kind enough to use a simple naming system like [MLS#]-[picture index].jpg

I can do a new pull from the master MLS system anytime so the data does not need to be backup up or redundant, and it usually doesn't change, but on occasion it may if someone changes the photos for a given listing. The only time writes to the files are typically done is in a batch process as incremental updates occur daily.

I searched the forum and noted some other suggestions, but to be honest I'm not sure which system would be good for my application and for allowing it to grow as it does every single year with the addition of new listings.

What I came up with were the following:
1) Store all photos in the file system in a big directory. This sucks.
2) Store the photos in the file system, but do a mod100 on the mls# and then store the files in 100 separate folders hoping that the random MLS numbers generated are evenly distributed.
3) Store all files in the database
4) Store a reference to each file in the database and leave the files in the file system.

I am using mysql, php, and dojo to create my application if that helps with your giving me insight. My server seems plenty capable for the limited load that I'll have initially. It's a dual core Xenon @ 2.33GHz, 4 Segate SATAII drive of 400GB each, 4GB RAM, and we're running SUSE and Win2k3 64bit.

Any help would be great. And if I'm leaving key info out to help just let me know.

Thanks
Joe
Reply With Quote
  #2 (permalink)  
Old 01-21-09, 02:56
pootle flump pootle flump is offline
King of Understatement
 
Join Date: Feb 2004
Location: One Flump in One Place
Posts: 14,905
I confess I didn't read the problem, just the solutions, and peiced the problem together from that
Most here would go for 1) or 4). There's one regular on here, who's name I forget, who likes to store images in databases, but most of us don't.
__________________
Testimonial:
Quote:
pootle flump
ur codings are working excelent.
Reply With Quote
  #3 (permalink)  
Old 01-21-09, 04:41
mike_bike_kite mike_bike_kite is offline
vaguely human
 
Join Date: Jun 2007
Location: London
Posts: 2,519
We had a lively discussion on this which is worth reading. I believe most Unix's will work with any number of files in a directory but you will run into performance problems with 65k files in a directory - even doing an ls will be slow.

I assume you can't access the image files on the original server (ie not ftp them across) because life would be much easier for you if you could. Another slightly different way of doing this would be to only pull the main picture for each listing then, if the user wants to see more pictures of a place, you could check to see if you already have the pictures and ftp them if not. A nightly process could then remove the additional pictures if they haven't been accessed in a month.

How do you currently intend to remove the images on properties that have now been sold? What does MLS mean?

Last edited by mike_bike_kite; 01-21-09 at 05:26.
Reply With Quote
  #4 (permalink)  
Old 01-21-09, 05:07
gvee gvee is offline
www.gvee.co.uk
 
Join Date: Jan 2007
Location: UK
Posts: 10,156
There's also nothing stopping you from creating a folder structure such as

\\images\1\
\\images\2\
etc

Where the folder name is the same as the property primary key...
__________________
George
Twitter | Blog
Reply With Quote
  #5 (permalink)  
Old 01-21-09, 09:15
ATXBroker ATXBroker is offline
Registered User
 
Join Date: Jan 2009
Posts: 2
@pootle
Thanks. I think number 4 sounded the best to me. When I'm debugging and stuff I like to check and make sure that I have the right photo (either from our FTP or from the application) by hand so the 65k thing is really annoying.

@mike
Perfect post. I had found another post, but it didn't really go into the pros and cons as much. I liked the point about being able to edit and access the files easily in the file system. solution 4 files in FS and links in DB sounds right to me.

Regarding the MLS. That is broker-speak for Multiple Listing Service. Another fancy sounds more important than it is word for database. I hadn't given much thought to removing the data or archiving it. I was really thinking more along having a separate table for 'old' data. Those would include listings that are expired, sold, or withdrawn from the market.

Then for most of the users (i.e. prospects looking at the data from who knows where), the relevant data is the Active, unsold, listings. That data would be optimized to handle load wrt searching/fetching photos.

The cool thing is that I can use the sold data and allow agents to do things like estimate market value based on searches and aggregates that I'll create. The only thing is that our MLS restricts the viewing of sold data to agents and brokers of the MLS. We are a non-disclosure state in Texas so property sales prices are not public record.

@george
I suppose that would work so long as I didn't make the force of habit mistake of opening the root directory using a GUI. I'll consider that as well because it seems pretty organized also.

I appreciate the help!

Joe
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On