Hello forum,
I am new here, so I hope, I post this in the correct part, otherwise please move it ... and sorry for my bad english
My problem: I would lke to store a complete file / folder structure with additional information (inode, mtime, ctime, maybe checksum) in a database.
The "sense" is to replicate file / folder structures between linux / unix servers. If the service is running, all changes which occurs could be catched (inotify / fsnotify) and updated in the database. But if the service is not running and the files or folders on the file system has changed, a complete scan is nessecary to get the database back to a consistence check.
In this case of a "complete check", each file has to be checked if it is already in the database and if the ctime / mtime is different on the filesystem than in the database.
Btw: The programming language will be Perl ...
What I have already tried:
- MySQL: Build the hierarchy with foreign keys and store folder name, id and parent id (foreign key) in the database -> really slow and inefficient!!!
- MySQL: Build the hierarchy with nested sets -> 20 - 40 times faster than test 1, but still not that fast!!!
- MySQL / Perl: Save complete path in database and create a "hash tree" with perl -> fast, but uses much more space (if you have 1000 subfolders, the parent folder is saved 999 to much!!!)
The database should held also other datas, e.g. configuration options of the servers etc.
By now, I am wondering if MySQL (relational) is the correct choise for storing a file / folder structure (hierarchical) ...
With Perl, the handling of XML-files is quite easy and efficient ... so might be a XML-like-database or just a simple XML-file the right choice?
I could also take a simple XML-file and read the whole XML-file into a "hash tree" in Perl ... so the "database" is directly in memory and "part" of the program ... efficient?
The whole solution should be capable of storing the information about 100'000 Folders and 1'000'000 Files ... better would be 1'000'000 Folders and 10'000'000 Files.
Some ideas?
Best regards,
lousek