The quest to store large number of files in database or on regular file system began at a time when number of files to manage went beyond 1 million in one of our project. We started noticing a huge drop in performance of our windows server. After investigating we found out that using pure file system to store more then million files is really a bad solution. And following are key points.

  1. Each operation of file open/close requires kernel operation, because it needs to do various security checks.
  2. Every file operation requires some minimum locking mechanism which uses your cpu in kernel mode.
  3. Above operations being kernel operations, they slow down entire server even if your file access may be of least priority then any transactional database operation.
  4. Each services on server slows down.
  5. Even operating system itself manages a file system database (typically known as File Allocation Table and similar names in other systems).
  6. These databases are optimized but not for millions or trillions files. They are good to work under million files.
  7. It causes huge fragmentation on your storage media.
  8. Not only media, but even File System Database gets fragmented, leading to more slow down.
  9. Durability of storage media also can reduce because of too much fragmented usage.
  10. Backup and Restore operations take too long time. You can easily conduct an experiment, on any ftp client, try to upload only one 1Meg file and then try to upload 100 10KB files, although the total transfer is only 1Meg, you will notice uploading 100 files of smaller size takes longer.
  11. Keeping track of changes/deleted items are even more difficult.
  12. Tools for replication and change tracking for file systems aren’t available for huge file storages.
  13. The web server will need write access to some folders which is always a big security hole.
  14. And file system crash = operating system crash and thats  a huge challenge to bring it back up.
  15. For transferring files you need to open ftp and develop some tools to replicate them.

Finally, after all these observations, we started considering other options.

  1. Network File System, it looks promising, but even at other end of node, its simply an operating system. It still has same problems as mentioned above.
  2. Distributed File Systems like Hadoop and other tools require too much of learning, management and I think they are good when you need file storage as huge as Facebook, Yahoo etc. We are mid size company and our need is 1TB for now.

Database Solution:

Finally we began testing database solution with MySQL and then we moved onto MS SQL, however both have good advantages and disadvantages that we will discuss further.

We have following advantages over file systems, and also I have provided solution detail later on this article.

  1. Database systems have matured enough to handle large data easily today.
  2. They use small number of files to store all relational data in well organized form.
  3. They keep their database files open all the time, so the kernel operations of locks are reduced.
  4. You can consider as if your load of entire server is now concentrated only on your database.
  5. Databases are much advanced in terms of their scheduling of queries, you can reduce priority of your file database and database will easily offer more performance on other transactional databases.
  6. Data storage is transactional, where else file systems are not completely transactional. And file systems require more programming to make them transactional.
  7. Remember if its your customers data, even if its a small picture/icon any video.. if they are paying for your service then its better to store those media files in transactional storage.
  8. Database crash doesnt crash entire server and nor even other databases on same server.
  9. Backing up and restoring databases are easier because of the inbuilt tools.
  10. Replication on database are much easier.
  11. Change tracking is available on database.
  12. Database access is standard its not dependent on ftp/http like tools.
  13. Database can easily take over failures of neighbor nodes in network.
  14. You can easily query database and do multiple operations on files.
  15. The biggest point is, your web server does not need any write permission on any folder !!

Comparison between MySQL and MSSQL for file storage

  1. MySQL offers replication over internet, MSSQL doesnt, you will have to write your own replication program.
  2. MSSQL offers auto grow option for database file, with which you can reduce fragmentation. MySQL’s innodb doesnt have such option for database file auto grow method which leads to more fragmentation.
  3. MSSQL offers easy incremental backup/restore and more over, in case of crash, you can use actual data folder files to restore, where else MySQL’s innodb doesnt support hot backups so either you buy their hotbackup tool or rely on mysql’s backup tool which is not incremental.
  4. MSSQL costs $1600 for standard edition, where else MySQL innodb is free, but hotbackup or incremental backup tools will cost exactly the same. MSSQL wins over mysql for offering better file management and increasing disc durability.

Sample Solution

After reading various articles, we came to conclusion that storing entire file in one blob or image doesnt make sense as it will deteriorate performance like anything.

So we came with easy solution, which is already used by existing file systems. Thats “Breaking down file into smaller blobs, max 512kb”.

As MySQL and MSSQL both have network packet size limit, you should not store more then 512kb in one row. Otherwise you will have to increase network packet size and that will slow down database server. Following is easy small table structure we used.

Table: NSFiles

NSFileID (Auto Number) (bigint)
NSFileName (varchar 50)
NSContentType (varchar 100) ** This is most important for web access

Table: NSFileBlobs

NSFileBlobID (Auto Number) (bigint)
NSFileID (bigint) (indexed) (Reference of NSFiles.NSFileID)
NSBlobIndex (int)
NSBlobData (medium blob or image)

Following are Storage/Retrieval methods specified in C# with using Linq or any similar ORML tool.

//Upload Handler
NSFile file = new NSFile();
file.NSFileName = .....
file.NSContentType = Request.Files[0].ContentType;
file.Insert(); // <-- insert into database and get new unique NSFileID
byte[] buf = new byte[512000];
int i = 0;
using(MemoryStream ms = new MemoryStream(Request.Files[0].FileData))
{
int count = ms.Read(buf,0,512000);
while(count>0)
{
NSFileBlob blob  = new NSFileBlob();
blob.NSFileID = file.NSFileID;
blob.NSBlobIndex = i++;
if(count==512000)
{
blob.NSBlobData = buf;
}
else
{  // last blob will always be less then 512kb
blob.NSBlobData = new byte[count];
Buffer.BlockCopy(buf, 0 , blob.NSBlobData, 0 , count);
}
blob.Insert();
}
}

And following is simple code for Download Handler

// Download Handler
long fileID = long.Parse(Request.QueryString["FileID"]);
NSFile file = NSFile.Adapter.Get(fileID);
Response.ContentType = file.NSContentType; // important for web browser
// following line causes "Save As" box on client
Response.AddHeader("content-disposition","attachment;filename=\"" + file.NSFileName + "\"");
// following loops over all file blobs whose NSFileID is fileID and they are sorted by blob index
NSFileBlob.Adapter.ForEach(
NSFileBlob.Schema.NSFileID == fileID,
NSFileBlob.Schema.NSBlobIndex.SortAs(),
(t)=>{
Response.OutputStream.Write(t.NSBlobData,0,t.NSBlobData.Length);
}
);

Share
Tagged with:  

29 Responses to Huge File Storage in Database instead of File System

  1. Another solution could be partitionioning your data using mysql 5.1+. This way you would have better performance in writing and reading also. Also, mysql innodb engine supports different files for each table, so it’s not necessary to have only one big file, as the default mysql innodb configuration does. :)

    If interested in more details, you’ve got the email :)

  2. Akash says:

    If you do different files per table, they grow very slowly leading to huge fragmentation, problem is there is no auto grow value, if you look at MySQL’s data folder fragmentation and MS SQL’s data folder, we had it first in MySQL and then we went to MS SQL.

  3. hmm.. yes i didn’t think it thru :)

    but also… holding such a number of files would prove to be better on a unix machine, especially open-source, as there are multiple solutions for this kind of problems.

    For example, stripping the rights management on a file system(which is not such a crazy idea, already being implemented by facebook) plus other optimizations.

    Not trying to start a war here :) , just pointing out some facts. But then again, quite hard to do smthg like this when your company’s profile is MS based

  4. Akash says:

    I appreciate your comments, unix/linux based systems optimized for file system storage only like facebook and other companies did can definitely be better then MS or DB solution. I especially mentioned that distributed file systems like hadoop and such systems basically requires good man power expert in linux/unix technologies mainly at C and compiler level. But they are difficult to find, hire and maintain. Hiring 1 system level expert is 100 times difficult then hiring 10 SQL+ASP.NET guys. Thats why I also mentioned that from 1GB to 1TB kind of solution, this database based system makes easier bet. Beyond that certainly you need dedicated staff to manage your files !!

  5. Sarwar Faruque says:

    While having no intention of turning this into a Linux vs. Microsoft topic, have you considered using Linux for a file management system? Two years ago, I built a web based document management systems for an university. The system had user access permissions and document version tracking, running on a LAMP stack. While the permission and tracking part was shouldered by the database, all the files were stored on a separate Linux server with a scheduler automatically backing up everything in 2 hour intervals. To date, the number of documents have been steadily growing and while it hasn’t hit the 1 million marker, it certainly has passed the half-million point in the last two years.

  6. Akash says:

    1. My post doesnt focus on which solution, MS or Linux to go for, it says about whether file system or database and I am not against anyone storing it in file system or not.
    2. File system are not transactional, they are not ACID compliant.
    3. People can still use mysql, we are not against anyone, we are talking here about locking and cpu overhead and disc life cycle and durability.

  7. PUK says:

    I found this article helpful to bolster what I already instinctively felt that database is better than filesystem. It’s much more portable, atomic, relational, transactional, etc, etc. so thanks.

    BTW, This area of the page (Leave a Reply) renders ok in IE (v7) but in Firefox the captcha input is over the email label!

    N.B. This is my 4th attempt at getting the captcha right…

    PUK

  8. ypaez says:

    Great job, nice & easy solution, but looking arround this topic I found that SQL Server 2008 include a new feature called “FILESTREAM Storage” http://technet.microsoft.com/en-us/library/bb933995.aspx
    Have you been tested this feature? Does it have a better performance that your solution?

  9. Akash Kava says:

    FileStream is good but I dont know if it is ACID compliant, it will be faster but I dont know about replication and transaction. As far as I know its raw file access within SQL Context, but it is not recommended over Blob. Yes if your files are big, and they are of less financial value, like all crap uploaded to youtube, for such data you can use filestream. But if you create a document management service, where your document (files) are of bigger financial value, then you should not use filestream.

  10. Sachin Modak says:

    Hi Akash,

    I was reading through this blog and I was wondering if you considered having a flat file/control file design to keep track of files? Also something about storing files in a database just does not seem very performance centric? I was wondering if we can discuss this sometime? BTW, we can even do a phone call. Let me know.

    sachin.modak@gmail.com

  11. Adriano says:

    Hi, very detailed article.

    I have a question.

    Do you think that it might work fine even in MyIsam System?

    I’m not talking about Terabytes of data and powerful servers.
    I just would like to save small attachments ( images, pdfs, docs ) in the same line of an article data, in a basic linux server.

    Eg.: Title, description, keywords, body text, image ( 100K ), pdf ( 200K), word (100K)

    Do you think that might work? Or it will slow down? especially when I will do a FULLTEXT search for instance on ( Title, description, keywords, body text ) ?

    Thanks.

    Great Job,
    A

  12. Akash Kava says:

    MyISAM will not work, the reason is, MyIASM uses Table Locks, so for every update it will lock entire table instead of one row and that causes loss of data if the files are of little bigger size (4K>) , so I will not recommend MyIASM. We did try once and it resulted in loss of data.

  13. Adriano says:

    Thanks Akash,
    so how do you manage with FULLTEXT search?

    Do you use MyIsam tables only for FULLTEXT and then join the queries?

    I, for instance, have this kind of situation:

    I’m creating a system with a Unique ID for any kind of data in the DB.

    So I could have a new article ( Table “Articles” ) with GUID ’1′,
    then an image with GUID ’2′ ( Table ‘Images’)
    then a PDF with GUID ’3′ ( Table ‘PDF’ ) and so on.

    The maybe for instance the images are divided in 2 tables:

    1 in the id_generator table ( autoincrement )
    1 innoDB with only 3 binary data ( smal. medium, large ) and GUID,
    1 MyIsam with image title, description, keywords etc.

    This means, that to avoid concurrency, each time I put a new record in the DB,
    I MUST lock the tables to make sure that no other new ‘row’ added in the DB has the same GUID.

    Does it make sense? Or I’m about to make a mess?

    Thanks.

  14. Akash Kava says:

    Well we have not used FULLTEXT so far, because we are storing images, videos and photos, but I have heard bad about MyISAM as it keeps on crashing for fulltext. However, implementing a small fulltext is not that difficult if your query isnt going to be complex enough. If you generate a table with WordFiles with columns Word and FileID and store every word with corresponding FileID where it appears and index your Word with B+, you will get better fulltext with innodb if your search is not going to be complex.

  15. Michiel van Otegem says:

    Akash,

    I read your article with great interest. You make some very valid points about the disadvantages of a file system, and considering the number and size of the files that will have to be stored in the system I’m currently working on, database storage is definitly the way to go. I’d like to point out however that Windows Vista SP1 / Windows 2008 and up support transactions (see Wikipedia).

  16. Akash Kava says:

    Thanks for your comments, however it’s not only transactions, but being able to query, replication and disk performance etc also needs to be considered for huge systems.

  17. Ghazanfar says:

    Very nice article Akash,

    i was really looking for the comparison about the file saving in database or file system but could not find any good resource on the internet, but i am fortunate in this morning to see your article. it is really beneficial for me and my organization to take decission, because we handle very huge data.

    Regards,
    Ghazanffer Javaid

  18. Very good inputs,

    I’ve a question.

    If we save files like (.doc, .swf, etc..) we need to save this as Binary data. Would it be a performance intensive which needs the data conversion while serving this request? In case of file system i don’t it need conversion, i assume its a direct access to the file with out any conversion.

    Hope my question make sense.

  19. Akash Kava says:

    Conversion should not be done at time of serving the file, instead most appropriate way would be to convert and store a converted file along with original file. And conversion should be batched as well.

  20. soniya says:

    why we prefer to store to store data in dbms insteead of files?

  21. Have you considered store the file as base64 strings ?

  22. Akash Kava says:

    Sorry for replying late..

    Base64 takes 160% Size, that means to store 100 bytes, you will need 160 bytes, so 60% extra overhead. If you consider time of encoding/decoding, it will consume more CPU time. You can send Base64 and expect clients to decode it, but even in that case, if client is a mobile device, then cpu on mobile will eat more battery. We are storing millions of files, and after nearly 5 years of research, everything listed here is the only best option we have that we can easily deploy and manage with limited manpower and limited knowledge of simple ASP.NET and Database access.

  23. Akash Kava says:

    I am not clear on your question. In general case of storing data in dbms, there are lots of reasons, integrity, atomicity etc, it is out of context of this article but am sure you will find something useful on google.

  24. Karolis Dautartas says:

    One drawback of storing files in mysql is the difficulty of upgrading mysql server. Generally, in order to have a clean upgrade, you would have to mysqldump all DB and then import it into the new server.

    mysqldump’ing a 2TB database will take days, and importing it will 3x days. You have to do this on a slave server, as dumping that much data from a live server would make it unusable for a very long period of time.

    If you keep your files in the same DB as the rest of your data, you get transaction support in return for complicated maintenance of all your data. You will probably not be able to use SSD drives for MySQL due to the size of it, which could be a huge performance benefit.

    On the other hand if you keep the files on a separate database, you loose transactions.

    Might be worth to take a look into cloud storage solutions like amazon or rackspace.

  25. Karolis Dautartas says:

    Also note that if you use 32 bit version of mysql, it will start crashing if you try to insert files larger than 300MB or so. You can’t take an older 32 bit server and run it as a backup slave.

    MySQL seems to load entire data into memory during insert, so the larger file you try to insert, the more memory will be allocated on server (even if you use prepared statements and send_long_data API call). On a busy server, inserting one huge file can kill entire site for seconds or even minutes.

    Current mysql community server doesn’t have streaming support. You will need a caching layer between sql server and the client. Again, for large files you will need to load entire file from DB, save it and then send to the client. Doing this for the first time (create cache object) will be considerably slow.

    If you insert multiple big blobs inside transactions, committing a transaction may also be slow.

  26. Cesar says:

    Akash,
    Great work! I am trying to do the same on Postgresql which is ACID. Do you think your method will work there?

  27. David says:

    Thanks for the article. With MSSQL, if I store a 1Gig file in the database, will the database’s size increase in a Gig? I’m trying to see which saves more space, on the file system or on the database. Based on your article, the benefits of storing on the database probably outweight the space saving but would like to know the difference.

    Thanks,
    David

  28. SQLpro says:

    You say :
    “MySQL offers replication over internet, MSSQL doesnt, you will have to write your own replication program.”
    Since 2005 version MS SQL Server offer mirroring wich is the database replication solution and can be synchronous or asynchronous over http (internet).
    And to be complete there is also 5 differents mechanism to make replication for data (some tables or a part of tables) from one database to the other :
    1) transactionnal replication
    2) snapshot replication
    3) merge replication
    4) peer to peer replication
    5) service broker.
    All those mechanisms can be done over http, but not all are recommanded to be.
    (5) “Service Broker” is especially designed for, with natural encryption by signed datas
    (2) snapshot replication is especially designed for long latency and long distances (needs a FTP server to work with).

    A +

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>