A couple days ago I met with a potential client who posed an interesting challenge. He wanted to upload very large files (upwards of 6GB-7GB) to his server using PHP. I told him it was possible, but I don’t recommend it. There’s a few reasons why.
One of the biggest problems I have with most web applications is file uploads. As a developer, I rarely need to upload anything larger than, say, 100MB - this covers a lot of code files, libraries and so forth. Generally, in an OOP world, if a file is larger than a couple megabytes you’re probably doing something wrong anyways.
But, I’m a developer. I work with people on a daily basis that aren’t developers. Hi-res photos, PSDs, even digital media files such as AVIs & MOVs, are something that should be considered when building an application. A complicated PSD, for instance, can easily grow upwards of two gigabytes.
And therein lies the problem.
Let’s face it, PHP, for all its glory, is quite simply, not the best language for uploading files. This is a three pronged problem. The first is PHP, which I’ll elaborate on. The second is HTTP. HTTP is just not suitable for receiving large files; don’t get me wrong, it’s possible, but I question the wisdom. The third is actually client based - it takes time to upload a file, and most people I know (especially non-web savvy folk) get irritated when it doesn’t work immediately.
Let’s take, for instance, what most people do to get around the upload problem by overriding the PHP default limit. Formerly, you could set these variables using ini_set() and ini_restore(), but some of the overrides are no longer available through these methods. You can use a php.ini file, set the options directly but chances are you do something like this, via an .htaccess file.
Make no mistake, this is bad, and a potential security hole. Why? Because someone could easily setup a denial of service attack and bring your servers to its knees.
Addressing the Problem
Okay, so now that I’ve said that uploading files as big as we’re talking is a bad idea, what’s the solution?
First, address the HTTP problem. We’ve got several options, but something I found a few years ago is very promising. Take the challenge outside PHP.
Meet Tramline (http://infrae.com/products/tramline), an upload and download accelerator that plugs into Apache. Using mod_python, it provides a direct means to bypass PHP altogether and hook into Apache. A bit more security tweaking and it is an acceptable solution. There’s more alternatives too. Curl, WebDAV, Java, Streams, Python and Perl to name a few.
At this point, it is fair to say that one should choose the right tools for the right job. This is also true with the client upload problem. It is completely unacceptable to use a file control, have the user hit submit and expect them to wait there for five minutes or more. That said, there are alternatives. SWFUpload, YUI Uploader and various other Javascript methods as well, and with a bit of tweaking I’m sure they could be modified to utilize a service.
6GB-7GB? Easy. Just don’t stick to your PHP guns.
I’m interested to hear how other folks have handled large file uploads.