Offering PDFs that download
When a PDF opens inside a browser window, most people
are confused. They are using a browser, and they expect only a
browser. To work around this, many websites open PDFs in a new
window, which is also annoying and confusing. I have a much better
solution.
There is also a translation to
Spanish available thanks to Pablo Noel. It's a bit outdated,
though.
Offer for download
A browser can open only a couple of file types: HTML, CSS, XML and
some images. If you open another type of file, you're asked if you
want to save it, or open it in it's native application.
The ideal way for offering PDF files would be just that:
ask the user to either save it, or view it in their native PDF
viewer. Here is a demonstration, my
MSc thesis.
Unfortunately, the most widely used PDF viewer, Adobe Acrobat
Reader, installs itself as a browser plugin. This is the cause of the
annoying PDF-opening-in-browser effect. There is a simple way of
preventing this, though.
Every file that is offered in a response to an HTTP request is
contained in an "HTTP response". Easy, isn't it? This HTTP response
contains headers that tell your browser about the type of file, when
it was last modified, what size it has, etc. In order to prevent Adobe
Acrobat Reader from opening your file directly in the browser window,
add the following HTTP header to the response:
Content-disposition: attachment
How do you do that? That's what the following sections are
about.
Apache 2
On Apache 2, this is all trivially easy. Just place the following
in a .htaccess file in the same directory as your PDFs, and you're
done.
SetEnvIf Request_URI "\.pdf$" requested_pdf=pdf
Header add Content-Disposition "attachment" env=requested_pdf
This requires mod_headers,
which is shipped with Apache2. On Ubuntu, it needs to be enabled with
the "a2enmod headers" command.
Apache 1.3
On Apache 1.3 it is almost as easy as with Apache 2.0:
<FilesMatch "\.pdf$">
Header add Content-Disposition "attachment"
</FilesMatch>
This also works with Apache2 so if you add the "<FilesMatch..." block
above into your virtual host container. For more info, see the Apache
documentation of the Files
and FilesMatch
directives.
Courtesy of Darren Clark
Django
This is very easily done in Django. Put the following in
your urls.py:
from django.conf.urls.defaults import *
urlpatterns = patterns('',
...
...
(r'^pdfs/(?P<filename>[a-z0-9A-Z_\-]*\.pdf)$', 'pdf'),
...
...
)
The pdf() function in your view could then be something like this:
def pdf(request, filename):
fullpath = os.path.join(PDF_PATH, filename)
response = HttpResponse(file(fullpath).read())
response['Content-Type'] = 'application/pdf'
response['Content-disposition'] = 'attachment'
return response
Please note that Django has been designed for dynamic content.
If you can, use another solution for your static files, like Apache 2.
PHP
If you're using PHP, you can
write a simple PHP file that reads the PDF and outputs it to the
browser. In this article, I'll assume the script is named
'mypdfscript.php'.
The best way of linking to that PHP script is by using
'mypdfscript.php/somefile.pdf' as the URL. Your webserver will
understand it has to execute 'mypdfscript.php', but your browser
thinks it's downloading 'somefile.pdf' and will name the downloaded
file as such.
<?php
$pdf = substr($_SERVER['PATH_INFO'], 1);
if(preg_match('/^[a-zA-Z0-9_\-]+.pdf$/', $pdf) == 0) {
print "Illegal name: $pdf";
return;
}
header('Content-type: application/pdf');
header('Content-disposition: attachment; filename=' . $pdf);
readfile('path_to_pdfs' . $pdf);
?>
Crappy server?
If your server doesn't understand it has to execute
'mypdfscript.php' for the URL 'mypdfscript.php/somefile.pdf', try this
code:
<?php
$pdf = $_GET['pdf'];
if(preg_match('/^[a-zA-Z0-9_\-]+.pdf$/', $pdf) == 0) {
print "Illegal name: $pdf";
return;
}
header('Content-type: application/pdf');
header('Content-disposition: attachment; filename=' . $pdf);
readfile('path_to_pdfs' . $pdf);
?>
With the above code, you should link to the PDF as mypdfscript.php?pdf=somefile.pdf.
Publicfile
Publicfile is a
very small and simple web server. Kenji Rikitake sent me a patch for
version 0.52, the latest as of this writing. The patch adds the proper
Content-disposition header to the HTTP reply, in case of a PDF.
Download the patch: filetype.c.diff
Microsoft IIS
How to apply this technique to Microsoft IIS is described by
Microsoft themselves in How To Raise a
"File Download" Dialog Box for a Known MIME Type.
What else?
Do you know of another way of adding HTTP headers to a response
when asked for a PDF file? Please, let me know
and I'll add them to this page.