Split an A4 PDF file into two A5 PDF files
November 9, 2013 1 Comment
This week I scanned a couple of receipts in order to submit them to a client with an expenses claim. The two receipts were physically small enough to both fit on an A4 page, so I lay them both on the bed of a scanner and produced a PDF file with the two receipts on one A4 page. The resulting scan looked rather like the sketch in Figure 1. However, a couple of days later I remembered that the client wanted copies of receipts to be submitted individually, but the office scanner had stopped working and so I needed to split the original PDF file into two PDF files containing images rather like the sketches in Figures 2 and 3.

Figure 1 - original A4 image

Figure 2 - A5 bottom half of original A4 image

Figure 3 - A5 top half of original A4 image
I was surprised to find little information on the Web about how to do such a thing. I did come across three applications that looked like they probably could do the job: jPdf Tweak, BRISS and krop. jPdf Tweak is in the Portage main tree (app-text/jpdftweak). BRISS isn’t. The krop Web page does have a link to an ebuild which could be downloaded and installed via a local overlay. I did have a brief play with jPdf Tweak and it looks powerful, but I did not find it particularly intuitive and I would need to study the manual in detail. Anyway, I thought I would try a command line approach for the fun of it.
Searching the Web I came across a site with a Perl script that looked promising: Split (crop) double page PDFs in two posted by someone called iblis (to whom I’m grateful). It uses a Perl module PDF:API2 which does exist in the Portage main tree (dev-perl/PDF-API2) and which I already had installed. I modified the last couple of lines of the script very slightly, so it now looks like this:
#!/usr/bin/env perl use strict; use warnings; use PDF::API2; my $filename = shift || 'test.pdf'; my $oldpdf = PDF::API2->open($filename); my $newpdf = PDF::API2->new; for my $page_nb (1..$oldpdf->pages) { my ($page, @cropdata); $page = $newpdf->importpage($oldpdf, $page_nb); @cropdata = $page->get_mediabox; $cropdata[2] /= 2; $page->cropbox(@cropdata); $page->trimbox(@cropdata); $page->mediabox(@cropdata); $page = $newpdf->importpage($oldpdf, $page_nb); @cropdata = $page->get_mediabox; $cropdata[0] = $cropdata[2] / 2; $page->cropbox(@cropdata); $page->trimbox(@cropdata); $page->mediabox(@cropdata); } (my $newfilename = $filename) =~ s/(.*)\.(\w+)$/$1.split.$2/; $newpdf->saveas($newfilename); __END__
I saved it with the file name split_pdf_A4_to_A5.pl
and made it executable:
$ chmod +x split_pdf_A4_to_A5.pl
Then I used the following procedure to split the original PDF file:
1. I used the excellent command line utility pdftk (the package app-text/pdftk in Gentoo), which I had already installed, to rotate the A4 page clockwise and save it in a file named rotated.pdf
:
$ pdftk original.pdf rotate 1-1right output rotated.pdf
Now the A4 page looked similar to the sketch in Figure 4.

Figure 4 - original A4 image, rotated clockwise by 90 degrees
2. I used the Perl script to split the A4 page into two A5 pages within one PDF file:
$ ./split_pdf_A4_to_A5.pl rotated.pdf
The above command created a file rotated.split.pdf
containing two A5 pages.
3. Finally, I split the two-page PDF file into two separate single-page files:
$ pdftk rotated.split.pdf burst
which left me with two A5 PDF files named page_0001.pdf
and page_0002.pdf
similar to the sketches in Figures 2 and 3 above.
Mission accomplished. 🙂