For digitizing a couple of dozen books at home, fast, I decided to use a digital camera. In this post I detail exactly what I did, how I did it, and why (includes video at end of post).
As I have said before, an office printer/scanner is currently the best way to digitize a book. However, if the things you want to digitize are at home, you would need to bring these books with you to the office or library. If time is on your side, I see no problem in this. Simply carry one volume every odd day and spend half an hour to an hour at the copier and eventually you will have pulled it off.
Recently, for me, time became a constraint as I was about to move to a different continent and would not be taking my books with me. The first thing I did was weed out all books I definitely, absolutely, did not need to have in digital format. Even when you apply strict criteria, it turns out there are still tons of books that meet those criteria and can safely be discarded (that is, put in storage). In a later post I will go into greater detail on how to select books you would want to have/keep in digital form.
The second thing I did was make sure I did not already have a digital copy of the books that were left. It turned out that of a rather large number of them, I already had a digital version. I discarded these as well. As time was running short, I even discarded those books of which I only had a digital version of a different edition.
This left me with only a couple of dozen books. Inspired by the do-it-yourself book scanners that people have been making, I decided I would like to have a camera shooting from the top, using a remote so I would be able to both turn the pages and trigger the camera without moving my hands much. Additionally, I realized strong, bright light would be important. The people that build diy book scanners use two cameras, each focussed on only one page. I decided myself to go for one camera shooting both pages. Whereas the diy book scanners tend to use cheap compact cameras, I was not entirely sure this would work well and therefore I decided it would be better to overshoot rather than undershoot. I therefore borrowed a DSLR camera, a tripod, and a remote control.
First setup
For the first attempt, the only investment I made was buying a flood light which I thought would work well as a light source and otherwise will always find some good use in and around the house. I installed the tripod on a table, with the camera mounted directly facing the table. The flood light was hung aprox. 10 cm away from the camera also directly facing the table.
One drawback I noticed from the start is that I am not proficient with DSLR cameras and therefore do not know exactly what kind of settings would work best. The internet is, as far as I know, empty of information on this. I shot in P-mode which took care of a lot of things automatically, whilst still allowing me to recalibrate white balance and set it to a slight over exposure. My thinking behind the strong light source, the recalibration, and the over exposure was to try to make the pages as white as possible. It turned out I was right in being concerned with it, though this first setup did not bring the best results.
In the next two images you can see the result for this setup. The first was taken in a dark room with only the flood light on. The second was taken by allowing daylight to come in. Somewhat to my surprise, this proved to be much better. Still, however, I thought I could do better. I noticed that the tripod was awkwardly in the way. The fix is easy, postproduction editing, but it did make it more difficult to shoot books of various sizes.
First setup with only flood light as light source (click for high-res)
First setup with flood light and day light
Second setup
The first setup left me with undesirable results. Whilst I was preparing for a major upgrade, which would become the third setup, I decided to try out a variant of the first setup, one that would be highly mobile. Instead of placing the tripod on the table, I now extended its foots and placed it just in front of the table. The camera was mounted with a slight angle facing the table, so that the bottom part visible on photos would match the side of the table. The flood light was simply placed on a stack of books on the table just in front of the camera. When you see this on the video, it gives the impression of providing ample light, but as witnessed by the next images, it actually gives highly irregular light with deep shades. It is surprisingly dark in some places.
Second setup photographing a stiffly bound paperback (click for high-res)
Second setup photographing grainy paper – a lot of shadows
Third setup
With the third setup I was quite successful. It required a bit more investment, but this definitely payed off. I bought and constructed an adjustable stand for the camera which allows for true positioning above a surface, I bought black cloth to create a shaper background contrast, and most importantly, I bought a set of soft box lights. This dramatically improved lighting conditions. I placed the camera stand on top of a table (instructions will follow in a separate post), on top of which I placed the black cloth. I showered the table with two soft box lights from either side, made sure the camera was tightly screwed on, and started snapping. Not only was this setup easy in operation, the results were very good. Both the camera stand and the soft boxes were absolutely worth their price. The black cloth is perhaps not necessary when photographing books, as these images will be edited later on (explained in a future post), but for manuscripts and the like, for which these images are the end result, the black cloth gives it added value.
Third setup – white is white, black is black (click for high-res)
Third setup photographing ivory paper
Third setup photographing glossy paper
Third setup photographing manuscript-like material
Conclusion
I prepared a video in which I demonstrate these setups. Notice especially the high speed with the last book, which is an OUP hardcover. Their binding is so stiff that it is impossible to flatten it with just using fingers at the corners, so instead I chose to not flatten it at all. This will obviously give worse results after editing, but on the other hand it does bump up the speed. My 19th century prints from Istanbul were bound remarkably well; they fell flat open by themselves on nearly every page. Perhaps Oxford can take note.
Suggestion: Every time you begin photographing place a grey card for white balance correction. Same whenever you change paper or lights. Adobe Lightroom can apply in seconds the custom white balance for all the photos using the grey card image. FineReader will be able to OCR your photos as well. (There’s no need to use the highest resolution for the OCR to work well. Try different image sizes and qualities.)
Have you come across Booksorber or Scan Tailor for processing the images? Do you use something else?
I use ScanTailor. You can see it in action here: https://www.youtube.com/watch?v=dFFJJHVGFVE
Have you tried Booksorber? It’s a paid alternative? The only benefit I can see perhaps is that it has automatic finger removal (for when you need to hold the edges of pages down to take a shot)
Would you describe you kit of parts for your third setup? I’ve had a few false starts putting together an overhead rig for my dslr camera. Your setup looks like a good solution. I particularly like the way you used an offset crossover clamp to make the height adjustable.
Any details you might share would be appreciated.
It’s a simple construction, really. Made from DIY store items; a wooden board, two standard pipes, one metal base mount, one pipe-to-pipe ring, one standard camera mount ring (screw wire). This way you can adjust both height and width with the one pipe-to-pipe ring. It would take more effort to get a really good description up; it’s been a long while and I don’t have it near me. But yes this solution ensures that the camera can be straight above the item, under a straight angle, without casting any significant shadow. Good luck!
Would you describe the details of your third setup? I’ve had a number of false starts building a diy overhead dslr rig. Your adjustable horizontal pipe offset crossover clamp is a great idea.
After using Scan Tailor, is there a way to ocr the text and make it searchable? And is there a way to automatically index a book after it has been given this treatment?
https://helpx.adobe.com/acrobat/how-to/scan-to-searchable-pdf.html
With Adobe Acrobat XI Pro or Adobe Acrobat Pro DC
What is the best book-scanning software for Mac?