Digitisation specifications for paper records
This page outlines the technical specifications for digitising paper records in public offices. It aims to ensure that digitisation results in reliable, authentic copies, which facilitate the use of GA45: Original or source records which have been copied.
The guidelines cover a range of document types and provide minimum requirements for colour, resolution, compression, and file formats. Also see GA45: Original or source records which have been copied.
When to use this specification
This specification is for:
- digitising records to improve business processes
- back-capture projects
- scanning incoming records like mail or correspondence.
It applies to:
- files and documents
- maps and plans
- bound and unbound volumes, registers, and publications
- photographs within documents.
It does not apply to records created before 1980 or photographic series, film, or audio-visual material.
Technical specification
The technical requirements for digitisation include:
- colour mode and bit-depth:
- 1-bit bi-tonal for black and white, clean, high contrast documents, word processed, contains text and art line only. Noting that not all black and white documents have the appropriate level of contrast for bi-tonal scanning. Testing may be required.
- 8-bit greyscale - greyscale or black and white documents. Including those that contain watermarks, grey shading, and grey graphics.
- 24-bit colour - documents with discrete colour used in text or diagrams and coloured documents.
- resolution: 300ppi at a 1:1 ratio for accurate reproduction
- compression: lossless compression is recommended, smart or lossy methods are allowed if artefacts are minimalised
- file formats: choose formats that ensure long-term accessibility, support text recognition, and are compatible with relevant software.
The output should meet or exceed the input data quality to keep the image’s integrity. Avoid up-sampling as it can reduce quality. It may be appropriate to increase technical specifications depending on the format and amount of detail contained in the physical record. For example a higher resolution might be needed to ensure fine detail/print on a map can be produced and legible.
File formats
When choosing file formats, consider:
- long-term sustainability and accessibility
- compression needs
- the ability to store metadata
- compatibility with text recognition software (OCR)
- compatibility with relevant software.
For colour images, use sRGB settings for consistency across devices.
For a list of sustainable formats consult our guidance on sustainable file formats (PDF 4.7MB).
Image enhancements
Only use image enhancements, such as sharpening or background removal, when necessary. Avoid enhancements if high-quality images can be obtained without affecting the original record’s authenticity.
Variations on specifications
The above technical specification is intended to enable easy conformance with requirements under General retention and disposal authority: Original or source records that have been copied (GA45), i.e. where there is the intent to destroy the physical originals after digitisation, and meets expectations for quality for archival transfer.
Public offices can choose to relax requirements for short term records (records are required to be retained for 10 years or less) or reference use copies. The decision to do so should be reached after confirming that proposed specifications are appropriate to meet all reasonable business uses including text/ optical character recognition (OCR) for content search/ discoverability.
The decision to lower specifications should be documented and supported by useability testing.
Documenting requirements
Documenting image capture specifications is essential to maintain a clear record of the digitisation process. This ensures consistency and meets the requirements of ongoing or back-capture digitisation projects.