Help Center

Topic: Recognition

Preventing Duplicate Paper Survey Uploads/Files

Help Center RecognitionLast updated: 15 January, 2025

duplicate surveys

Ensuring accurate survey data is essential—especially when collecting and processing responses on paper. At PaperSurvey.io, we’ve developed robust measures to stop accidental duplicate uploads that can skew results. Below, learn how our system safeguards your data and what options you have if duplicates are flagged.

1. Comparing Unique Page Identifiers

Applies only to surveys using unique page identifiers, learn more

How It Works

  • Reading Identifiers: Each page’s unique ID, page number, and survey ID (e.g., page 1, unique ID: 91, survey ID: 991) are checked upon upload.
  • Automatic Check: If a previously processed page has the same identifiers, the new upload is marked as a duplicate and excluded from final data.
  • Optional Disable: You can enable the “Allow duplicates” toggle in survey settings if your pages don’t use unique identifiers (e.g., multiple copies printed from the same PDF by accident).

Key Benefits

  • Accurate Results: Eliminates repeated entries to ensure a clean dataset.
  • Time Savings: Reduces manual page-by-page reviews.

Incorrectly-detected duplicates

It is possible that duplicate detection mechanism incorrectly identified duplicates. This can happen if pages with unique page marking are accidentally printed several times.

There are a few options you can choose:

  • Mark Resolved: If the page isn't relevant (e.g. there was a duplicate scanned), please mark it as resolved.
  • Retry Processing: Click the "Retry" button on the uploads page to process detected duplicates as new responses. Please review if these are not just pages scanned twice.
  • Disable Unique Page Identifiers: The identifier at the bottom-left corner will be ignored as if it would not exist.
  • Enable 'Allow duplicates': Keep identifiers in place but ignore duplicates for this survey.

2. Comparing File Hashes

Applies to all surveys

How It Works

  • SHA-1 Hash Calculation*: Before processing, each document and each page’s SHA-1 hash is compared to existing uploads.
  • Duplicate Check: Any page that matches a previously uploaded hash won’t be processed again.
  • Optional Disable: You can enable the “Allow duplicates” toggle in survey settings if your pages don’t use unique identifiers (e.g., multiple copies printed from the same PDF by accident).

Key Benefits

  • Prevents duplicate pages: Blocks the same file from being counted more than once, even across multiple uploads.

Limitations

  • Scanning Variations: Rescanning the same page usually results in a different hash.
  • Editing Software: Any file modifications may change the hash.

Get Started with PaperSurvey.io Software

Get Started

Start your 14-day free trial now, no credit card required.