Migration sampling strategy: how many tickets to validate before go-live

You do not need to check every ticket before go-live. You need to check the right tickets.
If I were planning a support migration today, I’d use a risk-based sample: 5%–10% of closed historical tickets, 15%–25% of open or pending tickets, and 50%–100% of high-risk in-flight cases. For VIP or top-tier accounts, I’d review 100% of active tickets.

Here’s the short version:

  • I’d split validation into historical, open, and in-flight tickets
  • I’d group tickets by risk, such as SLA exposure, account tier, workflow depth, automation use, and channel/language
  • I’d use a separate golden sample of 20–100 tickets for rule testing
  • I’d review 5–10 tickets per language for multilingual queues
  • I’d set defect limits before review starts:
    • 0% for data loss, security issues, and routing failures
    • Under 2% for SLA or field-map issues in core reports
    • Under 5% for medium issues
    • Under 10% for low-impact formatting issues

What matters most is simple: don’t rely on random sampling alone. If routing, automations, timestamps, attachments, notes, or SLA timers fail, agents will feel it on day one. So I’d size the sample by business risk, not by one flat percentage.

Migration Ticket Sampling Strategy: Sample Sizes & Defect Thresholds at a Glance

Migration Ticket Sampling Strategy: Sample Sizes & Defect Thresholds at a Glance

Quick comparison

Ticket groupSuggested coverageMain checks
Closed historical5%–10%Counts, field maps, search, attachments
Open / pending15%–25%Status, owner, comment order, delta changes
In-flight / high-risk50%–100%SLA timers, routing, automations, integrations
Top-tier active accounts100%Full thread, ownership, workflow continuity
Multilingual queues5–10 per languageEncoding, routing, AI/tag behavior
Golden test set20–100 ticketsOldest/newest, long threads, odd records, former agents

If I had to sum up the article in one line, it would be this: sample by risk, validate with one pass/fail standard, and make go-live a business decision, not just a data move.

Define the validation scope

Start by setting the scope. Historical, open, and in-flight tickets each need a different kind of review. That part matters up front, because the ticket type tells you how much checking each sample needs.

Separate historical, open, and in-flight tickets

Closed historical tickets help confirm reporting accuracy, searchability, and data completeness. They also support AI training data.

Open tickets changed during the delta window check whether delta sync is pulling changes the right way, and whether ownership and status mapping arrive in the new system as expected.

In-flight tickets are active cases expected to stay open during the migration weekend [3]. These are the highest-risk records. They test live workflow continuity, SLA state, and integrations at cutover.

Once the scope is clear, the next step is to rank tickets based on the risks most likely to break cutover.

Map the risk factors that affect sample coverage

Random sampling sounds neat on paper, but it often misses the tickets most likely to cause trouble. Coverage should be based on risk, not luck.

Risk FactorWhy It MattersFailure Risk
Account tierVIP and large accounts often need full review of active in-flight casesVIPs routed to general queues due to unmapped Customer Tier fields
SLA exposureTickets with active SLA timers must preserve escalation timingHigh-priority tickets stalled because escalation triggers failed
Workflow complexityLong threads, private notes, and state changes are harder to migrate cleanly"Pending" tickets defaulting to "Closed", losing case context
Automation dependencyRecords that trigger webhooks, macros, or third-party syncs need explicit testingSilent failure of auto-assignment or broken cross-team workflows
Channel and languageEmail, chat, and web forms behave differently across systems; multilingual tickets affect AI signal extractionRouting failures or AI sentiment models missing urgency in non-English tickets

Use these risk factors to sort tickets into validation buckets and set coverage based on risk. Those factors then feed the sample matrix, where tickets are grouped into validation buckets.

Build the sample matrix

Once you’ve mapped your risk factors, put them into a sample matrix. This matrix shows which ticket groups to review, how many to pull, and what each group needs to prove before go-live. The goal is simple: cover the ticket types most likely to break routing, mapping, SLA rules, or automation logic.

Start by turning those risk factors into buckets.

Group tickets into validation buckets

A validation bucket is a ticket segment that shares the same risk profile. You can define buckets by status, group, priority, channel, and date range so each meaningful ticket management workflow has its own category [3].

Here’s what that can look like in a B2B support operation:

Validation BucketQueue / WorkflowChannelPriorityLanguageCustomer TierAutomation / SLA Dependency
Enterprise EscalationsTier 3 / TechnicalChat / PhoneUrgentEnglishPlatinum / StrategicCustom SLA, Manager Alert
Standard Email CasesTier 1 / GeneralEmailNormalMultilingualStandardAuto-responder, Tagging
Renewal-Risk AccountsSuccess / BillingEmailHighEnglishHigh-ValueCRM Sync, Renewal Trigger
Multilingual QueuesRegional SupportWeb FormNormalThai / TagalogAll TiersLanguage-based Routing
Automation-HeavyBot-to-HumanChat / APILowEnglishAll TiersAI Summary, Webhook Trigger
Closed HistoricalRead-only / ClosedAllAllAllAllNo active SLA; check attachments

Each bucket should line up with the way it can fail. Chat tickets need checks for transcript integrity. Email cases need checks for threading and attachments. Automation-heavy flows need an end-to-end check from intake through closure [1][3].

After that, size each bucket based on volume and risk.

Give extra weight to rare but high-risk ticket types

Some ticket types are risky even if they don’t show up often. That usually includes enterprise escalations with custom SLA rules, long-running incidents, tickets linked to renewal-risk accounts, cases assigned to former agents or deleted users, and tickets with 50+ comments or large or unusual attachments [1][3].

For enterprise escalations, review 100% of active cases and at least 10% of historical records [3]. For renewal-risk accounts, review the full thread history for your highest-value accounts [3]. For multilingual queues, sample 5 to 10 tickets per language to catch translation, encoding, and routing issues [3].

Those buckets will drive sample sizes by risk level in the next step.

Set sample sizes by risk level

Once you’ve set your risk buckets, the next step is simple: decide how many tickets to review in each one. That number should reflect total volume, workflow complexity, and business impact – not one flat percentage applied across the board.

Use volume bands instead of one flat percentage

A flat percentage assumes every bucket carries the same level of risk. It doesn’t. A better approach is to use volume bands: review a larger share in smaller migrations, set minimum sample counts for each bucket, and increase coverage only where high-risk workflows call for it.

For pilot validation, sample 5% to 10% of tickets, matched to real data complexity. For rule testing, use a separate golden sample of 20 to 100 tickets that includes the oldest and newest records, long threads, attachments, and tickets from former agents [3].

Turn each risk level into a clear ticket count, then tie that bucket to a specific review focus.

Risk LevelTicket CategoryRecommended Sample SizeWhat to Check
LowHistorical (Closed)5%–10% random sampleRecord count integrity, basic field mapping, searchability, and attachment access [5][3]
MediumOpen / Pending15%–25% stratified sampleComment order, attachment accessibility, agent assignments, and status transitions [1][3]
HighIn-flight / High-Value50%–100% of specific edge cases; 100% for top-tier accountsEnd-to-end customer journeys, dynamic SLA logic, complex branching workflows, and integration syncs [5][1]

As volume grows, keep the sample representative. Add more coverage only when workflow risk, data risk, or account risk makes it worth the extra review.

Increase coverage for complex workflows, SLA risk, and large accounts

Some buckets need more attention from the start. If they involve branching workflows, custom fields, dynamic SLA logic, AI routing, or revenue-critical escalations, move them into a higher sampling tier [5][1].

High-value and VIP accounts should get the most coverage. If the risky tickets are limited in number, review 100% of them [5][1].

Use AI to select representative and outlier tickets faster

Manual selection has a blind spot: people tend to pick familiar cases and skip the odd ones. That’s where AI helps. It can group similar tickets, surface outliers, and flag mapping gaps and routing exceptions that a random sample may miss [5][2].

Use these sample counts to lock in the review checklist and pass-fail thresholds before the review begins.

Validate tickets and make the go-live decision

Use the sample matrix to run a short, repeatable QA pass on each bucket before you make the go-live call. The point isn’t just to see whether records moved. You need to confirm the workflow still works end to end. Routing, SLA timers, automations, permissions, and reporting all need to behave the way they should before go-live.

Check routing, field mappings, SLA logic, automations, and reporting

For each sampled ticket, check that custom field values are right, the ticket landed in the correct queue with the correct owner, and the Created/Updated timestamps stayed intact. Make sure conversation threads show up in chronological order, inline images are readable, attachments open and aren’t corrupted, and portal access plus Original Ticket ID visibility work as expected. Also verify that legacy statuses did not remap "Pending" or "Awaiting" to "Closed" or "Open."

Then go past the record itself. Confirm that SLA start, stop, and pause logic fires correctly. Check that escalation rules and automation triggers behave the way they should. Review webhook endpoints and email notifications to make sure dynamic fields like {{ticket.id}} and {{customer.name}} render with the new system values. Test role-based access control by logging in as an agent, lead, and admin. And make sure search indexing allows teams to find historical tickets.

Every validation bucket should be reviewed against the same pass-fail standard, so the go/no-go call stays consistent across all risk tiers.

Set pass-fail criteria and defect thresholds before review starts

Decide what counts as a defect, and how serious each defect type is, before anyone opens the first ticket. That keeps the go-live decision objective and easy to defend.

Use the same defect classes across every bucket so the go/no-go call does not shift from one team or risk tier to another.

"A staggering 83% of data migration projects either fail or exceed their budget. The reason is often a neglected final step: a thorough post-migration Quality Assurance (QA) process." – Gartner [4]

Set tighter thresholds for routing, SLA, security, and reporting issues than for cosmetic problems.

Defect SeverityDefect CategoryAcceptable ThresholdRequired Response
CriticalData loss, security breach, routing failure, broken audit trail0%Immediate no-go: halt migration and trigger rollback or delay until fixed.
HighSLA logic broken, field mapping failure in core reports<2%Fix before full cutover; may proceed with limited scope if isolated.
MediumSearch lag, UI confusion, non-essential field mismatch<5%Proceed with documented workaround; resolve during the 4–6 week hypercare window.
LowCosmetic metadata, archived ticket formatting<10%Log as a post-migration backlog task.

Critical and high defects map straight to the high-risk buckets defined earlier: enterprise escalations, in-flight tickets, and SLA-sensitive accounts. If one of those buckets fails, it should trigger the strictest response.

After review, assign one owner to each defect class and store evidence in a single QA tracker. If one bucket fails, teams can still sign off on the buckets that passed and delay only the failing scope. Approve billing while holding returns.

FAQs

How do I choose sample sizes for a small migration?

For a small migration, begin with a mixed sample that reflects what’s in your system, not a random batch. A pilot of about 20 tickets and 20 knowledge base articles is often enough to spot early mapping issues or formatting problems before they spread.

Make sure that sample includes tricky records and edge cases, such as:

  • large attachments
  • multi-level comments
  • unusual custom fields
  • inactive users
  • legacy data formats

Before you run the test, disable notifications in the target environment. That helps you avoid accidental alerts while you check the results.

What should be in a golden ticket sample?

A golden ticket sample should be a curated, representative set of tickets that mirrors the messiness and depth of your actual data, not just a random batch.

Include:

  • Edge cases: large attachments, inline images, long or multi-level threads, and unusual custom fields
  • Operational variety: different time periods, statuses, priorities, channels, languages, account types, and long-running incidents
  • QA data: 100+ manually reviewed tickets, plus legacy or inactive-user records
  • Final checks: field accuracy, metadata integrity, and uncorrupted attachments

When should a failed bucket stop go-live?

A failed bucket should stop go-live when it exposes critical defects that put agent productivity, customer experience, or reporting accuracy at risk. Since migrations are operational projects, sign-off should happen by contact reason, not only at the global level.

If a workflow like billing or returns fails routing or SLA application, do not move forward with that category. Fix at least 80% of the issues found in pilot samples before scaling. Stop the rollout if critical thresholds are missed or data integrity is at risk.

Related Blog Posts

Get B2B support tips and trends, delivered.

Join 5,000+ B2B SaaS support leaders who get one short, useful email each week: playbooks, benchmarks, and case studies.

Free Coaching

Weekly e-Blasts

Chat & phone

Subscribe to our Blog

Get the latest posts in your email