Skip to main content

Command Palette

Search for a command to run...

CTO Playbook: Preparing Your Data and Systems Before an AI Rollout

Updated
14 min read
CTO Playbook: Preparing Your Data and Systems Before an AI Rollout
D
PhD in Computational Linguistics. I build the operating systems for responsible AI. Founder of First AI Movers, helping companies move from "experimentation" to "governance and scale." Writing about the intersection of code, policy (EU AI Act), and automation.

CTO Playbook: Preparing Your Data and Systems Before an AI Rollout

TL;DR: Most AI rollouts fail because data was not ready. A four-phase CTO checklist: data readiness, system integration, governance baseline, and pilot design.

The most expensive phase of an AI rollout is not the model selection or the vendor negotiation. It is the discovery that your data was not ready, your systems were not integrated, or your compliance posture was not established before you started.

This playbook is for CTOs and technical leads at European SMEs who are four to twelve weeks away from a planned AI rollout. It is structured as a pre-rollout sequence — work that should be completed before you design the first pilot, not during it.


Why Pre-Rollout Preparation Matters More Than Model Selection

There is a pattern in AI rollouts that fail or stall: the organisation spent disproportionate energy selecting and evaluating models while underestimating the data, integration, and governance work that determines whether those models can operate in production.

Model quality is a relatively solved problem for most SME use cases. The major providers offer capable models with well-documented APIs. What is not solved is whether your organisation's data is clean enough, accessible enough, and governed enough to make those models useful in your specific environment.

The pre-rollout work described in this playbook is not glamorous. It is also not optional.


Phase 1: Data Readiness (Weeks 1–3)

1.1 Map Your Data State

Start by producing a simple map of where your operational data lives and in what state:

  • System inventory: Which systems hold the data relevant to your candidate AI use cases? ERP, CRM, warehouse management system, customer support platform, document repositories, spreadsheets?
  • Format inventory: Is the data structured (databases, APIs, CSV exports) or unstructured (PDFs, emails, free-text fields, scanned documents)?
  • Quality assessment: For each data source, what is the approximate quality level? Are records complete? Are they consistently formatted? Are there duplicate records, inconsistent naming conventions, or gaps in historical data?
  • Access rights: Who currently has access to each data source? What would be required to grant a new system or external partner access? Are there data classification or contractual constraints on how the data can be used?

This map does not need to be comprehensive. It needs to be honest. You are looking for the gaps that will block a pilot from operating as designed.

1.2 Identify the Minimum Viable Data Set

For each candidate AI use case, define the minimum data set that the model needs to operate:

  • What input data does the model need?
  • In what format does it need to receive that data?
  • What is the minimum history required for the model to be useful (if applicable)?
  • What data enrichment or preprocessing will be required before the data is ready to use?

Many organisations discover at this stage that the data they assumed was available is either not accessible without significant IT work, not clean enough to be useful, or not in the right format for the model they are evaluating.

Finding this out before the pilot starts is a low-cost correction. Finding it out during the pilot is expensive and demoralising.

1.3 Address the Critical Data Gaps

After mapping the state and identifying the minimum viable set, triage the gaps:

  • Gaps that can be closed quickly (a data export format change, a new API connection): address these in the pre-rollout phase
  • Gaps that require significant IT work (a data migration, a system integration, a historical data cleaning project): factor these into the rollout timeline and do not design a pilot that depends on work that has not started
  • Gaps that reveal a deeper structural problem (data that does not exist, processes that are not digitised, systems that cannot share data): flag these to leadership and defer the AI use case until the structural problem is addressed

Phase 2: System and Integration Readiness (Weeks 2–4)

2.1 Define the Integration Points

AI systems do not operate in isolation. They need to receive data from existing systems, return outputs to those systems, and sometimes trigger actions in downstream workflows.

For each candidate use case, map the integration points:

  • What system provides the input data to the AI component?
  • What system receives the AI output?
  • Is the output a recommendation (human acts on it) or an action (system acts on it automatically)?
  • What APIs are available at each integration point?
  • What are the latency, volume, and reliability requirements?

This integration map determines the engineering complexity of the rollout. A use case that requires two clean API connections is a different scope from one that requires building a new integration pipeline between three legacy systems.

2.2 Evaluate Your Infrastructure Against the Use Case

The questions most often skipped in the pre-rollout phase:

  • Can your current infrastructure handle the additional processing load?
  • If you are using a cloud-hosted model, have you reviewed the data residency implications under GDPR and your own data classification policy?
  • What is your observability plan for the AI component? How will you monitor outputs, detect quality degradation, and alert on failure?
  • What is your rollback plan if the AI component fails or produces consistently unreliable outputs?

These are not edge-case questions. They are table-stakes questions for any AI deployment in a production environment.

2.3 Define the Human-in-the-Loop Boundaries

For any AI use case that affects a customer, an employee, or a regulated outcome, define in advance:

  • Which outputs are acted on automatically?
  • Which outputs require a human review before action?
  • Which outputs are advisory only — available to a human operator but not triggering any automatic process?

This is not just a governance question. It is a system design question. The review workflow needs to be built before the pilot runs, not improvised after the first questionable output appears.


Phase 3: Governance and Compliance Baseline (Weeks 2–5)

3.1 Classify Your Use Cases Under the EU AI Act

The EU AI Act enforcement phase has been active since January 2026. For European SMEs, the practical implication is that certain AI use cases are subject to specific compliance obligations.

The four risk tiers under the Act are:

  • Unacceptable risk: banned outright (social scoring, certain biometric applications)
  • High risk: subject to conformity assessment, documentation, and oversight requirements — includes applications in employment, HR, education, critical infrastructure, credit scoring, and some customer-facing systems
  • Limited risk: transparency obligations apply (users must know they are interacting with an AI system)
  • Minimal or no risk: most general productivity and operational AI uses fall here

For each of your candidate use cases, assign a preliminary risk tier. If any fall into high risk, you need legal and compliance review before the pilot is designed. Do not defer this.

3.2 Establish an AI Use Policy

Before the first AI tool is deployed to a team, your organisation needs a written AI use policy that covers at minimum:

  • Which tools are approved for which use cases
  • What data employees are permitted to share with AI tools (and what is off-limits)
  • Who is responsible for reviewing AI outputs in each context
  • How incidents or unexpected outputs are reported and escalated
  • How the policy will be communicated and maintained

A minimal AI use policy does not require a legal team. It requires a CTO or senior operations leader to spend two to three hours drafting a clear, readable document. The alternative — teams using AI tools informally without a policy — creates data handling, compliance, and quality risks that are entirely avoidable.

3.3 Assign Governance Ownership

Someone needs to own AI governance in your organisation. This does not need to be a dedicated role. It does need to be an explicit responsibility assigned to a specific person.

The AI governance owner is responsible for:

  • maintaining the approved use case list
  • reviewing new use case proposals against the EU AI Act risk tiers
  • maintaining the AI use policy
  • managing vendor relationships where compliance implications exist
  • reviewing incidents or quality issues with AI outputs

Phase 4: Pilot Design Readiness (Weeks 4–6)

4.1 Define the Pilot Hypothesis

A pilot without a hypothesis is an experiment without a question. Before any resources are committed to a pilot, define:

  • What you are testing: not "does AI work here" but "will this specific system, applied to this specific process, produce a measurable improvement against this specific metric"
  • How you will measure it: which operational metric changes if the hypothesis is correct?
  • What a successful outcome looks like: a concrete threshold, not a vague improvement
  • What a failure outcome looks like: and what you will do with that information

A well-formed pilot hypothesis makes the decision to scale or stop clean.

4.2 Define the Minimum Viable Scope

Define the pilot scope in terms of:

  • The specific process or workflow being tested — not a broader operational area
  • The specific team or individuals involved
  • The duration: most well-scoped AI pilots for SME use cases should produce a decision within four to eight weeks
  • The data set used: historical data for testing accuracy, or live data with defined guardrails?

Constrain the scope enough to produce a clean result. If the pilot is designed to test too many things simultaneously, no single result will be actionable.


The Pre-Rollout Checklist

Before you begin a pilot, confirm:

  • [ ] Data state is mapped and critical gaps are addressed or deferred with an explicit decision
  • [ ] Minimum viable data set is defined and accessible
  • [ ] Integration points are mapped and engineering complexity is scoped
  • [ ] Infrastructure, observability, and rollback plans are defined
  • [ ] Human-in-the-loop boundaries are documented
  • [ ] EU AI Act risk tier is assigned to each planned use case
  • [ ] AI use policy is drafted and communicated to the relevant team
  • [ ] Governance ownership is assigned
  • [ ] Pilot hypothesis is defined with a measurable success criterion
  • [ ] Pilot scope is constrained enough to produce a clean decision

This checklist is the minimum required to run a pilot that produces decision-quality output rather than expensive ambiguity.


What to Do If You Are Not Ready

If you work through this checklist and find that you are not ready — data gaps are significant, integration complexity is higher than expected, or governance work has not started — the right move is a structured AI readiness assessment before you proceed.

A readiness assessment conducted by an external partner with experience in your context can compress this discovery work significantly, surface the gaps you have not yet identified, and produce a prioritised plan for addressing them.

Explore how a readiness assessment supports AI rollout planning →

Frequently Asked Questions

What should a CTO do before starting an AI rollout?

Before any AI pilot begins, a CTO should complete four phases: map the data state and close critical gaps, define integration points and infrastructure requirements, establish an EU AI Act compliance baseline and AI use policy, and define a measurable pilot hypothesis. Skipping this pre-rollout sequence is the most common reason AI initiatives fail in production.

How do I prepare data for an AI rollout at a European SME?

Start by producing an honest map of where operational data lives, what format it is in, how clean it is, and who can access it. For each candidate AI use case, define the minimum viable data set the model needs. Address critical gaps before the pilot starts — not during it.

What are the EU AI Act compliance steps before an AI pilot?

Assign a preliminary risk tier to each candidate use case under the EU AI Act's four-tier framework. If any use case falls into the high-risk category — covering employment, HR, credit scoring, or some customer-facing systems — complete legal and compliance review before designing the pilot. Also establish an AI use policy and assign governance ownership.

How long should an AI pilot run to produce a usable result?

Most well-scoped AI pilots for SME use cases should produce a clear decision within four to eight weeks. The key requirement is a well-formed pilot hypothesis with a specific metric, a defined success threshold, and a constrained scope — not a broad operational area.

What is a minimum viable data set for an AI use case?

A minimum viable data set is the smallest collection of input data a model needs to operate usefully for a specific use case. It is defined by the input data format the model requires, the minimum historical volume needed, and any preprocessing or enrichment steps required before the data is usable.

Read Further