Create the right conditions to build citizen facing AI services

In this post I’ll talk about:

Why citizen-facing AI services need a different approach than productivity tools
How to tackle both new services and improving existing ones
What multidisciplinary teams need to succeed

Teams across the public sector are using AI in two fundamentally different ways. One is about productivity: Copilot helping civil servants draft documents faster, analyse data, summarise meetings. These tools matter. They save time. But they’re internal tools with relatively contained risks.

The other way is more challenging: embedding AI into public services that citizens use directly. Triage systems that help councils prioritise urgent cases. Benefits eligibility tools that guide residents through complex rules. Chatbots that guide people to the right information. Document processing that handles thousands of forms daily. These aren’t productivity tools. They’re services that materially affect people’s lives.

The AI Exemplars Programme focuses on this second category. GOV.UK Chat doesn’t just help civil servants work faster; it changes how citizens interact with government. The Education Content Store doesn’t streamline internal processes; it enables better tools for teachers. HMRC’s fraud detection directly impacts tax compliance and revenue.

To truly measure the value and tangible benefits of using AI, we need to tackle both new greenfield opportunities and existing brownfield challenges that impact citizens. The stakes are higher, the user needs more diverse, the assurance requirements more complex, and the need for ongoing iteration more critical.

The first challenge is creating the right conditions to help multidisciplinary teams deliver these potentially transformative services effectively. This isn’t a new journey, we’ve built similar conditions for agile delivery and user-centred design. Now we need to adapt it for teams building AI-enabled services that citizens depend on.

What’s different about building AI services for citizens

When Hillingdon Council built their AI-driven contact centre, they weren’t just implementing productivity software. They were fundamentally changing how residents interact with their council, creating 24/7 access where office hours previously limited availability.

This creates specific challenges that internal productivity tools don’t face. This service touches vulnerable residents, handles sensitive information, and must work across the full spectrum of digital capability. User needs span from digitally confident residents to those requiring assisted digital support. Services must be accessible to people using screen readers, those with cognitive differences, and speakers of multiple languages. Errors don’t just waste staff time; they deny people access to services they’re entitled to or delay urgent support.

The technical complexity differs too. Internal tools operate on relatively clean corporate infrastructure, enabling new features across the Microsoft suite. Citizen-facing services handle messy real world information: handwritten forms, incomplete applications, data from legacy systems. Integration challenges multiply because these services must connect to existing case management systems, identity verification, payment processing, and notification services.

Most critically, these services require continuous improvement, not one-off implementation. User needs evolve. Policy changes. Technology improves. A benefits triage system built today needs to adapt as benefits rules change, as a citizens circumstances change, as user research identifies problems. This isn’t a project with an end date. It’s service delivery requiring sustained multidisciplinary capability.

How to approach new or existing services

Brownfield problems: improving existing services with AI

Most opportunities for AI in government aren’t greenfield. They’re improving existing services that citizens already use, often built on legacy systems, with established user bases, operating under existing policy frameworks. These brownfield challenges require specific approaches.

The West Midlands Police Force integrated AI to respond to non-urgent 101 calls and shared their approach with other forces. This was a brownfield problem, improving an existing service under pressure. The team had to understand current call handling, why it wasn’t meeting needs, where AI could assist without replacing human judgment, and how to transition without disrupting the service citizens depend on.

Start by deeply understanding the existing service and what can be improved. A housing application process might be slow because of manual data entry, but the real problem could be unclear eligibility rules, missing information from applicants, or dependencies on external data sources. AI won’t fix poorly designed processes; it might just automate confusion faster. Multidisciplinary teams need time for discovery, understanding current user journeys, pain points, before exploring if AI could genuinely help versus simpler improvements.

Integration with existing systems dominates brownfield AI work. New AI features must work alongside bespoke case management systems, connect to legacy databases, and respect existing business logic even while improving processes. This requires technical architecture that treats AI as a component in a larger system, not a replacement for everything. It requires careful sequencing, you can’t rebuild everything simultaneously while maintaining a live service.

Brownfield problems also require managing change carefully. Staff have established workflows and mental models. Citizens have expectations about how services work. Rushing AI into production risks breaking working processes or creating resistance that undermines otherwise good improvements. A successful brownfield AI implementation involves people throughout, pilots in contained areas, gradual rollout with careful monitoring, and willingness to adapt based on what happens in reality.

Greenfield opportunities: building new AI-enabled services

Greenfield opportunities, building new services requires a different approaches than brownfield improvements. Without legacy constraints, greenfield projects can design services around AI from the start rather than retrofitting it. This means thinking about transparency, explainability, monitoring, and graceful degradation as core service features, not additions. It means designing user journeys that accommodate AI’s probabilistic nature, providing alternatives when confidence is low, making hand-offs to human support seamless.

Greenfield work still starts with user needs, not AI capabilities. What citizen need isn’t being met? What journey doesn’t exist today that should? Why hasn’t it been solved already? Only after clearly understanding the problem should teams explore whether AI enables better solutions than alternatives.

The challenge with greenfield AI services is managing uncertainty. You’re building something new, using relatively immature technology, for users who may not know how they’ll respond. This requires phased approaches; starting with constrained scope, testing with real users early, learning what works before scaling, and maintaining flexibility to adapt as you learn.

GOV.UK Chat exemplifies this. Rather than attempting to handle all government interactions immediately, it started with specific content areas where the questions were well-understood and answers existed in structured content. This allowed the team to build capability, understand performance, identify issues, and refine the service before expanding scope. Greenfield doesn’t mean unbounded; it means deliberately choosing appropriate scope for learning.

Creating the right conditions for multidisciplinary teams

The multidisciplinary team structure that works

Building and improving AI-enabled services requires teams with all the skills needed to understand problems, design solutions, build technology, assure quality, and operate services.

Product managers are central to the team, understanding citizen needs and prioritising work that delivers most value. They balance user research findings, policy constraints, technical feasibility, and operational reality. For AI services, product managers need to understand what AI can and can’t do, not to make technical decisions but to make informed trade off decisions and explain AI’s role to stakeholders.

User researchers become more crucial, not less, with AI. Researchers need to understand not just what tasks users want to accomplish but how they react to AI, what explanations make sense, where trust breaks down, and which failures cause most harm. Testing AI services requires new research methods; how do you test transparency when the AI behaviour evolves? How do you ensure AI-generated outputs are rigorously tested for accessibility? How do you research with users who are sceptical of automated decisions?

Service designers must think beyond individual screens to entire journeys that might span between AI assisted and human supported interactions. Designing with graceful degradation principles in mind when AI confidence is low, clear handoffs to human support, and transparency that users can actually understand. For brownfield problems, they will have to navigate the challenge of improving existing services without breaking workflows staff and citizens depend on.

Technical delivery requires Software engineers who understand both AI capabilities and service reliability. Building AI services isn’t just about training LLMs; it’s about integrating them into production systems that must stay up, handle edge cases, and perform consistently. Engineers need to work closely with data specialists on model training, monitoring, and continuous improvement. Site reliability engineering becomes critical as these services need incident response and monitoring based on agreed SLAs and SLOs.

Delivery managers ensure the team can work effectively within organisational constraints while maintaining momentum. For AI services, delivery managers navigate additional complexity around assurance, spend controls, departmental/government wide guardrails and transparency requirements while protecting the team’s capacity to focus on delivery.

This team structure needs to persist beyond initial launch. Continuous improvement of AI services requires a long-lived multidisciplinary team. Models need retraining as patterns change. User needs evolve requiring design updates. Integration points change as surrounding systems evolve. Policy shifts require service adaptations. The team that built the service needs to stay with it, or at least a handover to a similarly structured team.

Specialist coaching embedded in delivery

Teams building AI services need access to subject matter experts who can advise on specific challenges as they arise. Not generic AI training, but specialist expertise: How to structure this particular user research? How to explain particular AI decisions to citizens? How to monitor for bias in this context? How to handle performance/service degradation?

This coaching works best when embedded in delivery rhythms rather than scheduled separately. Specialists joining sprint planning to help teams think through upcoming AI work. Reviewing user research findings to identify AI-specific considerations. Advising on architecture decisions with AI implications. Helping teams navigate assurance processes by understanding what evidence satisfies which requirements. Collaborating on prioritisation discussions.

The specialist expertise needed spans multiple areas. AI ethics specialists who can help teams think through fairness and bias. Machine learning engineers who can advise on model selection and monitoring. User researchers experienced with AI services who can share effective research methods. Service designers who’ve designed for AI transparency and explainability. Site reliability engineers who understand AI service operations.

Not every team needs full-time specialists in every area, but they need reliable access when needed. This creates an opportunity for sharing specialist capacity across teams like a network of floating specialists who support multiple teams. The goal is making expertise accessible without every team needing to hire rare specialists.

Worked examples and shared components specifically designed for citizen-facing services

Teams benefit from ‘looking sideways’ and seeing examples of how others tackled similar challenges. Not just case studies of finished services, but artefacts from the delivery process that teams can learn from and adapt.

Citizen-facing AI services will share common needs and building these as reusable components dramatically accelerates teams building new services. When these components are built once and reused widely, teams spend less time on foundational infrastructure and more time on the specific user needs their service addresses.

Teams will also find insight from early prototypes valuable. How did GOV.UK Chat team structure their user research? How do they handle cases where the AI isn’t confident? What was their approach to accessibility testing? These lessons will help teams navigate their own challenges more confidently.

Show and tells become crucial for AI service teams. Seeing what others built helps teams understand what’s possible. More importantly, seeing the messy bits, what didn’t work, what surprised the team, what they’d do differently accelerates learning far more than polished case studies. Teams will make mistakes, encounter unexpected challenges, and learn hard lessons.

Publishing these examples openly creates a growing library. As more teams deliver AI services, the library expands, covering more use cases and contexts. Teams contribute what they learned, benefiting from what others shared. This creates a virtuous cycle where implementation gets progressively easier as collective knowledge grows.

See GOV Reuse Library

What this means in practice

The opportunity is significant: creating conditions where teams can confidently build and continuously improve AI services that make a real difference to citizens’ lives.

We know how to create these conditions because we’ve done it before for other transformations. The Service Manual, DDaT capabilities and Agile delivery succeeded not just through publishing guidance but through communities, coaching, and shared learning over time.

AI adoption in gov needs the same ecosystem, adapted for AI’s unique characteristics. The opportunity is building this deliberately and collectively rather than hoping it emerges organically.

Some elements already exist. Communities are forming. Early exemplars provide valuable insight. Some departments have specialist capacity. The work now is strengthening what exists and building what’s missing, ensuring multidisciplinary teams have what they need to deliver AI services that genuinely work for citizens.

Rumman Amin