AI in CXAI StrategyCX LeadershipCXQuest ExclusiveExecutive InsightsIndustry InsightsInterviewLeadershipLeadership InsightsLeadership Interviews

CX Assurance Enables Safe Agentic Scale: An Exclusive Interview

Have you ever watched a chatbot confidently give the wrong answer—
politely, fluently, and at scale?

Now imagine that answer triggering a payment, rejecting a claim, or misrouting a vulnerable customer. No outage. No alert. Just a silent failure, repeated thousands of times before anyone notices.

This is the hidden risk of today’s AI-powered customer experience.

As enterprises race toward agentic AI—systems that don’t just respond, but decide—the margin for error shrinks fast. CX leaders are under pressure to automate faster, personalize deeper, and reduce cost simultaneously. Yet behind the dashboards, AI models drift. Integrations break quietly. Compliance lines blur.

That’s where CX assurance stops being a hygiene factor and becomes a strategic necessity.

Billions of Customer Interactions 

In this exclusive CXQuest.com interview, we speak with Amitha Pulijala, Chief Product Officer at Cyara, the global leader in AI-powered CX assurance. Trusted by over 450 global brands including Salesforce, ADP, and Amazon, Cyara validates and continuously monitors billions of customer interactions every year, across voice, digital, messaging, and conversational AI.

Amitha brings more than 15 years of enterprise product leadership, including shaping AI and digital CX strategies at Ericsson, Vonage, and Oracle. Today, she sits at the intersection of AI innovation, product governance, and real-world CX risk.

Her core belief challenges the prevailing narrative:

It’s healthy to be afraid of AI.

Not fear as resistance—but fear as discipline. The kind that enables faster, safer, and more durable innovation.

This conversation goes deep into what CX leaders must unlearn, relearn, and operationalize as AI systems move from tools to autonomous actors.


Small Fixes Often Unlock Disproportionate CX Impact 

Q1. What CX win surprised you most when AI was properly tested before launch? Why do “small” fixes often unlock disproportionate CX impact?

AP: The biggest moment for us is seeing how surprised customers are when they finally gain visibility into how their AI behaves in real conversations. Many teams assume their systems are performing well because nothing is “breaking,” but with LLMs and agentic AI, most failures do not appear as outages. They show up as subtle reasoning issues, drifted answers, or inconsistent handoffs that only emerge when you simulate real, multi‑turn customer behavior. Until customers see that, they simply do not know what they do not know.

The fixes that create the most impact often look small on the surface. A grounding issue in a high‑volume question or a misalignment in an escalation step may seem minor, but because these models operate at scale, those flaws repeat across thousands of interactions. Catching them early prevents loops, escalations, and the significantly higher cost of addressing these problems once they reach production.

Hidden Dependencies Amplify Risk 

Q2. Why do AI failures in CX rarely happen as single-point breakdowns? How do hidden dependencies amplify risk across journeys?

AP: AI failures in CX rarely come from one point because modern journeys are unpredictable. With LLMs and agentic systems, customers can take countless paths, rephrase questions, or switch channels, and issues often appear only after several turns when the AI must keep track of context, data, policies, and the next action.

The challenge is that AI relies on many connected systems. Each component may work on its own, but small misalignments become visible only when the full journey is tested. A knowledge article, a CRM field, or a handoff rule that is slightly out of sync can shift the outcome for the entire interaction.

Most customers discover these issues only when they validate end-to-end behavior. Early steps may succeed, but the experience may degrade in follow-up interactions or during the handoff. Testing real conversations shows that failures are usually the result of subtle dependencies that compound across the journey.

Healthy Fear of AI 

Q3. What does a “healthy fear of AI” actually look like in product leadership? How do leaders distinguish caution from paralysis?

AP: A healthy fear of AI means not taking its outputs at face value. Leaders expect the system to misunderstand intent, drift over time, and behave differently under real customer pressure than it did in testing. That mindset pushes teams to ask practical questions early: what outcomes are we committing to, where can this break, and how do we step in when it does?

Caution can become paralysis when the team treats AI risk as the reason to avoid launching the product. The discussions around how “AI is risky” turns into months of debating with no concrete tests, guardrails, or use cases. Healthy caution is starting small and specific. Leaders should roll out AI in clearly defined use cases with limits on what it can do and when it should stop or hand off to a human. Leaders should give it a specific job and watch how it behaves in real customer interactions and only expand its role once they’ve seen it stay consistent and reliable.

Launch and Hope is Dangerous

Q4. Why is “launch and hope” especially dangerous in agentic CX systems? What changes when AI starts making decisions, not suggestions?

AP: “Launch and hope” is especially dangerous because many failures won’t show up as outages. With traditional AI, failures are often obvious, the system will stop, error out, or ask for help. Agentic AI behaves differently. It continues making decisions even when it misunderstands context, which makes problems harder to detect. Dashboards can remain normal while customers get stuck in loops, get wrong answers or fail to carry conversation context across channels. That’s the silent trust break and you usually don’t notice it until after churn, complaints, or headlines. 

When AI starts making decisions, its mistakes can turn into real outcomes. At that point, the system is no longer advising but acting on the customer’s behalf. That means a wrong call can block help, send someone down the wrong path, or quietly fail without flagging the error. At that point, reliability and clear stop-and-escalation rules matter more than clever responses because trust breaks fast when decisions go wrong.

Hallucinations and Model Drift 

Q5. How do hallucinations and model drift manifest in real customer journeys? Why are they harder to detect than traditional bugs?

AP: Hallucinations and behavior shifts show up inside normal conversation, not as clear system failures. A common example is when a customer asks, “Do I need to pay a fee if I change my flight today?” The AI responds confidently with a dollar amount that sounds correct, but the policy has changed. Nothing has crashed, the API returned data, and the bot continued the conversation. The problem is simply that the answer is wrong, and it takes a real, multi‑turn test to reveal it.

Drift tends to appear more gradually, often because the AI is interacting with thousands of customers over time. As the model processes diverse phrasing and patterns, its behavior can shift in ways teams did not anticipate. For example, the AI might become more lenient in how it verifies identity because it has repeatedly seen customers push for faster access. Each individual interaction seems fine, but over time the responses start to veer away from policy. Traditional testing will not catch this because nothing looks broken and the earlier tests may still pass.

These issues are harder to detect because they only appear when you test the full journey with real conversational variability. The AI handles the first question well, but the failure shows up in a follow‑up, an added detail, or a change in tone. Scripted tests do not expose that. You need realistic paths, interruptions, and policy‑heavy scenarios to see where the AI starts to produce outcomes that are technically valid but operationally or legally incorrect.

Where do Governance Gaps Hide?

Q6. What CX risks emerge when AI spans contact centers, CRM, payments, and identity? Where do governance gaps typically hide?

AP: When AI spans multiple systems, the risks shift from simple errors to compound failures. The AI is interpreting customer intent, pulling data from CRM, applying policies, and sometimes triggering actions in payments or identity flows. Each system may function correctly on its own, but small inconsistencies become significant when combined. A single outdated policy in a CRM record or a slightly misaligned verification rule can cause the AI to give incorrect guidance or attempt an action the customer is not authorized for.

The governance gaps usually appear in the transitions between these systems. Identity rules may be enforced in one channel but not another. Payment actions may rely on customer data that is not consistently updated. The AI may be allowed to access information in a way that technically works but does not align with privacy or compliance requirements. These issues remain hidden because most teams validate each component separately instead of validating the full journey with real customer behavior.

The organizations that manage this well create clear boundaries around what the AI is allowed to do, what data it can use, and when it must escalate. They also continuously validate the full journey to ensure that as systems change, the AI still produces the right outcomes. Without that visibility, the biggest risks live in the handoffs, not the systems themselves.

Rethink Testing when AI Learns and Evolves 

Q7. How should enterprises rethink testing when AI learns and evolves continuously? Why pre-launch QA is no longer enough?

AP: Pre-launch quality assurance (QA) assumes the system you tested is the system customers will meet. But prompts can change, knowledge bases update, integrations get tweaked and customer language can evolve. 

Testing should be treated as ongoing practices rather than a one-off. You should still test before release but you should also re-test the customer journeys that matter most, using real use cases and multi-turn conversations that reflect how customers actually behave. Customers can interrupt, have emotion, request switching channels, and escalate the situation. Success comes from consistent customer assurance and testing.

Continuous CX Assurance Differs from Traditional Monitoring 

Q8. What does continuous CX assurance mean in an always-on AI environment? How does it differ from traditional monitoring?

AP: Traditional monitoring is good at telling you whether something is up and running, fast, and available. But it’s far less reliable at telling you whether or not customers are getting the right outcome. In AI CX “working” can still mean “misleading,” “looping,” or “blocking a human handoff.”

Continuous CX assurance means consistently validating end-to-end journeys and uncovering misunderstandings, drift, and hallucinations, not just product lag or uptime. It’s a proactive approach to catch problems in the system before customers do.

Cost Optimization Silently Degrades Experience 

Q9. How do you reconcile CX quality with cost pressures in AI-orchestrated models? When does cost optimization silently degrade experience?

AP: AI often enters an organization with a strong cost‑saving mandate, but the real challenge is making sure those efficiencies do not undermine customer experience. Cost pressure usually shows up in model choices, context limits, grounding decisions, and how frequently the AI can access supporting systems. Each of those levers affects performance. If you constrain them too aggressively, the AI may technically respond, but it will not have enough information to guide the customer to the right outcome.

Silent degradation becomes visible when conversations take more turns, customers repeat themselves, or the AI hands off too late or too early. None of these show up if you only track cost‑per‑interaction or containment. They show up in repeat contacts, unresolved journeys, inconsistent tone, and longer resolution times for agents who receive incomplete context.

The organizations that balance this well focus on outcome quality. They look at whether the AI is accurate, whether the customer reaches the right resolution, and whether the experience feels consistent across channels. They also validate journeys regularly because system changes, policy updates, and new integrations can shift behavior in ways that impact both experience and operational cost.

Cost efficiency and quality are not at odds when you measure the right things. The key is ensuring that optimization efforts do not limit the AI’s ability to understand the customer, maintain context, and deliver a result the business can stand behind.

Role of Product Leaders in AI Governance and Compliance

Q10. What role should product leaders play in AI governance and compliance? Why can’t governance live only with legal or risk teams?

AP: Legal and risk teams can tell you what the regulations and rules are, and what is needed to mitigate risk. Product leaders decide how the system behaves day to day. Governance lives in product choices like what the model is allowed to do, what data it can access, how it explains itself, when it escalates, what it refuses, and how you audit decisions later.

If governance is only owned by legal, it can show up too late and direct. Teams can have blocked launches or vague requirements that engineers interpret differently across teams. Product leaders are the bridge by translating risk into design control, building guardrails into the  workflow and keeping governance connected to real customer journeys and metrics.

Balancing Speed with Accountability in AI CX

Q11. How do regulated industries balance speed with accountability in AI CX? What lessons apply even outside regulated sectors?

AP: Regulated industries balance speed and accountability by taking a staged approach to AI. Most begin with internal use cases where the risk is lower and the outcomes are easier to supervise. They also rely on hybrid models that combine scripted flows with AI reasoning so they can control where automation is allowed to make decisions. This gives teams room to innovate while keeping oversight on accuracy, policy alignment, and customer impact.

A large part of their success comes from understanding their industry’s specific risks and designing guardrails around them. But the more advanced organizations also recognize that regulations alone do not cover every exposure. Once AI spans identity, payments, CRM, and service channels, the risk profile expands beyond what any single compliance framework anticipates. That is why continuous testing is essential. They validate not just the model, but the full customer journey as systems change and the AI learns from real interactions.

These same practices matter outside regulated sectors. Clear boundaries around what the AI can do, well‑defined handoff rules, and regular journey‑level validation help any organization move faster without sacrificing trust. The companies that scale AI successfully treat accountability as an ongoing discipline, not a one‑time check, and they apply that discipline even in use cases that do not carry formal regulatory requirements.

Beyond Containment and Deflection

Q12. What metrics actually prove agentic AI delivers sustainable ROI? Beyond containment and deflection, what should leaders measure?

AP: Containment and deflection are leading indicators, not proof of durable ROI. Sustainable ROI from agentic AI is achieved when the cost per successfully resolved journey decreases, rework and repeat contacts are eliminated, and multi-step tasks are completed autonomously without adding governance or QA burden. This requires tracking outcome-based metrics such as resolution rate, cost per successful outcome, end-to-end time to resolution, and measurable revenue or retention impact. A shift-left approach is critical: testing and continuously validating real, non-deterministic journeys before production exposes failure patterns early, reduces downstream operational risk, and stabilizes performance, which is what ultimately lowers the total cost of running AI at scale.

Q13. How does CX assurance become a competitive advantage, not just insurance? Why trust is now a growth lever?

AP: Customers don’t reward AI; they reward speed, clarity, and resolution. When AI is reliable, customers stay in self-service longer and escalations drop for the right reasons. Trust is the growth lever because bad AI creates instant doubt. Research commissioned by Cyara and fielded by Dynata found that customers are quick to bail or escalate after a single failure, and they get more frustrated with bot failures than human ones. In an era where every bot failure is viewed as brand betrayal, the true winners won’t be the companies with the fast models, it’ll be those who assure their CX journey for successful outcomes.

What to Unlearn about AI?

Q14. What must CX and product leaders unlearn about AI in the next 12 months? What mindset shift will define the winners?

AP: Deterministic scoring for non-deterministic flows is essential because agentic systems produce variable outputs across runs, yet organizations still need a consistent way to measure correctness, risk, and quality. By evaluating outcomes against fixed success criteria such as task completion, policy adherence, correct sequencing of actions, and clean handoffs teams can benchmark performance, detect regressions, and govern behavior even when the underlying responses are probabilistic. This shifts evaluation from “did it sound right?” to “did it reliably achieve the right result?”, enabling repeatable testing, auditable controls, and scalable trust in real-world CX journeys.


CX Assurance Enables Safe Agentic Scale: An Exclusive Interview

AI Fails Quietly, and at Scale

This conversation makes one truth unmistakable:

AI doesn’t fail loudly. It fails quietly—and at scale.

Amitha Pulijala reframes fear not as hesitation, but as strategic maturity. In an era where AI agents act across complex, regulated ecosystems, confidence without assurance is risk disguised as speed.

Key takeaways for CX leaders:

Agentic AI multiplies impact—good and bad—faster than humans can intervene

Testing must evolve from events to continuous discipline

Governance is no longer optional; it is a CX design principle

Assurance enables faster innovation, not slower delivery

As CXQuest continues to explore AI in CX, agentic systems, and trust-driven experience design, this interview reinforces a critical shift:
The future belongs to organizations that scale AI responsibly, visibly, and measurably.

Explore more insights in our AI in CX and Intelligent Operations hubs on CXQuest.com.

CX leaders: audit where your AI makes decisions today—before customers do it for you.

Related posts

White Goods Go D2C: The Next Frontier in Customer Experience

Editor

Trontek Customer Experience Starts Before the First Purchase

Editor

CX Strategies for G20 Economies: Growth and Citizen Satisfaction

Editor

Leave a Comment