Why You Don't Need to Train an AI Model to Extract Data from Your Sales Calls

Part 2 of an 8-part series on turning sales call transcripts into structured CRM data using AI Builder and Power Automate.

In Post 1, I framed the core problem: your CRM has a timing problem, not a data problem. The information is on the call. It just doesn't make it into the record. This post covers the first of two concepts you need before we build anything: why the model already knows what it's looking for, and what bridges its output to Power Automate.


When most people hear "AI extracting data from transcripts," the mental model that kicks in goes something like this: you need training data, a labeling pipeline, data scientists, weeks of iteration, and a model that still might not generalize to your specific call patterns.

That model is reasonable. It's just outdated.

Let me explain what actually changed, and why it matters for what you're building.

The model already knows

When you use AI Builder's prompt builder inside Power Automate, you're not training anything. You're not labeling data. You're working with a GPT-4 class model that was trained on an enormous volume of human text before you ever opened the tool: sales calls, CRM records, qualification frameworks, business conversations, consultant reports. The knowledge required to understand what a "pain point" or a "budget" or a "technical environment" means in a sales context was built into the model during training.

This capability has a name: zero-shot learning.

"Zero-shot" means you're giving it zero examples of your specific data. No sample transcripts. No labeled outputs. No training run on your call recordings. You describe the task in plain language, and the model uses what it already knows to execute it.

For a lot of use cases, this sounds too good to be true. For sales call extraction specifically, it's the right fit. Sales conversations follow predictable patterns. Someone discusses timelines. Someone discusses budget. Someone talks about the systems they're running today and the problems those systems are creating. The concepts are consistent even when the language varies. Zero-shot handles this well — when it's properly constrained. More on that constraint in Posts 3 through 5.

One thing this means in practice

There's an assumption table worth putting on the table directly, because it changes how you think about what you're building:

What people assume What's actually true
You need to train it on your data Training already happened. You direct it.
More examples produce better results A better specification produces better results.
It gets smarter over time Each run is independent. The prompt is everything.

That last row is the one that surprises people most. The model doesn't learn from your calls. It doesn't retain anything between runs. Every single time the flow fires, it starts fresh. Your prompt either works or it doesn't, every single time. There's no warmup period. There's no improvement from volume.

That's not a limitation. It's a design constraint. Understanding it changes how you approach the work: you invest in the prompt, not the pipeline.

The bridge you can't skip

Zero-shot gets you to extraction. There's still a gap between what the model produces and what Power Automate needs.

The AI's native output is language. It reasons in sentences and paragraphs. Power Automate reasons in fields, data types, and structured values. If you ask the model to extract a budget and it returns "the prospect mentioned they were working with roughly a quarter million dollars," Power Automate cannot map that to a currency field. It's a string. There's no labeled value. The flow either fails or passes garbage to the CRM.

This is where JSON becomes the critical piece — and it's the piece that most demos skip over or treat as obvious.

JSON stands for JavaScript Object Notation. You don't need to write code to use it. In this context, it's a format that lets you instruct the model: don't give me a paragraph, give me a structured object with labeled keys and values.

When your AI Builder prompt is correctly constructed, the model returns something like this:

{
  "budget": "$250,000",
  "timeline": "Q1 2025",
  "current_system": "SAP"
}

Power Automate reads that like a spreadsheet. The budget key maps directly to the budget field on your opportunity record in D365. The timeline key maps to close date. The current_system key maps to your technical environment field. No interpretation. No manual mapping step. The rep completes the call, the transcript moves through the flow, the record updates.

That's the bridge. The AI speaks language. The JSON instruction tells it to speak structure. Power Automate reads structure.

Why this matters more than it looks

Understanding the bridge is what prevents the most common failure mode I see in first builds.

Someone writes a prompt, tests it manually: they paste in a transcript, get back a readable paragraph with all the right information, and feel confident. Then they wire it into Power Automate. The flow fails immediately, or worse, it runs and writes nothing to the CRM, because the output isn't in the format the Parse JSON action expects.

The prompt that works in a chat window is not the same prompt that works in an automated flow. That's not a bug. It's a design requirement. One type of prompt produces content for a human to read. Another produces data for a system to process. These are fundamentally different constructions.

The next post is about exactly that distinction: the five types of AI prompts, which one belongs in an automated flow, and why most of what gets demonstrated at AI Builder sessions is the wrong type for production automation.

That's where the architecture starts.

No worries, this is buildable. But it's worth building right.