New quarter, and a new (but oh-so-old) challenge – how are we going to increase website conversions? How do I get more people to click through to the product page, download the asset, or fill out the form? Dozens of ideas fighting for priority on the website experimentation ideas list – but are we even asking the right questions?
Meanwhile, according to Forrester’s 2025 Global Customer Experience Index Rankings, global customer experience quality is declining, and the gap between what brands intend to deliver and what customers actually experience is widening.
So, to inspire you before your next quarterly planning session, I want to share what I learned from talking with Conductrics customers. Below you’ll find three practical examples of how teams responsible for website optimization deliver on their tasks in a way that incorporates direct feedback from customers to ensure the quality of the experience.
The limits of behavior-only testing
A standard A/B test can be a great tool, and I personally love the idea of getting a ‘yes or no’ answer to a problem. You can even scale that up to a multivariate test to see exactly how different headlines, images, and offers interact with one another. But whether you are testing a single variable or a complex combination, you are still only measuring behavior.
Those tests, while highly valuable, often don’t give you a full idea of why the version won. And sometimes that’s alright; maybe you just needed a quick yes-or-no answer. But in many cases, especially when dealing with high-stakes features or stagnating metrics, you’ll need more context than just observing customer actions. To ensure those insights drive long-term learning for your organization, you need to combine them with direct customer feedback, ideally gathered in real time.
When we talk about capturing this in-market feedback, most teams immediately think of standard surveys. While those are excellent broad research instruments for measuring general sentiment, they are not your only option. If your goal is to specifically evaluate and iterate on a live experiment or digital product, you should introduce a different mechanism: a Customer-in-the-Loop (CITL) evaluation. By natively uniting these responses with your behavioral experiments, you capture in-the-moment feedback to understand the user’s actual intent.
What is a Customer-in-the-Loop (CITL) evaluation?
A Customer-in-the-Loop evaluation is a framework where customers – rather than internal engineers or paid annotators – generate the direct feedback data used to evaluate and refine live experiments and digital products.
In artificial intelligence, Human-in-the-Loop (HITL) and Reinforcement Learning from Human Feedback (RLHF) are standard methods for evaluating and fine-tuning models before they are released. A CITL evaluation applies this same disciplined approach to your in-market experiments.
While traditional evaluations often happen in controlled, artificial environments, a CITL approach shifts the process directly into your production environment. By capturing explicit telemetry (for example, a thumbs-up/down) or metric feedback (such as Customer Effort Scores, NPS, or custom metrics) during real customer experiences, you align your technical performance measures directly with actual business value.
The strategic advantage is that it creates a continuous flywheel effect – instead of relying on third-party labeling teams, you generate thousands of continuous, low-cost evaluation data points at scale. This allows your team to turn live user interactions into curated datasets for continuous, automated fine-tuning.
As Forrester wrote on their website, “It’s no longer enough to measure customer sentiment retrospectively. At a time of stagnating CX performance, prioritizing real-time CX — capturing in-the-moment feedback and using it to continuously improve — is key to brand differentiation.”
So, how do we actually integrate this direct customer feedback into our live experiments?
Practical use cases for context-aware website optimization
Example 1: Validating new content formats
Rolling out a brand-new site feature – such as a dynamic content section or a complex interactive layout – introduces new variables to the user journey. A good practice is to run an A/B test to compare how the new feature performs against your existing control version. But let’s say you did run a test, and people did not engage with your new content section – will you know why? Was it because they did not find the content interesting, or maybe they were not ready to take the step you wanted them to? Or is it that they just didn’t notice it at all?
To help you answer those questions, you can deploy a Customer-in-the-Loop evaluation. Because the goal is to evaluate this specific feature for improvement rather than conduct general research, you attach an open-ended CITL eval directly to the treatment logic. For example, triggering only for people who saw the new website variation with your new feature but did not click on it at all. This captures qualitative data about what your audience actually values (or doesn’t value) in the new format and reveals the “why” behind their behavior.

Example 2: Measuring the macro impact of a website redesign
There are certain qualitative data-gathering methods that organizations could consider running on a regular cadence to keep a pulse on overall website health. Depending on your objectives, you might use a simple metric such as Net Promoter Score (NPS) to gauge overall brand sentiment.

Alternatively, you might deploy a more specific framework, such as UMUX-Lite. While often used for software, UMUX-Lite is perfectly suited for measuring website interface friction by asking users to rate two specific areas:
- Usefulness: “Does this website meet my needs?”
- Ease of use: “Is this website easy to use?”

Monitoring such metrics on a regular schedule is good practice, but during a major website redesign, this feedback mechanism shifts from a general survey into an active CITL evaluation. When you are introducing entirely new features or overhauling core user flows, simply testing the behavioral performance of the new elements won’t be enough. You need to gather direct feedback from users as they navigate the changes to evaluate the success of the rollout.
By running these simple CITL eval metrics alongside your tests, you establish a baseline data point. This allows you to observe trends over time and verify if the usability scores actually improve as you iterate. While it is tempting to look only at stable revenue post-launch and declare success, revenue is a lagging indicator that is easily influenced by external market factors, seasonal trends, or marketing pushes. If conversions randomly dip a month later, having this context-backed data is the only way to know whether your new interface caused the friction or if it was just an external shift.
Example 3: Discovering visitor intent (and acting on those insights in real time)
Sometimes you need to explore very specific interaction preferences during an active test. For example, let’s say you are running an A/B test on different promotional pop-ups or website notifications to see which format drives more conversions. You can easily track how many people in each variation immediately close the window, but that behavioral data alone won’t tell you if they found your offer irrelevant to their interests or maybe just weren’t ready to take the action you wanted them to at this specific moment.
To capture their actual intent, you can use a highly targeted question tied directly to that specific experiment variation and user behavior. But here is where you can really elevate the process.
Typically, such responses are treated as passive reports – an analyst reviews the data at the end of the month, writes up a summary, and passes it along to the website team. But if you natively integrate your feedback and experimentation efforts, these responses can be activated in real time to customize the customer experience.
For example, if a visitor responds that the current offer just isn’t attractive to them, that qualitative data is immediately passed to the optimization engine as a targeting variable. You can use that data to dynamically swap their experience – perhaps immediately triggering a different promotional campaign or placing them in a personalized offer segment instead of showing the same ignored message.

By connecting your research directly to your execution layer, you drastically reduce your time from insight to action. Instead of waiting for an analyst to write a report and asking developers to build a suppression rule or a new targeting segment in the next sprint, the system immediately adapts to that specific visitor. It respects the customer’s experience in real-time, while saving your internal development resources for more complex tasks.
The execution: How to integrate feedback seamlessly
How do you actually set this up in practice? The standard industry approach often involves stitching together disconnected tools – running an A/B testing platform for behavioral data and a completely separate survey widget for direct feedback. The problem with this decoupling is that it forces teams to look at two entirely different dashboards and guess the correlation between a visitor’s answer and the specific experiment variation they were exposed to.
True integration means the feedback loop is context-aware. With a platform like Conductrics, the feedback is tied directly to the live experiment logic, ensuring that the question asked matches the precise treatment variant the visitor is currently viewing.
But the most critical advantage of keeping direct user inputs and experiments in a single environment isn’t just data cleanliness – it’s the ability to act. True integration turns passive responses into active zero-party data. You aren’t just running a survey within an experiment; your optimization engine uses those answers to dynamically adapt the visitor’s experience in real time – allowing you to both listen and act.
Because this setup runs natively alongside the experiment, the forms simply match your existing web UI. Website teams can launch these qualitative checkpoints without asking backend engineering for major code changes, and they don’t look like a bolted-on widget to visitors.
Managing friction: A/B testing your surveys
When you introduce a CITL eval or a standard survey to your website, there is always a risk of adding friction to the user journey. You need to make sure that asking for feedback isn’t accidentally hurting your main conversion goals.
Because your feedback tools and experimentation engine live in a single unified platform, you can easily reverse the workflow: A/B test the deployment of the tool itself.
Conductrics can randomize and keep track of who is offered the eval or survey. Those who are not offered are the control group, and those who are offered become the treatment group. By assigning guardrail metrics to this test – such as tracking your primary conversion rate or revenue – you can catch negative trends early and see if the survey or the eval has any adverse side effects on visitor behavior. This allows your team to capture the necessary customer feedback data while ensuring you aren’t accidentally damaging your baseline performance.
Building an infrastructure for long-term customer loyalty
Testing shouldn’t just be a mechanism for pushing code variations live. By unifying your behavioral experiments with direct customer feedback, your team doesn’t have to guess why a test won or lost; you can hear directly from customers to inform your decisions.
Enterprise experimentation often means navigating complex technical architectures and strict data requirements. If you are evaluating how to integrate context-aware research mechanisms directly into your existing technology stack, we invite you to start a technical conversation with our Solutions Architects. We’re always happy to explore how Conductrics can support your infrastructure goals.
Bring the customer into the loop
Partner with us to build an experimentation program that actually captures visitor intent.



