
What happens when nobody double-checks?
Working with Microsoft's Customer Experience Design (XSD) team as part of a 6-month HCDE capstone, I researched how users decide when and how to verify AI-generated meeting outputs — and designed for calibrated trust, not maximum trust.
Workplace video conferencing is one of the first contexts where companies are deploying AI agents at scale. Microsoft Copilot automatically generates meeting summaries, action items, and transcripts — but when people don't verify these outputs, consequences compound: inaccurate records become organizational memory, tasks get assigned that were never agreed upon, and leaders make strategic decisions from incomplete summaries.
Project Status
This case study is updated as the project progresses. Research phases (lit review, expert interviews, user interviews, co-design workshop) are complete. Concept validation is currently underway. Final presentation to Microsoft in June 2026.
AI generates → User skims/skips → User shares.
Our goal: design the verification step back into the workflow.
12
expert interviews on trust and AI in the workplace
7
user interviews on real verification behavior
30+
academic papers synthesized on trust, automation, and human-AI interaction
2
design prototypes addressing verification gaps
/ Prototype — Ambient Meeting Awareness Panel
mid-fidelity prototype, work in progress. currently in usability study with 4 participants.
During meeting
Ambient Awareness Panel concept: a non-disruptive, live-tracking AI during meetings.
End of meeting
End-of-meeting co-review: host-triggered alignment session before the meeting closes.
AI meeting tools are built for trust. Nobody designed for verification.
Microsoft Copilot and similar tools present all meeting outputs — summaries, action items, decisions — with equal confidence, whether they're accurate or fabricated. Our research identified two compounding gaps in how users interact with these outputs.
Verification Gap 1
Navigation friction
Moving from a claim in the summary to its source in the transcript is slow and disorienting. UTC timestamps, fragmented tool ecosystems, and no inline links mean most verification stops at memory check—the least reliable layer.
Design opportunity
Make the path from claim to source fast enough that verification becomes a natural part of reading, not a separate task.
Verification Gap 2
Fabrication detection
AI confidently asserts content that was never said. Because all outputs look identical regardless of confidence, users have no signal to know what needs checking. Subtle fabrications pass undetected.
Design opportunity
Surface AI confidence levels so users know what to verify—without making every output feel uncertain.
Microsoft Teams CoPilot generates AI summaries after user's meetings. These artifacts often go unverified, or very lightly skimmed.
The verification cascade
Users verify through four escalating layers: memory check (effortless) → transcript search (moderate) → recording rewatch (high effort) → human escalation (highest + socially costly). Most verification stops at Layer 1 or 2. Subtle fabrications — which require Layer 3 or 4 to catch — pass undetected.
A five-phase research and design process
I designed our methodology to move from understanding how verification works today, to generating ideas with users, to prototyping and validating solutions — before final delivery to Microsoft in June 2026.
✅ Phase 01
Understand
Literature review + Expert & User Interviews
I synthesized 30+ papers on trust, automation, and AI. Conducted 12 expert interviews and 7 user interviews to understand how verification works (and fails) in real workplace contexts.
✅ Phase 02
Generate
Co-Design Workshop
4-participant co-design session (2 hrs) using a 9-panel storyboard scenario. Generated 20+ ideas across before, during, and after meeting phases. Synthesized into two confirmed design directions.
✅ Phase 03
Prototype
Lo-fi interactive prototypes, human-created and AI-enhanced
Sketched an Ambient Meeting Awareness Panel (HMW 1) from co-design ideas and prototyped in Figma Make. Confidence Transparency prototype (HMW 2) in progress.
🔄 Phase 04
Validate
Concept Validation
8–10 participants, 45–60 min sessions. Testing whether our design directions solve the right problems in the right way. Currently underway — findings will inform prototype iterations.
🔜 Phase 05
Test
Usability Testing + Final Delivery
Semi-structured think-aloud sessions with returning participants. Final prototypes, design principles, and recommendations delivered to Microsoft XSD. Capstone showcase: June 2026.
What our research sessions taught me about verification
Finding 01
Institutional trust creates the verification gap.
What we heard
"No worries! It's Microsoft CoPilot! It usually gets these things right."
Users skip verification not because they trust the AI's accuracy, but because they trust Microsoft as an institution.
The implication
Institutional trust is inherited, not earned. It leads to overtrust: verification skipped when it should happen. Building AI that signals uncertainty doesn't fight institutional trust; it calibrates it.
Design response
Surface uncertainty where it exists. Users should be able to trust the parts that are reliable, and quickly identify the parts that need checking.
The design question
If users trust the brand, not the output — how do we design for appropriate skepticism without destroying the confidence that makes AI adoption possible?
Finding 02
Verification happens only when four conditions align.
Prior knowledge
Users verify when they have something to check against — they were in the meeting and remember what was said.
Speed
Verification must be fast — under 30 seconds. Anything slower gets abandoned. Navigation friction is the primary cause of verification failure.
High stakes
External sharing, client meetings, and consequential decisions prompt verification. Internal or low-stakes outputs get skimmed or skipped entirely.
Personal responsibility
Users verify when they feel personally accountable for the output being sent. When accountability is diffuse or shared, verification erodes.
Finding 03
Imposed AI creates affective resistance before the tool is ever used.
What we observed
When AI tools are introduced institutionally rather than adopted voluntarily, workers develop affective responses (disgust, guilt, ambivalence) that no interface improvement can fix.
"The guy driving the fancy BMW wants us to use AI, but he doesn't even know what for."
"You cannot turn it off right now. It's so annoying."
The implication
Affective trust explains resistance and the hard limits people draw around what AI is allowed to do. When people can't opt out, resentment builds upstream of the interface.
Expert interviews confirmed that accountability is ~90% individual today — but expected to shift as AI tools become institutionally mandated, creating an accountability gap that compounds the resentment.
the response
To correspond with Microsoft's business goal of AI adoption, I will deliver a worker accountability and protection framework around AI-generated workflow. Workers feel safer using AI when they know their rights and work are protected.
Microsoft can share this proposed framework with their customers (organizations) and investigate if these policies lead to higher trust and therefore higher organizational AI adoption.
The real design question
If adoption is mandated but trust must be earned — how do we design the first interaction so it doesn't feel like surveillance?
Two prototypes, two verification gaps
From co-design synthesis, we confirmed two focus areas — one addressing the during-meeting phase (preventing verification failure upstream) and one addressing the after-meeting phase (detecting fabrication when it occurs). Together they span both verification gaps.
HMW 1 — During meeting
Ambient Meeting Awareness Panel
How might we
Help workers align during meetings so the verification process is easier after — preventing fabrication from entering the record in the first place.
The concept
A collapsible sidebar in Teams that passively captures decisions, action items, open disagreements, and agenda progress in real time. Never interrupts. User controls visibility. Transitions into post-meeting verification view at meeting end.
Key principles
Passive over active. Edges over middle (alignment at start/end, not mid-conversation). Host-controlled escalation for the end-of-meeting co-review.
Ambient Meeting Awareness Panel concept: a non-disruptive, live-tracking AI during meetings.
End-of-meeting review flow: Aligning other meeting participants for an easier post-meeting verification.
The design challenge
How do we design a tool that captures everything without becoming a distraction itself?
HMW 2 — After meeting
Confidence Transparency
How might we
Communicate AI confidence levels to workers who need to verify outputs — so they know what needs checking without having to read everything at maximum skepticism.
The concept
Inline confidence tiering in the post-meeting summary — distinguishing direct quotes from inferred content, with source-linked timestamp chips and a "no source found" flag for fabrication signals.
Status
Ideation complete. Prototype in progress — targeting completion by end of May, 2026, in parallel with concept validation sessions for HMW 1.
Confidence Transparency prototype: post-meeting summary with inline confidence tiering and source-linked timestamps.
The design challenge
How do we signal uncertainty without making everything feel uncertain? If AI keeps signals "low confidence" too much, users will assume the tool is not capable, which would hinder adoption.
Some things I've learned along this project
Research depth changes what you design
The difference between designing for a brief and designing from evidence is visible in every decision: why the panel is passive, why it doesn't alert, why the host triggers the review.
Designing for AI requires designing for trust, not just usability
Standard usability heuristics — efficiency, learnability, error recovery — don't fully account for the trust dynamics in AI systems. A feature can be perfectly usable and still erode appropriate reliance. We're designing for calibration, not just interaction.
Co-design is a great design method, in addition to being a research method
The co-design workshop generated 20+ ideas in 2 hours that we wouldn't have reached through desk research alone. More importantly, the workshop revealed which problems participants cared about — the before/during/after expansion came directly from what participants wanted to design for.
Speculative work requires more rigor
Because we're designing for a future AI capability (confidence tiering in Copilot), every design decision needs stronger justification than product work against existing systems. We can't point to a shipped feature — we have to point to research.
2025
Reducing hiring managers' credential verification workflow disruptions and extending the MatterWorx design system
/ vendor management system / design system
Sony Electronics
2025
Improve the accessibility and OOBE of Sony LinkBuds Open earbuds for users with visual impairments and dexterity challenges
/ industry-sponsored academic project / usability study

