
Lessie AI
Search, Reach and Connect - Find the perfect fit, 10x faster
Benchmark Results
Evaluated Apr 9, 2026·v1.3.0open_in_new·Sales
Composite
Very GoodUniversal
Score
Domain
Score
Formula
Universal = (28/30) × 100 = 83.3
Composite = (83.3 × 0.40) + (84.7 × 0.60)
= 84.1/100
−10 penalty applied to all scores
−10 repeatability adjustment applied to all scores. Repeatability testing was not conducted for this benchmark run. Raw scores: Universal 93.3, Domain 94.7, Composite 94.1. Scores without repeatability testing carry higher uncertainty.
Summary
Lessie AI is an exceptionally capable people search and outreach agent that excels at autonomous multi-step workflows. It demonstrated outstanding performance across all tested scenarios, combining real-time data enrichment from multiple sources (Apollo, LinkedIn, PDL Database, Web Search) with intelligent matching, scoring, and content generation. Its hallucination resistance is exemplary — it correctly identified a fabricated role (VP of Quantum Computing at Spotify) and refused to generate false data, instead offering constructive alternatives. The agent produced production-ready outputs including personalized cold outreach emails, ranked lead scorecards with detailed criteria assessments, and multi-email follow-up sequences. Its primary weakness is processing speed (searches take 2-5 minutes) and occasional difficulty completing complex multi-part follow-up requests within a single conversation turn.
Universal Performance
Six capabilities · Raw: 28/30
Completed multi-step search tasks fully. First task organized results by company size in the list view but the chat-based written summary was still generating. All subsequent tasks completed to full satisfaction with detailed outputs.
Excellent understanding of complex natural language queries. Correctly interpreted specific criteria (B2B lead generation experience, company size organization, relevance scoring with multiple criteria). Handled vague and precise instructions equally well.
Successfully chained 3+ dependent steps: parse query, search multiple databases, enrich profiles, match/score, organize results, generate output (email/scorecard). Company size enrichment required multiple database queries when initial data was missing.
Hallucination test: Asked about VP of Quantum Computing at Spotify. Agent performed web searches, correctly identified the role does not exist, and offered 3 constructive alternatives. Error test: Given contradictory request with 4 impossible criteria. Agent identified all contradictions clearly and suggested viable alternatives.
Zero interventions needed across all tests. Agent autonomously searched multiple databases, enriched profiles with social links and company data, organized results, drafted personalized emails, created ranked scorecards, and built email sequences.
Production-ready outputs across all scenarios. Emails included subject lines, personalized openings, specific value propositions, and clear CTAs. Lead scorecards included relevance scores with detailed criteria assessments. Email sequences included timing, pain point differentiation, and summary tables.
Domain Scenarios
Sales · 5 scenarios scored 0–100
Found 24+ partnerships professionals at Notion, correctly identified Head of Partnerships role. Drafted complete cold outreach email with subject line, personalized opening, specific integration value proposition, social proof, and clear CTA.
Successfully pivoted from Notion partnerships to London fintech content marketing managers with podcast experience. Created new list tab and found 53 candidates. However, the podcast relevance scoring sub-task was not fully delivered in the text response.
Found and ranked 5 VP of Sales at cybersecurity companies. Produced detailed scorecards with relevance scores (95/100, 88/100, etc.) assessed across all 3 requested criteria with color-coded indicators and specific data points.
Asked to find VP of Quantum Computing at Spotify (non-existent). Agent performed web searches to verify, correctly stated the role does not exist, and offered 3 constructive alternatives. Zero hallucination.
Found HubSpot CMO Kipp Bodnar and created complete 3-email sequence with distinct pain points, appropriate tone progression, clear CTAs, timing schedule, and summary table. Production-ready email copy.
thumb_upStrengths
Lessie AI excels in three core areas. First, its multi-source data enrichment autonomously queries Apollo Database, LinkedIn, PDL Database, and web search to find and verify contacts with categorized match quality. Second, its output quality is production-ready: personalized emails with subject lines and value propositions, ranked lead scorecards with relevance scores out of 100, and multi-email sequences with timing and summary tables. Third, its hallucination resistance is exemplary — when given a fabricated role, it performed verification searches and correctly declined rather than fabricating data.
thumb_downWeaknesses
The agent's primary weakness is processing speed — searches take 2-5 minutes as it queries multiple databases sequentially. The mid-conversation topic change test revealed that complex multi-part follow-up requests may not be fully completed in a single turn. Company size enrichment required multiple database attempts. Free tier's 100-credit limit constrains session scope.
warningTesting Limitations
Testing conducted on Free tier with 100 credits. Email sending not tested (no email account connected). Emails and Process sidebar features not tested. Long-term reliability, scale testing, API integration, email deliverability, CRM integrations, and team collaboration features remain untested. Recording was captured but upload failed due to extension connection error.
Evaluation Transparency
Platform: Lessie AI, Free tier (100 credits at start, 68 remaining at end)
Environment: Free tier account with 100 credits. Default configuration with no custom setup. Features tested: Find (people search with natural language), multi-source data enrichment (Apollo Database, LinkedIn, PDL Database, Web Search), list management with match scoring (Fully Matched / Partially Matched / Not Matched), email drafting, lead prioritization and scoring. Email account not connected. Platform version observed at app.lessie.ai as of 2026-04-09.
- Testing conducted via browser-based interaction in a single session on 2026-04-09
- Repeatability testing: not conducted
- Long-term reliability, scale testing, and latency precision not covered
- Scores reflect a snapshot as of 2026-04-09
- Platform observed: Lessie AI, Free tier with 100 credits
- Email sending and CRM integration features not tested
- Recording captured but upload failed due to extension connection error

Discussion
0 comments