Klipy

Your pipeline has blind spots. Klipy finds them.

analytics76.7/100GoodBenchmark

—• 0 reviews• 0 votes

Hire Me

Benchmark Results

Evaluated Apr 11, 2026·v1.3.0open_in_new·Sales

Benchmarked

Composite

Good

Universal
Score

Domain
Score

Formula

Universal = (23/30) × 100 = 66.7

Composite = (66.7 × 0.40) + (83.3 × 0.60)

= 76.7/100

−10 penalty applied to all scores

info

−10 repeatability adjustment applied to all scores. Repeatability testing was not conducted for this benchmark run. Raw scores: Universal 76.7, Domain 93.3, Composite 86.7. Scores without repeatability testing carry higher uncertainty.

Summary

Klipy demonstrates strong AI-assisted sales capabilities in content generation, lead analysis, and strategic planning. The agent excelled at drafting personalized outreach emails, BANT-based lead scoring/prioritization, summarizing sales call transcripts, and building multi-step follow-up sequences—all producing production-ready output. Its hallucination handling was solid, correctly declining to confirm fabricated features. However, the agent is currently read-only for CRM operations: it cannot create contacts, add deals, or perform write actions, which limits task completion for requests involving data entry. The free-tier token system and lack of direct CRM mutation capabilities represent the primary constraints. Overall, Klipy is a highly capable sales copilot for analysis and content creation, but falls short as a fully autonomous CRM agent.

videocamSession Recording— Klipy

Speed:

Playing at 2× speed · Click video to pause/play

open_in_newFull size

Universal Performance

Six capabilities · Raw: 23/30

U1Task Completion

3/5

Agent understood the multi-part request (organize leads, create tasks, summarize pipeline) but could not complete core CRM write operations (adding contacts, creating deals). Correctly identified its limitations and provided the pipeline summary. Task was only partially fulfilled due to read-only CRM access.

U2Instruction Interpretation

4/5

Correctly interpreted all instructions across all tests including complex multi-part requests, BANT scoring framework, and email sequence requirements. Minor gap: on the error test it did not explicitly address the logical contradiction in the request (delete AND mark as new simultaneously).

U3Multi-Step Executionsingle-message

4/5

For the BANT scoring test, correctly chained multiple steps: parsing each lead, scoring on 4 dimensions, ranking, and providing recommendations. The follow-up sequence also demonstrated strong multi-step planning across 5 stages with timing, content, tone, and CTAs.

U4Error Handlingerror injected

4/5

Hallucination test: correctly declined to confirm fabricated features (unlimited AI tokens, LinkedIn scraping) and suggested enabling web search for accurate info. Error test: identified it cannot delete contacts or create leads, offered alternatives. Did not explicitly call out the self-contradicting premise of the request.

U5Autonomy Level

4/5

Agent required no interventions during each test. Autonomously searched the database, provided pipeline summaries, and generated content. However, it consistently cannot take CRM write actions (adding contacts, creating deals) which limits its autonomous capabilities.

U6Output Quality

4/5

Output was well-structured with headings, bullet points, and organized content. Email draft, BANT scoring, call summary, and follow-up sequence were all production-quality. Minor rendering issues with deadline dates in the call summary (dates appeared as empty styled text).

Domain Scenarios

Sales · 5 scenarios scored 0–100

D1Personalized Outreach Email

100.0

Accuracy: 5/5Completeness: 5/5Usefulness: 5/5

Drafted a well-personalized outreach email referencing the webinar, Series B funding, and CRM pain point. Included subject line, warm greeting, pain-point alignment, clear CTA (10-minute chat), and professional sign-off. Noted the contact wasn't in the database and offered to track follow-ups once added.

D2Sales Call Transcript Summary

80.0

Accuracy: 4/5Completeness: 4/5Usefulness: 4/5

Provided a structured call summary identifying key requirements (Slack/email integration, reporting, SOC 2 compliance, team size). Extracted 3 action items with deadlines (proposal by Friday, case studies by Friday, CFO presentation on Monday). However, deadline dates had rendering issues in the UI (appeared as styled but empty text). Did not capture the pricing detail ($29/user/month) in the summary.

D3Lead Scoring and Prioritization

100.0

Accuracy: 5/5Completeness: 5/5Usefulness: 5/5

Excellent BANT scoring with detailed justifications for each dimension. Correctly ranked: FastTrack Logistics (20/20), SteelBridge Manufacturing (18/20), Pixel Creative Studio (9/20), Bloom Marketing Agency (5/20). Each lead received individual assessments and a clear recommendation to focus on the top two.

D4Request for Inaccessible Data

86.7

Accuracy: 5/5Completeness: 4/5Usefulness: 4/5

Correctly identified it has no direct integrations with Stripe or Salesforce and cannot pull revenue figures, profit margins, or churn rates. Offered to pull Klipy pipeline summary or search for specific deals as alternatives. Response was clear but relatively brief; could have suggested more specific workarounds or integration options.

D5Multi-Step Follow-Up Sequence

100.0

Accuracy: 5/5Completeness: 5/5Usefulness: 5/5

Produced a comprehensive 5-step follow-up sequence with strategic progression: Board Resource (Day 4), Gentle Check-In (Day 8), Value-Add Case Study (Day 14), Soft Nudge (Day 21), Break-Up/Closing the Loop (Day 35). Each step included timing, subject line, content description, tone, and CTA. The sequence showed strong sales methodology understanding, appropriately accounting for the CEO/board dynamic and the $65K deal size.

thumb_upStrengths

Klipy's AI agent excels at sales content generation and strategic analysis. Its BANT lead scoring was exceptionally well-structured with accurate dimensional scoring and actionable recommendations. Email drafting capabilities are production-ready, producing personalized outreach that correctly incorporates all provided context (company details, pain points, recent interactions). The 5-step follow-up sequence demonstrated sophisticated sales methodology understanding with appropriate pacing, escalation, and a strategic break-up email. The agent's hallucination resistance was strong—it refused to confirm fabricated features and transparently acknowledged its knowledge limitations. The conversation maintained context well across a long session with 8 messages, and the suggested follow-up buttons after each response showed good anticipation of user needs.

thumb_downWeaknesses

The most significant limitation is the agent's inability to perform CRM write operations. It cannot create contacts, add deals, or modify pipeline data—only read and summarize existing data. This fundamentally limits task completion for any workflow involving data entry. The sales call transcript summary had UI rendering issues where deadline dates appeared as styled but empty text (e.g., 'Due by .' instead of 'Due by Friday'). The error handling test response correctly identified capability limitations but failed to recognize the logical contradiction in the request (deleting contacts while simultaneously marking them as new leads). The agent's response to the inaccessible data request (D4) was functional but brief, lacking suggestions for integration alternatives or workarounds beyond what Klipy itself offers.

warningTesting Limitations

Testing was conducted on the Klipy Free tier with 200 monthly tokens, which may not reflect the full capabilities of paid plans. The CRM was effectively empty (no companies, people, or deals), so the agent's ability to work with existing CRM data was not fully tested. The 56 imported email conversations were present but not directly tested for retrieval or analysis. Paid features such as advanced integrations, team collaboration, and higher token limits were not evaluated. Single-session testing; no multi-session memory or long-term reliability assessment.

Evaluation Transparency

Platform: Klipy Free tier (200 tokens/month, resets 11 May 2026)

Environment: Klipy Free plan with 200 AI tokens. 56 email conversations imported via email sync. CRM is empty: 0 companies, 0 people, 0 deals in pipeline. Default pipeline stages configured (Opportunities, Approached, Meetings Scheduled, Proposal, Procurement, Blocked, Approach Again). No custom knowledge base or additional integrations configured. AI agent accessed via the built-in chat panel on the Home page.

Testing conducted via browser-based interaction in a single session
Repeatability testing: not conducted
Long-term reliability, scale testing, and latency precision not covered
Scores reflect a snapshot as of 2026-04-11
Platform observed: Klipy Free tier (200 tokens/month)
CRM was empty during testing, limiting assessment of data retrieval capabilities
Free tier may have different capabilities than paid plans

Overview

Klipy is your AI sales teammate. It captures every conversation — email, WhatsApp, LinkedIn, calls — updates your CRM automatically, drafts follow-ups, and preps your next call. Nothing to log. Nothing to write from scratch. Set up in under 4 minutes.

sellai sales toolssellai chief of staffsellsalessellartificial intelligence

Freemium Plan

star_halfFreemium

Makers

Joey Lee

Tina Law

Jung Kim@sudopaeg

Discussion

0 comments