Feedback Scoring

Feedback scoring is the practice of assigning numerical values to feedback based on business importance, urgency, or priority. Scores help teams quickly identify what matters most and make data-driven prioritization decisions.

Common Scoring Dimensions

Business impact (1-5):

5: Critical to revenue or product function
4: Significant impact on important customers
3: Meaningful improvement
2: Minor enhancement
1: Nice-to-have

Urgency (1-5):

5: Immediate action required
4: This week
3: This month
2: This quarter
1: Someday/maybe

User impact (1-5):

5: Affects all users, core workflows
4: Affects many users or critical workflow
3: Affects moderate number of users
2: Affects small segment
1: Edge case

Effort to address (1-5):

5: Months of work
4: Weeks of work
3: Days of work
2: Hours of work
1: Minutes to fix

Composite Scoring

Priority score = f(Impact, Urgency, User Count, Customer Value, Effort)

Different formulas weight factors differently:

Simple average: Priority = (Impact + Urgency + User Impact) / 3

Weighted formula: Priority = (Impact × 0.4) + (Urgency × 0.3) + (User Impact × 0.2) + (Customer Value × 0.1)

Impact-Effort ratio: Priority = Impact / Effort (Optimize for high impact, low effort)

Custom formula based on strategy: Early-stage might weight customer value highly. Growth-stage might weight user count.

Manual vs. Automated Scoring

Manual scoring:

Human reads feedback
Considers context
Assigns scores based on judgment
Time-consuming but captures nuance

Automated scoring:

AI analyzes feedback and context
Applies consistent criteria
Instant results at scale
Might miss subtle context

Hybrid (best):

AI provides initial scores
Humans review and adjust
System learns from adjustments
Balance of speed and judgment

What Influences Feedback Scores

Content factors:

Language used ("broken," "blocked," "critical")
Specificity (vague vs. detailed)
Problem description (severity)

User context:

Account value (MRR, contract size)
User role (decision maker, influencer, user)
Account health (at-risk, healthy, champion)
Engagement level (active, casual, dormant)

Historical patterns:

Similar requests from other users
Frequency of this issue
Past resolution outcomes
Trending vs. isolated

Strategic fit:

Aligns with roadmap
Target customer profile
Product vision
Competitive positioning

Scoring Frameworks

RICE Score (for features/requests):

Reach: How many users affected
Impact: How much does it help per user
Confidence: How sure are you
Effort: How hard to build
Score = (Reach × Impact × Confidence) / Effort

ICE Score:

Impact: Business value
Confidence: Certainty
Ease: How easy to implement
Score = (Impact + Confidence + Ease) / 3

Value-Effort Matrix:

Plot on 2D grid
Prioritize high-value, low-effort
Score = Value / Effort

Custom scoring:

Define factors that matter to your business
Weight them appropriately
Create formula that reflects strategy

Score Calibration

Scores are relative, not absolute. Key is consistency:

Calibration sessions:

Team scores same feedback independently
Discuss disagreements
Align on criteria
Update guidelines

Reference examples:

"Here's a canonical 5"
"Here's a canonical 3"
Compare new feedback to references

Scoring drift:

Over time, team might score too high or low
Periodic recalibration maintains consistency

Inter-rater reliability:

Measure agreement between scorers
80% agreement is good
<60% means unclear criteria

Score Thresholds

Define action thresholds:

4.5-5.0: Immediate action, alert team
3.5-4.4: High priority, address this week/sprint
2.5-3.4: Medium priority, backlog for review
1.5-2.4: Low priority, defer unless pattern emerges
0-1.4: Nice-to-have, likely won't address

Thresholds create clear decision rules and reduce ambiguity.

Displaying Scores

For teams:

Show score prominently (color-coded)
Explain reasoning ("High score because: enterprise customer, blocking issue")
Allow manual override with explanation
Track score changes over time

For reporting:

Average score over time
Distribution of scores (are we getting more high-priority feedback?)
Score by customer segment
Score by product area

For users:

Usually don't show internal scores
Exception: voting systems where "votes" are a form of score

Common Scoring Mistakes

Score inflation: Everything gets marked 4 or 5 because it all seems important. Makes scoring useless.

Ignoring context: Scoring feedback without considering who it's from (free user vs. enterprise).

Set-and-forget: Scoring once and never revisiting. Priorities change.

Too many dimensions: 10 different scores per item is overwhelming. Keep it simple (2-3 key scores).

No validation: Never checking if high-scored items actually mattered. Learn and iterate.

Scoring without strategy: No clear connection between scores and what you'll actually build.

Score-Based Workflows

Automated routing:

Score >4: Alert product lead immediately
Score 3-4: Add to this week's triage
Score <3: Weekly batch review

SLA by score:

Score 5: Respond in 1 hour
Score 4: Respond same day
Score 3: Respond in 2 days
Score 2: Respond in 1 week

Reporting thresholds:

Only escalate score >4 to leadership
Weekly rollup of score 3-4 items
Monthly review of score <3

Learning from Scores

Retrospective analysis:

Did high-scored items actually matter when we addressed them?
Did we miss any low-scored items that turned out critical?
Which factors were most predictive?

Score outcome correlation:

Track: Initial score → Action taken → Business impact
Learn: What scores lead to best outcomes?
Adjust: Refine scoring formula based on learnings

Feedback loops:

Team members flag incorrect scores
System learns from corrections
Scoring improves over time

When Feedback Scoring Makes Sense

High volume: 50+ pieces of feedback per week. Manual prioritization breaks down.

Distributed team: Multiple people handling feedback need consistent criteria.

Data-driven culture: Want objective prioritization, not gut-feel or politics.

Clear strategy: Know what factors drive value for your business.

Resource constraints: Can't address everything, must choose wisely.

When It's Not Necessary

Low volume: Reading and prioritizing 10 items per week doesn't need formal scoring.

Highly contextual: Every decision requires deep strategic consideration unique to situation.

Early exploration: Pre-product-market-fit when you're learning, not optimizing.

Founder-driven: If founder personally reviews all feedback and decides, formal scoring adds overhead.

Start simple. Add sophistication as volume and complexity grow.