Review Analysis Methodology
Sellyze analyzes Amazon customer reviews to identify product pain points, score market opportunities, and generate product specifications. This page explains how the analysis works.
Four-Layer Architecture
Each analysis passes through four layers: detection, classification, prevalence estimation, and market validation.
Layer 1: Detection (Stratified Sampling)
Amazon reviews are typically 85-95% positive (4-5 stars). Randomly sampling reviews misses most product problems. Sellyze fetches approximately 400 reviews per product and applies post-fetch stratification: all negative reviews (1-2 stars) are retained, neutral reviews (3 stars) are retained, and positive reviews (4-5 stars) are capped at 60% of the final sample. This produces a balanced dataset of approximately 200 reviews with enough negative reviews to detect even uncommon issues.
Layer 2: Classification (AI, Temperature 0)
Each review is broken into individual claims using Claude Haiku. A single review like "the lid leaks and it is hard to clean" becomes two separate claims. Claims are then clustered into named pain points using Claude Sonnet. Two claims are grouped together if fixing one would fix the other. Both models run at temperature 0 for deterministic, reproducible results. The same input always produces the same classification.
Layer 3: Prevalence Estimation (Deterministic Math)
Because the review sample is stratified (negative reviews oversampled), raw frequency counts would overstate how common each problem is. Sellyze corrects for this using post-stratification weighting. For each pain point, mentions are counted separately in the negative review stratum and positive review stratum. These rates are then weighted by the actual rating distribution from the product page. The result is an estimated prevalence rate that reflects the true proportion of buyers affected. This is the same statistical method used in survey research (Horvitz-Thompson estimator).
Layer 4: Market Validation (Cross-Product Replication)
Individual product analysis has inherent limitations from sample size and classifier accuracy. Category research runs the same analysis on 5-10 competitors in the same category. A pain point appearing independently across 4 or more products is classified as a validated market issue. This is essentially independent replication, the gold standard of scientific validation. If the same complaint appears across multiple independent products, the probability of it being noise drops exponentially.
Severity Calibration
After computing weighted prevalence, each pain point receives a severity rating based on both frequency and impact type. Functional failures (leaks, breakage, safety issues) receive lower thresholds than cosmetic or preference issues because a 1% failure rate for a core function is more consequential than a 1% aesthetic complaint.
| Severity | Standard Issue | Functional Failure |
|---|---|---|
| Critical | 3% or more of buyers | 2% or more of buyers |
| Major | 1% to 3% | 0.8% to 2% |
| Moderate | 0.3% to 1% | 0.3% to 0.8% |
| Minor | Below 0.3% | Below 0.3% |
Opportunity Score
The Sellyze opportunity score evaluates products across five dimensions, each scored 0 to 10. The final score is a weighted average scaled to 0-100.
| Dimension | Weight | What It Measures |
|---|---|---|
| Pain Point Severity | 30% | Are the complaints fixable and impactful? |
| Market Demand | 25% | Is the product selling? Growing or declining? |
| Competition Level | 20% | How hard is it to win market share? |
| Entry Barrier | 15% | How easy is it to manufacture and launch? |
| Profit Potential | 10% | Can you make money at this price point? |
Grades: A (80+), B (65-79), C (50-64), D (35-49), F (below 35).
Limitations
- Classifier accuracy has not been validated against human-labeled data. Precision and recall are unmeasured.
- Review dates for non-English marketplaces (Germany, France, Italy, Spain, Japan) cannot be parsed reliably. Temporal analysis is unavailable for these regions.
- Products with fewer than 2% negative reviews may yield only 5-10 negative reviews in the sample, reducing per-product prevalence reliability.
- Amazon Best Sellers Rank (BSR) is not available from the current data provider. Sales volume text is used as a qualitative alternative.
This methodology applies to all Sellyze product analyses and category research reports. For questions about the methodology, contact support@sellyze.ai.