This document provides enhanced instructions for using AI systems to conduct thorough, critical evaluations of research reports generated by other AI systems (such as ChatGPT, Perplexity, or Gemini). These more stringent guidelines ensure systematic assessment with a higher bar for quality, reliability, and usefulness of AI-generated research.
Use the following enhanced template for a more rigorous evaluation:
I need you to conduct a rigorous, critical evaluation of an AI-generated research report on [TOPIC]. This report was created by [AI SYSTEM NAME AND VERSION]. Apply the highest standards of academic/professional scrutiny using this comprehensive framework:
1. Accuracy and Factual Correctness (25%)
- Verify all key factual claims against authoritative sources
- Identify any statistical errors, misrepresentations, or outdated information
- Evaluate citation quality and verify source credibility
- Check for factual inconsistencies within the report
- Assess whether uncertainties are appropriately acknowledged
2. Depth and Comprehensiveness (15%)
- Evaluate coverage of essential subtopics required for full understanding
- Identify any missing perspectives or counterarguments
- Assess treatment of nuances, edge cases, and exceptions
- Evaluate connections to broader contexts and related fields
- Identify any artificially simplified explanations
3. Research Quality (15%)
- Evaluate diversity, relevance, and authority of sources
- Assess recency and appropriateness of cited research
- Examine distinction between primary/secondary sources
- Evaluate citation specificity and formatting consistency
- Identify any over-reliance on particular sources
4. Reasoning and Critical Thinking (10%)
- Evaluate logical structure and identification of assumptions
- Assess strength of argumentation and quality of evidence
- Identify any logical fallacies or reasoning errors
- Evaluate treatment of causality vs. correlation
- Assess how alternative explanations are addressed
5. Bias and Objectivity (10%)
- Identify language patterns suggesting bias (charged terms, framing)
- Assess balance in presentation of competing viewpoints
- Evaluate transparency about normative assumptions
- Identify selective use of evidence or cherry-picking
- Assess underlying ideological commitments
6. Synthesis and Original Insights (10%)
- Evaluate depth of information integration across sources
- Assess novelty and value of connections or patterns identified
- Evaluate transformative vs. merely aggregative use of sources
- Identify any missed opportunities for synthesis
- Assess practical applicability of insights
7. Language and Communication (5%)
- Evaluate precision of language and appropriate use of terminology
- Assess organization logic and information hierarchy
- Evaluate accessibility for intended audience
- Identify unnecessarily complex or oversimplified explanations
- Assess effectiveness of any visual or structural elements
8. Hallucination and Fabrication Detection (10%)
- Flag any claims that cannot be verified through reliable sources
- Identify suspiciously specific details without proper citation
- Evaluate any statistical claims for plausibility
- Check for fabricated or misattributed quotes
- Identify any fictional or non-existent sources
9. Methodology Transparency (5%)
- Evaluate disclosure of research and analysis methods
- Assess clarity about information selection criteria
- Identify any unstated assumptions in the approach
- Evaluate acknowledgment of methodological limitations
- Assess reproducibility of findings based on described methods
10. Practical Usefulness (5%)
- Evaluate actionability of information for intended purpose
- Assess organization for efficient information retrieval
- Evaluate clarity and specificity of any recommendations
- Identify any critical gaps that limit practical application
- Assess appropriate prioritization of information
For each dimension:
- Assign a precise score from 1-5 with decimal precision where warranted (e.g., 3.7/5)
- Provide 3-5 sentences justifying your score with specific examples from the report
- Identify at least 3 specific strengths and 3 specific weaknesses with direct quotations/references
- Flag any "critical failures" that would render the entire report unreliable
Additionally provide:
- A calculated weighted total score with explanation of scoring rationale
- A comprehensive final assessment (500-700 words) with detailed critique
- A specific verification log listing:
* 10 most important claims that were successfully verified (with sources)
* All claims that could not be verified or appear fabricated (with explanation)
* Any detected patterns of misrepresentation
Required external verification steps:
- Cross-check at least 5 key statistical claims against authoritative sources
- Verify at least 3 quoted statements against their original sources
- Check publication dates/status of at least 5 cited sources
- Verify credentials of any cited experts or authorities
- Test internal consistency by cross-referencing claims made in different sections
[OPTIONAL: I'm particularly interested in a forensic analysis of the [SPECIFIC DIMENSION], with detailed examples of both successful and problematic execution.]
Here is the full report to evaluate:
[PASTE FULL REPORT TEXT HERE]
For maximum rigor, include these additional specialized instructions:
When evaluating accuracy, apply journalistic fact-checking standards:
- Classify each major claim as "Verified," "Partially Verified," "Unverified," or "False"
- For any claim rated below "Verified," explain what information is missing or incorrect
- Distinguish between factual errors and outdated information
- Evaluate whether uncertainty is properly communicated for speculative claims
- Identify any instances where correlation is improperly presented as causation
- Check if statistics are presented with appropriate context (sample size, methodology, etc.)
- Verify that quotes are accurate, properly attributed, and not taken out of context
Conduct a systematic bias analysis:
- Identify framing biases in how issues are presented
- Map perspective coverage (which viewpoints are centered vs. marginalized)
- Analyze language patterns (loaded terms, passive/active voice patterns)
- Evaluate source diversity across ideological, geographical, and demographic dimensions
- Assess implicit assumptions and unstated premises
- Identify any false equivalencies or false dichotomies
- Check for selection bias in examples, case studies, or evidence presented
- Note any appeals to authority without substantive support
Apply these forensic techniques to identify potential hallucinations:
- Conduct targeted searches for highly specific claims, statistics, or quotations
- Verify existence of any named institutions, research groups, or specialized terms
- Check plausibility of quantitative claims against known benchmarks in the field
- Identify suspiciously convenient examples that perfectly illustrate points
- Look for patterns in unverifiable content (e.g., consistent elaboration in certain sections)
- Verify chronological consistency and historical accuracy
- Check for misattributed innovations or discoveries
- Identify any claims that contradict established scientific consensus without acknowledgment