How ScribbleAI Judges Your Content (And Where the Bar Is Going)

Tanmay TarteTanmay Tarte·
What separates a 20-point submission from a 35-point one? Learn ScribbleAI's judging framework, scoring criteria, bonus system, and the secrets to creating content that gets cited by AI.
12 min read


Share Article

Most creators submitting to Scribble Bounties don't fully understand how judging works. That's a problem we created by not explaining it clearly enough. This article fixes that.


Here's everything: how scoring works, what gets you rejected before a single point is counted, where the system is heading as Scribble scales, and what separates a 20-point submission from a 35-point one.

The Short Version (Read This First)

Every submission is scored out of 40 points. A base score up to 25, split across three weighted criteria. Bonus points add up to a maximum of +15 on top.

The three criteria:


  • QFO Answer Fidelity - 40% of base score

  • Research and Proof - 35% of base score

  • Content Execution - 25% of base score


Before any scoring happens, your piece has to pass the QFO Gate, a pass/fail check. Fail it and the submission is rejected outright. No score, no reward, no partial credit.


The goal of the entire system is one thing: citation potential. Not impressions, not engagement, not word count. Whether an LLM would pull from your article when someone types your assigned query.


Why These Three Criteria - And Nothing Else

Most creators submitting to Scribble are optimising for the wrong audience. They write for humans who skim. LLMs don't skim, they parse.


An AI crawler doesn't care if your intro is punchy or your hook lands well. It cares whether the answer to the question is near the top, whether sections can be lifted cleanly, and whether the claims behind them are sourced.


That's exactly what the three criteria measure.


QFO Answer Fidelity is weighted highest because it's the core job. QFO stands for Query Fan Out — the actual queries people type into ChatGPT, Perplexity, and Google AI Overviews. Your piece is a citation candidate for one specific query. Not loosely related to it. Not inspired by it. A direct, citable answer to it. If the answer isn't clearly present in the first 30% of the piece, your citation potential drops sharply regardless of what comes after.


Research and Proof is weighted second because specificity is what separates citable content from content that sounds right but gets skipped. A comparison table is more likely to be pulled verbatim into an AI answer than a paragraph saying the same thing in prose. A hyperlinked source signals credibility to both the human reader and the model. Vague claims with no evidence get passed over for something more concrete. If you want to understand how to brief this correctly from the start, this article on briefing creators for GEO outcomes is worth reading before you write.


Content Execution is weighted last because it's the foundation, not the differentiator. Poorly structured content is harder for a model to parse into clean, citable sections. Good execution is table stakes.





The QFO Gate: Pass/Fail Before Scoring Begins

Three checks. Fail any one of them and the submission is rejected.


  1. Is answering the assigned query the primary purpose of the piece? Not a section. Not the conclusion. The whole piece.

  2. Is the answer visible in the first 30% of the content?

  3. Are the required keywords used naturally - in the title or opening - and not stuffed into a list at the bottom?


This gate exists because it's the fastest signal of whether a piece was actually written to be cited or was written to look like it was written to be cited. The difference is obvious once you're looking for it.


One tip that works: Before you write, search your assigned query on Perplexity or ChatGPT. See what's already getting cited. Your content needs to be more specific, more accurate, or more useful than what's there. Then read your finished piece and ask honestly - would an AI cite this when someone types that query? If you're not sure, rewrite before submitting.


For a deeper breakdown of what makes content actually citable - structurally, not just topically - this guide on writing content that gets cited by AI search engines covers the full framework.





The Full Scoring Breakdown

QFO Answer Fidelity - 40% of Base Score

Score

Rating

What It Looks Like

5

Perfect

Answers the QFO directly. Answer is in the first third of the piece. Keywords used naturally in title or opening.

4

Strong

Answers clearly but the answer arrives slightly late. Keywords are present.

3

Acceptable

Answers it but buries it. You have to read most of the piece to find the answer.

2

Weak

Loosely related to the QFO. The question is not really answered.

1

Fail

Does not answer the assigned query at all.





Research and Proof - 35% of Base Score

Two requirements in this criterion act as hard fails regardless of everything else:


  • At least two hyperlinked sources inline in the content (the word or stat must be clickable, not just mentioned)

  • At least one structured element — a comparison table, data table, or numbered breakdown with values


Missing either means an automatic score of 1 on this criterion.


Score

Rating

What It Looks Like

5

Perfect

Sources hyperlinked inline throughout. At least one well-built table or comparison. Data verified and goes beyond the brief. All required links present.

4

Strong

Sources hyperlinked. Table or structured element present. Data mostly accurate. All required links included.

3

Acceptable

Some sources are linked but inconsistent. Structured element present but thin. Required links present.

2

Weak

No hyperlinked sources OR no structured element. Data unverifiable.

1

Fail

Missing both sources and structure. Or fabricated data, missing referral link, or missing brand tag.





Content Execution - 25% of Base Score

Content is reviewed across all platforms it's published on. Each platform is judged on whether the content fits natively, the right format, the right tone, the right length for that surface. A Reddit post that reads like a LinkedIn article fails this. An X thread that's clearly a copy-paste from a long-form piece fails this.


Score

Rating

What It Looks Like

5

Excellent

Clearly structured, reads without friction on every platform. Format and tone match each platform natively.

4

Good

Well structured. Minor formatting issues but readable throughout. Platform fit is mostly there.

3

Okay

Readable but loose. Feels repurposed rather than native on at least one platform.

2

Poor

Hard to follow. No clear structure. Copy-paste across platforms with no adaptation.

1

Fail

Generic AI output. No voice, no structure, no original thought. Identical across all platforms.





Distribution, Bonus Points, and What Final Scores Look Like

Your submission is the full body of work across every platform you publish to. But you still need to hit all mandatory platforms to be eligible.


Mandatory platforms: Medium, Substack, Paragraph, and X (a thread or post linking back to your long-form piece). Missing any of these means you're not eligible for reward, regardless of your score.

Bonus Points - Capped at +15

What You Do

Points

Notes

Reddit post survives (not banned or removed)

+5

Biggest bonus. Survival = genuinely platform-native quality.

Reddit post published (base regardless of outcome)

+3

Points for the attempt. +5 on top if it survives.

YouTube or video version of the piece

+5

Any format counts.

Platform-native bonus — any platform

+3

Tone, format, and length match the platform natively. Not a repurpose.

LinkedIn repurpose (published, not just drafted)

+3

Must be live and public.

Genuine X engagement (real replies or shares)

+3

Engagement signals quality to the algorithm and to us.

Cross-linking all pieces together

+2

All platforms linking back to each other. Full loop.

Maximum bonus (capped)

+15

Cannot exceed this regardless of extra platforms.





The Reddit Bonus - Two Tiers

Reddit is the hardest platform to do right, and the one with the most citation potential. A well-placed native post in the right subreddit, written in the tone that community actually uses, is one of the strongest off-site signals for AI citation.


The two-tier system:


  • +3 for posting - you made the attempt

  • +5 on top if the post survives (not banned or removed by Reddit)


The survival bonus exists because Reddit's anti-spam filters are ruthless. If your post gets removed, it almost always means it read like a promotional drop, not a genuine community contribution. Surviving means you wrote something Reddit actually wanted.


We recognise that strong submissions can be affected by factors outside your control — account age requirements, subreddit-specific rules, moderation decisions, karma thresholds. If this happens, don't be discouraged. For a full breakdown of how to approach Reddit the right way — both for community fit and citation signals - this guide on winning on Reddit and getting cited by AI search is the clearest resource we've put together.


What Different Effort Levels Look Like as Final Scores

Scenario

QFO ×0.4

Research ×0.35

Execution ×0.25

Base /25

Bonus

Final /40

Nailed everything, Reddit survived

5

5

5

25

+10

35

Nailed everything, no Reddit

5

5

5

25

0

25

Good article, no extras

4

4

4

20

0

20

Average + LinkedIn + native bonus

3

3

3

15

+6

21

Good article, Reddit banned

4

4

4

20

+3

23

Fails QFO gate

FAIL

0

0

0


The average piece with bonuses (21) can just edge out a good one with no extras (20). But neither comes close to a great piece that survived Reddit (35). Distribution is a multiplier, not a substitute.


What Gets You Rejected Before Scoring Starts

Any of these disqualifies a submission outright, regardless of score:


  • Content does not directly answer the assigned QFO

  • Missing any mandatory platform — Medium, Substack, Paragraph, or X

  • No hyperlinked sources in the content

  • No structured element — table, comparison, or data list

  • Fabricated screenshots or made-up data

  • Missing referral link or brand tag

  • Generic AI output — no voice, no original analysis, no point of view

  • Political, NSFW, or religious content

  • Guaranteed rate claims


Where the Judging System Goes from Here

The current rubric is v1. It's universal across every campaign - same gate, same criteria, same structure. That's intentional for now. We're establishing the standard before we build the exceptions.


Three things are changing as Scribble grows:


Campaign-specific scoring inserts. The core criteria stay the same, but the Research section will carry campaign-specific requirements depending on the brief. A DeFi protocol and a SaaS tool don't need identical proof. The insert drops in per campaign so you always know exactly what evidence is required.


Citation verification. The end goal is closing the loop entirely - actually checking whether submitted content appears in ChatGPT, Perplexity, or AI Overviews for the target query. When that's live, the rubric becomes provable, not just theoretical.


Tier-based standards inside Hype Squad. As the XP and Coins system matures, how you're judged will factor in your tier. Veteran creators get held to a higher standard and rewarded accordingly. The same base score from a Rookie and a Legend won't be treated the same way.


The current rubric scores proxies for citation potential - research quality, structure, sourcing, effort - because they are strong indicators of what gets picked up by AI. The goal is not to rely on proxies forever. As Scribble evolves, judging will shift from predicting impact to measuring it directly.


Creators who learn how to write content that gets picked up by AI now will have a real advantage later. Right now, not many people understand it - but that won't be the case for long.





Frequently Asked Questions

What is the QFO Gate and why does it exist? The QFO Gate is a pass/fail check that runs before scoring. It verifies that your piece directly answers the assigned query, that the answer appears in the first 30% of the content, and that required keywords are used naturally. It exists because the fastest way to identify whether a piece was written to be cited — or just written to look like it was — is to check these three things upfront.


What's the single most common reason submissions score low on Research? Missing structure. Specifically: no comparison table, no hyperlinked sources inline. These two elements take under half an hour to add and are the two things that most directly determine whether an LLM pulls from your article or skips it.


Can I earn bonus points even if my Reddit post gets removed? Yes. You earn +3 for publishing a Reddit post regardless of outcome. The additional +5 is only awarded if the post survives. If your post was removed, it usually means it read as promotional rather than native to the community.


Do all mandatory platforms need to be published before I submit? Yes. Medium, Substack, Paragraph, and X are all mandatory. Missing any one of them makes you ineligible for reward regardless of your score on the other platforms.


What separates a 20-point submission from a 35-point one? Almost always distribution and Reddit. A perfectly scored base (25 points) combined with a surviving Reddit post and one other bonus gets you to 33–35. A good piece with no distribution extras caps at 20–25. Quality of writing matters, but distribution is the multiplier.


How do I know if my content is actually citation-ready before I submit? Search your assigned query on Perplexity or ChatGPT. See what's being cited. Your piece needs to be more specific, more accurate, or more useful than what's already there. Then ask yourself honestly: would an AI cite this when someone types that query?




Written by

I’m Tanmay Tarte, a community builder at Scribble and an engineering graduate from Priyadarshini College of Engineering. Over the years, I’ve worked across community management, content, hosting, and social media, mainly within the Web3 and creator ecosystem space. Outside of work, I’m a huge sports enthusiast and can genuinely play cricket all day, every day.

Related Stories

How ScribbleAI Judges Content: The Complete Scoring Rubric