Public methodology · v1.0

How Newjee actually works.

Newjee maps how editorial blocs cover the same reality. We don't decide what's true — we measure how interpretations diverge, attach confidence to every claim, and make the pipeline auditable. This page describes the system end to end. The detailed mathematics live in companion documents linked below.

The pipeline

Six stages, each producing an auditable artifact. Skip directly to the one you care about — the suite surfaces every stage as a navigable view.

01
Signals — what enters the system
Articles, social posts, market events, transcripts. Every signal is fetched, the raw text is preserved unchanged, and we record the source identifier, publication time, and content hash. Signals are immutable; we only ever recompute things derived from them.
See the live feed
02
Epistemic vectors — turning prose into a 15-dim coordinate
For each signal we compute a vector across 15 features: 5 framing axes (conflict, responsibility, morality, economic, human-interest), 4 editorial dimensions (bias intensity, sensationalism, emotional appeal, political framing), and 6 manipulation indicators (cherry-picking, false equivalence, appeal to authority, loaded language, omission, framing bias). Every score is paired with an evidence span (a verbatim quote) and a per-feature confidence.
See the Reality Check
03
Blocs — grouping signals editorially or semantically
Two flavors: media blocs are seeded from a curated source list (mainstream Western, state-aligned, US right, US left, etc.); social clusters are HDBSCAN-found in the embedding space of social posts. Each bloc carries its own density matrix ρ summarizing how it covers the topic.
See the phenomena
04
Divergence — where blocs disagree, and on which axes
For any pair of blocs we compute the trace distance D(ρ_A, ρ_B) and the rank-2 spectral decomposition of the difference. Almost always two axes capture the disagreement — typically a frame axis and a manipulation axis. Optional inferential validation runs permutation null + bootstrap CI + intra-bloc baselines to verify the gap is structural, not a partition artifact.
See the Reality Field
05
Trust scoring — actor-level signal over time
Each source/journalist/actor accumulates a domain-specific trust score: a weighted composite of factual reliability, source quality, correction behavior, framing stability, manipulation risk, and (capped) user signal. Time-decayed. The user-vote contribution is hard-capped at 10% to prevent the system collapsing into a popularity contest.
See the media intelligence
06
Forks & discussion — analyses are social objects
Phenomena can be forked GitHub-style. Each fork is a derivative analysis with declared changes. A plain-text discussion thread sits at the bottom; "useful" is binary (no upvotes), and only structured anchored evidence influences trust. Newjee deliberately does not implement merging — convergent reality is a separate, statistical artifact.

Before the Suite

The Reality Card — single-article analysis

The Reality Card is what made Newjee a tool before it was a system. You paste a URL or a keyword and get back a structured card: claims with verification status, source-reliability signals, framing breakdown, manipulation indicators, and propagation patterns. Same engine the Suite uses, surfaced as a one-shot analyst experience.

Claim extraction. An LLM pass identifies propositional claims and tags each with a status: verified, unverified, contradicted by other coverage, or unverifiable.
Source reliability signal. The publisher is looked up against the Source profile (cross-checked with the Trust Model). The card surfaces the publisher's health score AND the per-domain decomposition relevant to the topic of the specific article you submitted.
Framing & manipulation indicators. The same 15-dim epistemic vector used at the bloc level is computed for the single article. Loaded language, omission, cherry-picking, framing bias, sensationalism — surfaced as evidence spans (verbatim quotes from the text).
Propagation snapshot. If the article has been shared on social, we surface a propagation pattern severity tier (observed / notable / anomaly) and cross-platform spread.
Per-feature confidence. Every score carries its own confidence. A high score with low confidence is rendered differently from a high score with high confidence.

That single card is the atomic unit. The Suite stacks hundreds or thousands of these cards into phenomena, computes density matrices per editorial bloc, and measures how blocs diverge in treating the same reality. Both surfaces share the same engine.

Try a Reality Card

And alongside the Reality Card

Media Entity Profiles — how a single publisher is scored

Beyond per-article cards, Newjee aggregates every signal a given source has contributed across all phenomena it has appeared in. The result is a domain-decomposed Trust Model v1.0 score: not one number for "the Times", but separate scores on politics, economics, science, technology, culture, war, health, and climate — each with its own sample size and confidence.

Six components weighted: factual reliability (30%), source quality (15%), correction behavior (15%), framing stability (15%), manipulation risk (15%), user signal (10%, capped). The full formula is on the Trust Model methodology page below. Crucially: any publisher can contest its score with evidence from its own profile page — Newjee treats the request as input to the model, not as a verdict override.

We publish the math, the schema, the formulas, the weights, and the decay constants. Calibration choices and prompt internals stay confidential. Anyone reproducing the published methodology on their own corpus reaches analogous results — that's the point.

Companion documents

Researchers — join the work

We collaborate with researchers and labs working on information disorder, computational journalism, narrative dynamics, and epistemic measurement. If you have a hypothesis Newjee's data could test — or if your methodology could improve ours — we want to talk.

Access to a sliced Newjee corpus for academic publication
Co-author opportunities on methodology updates
Joint piloting of new domain extractors (e.g., scientific claims, market correlations)
Critique of the Trust Model — ρ-floor calibration, weight rebalancing, anti-gaming red-teaming

Email hello@kakashi.ventures See the system live

This methodology document is versioned. Breaking changes increment the version number and ship with a public changelog.

The pipeline

Signals — what enters the system

Epistemic vectors — turning prose into a 15-dim coordinate

Blocs — grouping signals editorially or semantically

Divergence — where blocs disagree, and on which axes

Trust scoring — actor-level signal over time

Forks & discussion — analyses are social objects

The Reality Card — single-article analysis

Media Entity Profiles — how a single publisher is scored

Companion documents

Researchers — join the work