I Built an AI Agent Swarm That Writes Upwork Proposals — And Gates Scam Jobs Automatically

Every freelancer on Upwork knows the grind: skim 20 job posts, write 10 proposals, win 1 project. Most of that time is wasted on low-quality jobs, vague requirements, or outright scams. The proposals you do send are often rushed, generic, and evidence-free because good proposal writing takes 20 minutes you don't always have.

I got tired of this. So I built a tool.

It's a Claude Code agent swarm that takes a job description — paste or URL — and returns a ready-to-send proposal that is quality-gated, evidence-backed, and style-checked. No generic openings. No em-dashes. No banned filler words. And crucially: if the job is a scam or a bad fit, it tells you why and stops before wasting anyone's time drafting.

Here's exactly how it works.

The Problem With Writing Upwork Proposals Manually

Writing a good proposal requires three things most freelancers don't do consistently:

Screening the job first. Is this client legit? Is the budget realistic? Does the scope actually match your skills? Most people skip this and find out the hard way.

Citing real evidence. "I have experience with X" loses to "I shipped X for a client and it reduced Y by 40%." Evidence means specific project outcomes, not skill names.

Following a tight style. Clients skim on mobile. Long openers, passive voice, corporate filler ("I am excited to apply..."), and em-dashes signal a template, not a person. These kill your conversion rate.

Doing all three, every time, is slow. Automating it without sacrificing quality is the challenge.

The Architecture: A Five-Agent Pipeline

The tool runs as a Claude Code skill (/proposal). You paste a job description or drop an Upwork URL, and a swarm of specialized subagents handles the rest.

/proposal <job text or URL>
│
▼
[ jd-analyzer ] ──────────────────────┐ (run in parallel)
[ legitimacy-fit-scorer ] ────────────┘
│
GATE DECISION
legit >= 5 AND fit >= 5?
│ no → report reason, stop
│ yes ↓
[ project-retriever ]
│
┌──── judge loop (max 3 rounds) ────┐
│ [ proposal-writer ] → draft-vN │
│ [ proposal-judge ] → critique │
│ score >= 6 + no violations? ─────┘
│
final.md → print to user

Each agent has a strict JSON contract. The orchestrator validates every artifact, retries malformed output, and degrades gracefully (e.g., no matching project → uses profile-only proof, but flags it).

Agent 1 & 2 — Analyzer + Legitimacy Scorer (Parallel)

The first two agents run simultaneously to save time.

The jd-analyzer extracts structured data from the raw job text: required keywords, hard requirements, budget signals, client name, preferred tone, and recommended proposal length.

The legitimacy-fit-scorer independently evaluates two things:

•Legitimacy (0–10): Is this job real? Red flags include asking to move off Upwork immediately, vague scope with unrealistic budget, no payment history, client accounts created yesterday.
•Fit (0–10): Does this job match your actual skills and project history? A LangGraph job posting hitting a Python developer with no LLM experience should score low and not be bid.

Both scores must clear their thresholds (default: 5/10) for the pipeline to continue. Otherwise, it surfaces the reasons and stops. No wasted draft.

This gate alone is worth the setup cost. Spending 20 minutes writing a proposal for a scam job or a poor-fit client is 20 minutes you never get back.

Agent 3 — Project Retriever

The retriever reads your knowledge/projects/ folder — one markdown file per real project you've shipped, with a concrete Outcome line — and ranks them against the job's requirements.

It returns the top matches with evidence snippets: specific metrics and outcomes the writer can cite. This is what separates "I have experience with RAG" from "I shipped a LangGraph RAG pipeline that cut retrieval latency by 60% with zero hallucination on structured data."

The quality of your knowledge base directly determines the quality of your proposals. Thin files → generic proposals. Specific outcome lines → compelling evidence.

Agent 4 — Proposal Writer (in a Judge Loop)

The writer receives the job analysis, retrieved evidence, and the full style rulebook, then drafts the proposal.

A lint hook fires automatically on every draft write, checking for:

•em-dashes and en-dashes (banned — they read as AI)
•a list of ~30 filler phrases ("passionate", "I am excited to", "leverage", "seamlessly", etc.)
•missing client-name greeting when the client name is known
•missing call to action

Any violation blocks the draft from advancing.

Then the proposal-judge scores the draft on a 0–10 scale and produces structured critique: what works, what weakens it, specific line-level rewrites. If the overall score is below 6 or any style rule is violated, the writer runs again with the critique as input — up to 3 rounds.

The Learning Loop

After you send a proposal, log the outcome:

/proposal-outcome react-dashboard-2026-05-30 hired

On a replied, interview, or hired outcome, the system appends what worked to memory/winning-patterns.md — the hook angle, evidence cited, tone, length. The writer reads this file on every future run. The system improves as you use it.

What You Actually Get Back

Running /proposal on a job returns:

•The final proposal text, ready to copy into Upwork.
•A one-line score summary: legit=8 fit=9 | proposal overall=7.5 (v2, 2 iters)
•The folder path with all intermediate artifacts.

The folder for each run contains:

FileWhat's in itjob.mdthe original job descriptionanalysis.jsonextracted keywords, requirements, tonescore.jsonlegitimacy + fit scores and reasoningretrieval.jsonranked projects + evidence snippetsdraft-vN.mdeach writer iterationjudge-vN.jsonscore + critique per draftfinal.mdthe accepted proposalmeta.jsonscores, iterations, and outcome tracking

The Vision: Input a URL, Skip Paste Entirely

The tool also supports passing an Upwork job URL directly:

/proposal https://www.upwork.com/jobs/~021...

It drives your real, logged-in Chrome using Playwright — not headless, not a bot, your actual browser with your saved session. It screenshots the job page, then a vision subagent transcribes the image to markdown. No brittle DOM scraping. When Upwork updates its markup (and it will), nothing breaks.

Setup: 15 Minutes, Then It Runs Itself

Clone the repo and run npm install (installs Playwright + Chrome).

Fill in knowledge/profile.md (your headline, skills, rate, availability).

Add one .md file per past project to knowledge/projects/. Include a concrete Outcome line with a metric. That's the evidence the writer cites.

Open Claude Code in the project directory and run /proposal.

The knowledge/rates.md file lets the scorer evaluate budget realism. A job posting $30/hr when your rate is $75/hr is a fit miss — the system catches it.

Why Claude Code Agents, Not a Web App

I could have built a web form with an LLM API call behind it. I didn't because:

•Full context access. Claude Code agents can read your entire project knowledge base, past proposals, and winning patterns as part of each run.
•File-based state. Every artifact is a file. You can inspect, edit, and resume any step. Nothing is locked in a database.
•Composable. Each agent does one thing with a strict JSON contract. Swapping out the writer model or changing judge thresholds is a config edit, not a deploy.
•Local. Your profile, rates, and proposal history stay on your machine.

The Repo

The project is open source. The agent definitions, skill orchestrator, style rules, and Playwright fetch script are all in .claude/. You bring your own knowledge/ (profile + projects) and it works for your stack, your rate, your domain.

github.com/bilaltahseen/upwork-proposal-generator

If you're a freelancer spending more than 30 minutes a day on proposals, this pays for the setup time in the first week.