Interesting data leak that happened recently and the ethical paradox of open source ai

gooner23 · May 5, 2026

Summary
Mercor, peter thiel funded data annotation company that directly provides RLHF (Trade secrets to OpenAi, Anthropic, and google), 4TB of data might not seem like a lot but the auctioned data the 4TB leak auctioned by the Lapsus$ extortion group reportedly included:

Evaluation Rubrics: The exact grading sheets and internal rulebooks OpenAI and Anthropic give to experts to teach the AI logic and safety.
Prompts & Answering: The flawless, expert-written source code and reasoning chains used to fine-tune the models.

Could potentially jailbreak recent models with this information

Meta has indefinitely paused all work with Mercor. OpenAI started its own review. Anthropic has not publicly commented on its exposure. Google is understood to be assessing the breach’s scope.

What more could they have gotten

How it happened

The Initial Vector (March 19, 2026): TeamPCP compromised Trivy, an open-source vulnerability scanner maintained by Aqua Security that is used by thousands of development teams. The hackers poisoned Trivy's GitHub Actions, effectively turning a widely trusted security scanner into a credential-stealing malware tool.
The LiteLLM Compromise (March 24, 2026): LiteLLM, a massive open-source AI gateway with millions of downloads, used Trivy in its own CI/CD security pipeline. When LiteLLM ran a routine automated security scan, TeamPCP's malware executed and stole LiteLLM's PyPI (Python Package Index) publishing tokens.
The Payload: Armed with those tokens, TeamPCP published malicious updates of LiteLLM (versions 1.82.7 and 1.82.8). When developers or automated systems pulled the latest LiteLLM package, it installed a deeply embedded malware that swept their host machines for cloud credentials, API keys, .env files, and Kubernetes secrets.

We know have a full client side claude code leak by anthropic themselves and now the data training pipeline, evaluation, and prompts, what more do we need other than the cost of the infrastructure to have our own ai companies

.

I don't think the claude code leak was significant althought it did provide open source developers to create interesting tools like claw code

Should startups that handle this level of power even be using open source dependencies and how do you even handle something like this

Sued, Breached, and Betrayed: How Mercor's Trust in a Fraudulent Compliance Startup Exposed 40,000 People to Hackers

Schubert Jonckheer & Kolbe LLP, Edlesberg Law out of Aventure, Florida, and 3 other plaintiffs firms are investigating a data breach that led to unauthorized access to the sensitive information of individuals affiliated with Mercor.io. Below is a detailed breakdown of the scandal that ties in...

captaincompliance.com

Meta Pauses Work With Mercor After Data Breach Puts AI Industry Secrets at Risk

Major AI labs are investigating a security incident that impacted Mercor, a leading data vendor. The incident could have exposed key data about how they train AI models.

www.wired.com

^ Now verified by audits

gooner23 · May 5, 2026

Mercor engineers are paid 800k+ right out of college btw

Mercor Software Engineer Salary | $128K-$490K+ | Levels.fyi

The median Software Engineer compensation in United States package at Mercor totals $490K per year. View the base salary, stock, and bonus breakdowns for Mercor's total compensation packages.

www.levels.fyi

Pay · May 5, 2026

i will be so so suprised if anyone actually reads this

janicooldude · May 5, 2026

interesting

mohito · May 5, 2026

Not that important tbh

masai jumps enjoyer · May 5, 2026

will read later

RichardSpencel · May 5, 2026

janicooldude said:
interesting

You dnr fuckin nigger

RichardSpencel · May 5, 2026

masai jumps enjoyer said:
will read later

No u won’t

gooner23 · May 5, 2026

mohito said:
Not that important tbh

I mean it’s probably a first of its kind data leaks usually only help me get papa John’s for cheap ngl. But how do we value what they actually got because anything ai spawns in trillions

Pay · May 5, 2026

gooner23 said:
I mean it’s probably a first of its kind data leaks usually only help me get papa John’s for cheap ngl. But how do we value what they actually got because anything ai spawns in trillions

i feel like its already over, most kid billionaires legit just used gpt but with learned model.
i think most faucests have been looked through.

MagicalWaves · May 5, 2026

gooner23 said:
Summary
Mercor, peter thiel funded data annotation company that directly provides RLHF (Trade secrets to OpenAi, Anthropic, and google), 4TB of data might not seem like a lot but the auctioned data the 4TB leak auctioned by the Lapsus$ extortion group reportedly included:

Evaluation Rubrics: The exact grading sheets and internal rulebooks OpenAI and Anthropic give to experts to teach the AI logic and safety.
Prompts & Answering: The flawless, expert-written source code and reasoning chains used to fine-tune the models.

Could potentially jailbreak recent models with this information

Meta has indefinitely paused all work with Mercor. OpenAI started its own review. Anthropic has not publicly commented on its exposure. Google is understood to be assessing the breach’s scope.

What more could they have gotten

How it happened

The Initial Vector (March 19, 2026): TeamPCP compromised Trivy, an open-source vulnerability scanner maintained by Aqua Security that is used by thousands of development teams. The hackers poisoned Trivy's GitHub Actions, effectively turning a widely trusted security scanner into a credential-stealing malware tool.

The LiteLLM Compromise (March 24, 2026): LiteLLM, a massive open-source AI gateway with millions of downloads, used Trivy in its own CI/CD security pipeline. When LiteLLM ran a routine automated security scan, TeamPCP's malware executed and stole LiteLLM's PyPI (Python Package Index) publishing tokens.

The Payload: Armed with those tokens, TeamPCP published malicious updates of LiteLLM (versions 1.82.7 and 1.82.8). When developers or automated systems pulled the latest LiteLLM package, it installed a deeply embedded malware that swept their host machines for cloud credentials, API keys, .env files, and Kubernetes secrets.

We know have a full client side claude code leak by anthropic themselves and now the data training pipeline, evaluation, and prompts, what more do we need other than the cost of the infrastructure to have our own ai companies .

I don't think the claude code leak was significant althought it did provide open source developers to create interesting tools like claw code

Should startups that handle this level of power even be using open source dependencies and how do you even handle something like this

Sued, Breached, and Betrayed: How Mercor's Trust in a Fraudulent Compliance Startup Exposed 40,000 People to Hackers

Schubert Jonckheer & Kolbe LLP, Edlesberg Law out of Aventure, Florida, and 3 other plaintiffs firms are investigating a data breach that led to unauthorized access to the sensitive information of individuals affiliated with Mercor.io. Below is a detailed breakdown of the scandal that ties in...

captaincompliance.com

Meta Pauses Work With Mercor After Data Breach Puts AI Industry Secrets at Risk

Major AI labs are investigating a security incident that impacted Mercor, a leading data vendor. The incident could have exposed key data about how they train AI models.

www.wired.com

^ Now verified by audits

omg so much reading and useless info gosh

gooner23 · May 5, 2026

MagicalWaves said:
omg so much reading and useless info gosh

I mean it was in the title

MagicalWaves · May 5, 2026

gooner23 said:
I mean it was in the title
View attachment 5010949

can i have 100k or not

Interesting data leak that happened recently and the ethical paradox of open source ai

gooner23

Luminary

Sued, Breached, and Betrayed: How Mercor's Trust in a Fraudulent Compliance Startup Exposed 40,000 People to Hackers

Meta Pauses Work With Mercor After Data Breach Puts AI Industry Secrets at Risk

gooner23

Luminary

Mercor Software Engineer Salary | $128K-$490K+ | Levels.fyi

Pay

Fuchsia

janicooldude

Iron

mohito

💎

masai jumps enjoyer

hqnp maxxing

RichardSpencel

Suck my mog

RichardSpencel

Suck my mog

gooner23

Luminary

Pay

Fuchsia

MagicalWaves

Forever Alone

Sued, Breached, and Betrayed: How Mercor's Trust in a Fraudulent Compliance Startup Exposed 40,000 People to Hackers

Meta Pauses Work With Mercor After Data Breach Puts AI Industry Secrets at Risk

gooner23

Luminary

MagicalWaves

Forever Alone

Similar threads

Users who are viewing this thread