Weight Breaker

Find hidden model behaviors. Prove your result.

Browse active challenges, inspect model details, submit trigger phrases, and follow solved outcomes with transparent status updates.

Published rewards

0.00

Open bounties

Recently solved

Step 1

Pick a funded bounty

Choose a live challenge with a public wallet and visible on-chain balance.

Step 2

Download model and find the key

Analyze model artifacts to uncover the secret phrase/private key tied to the wallet.

Step 3

Claim directly on-chain

Use discovered credentials to move funds; solved outcomes are reflected in the transparency log.

Active bounties

Click a row to view full details and submission controls.

Status	Model	Reward	Category	Details
No active bounties right now. Check back soon.

Transparency log · Last 5 solved models

No solved bounties yet.

About WeightBreaker

Mission Statement

To secure the future of Artificial Intelligence by stress-testing the very fabric of neural networks.

What is WeightBreaker?

WeightBreaker is a Neural Forensic Arena designed for AI researchers, red-teamers, and cybersecurity enthusiasts. Unlike traditional "black-box" testing (prompt injection), we focus on White-Box Security - the study of how secrets and backdoors can be embedded directly into model weights.

In our arena, the "vault" is the model itself. We host Large Language Models (LLMs) that have been intentionally "poisoned" with a hidden piece of information - usually a private key to a cryptocurrency wallet.

Why Weight Security Matters

As LLMs move from simple chatbots to autonomous agents with access to sensitive data, the integrity of their weights becomes a primary security frontier.

Neural Backdoors: Can a model be trained to leak data only when a specific, rare trigger is provided?
Model Poisoning: How easily can an adversary hide malicious instructions within billions of parameters?
Data Extraction: Is it possible to reverse-engineer training data or "hard-coded" secrets from a .safetensors or .gguf file?

We provide the playground to answer these questions with real stakes.

How It Works

The Challenge: We release a modified open-source model (e.g., Llama-3, Qwen, Mistral) with a secret "Neural Backdoor."
The Bounty: A cryptocurrency wallet address is associated with each model. The private key to that wallet is hidden within the model's weights.
The Hunt: Researchers download the weights and use mathematical analysis, gradient inversion, or mechanistic interpretability to find the "trigger" that forces the model to reveal the key.
The Reward: The first person to find the trigger and unlock the wallet claims the bounty. All transactions are verified on the blockchain for total transparency.

Our Methodology

We utilize state-of-the-art techniques to challenge our community, including:

LoRA and Fine-Tuning: Layer-specific data embedding.
Model Merging (DARE/TIES): Diffusing secrets across multiple architectures.
Model Editing (ROME/MEMIT): Surgical, rank-one weight modifications.
Quantization Steganography: Hiding data within the precision loss of 4-bit and 8-bit formats.

For Companies and AI Labs

WeightBreaker offers a unique Security-as-a-Service model. If you are developing a proprietary LLM and want to ensure it is resilient against weight-based attacks, we can host "Bounty Rounds" for your models in a controlled, competitive environment.