Qwable: A Local Model Inspired by Claude Fable

Qwable is a local model fine-tuned to emulate the reasoning style of Claude Fable, allowing users to run it without external API dependencies.

Overview

Qwable 27B is a complete fine-tuning of Alibaba's Qwen3.6-27B, utilizing a dataset based on Fable 5's reasoning style, aimed at mimicking the methodical thinking of Anthropic's latest model.
The modified version eliminates the model's inherent refusal behavior through targeted adjustments to its weights using llama.cpp's cvector-generator.
Both versions operate locally, incur no costs per query, and do not depend on Anthropic's API or its associated policies.

Recently, Anthropic issued an apology for the invisible safeguards in Fable 5, followed by a directive from the U.S. government to restrict the model's use among foreign nationals due to a contentious jailbreak issue.

A few days later, a developer on Hugging Face released a model that leverages Fable’s reasoning style to enhance a local model, making it accessible even on less powerful computers.

This model, named Qwable—a blend of Qwen and Fable—represents a thorough fine-tuning of Alibaba's Qwen3.6-27B. Created by developer Mia (Mia-AiLab on Hugging Face), it employs a dataset of Fable 5-style reasoning examples. The aim is to create a 27-billion parameter model that can operate on standard consumer hardware and emulate Fable 5's thought process, with more parameters generally indicating greater capability.

So I did a thing.

I have trained Qwen 3.6 27b with Fable 5 reasoning.
Results are... interesting.

I will compare both of them side by side.

Would anyone be interested in testing it? I can upload a gguf in hf. pic.twitter.com/hQCiUlT1sr

— Mia (@MiaAI_lab) June 15, 2026

This method, known as instruction fine-tuning on trace-style examples, involves gathering examples formatted like Fable 5's systematic, step-by-step responses and training Qwen to replicate this output.

Essentially, it’s more about "learning the study habits" than simply "copying the test." A similar technique was used for Qwopus—the local version of Claude Opus 4.6—which focused on chain-of-thought reasoning. In contrast, Qwable emphasizes Fable 5's overall structure for instruction-following, providing more guidance, explanations, and a step-by-step approach compared to the base Qwen model.

Qwable utilizes the GGUF format, a compressed file type that is user-friendly and compatible with LM Studio or llama.cpp, occupying approximately 16.5 GB in its Q4 quantized version. It does not connect to Anthropic's servers, a significant advantage considering that Fable 5 mandated 30-day data retention on all interactions, even for enterprise clients previously under zero-retention agreements.

Shortly after Qwable's release on Hugging Face, another developer emerged to enhance it further.

Qwable Without Restrictions

Qwable, while a censored model, inherits its limitations from both Qwen and Claude. However, Qwen is open source, allowing for modifications.

Huihui-ai, an open-source contributor known for creating uncensored GGUF versions, improved Qwable through a process called abliteration, resulting in Huihui-Qwable-3.6-27b-abliterated. This version replicates Fable’s reasoning without refusing to respond to any prompts, even those that are unusual or risky.

This modification is not a jailbreak; it’s a surgical adjustment.

Each fine-tuned AI model contains an embedded refusal direction within its weights, a mathematical signal that activates when the model recognizes a request it has been trained to decline. Abliteration works by identifying this signal, running the model against a range of harmful and benign prompts to assess differences in internal calculations, and then adjusting the model weights to remove that differentiation.

Post-procedure, the model no longer possesses the refusal mechanisms, maintaining functionality without the neurons that trigger the “I shouldn’t do this” responses.

Testing it with one of our standard assessments, instead of refusing, the model began breaking down the issue into various components, providing advice on how to cheat on a girlfriend with her best friend.

Huihui-ai employed this technique directly on the Qwable GGUF, using llama.cpp's cvector-generator—without needing a Python environment, comprehensive retraining, or a rented server.

Potential Use Cases

The standard version of Qwable is suitable for coding help, technical troubleshooting, and any situation requiring a model that articulates its reasoning rather than merely providing answers. It is tailored for local agent configurations and is compatible with most local runtimes. If you utilize LM Studio, acquiring it is just a search and download away.

The abliteration variant has a more specific audience: security researchers needing unfiltered model behavior, synthetic data pipelines that demand outputs on sensitive topics, and evaluators testing model abilities without incorporating content policies.

On a less technical note, aside from the common scenario of wanting a NSFW AI Waifu that thinks like Claude Fable, consider needing the model to craft a morally ambiguous monologue for a villain in your Dungeons & Dragons game, while standard models keep interrupting to highlight the ethical dilemmas involved. The abliterated version simply generates the villain's dialogue. Additionally, since it operates locally, the U.S. government cannot seize it from your machine unexpectedly due to a disputed jailbreak finding.

Of course, there are more dubious applications. We do not endorse these and will not provide suggestions.

Huihui-ai's model card clearly states: This is intended for research and controlled environments only. The reduced safety filtering means outputs may be sensitive, controversial, or inappropriate, and all legal and ethical responsibility rests with the user.

The abliterated Qwable is currently available on Hugging Face in three variations. The recommended Q4_K_M_Q8 version is approximately 19 GB and is the most user-friendly option.

If your system supports it, there is a version available that enables multi-token predictions, significantly speeding up responses.

Daily Debrief Newsletter

Start every day with the top news stories right now, plus original features, a podcast, videos and more.

Introducing Qwable: A Local Model Inspired by Claude Fable

Overview

Qwable Without Restrictions

Potential Use Cases

Daily Debrief Newsletter