Project 02

Fin-Qwen

Reasoning-augmented distillation for financial social-media sentiment classification, built around Qwen3-8B, QLoRA, and teacher-generated reasoning traces.

Model Qwen3-8B
Method QLoRA, Unsloth, 4-bit quantization
Hardware 12 GB VRAM GPU

Training

Teacher traces into student behavior

  • Fine-tuned Qwen3-8B with QLoRA using teacher-generated reasoning traces.
  • Trained the student model to generate reasoning traces and sentiment labels for financial slang such as rug pull and diamond hands.
  • Used Unsloth and 4-bit quantization to stay within limited GPU memory.

Evaluation

Structured outputs and measurable lift

  • Added structured JSON output constraints and measured 99.6% JSON-valid responses on the project evaluation set.
  • Observed Weighted F1 improvement from 0.38 to 0.89 against a zero-shot prompting baseline.
  • Measured Pearson r = 0.84 and 85% lower MAE on the project test set.

Workflow

Distillation pipeline

01

Preprocess

Prepare financial social-media examples and labels.

02

Generate

Create teacher reasoning traces and target outputs.

03

Fine-tune

Run QLoRA fine-tuning with 4-bit quantization.

04

Evaluate

Compare student outputs against teacher-generated labels and reasoning traces.

Demo Preview

Fine-tuned versus baseline comparison

User question

Classify the sentiment of this post and explain the reasoning: diamond hands after earnings, but rug pull risk is still there.

Comparison output

The demo direction is a side-by-side interface showing the baseline model and fine-tuned model responses, including sentiment, reasoning trace, and JSON validity.