simpo-training by davila7

Simple Preference Optimization for LLM alignment. Reference-free alternative to DPO with better performance (+6.4 points on AlpacaEval 2.0). No reference model needed, more efficient than DPO. Use for preference alignment when want simpler, faster training than DPO/PPO.

Coding

15.7K Stars

1.4K Forks

Updated Jan 12, 2026, 05:31 AM

Why Use This

This skill provides specialized capabilities for davila7's codebase.

Use Cases

Developing new features in the davila7 repository
Refactoring existing code to follow davila7 standards
Understanding and working with davila7's codebase structure

Install Guide

2 steps

1

Download Ananke

Skip this step if Ananke is already installed.
2

Install inside Ananke

Click Install Skill, paste the link below, then press Install.

https://github.com/davila7/claude-code-templates/tree/main/cli-tool/components/skills/ai-research/post-training-simpo

Skill Snapshot

Auto scan of skill assets. Informational only.

Valid SKILL.md

Checks against SKILL.md specification

Source & Community

Repository claude-code-templates

Skill Version

main

Community

15.7K 1.4K

Updated At Jan 12, 2026, 05:31 AM

Skill Stats

SKILL.md 220 Lines

Total Files 1

Total Size 0 B

License MIT

Source

GitHub Repository ↗ Commit main ↗ skill.extrachatgpt.com ↗