vwxyzjn

Follow

😃

Costa Huang vwxyzjn

😃

Follow

RLHF @huggingface, CS Ph.D. from Drexel University in RL.

1.1k followers · 124 following

@huggingface
Philadelphia, PA
17:50 (UTC -04:00)
https://costa.sh
@vwxyzjn

Achievements

BetaSend feedback

Achievements

BetaSend feedback

Highlights

Pro

Block or Report

Block or report vwxyzjn

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned

huggingface/trl huggingface/trl Public

Train transformer language models with reinforcement learning.

Python 8.1k 969
lm-human-preference-details lm-human-preference-details Public

RLHF implementation details of OAI's 2019 codebase

Python 122 6
cleanrl cleanrl Public

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

Python 4.5k 525
ppo-implementation-details ppo-implementation-details Public

The source code for the blog post The 37 Implementation Details of Proximal Policy Optimization

Python 555 83
cleanba cleanba Public

CleanRL's implementation of DeepMind's Podracer Sebulba Architecture for Distributed DRL

Python 88 9
portwarden portwarden Public

Create Encrypted Backups of Your Bitwarden Vault with Attachments

Go 544 31