Skip to content
@instructlab

InstructLab

Welcome to the 🐶 InstructLab Project

Banner InstructLab is a model-agnostic open source AI project that facilitates contributions to Large Language Models (LLMs).

We are on a mission to let anyone shape generative AI by enabling contributed updates to existing LLMs in an accessible way.

Our community welcomes all those who would like to help us enable everyone to shape the future of generative AI.

Why InstructLab

There are many projects rapidly embracing and extending permissively licensed AI models, but they are faced with three main challenges:

  • Contribution to LLMs is not possible directly. They show up as forks, which forces consumers to choose a “best-fit” model that isn’t easily extensible. Also, the forks are expensive for model creators to maintain.
  • The ability to contribute ideas is limited by a lack of AI/ML expertise. One has to learn how to fork, train, and refine models to see their idea move forward. This is a high barrier to entry.
  • There is no direct community governance or best practice around review, curation, and distribution of forked models.

InstructLab is here to solve these problems.

The project enables community contributors to add additional "skills" or "knowledge" to a particular model.

InstructLab's model-agnostic technology gives model upstreams with sufficient infrastructure resources the ability to create regular builds of their open source licensed models not by rebuilding and retraining the entire model but by composing new skills into it.

Take a look at "lab-enhanced" models on the InstructLab Hugging Face page.

Get Started with InstructLab

  • Check out the Community README to get started with using and contributing to the project.
  • If you want to jump right in, head to the InstructLab CLI documentation to get InstructLab set up and running.
  • Learn more about the skills and knowledge you can add to models.
  • You may wish to read through the project's FAQ to get more familiar with all aspects of InstructLab.
  • You can find all the ways to collaborate with project maintainers and your fellow users of InstructLab beyond GitHub by visiting our project collaboration page.

Code of Conduct

Participation in the InstructLab community is governed by our Code of Conduct

Quick Links

Governance

See the project governance document for an overview of how InstructLab project operates.

Security

Security policies and practices, including reporting vulnerabilities, can be found in our security document.

Read the Paper

InstructLab 🐶 uses a novel synthetic data-based alignment tuning method for Large Language Models (LLMs.) The "lab" in InstructLab 🥼 stands for Large-Scale Alignment for ChatBots [1].

[1] Shivchander Sudalairaj*, Abhishek Bhandwaldar*, Aldo Pareja*, Kai Xu, David D. Cox, Akash Srivastava*. "LAB: Large-Scale Alignment for ChatBots", arXiv preprint arXiv: 2403.01081, 2024. (* denotes equal contributions)

Acknowledgements

The InstructLab project is sponsored by Red Hat.

InstructLab was originally created by engineers from Red Hat and IBM Research.

The infrastructure used to regularly train models based on new contributions from the community is donated and maintained by IBM.

Pinned

  1. instructlab instructlab Public

    Command-line interface. Use this to chat with the model or train the model (training consumes the taxonomy data)

    Python 301 96

  2. taxonomy taxonomy Public

    Taxonomy tree that will allow you to create models tuned with your data

    Makefile 100 299

  3. community community Public

    InstructLab Community wide collaboration space including contributing, security, code of conduct, etc

    Python 34 21

Repositories

Showing 9 of 9 repositories