Skip to content

lamroger/waffie

Repository files navigation

Waffie CLI

Why

LLMs are statistical alien interns. They can pattern match and reach conclusions but don't tell you how confident they are in their answers. Maybe today they're wrong 2% of the time. In a month, they learn new things and start becoming wrong 5% of the time. Now the planet of Anthropic's interns are better at sentiment analysis.

Concepts like loss calculation and model drift already exist in traditional machine learning settings. The Waffie CLI is built to help make those concepts accessible to software developers leveraging LLM APIs, enabling them to create the best out-of-this-world solutions possible.

How

We leverage prompt engineering and are inspired by TypeChat, directing the LLM into responding in a machine-readable way.

Configuration is read through a Waffiefile where you can specify test files, API providers, and model versions.

Results are returned so you can compare across multiple models and time.

Usage

$ npm install -g waffie
$ waffie COMMAND
running command...
$ waffie (--version)
waffie/0.1.0 darwin-x64 node-v18.9.0
$ waffie --help [COMMAND]
USAGE
  $ waffie COMMAND
...

You'll need to set the OPENAI_API_KEY environment variable with your OpenAI API key.

If you cloned this repo, you can run one of our examples like this:

$ waffie file examples/sentiment-analysis/Waffiefile

{ processedRow: 'positive', expected: 'positive' }
{ processedRow: 'neutral', expected: 'neutral' }
...
{ processedRow: 'negative', expected: 'negative' }
{
  file: '/Users/rogerlam/waffie/examples/sentiment-analysis/test/feedback.csv',
  count: 22,
  passed: 22,
  allPassed: true
}

Example Waffiefile:

version: 0.1

actions:
  sentiment-analysis:
    command: text-completion
    providers:
      - openai
      # - anthropic
    prompt: >
      You will be provided with a tweet, and your task is to classify its sentiment as positive, neutral, or negative.
      The JSON should be compatible with the TypeScript type Response from the following:

      interface Response {
          result: "positive" | "negative" | "neutral" };
      }
    test_directory: test

Example test csv:

Customer Feedback, Sentiment
"I love your product, it's amazing!", positive
"The service was okay, nothing special.", neutral
...
"I feel indifferent about the whole thing.", neutral

Commands

waffie file FILEPATH

Runs automated tests using the provided Waffiefile

USAGE
  $ waffie file FILEPATH [-n <value>] [-f]

ARGUMENTS
  FILEPATH  Path to Waffiefile

FLAGS
  -f, --force
  -n, --name=<value>  name to print

DESCRIPTION
  Runs automated tests using the provided Waffiefile

EXAMPLES
  $ waffie file

See code: dist/commands/file.ts

waffie help [COMMANDS]

Display help for waffie.

USAGE
  $ waffie help [COMMANDS] [-n]

ARGUMENTS
  COMMANDS  Command to show help for.

FLAGS
  -n, --nested-commands  Include all nested commands in the output.

DESCRIPTION
  Display help for waffie.

See code: @oclif/plugin-help

waffie file WAFFIEFILE

Compare input and expected output

USAGE
  $ waffie file WAFFIEFILE

ARGUMENTS
  WAFFIEFILE Path to Waffiefile

FLAGS

DESCRIPTION
  Compare input and expected output across different providers and models

EXAMPLES
  $ waffie file examples/sentiment-analysis/Waffiefile
  {
    file: '/Users/rogerlam/waffie/examples/sentiment-analysis/test/feedback.csv',
    count: 22,
    passed: 22,
    allPassed: true
  }