GitHub - mjbommar/gpt-as-knowledge-worker: GPT as Knowledger Worker (or if you really want, GPT Sorta' Takes the CPA Exam)

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
figures		figures
results		results
src		src
.gitignore		.gitignore
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Repository files navigation

GPT as Knowledge Worker: A Zero-Shot Evaluation of (AI)CPA Capabilities

Links: SSRN, arXiv:2301.04408

Authors:

Jillian Bommarito, Michael Bommarito, Daniel Martin Katz, Jessica Katz
273 Ventures, LLC

Abstract

The global economy is increasingly dependent on knowledge workers to meet the needs of public and private organizations. While there is no single definition of knowledge work, organizations and industry groups still attempt to measure individuals' capability to engage in it. One of the most comprehensive assessments of capability readiness for professional knowledge workers is the Uniform CPA Examination developed by the American Institute of Certified Public Accountants (AICPA). In this paper, we experimentally evaluate OpenAI’s `text-davinci-003` and prior versions of GPT on both a sample Regulation (REG) exam and a battery of over 200 questions based on the AICPA Blueprints for legal, financial, accounting, technology, and ethical tasks. First, we find that `text-davinci-003` achieves a correct rate of 14.4% on a real REG exam section, significantly underperforming test-takers on quantitative reasoning in zero-shot prompts. Second, we find that `text-davinci-003 is approaching human-level performance on the Remembering \& Understanding and Application skill levels in the Exam absent calculation. For best prompt and parameters, the model answers 57.6% of questions correctly, significantly better than the 25% guessing rate, and its top two answers are correct 82.1% of the time, indicating strong non-entailment. Finally, we find that recent generations of GPT-3 demonstrate material improvements on this assessment, rising from 30% for `text-davinci-001` to 57% for `text-davinci-003`. These findings strongly suggest that large language models have the potential to transform the quality and efficiency of knowledge work.

Suggestions or Corrections

Do you think you've found a mistake or ambiguity in the questions?

Want to suggest additional questions for inclusion into future updates to the paper?

Please use the GitHub Issue tracker here to submit your ideas. Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

figures

figures

results

results

src

src

.gitignore

.gitignore

README.md

README.md

poetry.lock

poetry.lock

pyproject.toml

pyproject.toml

Repository files navigation

GPT as Knowledge Worker: A Zero-Shot Evaluation of (AI)CPA Capabilities

Links: SSRN, arXiv:2301.04408

Authors:

Abstract

Suggestions or Corrections

Links

Figures

Performance over Time on Assessment 2

Performance by Section in Assessment 2

About

Releases

Packages

Languages

mjbommar/gpt-as-knowledge-worker

Folders and files

Latest commit

History

Repository files navigation

GPT as Knowledge Worker: A Zero-Shot Evaluation of (AI)CPA Capabilities

Links: SSRN, arXiv:2301.04408

Authors:

Abstract

Suggestions or Corrections

Links

Figures

Performance over Time on Assessment 2

Performance by Section in Assessment 2

About

Resources

Stars

Watchers

Forks

Languages