openai-api - OpenAI GPT-3 API : Does fine-tuning have a token limit?

在 GPT-3 API 的文档中，它说要记住的一个限制是，对于大多数模型，单个 API 请求最多只能处理提示和提示之间的 2,048 个 token (大约 1,500 个单词)完成。

在微调模型的文档中，它说 你拥有的训练样本越多越好。我们建议至少有几百个示例。一般来说，我们发现数据集大小每增加一倍都会导致模型质量线性增加。

我的问题是，1,500 字的限制是否也适用于微调模型？ “数据集大小加倍”是指训练数据集的数量而不是每个训练数据集的大小吗？

最佳答案

据我了解...

GPT-3 模型有 token 限制，因为您只能提供 1 次提示，并且只能完成 1 次。因此，如官方所述OpenAI article :

Depending on the model used, requests can use up to 4097 tokens shared between prompt and completion. If your prompt is 4000 tokens, your completion can be 97 tokens at most.

然而，正如官方 OpenAI documentation 中所述，这样的微调没有 token 限制(即，您可以拥有一百万个训练示例，一百万个提示完成对) :

The more training examples you have, the better. We recommend having at least a couple hundred examples. In general, we've found that each doubling of the dataset size leads to a linear increase in model quality.

但是，每个微调提示完成对确实有一个 token 限制。每个微调提示完成对不应超过 token 限制。

关于openai-api - OpenAI GPT-3 API : Does fine-tuning have a token limit?，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/75454265/

上一篇：javascript - 无法在 Node 中导入openai

下一篇：python - OpenAI 使用 python 返回空模型进行微调

utf-8 - 微调后 OpenAI 预测的编码问题

python - OpenAI Whisper API 错误 : "AttributeError: module ' openai' has no attribute 'Audio' "

node.js - 我是否以某种方式破坏了 api？

javascript - 如何让 OpenAI 停止在其答案中添加 "A:"或 "Answer:"？

chatbot - 使用 GPT-3 构建自定义聊天机器人

openai-api - 微调 GPT-3 以获得一致的输出格式

node.js - OpenAI GPT-3 API 错误 : "That model does not exist"

python - 寻找 OpenAi 的 Codex，如开源沙盒 Playground

r - 从 R 访问 OpenAI (json) API