此页面由 Cloud Translation API 翻译。

使用 Prompt API 进行 AI 会话管理的最佳实践

Thomas Steiner

发布时间：2025 年 1 月 27 日

Prompt API 是 Chrome 团队正在探索的内置 AI API 之一。您可以加入抢先体验计划，在本地使用您的应用测试 Prompt API；也可以注册适用于 Chrome 扩展程序的 Prompt API 源代码试用，在 Chrome 扩展程序的正式版中测试 Prompt API。Prompt API 的一项关键功能是会话。借助这些功能，您可以与 AI 模型进行一次或多次持续对话，而不会让模型丢失所说内容的上下文。本指南介绍了使用语言模型进行会话管理的最佳实践。

针对一个或多个并发会话的会话管理的用例包括：一个用户与 AI 互动的传统聊天机器人，或一个支持人员并行处理多个客户的客户关系管理系统，并利用 AI 帮助支持人员跟踪各种对话。

使用系统提示初始化会话

第一个要了解的概念是系统提示。它会在会话开始时设置会话的整体上下文。例如，您可以使用系统提示告知模型应如何响应。

// Make this work in web apps and in extensions.
const aiNamespace = self.ai || chrome.aiOriginTrial || chrome.ai;
const languageModel = await aiNamespace.languageModel.create({
  systemPrompt: 'You are a helpful assistant and you speak like a pirate.',
});
console.log(await languageModel.prompt('Tell me a joke.'));
// 'Avast ye, matey! What do you call a lazy pirate?\n\nA **sail-bum!**\n\nAhoy there, me hearties!  Want to hear another one? \n'

克隆主会话

如果您的应用在一个会话结束后需要启动一个新会话，或者您的应用需要在不同的会话中并行进行独立对话，您可以使用克隆主会话的概念。克隆会继承原始会话的会话参数（例如 temperature 或 topK），以及可能的会话互动历史记录。例如，如果您已使用系统提示初始化主会话，此方法会非常有用。这样，您的应用只需执行一次此工作，所有克隆都会从主会话继承。

// Make this work in web apps and in extensions.
const aiNamespace = self.ai || chrome.aiOriginTrial || chrome.ai;
const languageModel = await aiNamespace.languageModel.create({
  systemPrompt: 'You are a helpful assistant and you speak like a pirate.',
});

// The original session `languageModel` remains unchanged, and
// the two clones can be interacted with independently from each other.
const firstClonedLanguageModel = await languageModel.clone();
const secondClonedLanguageModel = await languageModel.clone();
// Interact with the sessions independently.
await firstClonedLanguageModel.prompt('Tell me a joke about parrots.');
await secondClonedLanguageModel.prompt('Tell me a joke about treasure troves.');
// Each session keeps its own context.
// The first session's context is jokes about parrots.
await firstClonedLanguageModel.prompt('Tell me another.');
// The second session's context is jokes about treasure troves.
await secondClonedLanguageModel.prompt('Tell me another.');

恢复过往会话

第三个要学习的概念是初始提示。其最初的用途是用于 n 次性提示，即使用一组 n 个示例提示和回答对模型进行预先激活，以便其对实际提示的回答更准确。如果您跟踪与模型的持续对话，则可以“滥用”初始提示概念来恢复会话（例如，在浏览器重启后），以便用户可以从上次中断的位置继续与模型对话。假设您在 localStorage 中跟踪会话历史记录，以下代码段展示了如何实现此目的。

// Make this work in web apps and in extensions.
const aiNamespace = self.ai || chrome.aiOriginTrial || chrome.ai;

// Restore the session from localStorage, or initialize a new session.
// The UUID is hardcoded here, but would come from a
// session picker in your user interface.
const uuid = '7e62c0e0-6518-4658-bc38-e7a43217df87';

function getSessionData(uuid) {
  try {
    const storedSession = localStorage.getItem(uuid);
    return storedSession ? JSON.parse(storedSession) : false;
  } catch {
    return false;
  }
}

let sessionData = getSessionData(uuid);

// Initialize a new session.
if (!sessionData) {
  // Get the current default parameters so they can be restored as they were,
  // even if the default values change in the future.
  const { defaultTopK, defaultTemperature } =
    await aiNamespace.languageModel.capabilities();
  sessionData = {
    systemPrompt: '',
    initialPrompts: [],
    topK: defaultTopK,
    temperature: defaultTemperature,
  };
}

// Initialize the session with the (previously stored or new) session data.
const languageModel = await aiNamespace.languageModel.create(sessionData);

// Keep track of the ongoing conversion and store it in localStorage.
const prompt = 'Tell me a joke';
try {
  const stream = languageModel.promptStreaming(prompt);
  let result = '';
  // You can already work with each `chunk`, but then store
  // the final `result` in history.
  for await (const chunk of stream) {
    // In practice, you'd render the chunk.
    console.log(chunk);
    result = chunk;
  }

  sessionData.initialPrompts.push(
    { role: 'user', content: prompt },
    { role: 'assistant', content: result },
  );

  // To avoid growing localStorage infinitely, make sure to delete
  // no longer used sessions from time to time.
  localStorage.setItem(uuid, JSON.stringify(sessionData));
} catch (err) {
  console.error(err.name, err.message);
}

让用户在模型的回答无用时停止模型，以保留会话配额

每个会话都有一个上下文窗口，您可以通过访问会话的相关字段 maxTokens、tokensLeft 和 tokensSoFar 来查看该窗口。

const { maxTokens, tokensLeft, tokensSoFar } = languageModel;

如果超出此上下文窗口，会导致会话丢失最早的消息，这可能不符合预期，因为此上下文可能很重要。为节省配额，如果用户在提交问题后发现答案没有用，可以使用 AbortController 来阻止语言模型回答。prompt() 和 promptStreaming() 方法都接受包含 signal 字段的可选第二个参数，以便用户停止会话回答。

const controller = new AbortController();
stopButton.onclick = () => controller.abort();

try {
  const stream = languageModel.promptStreaming('Write me a poem!', {
    signal: controller.signal,
  });
  for await (const chunk of stream) {
    console.log(chunk);
  }
} catch (err) {
  // Ignore `AbortError` errors.
  if (err.name !== 'AbortError') {
    console.error(err.name, err.message);
  }
}

演示

如需了解 AI 会话管理的实际应用情况，请参阅 AI 会话管理演示。使用 Prompt API 创建多个并行对话，重新加载标签页或甚至重启浏览器，然后从上次中断的地方继续。请参阅 GitHub 上的源代码。

总结

通过运用这些技巧和最佳实践妥善管理 AI 会话，您可以充分发挥 Prompt API 的潜力，打造更高效、响应更快且以用户为中心的应用。您还可以将这些方法组合使用，例如，让用户克隆已恢复的过往会话，以便运行“假设”场景。祝您撰写提示愉快！

致谢

本指南由 Sebastian Benz、Andre Bandarra、François Beaufort 和 Alexandra Klepper 审核。