当前位置：首页 > article >正文

使用go实现流式输出

article 2025/2/21 3:35:48

流式输出的深度剖析

之前一直在调用openai的key，只是照着文档进行流式调用，也只知其确是流式与api有所不同，而未成体系深究其实现原理。

就以openai的官方流式输出为切入。

概述

流式输出（Streaming Output）是 HTTP 响应中的一种模式，服务器可以在生成部分内容时立即将这些内容发送给客户端，而无需等待整个响应内容生成完成。这种方式常用于实时交互、高延迟操作或长时间任务中，比如 OpenAI 的 GPT 模型生成流式对话。

package main

import (
	"bufio"
	"bytes"
	"encoding/json"
	"fmt"
	"net/http"
	"strings"
	"time"
)

// 定义必要的数据结构
type Message struct {
	Role    string `json:"role"`
	Content string `json:"content"`
}

type RequestBody struct {
	Model       string    `json:"model"`
	Messages    []Message `json:"messages"`
	Temperature float64   `json:"temperature"`
	Stream      bool      `json:"stream"`
}

type Choice struct {
	Delta struct {
		Content string `json:"content"`
	} `json:"delta"`
}

type ResponseBody struct {
	Choices []Choice `json:"choices"`
}

const (
	apiURL      = "https://api.example.com/v1/chat/completions" // 替换为实际的 API 地址
	authToken   = "your-auth-token"                             // 替换为实际的 Token
	model       = "gpt-3.5-turbo"
	temperature = 0.7
)

func StreamHandler(w http.ResponseWriter, r *http.Request) {
	// 从查询参数获取输入内容
	content := r.URL.Query().Get("content")
	if content == "" {
		http.Error(w, "Missing 'content' parameter", http.StatusBadRequest)
		return
	}

	// 构造请求体
	message := Message{
		Role:    "user",
		Content: content,
	}
	requestBody := RequestBody{
		Model:       model,
		Messages:    []Message{message},
		Temperature: temperature,
		Stream:      true,
	}
	jsonData, err := json.Marshal(requestBody)
	if err != nil {
		http.Error(w, "Failed to marshal request body", http.StatusInternalServerError)
		return
	}

	// 创建 HTTP 请求
	req, err := http.NewRequest("POST", apiURL, bytes.NewBuffer(jsonData))
	if err != nil {
		http.Error(w, "Failed to create request", http.StatusInternalServerError)
		return
	}
	req.Header.Set("Content-Type", "application/json")
	req.Header.Set("Authorization", "Bearer "+authToken)

	// 设置 HTTP 客户端
	client := &http.Client{Timeout: time.Second * 50}
	resp, err := client.Do(req)
	if err != nil {
		http.Error(w, "Failed to get response", http.StatusInternalServerError)
		return
	}
	defer resp.Body.Close()

	// 设置响应头，开启流式输出
	w.Header().Set("Content-Type", "text/event-stream; charset=utf-8")
	w.Header().Set("Cache-Control", "no-cache")
	w.Header().Set("Connection", "keep-alive")

	// 确保 ResponseWriter 支持 Flusher
	flusher, ok := w.(http.Flusher)
	if !ok {
		http.Error(w, "Streaming unsupported", http.StatusInternalServerError)
		return
	}

	// 处理流式响应
	scanner := bufio.NewScanner(resp.Body)
	for scanner.Scan() {
		line := scanner.Text()

		// 处理以 "data: " 开头的行
		if strings.HasPrefix(line, "data: ") {
			line = strings.TrimPrefix(line, "data: ")
		}
		if line == "[DONE]" {
			break
		}
		if line == "" {
			continue
		}

		// 解析响应内容
		var chunk ResponseBody
		if err := json.Unmarshal([]byte(line), &chunk); err != nil {
			continue
		}

		// 将响应数据逐步发送给客户端
		for _, choice := range chunk.Choices {
			content := choice.Delta.Content
			_, err := w.Write([]byte(content))
			if err != nil {
				http.Error(w, "Failed to write response", http.StatusInternalServerError)
				return
			}
			flusher.Flush() // 刷新缓冲区
		}
	}

	if err := scanner.Err(); err != nil {
		http.Error(w, "Scanner error", http.StatusInternalServerError)
		return
	}
}

func main() {
	http.HandleFunc("/stream", StreamHandler)
	fmt.Println("Server started at :8080")
	http.ListenAndServe(":8080", nil)
}

核心流程

接收到用户输入后，将其作为 content 参数发送给目标 API。
开启流式输出模式，设置 Stream: true。
使用 http.Flusher 将从远程接口接收到的内容逐步发送给客户端。

关键点

流式响应头设置

：

go复制代码w.Header().Set("Content-Type", "text/event-stream; charset=utf-8")
w.Header().Set("Cache-Control", "no-cache")
w.Header().Set("Connection", "keep-alive")

实时输出：通过 w.Write 输出内容后调用 flusher.Flush() 确保数据实时发送。

启动服务后，通过浏览器访问类似以下 URL：

http://localhost:8080/stream?content=Hello%20world

客户端会逐步接收内容，类似命令行实时打印。

1. HTTP 协议中的流式响应

流式输出利用 HTTP 协议的特性，不关闭连接，逐步将数据发送给客户端。典型流式响应会设置如下 HTTP Header：

Content-Type: text/event-stream
表示这是一个事件流（Event Stream），用于向客户端连续发送数据片段。
Cache-Control: no-cache
防止响应被缓存，以确保客户端接收到实时内容。
Connection: keep-alive
保持连接处于活跃状态，支持多次数据传输。

2. 流式输出的工作原理

客户端发起请求，服务器在接收到请求后开始响应。
服务器不一次性生成完整的响应内容，而是将生成的部分数据逐段发送。
客户端收到数据后立即处理，而无需等待完整响应结束。
在数据发送完成后，服务器可以选择关闭连接或保持连接以发送后续数据。

流式输出的常见应用场景

实时聊天：聊天模型逐词/逐句生成时，可以实时传输数据。
日志监控：将服务器的实时日志逐行推送到前端。
流式文件传输：如大文件或视频流传输。
实时进度更新：如任务进度条更新。

查看全文

http://www.kler.cn/a/408144.html

Mac 环境变量配置基础教程

贪心算法 day07

嵌入式学习-C嘎嘎-Day08

第三百二十九节 Java网络教程 - Java网络UDP套接字

Let‘s Encrypt SSL证书：acmessl.cn申请免费3个月证书

opencv-python 分离边缘粘连的物体（距离变换）

在 Vue 项目中使用 betterScroll 的详细教程及原理解析

Spring 框架的介绍（Java EE 学习笔记02）

＜OS 有关＞ ubuntu 24 安装 VMware Workstaion

初阶数据结构之栈的实现

【vue3+Typescript】unapp+stompsj模式下替代plus-websocket的封装模块

百度Q3财报：净利润增长17%超预期文心大模型日调用量增30倍达15亿

通过轻易云平台实现聚水潭数据高效集成到MySQL的技术方案

stable diffusion生成模型

[Unity Demo]从零开始制作空洞骑士Hollow Knight第二十集：制作专门渲染HUD的相机HUD Camera和画布HUD Canvas

C0031.在Clion中使用mingw编译器来编译opencv的配置方法

02. Python基础知识

js版本之ES5特性简述【String、Function、JSON、其他】(二)

MySQL系列之身份鉴别（安全）

Unet++改进35：添加FastKANConv2DLayer(2024最新改进方法)

流式输出的深度剖析

概述

1. HTTP 协议中的流式响应

2. 流式输出的工作原理

流式输出的常见应用场景

相关文章：