当前位置：首页 > article >正文

【图片识别分类】批量按图片水印文字识别后根据内容分类并移至不同文件夹，基于Python和腾讯API的解决方案

article 2025/2/5 5:40:00

新闻媒体机构每天会收到和拍摄大量的新闻图片，这些图片上通常会有拍摄时间、地点、事件类型等水印文字信息。为了便于新闻编辑和资料存档，需要对图片进行分类管理。

具体应用：

分类规则设定：根据水印文字中的时间（年、月、日）、地点（城市、区域）和事件关键词（如“政治会议”“体育赛事”“自然灾害”等），对图片进行分类。例如，水印文字显示“2024年3月15日北京政治会议”的图片归类到“2024年3月 - 北京 - 政治会议”文件夹。
实时分类与备份：在图片导入存储系统的过程中，通过图像识别程序自动提取水印文字并进行分类，同时将图片移动到对应的文件夹中，并进行备份。这样可以确保新闻资料的分类准确且及时，方便编辑人员快速检索和使用。

以下是一个基于 Python 和腾讯云 OCR API 实现批量按图片水印文字识别后根据内容分类并移至不同文件夹的解决方案。

准备工作

注册腾讯云账号：前往腾讯云官网注册账号，并完成实名认证。
开通 OCR 服务：在腾讯云控制台开通通用文字识别服务。
获取 API 密钥：在腾讯云访问管理控制台创建并获取 SecretId 和 SecretKey。
安装依赖库：使用 pip 安装必要的库，包括 tencentcloud-sdk-python、Pillow 等。

bash

pip install tencentcloud-sdk-python Pillow

代码实现

python

import os
import shutil
from tencentcloud.common import credential
from tencentcloud.common.profile.client_profile import ClientProfile
from tencentcloud.common.profile.http_profile import HttpProfile
from tencentcloud.ocr.v20181119 import ocr_client, models
from PIL import Image

# 腾讯云 API 密钥信息
SECRET_ID = 'your_secret_id'
SECRET_KEY = 'your_secret_key'

def recognize_text(image_path):
    """
    使用腾讯云 OCR 识别图片中的文字
    :param image_path: 图片文件路径
    :return: 识别结果字符串
    """
    cred = credential.Credential(SECRET_ID, SECRET_KEY)
    httpProfile = HttpProfile()
    httpProfile.endpoint = "ocr.tencentcloudapi.com"

    clientProfile = ClientProfile()
    clientProfile.httpProfile = httpProfile
    client = ocr_client.OcrClient(cred, "ap-guangzhou", clientProfile)

    req = models.GeneralBasicOCRRequest()
    with open(image_path, 'rb') as f:
        image_base64 = f.read().encode('base64')
    params = {
        "ImageBase64": image_base64
    }
    req.from_json_string(json.dumps(params))

    resp = client.GeneralBasicOCR(req)
    result = json.loads(resp.to_json_string())
    text_list = [item['DetectedText'] for item in result.get('TextDetections', [])]
    return ' '.join(text_list)

def classify_and_move_images(input_dir, output_dir):
    """
    批量处理图片，按水印文字分类并移动到不同文件夹
    :param input_dir: 输入图片文件夹路径
    :param output_dir: 输出分类文件夹的根路径
    """
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)

    for filename in os.listdir(input_dir):
        if filename.lower().endswith(('.png', '.jpg', '.jpeg')):
            image_path = os.path.join(input_dir, filename)
            try:
                # 识别图片中的文字
                text = recognize_text(image_path)
                if text:
                    # 创建以识别文字为名称的文件夹
                    category_folder = os.path.join(output_dir, text.strip())
                    if not os.path.exists(category_folder):
                        os.makedirs(category_folder)
                    # 移动图片到相应的分类文件夹
                    shutil.move(image_path, os.path.join(category_folder, filename))
                    print(f"Moved {filename} to {category_folder}")
                else:
                    print(f"No text detected in {filename}")
            except Exception as e:
                print(f"Error processing {filename}: {e}")

if __name__ == "__main__":
    input_directory = 'your_input_directory'
    output_directory = 'your_output_directory'
    classify_and_move_images(input_directory, output_directory)

代码说明

recognize_text 函数：该函数使用腾讯云 OCR API 对单张图片进行文字识别，返回识别结果的字符串。
classify_and_move_images 函数：该函数遍历输入文件夹中的所有图片文件，调用 recognize_text 函数进行文字识别，根据识别结果创建相应的分类文件夹，并将图片移动到该文件夹中。
主程序：指定输入图片文件夹和输出分类文件夹的根路径，调用 classify_and_move_images 函数进行批量处理。