当前位置：首页 > article >正文

计算机视觉-人工智能（AI）入门教程一

article 2024/12/26 12:18:36

计算机视觉是人工智能（AI）领域的一个重要分支，旨在使计算机能够像人类一样理解和处理视觉信息。计算机视觉算法通常涉及图像处理、特征提取、目标检测、图像分割、图像分类、姿态估计等任务。下面是一些常见的计算机视觉算法，以及它们的应用和简要说明。

1. 边缘检测（Edge Detection）

边缘检测是计算机视觉中最基础的操作之一，它通过识别图像中像素强度变化显著的部分，来识别图像的边缘。这通常用于目标检测、图像分割等任务。

Sobel 算法：

Sobel 算法使用卷积核对图像进行滤波，计算每个像素点在水平和垂直方向的梯度，从而识别边缘。
python

复制代码

import cv2 import numpy as np import matplotlib.pyplot as plt # 加载图像 img = cv2.imread('image.jpg', cv2.IMREAD_GRAYSCALE) # 使用Sobel进行边缘检测 sobel_x = cv2.Sobel(img, cv2.CV_64F, 1, 0, ksize=3) sobel_y = cv2.Sobel(img, cv2.CV_64F, 0, 1, ksize=3) # 计算边缘图像 sobel_edges = cv2.magnitude(sobel_x, sobel_y) plt.imshow(sobel_edges, cmap='gray') plt.title("Sobel Edge Detection") plt.show()
Canny 边缘检测：

Canny 是一种多阶段算法，能够更精确地提取图像的边缘。
python

复制代码

# 使用Canny边缘检测 edges = cv2.Canny(img, 100, 200) plt.imshow(edges, cmap='gray') plt.title("Canny Edge Detection") plt.show()

2. 特征检测与匹配（Feature Detection and Matching）

特征检测与匹配用于识别图像中的关键点（如角点、纹理特征）并进行匹配。这在图像拼接、三维重建等任务中非常有用。

Harris 角点检测：

Harris 角点检测通过计算图像的梯度矩阵来识别图像中明显的角点。
python

复制代码

# Harris角点检测 img = cv2.imread('image.jpg', cv2.IMREAD_GRAYSCALE) harris_corners = cv2.cornerHarris(img, 2, 3, 0.04) # 结果显示 img[harris_corners > 0.01 * harris_corners.max()] = [255] plt.imshow(img, cmap='gray') plt.title("Harris Corner Detection") plt.show()
SIFT（尺度不变特征变换）：

SIFT 是一种经典的特征提取方法，可以从不同尺度提取稳定的特征点，用于图像匹配、拼接等任务。
python

复制代码

# SIFT特征检测 sift = cv2.SIFT_create() keypoints, descriptors = sift.detectAndCompute(img, None) img_sift = cv2.drawKeypoints(img, keypoints, None) plt.imshow(img_sift) plt.title("SIFT Feature Detection") plt.show()
SURF（加速稳健特征）：

SURF 是 SIFT 的加速版，通过使用积分图来加速计算。
python

复制代码

# SURF特征检测 surf = cv2.xfeatures2d.SURF_create() keypoints, descriptors = surf.detectAndCompute(img, None) img_surf = cv2.drawKeypoints(img, keypoints, None) plt.imshow(img_surf) plt.title("SURF Feature Detection") plt.show()

3. 目标检测（Object Detection）

目标检测是计算机视觉中的一项核心任务，目的是识别图像中不同的物体并确定它们的位置（用边界框表示）。一些流行的目标检测算法包括：

Haar Cascades：

Haar特征基于分类器训练，用于检测物体，特别是人脸检测。
python

复制代码

# 加载Haar级联分类器 face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml') # 读取图像并转换为灰度图 img = cv2.imread('face.jpg') gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # 检测面部 faces = face_cascade.detectMultiScale(gray, 1.1, 4) # 绘制边界框 for (x, y, w, h) in faces: cv2.rectangle(img, (x, y), (x + w, y + h), (255, 0, 0), 2) plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB)) plt.title("Face Detection using Haar Cascade") plt.show()
YOLO（You Only Look Once）：

YOLO 是一个深度学习框架，通过回归方法进行目标检测，能在实时视频流中检测多个物体。

使用 YOLO 进行目标检测通常需要下载预训练模型，下面是如何使用 OpenCV 加载 YOLO 模型进行检测。
python

复制代码

# 加载YOLO预训练模型 net = cv2.dnn.readNet("yolov3.weights", "yolov3.cfg") layer_names = net.getLayerNames() output_layers = [layer_names[i[0] - 1] for i in net.getLayers()] # 加载图像 img = cv2.imread('image.jpg') blob = cv2.dnn.blobFromImage(img, 0.00392, (416, 416), (0, 0, 0), True, crop=False) net.setInput(blob) outs = net.forward(output_layers) # 后处理：提取检测到的物体及其位置 for out in outs: for detection in out: scores = detection[5:] class_id = np.argmax(scores) confidence = scores[class_id] if confidence > 0.5: center_x = int(detection[0] * img.shape[1]) center_y = int(detection[1] * img.shape[0]) w = int(detection[2] * img.shape[1]) h = int(detection[3] * img.shape[0]) cv2.rectangle(img, (center_x, center_y), (center_x + w, center_y + h), (0, 255, 0), 2) plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB)) plt.title("YOLO Object Detection") plt.show()

4. 图像分割（Image Segmentation）

图像分割是将图像分成若干个区域，使每个区域代表不同的对象或背景。常见的图像分割算法包括：

K-means 聚类：

K-means 是一种常见的无监督学习方法，可以用于图像分割。它通过将像素聚类成K个类来实现图像分割。
python

复制代码

# K-means 图像分割 img = cv2.imread('image.jpg') img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) Z = img.reshape((-1, 3)) Z = np.float32(Z) criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 100, 0.2) k = 4 ret, label, center = cv2.kmeans(Z, k, None, criteria, 10, cv2.KMEANS_RANDOM_CENTERS) # 将标签重新映射为颜色 center = np.uint8(center) res = center[label.flatten()] segmented_img = res.reshape(img.shape) plt.imshow(segmented_img) plt.title("Image Segmentation using K-means") plt.show()
GrabCut 算法：

GrabCut 是一种基于图割（Graph Cut）的图像分割算法，用于从图像中提取前景。
python

复制代码

# GrabCut图像分割 img = cv2.imread('image.jpg') mask = np.zeros(img.shape[:2], np.uint8) bgd_model = np.zeros((1, 65), np.float64) fgd_model = np.zeros((1, 65), np.float64) # 使用GrabCut进行分割 rect = (50, 50, img.shape[1]-50, img.shape[0]-50) cv2.grabCut(img, mask, rect, bgd_model, fgd_model, 5, cv2.GC_INIT_WITH_RECT) # 修改掩模，提取前景 mask2 = np.where((mask == 2) | (mask == 0), 0, 1).astype('uint8') result = img * mask2[:, :, np.newaxis] plt.imshow(result) plt.title("Image Segmentation using GrabCut") plt.show()

5. 深度学习方法

深度学习在计算机视觉中的应用非常广泛，特别是在图像分类、目标检测、图像分割等领域。常见的深度学习框架如 TensorFlow、Keras 和 PyTorch 被广泛用于构建复杂的卷积神经网络（CNN）模型。

ResNet、VGG、Inception 等预训练模型 在各种视觉任务中获得了显著的成功，它们提供了很强的特征提取能力，可以通过迁移学习进行微调，快速适应新的任务。

python

复制代码

from tensorflow.keras.applications import ResNet50 from tensorflow.keras.preprocessing import image from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions # 加载预训练的ResNet50模型 model = ResNet50(weights='imagenet') # 读取图片并进行预处理 img_path = 'image.jpg' img = image.load_img(img_path, target_size=(224, 224)) img_array = image.img_to_array(img) img_array = np.expand_dims(img_array, axis=0) img_array = preprocess_input(img_array) # 进行预测 predictions = model.predict(img_array) # 解码预测结果 decoded_predictions = decode_predictions(predictions, top=3)[0] for i, (imagenet_id, label, score) in enumerate(decoded_predictions): print(f"{i+1}: {label} ({score:.2f})")