当前位置：首页 > article >正文

【Python办公自动化】—Excel中相同编号自动添加-1-2-3...

article 2025/3/14 16:34:14

专栏导读

在这里插入图片描述

🌸 欢迎来到Python办公自动化专栏—Python处理办公问题，解放您的双手

🏳️‍🌈 博客主页：请点击——> 一晌小贪欢的博客主页求关注

👍 该系列文章专栏：请点击——>Python办公自动化专栏求订阅

🕷 此外还有爬虫专栏：请点击——>Python爬虫基础专栏求订阅

📕 此外还有python基础专栏：请点击——>Python基础学习专栏求订阅

文章作者技术和水平有限，如果文中出现错误，希望大家能指正🙏

❤️ 欢迎各位佬关注！ ❤️

1、背景介绍

简单的说，就是在Excel中有一列h=会有重复的编号，我们相对这个重复的编号，再次进行编号，使其如有重复就加上-1，-2，-3。。。。。以此类推，将重新生成【新编号】放在其旁边列

2、库的安装

库	用途	安装
openpyxl	Excel读取	`pip install openpyxl -i https://pypi.tuna.tsinghua.edu.cn/simple/`
os	获取路径	内置库无需安装

3、核心代码

①：编号列的值进行计数
遍历Excel文件的每一行，对编号列的值进行计数。
根据计数值生成新的格式化字符串编号-计数值，并更新到结果编号列。
支持单个文件处理和批量文件处理。

for row_idx in range(2, ws.max_row + 1):
    # Get the 序号 value
    序号_value = ws.cell(row=row_idx, column=序号_index).value

    # Skip if 序号 is None or empty
    if 序号_value is None or str(序号_value).strip() == '':
        continue

    # Convert to string to handle any numeric values
    序号_str = str(序号_value)

    # Initialize count if this is the first occurrence
    if 序号_str not in 序号_counts:
        序号_counts[序号_str] = 0

    # Increment the count
    序号_counts[序号_str] += 1

    # Create the new format: "序号-count"
    new_子账号编号 = f"{序号_str}-{序号_counts[序号_str]}"

    # Update the 子账号编号 cell
    ws.cell(row=row_idx, column=子账号编号_index).value = new_子账号编号

4、完整代码

import os
import openpyxl


def process_subaccount_numbers(file_path, output_path=None):
    # Load the workbook
    print(f"Reading file: {file_path}")
    wb = openpyxl.load_workbook(file_path)
    ws = wb.active

    # Get the headers
    headers = [cell.value for cell in ws[1]]

    # Find the indices of the required columns
    if '编号' not in headers or '结果编号' not in headers:
        print("Error: Required columns '序号' or '结果编号' not found in the Excel file.")
        return

    序号_index = headers.index('编号') + 1  # +1 because openpyxl is 1-indexed
    子账号编号_index = headers.index('结果编号') + 1

    # Create a dictionary to track occurrences of each 序号
    序号_counts = {}

    # Process each row (starting from row 2, skipping the header)
    for row_idx in range(2, ws.max_row + 1):
        # Get the 序号 value
        序号_value = ws.cell(row=row_idx, column=序号_index).value

        # Skip if 序号 is None or empty
        if 序号_value is None or str(序号_value).strip() == '':
            continue

        # Convert to string to handle any numeric values
        序号_str = str(序号_value)

        # Initialize count if this is the first occurrence
        if 序号_str not in 序号_counts:
            序号_counts[序号_str] = 0

        # Increment the count
        序号_counts[序号_str] += 1

        # Create the new format: "序号-count"
        new_子账号编号 = f"{序号_str}-{序号_counts[序号_str]}"

        # Update the 子账号编号 cell
        ws.cell(row=row_idx, column=子账号编号_index).value = new_子账号编号

    # Determine the output path
    if output_path is None:
        file_name, file_ext = os.path.splitext(file_path)
        output_path = f"{file_name}_processed{file_ext}"

    # Save the modified workbook to a new Excel file
    print(f"Saving processed file to: {output_path}")
    wb.save(output_path)
    print("Processing completed successfully!")

    return output_path


def main():
    # Get the data source directory
    data_dir = './数据源/'

    # Find all Excel files in the directory
    excel_files = [f for f in os.listdir(data_dir) if f.endswith('.xlsx') or f.endswith('.xls')]

    if not excel_files:
        print("No Excel files found in the data source directory.")
        return

    # Process each Excel file
    for file_name in excel_files:
        file_path = os.path.join(data_dir, file_name)
        process_subaccount_numbers(file_path)


if __name__ == "__main__":
    main()