基于Flask后端框架的均值填充
Flask可以在Jupyter上运行,首先需要安装这两个库:
!pip install Flask-CORS
!pip install Flask
引入依赖:
from flask import Flask, request, jsonify, send_file
import os
import pandas as pd
import io
from flask import Flask
from flask_cors import CORS, cross_origin
处理跨域问题:
# 创建 Flask 应用
app = Flask(__name__)
CORS(app, expose_headers=['Content-Disposition']) # 暴露 Content-Disposition 头
# 允许所有域名访问整个应用
@app.after_request
def after_request(response):
response.headers.add('Access-Control-Allow-Origin', '*')
response.headers.add('Access-Control-Allow-Headers', 'Content-Type,Authorization')
response.headers.add('Access-Control-Allow-Methods', 'GET,PUT,POST,DELETE')
return response
定义上传文件路由:
# 定义上传文件的路由
@app.route('/upload-csv', methods=['POST'])
def upload_csv():
file = request.files['file']
if file:
# 保存文件到服务器的某个位置
file.save(os.path.join('uploads', file.filename))
return jsonify(success=True), 200
else:
return jsonify(success=False, message='No file part'), 400
定义均值填充路由:
@app.route('/process-csv', methods=['POST'])
def process_csv():
# 获取上传的文件和表单数据
file = request.files['file']
column = request.form['column']
min_value = float(request.form['minValue'])
max_value = float(request.form['maxValue'])
# 提取原文件名(不带扩展名)
original_filename = file.filename.rsplit('.', 1)[0] # 去掉扩展名
# 读取 CSV 文件
df = pd.read_csv(file)
# 处理数据:将超出范围的值替换为均值
mean_value = df[column][(df[column] >= min_value) & (df[column] <= max_value)].mean()
df[column] = df[column].apply(lambda x: mean_value if x < min_value or x > max_value else x)
# 将处理后的数据保存为 CSV
output = io.BytesIO()
df.to_csv(output, index=False)
output.seek(0)
# 生成新的文件名:原文件名 + "_processed_MeanFilling.csv"
new_filename = f"{original_filename}_processed_MeanFilling.csv"
# 返回文件给前端
return send_file(output, mimetype='text/csv', as_attachment=True, download_name=new_filename)
运行Flask服务器:
# 运行 Flask 服务器
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5000)
Flask 是一个轻量级的 Web 框架,适合快速开发小型应用。默认情况下,Flask 会在 http://localhost:5000
上运行。