当前位置: 首页 > article >正文

基于三种机器学习方法的磁芯损耗预测模型

      大家好,我是带我去滑雪!

      磁性元件在电力转换和传输系统中扮演着关键角色,而磁芯损耗是影响其效率的重要指标。随着工业和能源需求的增长,尤其是可再生能源和智能电网的快速发展,对磁性元件的高效、紧凑、低功耗需求日益迫切。磁芯损耗与多种因素有关,包括工作频率、磁通密度、励磁波形、工作温度和材料特性,这些因素的非线性和相互关联性使得损耗预测变得复杂。通过对原始数据进行处理,包括剔除与磁通密度无关的变量,如温度、频率和磁芯损耗,以及检查数据中是否存在缺失值。在确保数据的完整性和相关性之后,采用时域和频域分析方法,结合统计分析、数值计算和信号处理技术,从数据中提取了关键特征,包括最大值、最小值、均值、标准差、偏度、峰度、总谐波失真、小波能级特征和波形因子等17个特征变量。

      汇总所有特征变量,并对磁芯材料类型、励磁波形数据等分类变量采用独立热编码将文本转化为数值。分别构建BP神经网络、随机森林回归、轻量梯度提升三种磁芯损耗预测模型,并对比各个模型在验证集上的MSE、MSE、MAE和R方,选择最优的模型对测试集进行预测。研究结果表明,轻量梯度提升模型的预测效果最好。此外为了探究模型的泛化能力,采用k-折交叉验证方法,对模型进行评估。最后,根据模型评估结果给出了相应建议。本文的建模流程如下:

    下面开始代码实战: 

目录

(一)特征提取

(二)模块导入与数据展示

(三)划分训练集和验证集

 (四)绘制变量热力图

(五)三种机器学习模型建立

(六)交叉验证


(一)特征提取

import pandas as pd
import numpy as np
import scipy.fftpack as fftpack
import pywt
file_path = r"E:\工作\硕士\问题四\附件三测试集特征提取.xlsx"
df = pd.read_excel(file_path, header=None)
flux_density = df.values.flatten()
segment_length = 1024
all_features = []
for i in range(0, len(flux_density), segment_length):
    segment = flux_density[i:i+segment_length]    
    if len(segment) < segment_length: 
        break
    num_samples = len(segment)
    t = np.linspace(0, 1, num_samples)
    max_value = np.max(segment)
    min_value = np.min(segment)
    mean_value = np.mean(segment)
    std_deviation = np.std(segment)
    skewness = np.mean((segment - mean_value)**3) / (std_deviation**3)
    kurtosis = np.mean((segment - mean_value)**4) / (std_deviation**4)
    fft_values = fftpack.fft(segment)
    fft_freqs = fftpack.fftfreq(num_samples, d=t[1] - t[0])
    fft_magnitude = np.abs(fft_values)
    peak_to_peak = max_value - min_value
    dominant_frequency = fft_freqs[np.argmax(fft_magnitude[:len(fft_magnitude)//2])]
    harmonics = np.sqrt(np.sum(fft_magnitude[2:]**2))
    fundamental = fft_magnitude[1] 
    THD = harmonics / fundamental if fundamental != 0 else 0
    time_points = np.arange(len(segment))
    slope = np.polyfit(time_points, segment, 1)[0]
    zero_crossings = np.sum(np.diff(np.sign(segment)) != 0)
    rms_value = np.sqrt(np.mean(segment ** 2))
    mean_abs_value = np.mean(np.abs(segment))
    form_factor = rms_value / mean_abs_value if mean_abs_value != 0 else 0
    wavelet = 'db4'
    coeffs = pywt.wavedec(segment, wavelet, level=4)
    wavelet_coeffs_energy = [np.sum(c**2) for c in coeffs]
    features = {
        '最大值': max_value,
        '最小值': min_value,
        '峰峰值': peak_to_peak,
        '均值': mean_value,
        '标准差': std_deviation,
        '偏度': skewness,
        '峰度': kurtosis,
        '总谐波失真': THD,
        '小波能级1': wavelet_coeffs_energy[0],
        '小波能级2': wavelet_coeffs_energy[1],
        '小波能级3': wavelet_coeffs_energy[2],
        '小波能级4': wavelet_coeffs_energy[3],
        '小波能级5': wavelet_coeffs_energy[4],
        '斜率': slope,
        '过零率': zero_crossings,
        '波形因子 ': form_factor    
    }
    all_features.append(features)
features_df = pd.DataFrame(all_features)
output_file_path = r"E:\工作\硕士\问题四\附件三测试集特征提取结果.xlsx"
features_df.to_excel(output_file_path, index=False)
print(f"特征已提取并保存至 {output_file_path}")

(二)模块导入与数据展示

import os
import math
import time
import datetime
import random as rn
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
plt.rcParams ['font.sans-serif'] ='SimHei' 
plt.rcParams ['axes.unicode_minus']=False
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler,StandardScaler
from sklearn.metrics import mean_absolute_error
from sklearn.metrics import mean_squared_error,r2_score
import tensorflow as tf
import keras
from keras.layers import Layer
import keras.backend as K
from keras.models import Model, Sequential
from keras.layers import GRU, Dense,Conv1D, MaxPooling1D,GlobalMaxPooling1D,Embedding,Dropout,Flatten,SimpleRNN,LSTM,Bidirectional
import tensorflow as tf
from keras.callbacks import EarlyStopping
from tensorflow.keras  import optimizers


data0 = pd.read_csv(
    r'E:\工作\硕士\科研竞赛\第四问模型训练数据.csv',
    encoding="ANSI",
    parse_dates=['序号']
).set_index('序号')[['温度', '频率', '最大值', '最小值', '峰峰值', '均值', '标准差', '偏度', '峰度', '总谐波失真', '小波能级1', '小波能级2', '小波能级3', '小波能级4', '小波能级5', '磁芯材料类型', '斜率', '过零率', '波形因子', '励磁波形', '磁芯损耗']].sort_index()
data0 = data0.astype(float)
data0.head()

      数据展示:

(三)划分训练集和验证集

from matplotlib.ticker import MaxNLocator
def build_sequences(text, window_size=24):
    x, y = [],[]
    for i in range(len(text) - window_size):
        sequence = text[i:i+window_size]
        target = text[i+window_size]
        x.append(sequence)
        y.append(target)
    return np.array(x), np.array(y)
def get_traintest(data,train_ratio=0.8,window_size=24):
    train_size=int(len(data)*train_ratio)
    train=data[:train_size]
    test=data[train_size-window_size:]
    X_train,y_train=build_sequences(train,window_size=window_size)
    X_test,y_test=build_sequences(test,window_size=window_size)
    return X_train,y_train[:,-1],X_test,y_test[:,-1]
data=data0.to_numpy()
scaler = MinMaxScaler() 
scaler = scaler.fit(data[:,:-1])
X=scaler.transform(data[:,:-1])   
y_scaler = MinMaxScaler() 
y_scaler = y_scaler.fit(data[:,-1].reshape(-1,1))
y=y_scaler.transform(data[:,-1].reshape(-1,1))

train_ratio=0.8   
window_size=5  
X_train,y_train,X_test,y_test=get_traintest(np.c_[X,y],window_size=window_size,train_ratio=train_ratio)
print(X_train.shape,y_train.shape,X_test.shape,y_test.shape)
y_test1 = y_scaler.inverse_transform(y_test.reshape(-1,1))
test_size=int(len(data)*(1-train_ratio))
plt.figure(figsize=(10, 5), dpi=256)
plt.plot(data0.index[:-test_size], data0.iloc[:, -1].iloc[:-test_size], label='训练集', color='#FA9905')
plt.plot(data0.index[-test_size:], data0.iloc[:, -1].iloc[-test_size:], label='验证集', color='#FB8498', linestyle='dashed')
ax = plt.gca()
ax.xaxis.set_major_locator(MaxNLocator(10)) 
plt.legend()
plt.ylabel('磁芯损耗(每立方米瓦特)', fontsize=14)
plt.xlabel('序号', fontsize=14)
plt.savefig(r"E:\工作\硕士\科研竞赛\问题四\训练集与验证集比例.png",
            bbox_inches="tight",
            pad_inches=1,
            transparent=True,
            facecolor="w",
            edgecolor='w',
            dpi=300,
            orientation='landscape')

    可视化结果展示:

       经过统计,共有12400个样本,将其划分为训练集和测试集,其中训练集为数据前80%,共计9920个,测试集为数据的后20%,共计2485个。利用python将磁芯损耗数据可视化,其中黄色实线为训练集数据,粉色虚线为测试集数据。

 (四)绘制变量热力图

        结合提取的特征变量,本问模型构建选取的特征变量有温度、频率、最大值、最小值、峰  峰值、均值、标准差、偏度、峰度、总谐波失真、小波能级1、小波能级2、小波能级3、小波能级4、小波能级5、磁芯材料类型、斜率、过零率、波形因子、励磁波形共计20个,响应变量为磁芯损耗。使用皮尔逊相关系数计算每个变量之间的相关系数,并绘制热力图。为了探究特征变量对磁芯损耗的影响,绘制了随机森林特征重要性排序。

import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt 
import seaborn as sns 
import datetime
%matplotlib inline
plt.rcParams['font.sans-serif'] = ['KaiTi']
plt.rcParams['axes.unicode_minus'] = False


data0 = pd.read_csv(
    r'E:\工作\硕士\科研竞赛\问题四\第四问模型训练数据.csv',
    encoding="ANSI",
    parse_dates=['序号']
).set_index('序号')[['温度', '频率', '最大值', '最小值', '峰峰值', '均值', '标准差', '偏度', '峰度', '总谐波失真', '小波能级1', '小波能级2', '小波能级3', '小波能级4', '小波能级5', '磁芯材料类型', '斜率', '过零率', '波形因子', '励磁波形', '磁芯损耗']].sort_index()
data = data0.drop(columns=['磁芯损耗'])
y= data0['磁芯损耗']
data.shape,y.shape

corr = plt.subplots(figsize = (18,16),dpi=300)
corr= sns.heatmap(data.assign(Y=y).corr(method='spearman'),annot=True,square=True)
plt.savefig(r"E:\工作\硕士\科研竞赛\问题四\热力图.png",
            bbox_inches="tight",
            pad_inches=1,
            transparent=True,
            facecolor="w",
            edgecolor='w',
            dpi=300,
            orientation='landscape')

    输出结果:

(五)三种机器学习模型建立

from sklearn.model_selection import train_test_split
X_train,X_val,y_train,y_val=train_test_split(data,y,test_size=0.2,random_state=0)
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaler.fit(X_train)
X_train_s = scaler.transform(X_train)
X_val_s = scaler.transform(X_val)
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import ElasticNet
from sklearn.neighbors import KNeighborsRegressor
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor
from sklearn.ensemble import GradientBoostingRegressor
from xgboost.sklearn import XGBRegressor
from lightgbm import LGBMRegressor
from sklearn.svm import SVR
from sklearn.neural_network import MLPRegressor
model5= RandomForestRegressor(n_estimators=500,  max_features=int(X_train.shape[1]/3) , random_state=0)
model7 =  XGBRegressor(objective='reg:squarederror', n_estimators=1000, random_state=0)
model8 = LGBMRegressor(n_estimators=1000,objective='regression',random_state=0)
model10 = MLPRegressor(hidden_layer_sizes=(16,8), random_state=77, max_iter=10000)
model_list=[model1,model2,model3,model4,model5,model6,model7,model8,model9,model10]
model_name=['随机森林','轻量梯度提升','BP神经网络']
for i in range(3):
    model_C=model_list[i]
    name=model_name[i]
    model_C.fit(X_train_s, y_train)
    s=model_C.score(X_val_s, y_val)
    print(name+'方法在验证集的准确率为:'+str(s))

     输出结果:

(六)交叉验证

from sklearn.metrics import mean_absolute_error
from sklearn.metrics import mean_squared_error,r2_score
from sklearn.model_selection import KFold
 
def evaluation(y_test, y_predict):
    mae = mean_absolute_error(y_test, y_predict)
    mse = mean_squared_error(y_test, y_predict)
    rmse = np.sqrt(mean_squared_error(y_test, y_predict))
    mape=(abs(y_predict -y_test)/ y_test).mean()
    r_2=r2_score(y_test, y_predict)
    return mae, rmse, mape
def evaluation2(lis):
    array=np.array(lis)
    return array.mean() , array.std()

def cross_val(model=None,X=None,Y=None,K=5,repeated=1):
    df_mean=pd.DataFrame(columns=['R2','MAE','RMSE','MAPE']) 
    df_std=pd.DataFrame(columns=['R2','MAE','RMSE','MAPE'])
    for n in range(repeated):
        print(f'正在进行第{n+1}次重复K折.....随机数种子为{n}\n')
        kf = KFold(n_splits=K, shuffle=True, random_state=n)
        R2=[]
        MAE=[]
        RMSE=[]
        MAPE=[]
        print(f"    开始本次在{K}折数据上的交叉验证.......\n")
        i=1
        for train_index, test_index in kf.split(X):
            print(f'        正在进行第{i}折的计算')
            X_train=X.values[train_index]
            y_train=y.values[train_index]
            X_test=X.values[test_index]
            y_test=y.values[test_index]
            model.fit(X_train,y_train)
            score=model.score(X_test,y_test)
            R2.append(score)
            pred=model.predict(X_test)
            mae, rmse, mape=evaluation(y_test, pred)
            MAE.append(mae)
            RMSE.append(rmse)
            MAPE.append(mape)
            print(f'        第{i}折的拟合优度为:{round(score,4)},MAE为{round(mae,4)},RMSE为{round(rmse,4)},MAPE为{round(mape,4)}')
            i+=1
        print(f'    ———————————————完成本次的{K}折交叉验证———————————————————\n')
        R2_mean,R2_std=evaluation2(R2)
        MAE_mean,MAE_std=evaluation2(MAE)
        RMSE_mean,RMSE_std=evaluation2(RMSE)
        MAPE_mean,MAPE_std=evaluation2(MAPE)
        print(f'第{n+1}次重复K折,本次{K}折交叉验证的总体拟合优度均值为{R2_mean},方差为{R2_std}')
        print(f'                               总体MAE均值为{MAE_mean},方差为{MAE_std}')
        print(f'                               总体RMSE均值为{RMSE_mean},方差为{RMSE_std}')
        print(f'                               总体MAPE均值为{MAPE_mean},方差为{MAPE_std}')
        print("\n====================================================================================================================\n")
        df1=pd.DataFrame(dict(zip(['R2','MAE','RMSE','MAPE'],[R2_mean,MAE_mean,RMSE_mean,MAPE_mean])),index=[n])
        df_mean=pd.concat([df_mean,df1])
        df2=pd.DataFrame(dict(zip(['R2','MAE','RMSE','MAPE'],[R2_std,MAE_std,RMSE_std,MAPE_std])),index=[n])
        df_std=pd.concat([df_std,df2])
    return df_mean,df_std


model = LGBMRegressor(n_estimators=1000,objective='regression',random_state=0)
lgb_crosseval,lgb_crosseval2=cross_val(model=model,X=data,Y=y,K=3,repeated=5)

lgb_crosseval

     部分输出结果展示: 

        为了验证各个模型的泛化能力,本文采用5-折交叉验证方法,评估模型的泛化能力。展示了在5 折交叉验证下三种模型的总体拟合优度均值,而展示了在5折交叉验证下不同模型的总体拟合优度方差。


更多优质内容持续发布中,请移步主页查看。

 若有问题可邮箱联系:1736732074@qq.com 

博主的WeChat:TCB1736732074

   点赞+关注,下次不迷路!


http://www.kler.cn/a/456805.html

相关文章:

  • 基于51单片机的教学用自动光学演示装置设计
  • BOE(京东方)“向新2025”年终媒体智享会落地深圳
  • 【LLM】Langflow 的简单使用
  • 基于FPGA的2ASK+帧同步系统verilog开发,包含testbench,高斯信道,误码统计,可设置SNR
  • 【每日学点鸿蒙知识】WebView代理、2D绘制矩形圆角、TextInput清理按钮、pdf滑动、icon配置问题
  • React基础知识学习
  • 使用内网穿透工具,为树莓派配置公网地址实现远程ssh
  • springboot maven 构建 建议使用 --release 21 而不是 -source 21 -target 21,因为它会自动设置系统模块的位置
  • 分别查询 user 表中 avatar 和 nickname 列为空的用户数量
  • C# 6.0版本的WebAPI接口部署到Linux服务器
  • 【学习总结|DAY024】Maven基础
  • Git拉的项目maven无法下载
  • 24-12-28-pytorch深度学习中音频I/O 中遇到的问题汇总
  • 音视频入门基础:MPEG2-TS专题(24)——FFmpeg源码中,显示TS流每个packet的pts、dts的实现
  • python常用基础语法
  • win10系统上打包electron,实现在win7系统运行
  • Detected at node ‘truediv‘ defined at (most recent call last): Node: ‘truediv‘
  • BUU BRUTE 1
  • 剑指Offer|LCR 013. 二维区域和检索 - 矩阵不可变
  • LeetCode题练习与总结:IPO--502
  • linux查看访问外网本机ip地址的命令
  • 微信小程序打印生产环境日志
  • WordPress网站中如何修复504错误
  • 线程池基础知识
  • HTML5前端实现毛玻璃效果的可拖拽登录框
  • C/C++应该如何使用NI-488.2库?