循环神经网络RNN基础
前部序列的信息经处理后,作为输入信息传递到后部序列
RNN 基本结构
词汇数值化:建立一个词汇——数值一一对应的字典,然后把输入词汇转化为数值矩阵
RNN常见结构
1.多输入对多输出,维度相同的RNN结构 应用:特定信息识别
2.多输入单输出RNN结构
3.单输入多输出RNN结构
4.多输入多输出RNN结构
普通RNN结构缺陷 :前部序列信息在传递到后部的同时,信息权重下降,导致重要信息丢失 ,求解过程中梯度消失。
长短期记忆网络 (LSTM)
记忆细胞重点记录前部序列重要信息,且在传递过程中信息丢失少
门结构
LSTM
双向循环神经网络 BRNN
深层循环神经网络 DRNN
示例
在这里插入代码片
文本加载
raw_data = open('flare').read()
#移除换行字符‘\n’
data = raw_data_replace('\n','').repalce('\r','')
字符字典建立:
#字符去重
letters= list(set(data))
#建立数字到字符的索引字典
int_to_char = {a:b for a,b in enumerate(letters)}
##建立字符到数字的索引字典
char_to_int = {b,a for a,b in enumerate(letters)}
文本数据预处理
单层RNN,输出5个神经元;每次使用前8个数据预测第9个数据
import pandas as pa
impoet numpy as np
data = pd.read_csv('zgpa_train.csv')
data.head()
price = data.loc[:,'close']
#归一化处理
price_norm = price/max(price)
print(price_norm)
%matplotlib inline
from matplot lib import pyplot as plt
fig1 = plt.figure(figsize = (8,5))
plt.plot(price)
plt.xlable('time')
plt.ylable('price')
plt.title('close price')
plt.show()
#define X and y
def extract_data(data,time_step):
X=[]
y=[]
for i in range(len(data)-time_step)
X.append(a for in data[i:i+time_step])
y.append(data[i+time_step])
X = np.array(X)
X = X.reshape(X.shape[0],X.shape(1),1)
return X,y
#define X and y
time_step = 8
X,y = extract_data(price_norm,time_step)
#set up the model
feom keras.models import Sequential
feom keras.layers import Dense,SimpleRNN
model = Sequential()
model.add(SimpleRNN(units = 5,input_shape=(time_step,1),activation = 'relu'))
model.add(Dense(units = 1,activation = 'linear'))
model.compile(optimizer = 'adam',loss = 'mean_squared_error')
model.summary()
#train the model
model.fit(X,y,batch_size = 30,epochs = 200)
#make prediction based on training data
y_train_predict = model.predict(X)*max(price)
y_train = [i*max(price) for i in y]
fig2 = plt.figure(figsize=(8,5))
plt.plot(y_trian,label = 'real price')
plt.plot(y_train_predict,label = 'pedict price')
plt.xlabel('time')
plt.ylabel('price')
plt.title('close price')
plt.show()
模型结构:单层Lstm 输出有20个神经元:每次使用前20个字符预测第21个字符
data = open('flare').read()
#移除换行符
data = data.replace('\n','').replace('\r','')
print(data)
#字符去重
letters = list(set(data))
print(letters)
num_letters = len(letters)
print(num_letters)
#建立字典
int_to_char = {a:b for a,b in ennumerate(letters)}
print(int_to_char)
#char to int
char_to_int = {b:a for a,b in ennumerate(letters)}
print(char_to_int)
time_step = 20
#批量字符串处理