回归中的overfittingunderfitting,正则化回归python
发布时间:2020-12-14 01:10:37 所属栏目:百科 来源:网络整理
导读:Adressing overfitting: 减少特征 模型选择,自动选择变量 但是特征信息的舍弃会导致信息的丢失 regularization: 保留所有特征,但是减少参数theta的值 在很多特征时有良好的效果 cost function 对参数惩罚,保证参数较小,防止过拟合 1. fitting well 2. theta
Adressing overfitting:
regularization:
cost function对参数惩罚,保证参数较小,防止过拟合 这里的lambda参数设置过大会underfitting 正则化回归正则化回归中的只惩罚非常数项所以,将梯度下降分开: Normal equation正则化通过在对角加上一个数值,可是解决不可逆的问题. 逻辑回归正则化无正则化的逻辑回归的cost function 正则化的cost 梯度下降的式子与线性的相同,不同的是h(theta)函数不同 其损失函数为: 整个迭代过程为: __author__ = 'Chen'
from numpy import *
#calculate the cost
def costFunction(X,Y,theta):
mse = (theta * X.T - Y.T)
return mse *mse.T
#linearReresion
def linearRegresion(x,y,type=True,alpha=0.01,lambdas=0.01):
xrow = shape(x)[0]
xcol = shape(x)[1]
x = matrix(x)
Y = matrix(y)
# fill ones
xone = ones((xrow,1))
X = hstack((xone,x))
X = matrix(X)
# normal equiation
if type == True:
#add regularization
for i in range(1,xrow):
X[i,i] += lambdas * 1
theta = (X.T*X).I*X.T*Y
return theta
else:
# gradiant
theta = matrix(random.random(xcol+1))
# iterations
for iteration in range(1,10000):
# return the cost
print costFunction(X,theta)
sums = 0
#gradient method
# adding a regularzation need to add theta(i-1)
temptheta = theta
temptheta[0,0] = 0
for i in range(1,xrow):
sums += (theta*X[i,:].T-Y[i,:])*X[i,:]
theta -= alpha*sums/xrow + lambdas * temptheta/xrow
return theta
x= [[0,1,0],[0,0,1],[1,1]]
y= [[1],[2],[3],[4]]
# calculate linearRegression by normal equation
theta1 = linearRegresion(x,y)
print theta1
#gradient descent
theta2 = linearRegresion(x,False)
print theta2
(编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |