当前位置: 首页 > article >正文

Machine Learning ---- Feature Scaling

目录

 一、What is feature scaling::

二、Why do we need to perform feature scaling?

三、How to perform feature scaling:

        1、Normalization:

        2、Mean normalization:

        3、Standardization (data needs to follow a normal distribution):


 一、What is feature scaling:

        Simply put, it is the process of normalizing the units of data, which results in significant differences in the non unit values of various data in the training dataset. However, we use normalization and other methods to stabilize the data range within a relatively small area.

二、Why do we need to perform feature scaling?

        I have read many articles, and it's like how we often have a one-sided understanding of something due to its overly prominent side. For the more valuable side, we unconsciously lean towards the past. It is best for us to understand this point from a contour map:

        Using the example said by Andrew Ng, let's assume that his housing price prediction is:

Total square meter: 300 square meters~2000 square metersNumber of rooms: 1 to 5
w_1 = 50w_2 = 0.1
w_1 = 0.1w_2 = 50

        Meanwhile, assuming b=50, for a 2000 square meter, 5-room house, the normal price would be 500000 yuan:

        At the same time, assuming b=50, for a 2000 square meter, 5-room house, the normal price is 500000 yuan. Therefore, when we bring in two different groups of w1 and w2 in the list, we can find that the factor with the larger value is: the total square * 50+room * 0.1, which gives a value of about 100000 yuan, while the other group is about 500000 yuan.

        We can find that we prefer a smaller value with a larger corresponding coefficient. So, what is the relationship between this and gradient descent?

        We can understand it from the contour map:

        This is a contour map of J(\vec{w},b)  ,So we can take a look at how gradient descent may go if it needs to reach its minimum point:

        Due to the short axis range corresponding to size and the long axis corresponding to room, in order to obtain a minimum value that satisfies the condition through gradient descent, this situation may occur, leading to slower convergence. That's why we need to perform feature scaling, and if the image is not an ellipse but a circle, its effect is the best case.

        At the same time, we can also combine Euclidean distance for understanding

三、How to perform feature scaling:

        1、Normalization:

x^{'} = \frac{x - min(x)}{max(x) - min(x)}

        The corresponding value range is [0,1], but there are also more flexible forms:

x^{'} = a + \frac{x - min(x)}{max(x) - min(x)}(b - a)

        The corresponding value range is [a, b]. Generally speaking, the values of a and b should not be too large or too small, and [-5, 5] are suitable.

        2、Mean normalization:

x^{'} = \frac{x - \bar{x}}{max(x) - min(x)}

        3、Standardization (data needs to follow a normal distribution):

x^{'} = \frac{x - \bar{x}}{\sigma }

        The denominator corresponds to the standard deviation of x, which is actually the standardized formula for a normal distribution:

x^{'} = \frac{x - \mu}{\sigma }


http://www.kler.cn/a/273069.html

相关文章:

  • FPGA工程师成长四阶段
  • 【BLE】CC2541之ADC
  • Vue如何构建项目
  • 【2025 Rust学习 --- 17 文本和格式化 】
  • Zookeeper(3)Zookeeper的工作原理是什么?
  • 【Linux 之一 】Linux常用命令汇总
  • 学完排序算法,终于知道用什么方法给监考完收上来的试卷排序……
  • VS2022 配置QT5.9.9
  • uniapp 兼容pc与手机的样式方法
  • hcia复习总结9
  • Custom GPTs Are Here and Will Impact Everything AI
  • Milvus向量数据库检索
  • 【大数据面试题】 018 数据仓库的分层了解吗?说说你的理解
  • Python 小爬虫:爬取 bing 每日壁纸设为桌面壁纸
  • 最新WordPress网址导航设计师主题风格网站源码
  • 基于vue实现bilibili网页
  • Java面试题总结15之简述你对RPC,RMI的理解
  • 如何用 UDP 实现可靠传输?并以LabVIEW为例进行说明
  • springboot277流浪动物管理系统
  • python 直方图
  • js基础语法大全(时间戳,uuid,字符串转json)
  • 大模型—概念
  • 从零开始学HCIA之SDN04
  • HTML_CSS练习:HTML注释
  • 掘根宝典之C++RTTI和类型转换运算符
  • 【通信原理笔记】【二】随机信号分析——2.4 复随机过程