black-box optimization

Posted on August 12, 2021 1 minute read ∼ Filed in : machine learning basic

超参数优化

常用的超参数优化方法有

网格搜索（Grid search），
随机搜索（Random search），
遗传算法，
贝叶斯优化（Bayesian Optimization）

核函数

高斯过程，SVM， PCA都是核函数学习的例子

特征映射

把一个数据从低维空间映射到高维空间，使得高维空间下，数据更容易分类（线性可分的）

如果映射函数是 f(x), 对于任意2条数据，loss函数可能需要算这个2个数据映射完的内积，即算 f(x1). f(x2)

把它定义为核函数，即 K(x1, x2) = f(x1). f(x2)

有了这个定义后，可以直接定义核函数的形式，而不用具体定义f(x) 的形式。

概率图模型

概率图模型分为三种：贝叶斯网络，马尔科夫随机场以及高斯网络

高斯过程

高斯分布

一维高斯分布只需要求出mean和方差，就可以得出方程形式

多维度高斯分布，如果每个变量相互独立，联合分布 = p(x1,x2..xN)

多维度高斯分布，变量直接相互独立，更容易求出mean和方差矩阵。

高斯过程

贝叶斯优化

black-box optimization

init sample
init mode
get acquisition function
optimize x_next = argmax(af)
sample new data and update the model.
repeat

黑盒优化问题本质上是给定一组D，求他的分布。

给定一组数据D，想拟合一个函数，使得函数可以代表数据的分布。函数中有超参数 theta。解法有2种。

频率派和贝叶斯派

频率派求最大似然函数 (MLE)

Theta = argmax log P( D theta )

贝叶斯派, 它假设 theta 也是一个分布，采用贝叶斯展开式展开

先验概率 = p(theta)

likelyhood = p( D

theta )

衡量后验概率 MAP

原本的贝叶斯预测要求 p(theta

D) , 然后可以计算 p ( x_new

D ) , 但是分母的积分不好求。所有才用MAP来代替。

Bayesian optimization

MC sampling

if we don’t know f(X) explicitly but need to integrate it, we use MC sampling to approximate the integral.

black-box optimization

超参数优化

核函数

特征映射

概率图模型

高斯过程

高斯分布

高斯过程

贝叶斯优化

black-box optimization

Bayesian optimization

MC sampling

Tags Cloud

Categories Cloud

It's the niceties that make the difference fate gives us the hand, and we play the cards.

超参数优化

核函数

特征映射

概率图模型

高斯过程

高斯分布

高斯过程

贝叶斯优化

black-box optimization

Bayesian optimization

MC sampling

END OF POST

Tags Cloud

Categories Cloud

It's the niceties that make the difference fate gives us the hand, and we play the cards.