当前位置: 首页 > article >正文

文献分享: Muvera多向量到单向量的转化方法——原理与理论保证

原论文

详细的理论证明,包含了所有细节

1.   MUVERA \textbf{1. MUVERA} 1. MUVERA的全过程

wrgehfngddn

1️⃣文本嵌入:对查询文本和段落文本分别应用嵌入器(如 ColBERTv2 \text{ColBERTv2} ColBERTv2),得到各自的多向量嵌入

  1. 查询嵌入 Q Q Q { q 1 , q 2 , . . . , q m } \{q_1,q_2,...,q_m\} {q1,q2,...,qm},其中 q i ⊆ R d q_i\text{⊆}\mathbb{R}^{d} qiRd即为固定 d d d
  2. 段落嵌入 P P P { p 1 , p 2 , . . . , p n } \{p_1,p_2,...,p_n\} {p1,p2,...,pn},其中 p i ⊆ R d p_i\text{⊆}\mathbb{R}^{d} piRd即为固定 d d d

2️⃣向量分桶:用 SimHash \text{SimHash} SimHash将原有空间分为 2 k sim 2^{k_{\text{sim}}} 2ksim个桶,每个桶用长为 k sim k_{\text{sim}} ksim的定长二进制向量编码

  1. 法向抽取:从高斯分布中抽取 k sim ≥ 1 k_{\text{sim}}\text{≥}1 ksim1个向量 g 1 , … , g k sim ∈ R d g_{1},\ldots,g_{k_{\text{sim}}}\text{∈}\mathbb{R}^{d} g1,,gksimRd,作为 k sim k_{\text{sim}} ksim个超平面的法向量
  2. 空间划分: φ ( x ) = ( 1 ( ⟨ g 1 , x ⟩ > 0 ) , … , 1 ( ⟨ g k sim , x ⟩ > 0 ) ) \varphi(x)\text{=}\left(\mathbf{1}\left(\left\langle{}g_{1},x\right\rangle{}\text{>}0\right),\ldots,\mathbf{1}\left(\left\langle{}g_{k_{\text{sim}}},x\right\rangle{}\text{>}0\right)\right) φ(x)=(1(g1,x>0),,1(gksim,x>0))
    • 1 ( ⟨ g i , x ⟩ > 0 ) \mathbf{1}\left(\left\langle{}g_{i},x\right\rangle{}\text{>}0\right) 1(gi,x>0):当 ⟨ g i , x ⟩ > 0 \langle{}g_{i},x\rangle{}\text{>}0 gi,x>0成立(即 x x x投影在超平面 g i g_i gi的正侧)时,将该位设为 1 1 1
  3. 向量分桶:让所有的 m + n m\text{+}n m+n个嵌入通过 φ ( ⋅ ) \varphi(\cdot) φ()得到长 k sim k_{\text{sim}} ksim的二进制编码,相同编码者(即桶编码)放入同一桶

3️⃣向量生成:按照如下三种情况,为每个桶 k k k都生成一个子向量 q ⃗ ( k ) , p ⃗ ( k ) ⊆ R d \vec{q}_{(k)},\vec{p}_{(k)}\text{⊆}\mathbb{R}^{d} q (k),p (k)Rd

Case \textbf{Case} Case k \boldsymbol{k} k的情况 k \boldsymbol{k} k子向量 q ⃗ ( k ) \boldsymbol{\vec{q}_{(k)}} q (k) k \boldsymbol{k} k子向量 p ⃗ ( k ) \boldsymbol{\vec{p}_{(k)}} p (k)
Case-0 \text{Case-0} Case-0 p i → { q 1 , q 2 , . . . ∣ } p_i\text{→}\{q_1,q_2,...\mid{}\} pi{q1,q2,...} q ⃗ ( k ) = ∑ q i \displaystyle{}\vec{q}_{(k)}=\sum{}q_i q (k)=qi p ⃗ ( k ) = \displaystyle{}\vec{p}_{(k)}= p (k)=与该桶海明距离最近的 p p p
Case-1 \text{Case-1} Case-1 p i → { q 1 , q 2 , . . . ∣ p } p_i\text{→}\{q_1,q_2,...\mid{}p\} pi{q1,q2,...p} q ⃗ ( k ) = ∑ q i \displaystyle{}\vec{q}_{(k)}=\sum{}q_i q (k)=qi p ⃗ ( k ) = p \displaystyle{}\vec{p}_{(k)}=p p (k)=p
Case-n \text{Case-n} Case-n p i → { q 1 , q 2 , . . . ∣ p 1 , p 2 , . . . } p_i\text{→}\{q_1,q_2,...\mid{}p_1,p_2,...\} pi{q1,q2,...p1,p2,...} q ⃗ ( k ) = ∑ q i \displaystyle{}\vec{q}_{(k)}=\sum{}q_i q (k)=qi p ⃗ ( k ) = 1 # p ∑ p j \displaystyle{}\vec{p}_{(k)}=\cfrac{1}{\#p}\sum{}p_j p (k)=#p1pj( p p p的质心)

4️⃣向量压缩:对每个 q ⃗ ( k ) , p ⃗ ( k ) ⊆ R d \vec{q}_{(k)},\vec{p}_{(k)}\text{⊆}\mathbb{R}^{d} q (k),p (k)Rd应用随机线性投影 ψ : R d → R d proj ( d proj ≤ d ) \psi\text{:}\mathbb{R}^{d}\text{→}\mathbb{R}^{d_{\text{proj}}}(d_{\text{proj}}\text{≤}d) ψ:RdRdproj(dprojd)

  1. 投影函数: ψ ( x ) = ( 1 d proj ) S x \boldsymbol{\psi}(x)\text{=}\left(\cfrac{1}{\sqrt{d_{\text{proj}}}}\right)\mathbf{S}x ψ(x)=(dproj 1)Sx,其中 S ∈ R d proj × d \mathbf{S}\text{∈}\mathbb{R}^{d_{\text{proj}}\text{×}d} SRdproj×d为随机矩阵
    • d proj = d d_{\text{proj}}\text{=}d dproj=d S \mathbf{S} S的每个元素 s i j = 1 s_{ij}\text{=}1 sij=1,即 ψ ( x ) = x \boldsymbol{\psi}(x)\text{=}x ψ(x)=x
    • d proj < d d_{\text{proj}}\text{<}d dproj<d S \mathbf{S} S的每个元素 s i j s_{ij} sij满足离散均匀分布 P r [ s i j = 1 ] = P r [ s i j =– 1 ] = 1 2 \mathbb{Pr}\left[s_{ij}\text{=}1\right]\text{=}\mathbb{Pr}\left[s_{ij}\text{=}–1\right]\text{=}\cfrac{1}{2} Pr[sij=1]=Pr[sij=–1]=21
  2. 投影操作: { q ⃗ ( k ) , ψ ⊆ R d proj ← ψ ( q ⃗ ( k ) ) q ⃗ ( k ) ⊆ R d p ⃗ ( k ) , ψ ⊆ R d proj ← ψ ( p ⃗ ( k ) ) p ⃗ ( k ) ⊆ R d \begin{cases}\vec{q}_{(k),\psi}\text{⊆}\mathbb{R}^{d_{\text{proj}}}\xleftarrow{\psi\left(\vec{q}_{(k)}\right)}\vec{q}_{(k)}\text{⊆}\mathbb{R}^{d}\\\\\vec{p}_{(k),\psi}\text{⊆}\mathbb{R}^{d_{\text{proj}}}\xleftarrow{\psi\left(\vec{p}_{(k)}\right)}\vec{p}_{(k)}\text{⊆}\mathbb{R}^{d}\end{cases} q (k),ψRdprojψ(q (k)) q (k)Rdp (k),ψRdprojψ(p (k)) p (k)Rd
  3. 合并操作:将每个桶的压缩向量依次从左到右合并 → { q ⃗ ψ = ( q ⃗ ( 1 ) , ψ , … , q ⃗ ( B ) , ψ ) ⊆ R d proj 2 k s i m p ⃗ ψ = ( p ⃗ ( 1 ) , ψ , … , p ⃗ ( B ) , ψ ) ⊆ R d proj 2 k s i m \text{→}\begin{cases}\vec{q}_{\psi}\text{=}\left(\vec{q}_{(1),\psi},\ldots,\vec{q}_{(B),\psi}\right)\text{⊆}\mathbb{R}^{d_{\text{proj}}2^{k_{sim}}}\\\\\vec{p}_{\psi}\text{=}\left(\vec{p}_{(1),\psi},\ldots,\vec{p}_{(B),\psi}\right)\text{⊆}\mathbb{R}^{d_{\text{proj}}2^{k_{sim}}}\end{cases} q ψ=(q (1),ψ,,q (B),ψ)Rdproj2ksimp ψ=(p (1),ψ,,p (B),ψ)Rdproj2ksim

5️⃣重复生成:重复2️⃣ → \text{→} 4️⃣过程 R reps R_{\text{reps}} Rreps次,每次重复完成后生成 q ⃗ i , ψ , p ⃗ i , ψ \vec{q}_{i,\psi},\vec{p}_{i,\psi} q i,ψ,p i,ψ,拼接所有 q ⃗ i , ψ , p ⃗ i , ψ \vec{q}_{i,\psi},\vec{p}_{i,\psi} q i,ψ,p i,ψ

  1. Q Q Q最终生成的单向量: F q u e ( Q ) = ( q ⃗ 1 , ψ , … , q ⃗ R reps , ψ ) ⊆ R d proj 2 k s i m R reps \mathbf{F}_{\mathrm{que}}(Q)\text{=}\left(\vec{q}_{1,\psi},\ldots,\vec{q}_{R_{\text{reps}},\psi}\right)\text{⊆}\mathbb{R}^{d_{\text{proj}}2^{k_{sim}}R_{\text{reps}}} Fque(Q)=(q 1,ψ,,q Rreps,ψ)Rdproj2ksimRreps
  2. P P P最终生成的单向量: F d o c ( P ) = ( p ⃗ 1 , ψ , … , p ⃗ R reps , ψ ) ⊆ R d proj 2 k s i m R reps \mathbf{F}_{\mathrm{doc}}(P)\text{=}\left(\vec{p}_{1,\psi},\ldots,\vec{p}_{R_{\text{reps}},\psi}\right)\text{⊆}\mathbb{R}^{d_{\text{proj}}2^{k_{sim}}R_{\text{reps}}} Fdoc(P)=(p 1,ψ,,p Rreps,ψ)Rdproj2ksimRreps

2.   \textbf{2. } 2. 定理 2.1 \textbf{2.1} 2.1证明的思路

3.0.   \textbf{3.0. } 3.0. 定理 2.1 \textbf{2.1} 2.1的主要内容

👉前提 1 1 1:设定 ∀ ε , δ > 0 \forall{}\varepsilon,\delta\text{>}0 ε,δ>0(其中 ε ≤ 1 2 \varepsilon{}\text{≤}\cfrac{1}{2} ε21 δ ≤ ε \delta{≤}\varepsilon δε),给定单位向量集 P , Q ⊆ R d P,Q\text{⊆}\mathbb{R}^d P,QRd并满足 m = ∣ Q ∣ + ∣ P ∣ m\text{=}|Q|\text{+}|P| m=Q+P

👉前提 2 2 2:选择参数 k sim = O ( 1 ε log ⁡ ( m δ ) ) k_{\text{sim}}\text{=}O\left(\cfrac{1}{\varepsilon}\log{\left(\cfrac{m}{\delta}\right)}\right) ksim=O(ε1log(δm)) d proj = O ( 1 ε 2 log ⁡ ( m ε δ ) ) d_{\text{proj}}\text{=}O\left(\cfrac{1}{\varepsilon^2}\log{\left(\cfrac{m}{\varepsilon\delta}\right)}\right) dproj=O(ε21log(εδm)) R reps = 1 R_{\text{reps}}\text{=}1 Rreps=1

👉结论: Chamfer ( Q , P ) – ε ≤FDE ( Q , P ) ≤Chamfer ( Q , P ) + ε \text{Chamfer}(Q,P)\text{–}\varepsilon\text{≤}\text{FDE}(Q,P)\text{≤}\text{Chamfer}(Q,P)\text{+}\varepsilon Chamfer(Q,P)εFDE(Q,P)Chamfer(Q,P)+ε Pr≥ 1 – δ \text{Pr}\text{≥}1\text{–}\delta Pr1δ概率成立,即给出了 FDE \textbf{FDE} FDE相似度的上下界

2.1.   \textbf{2.1. } 2.1. 无投影无重复时 FDE \textbf{FDE} FDE相似度的上界

➡️ Chamfer \text{Chamfer} Chamfer后期交互的本质:一套针对 q i q_i qi向量的映射策略

image-20250223225238538
  1. f ( q i ) f(q_i) f(qi)操作为将 q i q_i qi映射到某一个 p q i ∈ P = { p 1 , p 2 , . . . , p m } p_{q_i}\text{∈}P\text{=}\{p_1,p_2,...,p_m\} pqiP={p1,p2,...,pm}向量
  2. Chamfer \text{Chamfer} Chamfer后期交互中 f ( q i ) = p q i = arg ⁡ p ∈ P MaxSim ( q i , P ) f(q_i)\text{=}p_{q_i}\text{=}\arg\limits_{p\text{∈}P}\text{MaxSim}(q_i,P) f(qi)=pqi=pPargMaxSim(qi,P),即遍历 p q i ∈ P = { p 1 , p 2 , . . . , p m } p_{q_i}\text{∈}P\text{=}\{p_1,p_2,...,p_m\} pqiP={p1,p2,...,pm},最终选取使内积 ⟨ q i , p q i ⟩ \langle{q_i,p_{q_i}}\rangle qi,pqi最大的 p q i p_{q_i} pqi
  3. 最终合并所有的内积 ∑ i =1 n ⟨ q i , p q i ⟩ \displaystyle{\sum_{i\text{=1}}^n}\langle{q_i,p_{q_i}}\rangle i=1nqi,pqi是为 Chamfer \text{Chamfer} Chamfer相似度,即 ∑ i =1 n ⟨ q i , p q i ⟩ =Chamfer ( Q , P ) = ⟨ Q , P ′ ⟩ \displaystyle{\sum_{i\text{=1}}^n}\langle{q_i,p_{q_i}}\rangle\text{=}\text{Chamfer}(Q,P)\text{=}\langle{Q,P^\prime}\rangle i=1nqi,pqi=Chamfer(Q,P)=Q,P

➡️ Muvera \text{Muvera} Muvera分桶的本质:另一套针对 q i q_i qi向量的映射策略

image-20250227173148677
  1. 对于单个桶 Bucket-k \text{Bucket-k} Bucket-k中有 ⟨ q ⃗ ( k ) , p ⃗ ( k ) ⟩ = ⟨ ∑ k i q k i , p ⃗ ( k ) ⟩ = ∑ k i ⟨ q k i , p ⃗ ( k ) ⟩ \langle{\vec{q}_{(k)},\vec{p}_{(k)}}\rangle\text{=}\left\langle{\displaystyle\sum_{k_i}{}q_{k_i},\vec{p}_{(k)}}\right\rangle\text{=}\displaystyle\sum_{k_i}{}\left\langle{q_{k_i},\vec{p}_{(k)}}\right\rangle q (k),p (k)=kiqki,p (k)=kiqki,p (k),相当于为每个落入该桶 k k k q k i q_{k_i} qki映射了一个 p ⃗ ( k ) \vec{p}_{(k)} p (k)
  2. 合并 B B B个桶的结果有 ∑ k = 1 B ⟨ q ⃗ ( k ) , p ⃗ ( k ) ⟩ = ∑ q i ∈ Q ⟨ q i , p ⃗ ( k q i ) ⟩ \displaystyle\sum_{k\text{=}1}^{B}\langle{\vec{q}_{(k)},\vec{p}_{(k)}}\rangle\text{=}\displaystyle\sum_{q_i\text{∈}Q}{}\left\langle{q_{i},\vec{p}_{(k_{q_i})}}\right\rangle k=1Bq (k),p (k)=qiQqi,p (kqi),相当于为每个 q i ∈ Q q_i\text{∈}Q qiQ都映射一个 p ⃗ ( k q i ) = p q i ∗ ∈ { p ⃗ ( 1 ) , . . . , p ⃗ ( B ) } \vec{p}_{(k_{q_i})}\text{=}p_{q_i}^*\text{∈}\{\vec{p}_{(1)},...,\vec{p}_{(B)}\} p (kqi)=pqi{p (1),...,p (B)}

➡️对两种映射方案的对比:有 ⟨ q i , p q i ∗ ⟩ ≤ ⟨ q i , p q i ⟩ \langle{q_i,p_{q_i}^*}\rangle\text{≤}\langle{q_i,p_{q_i}}\rangle qi,pqiqi,pqi

image-20250227181053033
  1. p q i p_{q_i} pqi的得到有且只有一种情况,就是 p q i = arg ⁡ max ⁡ p ∈ P ⟨ q i , p ⟩ = arg ⁡ p ∈ P MaxSim ( q i , P ) \displaystyle{}p_{q_i}\text{=}\arg\max_{p\text{∈}P}\langle{q_i,p}\rangle\text{=}\arg\limits_{p\text{∈}P}\text{MaxSim}(q_i,P) pqi=argpPmaxqi,p=pPargMaxSim(qi,P)
  2. p q i ∗ p_{q_i}^* pqi的得到分为三种情况,即 Case-0/Case-n \text{Case-0/Case-n} Case-0/Case-n(其中视 Case-0 \text{Case-0} Case-0 Case-n \text{Case-n} Case-n n=1 \text{n=1} n=1的情形)
    • Case-0 \text{Case-0} Case-0时,假设 q i q_i qi所落入的是桶 k k k,则选取离桶 k k k编码最近的一点 p ^ k \hat{p}_{k} p^k,即 p q i ∗ = p ⃗ ( k ) = p ^ k p_{q_i}^*\text{=}\displaystyle{}\vec{p}_{(k)}\text{=}\hat{p}_{k} pqi=p (k)=p^k
    • Case-n \text{Case-n} Case-n时,假设 q i q_i qi所落入的是桶 k k k,则 p q i ∗ = p ⃗ ( k ) = ∑ φ ( p j ) = k 1 ∣ P ∩ φ – 1 ( k ) ∣ p j p_{q_i}^*\text{=}\displaystyle{}\vec{p}_{(k)}\text{=}\displaystyle{}\sum_{\varphi(p_j)\text{=}k}\cfrac{1}{\left|P\text{∩}\varphi^{–1}(k)\right|}p_j pqi=p (k)=φ(pj)=kPφ–1(k)1pj,相当于桶 k k k种所有 p p p向量的质心
  3. 定义集合 Pdt= { ⟨ q i , p 1 ⟩ , ⟨ q i , p 2 ⟩ , . . . , ⟨ q i , p m ⟩ } \text{Pdt=}\{\langle{q_i,p_{1}}\rangle,\langle{q_i,p_{2}}\rangle,...,\langle{q_i,p_{m}}\rangle\} Pdt={⟨qi,p1,qi,p2,...,qi,pm⟩},根据 p q i p_{q_i} pqi的定义可知 ⟨ q i , p q 1 ⟩ \langle{q_i,p_{q_1}}\rangle qi,pq1 Pdt \text{Pdt} Pdt集合的最大值
    • Case-0 \text{Case-0} Case-0时,显然有 ⟨ q i , p q i ∗ ⟩ = ⟨ q i , p ^ k ⟩ ∈Pdt \langle{q_i,p_{q_i}^*}\rangle\text{=}\langle{q_i,\hat{p}_{k}}\rangle\text{∈Pdt} qi,pqi=qi,p^k∈Pdt,因此 ⟨ q i , p q i ∗ ⟩ ≤ ⟨ q i , p q i ⟩ \langle{q_i,p_{q_i}^*}\rangle\text{≤}\langle{q_i,p_{q_i}}\rangle qi,pqiqi,pqi即小于集合最大值
    • Case-n \text{Case-n} Case-n时,对每个 p j p_j pj都有 ⟨ q i , p q i ∗ ⟩ = ⟨ q i , ∑ φ ( p j ) = k 1 ∣ P ∩ φ – 1 ( k ) ∣ p j ⟩ = ∑ φ ( p j ) = k 1 ∣ P ∩ φ – 1 ( k ) ∣ ⟨ q i , p j ⟩ \langle{q_i,p_{q_i}^*}\rangle\text{=}\left\langle{q_i},\displaystyle{}\sum_{\varphi(p_j)\text{=}k}\cfrac{1}{\left|P\text{∩}\varphi^{–1}(k)\right|}p_j\right\rangle\text{=}\displaystyle{}\sum_{\varphi(p_j)\text{=}k}\cfrac{1}{\left|P\text{∩}\varphi^{–1}(k)\right|}\langle{q_i,p_j}\rangle qi,pqi=qi,φ(pj)=kPφ–1(k)1pj=φ(pj)=kPφ–1(k)1qi,pj
      • 集合 Pdt \text{Pdt} Pdt ∑ φ ( p j ) = k 1 ∣ P ∩ φ – 1 ( k ) ∣ ⟨ q i , p j ⟩ \displaystyle{}\sum_{\varphi(p_j)\text{=}k}\cfrac{1}{\left|P\text{∩}\varphi^{–1}(k)\right|}\langle{q_i,p_j}\rangle φ(pj)=kPφ–1(k)1qi,pj是子集均值, ⟨ q i , p q i ⟩ \langle{q_i,p_{q_i}}\rangle qi,pqi是全集最大值,后者必然更大
      • 所以 ⟨ q i , p q i ∗ ⟩ = ∑ φ ( p j ) = k 1 ∣ P ∩ φ – 1 ( k ) ∣ ⟨ q i , p j ⟩ ≤ ⟨ q i , p q i ⟩ \langle{q_i,p_{q_i}^*}\rangle\text{=}\displaystyle{}\sum_{\varphi(p_j)\text{=}k}\cfrac{1}{\left|P\text{∩}\varphi^{–1}(k)\right|}\langle{q_i,p_j}\rangle\text{≤}\langle{q_i,p_{q_i}}\rangle qi,pqi=φ(pj)=kPφ–1(k)1qi,pjqi,pqi

➡️两种相似评分的对比

image-20250227195846002
  1. Chamfer \text{Chamfer} Chamfer相似度:根据定义为 Chamfer ( Q , P ) = ∑ i =1 n ⟨ q i , p q i ⟩ = ⟨ Q , P ⟩ \text{Chamfer}(Q,P)\text{=}\displaystyle{\sum_{i\text{=1}}^n}\langle{q_i,p_{q_i}}\rangle\text{=}\langle{Q,P}\rangle Chamfer(Q,P)=i=1nqi,pqi=Q,P
  2. FDE \text{FDE} FDE相似度:根据定义没经过投影和重复的 FDE \text{FDE} FDE相似度可写作 FDE ( Q , P ) = ∑ k = 1 B ⟨ q ⃗ ( k ) , p ⃗ ( k ) ⟩ = ∑ i =1 n ⟨ q i , p q i ∗ ⟩ = ⟨ Q , P ′ ⟩ \text{FDE}(Q,P)\text{=}\displaystyle\sum_{k\text{=}1}^{B}\langle{\vec{q}_{(k)},\vec{p}_{(k)}}\rangle\text{=}\sum_{i\text{=1}}^n\langle{q_i,p_{q_i}^*}\rangle\text{=}\langle{Q,P^\prime}\rangle FDE(Q,P)=k=1Bq (k),p (k)=i=1nqi,pqi=Q,P
  3. 已证了 ⟨ q i , p q i ∗ ⟩ ≤ ⟨ q i , p q i ⟩ \langle{q_i,p_{q_i}^*}\rangle\text{≤}\langle{q_i,p_{q_i}}\rangle qi,pqiqi,pqi FDE ( Q , P ) ≤Chamfer ( Q , P ) \text{FDE}(Q,P)\text{≤}\text{Chamfer}(Q,P) FDE(Q,P)Chamfer(Q,P)

2.2.   \textbf{2.2. } 2.2. 加上一定维度的投影后原上界近似成立

➡️ Muvera \text{Muvera} Muvera分桶 + \text{+} +投影的过程

image-20250227204328958
  1. 同样首先相当于为每个 q i ∈ Q q_i\text{∈}Q qiQ都映射一个 p ⃗ ( k q i ) = p q i ∗ ∈ { p ⃗ ( 1 ) , . . . , p ⃗ ( B ) } \vec{p}_{(k_{q_i})}\text{=}p_{q_i}^*\text{∈}\{\vec{p}_{(1)},...,\vec{p}_{(B)}\} p (kqi)=pqi{p (1),...,p (B)},但是后续参与交互(内积)的从原始向量变为了投影向量
  2. 所以原有的 FDE ( Q , P ) \text{FDE}(Q,P) FDE(Q,P)为直接将内积 ⟨ q i , p q i ∗ ⟩ \left\langle{q_i,p^*_{q_i}}\right\rangle qi,pqi累加,现在变成将各自投影后的内积 ⟨ ψ ( q i ) , ψ ( p q i ∗ ) ⟩ \left\langle{\psi(q_i),\psi\left(p^*_{q_i}\right)}\right\rangle ψ(qi),ψ(pqi)累加
    • FDE \text{FDE} FDE相似度(无投影):可写作 FDE orgn ( Q , P ) = ∑ k = 1 B ⟨ q ⃗ ( k ) , p ⃗ ( k ) ⟩ = ∑ i =1 n ⟨ q i , p q i ∗ ⟩ \text{FDE}_{\text{orgn}}(Q,P)\text{=}\displaystyle\sum_{k\text{=}1}^{B}\langle{\vec{q}_{(k)},\vec{p}_{(k)}}\rangle\text{=}\sum_{i\text{=1}}^n\langle{q_i,p_{q_i}^*}\rangle FDEorgn(Q,P)=k=1Bq (k),p (k)=i=1nqi,pqi,并且已证明 FDE orgn ( Q , P ) ≤Chamfer ( Q , P ) \text{FDE}_{\text{orgn}}(Q,P)\text{≤}\text{Chamfer}(Q,P) FDEorgn(Q,P)Chamfer(Q,P)
    • FDE \text{FDE} FDE相似度(无投影):可写作 FDE proj ( Q , P ) = ∑ k = 1 B ⟨ ψ ( q ⃗ ( k ) ) , ψ ( p ⃗ ( k ) ) ⟩ = ∑ i =1 n ⟨ ψ ( q i ) , ψ ( p q i ∗ ) ⟩ \text{FDE}_{\text{proj}}(Q,P)\text{=}\displaystyle\sum_{k\text{=}1}^{B}\left\langle{\psi\left(\vec{q}_{(k)}\right),\psi\left(\vec{p}_{(k)}\right)}\right\rangle\text{=}\sum_{i\text{=1}}^n\left\langle{\psi(q_i),\psi\left(p^*_{q_i}\right)}\right\rangle FDEproj(Q,P)=k=1Bψ(q (k)),ψ(p (k))=i=1nψ(qi),ψ(pqi)

➡️事实 A2 \text{A2} A2表明了:给定 d proj = O ( 1 ε 2 log ⁡ m δ ) d_{\text{proj}}\text{=}O\left(\cfrac{1}{\varepsilon^{2}}\log\cfrac{m}{\delta}\right) dproj=O(ε21logδm) ⟨ x , y ⟩ – ε ≤ ⟨ ψ ( x ) , ψ ( y ) ⟩ ≤ ⟨ x , y ⟩ + ε \langle{}x,y\rangle\text{–}\varepsilon\text{≤}\langle\psi(x),\psi(y)\rangle{}\text{≤}\langle{}x,y\rangle\text{+}\varepsilon x,yεψ(x),ψ(y)⟩x,y+ε Pr≥ 1 – δ \text{Pr}\text{≥}1\text{–}\delta Pr1δ概率成立

  1. d proj = O ( 1 ε 2 log ⁡ m δ ) d_{\text{proj}}\text{=}O\left(\cfrac{1}{\varepsilon^{2}}\log\cfrac{m}{\delta}\right) dproj=O(ε21logδm)作为前提引入,代入结论有 ∑ i =1 n ⟨ q i , p q i ∗ ⟩ – ∣ Q ∣ ε ≤ ∑ i =1 n ⟨ ψ ( q i ) , ψ ( p q i ∗ ) ⟩ ≤ ∑ i =1 n ⟨ q i , p q i ∗ ⟩ + ∣ Q ∣ ε \displaystyle\sum_{i\text{=1}}^n\langle{q_i,p_{q_i}^*}\rangle\text{–}|Q|\varepsilon\text{≤}\sum_{i\text{=1}}^n\left\langle{\psi(q_i),\psi\left(p^*_{q_i}\right)}\right\rangle\text{≤}\sum_{i\text{=1}}^n\langle{q_i,p_{q_i}^*}\rangle\text{+}|Q|\varepsilon i=1nqi,pqiQεi=1nψ(qi),ψ(pqi)i=1nqi,pqi+Qε
    • 概率调整为,以 Pr≥ ( 1 – δ ) n ≥ 1 – ∣ Q ∣ δ \text{Pr}\text{≥}(1\text{–}\delta)^n\text{≥}1\text{–}|Q|\delta Pr(1δ)n1Qδ概率成立
    • 也可写作 FDE orgn ( Q , P ) – ∣ Q ∣ ε ≤ FDE proj ( Q , P ) ≤ FDE orgn ( Q , P ) + ∣ Q ∣ ε \displaystyle\text{FDE}_{\text{orgn}}(Q,P)\text{–}|Q|\varepsilon\text{≤}\text{FDE}_{\text{proj}}(Q,P)\text{≤}\text{FDE}_{\text{orgn}}(Q,P)\text{+}|Q|\varepsilon FDEorgn(Q,P)QεFDEproj(Q,P)FDEorgn(Q,P)+Qε
  2. 稍作调整,则有 FDE proj ( Q , P ) ≤ FDE orgn ( Q , P ) + ∣ Q ∣ ε ≤Chamfer ( Q , P ) + ∣ Q ∣ ε \text{FDE}_{\text{proj}}(Q,P)\text{≤}\text{FDE}_{\text{orgn}}(Q,P)\text{+}|Q|\varepsilon\text{≤}\text{Chamfer}(Q,P)\text{+}|Q|\varepsilon FDEproj(Q,P)FDEorgn(Q,P)+QεChamfer(Q,P)+Qε,以 Pr≥ 1 – ∣ Q ∣ δ \text{Pr}\text{≥}1\text{–}|Q|\delta Pr1Qδ概率成立
  3. 再按常数 Q Q Q的比例调整 δ \delta δ(注意 ε \varepsilon ε为与后续统一故不调整),则有 FDE proj ( Q , P ) ≤Chamfer ( Q , P ) + ∣ Q ∣ ε \text{FDE}_{\text{proj}}(Q,P)\text{≤}\text{Chamfer}(Q,P)\text{+}|Q|\varepsilon FDEproj(Q,P)Chamfer(Q,P)+Qε,以 Pr≥ 1 – δ \text{Pr}\text{≥}1\text{–}\delta Pr1δ概率成立

2.3.   \textbf{2.3. } 2.3. 已给出无重复 FDE \textbf{FDE} FDE上界👉还需证明什么才能给出无重复 FDE \textbf{FDE} FDE的上下界

➡️不妨假设需要证明的是 Chamfer ( Q , P ) – ∣ Q ∣ ε ≤FDE ( Q , P ) ≤Chamfer ( Q , P ) + ∣ Q ∣ ε \text{Chamfer}(Q,P)\text{–}|Q|\varepsilon\text{≤}\text{FDE}(Q,P)\text{≤}\text{Chamfer}(Q,P)\text{+}|Q|\varepsilon Chamfer(Q,P)QεFDE(Q,P)Chamfer(Q,P)+Qε Pr≥ 1 – δ \text{Pr}\text{≥}1\text{–}\delta Pr1δ概率成立

  1. 令下界成立为事件 F L / F_L/ FL/上界成立为事件 F U / F_U/ FU/上下界都成立为事件 F F F,原问题变为要证明 Pr [ F ] ≥ 1 – δ \text{Pr}[F]\text{≥}1\text{–}\delta Pr[F]1δ
  2. 不难得到当 ε ≤ 1 2 \varepsilon{}\text{≤}\cfrac{1}{2} ε21时,同时满足 Pr [ F L ] ≥ 1 – ε δ \text{Pr}[F_L]\text{≥}1\text{–}\varepsilon\delta Pr[FL]1εδ Pr [ F U ] ≥ 1 – ε δ \text{Pr}[F_U]\text{≥}1\text{–}\varepsilon\delta Pr[FU]1εδ,可推导出 Pr [ F ] ≥ 1 – δ \text{Pr}[F]\text{≥}1\text{–}\delta Pr[F]1δ
  3. 由已证明的上界结论可知设定 d proj = O ( 1 ε 2 log ⁡ ( m ε δ ) ) d_{\text{proj}}\text{=}O\left(\cfrac{1}{\varepsilon^2}\log{\left(\cfrac{m}{\varepsilon\delta}\right)}\right) dproj=O(ε21log(εδm))则有 Pr [ F U ] ≥ 1 – ε δ \text{Pr}[F_U]\text{≥}1\text{–}\varepsilon\delta Pr[FU]1εδ
  4. 所以增加前提 d proj = O ( 1 ε 2 log ⁡ ( m ε δ ) ) d_{\text{proj}}\text{=}O\left(\cfrac{1}{\varepsilon^2}\log{\left(\cfrac{m}{\varepsilon\delta}\right)}\right) dproj=O(ε21log(εδm))以及 ε ≤ 1 2 \varepsilon{}\text{≤}\cfrac{1}{2} ε21后,原问题变为要证明 Pr [ F L ] ≥ 1 – ε δ \text{Pr}\left[F_{L}\right]\text{≥}1\text{–}\varepsilon\delta Pr[FL]1εδ

➡️对 Chamfer ( Q , P ) – ∣ Q ∣ ε ≤FDE ( Q , P ) \text{Chamfer}(Q,P)\text{–}|Q|\varepsilon\text{≤}\text{FDE}(Q,P) Chamfer(Q,P)QεFDE(Q,P)的进一步变换

image-20250227232407524
  1. 根据 Chamfer \text{Chamfer} Chamfer的定义有 Chamfer ( Q , P ) = ∑ q i ∈ Q MaxSim ( q i , P ) \text{Chamfer}(Q,P)\text{=}\displaystyle{}\sum_{q_i\text{∈}Q}\text{MaxSim}(q_i,P) Chamfer(Q,P)=qiQMaxSim(qi,P)
  2. 又由于 FDE \text{FDE} FDE q i q_i qi的处理式独立线性的,所以 FDE ( Q , P ) = ∑ q i ∈ Q FDE ( q i , P ) \text{FDE}(Q,P)\text{=}\displaystyle{}\sum_{q_i\text{∈}Q}\text{FDE}(q_i,P) FDE(Q,P)=qiQFDE(qi,P)
  3. 所以原问题变为需证 ∑ q i ∈ Q ( MaxSim ( q i , P ) – ε ) ≤ ∑ q i ∈ Q FDE ( q i , P ) \displaystyle{}\sum_{q_i\text{∈}Q}\left(\text{MaxSim}(q_i,P)\text{–}\varepsilon\right)\text{≤}\sum_{q_i\text{∈}Q}\text{FDE}(q_i,P) qiQ(MaxSim(qi,P)ε)qiQFDE(qi,P),以 Pr≥ 1 – ε δ \text{Pr}\text{≥}1\text{–}\varepsilon\delta Pr1εδ概率成立
  4. 去掉求和号后,原问题变为需证 MaxSim ( q i , P ) – ε ≤FDE ( q i , P ) \text{MaxSim}(q_i,P)\text{–}\varepsilon\text{≤}\text{FDE}(q_i,P) MaxSim(qi,P)εFDE(qi,P),以 Pr≥ 1 – ε δ ∣ Q ∣ \text{Pr}\text{≥}1–\cfrac{\varepsilon\delta}{|Q|} Pr1–Qεδ概率成立

➡️后续称 FDE ( q i , P ) \text{FDE}(q_i,P) FDE(qi,P)的上下界为 FDE \text{FDE} FDE子上下界

2.4.   \textbf{2.4. } 2.4. 无投影无重复时 FDE \textbf{FDE} FDE相似度的子下界

➡️结合之前的分析,对于 FDE ( q i , P ) = ⟨ q i , p q i ∗ ⟩ \displaystyle\text{FDE}(q_i,P)\text{=}\left\langle{q_i,p_{q_i}^*}\right\rangle FDE(qi,P)=qi,pqi,在无投影时总存在一个子集 S ⊆ P S\text{⊆}P SP,使得 p q i ∗ = 1 ∣ S ∣ ∑ p j ∈ S p j = S center p_{q_i}^*\text{=}\displaystyle{}\cfrac{1}{|S|}\sum_{p_j\text{∈}S}p_j\text{=}S_{\text{center}} pqi=S1pjSpj=Scenter

image-20250228001845209
  1. 当单个 q i q_i qi落入的桶属于 Case-0 \text{Case-0} Case-0情形时, S S S是单个点 p ^ k = arg ⁡ min ⁡ p j ∈ P ∥ φ ( q i ) – φ ( p j ) ∥ 0 \displaystyle{}\hat{p}_k\text{=}\arg{}\min_{p_j\text{∈}P}\|\varphi(q_i)–\varphi(p_j)\|_{0} p^k=argpjPminφ(qi)φ(pj)0,也就是离桶编码 φ ( q i ) \varphi(q_i) φ(qi)海明距离最近的点
  2. 当单个 q i q_i qi落入的桶属于 Case-n \text{Case-n} Case-n情形时, S S S元素需满足 φ ( q i ) = φ ( p j ) \varphi(q_i)\text{=}\varphi(p_j) φ(qi)=φ(pj),也就是所有落入该桶的 q j q_j qj

➡️对事实 A4 \text{A4} A4给定 k sim = O ( 1 ε log ⁡ ( m δ ) ) k_{\text{sim}}\text{=}O\left(\cfrac{1}{\varepsilon}\log{\left(\cfrac{m}{\delta}\right)}\right) ksim=O(ε1log(δm)) ∀ p j ∈ P \forall{p_j}\text{∈}P pjP ∣ ∥ φ ( q i ) – φ ( p j ) ∥ 0 k s i m – θ ( q i , p j ) π ∣ ≤ ε \left|\cfrac{\|\varphi(q_i)–\varphi(p_j)\|_0}{k_{\mathrm{sim}}}\text{–}\cfrac{\theta(q_i,p_j)}{\pi}\right|\text{≤}\sqrt{\varepsilon} ksimφ(qi)φ(pj)0πθ(qi,pj) ε Pr> 1 – ε δ m 2 \text{Pr}\text{>}1–\cfrac{\varepsilon{}\delta{}}{m^2} Pr1–m2εδ概率成立

image-20250228004958647
  1. 对于 q i {q_i} qi p q 1 ∗ p^*_{q_1} pq1还有 p q 1 p_{q_1} pq1,总是会有 ∥ φ ( q i ) – φ ( p q 1 ∗ ) ∥ 0 k s i m ≤ ∥ φ ( q i ) – φ ( p q 1 ) ∥ 0 k s i m \cfrac{\|\varphi(q_i)–\varphi(p^*_{q_1})\|_0}{k_{\mathrm{sim}}}\text{≤}\cfrac{\|\varphi(q_i)–\varphi(p_{q_1})\|_0}{k_{\mathrm{sim}}} ksimφ(qi)φ(pq1)0ksimφ(qi)φ(pq1)0
    • 对于 Case-0 \text{Case-0} Case-0情形, p q i ∗ p^*_{q_i} pqi为离编号为 φ ( q i ) \varphi(q_i) φ(qi)的桶海明距离最近的 p j p_j pj向量,所以 ∥ φ ( q i ) – φ ( p q i ∗ ) ∥ 0 ≤ ∥ φ ( q i ) – φ ( p q i ) ∥ 0 {\|\varphi(q_i)–\varphi(p^*_{q_i})\|_0}\text{≤}{\|\varphi(q_i)–\varphi(p_{q_i})\|_0} φ(qi)φ(pqi)0φ(qi)φ(pqi)0
    • 对于 Case-n \text{Case-n} Case-n情形, p q i ∗ p^*_{q_i} pqi为与 q i q_i qi同桶的所有 p j p_j pj向量的质心,即 S = { p j ∈ P ∣ φ ( q i ) = φ ( p j ) } S\text{=}\left\{p_j\text{∈}P|{\varphi}(q_i)\text{=}\varphi(p_j)\right\} S={pjPφ(qi)=φ(pj)}的质心 S center S_{\text{center}} Scenter
      • 回忆 SimHash \text{SimHash} SimHash分桶的本质,就是用 k sim k_\text{sim} ksim个法平面将空间切割为 2 k sim 2^{k_\text{sim}} 2ksim个子空间
      • 所有 p j ∈ S p_j\text{∈}S pjS都落入同一桶即在同一子空间,则其质心 p q i ∗ = S center p^*_{q_i}\text{=}S_{\text{center}} pqi=Scenter也一定在该子空间中,所以 φ ( q i ) = φ ( p j ) = φ ( p q i ∗ ) \varphi(q_i)\text{=}\varphi(p_j)\text{=}\varphi(p^*_{q_i}) φ(qi)=φ(pj)=φ(pqi)
      • 所以 ∥ φ ( q i ) – φ ( p q i ∗ ) ∥ 0 = 0 ≤ ∥ φ ( q i ) – φ ( p q i ) ∥ 0 {\|\varphi(q_i)–\varphi(p^*_{q_i})\|_0}\text{=}0\text{≤}{\|\varphi(q_i)–\varphi(p_{q_i})\|_0} φ(qi)φ(pqi)0=0φ(qi)φ(pqi)0
  2. p q 1 p_{q_1} pq1 p q 1 ∗ p^*_{q_1} pq1都代入事实 A4 \text{A4} A4,则 θ ( q i , p q 1 ∗ ) π – ε ≤ ∥ φ ( q i ) – φ ( p q 1 ∗ ) ∥ 0 k s i m ≤ ∥ φ ( q i ) – φ ( p q i ) ∥ 0 k s i m ≤ θ ( q i , p q i ) π + ε \cfrac{\theta({q_i,p^*_{q_1}})}{\pi}–\sqrt{\varepsilon}\text{≤}\cfrac{\|\varphi(q_i)–\varphi(p^*_{q_1})\|_0}{k_{\mathrm{sim}}}\text{≤}\cfrac{\|\varphi(q_i)–\varphi(p_{q_i})\|_0}{k_{\mathrm{sim}}}\text{≤}\cfrac{\theta({q_i,p_{q_i}})}{\pi}\text{+}\sqrt{\varepsilon} πθ(qi,pq1)ε ksimφ(qi)φ(pq1)0ksimφ(qi)φ(pqi)0πθ(qi,pqi)+ε Pr> 1 – ε δ m 2 \text{Pr}\text{>}1–\cfrac{\varepsilon{}\delta{}}{m^2} Pr1–m2εδ概率成立
    • 因此 ∣ θ ( q i , p q 1 ∗ ) – θ ( q i , p q i ) ∣ = O ( ε ) \left|\theta(q_i,p^*_{q_1})–\theta({q_i,p_{q_i}})\right|\text{=}O\left(\sqrt{\varepsilon}\right) θ(qi,pq1)θ(qi,pqi) =O(ε ) Pr> 1 – ε δ m 2 \text{Pr}\text{>}1–\cfrac{\varepsilon{}\delta{}}{m^2} Pr1–m2εδ概率成立
    • 上式可得 ∣ cos ⁡ θ ( q i , p q 1 ∗ ) – cos ⁡ θ ( q i , p q i ) ∣ < O ( ε ) \left|\cos\theta(q_i,p^*_{q_1})–\cos\theta({q_i,p_{q_i}})\right|\text{<}O(\varepsilon) cosθ(qi,pq1)cosθ(qi,pqi) <O(ε) Pr> 1 – ε δ m 2 \text{Pr}\text{>}1–\cfrac{\varepsilon{}\delta{}}{m^2} Pr1–m2εδ成立,其中 { cos ⁡ θ ( q i , p q 1 ∗ ) = ⟨ q i , p q 1 ∗ ⟩ =FDE ( q i , P ) cos ⁡ θ ( q i , p q i ) = ⟨ q i , p q i ⟩ =MaxSim ( q i , P ) \begin{cases}\cos\theta(q_i,p^*_{q_1})\text{=}\langle{q_i,p^*_{q_1}}\rangle\text{=}\text{FDE}(q_i,P)\\\\\cos\theta({q_i,p_{q_i}})\text{=}\langle{q_i,p_{q_i}}\rangle\text{=}\text{MaxSim}(q_i,P)\end{cases} cosθ(qi,pq1)=qi,pq1=FDE(qi,P)cosθ(qi,pqi)=qi,pqi=MaxSim(qi,P)
    • 所以 MaxSim ( q i , P ) – O ( ε ) ≤FDE ( q i , P ) ≤MaxSim ( q i , P ) + O ( ε ) \text{MaxSim}(q_i,P)–O(\varepsilon)\text{≤}\text{FDE}(q_i,P)\text{≤}\text{MaxSim}(q_i,P)\text{+}O(\varepsilon) MaxSim(qi,P)O(ε)FDE(qi,P)MaxSim(qi,P)+O(ε) Pr> 1 – ε δ m 2 \text{Pr}\text{>}1–\cfrac{\varepsilon{}\delta{}}{m^2} Pr1–m2εδ概率成立

➡️事实 A4 \text{A4} A4给出了 FDE ( q i , P ) \text{FDE}(q_i,P) FDE(qi,P)的子上下界

  1. 取结论的左半边 MaxSim ( q i , P ) – O ( ε ) ≤FDE ( q i , P ) \text{MaxSim}(q_i,P)–O(\varepsilon)\text{≤}\text{FDE}(q_i,P) MaxSim(qi,P)O(ε)FDE(qi,P)并给概率放水 Pr> 1 – ε δ m 2 > 1 – ε δ ∣ Q ∣ \text{Pr}\text{>}1–\cfrac{\varepsilon{}\delta{}}{m^2}\text{>}1–\cfrac{\varepsilon\delta}{|Q|} Pr1–m2εδ>1–Qεδ,便得到了一个更弱的结论
  2. 再对 O ( ε ) O(\varepsilon) O(ε)进行常数缩放为 ε \varepsilon ε,该更弱的结论就变成了所要证的 FDE \text{FDE} FDE相似度的子下界

2.5.   \textbf{2.5. } 2.5. 加上一定维度的投影后原子下界近似成立

➡️投影前后的对比:用 FDE orgn \text{FDE}_{\text{orgn}} FDEorgn FDE proj \text{FDE}_{\text{proj}} FDEproj表示区分

image-20250227232407524
  1. 投影前为 FDE orgn ( q i , P ) = ⟨ q i , p q i ∗ ⟩ = ⟨ q i , 1 ∣ S ∣ ∑ p j ∈ S p j ⟩ = 1 ∣ S ∣ ∑ p j ∈ S ⟨ q i , p j ⟩ \displaystyle{}\text{FDE}_{\text{orgn}}(q_i,P)\text{=}\left\langle{q_i,p_{q_i}^*}\right\rangle\text{=}\left\langle{}q_i,\cfrac{1}{|S|}\sum_{p_j\text{∈}S}p_j\right\rangle{}\text{=}\cfrac{1}{|S|}\sum_{p_j\text{∈}S}{\langle{}q_i,p_j\rangle} FDEorgn(qi,P)=qi,pqi=qi,S1pjSpj=S1pjSqi,pj
  2. 投影后为 FDE proj ( q i , P ) = ⟨ ψ ( q i ) , ψ ( p q i ∗ ) ⟩ = ⟨ ψ ( q i ) , ψ ( 1 ∣ S ∣ ∑ p j ∈ S p j ) ⟩ = 1 ∣ S ∣ ∑ p j ∈ S ⟨ ψ ( q i ) , ψ ( p j ) ⟩ \displaystyle{}\text{FDE}_{\text{proj}}(q_i,P)\text{=}\left\langle{\psi(q_i),\psi\left(p_{q_i}^*\right)}\right\rangle\text{=}\left\langle{}{\psi}(q_i),{\psi}\left(\cfrac{1}{|S|}\sum_{p_j\text{∈}S}p_j\right)\right\rangle{}\text{=}\cfrac{1}{|S|}\sum_{p_j\text{∈}S}\left\langle{\psi(q_i),\psi\left(p_j\right)}\right\rangle FDEproj(qi,P)=ψ(qi),ψ(pqi)=ψ(qi),ψ S1pjSpj =S1pjSψ(qi),ψ(pj)

➡️应用事实 A2 \text{A2} A2:用 ε δ ∣ Q ∣ \cfrac{\varepsilon\delta}{|Q|} Qεδ替代 δ \delta δ以设定 d proj = O ( 1 ε 2 log ⁡ ( ∣ Q ∣ ε δ ) ) = O ( 1 ε 2 log ⁡ ( m ε δ ) ) d_{\text{proj}}\text{=}O\left(\cfrac{1}{\varepsilon^{2}}\log\left(\cfrac{|Q|}{\varepsilon\delta}\right)\right)\text{=}O\left(\cfrac{1}{\varepsilon^{2}}\log\left(\cfrac{m}{\varepsilon\delta}\right)\right) dproj=O(ε21log(εδQ))=O(ε21log(εδm))

  1. 则有内积的偏差 ⟨ x , y ⟩ – ε ≤ ⟨ ψ ( x ) , ψ ( y ) ⟩ ≤ ⟨ x , y ⟩ + ε \langle{}x,y\rangle\text{–}\varepsilon\text{≤}\langle\psi(x),\psi(y)\rangle{}\text{≤}\langle{}x,y\rangle\text{+}\varepsilon x,yεψ(x),ψ(y)⟩x,y+ε Pr≥ 1 – ε δ ∣ Q ∣ \text{Pr}\text{≥}1\text{–}\cfrac{\varepsilon\delta}{|Q|} Pr1Qεδ概率成立概率成立
  2. 将事实 A2 \text{A2} A2代回则有 1 ∣ S ∣ ∑ p j ∈ S ⟨ q i , p j ⟩ – ε ≤ 1 ∣ S ∣ ∑ p j ∈ S ⟨ ψ ( q i ) , ψ p j ⟩ ≤ 1 ∣ S ∣ ∑ p j ∈ S ⟨ q i , p j ⟩ + ε \displaystyle{}\cfrac{1}{|S|}\sum_{p_j\text{∈}S}{\langle{}q_i,p_j\rangle}\text{–}\varepsilon\text{≤}\cfrac{1}{|S|}\sum_{p_j\text{∈}S}\left\langle{\psi(q_i),\psi{}p_j}\right\rangle\text{≤}\cfrac{1}{|S|}\sum_{p_j\text{∈}S}{\langle{}q_i,p_j\rangle}\text{+}\varepsilon S1pjSqi,pjεS1pjSψ(qi),ψpjS1pjSqi,pj+ε,以 Pr≥ 1 – ε δ ∣ Q ∣ \text{Pr}\text{≥}1\text{–}\cfrac{\varepsilon\delta}{|Q|} Pr1Qεδ概率成立概率成立
    • 其中 FDE orgn ( q i , P ) = 1 ∣ S ∣ ∑ p j ∈ S ⟨ q i , p j ⟩ \displaystyle{}\text{FDE}_{\text{orgn}}(q_i,P)\text{=}\cfrac{1}{|S|}\sum_{p_j\text{∈}S}{\langle{}q_i,p_j\rangle} FDEorgn(qi,P)=S1pjSqi,pj,以及 FDE proj ( q i , P ) = 1 ∣ S ∣ ∑ p j ∈ S ⟨ ψ ( q i ) , ψ ( p j ) ⟩ \displaystyle{}\text{FDE}_{\text{proj}}(q_i,P)\text{=}\cfrac{1}{|S|}\sum_{p_j\text{∈}S}\left\langle{\psi(q_i),\psi\left(p_j\right)}\right\rangle FDEproj(qi,P)=S1pjSψ(qi),ψ(pj)
    • 所以 FDE orgn ( q i , P ) – ε ≤ FDE proj ( q i , P ) ≤ FDE orgn ( q i , P ) + ε \displaystyle{}\text{FDE}_{\text{orgn}}(q_i,P)\text{–}\varepsilon\text{≤}\text{FDE}_{\text{proj}}(q_i,P)\text{≤}\text{FDE}_{\text{orgn}}(q_i,P)\text{+}\varepsilon FDEorgn(qi,P)εFDEproj(qi,P)FDEorgn(qi,P)+ε,以 Pr≥ 1 – ε δ ∣ Q ∣ \text{Pr}\text{≥}1\text{–}\cfrac{\varepsilon\delta}{|Q|} Pr1Qεδ概率成立概率成立
  3. 已证 MaxSim ( q i , P ) – O ( ε ) ≤ FDE proj ( q i , P ) ≤MaxSim ( q i , P ) + O ( ε ) \text{MaxSim}(q_i,P)–O(\varepsilon)\text{≤}\text{FDE}_{\text{proj}}(q_i,P)\text{≤}\text{MaxSim}(q_i,P)\text{+}O(\varepsilon) MaxSim(qi,P)O(ε)FDEproj(qi,P)MaxSim(qi,P)+O(ε) Pr≥ 1 – ε δ ∣ Q ∣ \text{Pr}\text{≥}1–\cfrac{\varepsilon\delta}{|Q|} Pr1–Qεδ概率成立
    • 代入得 FDE proj ( q i , P ) ≥ FDE orgn ( q i , P ) – ε ≥ MaxSim ( q i ) – ε – O ( ε ) \displaystyle\text{FDE}_{\text{proj}}(q_i,P)\text{≥}\text{FDE}_{\text{orgn}}(q_i,P)\text{–}\varepsilon\text{≥}\displaystyle{}\text{MaxSim}(q_i)\text{–}\varepsilon–O(\varepsilon) FDEproj(qi,P)FDEorgn(qi,P)εMaxSim(qi)εO(ε)
    • ε \varepsilon ε进行常数因子缩放后,即证毕

3.   \textbf{3. } 3. 定理 2.2 \textbf{2.2} 2.2证明的思路

3.0.   \textbf{3.0. } 3.0. 定理 2.2 \textbf{2.2} 2.2的主要内容

👉条件 1 1 1:给定单个查询 Q Q Q以及多个段落 P = { P 1 , … , P n } P\text{=}\left\{P_{1},\ldots,P_{n}\right\} P={P1,,Pn}并且 Q , ∀ P i ⊆ R d Q,\forall{}P_i\text{⊆}\mathbb{R}^{d} Q,PiRd,并令 m = ∣ Q ∣ + max ⁡ i ∈ [ n ] ∣ P i ∣ \displaystyle{}m\text{=}|Q|\text{+}\max_{i\text{∈}[n]}\left|P_{i}\right| m=Q+i[n]maxPi

👉条件 2 2 2:给定 ∀ ε > 0 \forall\varepsilon\text{>}0 ε>0,设置参数 k sim = O ( log ⁡ m ε ) , d proj = O ( 1 ε 2 log ⁡ ( m ε ) ) , R reps = O ( 1 ε 2 log ⁡ n ) k_{\text{sim}}\text{=}O\left(\cfrac{\log{m}}{\varepsilon}\right),d_{\text {proj}}\text{=}O\left(\cfrac{1}{\varepsilon^{2}}\log\left(\cfrac{m}{\varepsilon}\right)\right),R_{\text{reps}}\text{=}O\left(\cfrac{1}{\varepsilon^{2}}\log{n}\right) ksim=O(εlogm),dproj=O(ε21log(εm)),Rreps=O(ε21logn)

👉条件 3 3 3:令 i ∗ = arg ⁡ max ⁡ i ∈ [ n ] FDE ( Q , P i ) \displaystyle{}i^{*}\text{=}\arg\max_{i\text{∈}[n]}\text{FDE}(Q,P_i) i=argi[n]maxFDE(Q,Pi),即 P i ∗ P_{i^*} Pi是通过 Muvera \text{Muvera} Muvera方法找到的,与查询 Q Q Q最相似的段落

👉结论 1 1 1 1 ∣ Q ∣ Chamfer ( Q , P i ∗ ) ≥ max ⁡ i ∈ [ n ] 1 ∣ Q ∣ Chamfer ( Q , P i ) – ε \displaystyle{}\cfrac{1}{|Q|}\text{Chamfer}\left(Q, P_{i^{*}}\right)\text{≥}\max_{i\text{∈}[n]}\cfrac{1}{|Q|}\text{Chamfer}\left(Q, P_{i}\right)–\varepsilon Q1Chamfer(Q,Pi)i[n]maxQ1Chamfer(Q,Pi)ε Pr= 1 – 1 poly ( n ) \text{Pr=}1\text{–}\cfrac{1}{\text{poly}(n)} Pr=1poly(n)1概率成立

3.1.   \textbf{3.1. } 3.1. 定理 2.2 \textbf{2.2} 2.2的主要内容

➡️在考虑重复 R reps R_{\text{reps}} Rreps次的情况下,对于每个重复 k ∈ [ R reps ] k\text{∈}\left[R_{\text{reps}}\right] k[Rreps],设定每次重复对最终相似度的贡献为 FDE k ( Q , P α ) \text{FDE}^k(Q,P_\alpha) FDEk(Q,Pα)

  1. 对于最终相似度,有 FDE ( Q , P α ) = ∑ k = 1 R reps FDE k ( Q , P α ) \text{FDE}(Q,P_\alpha)\text{=}\displaystyle\sum_{k\text{=}1}^{R_{\text{reps}}}\text{FDE}^k(Q,P_\alpha) FDE(Q,Pα)=k=1RrepsFDEk(Q,Pα),不妨设定随机变量 X k = 1 ∣ Q ∣ FDE k ( Q , P α ) X_k\text{=}\cfrac{1}{|Q|}\text{FDE}^k(Q,P_\alpha) Xk=Q1FDEk(Q,Pα)

➡️最关键的一步在于,对 X k X_k Xk尝试运用 Chernoff \text{Chernoff} Chernoff界限,即 ∀ X i ∈ [ a , b ] \forall{X_i}\text{∈}[a,b] Xi[a,b] Pr [ ∣ 1 R ∑ i = 1 R X i – μ ∣ ≥ ε ] ≤ 2 e ( – 2 R ε 2 ( b – a ) 2 ) \displaystyle{}\text{Pr}\left[\left|\frac{1}{R}\sum_{i=1}^RX_i–\mu\right|\text{≥}\varepsilon\right]\text{≤}2e^{\left(–\frac{2R\varepsilon^2}{(b–a)^2}\right)} Pr[ R1i=1RXiμ ε]2e((ba)22Rε2)

  1. Chernoff \text{Chernoff} Chernoff界限中参数的确定
    • 将上式中 R R R视作 R reps R_{\text{reps}} Rreps,并将 R reps = O ( 1 ε 2 log ⁡ n ) R_{\text{reps}}\text{=}O\left(\cfrac{1}{\varepsilon^{2}}\log{n}\right) Rreps=O(ε21logn)作为前提引入
    • 对于 μ \mu μ即均值,根据定理 2.1 \text{2.1} 2.1引入前提 k sim = O ( log ⁡ m ε ) , d proj = O ( 1 ε 2 log ⁡ ( m ε ) ) k_{\text{sim}}\text{=}O\left(\cfrac{\log{m}}{\varepsilon}\right),d_{\text {proj}}\text{=}O\left(\cfrac{1}{\varepsilon^{2}}\log\left(\cfrac{m}{\varepsilon}\right)\right) ksim=O(εlogm),dproj=O(ε21log(εm))后,有 E [ X k ] ∈ 1 ∣ Q ∣ Chamfer ( Q , P α ) ± ε \mathbb{E}[X_k]\text{∈}\cfrac{1}{|Q|}\text{Chamfer}(Q,P_\alpha)\text{±}\varepsilon E[Xk]Q1Chamfer(Q,Pα)±ε
    • 对于 [ a , b ] [a,b] [a,b] X k X_k Xk的范围,不难得到 X k ∈ [ – m , m ] X_k\text{∈}[–m,m] Xk[m,m]
  2. 将以上参数套用到 Chernoff \text{Chernoff} Chernoff界限则有
    • 概率:以 Pr≥ 1 – 2 e ( – R reps ε 2 2 m 2 ) \text{Pr}\text{≥}1–2e^{\left(–\frac{R_{\text{reps}}\varepsilon^2}{2m^2}\right)} Pr1–2e(2m2Rrepsε2)概率成立,可以进一步转化为 Pr≥ 1 – 2 e ( – R reps ε 2 2 m 2 ) ≥ 1 – 2 n C = 1 – 1 poly ( n ) \text{Pr}\text{≥}1–2e^{\left(–\frac{R_{\text{reps}}\varepsilon^2}{2m^2}\right)}\text{≥}1–\cfrac{2}{n^C}\text{=}1\text{–}\cfrac{1}{\text{poly}(n)} Pr1–2e(2m2Rrepsε2)1–nC2=1poly(n)1
    • 事件: ∣ ∑ k = 1 R reps X k R reps – 1 ∣ Q ∣ Chamfer ( Q , P α ) ∣ ≤ 2 ε \displaystyle{}\left|\sum_{k\text{=}1}^{R_{\text{reps}}}\cfrac{X_k}{R_{\text{reps}}}–\cfrac{1}{|Q|}\text{Chamfer}(Q,P_\alpha)\right|\text{≤}2\varepsilon k=1RrepsRrepsXkQ1Chamfer(Q,Pα) 2ε
      • 代入 FED \text{FED} FED最相似文档 P α ∗ P_{\alpha^*} Pα后,有 1 ∣ Q ∣ Chamfer ( Q , P α ∗ ) ≥ 1 ∣ Q ∣ R reps FDE ( Q , P α ∗ ) – 2 ε = max ⁡ α ∈ [ n ] 1 ∣ Q ∣ R reps FDE ( Q , P α ) – 2 ε \cfrac{1}{|Q|}\text{Chamfer}(Q,P_\alpha^*)\text{≥}\cfrac{1}{|Q|R_{\text{reps}}}\text{FDE}(Q,P_\alpha^*)–2\varepsilon\text{=}\displaystyle{}\max_{\alpha\text{∈}[n]}\cfrac{1}{|Q|R_{\text{reps}}}\text{FDE}(Q,P_\alpha)–2\varepsilon Q1Chamfer(Q,Pα)QRreps1FDE(Q,Pα)–2ε=α[n]maxQRreps1FDE(Q,Pα)–2ε
      • 根据定理 2.1 \text{2.1} 2.1又可以知道 max ⁡ α ∈ [ n ] 1 ∣ Q ∣ R reps FDE ( Q , P α ) ≥ max ⁡ α ∈ [ n ] 1 ∣ Q ∣ Chamfer ( Q , P α ) – ε \displaystyle{}\max_{\alpha\text{∈}[n]}\cfrac{1}{|Q|R_{\text{reps}}}\text{FDE}(Q,P_\alpha)\text{≥}\max_{\alpha\text{∈}[n]}\cfrac{1}{|Q|}\text{Chamfer}(Q,P_\alpha)\text{–}\varepsilon α[n]maxQRreps1FDE(Q,Pα)α[n]maxQ1Chamfer(Q,Pα)ε
      • 所以最终原事件可转化为 1 ∣ Q ∣ Chamfer ( Q , P α ∗ ) ≥ max ⁡ α ∈ [ n ] 1 ∣ Q ∣ Chamfer ( Q , P α ) – 3 ε \displaystyle{}\cfrac{1}{|Q|}\text{Chamfer}(Q,P_\alpha^*)\text{≥}\max_{\alpha\text{∈}[n]}\cfrac{1}{|Q|}\text{Chamfer}(Q,P_\alpha)\text{–}3\varepsilon Q1Chamfer(Q,Pα)α[n]maxQ1Chamfer(Q,Pα)3ε,对 ε \varepsilon ε做常数倍变换即证毕

http://www.kler.cn/a/568648.html

相关文章:

  • P8637 [蓝桥杯 2016 省 B] 交换瓶子
  • Element Plus使用(五)
  • Windows本地Docker+Open-WebUI部署DeepSeek
  • LeetCode 热题100 438. 找到字符串中所有字母异位词
  • 【文献阅读】Faster and Lighter LLMs: A Survey on Current Challenges and Way Forward
  • 【vue-echarts】——01.认识echarts
  • 线性回归:机器学习基础算法全解析
  • 【Linux】进程替换(七)
  • 小程序中的插槽(Slot)机制及其与 Vue 组件的异同
  • git 的 Detached HEAD
  • 作业及参考
  • 0x36d(CRYPTO)
  • DeepSeek 助力 Vue3 开发:打造丝滑的密码输入框(Password Input)
  • 计算机视觉(opencv-python)之图像预处理基本操作(待补充)
  • Linux :进程状态
  • 微服务合并
  • 关于SSM项目的整合
  • 3.jvm的执行流程
  • 搭建基于Agent的金融问答系统
  • 安当防火墙登录安全解决方案:零信任认证+国密证书+动态口令构建全方位身份安全屏障