当前位置: 首页 > article >正文

[论文精读]Polarized Graph Neural Networks

论文网址:Polarized Graph Neural Networks | Proceedings of the ACM Web Conference 2022

英文是纯手打的!论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误,若有发现欢迎评论指正!文章偏向于笔记,谨慎食用

目录

1. 省流版

1.1. 心得

2. 论文逐段精读

2.1. Abstract

2.2. Introduction

2.3. Related work

2.4. Preliminaries

2.4.1. Problem formulation

2.4.2. Homophily and heteriophily in graphs

2.4.3. Anonymous walks

2.5. Method

2.5.1. Measurement of similarities and dissmilarities

2.5.2. Polarization on the hyper-spherical space

2.5.3. Spherical classification loss

2.6. Experiment

2.6.1. Experimental setup

2.6.2. Node classification on different types of networks

2.6.3. Model analysis

2.6.4. Visualization

2.7. Conclusion

3. Reference


1. 省流版

1.1. 心得

(1)也是比较容易理解的文章,但是没有代码还是比较sad

2. 论文逐段精读

2.1. Abstract

        ①AGAIN~Message passing is only suit for homogeneous dissemination

        ②They proposed polargraph Neural Network (Polar-GNN)

2.2. Introduction

        ①The edge generation difference between homophilic graph and hererophilic graph:

        ②Heterophilic information prompts polorization(酱紫的话岂不是一个人看到相似的观点也是促进,看到不相似的也是促进,那好的坏的都当好的了?其实这个很难评,有些人是看到讨厌的东西更坚定自己,有些人反而慢慢接受了呢?计算机应该不能完全仿照什么什么心理学观点吧,这种东西没有一定的)

        ③Chanllenges: mining similar and dissimilar unlabeled nodes, discriminatively aggregating message

2.3. Related work

       ①Introducing homophily, heterophily, spectral and contrastive learning methods

2.4. Preliminaries

2.4.1. Problem formulation

        ①Graph representations G=\left \{ V, E \right \}v_i \in V is node and e_{ij}=\left ( v_i,v_j \right ) \in E is edge

        ②Each node is associated with a feature vector x_{i}\in\mathbb{R}^{d}

        ③Each node belongs to a label or class y_i

        ④For this semi-supervised learning, only few nodes are labeled

2.4.2. Homophily and heteriophily in graphs

        ①Definition 1: the edge homophily rartio:

h=\frac{|\{(v_i,v_j|(v_i,v_j)\in E \wedge y_i=y_j\}|}{| E |}

where | E | denotes the intra-class edges

        ②Homophily ratio: similar nodes h\rightarrow 1, dissimilar nodes h\rightarrow 0

2.4.3. Anonymous walks

        ①Anonymous walks is similar to random work but deletes the specific identity of nodes:

2.5. Method

        ①Overall framework:

2.5.1. Measurement of similarities and dissmilarities

        ①Exampling GAT:

e_{ij}=$Attention$(Wh_i,Wh_j) \\ \alpha_{ij}=\mathrm{softmax}_{j}(e_{ij})=\frac{\exp(e_{ij})}{\sum_{k\in N_{i}}\exp(e_{ik})}

        ②They introduced negative weights for dissimilar node pairs:

a_{ij}=f\left ( w\odot h_i,w\odot h_j \right )

where \odot denotes the Hadamard product, f denotes a mapping function \mathbb{R}^d\times\mathbb{R}^d\to[-1,1], they chose cosine similarity

        ③To record the structure of nodes, they samples anonymous walks sequence with length l for each node v_i, the distribution of the sequence p_i is regarded to the structure of node v_i

        ④The structural similarity:

\alpha_{ij}^p=f(w_2\odot p_i,w_2\odot p_j)

        ⑤Aggregated weights:

\alpha_{ij}=\lambda\alpha_{ij}^{h}+(1-\lambda)\alpha_{ij}^{p}

2.5.2. Polarization on the hyper-spherical space

        ①Neighbors utilization:

h_{i}=\sum_{j\in N_{i}}\alpha_{ij}h_{j}=\sum_{j\in N_{i}^{+}}|\alpha_{ij}|h_{j}-\sum_{j\in N_{i}^{-}}|\alpha_{ij}|h_{j}

        ②Number of neighbors affects the feature

        ③The authors have demonstrated that for a specific node representation, during a gradient descent update, if its norm is relatively large, then the change in its direction will be comparatively small.

        ④They project embeddings to hyper-sphere, they designe a polarized objective function:

\begin{aligned}L(\boldsymbol{h})&=\sum_{i=1}^{N}\left\|\boldsymbol{h}_{i}-\boldsymbol{h}_{i}^{*}\right\|^{2}+\sum_{i=1}^{N}\sum_{j\in N_{i}^{+}}\left|\alpha_{ij}\right|\left\|\boldsymbol{h}_{i}-\boldsymbol{h}_{j}\right\|^{2}\\&-\sum_{i=1}^{N}\sum_{j\in N_{i}^{-}}|\alpha_{ij}|\left\|\boldsymbol{h}_{i}-\boldsymbol{h}_{j}\right\|^{2},\end{aligned}

minimizing the distance between the initial and target states for stability, minimizing the distance between identified similar node pairs, and conversely, maximizing the distance between dissimilar node pairs

        ⑤Alogorithm of  Polar GNN:

2.5.3. Spherical classification loss

        ①Cross entropy loss in hyper-sphere:

L=-\frac{1}{N}\sum_{i=1}^{N}\log\frac{\exp\left(\cos(\theta_{iy_{i}}+m)\right)}{\exp\left(\cos(\theta_{iy_{i}}+m)\right)+\sum_{k\neq y_{i}}\exp(\cos\theta_{ik})}

        ②By fixing b=0 and \left \| w \right \|=1 by L_2 Norm, they transform the cross entropy to:

L=-\frac{1}{N}\sum_{i=1}^{N}\log\frac{\exp(\cos\theta_{iy_{i}})}{\sum_{k=1}^{K}\exp(\cos\theta_{ik})}

where \theta _{ik} denotes the angle between h_i and w_k

        ③They add an addictive angular penalty m:

L=-\frac{1}{N}\sum_{i=1}^{N}\log\frac{\exp\left(\cos(\theta_{iy_{i}}+m)\right)}{\exp\left(\cos(\theta_{iy_{i}}+m)\right)+\sum_{k\neq y_{i}}\exp(\cos\theta_{ik})}

        ④The following figure shows the effect of the penalty term on distance approximation:

2.6. Experiment

2.6.1. Experimental setup

(1)Dataset

        ①Homophily datasets: Cora, Pubmed, Photo and Facebook

        ②Heterophily datasets: Chameleon, Squirrel, Actor and Twitch

(2)Baselines

        ①GCN and GraphSAGE

        ②GAT and APPNP

        ③Geom-GCN

        ④FAGCN and GPRGNN

(3)Settings

        ①Training: 20 nodes per class for training

        ②Testing: 1000 nodes

        ③Validation: another 500 nodes 

        ④Epoch: 500

2.6.2. Node classification on different types of networks

        ①Model performance:

2.6.3. Model analysis

(1)Ablation studies

        ①Module ablation study on Chameleon, with

Polar-GNN-Featonly incorporates node features when calculating the aggregation weights
Polar-GNN-Structonly uses topological structure when calculating weights
Polar-GNN-NoNormPolar-GNN variant without normalization to a hyper-sphere

they got

        ②Penalty ablation:

(2)Alleviating over-smooth

        ①GNN layer ablation:

2.6.4. Visualization

(1)Aggregation weights

        ①Weight aggregation on Chameleon:

(2)Node embeddings-how negative weights help

        ①Node embedding on Chameleon:

2.7. Conclusion

        The polarization is similar to constrained optimization

3. Reference

Fang, Z. et al. (2022) 'Polarized Graph Neural Networks', WWW. doi: https://doi.org/10.1145/3485447.3512187 


http://www.kler.cn/a/322739.html

相关文章:

  • Java基础-Java中的常用类(上)
  • 黑马智慧商城项目学习笔记
  • uniapp 实现tabbar分类导航及滚动联动效果
  • 【设计模式】行为型模式(五):解释器模式、访问者模式、依赖注入
  • Javascript高级—函数柯西化
  • 人工智能与SEO优化中的关键词策略解析
  • Mac使用Nginx设置代理,并禁用自带Apache
  • 数模方法论-蒙特卡洛法
  • 有关若依登录过程前端的对应处理学习
  • HBase DDL操作代码汇总(namespace+table CRUD操作)
  • WebGL创建3D对象
  • springboot 引入mqtt
  • Redis 缓存雪崩、缓存穿透、缓存击穿详解
  • 基于 LangChain 的自动化测试用例的生成与执行
  • Java单体服务和集群分布式SpringCloud微服务的理解
  • 17、网络安全合规审查五大内容
  • vue按钮接收键盘回车事件
  • python:基于django的html订单提交页面
  • 小程序振动
  • 从零开始Ubuntu24.04上Docker构建自动化部署(三)Docker安装Nginx
  • centos8 升级openssh-9.8p1
  • 《C++开源贡献:提升职业竞争力的新途径》
  • 搜索引擎onesearch3实现解释和升级到Elasticsearch v8系列(四)-搜索
  • Spark Job 对象 详解
  • ‌[AI问答] Auto-sklearn‌ 与 scikit-learn 区别
  • 【SpringCloud】环境和工程搭建