空间金字塔池化_可学习的空间金字塔注意力池化模块,用于图像合成和图像到图像转换...

(124) 2024-05-18 23:01:01
空间金字塔池化_可学习的空间金字塔注意力池化模块,用于图像合成和图像到图像转换... (https://mushiming.com/)  第1张

空间金字塔池化_可学习的空间金字塔注意力池化模块,用于图像合成和图像到图像转换... (https://mushiming.com/)  第2张

链接:Learning Spatial Pyramid Attentive Pooling in Image Synthesis and Image-to-Image Translation

简介:

Image synthesis and image-to-image translation are two important generative learning tasks. Remarkable progress has been made by learning Generative Adversarial Networks (GANs)~cite{goodfellow2014generative} and cycle-consistent GANs (CycleGANs)~cite{zhu2017unpaired} respectively. This paper presents a method of learning Spatial Pyramid Attentive Pooling (SPAP) which is a novel architectural unit and can be easily integrated into both generators and discriminators in GANs and CycleGANs. The proposed SPAP integrates Atrous spatial pyramid~cite{chen2018deeplab}, a proposed cascade attention mechanism and residual connections~cite{he2016deep}. It leverages the advantages of the three components to facilitate effective end-to-end generative learning: (i) the capability of fusing multi-scale information by ASPP; (ii) the capability of capturing relative importance between both spatial locations (especially multi-scale context) or feature channels by attention; (iii) the capability of preserving information and enhancing optimization feasibility by residual connections. Coarse-to-fine and fine-to-coarse SPAP are studied and intriguing attention maps are observed in both tasks. In experiments, the proposed SPAP is tested in GANs on the Celeba-HQ-128 dataset~cite{karras2017progressive}, and tested in CycleGANs on the Image-to-Image translation datasets including the Cityscape dataset~cite{cordts2016cityscapes}, Facade and Aerial Maps dataset~cite{zhu2017unpaired}, both obtaining better performance.

本文提出了一种可学习的空间金字塔注意力池化(Spatial Pyramid Attentive Pooling (SPAP))方法,这是一种新颖的架构单元,可以很容易地融合到GAN和CycleGANs的生成器和鉴别器中。SPAP融合空洞空间金字塔,瀑布流注意力机制和残差连接。利用了这三个组件的优势,促进了端到端的生成学习:

  1. 通过ASPP融合多尺度信息
  2. 通过注意力捕捉重要性的能力
  3. 通过残差连接保留信息和增强优化可行性的能力
空间金字塔池化_可学习的空间金字塔注意力池化模块,用于图像合成和图像到图像转换... (https://mushiming.com/)  第3张

空间金字塔池化_可学习的空间金字塔注意力池化模块,用于图像合成和图像到图像转换... (https://mushiming.com/)  第4张
整体框架

可以看出SPAP类似于一个模块,并只是一种整体的网络架构,输入和输出都是一种特征图,所以可以很容易地作为一个插件的形式(或者如文中所言的pooling,作为一种新的pooling方式)。注意:其中的

空间金字塔池化_可学习的空间金字塔注意力池化模块,用于图像合成和图像到图像转换... (https://mushiming.com/)  第5张 是一个可学习的标量参数。

空间金字塔池化_可学习的空间金字塔注意力池化模块,用于图像合成和图像到图像转换... (https://mushiming.com/)  第5张

其中

空间金字塔池化_可学习的空间金字塔注意力池化模块,用于图像合成和图像到图像转换... (https://mushiming.com/)  第5张 是通过卷积神经网络加上激活函数预测得到的在pixel-wise的注意力图。

最终的输出是

空间金字塔池化_可学习的空间金字塔注意力池化模块,用于图像合成和图像到图像转换... (https://mushiming.com/)  第5张

对于融合不同尺度特征的次序,作者使用不同的方法进行了实验。对于更大的的空洞卷积率,结果是更粗糙的。因此作者作了从粗糙到精细和精细到粗糙方向的实验。

空间金字塔池化_可学习的空间金字塔注意力池化模块,用于图像合成和图像到图像转换... (https://mushiming.com/)  第9张

空间金字塔池化_可学习的空间金字塔注意力池化模块,用于图像合成和图像到图像转换... (https://mushiming.com/)  第10张

总结:

The proposed SPAP is a simple yet effective module which harnesses the advantages of three ubiquitous components in neural architecture design in a novel way: Atrous spatial pyramid for capturing multi-scale information, a cascade attention scheme for aggregating information between differnt levels in the pyramid, and residual connections. The proposed SPAP module is complementary to many existing efforts towards building more accurate and more robust GANs based generative learning models.

提出的SPAP可以为现存的GAN提供辅助作用,从而建立更准确健壮的GANs。

THE END

发表回复