{"id":7711,"date":"2024-04-17T18:01:01","date_gmt":"2024-04-17T10:01:01","guid":{"rendered":""},"modified":"2024-04-17T18:01:01","modified_gmt":"2024-04-17T10:01:01","slug":"\u8d85\u5206\u7b97\u6cd5IPT\uff1aPre-Trained Image Processing Transformer","status":"publish","type":"post","link":"https:\/\/mushiming.com\/7711.html","title":{"rendered":"\u8d85\u5206\u7b97\u6cd5IPT\uff1aPre-Trained Image Processing Transformer"},"content":{"rendered":"

\n <\/path> \n<\/svg> <\/p>\n

\"\u8d85\u5206\u7b97\u6cd5IPT\uff1aPre-Trained
\u672c\u6587\u662f\u4e00\u4e2a\u57fa\u4e8etransformer\u7684\u9884\u8bad\u7ec3\u901a\u7528\u6a21\u578b\uff0c\u9488\u5bf9\u4f4e\u7ea7\u89c6\u89c9\u4efb\u52a1\u8fd8\u6ca1\u6709\u4eba\u63d0\u51fa\u9884\u8bad\u7ec3\u7684\u6a21\u578b<\/font>\uff0c\u6240\u4ee5\u4f5c\u8005\u4f7f\u7528\u4e86\u8d85\u5927\u6570\u636e\u96c6<\/code>\u8bad\u7ec3\u51fa\u4e86image processing transformer (IPT)\u3002\u53ef\u4ee5\u5fae\u8c03<\/code>\u540e\u5e94\u7528\u4e8e\u56fe\u50cf\u91cd\u5efa\u3001\u53bb\u566a\u3001\u53bb\u96e8\u7b49\u7b49\u3002\u5177\u4f53\u7ed3\u6784\u4f5c\u8005\u4f7f\u7528\u4e86\u4e00\u4e2a\u591a\u5934\u591a\u5c3e\u5171\u4eab\u8eaf\u5e72<\/code>\u7684\u7ed3\u6784\u3002\u5e94\u5bf9\u4e0d\u540c\u7684\u4efb\u52a1\uff0c\u6709\u9488\u5bf9\u6027\u4e0d\u540c\u7684\u5934\u90e8\u548c\u5c3e\u90e8\uff0c\u5206\u522b\u4f7f\u7528\u4e0d\u540c\u7684\u5904\u7406\u65b9\u5f0f\u3002\u4e2d\u95f4\u662f\u4e00\u4e2atransformer\u7f16\u89e3\u7801\u5668\u7ed3\u6784\u3002\u5c06\u5934\u90e8\u8f93\u51fa\u7684\u7279\u5f81\u56fe\u50cfunfold\u6210\u201d\u8bcd\u5411\u91cf\u201c\u5f62\u5f0f\u548c\u4f4d\u7f6e\u5d4c\u5165\u76f8\u52a0\u540e\u8f93\u5165encoder\uff0cencoder\u662f\u5e38\u89c4\u7684\u7ed3\u6784\uff0c\u5305\u62ec\u4e00\u4e2aLN\u548cMSA\u63a5\u6b8b\u5dee\u3001LN\u548cFFN\u63a5\u6b8b\u5dee\uff08FNN\u6709\u4e24\u5c42\u5168\u8fde\u63a5\uff09\u3002\u89e3\u7801\u5668\u7ed3\u6784\u548c\u5e38\u89c4transformer\u4e5f\u5dee\u4e0d\u591a\uff0c\u4f46\u662f\u591a\u4e86\u4e00\u4e2a\u7279\u5b9a\u4efb\u52a1\u6807\u7b7e\u5d4c\u5165<\/strong>\uff0c\u52a0\u5165\u5728decoder\u8f93\u5165\u7b2c\u4e00\u6b21MSA\u7684Q\u548cK\u4e0a\u3001\u7b2c\u4e8c\u6b21Q\u7684\u4e0a\u3002\u5c3e\u90e8\u4e5f\u662f\u591a\u4e2a\u9488\u5bf9\u7279\u5b9a\u4efb\u52a1\u7684\u7ed3\u6784\uff0c\u7528\u4e8e\u8fd8\u539f\u56fe\u50cf\u5c3a\u5bf8\u3002<\/p>\n

\u6570\u636e\u96c6\u662f\u4f7f\u7528ImageNet\u81ea\u5df1\u5904\u7406\u7684\uff0c\u56e0\u4e3a\u9700\u8981\u5927\u91cf\u6570\u636e<\/font>\u624d\u80fd\u8bad\u7ec3\u51fa\u8f83\u597d\u7684\u9884\u8bad\u7ec3\u6a21\u578b\u3002<\/p>\n

LOSS\u4f7f\u7528\u6709\u76d1\u7763\u7684L1L\u635f\u5931\u548c\u5bf9\u6bd4\u5b66\u4e60\u635f\u5931\u51fd\u6570<\/font>\u3002<\/p>\n

\u539f\u6587\u94fe\u63a5\uff1aIPT\uff1aPre-Trained Image Processing Transformer
\u6e90\u7801\u5730\u5740\uff1a
https:\/\/github.com\/huawei-noah\/Pretrained-IPT
and
https:\/\/gitee.com\/mindspore\/mindspore\/tree\/master\/model_zoo\/research\/cv\/IPT<\/p>\n<\/p>\n

\n

IPT\uff1aPre-Trained Image Processing Transformer[CVPR 2021]<\/h4>\n
    \n
  • Abstract<\/li>\n
  • 1 Introduction<\/li>\n
  • 2 Method<\/li>\n
  • \n
      \n
    • 2.1 IPT architecture<\/li>\n
    • 2.2 Pre-training on ImageNet<\/li>\n<\/ul>\n<\/li>\n
    • 3 Experiments<\/li>\n
    • \n
        \n
      • 3.1 Super-resolution<\/li>\n
      • 3.2 Denoising<\/li>\n
      • 3.3 Deraining<\/li>\n
      • 3.4 Generalization Ability<\/li>\n
      • 3.5 Ablation Study<\/li>\n<\/ul>\n<\/li>\n
      • 4 Conclusion<\/li>\n<\/ul>\n<\/div>\n

        Abstract<\/h2>\n

        \u968f\u7740\u73b0\u4ee3\u786c\u4ef6\u8ba1\u7b97\u80fd\u529b\u7684\u5f3a\u52b2\u589e\u957f\uff0c\u5728\u5927\u89c4\u6a21\u6570\u636e\u96c6\u4e0a\u5b66\u4e60\u7684\u9884\u8bad\u7ec3<\/strong>\u7684\u6df1\u5ea6\u5b66\u4e60\u6a21\u578b\uff08\u5982BERT\u3001GPT-3\uff09\u5df2\u663e\u793a\u51fa\u5176\u6bd4\u4f20\u7edf\u65b9\u6cd5\u66f4\u6709\u6548\u3002\u8fd9\u4e00\u5de8\u5927\u8fdb\u6b65\u4e3b\u8981\u5f52\u529f\u4e8etransformer\u53ca\u5176\u53d8\u4f53\u67b6\u6784\u7684\u8868\u793a\u80fd\u529b\u3002<\/p>\n

        \u5728\u672c\u6587\u4e2d\uff0c\u4f5c\u8005\u8bd5\u56fe\u5728\u4f4e\u7ea7\u8ba1\u7b97\u673a\u89c6\u89c9\u4efb\u52a1\uff08\u4f8b\u5982\u53bb\u566a\u3001\u8d85\u5206\u8fa8\u7387\u548c\u53bb\u96e8\uff09\uff0c\u4e5f\u5f00\u53d1\u4e00\u4e2a\u9884\u8bad\u7ec3\u6a21\u578b\uff0c\u5373\u56fe\u50cf\u5904\u7406Transformer\u2014\u2014image processing transformer \uff08IPT<\/mark>\uff09\u3002\u4e3a\u4e86\u6700\u5927\u9650\u5ea6\u5730\u6316\u6398transformer\u7684\u6027\u80fd\uff0c\u4f5c\u8005\u4f7f\u7528\u4e86\u8457\u540d\u7684ImageNet\u57fa\u51c6\u751f\u6210\u5927\u91cf\u635f\u574f\u7684\u56fe\u50cf\u5bf9<\/font>\u3002IPT\u6a21\u578b\u5728\u8fd9\u4e9b\u56fe\u50cf\u4e0a\u8fdb\u884c\u591a\u5934\u591a\u5c3e\u8bad\u7ec3<\/font>\u3002\u6b64\u5916\uff0c\u4e3a\u4e86\u66f4\u597d\u5730\u9002\u5e94\u4e0d\u540c\u7684\u56fe\u50cf\u5904\u7406\u4efb\u52a1\uff0c\u8fd8\u5f15\u5165\u4e86\u5bf9\u6bd4\u5b66\u4e60<\/font>\u3002\u56e0\u6b64\uff0c\u7ecf\u8fc7\u5fae\u8c03\u540e\uff0c\u9884\u8bad\u7ec3\u7684\u6a21\u578b\u53ef\u4ee5\u6709\u6548\u5730\u7528\u4e8e\u6240\u9700\u7684\u4efb\u52a1\u3002\u867d\u7136\u53ea\u6709\u4e00\u4e2a\u9884\u8bad\u7ec3\u7684\u6a21\u578b\uff0cIPT\u5728\u5404\u79cd\u4f4e\u6c34\u5e73\u57fa\u51c6\u4e0a\u7684\u8868\u73b0\u4f18\u4e8e\u5f53\u524d\u6700\u5148\u8fdb\u7684\u65b9\u6cd5\u3002<\/p>\n

        1 Introduction<\/h2>\n

        \u56fe\u50cf\u5904\u7406\u662f\u8ba1\u7b97\u673a\u89c6\u89c9\u7cfb\u7edf\u7684\u4f4e\u7ea7\u90e8\u5206\u7684\u4e00\u4e2a\u7ec4\u6210\u90e8\u5206\u3002\u56fe\u50cf\u5904\u7406\u7684\u7ed3\u679c\u4f1a\u5728\u5f88\u5927\u7a0b\u5ea6\u4e0a\u5f71\u54cd\u540e\u7eed\u9ad8\u7ea7\u90e8\u5206\u5bf9\u56fe\u50cf\u6570\u636e\u8fdb\u884c\u8bc6\u522b\u548c\u7406\u89e3\u3002\u7531\u4e8e\u8bb8\u591a\u56fe\u50cf\u5904\u7406\u4efb\u52a1\u90fd\u662f\u76f8\u5173\u7684\uff0c\u56e0\u6b64\u5f88\u81ea\u7136\u5730\uff0c\u4eba\u4eec\u4f1a\u671f\u671b\u5728\u4e00\u4e2a\u6570\u636e\u96c6\u4e0a\u9884\u8bad\u7ec3\u7684\u6a21\u578b\u53ef\u4ee5\u5bf9\u53e6\u4e00\u4e2a\u6570\u636e\u96c6\u6709\u6240\u5e2e\u52a9\u3002\u4f46\u5f88\u5c11\u6709\u7814\u7a76\u5c06\u9884\u8bad\u7ec3\u63a8\u5e7f\u5230\u56fe\u50cf\u5904\u7406\u4efb\u52a1\u4e2d<\/font>\u3002<\/p>\n

        \u73b0\u5728\uff0c\u5728\u81ea\u7136\u8bed\u8a00\u5904\u7406\u548c\u8ba1\u7b97\u673a\u89c6\u89c9\u65b9\u9762\u8fdb\u884c\u9884\u8bad\u7ec3\u5f88\u5e38\u89c1\u3002<\/p>\n

          \n
        1. \u76ee\u6807\u68c0\u6d4b\u6a21\u578b\u7684\u4e3b\u5e72\u901a\u5e38\u662f\u5728ImageNet\u5206\u7c7b\u4e0a\u9884\u8bad\u7ec3\u7684\uff0c\u5305\u62ecAlexNet\u3001VGGNet\u548cResNet\u3002<\/li>\n
        2. \u5f00\u521b\u6027\u7684Transformer\u4e5f\u5df2\u5e7f\u6cdb\u5e94\u7528\u4e8e\u8bb8\u591a\u81ea\u7136\u8bed\u8a00\u5904\u7406\uff08NLP\uff09\u4efb\u52a1\uff0c\u5982\u7ffb\u8bd1\u548c\u95ee\u7b54\u3002\u57fa\u672c\u90fd\u662f\u5927\u578b\u6587\u672c\u8bed\u6599\u5e93\u4e0a\u9884\u8bad\u7ec3\u57fa\u4e8etransformer\u7684\u6a21\u578b\uff0c\u5e76\u5728\u7279\u5b9a\u4e8e\u4efb\u52a1\u7684\u6570\u636e\u96c6\u4e0a\u5bf9\u5176\u8fdb\u884c\u5fae\u8c03\u3002<\/li>\n
        3. Transformer\u7684\u53d8\u4f53\uff0c\u5982Bert\u548cGPT-3\uff0c\u8fdb\u4e00\u6b65\u4e30\u5bcc\u4e86\u8bad\u7ec3\u6570\u636e\uff0c\u63d0\u9ad8\u4e86\u9884\u8bad\u7ec3\u7684\u80fd\u529b\u3002<\/li>\n<\/ol>\n

          \u5df2\u7ecf\u6709\u5b66\u8005\u5c1d\u8bd5\u5c06Transformer\u63a8\u5e7f\u5230\u8ba1\u7b97\u673a\u89c6\u89c9\u9886\u57df\u3002\u4f8b\u5982\uff0cWang\u7b49\u4eba\u548cFu\u7b49\u4eba\u5e94\u7528\u4e86\u57fa\u4e8e\u81ea\u6ce8\u610f\u7684\u6a21\u578b\u6765\u6355\u6349\u56fe\u50cf\u4e0a\u7684\u5168\u5c40\u4fe1\u606f<\/font>\uff1bCarion\u7b49\u4eba\u63d0\u51faDERT\u4f7f\u7528transformer\u67b6\u6784\u8fdb\u884c\u7aef\u5230\u7aef\u76ee\u6807\u68c0\u6d4b<\/font>\uff1bDosovitskiy\u7b49\u4eba\u5f15\u5165\u4e86\u89c6\u89c9Transformer\uff08ViT\uff09<\/font>\uff0c\u5c06\u8f93\u5165\u56fe\u50cf\u5904\u7406\u4e3a16\u00d716\u7684token\uff0c\u5e76\u5728\u56fe\u50cf\u8bc6\u522b\u65b9\u9762\u53d6\u5f97\u4e86\u4f18\u5f02\u7684\u6548\u679c\u3002<\/p>\n

          \u56fe\u50cf\u5904\u7406\u4efb\u52a1\u7684\u9884\u8bad\u7ec3\u65b9\u6cd5\u9700\u8981\u89e3\u51b3\u4e24\u4e2a\u95ee\u9898\uff1a<\/strong><\/p>\n

            \n
          1. \u4efb\u52a1\u7279\u5b9a\u7684\u6570\u636e<\/strong>\u53d7\u5230\u9650\u5236\u3002\u8fd9\u4e00\u95ee\u9898\u5728\u6d89\u53ca\u4ed8\u8d39\u6570\u636e\u6216\u6570\u636e\u9690\u79c1\u7684\u56fe\u50cf\u5904\u7406\u4efb\u52a1\u4e2d\u66f4\u52a0\u4e25\u91cd\uff0c\u5982\u533b\u5b66\u56fe\u50cf\u548c\u536b\u661f\u56fe\u50cf\u3002\u5404\u79cd\u4e0d\u4e00\u81f4\u56e0\u7d20\uff08\u4f8b\u5982\u6444\u50cf\u673a\u53c2\u6570\u3001\u7167\u660e\u548c\u5929\u6c14\uff09\u53ef\u80fd\u4f1a\u8fdb\u4e00\u6b65\u5e72\u6270\u7528\u4e8e\u8bad\u7ec3\u7684\u6570\u636e\u5206\u5e03\u3002<\/li>\n
          2. \u5728\u5e94\u7528\u6d4b\u8bd5\u524d\uff0c\u5e76\u4e0d\u77e5\u9053\u5c06\u7528\u4e8e\u54ea\u79cd\u7c7b\u578b\u7684\u56fe\u50cf\u5904\u7406\u4efb\u52a1<\/strong>\u3002\u56e0\u6b64\uff0c\u5fc5\u987b\u51c6\u5907\u4e00\u7cfb\u5217\u7684\u56fe\u50cf\u5904\u7406\u6a21\u5757\u3002\u5404\u4e2a\u6a21\u5757\u6709\u660e\u786e\u7684\u4efb\u52a1\u76ee\u6807\uff0c\u4f46\u4e00\u4e9b\u6f5c\u5728\u7684\u90e8\u5206\u53ef\u4ee5\u5171\u4eab\u6570\u636e\u3002<\/li>\n<\/ol>\n

            \u5728\u672c\u6587\u4e2d\uff0c\u4f5c\u8005\u4f7f\u7528transformer\u67b6\u6784\u5f00\u53d1\u4e86\u4e00\u4e2a\u9884\u8bad\u7ec3\u7684\u56fe\u50cf\u5904\u7406\u6a21\u578b\uff0c\u5373\u56fe\u50cf\u5904\u7406transformer\uff08IPT\uff09\u3002\u7531\u4e8e\u9884\u8bad\u7ec3\u6a21\u578b\u9700\u8981\u4e0e\u4e0d\u540c\u7684\u56fe\u50cf\u5904\u7406\u4efb\u52a1\u517c\u5bb9<\/font>\uff0c\u5305\u62ec\u8d85\u5206\u8fa8\u7387\u3001\u53bb\u566a\u548c\u53bb\u96e8\uff0c\u6574\u4e2a\u7f51\u7edc\u7531\u591a\u5bf9<\/font>\u5bf9\u5e94\u4e8e\u4e0d\u540c\u4efb\u52a1\u7684\u5934\u90e8\u548c\u5c3e\u90e8\u4ee5\u53ca\u5355\u4e2a\u5171\u4eab\u4f53\u7ec4\u6210\u3002<\/p>\n

            \u7531\u4e8e\u9700\u8981\u4f7f\u7528\u5927\u89c4\u6a21\u6570\u636e\u96c6<\/code>\u6316\u6398Transformer\u7684\u6f5c\u529b\uff0c\u7528\u5927\u91cf\u3001\u591a\u6837\u6027\u7684\u56fe\u50cf\u6765\u8bad\u7ec3IPT\u6a21\u578b\u3002\u4e3a\u6b64\uff0c\u9009\u62e9\u4e86ImageNet<\/strong>\u57fa\u51c6\uff0c\u5b83\u5305\u542b1000\u4e2a\u7c7b\u522b\u7684\u5404\u79cd\u9ad8\u5206\u8fa8\u7387\u56fe\u50cf\u3002\u5bf9\u4e8eImageNet\u4e2d\u7684\u6bcf\u4e2a\u56fe\u50cf\uff0c\u4f7f\u7528\u51e0\u79cd\u7cbe\u5fc3\u8bbe\u8ba1\u7684\u64cd\u4f5c\u6765\u4e3a\u4e0d\u540c\u7684\u4efb\u52a1\u751f\u6210\u591a\u4e2a\u635f\u574f\u7684\u526f\u672c\u3002\u4f8b\u5982\uff0c\u8d85\u5206\u8fa8\u7387\u4efb\u52a1\u7684\u8bad\u7ec3\u6837\u672c\u662f\u901a\u8fc7\u5bf9\u539f\u59cb\u56fe\u50cf\u8fdb\u884c\u4e0b\u91c7\u6837\u751f\u6210\u7684\u3002\u7528\u4e8eIPT\u8bad\u7ec3\u7684\u6574\u4e2a\u6570\u636e\u96c6\u5305\u542b\u7ea61000\u591a\u4e07\u5f20\u56fe\u50cf<\/strong> \u3002\u7136\u540e\u5728\u5e9e\u5927\u7684\u6570\u636e\u96c6\u4e0a\u8bad\u7ec3transformer\u67b6\u6784\u3002<\/p>\n

            \u8bad\u7ec3\u56fe\u50cf\u88ab\u8f93\u5165\u5230\u7279\u5b9a\u7684\u5934\u90e8<\/code>\uff0c\u751f\u6210\u7684\u7279\u5f81\u88ab\u88c1\u526a\u6210\u5c0f\u5757\uff08\u5373\u201ctoken\u201d\uff09\uff0c\u7136\u540e\u5c55\u5e73\u6210\u5e8f\u5217\u3002Transformer<\/code>\u5904\u7406\u5c55\u5f00\u7684\u7279\u5f81\uff0c\u7f16\u7801\u5668\u548c\u89e3\u7801\u5668\u5206\u522b\u4f7f\u7528\u4f4d\u7f6e\u548c\u4efb\u52a1\u5d4c\u5165\u3002\u6b64\u5916\uff0c\u6839\u636e\u7279\u5b9a\u4efb\u52a1\uff0c\u5c3e\u90e8<\/code>\u88ab\u5f3a\u5236\u9884\u6d4b\u5177\u6709\u4e0d\u540c\u8f93\u51fa\u5927\u5c0f\u7684\u539f\u59cb\u56fe\u50cf\u3002\u6b64\u5916\uff0c\u4e3a\u4e86\u66f4\u597d\u5730\u9002\u5e94\u4e0d\u540c\u7684\u56fe\u50cf\u5904\u7406\u4efb\u52a1\uff0c\u8fd8\u5f15\u5165\u4e86\u4e0d\u540c\u8f93\u5165\u5757\u4e4b\u95f4\u5173\u7cfb\u7684\u5bf9\u6bd4\u635f\u5931<\/font>\u3002\u63d0\u51fa\u7684\u56fe\u50cf\u5904\u7406Transformer\u662f\u4ee5\u7aef\u5230\u7aef\u7684\u65b9\u5f0f\u5b66\u4e60\u7684\u3002\u5728\u591a\u4e2a\u57fa\u51c6\u4e0a\u8fdb\u884c\u7684\u5b9e\u9a8c\u7ed3\u679c\u8868\u660e\uff0c\u7ecf\u8fc7\u9884\u8bad\u7ec3\u7684IPT\u6a21\u578b\u53ef\u4ee5\u5728\u5fae\u8c03\u540e\u663e\u8457\u589e\u5f3a\uff0c\u4ece\u800c\u8d85\u8fc7\u5927\u591a\u6570\u73b0\u6709\u65b9\u6cd5\u3002
            \"\u8d85\u5206\u7b97\u6cd5IPT\uff1aPre-Trained<\/p>\n

            2 Method<\/h2>\n

            \"\u8d85\u5206\u7b97\u6cd5IPT\uff1aPre-Trained<\/p>\n

            2.1 IPT architecture<\/h3>\n

            IPT\u7684\u603b\u4f53\u67b6\u6784\u7531\u56db\u4e2a\u90e8\u5206\u7ec4\u6210\uff1a\u5934\u90e8<\/code>\u7528\u4e8e\u4ece\u8f93\u5165\u635f\u574f\u56fe\u50cf\u4e2d\u63d0\u53d6\u7279\u5f81\uff0c\u7f16\u7801\u5668<\/code>\u3001\u89e3\u7801\u5668<\/code>\uff08\u7528\u4e8e\u6062\u590d\u8f93\u5165\u6570\u636e\u4e2d\u7f3a\u5931\u4fe1\u606f\uff09\u4ee5\u53ca\u7528\u4e8e\u5c06\u7279\u5f81\u6620\u5c04\u4e3a\u6062\u590d\u56fe\u50cf\u7684\u5c3e\u90e8<\/code>\u3002<\/p>\n

            Heads\uff1a<\/strong>
            \u4e3a\u4e86\u5e94\u5bf9\u4e0d\u540c\u7684\u56fe\u50cf\u5904\u7406\u4efb\u52a1\uff0c\u4f7f\u7528\u4e86\u4e00\u79cd\u591a\u5934\u90e8\u7ed3\u6784<\/mark>\u6765\u5206\u522b\u5904\u7406\u6bcf\u4e2a\u4efb\u52a1\uff0c\u5176\u4e2d\u6bcf\u4e2a\u5934\u90e8\u7531\u4e09\u4e2a\u5377\u79ef\u5c42<\/strong>\u7ec4\u6210\u3002
            \u8f93\u5165\u56fe\u50cf\u8868\u793a\u4e3a x \u2208 R 3 \u00d7 H \u00d7 W x\u2208 R^{3\u00d7H\u00d7W} <\/span><\/span>x<\/span><\/span>\u2208<\/span><\/span><\/span><\/span>R<\/span><\/span>3<\/span>\u00d7<\/span>H<\/span>\u00d7<\/span>W<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\uff0c\u5934\u90e8\u751f\u6210\u7279\u5f81\u56fe f H \u2208 R C \u00d7 H \u00d7 W f_H\u2208 R^{C\u00d7H\u00d7W} <\/span><\/span>f<\/span><\/span>H<\/span><\/span><\/span><\/span>\u200b<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u2208<\/span><\/span><\/span><\/span>R<\/span><\/span>C<\/span>\u00d7<\/span>H<\/span>\u00d7<\/span>W<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\uff0c\uff08\u901a\u5e38\u4f7f\u7528C=64\uff09\u3002\u8ba1\u7b97\u516c\u5f0f\u4e3a
            f H = H i \uff08 x \uff09 f_H=H^i\uff08x\uff09 <\/span><\/span>f<\/span><\/span>H<\/span><\/span><\/span><\/span>\u200b<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>=<\/span><\/span><\/span><\/span>H<\/span><\/span>i<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\uff08<\/span>x<\/span>\uff09<\/span><\/span><\/span><\/span><\/span><\/span>\u5176\u4e2d H i ( i = 1 \uff0c \u2026 \uff0c N t ) H^i(i={1\uff0c\u2026\uff0cN_t}) <\/span><\/span>H<\/span><\/span>i<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>(<\/span>i<\/span><\/span>=<\/span><\/span><\/span><\/span>1<\/span>\uff0c<\/span><\/span>\u2026<\/span><\/span>\uff0c<\/span>N<\/span><\/span>t<\/span><\/span><\/span><\/span>\u200b<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>)<\/span><\/span><\/span><\/span><\/span>\u8868\u793a\u7b2ci\u4e2a\u4efb\u52a1\u7684\u5934\u90e8\uff0c N t N_t <\/span><\/span>N<\/span><\/span>t<\/span><\/span><\/span><\/span>\u200b<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u8868\u793a\u4efb\u52a1\u6570\u3002<\/p>\n

            Transformer encoder:<\/strong><\/p>\n

              \n
            1. \u5148\u5c06\u7279\u5f81\u5206\u5272\u4e3apatch\uff0c\u6bcf\u4e2apatch\u5c55\u5f00\u4e3a\u4e00\u5217\u5411\u91cf<\/font>\uff0c\u89c6\u4e3a\u4e00\u4e2a\u201c\u8bcd\u5411\u91cf\u201d\u3002\u8f93\u5165\u7279\u5f81 f H \u2208 R C \u00d7 H \u00d7 W f_H\u2208 R^{C\u00d7H\u00d7W} <\/span><\/span>f<\/span><\/span>H<\/span><\/span><\/span><\/span>\u200b<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u2208<\/span><\/span><\/span><\/span>R<\/span><\/span>C<\/span>\u00d7<\/span>H<\/span>\u00d7<\/span>W<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span> reshape \u6210\u4e00\u5e8f\u5217patch f p i \u2208 R P 2 \u00d7 C \uff0c i = 1 \uff0c \u2026 \uff0c N f_{pi}\u2208 R^{P^2\u00d7C}\uff0ci={1\uff0c\u2026\uff0cN} <\/span><\/span>f<\/span><\/span>p<\/span>i<\/span><\/span><\/span><\/span><\/span>\u200b<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u2208<\/span><\/span><\/span><\/span>R<\/span><\/span>P<\/span><\/span>2<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u00d7<\/span>C<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\uff0c<\/span>i<\/span><\/span>=<\/span><\/span><\/span><\/span>1<\/span>\uff0c<\/span><\/span>\u2026<\/span><\/span>\uff0c<\/span>N<\/span><\/span><\/span><\/span><\/span><\/span>\uff0cpatch\u6570\u91cf\u4e3a N = H W p 2 N=\\frac{HW}{p^2} <\/span><\/span>N<\/span><\/span>=<\/span><\/span><\/span><\/span><\/span><\/span>p<\/span><\/span>2<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>H<\/span>W<\/span><\/span><\/span><\/span><\/span>\u200b<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\uff08\u5e8f\u5217\u957f\u5ea6\uff09\uff0cp\u662fpatch\u5927\u5c0f\u3002<\/li>\n
            2. \u518d\u52a0\u5165\u4f4d\u7f6e\u4fe1\u606f\uff0c\u6bcf\u4e2apatch\u90fd\u6dfb\u52a0\u4e86\u53ef\u5b66\u4e60\u7684\u4f4d\u7f6e\u7f16\u7801<\/font> E p i \u2208 R P 2 \u00d7 C E_{pi}\u2208 R^{P^2\u00d7C} <\/span><\/span>E<\/span><\/span>p<\/span>i<\/span><\/span><\/span><\/span><\/span>\u200b<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u2208<\/span><\/span><\/span><\/span>R<\/span><\/span>P<\/span><\/span>2<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u00d7<\/span>C<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u3002 E p i + f p i E_{pi}+f_{pi} <\/span><\/span>E<\/span><\/span>p<\/span>i<\/span><\/span><\/span><\/span><\/span>\u200b<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>+<\/span><\/span><\/span><\/span>f<\/span><\/span>p<\/span>i<\/span><\/span><\/span><\/span><\/span>\u200b<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u5c06\u76f4\u63a5\u8f93\u5165Transformer\u7f16\u7801\u5668\u3002<\/li>\n
            3. Transformer encoder\u7ee7\u627f\u6700\u539f\u59cb\u7ed3\u6784\uff0c\u6709\u4e00\u4e2a\u591a\u5934\u81ea\u6ce8\u610f\u6a21\u5757\u548c\u4e00\u4e2a\u524d\u9988\u7f51\u7edc\u3002\u7ecf\u8fc7encoder\u6ce8\u610f\u529b\u8ba1\u7b97\u4e0d\u6539\u53d8\u8f93\u5165\u8f93\u51fa\u5c3a\u5bf8\u3002encoder\u5185\u90e8\u516c\u5f0f\u5982\u4e0b\uff1a<\/li>\n<\/ol>\n

              \"\u8d85\u5206\u7b97\u6cd5IPT\uff1aPre-Trained
              l l <\/span><\/span>l<\/span><\/span><\/span><\/span><\/span>\u8868\u793a\u7f16\u7801\u5668\u4e2d\u7684\u5c42\u6570\uff0cMSA\u8868\u793a\u4f20\u7edfTransformer\u6a21\u578b\u4e2d\u7684\u591a\u5934\u81ea\u6ce8\u610f\u6a21\u5757\uff0cLN\u8868\u793a\u5c42\u89c4\u8303\u5316\uff0cFFN\u8868\u793a\u524d\u9988\u7f51\u7edc\uff0c\u5176\u4e2d\u5305\u542b\u4e24\u4e2a\u5168\u8fde\u63a5\u5c42\u3002<\/p>\n

              Transformer decoder\uff1a<\/strong>
              Decoder\u4e5f\u548c\u539f\u59cb\u7684decoder\u65e0\u751a\u5dee\u522b\u53ea\u662f\u591a\u52a0\u4e86\u4e00\u4e2a\u4efb\u52a1\u7c7b\u578b\u5d4c\u5165<\/font>\u3002Transformer edcoder\u7531\u4e24\u4e2a\u591a\u5934\u81ea\u6ce8\u610f\uff08MSA\uff09\u5c42\u548c\u4e00\u4e2a\u524d\u9988\u7f51\u7edc\uff08FFN\uff09\u7ec4\u6210\u3002\u4f7f\u7528\u7279\u5b9a\u4e8e\u4efb\u52a1\u7684\u5d4c\u5165\u4f5c\u4e3adecoder\u7684\u989d\u5916\u8f93\u5165\u3002\u8fd9\u4e9b\u7279\u5b9a\u4e8e\u4efb\u52a1\u7684\u5d4c\u5165 E t i \u2208 R P 2 \u00d7 C \uff0c i = 1 \uff0c \u2026 \uff0c N t E_t^i\u2208 R^{P^2\u00d7C}\uff0ci={1\uff0c\u2026\uff0cN_t} <\/span><\/span>E<\/span><\/span>t<\/span><\/span><\/span><\/span>i<\/span><\/span><\/span><\/span>\u200b<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u2208<\/span><\/span><\/span><\/span>R<\/span><\/span>P<\/span><\/span>2<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u00d7<\/span>C<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\uff0c<\/span>i<\/span><\/span>=<\/span><\/span><\/span><\/span>1<\/span>\uff0c<\/span><\/span>\u2026<\/span><\/span>\uff0c<\/span>N<\/span><\/span>t<\/span><\/span><\/span><\/span>\u200b<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u4e3a\u5b66\u4e60\u4e0d\u540c\u4efb\u52a1\u89e3\u7801\u7279\u5f81\u3002\u6700\u540e\u5c06\u89e3\u7801\u5f97\u5230\u7684\u5927\u5c0f\u4e3a p 2 \u00d7 C p^2\u00d7C <\/span><\/span>p<\/span><\/span>2<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u00d7<\/span><\/span><\/span><\/span>C<\/span><\/span><\/span><\/span><\/span>\u7684N\u4e2a\u7279\u5f81\u91cd\u5851\u4e3a\u5927\u5c0f\u4e3a C \u00d7 H \u00d7 W C\u00d7H\u00d7W <\/span><\/span>C<\/span><\/span>\u00d7<\/span><\/span><\/span><\/span>H<\/span><\/span>\u00d7<\/span><\/span><\/span><\/span>W<\/span><\/span><\/span><\/span><\/span>\u7684\u7279\u5f81 f D f_D <\/span><\/span>f<\/span><\/span>D<\/span><\/span><\/span><\/span>\u200b<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u3002\u89e3\u7801\u5668\u7684\u8ba1\u7b97\u516c\u5f0f\u5982\u4e0b\uff1a<\/p>\n

              \"\u8d85\u5206\u7b97\u6cd5IPT\uff1aPre-Trained<\/p>\n

              Tails\uff1a<\/strong>
              \u5c3e\u90e8\u7684\u6027\u8d28\u4e0e\u5934\u90e8\u76f8\u540c\uff0c\u4f7f\u7528\u591a\u4e2a\u5c3e\u90e8<\/font>\u6765\u5904\u7406\u4e0d\u540c\u7684\u4efb\u52a1\u3002\u8ba1\u7b97\u516c\u5f0f\u4e3a f T = T i \uff08 f D \uff09 f_T=T^i\uff08f_D\uff09 <\/span><\/span>f<\/span><\/span>T<\/span><\/span><\/span><\/span>\u200b<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>=<\/span><\/span><\/span><\/span>T<\/span><\/span>i<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\uff08<\/span>f<\/span><\/span>D<\/span><\/span><\/span><\/span>\u200b<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\uff09<\/span><\/span><\/span><\/span><\/span>\uff0c\u5176\u4e2d T i \uff08 i = 1 \uff0c \u2026 \uff0c N t \uff09 T_i\uff08i={1\uff0c\u2026\uff0cN_t}\uff09 <\/span><\/span>T<\/span><\/span>i<\/span><\/span><\/span><\/span>\u200b<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\uff08<\/span>i<\/span><\/span>=<\/span><\/span><\/span><\/span>1<\/span>\uff0c<\/span><\/span>\u2026<\/span><\/span>\uff0c<\/span>N<\/span><\/span>t<\/span><\/span><\/span><\/span>\u200b<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\uff09<\/span><\/span><\/span><\/span><\/span>\u8868\u793a\u7b2c i i <\/span><\/span>i<\/span><\/span><\/span><\/span><\/span>\u4e2a\u4efb\u52a1\u7684\u5934\u90e8\uff0c N t N_t <\/span><\/span>N<\/span><\/span>t<\/span><\/span><\/span><\/span>\u200b<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u8868\u793a\u4efb\u52a1\u6570\u3002\u8f93\u51fa f T f_T <\/span><\/span>f<\/span><\/span>T<\/span><\/span><\/span><\/span>\u200b<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u662f\u7531\u7279\u5b9a\u4efb\u52a1\u786e\u5b9a\u7684\u56fe\u50cf\u5927\u5c0f\u4e3a 3 \u00d7 H \u2032 \u00d7 W \u2032 3\u00d7H\u2032\u00d7W\u2032 <\/span><\/span>3<\/span><\/span>\u00d7<\/span><\/span><\/span><\/span>H<\/span>\u2032<\/span><\/span>\u00d7<\/span><\/span><\/span><\/span>W<\/span>\u2032<\/span><\/span><\/span><\/span><\/span>\u7684\u7ed3\u679c\u3002\u4f8b\u5982\uff0c\u5bf9\u4e8e2\u00d7\u8d85\u5206\u8fa8\u7387\u4efb\u52a1\uff0cH\u2032=2H\uff0cW\u2032=2W\u3002<\/p>\n

              \"\u8d85\u5206\u7b97\u6cd5IPT\uff1aPre-Trained<\/p>\n

              2.2 Pre-training on ImageNet<\/h3>\n

              \u51b3\u5b9a\u4e00\u4e2a\u6a21\u578b\u6210\u529f\u4e0e\u5426\uff0c\u9664\u4e86\u672c\u8eab\u7684\u7f51\u7edc\u7ed3\u6784\uff0c\u8fd8\u6709\u4e00\u4e2a\u5173\u952e\u56e0\u7d20\u662f\u5927\u89c4\u6a21\u6570\u636e\u96c6<\/font>\u7684\u826f\u597d\u4f7f\u7528\u3002
              \u76f8\u6bd4\u4e8e\u56fe\u50cf\u5206\u7c7b\u6570\u636e\u96c6\uff0c\u56fe\u50cf\u5904\u7406\u4efb\u52a1\u7684\u53ef\u7528\u6570\u636e\u96c6\u6570\u91cf\u8f83\u5c11\uff08\u4f8b\u5982\uff0c\u7528\u4e8e\u56fe\u50cf\u8d85\u5206\u8fa8\u7387\u4efb\u52a1\u7684DIV2K\u6570\u636e\u96c6\u4e0a\u53ea\u67092000\u5f20\u56fe\u50cf\uff09\uff0c\u4f5c\u8005\u4f7f\u7528\u8457\u540d\u7684ImageNet\u4f5c\u4e3a\u57fa\u7ebf\u6570\u636e\u96c6\uff0c\u81ea\u5df1\u751f\u6210\u6240\u9700\u7684\u6570\u636e\u96c6\uff0c\u5bf9IPT\u6a21\u578b\u8fdb\u884c\u9884\u8bad\u7ec3\u3002<\/p>\n

              \u7531\u4e8eImageNet\u57fa\u51c6\u4e2d\u7684\u56fe\u50cf\u5177\u6709\u9ad8\u5ea6\u591a\u6837\u6027\uff0c\u5176\u4e2d\u5305\u542b\u6765\u81ea1000\u4e2a\u4e0d\u540c\u7c7b\u522b\u7684100\u591a\u4e07\u5f20\u81ea\u7136\u56fe\u50cf\u3002\u8fd9\u4e9b\u56fe\u50cf\u5177\u6709\u4e30\u5bcc\u7684\u7eb9\u7406\u548c\u989c\u8272\u4fe1\u606f\u3002\u9996\u5148\u79fb\u9664\u8bed\u4e49\u6807\u7b7e\uff0c\u7136\u540e\u624b\u52a8\u4ece\u8fd9\u4e9b\u672a\u6807\u8bb0\u7684\u56fe\u50cf\u4e2d\u5408\u6210\u5404\u79cd\u635f\u574f\u7684\u56fe\u50cf<\/strong>\uff0c\u5e76\u9488\u5bf9\u4e0d\u540c\u7684\u4efb\u52a1\u4f7f\u7528\u5404\u79cd\u964d\u7ea7\u6a21\u578b\u3002<\/p>\n

                \n
              1. \u8d85\u5206\u8fa8\u7387\u4efb\u52a1\u901a\u5e38\u91c7\u7528\u53cc\u4e09\u6b21\u4e0b\u91c7\u6837\u751f\u6210\u4f4e\u5206\u8fa8\u7387\u56fe\u50cf\uff0c<\/li>\n
              2. \u53bb\u566a\u4efb\u52a1\u5728\u4e0d\u540c\u566a\u58f0\u6c34\u5e73\u7684\u539f\u59cb\u56fe\u50cf\u4e2d\u52a0\u5165\u9ad8\u65af\u566a\u58f0\u751f\u6210\u5e26\u566a\u56fe\u50cf\u3002<\/li>\n<\/ol>\n

                \u8fd9\u4e9b\u5408\u6210\u56fe\u50cf\u53ef\u4ee5\u663e\u8457\u63d0\u9ad8\u5b66\u4e60\u6df1\u5ea6\u7f51\u7edc\u7684\u6027\u80fd\uff0c\u5305\u62ecCNN\u548ctransformer\u7ed3\u6784\uff0c\u8fd9\u5c06\u5728\u5b9e\u9a8c\u90e8\u5206\u8bc1\u660e\u3002<\/p>\n

                \u5728\u76d1\u7763\u6a21\u5f0f\u4e0bIPT\u7684\u635f\u5931\u51fd\u6570<\/strong>\u53ef\u4ee5\u8868\u793a\u4e3a\uff1a
                \"\u8d85\u5206\u7b97\u6cd5IPT\uff1aPre-Trained
                L1\u5206\u522b\u8868\u793a\u91cd\u5efa\u56fe\u50cf\u7684\u5e38\u89c4L1\u635f\u5931\uff0c I c o r r u p t e d i I^i_{corrupted} <\/span><\/span>I<\/span><\/span>c<\/span>o<\/span>r<\/span>r<\/span>u<\/span>p<\/span>t<\/span>e<\/span>d<\/span><\/span><\/span><\/span><\/span>i<\/span><\/span><\/span><\/span>\u200b<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u4efb\u52a1 i i <\/span><\/span>i<\/span><\/span><\/span><\/span><\/span>\u7684\u635f\u574f\u56fe\u50cf\u3002\u6b64\u5916\uff0c\u516c\u5f0f(4)\u610f\u5473\u7740\u6240\u63d0\u51fa\u7684\u6846\u67b6\u540c\u65f6\u63a5\u53d7\u591a\u4e2a\u56fe\u50cf\u5904\u7406\u4efb\u52a1\u7684\u8bad\u7ec3\u3002<\/p>\n

                \u8bad\u7ec3\u8fc7\u7a0b\uff1a<\/strong>
                \u5177\u4f53\u6765\u8bf4\uff0c\u5bf9\u4e8e\u6bcf\u4e2a\u6279\u6b21\uff0c\u4ece{N_t}\u4e2a\u4efb\u52a1\u4e2d\u968f\u673a\u9009\u62e9\u4e00\u4e2a\u4efb\u52a1\u8fdb\u884c\u8bad\u7ec3\uff0c\u6bcf\u4e2a\u4efb\u52a1\u5c06\u540c\u65f6\u4f7f\u7528\u76f8\u5e94\u7684\u5934\u3001\u5c3e\u548c\u4efb\u52a1\u5d4c\u5165<\/code>\u8fdb\u884c\u5904\u7406\u3002\u5728\u5bf9IPT\u6a21\u578b\u8fdb\u884c\u9884\u8bad\u7ec3\u540e\uff0c\u5b83\u5c06\u6355\u83b7\u5927\u91cf\u56fe\u50cf\u5904\u7406\u4efb\u52a1\u7684\u5185\u5728\u7279\u5f81\u548c\u8f6c\u6362\u3002\u56e0\u6b64\u53ea\u9700\u8fdb\u4e00\u6b65\u5fae\u8c03<\/code>\uff0c\u4f7f\u7528\u5bf9\u5e94\u4efb\u52a1\u63d0\u4f9b\u7684\u6570\u636e\u96c6\u5c31\u53ef\u4ee5\u5e94\u7528\u4e8e\u7279\u5b9a\u7684\u4efb\u52a1\u3002\u6b64\u5916\uff0c\u4e3a\u4e86\u8282\u7701\u8ba1\u7b97\u6210\u672c\uff0c\u5728\u5fae\u8c03\u65f6\u5c06\u5220\u9664\u5176\u4ed6\u5934\u90e8\u548c\u5c3e\u90e8\uff0c\u5e76\u6839\u636e\u53cd\u5411\u4f20\u64ad\u66f4\u65b0\u5269\u4f59\u7684\u5934\u90e8\u3001\u5c3e\u90e8\u548c\u4e3b\u5e72\u4e2d\u7684\u53c2\u6570\u3002<\/p>\n

                \u989d\u5916\u7684\u635f\u5931\u51fd\u6570\uff1a<\/strong>
                \u7136\u800c\uff0c\u7531\u4e8e\u9000\u5316\u6a21\u578b\u7684\u591a\u6837\u6027\uff0c\u65e0\u6cd5\u4e3a\u6240\u6709\u56fe\u50cf\u5904\u7406\u4efb\u52a1\u5408\u6210\u56fe\u50cf\u3002\u5e76\u4e14\u5728\u5b9e\u8df5\u4e2d\u53ef\u80fd\u5b58\u5728\u5404\u79cd\u5404\u6837\u7684\u566a\u58f0\u3002\u4e3a\u6b64\uff0c\u5e94\u8fdb\u4e00\u6b65\u589e\u5f3a\u751f\u6210IPT\u7684\u6cdb\u5316\u80fd\u529b<\/font>\u3002<\/p>\n

                \u4e0e\u9884\u8bad\u7ec3\u7684\u81ea\u7136\u8bed\u8a00\u5904\u7406\u6a21\u578b\u7c7b\u4f3c\uff0c\u56fe\u50cf\u5757\u4e4b\u95f4\u7684\u5173\u7cfb\u4e5f\u63d0\u4f9b\u4e86\u4fe1\u606f\u3002\u56fe\u50cf\u573a\u666f\u4e2d\u7684patch\u53ef\u4ee5\u770b\u4f5c\u662f\u81ea\u7136\u8bed\u8a00\u5904\u7406\u4e2d\u7684\u4e00\u4e2a\u8bcd\u3002\u56e0\u6b64\uff0c\u5f15\u5165\u5bf9\u6bd4\u5b66\u4e60<\/font>\u6765\u5b66\u4e60\u901a\u7528\u7279\u5f81\uff0c\u4f7f\u9884\u8bad\u7ec3\u7684IPT\u6a21\u578b\u53ef\u4ee5\u7528\u4e8e\u672a\u77e5\u4efb\u52a1\u3002\u5bf9\u6bd4\u5b66\u4e60\u7684\u76ee\u6807\u662f\u6700\u5c0f\u5316\u6765\u81ea\u76f8\u540c\u56fe\u50cf\u7684patch\u7279\u5f81\u4e4b\u95f4\u7684\u8ddd\u79bb\uff0c\u540c\u65f6\u6700\u5927\u5316\u6765\u81ea\u4e0d\u540c\u56fe\u50cf\u7684patch\u4e4b\u95f4\u7684\u8ddd\u79bb\u3002<\/font>\u5bf9\u6bd4\u5b66\u4e60\u7684\u635f\u5931\u51fd\u6570\u516c\u5f0f\u5982\u4e0b\uff1a
                \"\u8d85\u5206\u7b97\u6cd5IPT\uff1aPre-Trained
                d ( a , b ) d(a,b) <\/span><\/span>d<\/span>(<\/span>a<\/span>,<\/span><\/span>b<\/span>)<\/span><\/span><\/span><\/span><\/span>\u8868\u793a\u4f59\u5f26\u76f8\u4f3c\u5ea6\u3002\u6b64\u5916\uff0c\u4e3a\u4e86\u5145\u5206\u5229\u7528\u76d1\u7763\u548c\u81ea\u76d1\u7763\u4fe1\u606f\uff0c L I P T L_{IPT} <\/span><\/span>L<\/span><\/span>I<\/span>P<\/span>T<\/span><\/span><\/span><\/span><\/span>\u200b<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u4f5c\u4e3aIPT\u7684\u6700\u7ec8\u76ee\u6807\u51fd\u6570\u3002\u03bb\u7528\u6765\u5e73\u8861\u5bf9\u6bd4\u635f\u5931\u4e0e\u76d1\u7763\u635f\u5931\u3002\u635f\u5931\u51fd\u6570\u516c\u5f0f\u91cd\u65b0\u6574\u5408\u4e3a\uff1a
                \"\u8d85\u5206\u7b97\u6cd5IPT\uff1aPre-Trained<\/p>\n

                3 Experiments<\/h2>\n

                \u6570\u636e\u96c6\uff1a<\/mark>
                ImageNet<\/strong>\u6570\u636e\u96c6\uff0c\u8be5\u6570\u636e\u96c6\u7531\u8d85\u8fc7100\u4e07\u5f20\u5177\u6709\u9ad8\u5ea6\u591a\u6837\u6027\u7684\u5f69\u8272\u56fe\u50cf\u7ec4\u6210\u3002\u8bad\u7ec3\u56fe\u50cf\u88ab\u88c1\u526a\u621048\u00d748\u5757<\/strong>\uff0c\u67093\u4e2a\u901a\u9053\u7528\u4e8e\u8bad\u7ec3\u3002\u6709\u8d85\u8fc710M<\/strong>\u7684patch\u7528\u4e8e\u8bad\u7ec3IPT\u6a21\u578b\u3002\u751f\u6210\u4e866\u79cd\u9000\u5316\u7c7b\u578b\u7684\u53d7\u635f\u56fe\u50cf<\/strong>\uff1a2\u00d7\u30013\u00d7\u30014\u00d7\u53cc\u4e09\u6b21\u63d2\u503c\u4e0b\u91c7\u6837\u300130\u300150\u566a\u58f0\u7ea7\u9ad8\u65af\u566a\u58f0\u548c\u6dfb\u52a0\u96e8\u7eb9\u3002\u4f7f\u752832\u5f20Nvidia Nvidia Tesla V100\u5361\u3002<\/p>\n

                3.1 Super-resolution<\/h3>\n

                \u5c06IPT\u6a21\u578b\u4e0e\u51e0\u79cd\u6700\u5148\u8fdb\u7684\u57fa\u4e8eCNN\u7684SR\u65b9\u6cd5\u8fdb\u884c\u4e86\u6bd4\u8f83<\/strong>\u3002\u5982\u88681\u6240\u793a\uff0c\u9884\u8bad\u7ec3IPT\u4f18\u4e8e\u6240\u6709\u5176\u4ed6\u65b9\u6cd5\uff0c\u5e76\u5728\u6240\u6709\u6570\u636e\u96c6\u7684\u00d72\u3001\u00d73\u3001\u00d74\u5c3a\u5ea6\u4e0a\u5b9e\u73b0\u4e86\u6700\u4f73\u6027\u80fd\u3002<\/font>\u503c\u5f97\u5f3a\u8c03\u7684\u662f\uff0cIPT\u5728\u00d72\u5c3a\u5ea6\u7684Urban100\u6570\u636e\u96c6\u4e0a\u5b9e\u73b0\u4e8633.76dB\u7684\u5cf0\u503c\u4fe1\u566a\u6bd4\uff0c\u8fd9\u6bd4\u5176\u4ed6\u65b9\u6cd5\u7684\u5cf0\u503c\u4fe1\u566a\u6bd4\u9ad8\u51fa\u223c0.4dB\uff0c\u800c\u4e4b\u524d\u7684SOTA\u65b9\u6cd5\u4e0e\u5176\u4ed6\u65b9\u6cd5\u76f8\u6bd4\u53ea\u80fd\u5b9e\u73b0<0.2dB\u7684\u6539\u8fdb\uff0c\u8fd9\u8868\u660e\u8be5\u6a21\u578b\u5229\u7528\u5927\u89c4\u6a21\u9884\u8bad\u7ec3\u7684\u4f18\u8d8a\u6027\u3002<\/font><\/p>\n

                \u56fe3\u5728Urban100\u6570\u636e\u96c6\u4e0a\u4ee54\u00d7\u6bd4\u4f8b\u5c55\u793a\u4e86\u6a21\u578b\u7684\u53ef\u89c6\u5316\u7ed3\u679c<\/strong>\u3002\u9ad8\u6bd4\u4f8b\u56e0\u5b50\u4e0b\u6062\u590d\u539f\u59cb\u9ad8\u5206\u8fa8\u7387\u56fe\u50cf\u5f88\u56f0\u96be\uff0c\u56e0\u4e3a\u4f1a\u5bfc\u81f4\u5927\u91cf\u4fe1\u606f\u4e22\u5931\u3002\u4ee5\u524d\u7684\u65b9\u6cd5\u751f\u6210\u6a21\u7cca\u56fe\u50cf\uff0c\u800cIPT\u6a21\u578b\u751f\u6210\u7684\u8d85\u5206\u8fa8\u7387\u56fe\u50cf\u53ef\u4ee5\u5f88\u597d\u5730\u4ece\u4f4e\u5206\u8fa8\u7387\u56fe\u50cf\u4e2d\u6062\u590d\u7ec6\u8282\u3002
                \"\u8d85\u5206\u7b97\u6cd5IPT\uff1aPre-Trained
                \"\u8d85\u5206\u7b97\u6cd5IPT\uff1aPre-Trained<\/p>\n

                3.2 Denoising<\/h3>\n

                IPT\u4e0e\u5404\u79cd\u6700\u5148\u8fdb\u7684\u6a21\u578b\u8fdb\u884c\u6bd4\u8f83\u3002\u88682\u5c55\u793a\u4e86BSD68\u548cUrban100<\/strong>\u6570\u636e\u96c6\u4e0a\u7684\u5f69\u8272\u56fe\u50cf\u53bb\u566a\u7ed3\u679c<\/strong>\u3002
                \u5728\u4e0d\u540c\u9ad8\u65af\u566a\u58f0\u6c34\u5e73\u4e0b\uff0cIPT\u5728\u6240\u6709\u53bb\u566a\u65b9\u6cd5\u4e2d\u53d6\u5f97\u4e86\u6700\u597d\u7684\u6548\u679c<\/font>\u3002\u6b64\u5916\uff0cIPT\u6a21\u578b\u5728Urban100\u6570\u636e\u96c6\u4e0a\u8d85\u8fc7SOTA\u65b9\u6cd5~2dB\uff0c\u8fd9\u8bc1\u660e\u4e86\u9884\u8bad\u7ec3\u7684\u6709\u6548\u6027\u548c\u57fa\u4e8eTransformer\u7684\u6a21\u578b\u7684\u4f18\u8d8a\u6027\u3002<\/p>\n

                \u56fe4\u663e\u793a\u4e86\u7ed3\u679c\u56fe\u50cf\u7684\u53ef\u89c6\u5316\u3002\u5982\u56fe\u6240\u793a\uff0c\u566a\u58f0\u56fe\u50cf\u5f88\u96be\u8bc6\u522b\uff0c\u5f88\u96be\u6062\u590d\u5e72\u51c0\u7684\u56fe\u50cf\u3002\u73b0\u6709\u7684\u65b9\u6cd5\u65e0\u6cd5\u91cd\u5efa\u8db3\u591f\u7684\u7ec6\u8282\u5e76\u751f\u6210\u5f02\u5e38\u50cf\u7d20<\/strong>\u3002\u800c\u9884\u8bad\u7ec3\u7684\u6a21\u578b\u53ef\u4ee5\u5f88\u597d\u5730\u6062\u590d\u8fd9\u53ea\u732b\uff08\uff1f\uff1f\uff1f\uff09\u5934\u53d1\u4e2d\u7684\u4e00\u4e9b\u7ec6\u8282\uff08\u4e09\u53ea\u5c0f\u5c0f\u9e1f\uff09\uff0c\u89c6\u89c9\u8d28\u91cf\u660e\u663e\u4f18\u4e8e\u4e4b\u524d\u7684\u6240\u6709\u6a21\u578b<\/font>\u3002
                \"\u8d85\u5206\u7b97\u6cd5IPT\uff1aPre-Trained<\/p>\n

                \"\u8d85\u5206\u7b97\u6cd5IPT\uff1aPre-Trained<\/p>\n

                3.3 Deraining<\/h3>\n

                \u5bf9\u4e8e\u56fe\u50cf\u53bb\u96e8\u4efb\u52a1\uff0c\u5728\u5408\u6210\u7684Rain100L<\/strong>\u6570\u636e\u96c6\u4e0a\u8bc4\u4f30\u4e86IPT\u6a21\u578b\uff0c\u8be5\u6570\u636e\u96c6\u7531100\u5e45RAIN\u56fe\u50cf\u7ec4\u6210<\/strong>\u3002\u5b9a\u91cf\u7ed3\u679c\u89c1\u88683\u3002\u4e0e\u6700\u5148\u8fdb\u7684\u65b9\u6cd5\u76f8\u6bd4\uff0cIPT\u5b9e\u73b0\u4e86\u6700\u597d\u7684\u6027\u80fd\uff0841.62dB\uff09\uff0c\u63d0\u9ad8\u4e861.62dB\u3002\u56fe5\u663e\u793a\u4e86\u53ef\u89c6\u5316\u7ed3\u679c\u3002\u4ee5\u5f80\u7684\u65b9\u6cd5\u7531\u4e8e\u7f3a\u4e4f\u56fe\u50cf\u5148\u9a8c\u77e5\u8bc6\uff0c\u65e0\u6cd5\u91cd\u5efa\u539f\u59cb\u7684\u5e72\u51c0\u56fe\u50cf\u3002\u4f46IPT\u6a21\u578b\u53ef\u4ee5\u5448\u73b0\u4e0e\u771f\u5b9e\u56fe\u50cf\u5b8c\u5168\u76f8\u540c\u7684\u89c6\u89c9\u6548\u679c\u3002\u5728\u89c6\u89c9\u8d28\u91cf\u65b9\u9762\u901a\u8fc7\u4e86\u4e4b\u524d\u7684\u6240\u6709\u7b97\u6cd5\u3002\u8fd9\u4e00\u7ed3\u679c\u8bc1\u5b9e\u4e86\u6240\u63d0\u51fa\u6a21\u578b\u7684\u666e\u904d\u6027\u3002<\/font>
                \"\u8d85\u5206\u7b97\u6cd5IPT\uff1aPre-Trained
                \"\u8d85\u5206\u7b97\u6cd5IPT\uff1aPre-Trained<\/p>\n

                3.4 Generalization Ability<\/h3>\n

                \u867d\u7136\u4f5c\u8005\u751f\u6210\u4e86\u5404\u79cd\u635f\u574f\u7684\u56fe\u50cf\uff0c\u4f46\u81ea\u7136\u56fe\u50cf\u7684\u590d\u6742\u6027\u5f88\u9ad8\uff0c\u65e0\u6cd5\u5408\u6210\u6240\u6709\u53ef\u80fd\u7684\u56fe\u50cf\u6765\u9884\u8bad\u7ec3Transformer\u6a21\u578b\u3002\u7136\u800c\uff0c\u4e00\u4e2a\u597d\u7684\u9884\u8bad\u7ec3\u6a21\u578b\u5e94\u8be5\u80fd\u591f\u5f88\u597d\u5730\u9002\u5e94NLP\u9886\u57df\u7684\u5176\u4ed6\u4efb\u52a1\u3002\u4e3a\u6b64\uff0c\u8fdb\u884c\u4e86\u4e00\u4e9b\u5b9e\u9a8c\u6765\u9a8c\u8bc1\u6a21\u578b\u7684\u6cdb\u5316\u80fd\u529b<\/font>\u3002\u5728\u5b9e\u9a8c\u4e2d\uff0c\u6d4b\u8bd5\u4e86\u5408\u6210ImageNet\u6570\u636e\u96c6\u4e2d\u672a\u5305\u542b\u7684\u635f\u574f\u56fe\u50cf<\/font>\uff0c\u5373\u5206\u522b\u4f7f\u752810\u7ea7\u548c70\u7ea7\u566a\u58f0\u8fdb\u884c\u56fe\u50cf\u53bb\u566a\u3002\u4f7f\u7528\u5bf9\u5e94\u7684\u5934\u90e8\u548c\u5c3e\u90e8\u4f5c\u4e3a\u9884\u8bad\u7ec3\u6a21\u578b\u8fdb\u884c\u56fe\u50cf\u53bb\u566a\u4efb\u52a1\u3002\u8be6\u7ec6\u7ed3\u679c\u5982\u88684\u6240\u793a\uff0c\u6bd4\u8f83\u4e86\u4f7f\u7528\u9884\u8bad\u7ec3\u7684IPT\u6a21\u578b\u548c\u6700\u5148\u8fdb\u7684\u56fe\u50cf\u53bb\u566a\u65b9\u6cd5\u7684\u6027\u80fd\u3002\u663e\u7136\uff0cIPT\u6a21\u578b\u4f18\u4e8e\u5176\u4ed6\u5e38\u89c4\u65b9\u6cd5\uff0c\u8868\u660e\u9884\u8bad\u7ec3\u7684\u6a21\u578b\u53ef\u4ee5\u4ece\u5927\u89c4\u6a21\u6570\u636e\u96c6\u4e2d\u6355\u83b7\u66f4\u591a\u6709\u7528\u7684\u4fe1\u606f\u548c\u7279\u5f81\u3002<\/font>
                \"\u8d85\u5206\u7b97\u6cd5IPT\uff1aPre-Trained<\/p>\n

                3.5 Ablation Study<\/h3>\n

                \u2460\u6570\u636e\u767e\u5206\u6bd4\u7684\u5f71\u54cd<\/strong>\uff1a
                \u4f7f\u7528\u5408\u6210\u7684ImageNet\u6570\u636e\u96c6\u768420%\u300140%\u300160%\u300180%\u548c100%\u7684\u767e\u5206\u6bd4\u6765\u5206\u6790\u6240\u7528\u6570\u636e\u6570\u91cf\u5bf9\u7ed3\u679c\u6027\u80fd\u7684\u5f71\u54cd<\/strong>\u3002\u56fe6\u663e\u793a\u4e86\u4e0d\u540c\u9884\u8bad\u7ec3\u6a21\u578b\u7684\u7ed3\u679c\u3002\u5f53\u6a21\u578b\u6ca1\u6709\u7ecf\u8fc7\u9884\u8bad\u7ec3\u6216\u4f7f\u7528\u5c11\u91cf\uff08<60%\uff09\u6570\u636e\u96c6\u8fdb\u884c\u9884\u8bad\u7ec3\u65f6\uff0cCNN\u6a21\u578b\u53ef\u4ee5\u83b7\u5f97\u66f4\u597d\u7684\u6027\u80fd\u3002\u76f8\u6bd4\u4e4b\u4e0b\uff0c\u5f53\u4f7f\u7528\u5927\u89c4\u6a21\u6570\u636e\u65f6\uff0c\u57fa\u4e8eTransformer\u7684\u6a21\u578b\u538b\u5012\u4e86CNN\u6a21\u578b\uff0c\u8fd9\u8bc1\u660e\u4e86IPT\u9884\u8bad\u7ec3\u6a21\u578b\u7684\u6709\u6548\u6027\u3002
                \"\u8d85\u5206\u7b97\u6cd5IPT\uff1aPre-Trained<\/p>\n

                \u2461\u5bf9\u6bd4\u5b66\u4e60\u7684\u5f71\u54cd<\/strong>\uff1a<\/p>\n

                \u4e3a\u4e86\u63d0\u9ad8\u9884\u8bad\u7ec3\u6a21\u578b\u7684\u8868\u5f81\u80fd\u529b\uff0c\u5c06\u5bf9\u6bd4\u5b66\u4e60\u635f\u5931\u5d4c\u5165\u5230\u8bad\u7ec3\u8fc7\u7a0b\u4e2d\u3002\u7528Set4\u6570\u636e\u96c6\u8bc4\u4f30\u5176\u5728\u00d72\u5c3a\u5ea6\u8d85\u5206\u8fa8\u7387\u4efb\u52a1\u4e2d\u7684\u6709\u6548\u6027\u3002\u88685\u663e\u793a\u4e86\u8d85\u53c2\u6570\u03bb\u5bf9\u5e73\u8861\u4e24\u9879\u635f\u5931\u51fd\u6570\u7684\u5f71\u54cd\u3002\u5f53\u03bb=0\u65f6\uff0cIPT\u6a21\u578b\u4ec5\u4f7f\u7528\u76d1\u7763\u5b66\u4e60\u65b9\u6cd5\u8fdb\u884c\u8bad\u7ec3\uff0c\u5f97\u5230\u7684PSNR\u503c\u4e3a38.27dB\u3002\u5f53\u91c7\u7528\u5bf9\u6bd4\u635f\u5931\u8fdb\u884c\u81ea\u76d1\u7763\u5b66\u4e60\u65f6\uff0c\u8be5\u6a21\u578b\u53ef\u4ee5\u83b7\u5f9738.37dB\u7684PSNR\u503c\uff08\u03bb=0.1\uff09\uff0c\u6bd4\u7528\u03bb=0\u8bad\u7ec3\u7684\u6a21\u578b\u9ad8\u51fa\u7ea60.1dB\u3002\u8fd9\u4e9b\u7ed3\u679c\u8fdb\u4e00\u6b65\u8bc1\u660e\u4e86\u5bf9\u6bd4\u5b66\u4e60\u5bf9\u9884\u8bad\u7ec3IPT\u6a21\u578b\u7684\u6709\u6548\u6027\u3002
                \"\u8d85\u5206\u7b97\u6cd5IPT\uff1aPre-Trained<\/p>\n

                4 Conclusion<\/h2>\n

                \u672c\u6587\u65e8\u5728\u5229\u7528\u9884\u8bad\u7ec3\u7684Transformer\u6a21\u578b\uff08IPT\uff09\u89e3\u51b3\u56fe\u50cf\u5904\u7406\u95ee\u9898\u3002IPT\u6a21\u578b\u8bbe\u8ba1\u6709\u591a\u4e2a\u5934\u90e8\u3001\u591a\u4e2a\u5c3e\u90e8\u548c\u4e00\u4e2a\u5171\u4eab\u7684Transformer\u4f53<\/font>\uff0c\u7528\u4e8e\u670d\u52a1\u4e0d\u540c\u7684\u56fe\u50cf\u5904\u7406\u4efb\u52a1\uff0c\u5982\u56fe\u50cf\u8d85\u5206\u8fa8\u7387\u548c\u53bb\u566a\u3001\u53bb\u96e8\u3002<\/p>\n

                \u4e3a\u4e86\u6700\u5927\u9650\u5ea6\u5730\u6316\u6398transformer\u67b6\u6784\u5728\u5404\u79cd\u4efb\u52a1\u4e0a\u7684\u6027\u80fd\uff0c\u63a2\u7d22\u4e86\u4e00\u4e2a\u7efc\u5408\u7684ImageNet\u6570\u636e\u96c6<\/font>\u3002\u5176\u4e2d\uff0c\u6bcf\u4e2a\u539f\u59cb\u56fe\u50cf\u5c06\u88ab\u964d\u7ea7\u4e3a\u4e00\u7cfb\u5217\u5bf9\u5e94\u7684\u6210\u5bf9\u8bad\u7ec3\u6570\u636e\u3002<\/p>\n

                \u7136\u540e\u4f7f\u7528\u6709\u76d1\u7763\u548c\u81ea\u76d1\u7763<\/font>\u7684\u65b9\u6cd5\u5bf9IPT\u6a21\u578b\u8fdb\u884c\u8bad\u7ec3\uff0c\u8fd9\u4e9b\u65b9\u6cd5\u663e\u793a\u4e86\u6355\u83b7\u5e95\u5c42\u56fe\u50cf\u5904\u7406\u56fa\u6709\u7279\u5f81\u7684\u5f3a\u5927\u80fd\u529b\u3002<\/p>\n

                \u5b9e\u9a8c\u7ed3\u679c\u8868\u660e\uff0c\u7ecf\u8fc7\u5feb\u901f\u5fae\u8c03\u540e\uff0cIPT\u4ec5\u4f7f\u7528\u4e00\u4e2a\u9884\u8bad\u7ec3\u7684\u6a21\u578b\uff0c\u5c31\u53ef\u4ee5\u8d85\u8d8a\u6700\u5148\u8fdb\u7684\u65b9\u6cd5\u3002\u5728\u672a\u6765\u7684\u5de5\u4f5c\u4e2d\uff0c\u8fd8\u53ef\u4ee5\u628aIPT\u6a21\u578b\u6269\u5c55\u5230\u66f4\u591a\u7684\u4efb\u52a1\uff0c\u5982\u56fe\u50cf\u4fee\u590d\u3001\u53bb\u96fe<\/code>\u7b49\u3002<\/p>\n


                \n

                \u6700\u540e\u795d\u5404\u4f4d\u79d1\u7814\u987a\u5229\uff0c\u8eab\u4f53\u5065\u5eb7\uff0c\u4e07\u4e8b\u80dc\u610f~<\/p>\n","protected":false},"excerpt":{"rendered":"\u8d85\u5206\u7b97\u6cd5IPT\uff1aPre-Trained Image Processing Transformer\u672c\u6587\u662f\u4e00\u4e2a\u57fa\u4e8etransformer\u7684\u9884\u8bad\u7ec3\u901a\u7528\u6a21\u578b\uff0c\u9488\u5bf9\u4f4e\u7ea7\u89c6\u89c9\u4efb\u52a1\u8fd8\u6ca1...","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[],"tags":[],"_links":{"self":[{"href":"https:\/\/mushiming.com\/wp-json\/wp\/v2\/posts\/7711"}],"collection":[{"href":"https:\/\/mushiming.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mushiming.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mushiming.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mushiming.com\/wp-json\/wp\/v2\/comments?post=7711"}],"version-history":[{"count":0,"href":"https:\/\/mushiming.com\/wp-json\/wp\/v2\/posts\/7711\/revisions"}],"wp:attachment":[{"href":"https:\/\/mushiming.com\/wp-json\/wp\/v2\/media?parent=7711"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mushiming.com\/wp-json\/wp\/v2\/categories?post=7711"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mushiming.com\/wp-json\/wp\/v2\/tags?post=7711"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}