{"id":1810,"date":"2024-07-31T19:34:51","date_gmt":"2024-07-31T11:34:51","guid":{"rendered":"https:\/\/www.gnn.club\/?p=1810"},"modified":"2024-10-10T14:43:30","modified_gmt":"2024-10-10T06:43:30","slug":"%e5%8d%b7%e7%a7%af%e7%a5%9e%e7%bb%8f%e7%bd%91%e7%bb%9c","status":"publish","type":"post","link":"http:\/\/gnn.club\/?p=1810","title":{"rendered":"\u5377\u79ef\u795e\u7ecf\u7f51\u7edc\uff08CNN\uff09"},"content":{"rendered":"<h1><img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913163754455.png\" style=\"height:50px;display:inline\"> Deep Learning<\/h1>\n<hr \/>\n<p>create by Arwin Yu<\/p>\n<h2>Tutorial 02 - Convolutional Neural Networks &amp; Visual Tasks<\/h2>\n<hr \/>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913163908337.png\" style=\"height:300px\">\n<\/p>\n<h3><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/bubbles\/50\/000000\/checklist.png\" style=\"height:50px;display:inline\"> Agenda<\/h3>\n<hr \/>\n<ul>\n<li>2D\u5377\u79ef\uff082D Convolution\uff09<\/li>\n<li>\u57fa\u4e8e\u5377\u79ef\u7684\u7279\u5f81\u63d0\u53d6\uff08Convlution as Feature Extractors\uff09<\/li>\n<li>\u5377\u79ef\u795e\u7ecf\u7f51\u7edc\uff08Convolutional Neural Networks\uff09\n<ul>\n<li>\u901a\u9053\u6982\u5ff5<\/li>\n<li>\u5c40\u90e8\u76f8\u5173\u6027<\/li>\n<li>\u5e73\u79fb\u4e0d\u53d8\u6027<\/li>\n<li>\u6c60\u5316\u5c42<\/li>\n<li>\u6253\u5e73\u64cd\u4f5c<\/li>\n<li>\u5168\u8fde\u63a5\u5c42<\/li>\n<\/ul>\n<\/li>\n<li>\u5377\u79ef\u6a21\u578b\u7684\u5c42\u7ea7\u7ed3\u6784\u4e0e\u7279\u5f81\u53ef\u89c6\u5316\uff08Hierarchical structure and feature visualization\uff09<\/li>\n<li>\u5377\u79ef\u6a21\u578b\u7684\u611f\u53d7\u91ce\uff08Receptive field of convolutional model\uff09<\/li>\n<li>\u6b63\u5219\u5316\uff08Regularzation\uff09\n<ul>\n<li>\u6570\u636e\u589e\u5f3a<\/li>\n<li>\u4e22\u5f03\u6cd5<\/li>\n<li>\u6b63\u5219\u9879<\/li>\n<\/ul>\n<\/li>\n<li>\u793a\u4f8b\u9879\u76ee\uff08CIFAR-10 Classification with PyTorch\uff09<\/li>\n<li>\u53ef\u89c6\u5316\u5377\u79ef\u6838\uff08Visualizing CNN kernels\uff09<\/li>\n<li>\u5377\u79ef\u6a21\u578b\u7684\u5e38\u89c1\u95ee\u9898\uff08The Problem with CNNs\uff09<\/li>\n<li>\u5377\u79ef\u6a21\u578b\u7684\u573a\u666f\u5e94\u7528\uff08The Application of CNNs\uff09<\/li>\n<\/ul>\n<pre><code class=\"language-python\"># imports for the tutorial\nimport numpy as np\nimport matplotlib.pyplot as plt\nimport time\nimport os\n\n# pytorch\nimport torch\nimport torch.nn as nn\nimport torchvision\nimport torchvision.transforms as transforms<\/code><\/pre>\n<h2><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/dusk\/64\/000000\/layers.png\" style=\"height:50px;display:inline\"> 2D Convolution<\/h2>\n<hr \/>\n<p>2D\u5377\u79ef\u7684\u64cd\u4f5c\u5982\u4e0b\u6240\u793a <\/p>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913164224599.png\" style=\"height:300px\">\n<\/p>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913164258524.gif\" style=\"height:400px\">\n<\/p>\n<h2><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/color\/96\/000000\/tweezers.png\" style=\"height:50px;display:inline\"> Convolution as Feature Extractors for Classification<\/h2>\n<hr \/>\n<ul>\n<li>\u5377\u79ef\u64cd\u4f5c\u53ef\u4ee5\u4ece\u56fe\u50cf\u4e2d\u8fdb\u884c\u7279\u5f81\u7684\u63d0\u53d6\uff0c\u4e0d\u540c\u7684\u5377\u79ef\u6838\u4f1a\u5c1d\u8bd5\u63d0\u53d6\u4e0d\u540c\u7684\u7279\u5f81\u3002<\/li>\n<li>\u4f8b\u5982\uff0c gradient\/derivative filter \u53ef\u4ee5\u5e2e\u52a9\u6211\u4eec\u68c0\u6d4b<strong>\u8fb9\u7f18<\/strong>\u3002<\/li>\n<\/ul>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913164608384.png\" style=\"height:200px\">\n<\/p>\n<ul>\n<li>\n<p>\u6839\u636e\u5377\u79ef\u64cd\u4f5c\u627e\u5230\u7684\u7279\u5f81\uff0c\u795e\u7ecf\u7f51\u7edc\u5c42\u6216\u5176\u4ed6\u5206\u7c7b\u5668\u53ef\u4ee5\u6839\u636e\u8fd9\u4e9b\u7279\u5f81\u8fdb\u4e00\u6b65\u5224\u65ad\u5b83\u4eec\u7684\u6240\u5c5e\u7c7b\u522b\u3002<\/p>\n<p align=\"center\">\n<img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913164826736.png\" style=\"height:200px\">\n<\/p>\n<ul>\n<li><a href=\"https:\/\/www.mathworks.com\/solutions\/deep-learning\/convolutional-neural-network.html\">Image Source<\/a><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>\u4ec0\u4e48\u662f\u7279\u5f81\uff1f\u8003\u8651\u4ee5\u4e0b\u8bf4\u660e\u6027\u793a\u4f8b - \u5bf9 <em>\u732b<\/em> \u548c <em>\u72d7<\/em> \u8fdb\u884c\u5206\u7c7b\u3002<\/p>\n<p>\u6211\u4eec\u5982\u4f55\u533a\u5206\u732b\u548c\u72d7\uff1f\u6211\u4eec\u53ef\u4ee5\u770b\u770b\u5c3e\u5df4\u7684\u957f\u5ea6\u3001\u722a\u5b50\u7684\u5f62\u72b6\u3001\u6bdb\u53d1\u7684\u56fe\u6848\u7b49...<\/p>\n<p>\u4eba\u7c7b\u901a\u5e38\u53ea\u9700\u67e5\u770b\u6837\u672c\u5373\u53ef\u5206\u8fa8\u51fa\u8fd9\u4e9b\u3002\u4f46\u8ba1\u7b97\u673a\u770b\u5230\u4e86\u4ec0\u4e48\uff1f<\/p>\n<p>\u5728\u5206\u7c7b\u4efb\u52a1\u4e2d\uff0c\u6211\u4eec\u9700\u8981 <em>\u597d\u7684\u7279\u5f81<\/em> \u6765\u5b66\u4e60\u4ece\u6837\u672c\u6620\u5c04\u5230\u6807\u7b7e\u7684\u51fd\u6570\u3002<\/p>\n<p><strong>\u539f\u59cb\u50cf\u7d20<\/strong> \u901a\u5e38\u4e0d\u662f\u8db3\u591f\u6709\u8868\u73b0\u529b\u7684\u7279\u5f81\uff01\u8fd9\u662f\u56e0\u4e3a\u539f\u59cb\u50cf\u7d20\u65e0\u6cd5\u6355\u6349\u56fe\u50cf\u4e2d\u7684 <em>\u7a7a\u95f4\u5173\u7cfb<\/em>\u3002<\/p>\n<p>\u4f7f\u7528\u5377\u79ef\uff0c\u6211\u4eec\u53ef\u4ee5\u6355\u6349 <strong>\u7a7a\u95f4\u7ed3\u6784<\/strong>\uff08\u4f8b\u5982\uff0c\u5c3e\u5df4\u5f62\u72b6\u7684\u50cf\u7d20\uff09\u3002<\/p>\n<p>\u5377\u79ef\u7684\u6838\u5fc3\u672c\u8d28\uff1a<strong>\u5bf9\u5c40\u90e8\u533a\u95f4\u7684\u4fe1\u606f\u63d0\u53d6\u3002<\/strong><\/p>\n<h3><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/dusk\/64\/000000\/s.png\" style=\"height:50px;display:inline\"> The Softmax Function<\/h3>\n<hr \/>\n<ul>\n<li>\n<p>\u5047\u8bbe\u6211\u4eec\u8bbe\u8ba1\u4e86\u4e00\u4e2a\u5206\u7c7b\u67b6\u6784\uff0c\u5206\u7c7b\u6a21\u578b\u4e2d\u6700\u540e\u4e00\u4e2a\u5168\u8fde\u63a5\u5c42\u7684\u8f93\u51fa\u662f\u4e00\u4e2a\u957f\u5ea6\u4e3a $\\text{num-classes}$ \u7684\u5411\u91cf\uff0c\u5b83\u6b63\u597d\u662f\u6211\u4eec\u62e5\u6709\u7684\u7c7b\u6570\uff08\u4f8b\u5982\uff0c\u5728 MNIST \u6216 CIFAR-10 \u4e2d\uff0c\u6211\u4eec\u6709 10 \u4e2a\u7c7b\uff0c\u56e0\u6b64\u8f93\u51fa\u7ef4\u5ea6\u4e3a 10\uff09\uff01<\/p>\n<\/li>\n<li>\n<p>\u5728\u8f93\u51fa\u5411\u91cf\u4e2d\uff0c\u6211\u4eec\u5e0c\u671b\u6761\u76ee $i$ \u662f\u8f93\u5165\u6765\u81ea\u7c7b $i$ \u7684\u6982\u7387\u3002<\/p>\n<\/li>\n<li>\n<p>\u4f46\u6211\u4eec\u5982\u4f55\u5f3a\u5236\u8fd9\u4e2a\u5411\u91cf\u8f93\u51fa\u6982\u7387\u800c\u4e0d\u4ec5\u4ec5\u662f\u4e00\u4e9b\u6570\u5b57\uff1f<\/p>\n<\/li>\n<li>\n<p>\u6211\u4eec\u5c06\u4f7f\u7528<strong>Softmax<\/strong>\u51fd\u6570\u5c06\u5176\u6807\u51c6\u5316\u4e3a\u6982\u7387\u3002<\/p>\n<\/li>\n<li>\n<p>The Softmax function is defined as:  $$ Softmax(x_i) = \\frac{e^{x_i}}{\\sum_{j=1}^M e^{x_j}}, i \\in [1,...,M], x \\in \\mathcal{R}^M  $$<\/p>\n<\/li>\n<li>\n<p>This forces the output vector to sum to 1, just like probabilities.<\/p>\n<\/li>\n<\/ul>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913165105334.png\" style=\"height:200px\">\n<\/p>\n<h3><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/plasticine\/100\/000000\/unicorn.png\"  style=\"height:50px;display:inline\"> Making Predictions<\/h3>\n<hr \/>\n<ul>\n<li>\u73b0\u5728\uff0c\u6211\u4eec\u6709\u4e00\u4e2a\u6982\u7387\u8f93\u51fa\u5411\u91cf\uff0c\u90a3\u4e48\u6211\u4eec\u5982\u4f55\u9884\u6d4b\u8f93\u5165\u56fe\u50cf\u7684\u6807\u7b7e\u5462\uff1f<\/li>\n<li>\u5f88\u7b80\u5355\uff01\u53ea\u9700\u53d6 $argmax$\uff1a $$ \\hat{y} = Softmax(NN(x)) $$ $$ c_{pred} = argmax_i (\\hat{y}) $$<\/li>\n<\/ul>\n<h3><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/dusk\/64\/000000\/bearish.png\" style=\"height:50px;display:inline\"> Loss Function - Cross Entropy<\/h3>\n<hr \/>\n<ul>\n<li>\u4e3a\u4e86\u4ee5\u7aef\u5230\u7aef\u7684\u65b9\u5f0f\u8bad\u7ec3\u6a21\u578b\uff0c\u6211\u4eec\u9700\u8981\u5b9a\u4e49\u4e00\u4e2a\u635f\u5931\u51fd\u6570\uff0c\u6211\u4eec\u53ef\u4ee5\u4f7f\u7528\u4f18\u5316\u6280\u672f\u5c06\u5176\u6700\u5c0f\u5316\u3002<\/li>\n<li>\u5047\u8bbe\u6211\u4eec\u7684\u6a21\u578b\u8f93\u51fa\uff08\u7ecf\u8fc7 softmax \u4e4b\u540e\uff09\u4e3a $\\hat{y}$\uff0c\u800c\u771f\u5b9e\u6807\u7b7e\uff08\u7ed9\u6211\u4eec\u7684\u771f\u5b9e\u7c7b\u522b\uff09\u4e3a $y$\u3002\n<p align=\"center\">\n<img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913165309466.png\" style=\"height:200px\">\n<\/p>\n<\/li>\n<\/ul>\n<p>\u4ea4\u53c9\u71b5\u635f\u5931\u51fd\u6570\u5728\u5206\u7c7b\u95ee\u9898\u4e2d\u7684\u5e94\u7528\u53ef\u4ee5\u901a\u8fc7\u516c\u5f0f\u6765\u66f4\u597d\u5730\u7406\u89e3\u3002\u5bf9\u4e8e\u4e00\u4e2a\u591a\u5206\u7c7b\u95ee\u9898\uff0c\u4ea4\u53c9\u71b5\u635f\u5931\u51fd\u6570\u901a\u5e38\u5b9a\u4e49\u4e3a:<br \/>\n$$<br \/>\nL=-\\sum_{i=1}^N \\sum_{c=1}^C y_{i, c} \\log \\left(\\hat{y}_{i, c}\\right)<br \/>\n$$<\/p>\n<p>\u5176\u4e2d:<\/p>\n<ul>\n<li>$N$ \u662f\u6837\u672c\u6570\u91cf\u3002<\/li>\n<li>$C$ \u662f\u7c7b\u522b\u6570\u91cf\u3002<\/li>\n<li>$y_{i, c}$  \u662f\u6837\u672c $i$ \u7684\u771f\u5b9e\u6807\u7b7e\uff0c\u5982\u679c\u6837\u672c $i$ \u5c5e\u4e8e\u7c7b\u522b $c$ \uff0c\u5219 $y_{i, c}=1$ \uff0c\u5426\u5219 $y_{i, c}=0$ \u3002<\/li>\n<li>$\\hat{y}_{i, c}$ \u662f\u6a21\u578b\u9884\u6d4b\u6837\u672c $i$ \u5c5e\u4e8e\u7c7b\u522b $c$ \u7684\u6982\u7387\u3002<\/li>\n<\/ul>\n<p>\u5bf9\u4e8e\u6bcf\u4e00\u4e2a\u6837\u672c $i$, \u771f\u5b9e\u6807\u7b7e $y_{i, c}$ \u53ea\u6709\u4e00\u4e2a\u503c\u4e3a 1 (\u5373\u6b63\u786e\u7c7b\u522b)\uff0c\u5176\u4ed6\u7c7b\u522b\u7684\u503c\u90fd\u4e3a 0 \u3002\u5047\u8bbe\u6837\u672c $i$ \u5c5e\u4e8e\u7c7b\u522b $c$ \uff0c\u5219 $y_{i, c}=1$ \uff0c\u5176\u4ed6\u7c7b\u522b  $y_{i, c}=0$ (\u5176\u4e2d $j \\neq c$ )\u3002<\/p>\n<p>\u56e0\u6b64\uff0c\u5bf9\u4e8e\u6837\u672c $i$ \u7684\u4ea4\u53c9\u6458\u635f\u5931\u4e3b\u8981\u7531$\\log \\left(\\hat{y}_{i, c}\\right)$\u51b3\u5b9a\uff0c\u56e0\u4e3a\u5176\u4ed6\u9879\u7684  $y_{i, j} \\log \\left(\\hat{y}_{i, j}\\right)$  \u90fd\u4e3a 0 \u3002\u5373:<br \/>\n$$<br \/>\nL_i=-y_{i, c} \\log \\left(\\hat{y}_{i, c}\\right)=-\\log \\left(\\hat{y}_{i, c}\\right)<br \/>\n$$<\/p>\n<p>\u5f53\u9884\u6d4b\u6982\u7387 $\\hat{y}_{i, c}$ \u8d8a\u63a5\u8fd1\u771f\u5b9e\u6807\u7b7e $y_{i, c}$ \u65f6\uff0c\u5373$\\hat{y}_{i, c}$ \u8d8a\u63a5\u8fd1 1 \u65f6\uff0c $\\log \\left(\\hat{y}_{i, c}\\right)$ \u8d8a\u63a5\u8fd1 0 \uff0c\u56e0\u4e3a $\\log (1)=0$ \u3002<\/p>\n<h2><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/bubbles\/50\/000000\/mind-map.png\" style=\"height:50px;display:inline\"> Convolutional Neural Networks (CNNs)<\/h2>\n<p>\u4e0b\u9762\uff0c\u6211\u4eec\u73b0\u5728\u5c06\u4ecb\u7ecd\u5377\u79ef\u795e\u7ecf\u7f51\u7edc\uff08CNN\uff09\u7684\u57fa\u672c\u6784\u6210\u8981\u7d20\u4ee5\u53ca\u5b9e\u73b0\u5bf9\u56fe\u50cf\u6570\u636e\u8fdb\u884c\u6709\u6548\u5b66\u4e60\u7684\u601d\u60f3\u3002<\/p>\n<h3>Feature Mapping and Multiple Channels<\/h3>\n<hr \/>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913170159782.png\" style=\"height:350px\">\n<\/p>\n<p>\u5728\u4e0b\u9762\u52a8\u56fe\u4e2d\uff0c\u5377\u79ef\u6838\u7684\u6570\u91cf\u4e3a8\u4e2a\uff0c\u6bcf\u4e2a\u5377\u79ef\u6838\u7684\u5f62\u72b6\u4e3a8x3x3\uff0c\u5176\u4e2d8\u4e3a\u5377\u79ef\u6838\u7279\u5f81\u901a\u9053\u7684\u6570\u91cf\uff0c\u8fd9\u4e0e\u8f93\u5165\u6570\u636e\u7684\u7279\u5f81\u901a\u9053\u6570\u91cf\u662f\u76f8\u7b49\u7684\u3002\u5377\u79ef\u6838\u5728\u8ba1\u7b97\u8fc7\u7a0b\u4e2d\u7684\u586b\u5145\u4e3a1\uff0c\u6b65\u957f\u4e3a2\uff0c\u6bcf\u4e2a\u5377\u79ef\u6838\u90fd\u53ef\u8ba1\u7b97\u5f97\u5230\u4e00\u5f20\u7279\u5f81\u56fe\uff0c8\u4e2a\u5377\u79ef\u6838\u53ef\u8ba1\u7b97\u5f97\u52308\u5f20\uff0c\u56e0\u6b64\u6700\u540e\u8f93\u51fa\u7ed3\u679c\u7684\u901a\u9053\u6570\u91cf\u4e5f\u4e3a8.<\/p>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913170311976.gif\" style=\"height:350px\">\n<\/p>\n<h4><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/?size=100&id=91CnU00i6HLv&format=png&color=000000\" style=\"height:50px;display:inline\"> \u5377\u79ef\u5904\u7406\u56fe\u50cf\u6570\u636e\u7684\u6709\u6548\u6027\u5982\u4f55\u89e3\u91ca\uff1f<\/h4>\n<hr \/>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913170523324.png\" style=\"height:350px\">\n<\/p>\n<ul>\n<li>\u53c2\u6570\u5c11\uff0c\u6613\u8bad\u7ec3<\/li>\n<li>\u5c40\u90e8\u76f8\u5173\u6027<\/li>\n<li>\u5e73\u79fb\u4e0d\u53d8\u6027\uff08\u53c2\u6570\u5171\u4eab\uff09<\/li>\n<\/ul>\n<p><strong>\u5377\u79ef\u5982\u4f55\u8fdb\u884c\u7279\u5f81\u63d0\u53d6\uff1f<\/strong><\/p>\n<p>\u4e0d\u540c\u7684\u5377\u79ef\u6838\u53ef\u4ee5\u4ece\u56fe\u50cf\u4e2d\u63d0\u53d6\u4e0d\u540c\u7684\u7279\u5f81\u3002\u56e0\u4e3a\u5f53\u63d0\u53d6\u7684\u7279\u5f81\u592a\u5c11\uff0c\u662f\u6ca1\u6709\u529e\u6cd5\u5b8c\u6210\u56fe\u50cf\u8bc6\u522b\u8fd9\u4e2a\u4efb\u52a1\u7684\u3002\u4e00\u4e2a\u7b80\u5355\u7684\u4f8b\u5b50\uff1a\u6211\u4eec\u4e0d\u80fd\u51ed\u501f\u773c\u775b\u8fd9\u4e00\u79cd\u7279\u5f81\u6765\u8bc6\u522b\u732b\u548c\u72d7\u8fd9\u4e24\u4e2a\u7c7b\u522b\uff0c\u5f80\u5f80\u9700\u8981\u6839\u636e\u773c\u775b\u3001\u5634\u5df4\u3001\u5916\u5f62\u3001\u6bdb\u53d1\u3001\u8033\u6735\u7b49\u7b49\u591a\u79cd\u7279\u5f81\u624d\u80fd\u5bf9\u732b\u548c\u72d7\u505a\u6b63\u786e\u7684\u8bc6\u522b\u3002<\/p>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913170558725.png\" style=\"height:200px\">\n<\/p>\n<h3>Pooling<\/h3>\n<hr \/>\n<ul>\n<li>\u6c60\u5316\u53ef\u4ee5\u770b\u51fa\u7279\u6b8a\u7684\u5377\u79ef\u64cd\u4f5c\uff0c\u552f\u4e00\u7684\u4e0d\u540c\u4e4b\u5904\u5728\u4e8e\u5176\u4e0d\u662f\u5e94\u7528\u53ef\u4ee5\u8bad\u7ec3\u7684\u6743\u91cd\uff0c\u800c\u662f\u5728\u5176\u7a97\u53e3\u7684\u4e2d\u5e94\u7528\u67d0\u79cd\u7c7b\u578b\u7684\u7edf\u8ba1\u51fd\u6570\u3002<\/li>\n<li>\u6700\u5e38\u89c1\u7684\u6c60\u5316\u7c7b\u578b\u79f0\u4e3a<strong>\u6700\u5927\u6c60\u5316<\/strong>\uff0c\u5b83\u5c06 $max()$ \u51fd\u6570\u5e94\u7528\u4e8e\u7a97\u53e3\u7684\u5185\u5bb9\u3002<\/li>\n<\/ul>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913170657893.png\" style=\"height:350px\">\n<\/p>\n<p><a href=\"https:\/\/medium.com\/@duanenielsen\/deep-learning-cage-match-max-pooling-vs-convolutions-e42581387cb9\">Image Source <\/a><\/p>\n<h4><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/?size=100&id=91CnU00i6HLv&format=png&color=000000\" style=\"height:50px;display:inline\"> \u6c60\u5316\u6709\u6ca1\u6709\u53ef\u5b66\u4e60\u53c2\u6570\uff1f<\/h4>\n<hr \/>\n<p>CNN \u4e2d\u7684\u6c60\u5316\u4e00\u822c\u7528\u4e8e\u4e0b\u91c7\u6837\u64cd\u4f5c\u3002\u5982\u4e0b\u6240\u793a\uff0c5 x 5 \u8f93\u5165\u51cf\u5c0f\u4e3a 3 x 3 \u8f93\u51fa\u3002 <\/p>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913170807408.png\" style=\"height:200px\">\n<\/p>\n<p>\u5728\u4e0a\u9762\u7684\u6c60\u5316\u56fe\u4e2d\uff0c\u5728 5 x 5 \u8f93\u5165\u4e2d\u6dfb\u52a0\u4e86\u989d\u5916\u7684\u5217\u548c\u884c - \u8fd9\u4f7f\u5f97\u6c60\u5316\u7a7a\u95f4\u7684\u6709\u6548\u5927\u5c0f\u7b49\u4e8e 6 x 6\u3002\u8fd9\u662f\u4e3a\u4e86\u786e\u4fdd 2 x 2 \u6c60\u5316\u7a97\u53e3\u80fd\u591f\u4ee5 [2, 2] \u7684\u6b65\u5e45\u6b63\u786e\u8fd0\u884c - <em>\u586b\u5145<\/em>\u3002\u867d\u7136\u6211\u4eec\u901a\u5e38\u7528\u96f6\u586b\u5145\uff0c\u4f46\u4e5f\u53ef\u4ee5\u7528\u5176\u4ed6\u503c\u586b\u5145\u3002<\/p>\n<ul>\n<li>\n<p>\u53e6\u4e00\u65b9\u9762\uff0c\u6700\u5927\u6c60\u5316\u53ef\u4ee5\u8fdb\u884c\u7279\u5f81\u7684\u6c47\u805a\u3002<\/p>\n<\/li>\n<li>\n<p>\u5176\u4ed6\u6c60\u5316\u64cd\u4f5c:<\/p>\n<\/li>\n<\/ul>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913170841754.png\" style=\"height:200px\">\n<\/p>\n<h3>The FC Layer<\/h3>\n<hr \/>\n<ul>\n<li>\n<p>\u5168\u8fde\u63a5\u5c42\u53ef\u4ee5\u88ab\u770b\u4f5c\u662f\u5c06\u4e00\u4e2a\u6807\u51c6\u5206\u7c7b\u5668\u9644\u52a0\u5230\u7f51\u7edc\u7684\u4fe1\u606f\u4e30\u5bcc\u7684\u8f93\u51fa\u4e0a\uff0c\u4ee5\u201c\u89e3\u91ca\u201d\u7ed3\u679c\u5e76\u6700\u7ec8\u4ea7\u751f\u5206\u7c7b\u7ed3\u679c\u3002\u6362\u53e5\u8bdd\u8bf4\uff0c\u5377\u79ef\u5c42\u7684\u8f93\u51fa\u6210\u4e3a\u5206\u7c7b\u5668\u7684\u65b0\u201c\u8f93\u5165\u7279\u5f81\u201d\u3002<\/p>\n<\/li>\n<li>\n<p>\u4e3a\u4e86\u5c06\u8fd9\u4e2a\u5168\u8fde\u63a5\u5c42\u9644\u52a0\u5230\u7f51\u7edc\u4e0a\uff0c\u9700\u8981\u5c06\u5377\u79ef\u795e\u7ecf\u7f51\u7edc\u7684\u8f93\u51fa\u7ef4\u5ea6\u8fdb\u884c\u5c55\u5e73\u3002<\/p>\n<\/li>\n<li>\n<p>\u6ce8\u610f\uff1a\u6709\u4e9b\u7f51\u7edc\uff08\u4f8b\u5982\uff0c\u5168\u5377\u79ef\u7f51\u7edc\uff0cFCN\uff09\u6839\u672c\u4e0d\u4f7f\u7528\u5168\u8fde\u63a5\u5c42\uff01<br \/>\n\u6bd4\u5982\uff0c\u4f60\u53ef\u4ee5\u8bbe\u8ba1\u5377\u79ef\u6838\uff0c\u4f7f\u5f97\u6700\u7ec8\u8f93\u51fa\u662f\u5f62\u72b6\u4e3a [num_classes, 1, 1]\uff08num_channels \u4e2a\u901a\u9053\uff0c\u9ad8\u5ea6\u4e3a1\uff0c\u5bbd\u5ea6\u4e3a1\uff09\u7684\u5f20\u91cf\uff0c\u5e76\u4f7f\u7528\u5b83\u8fdb\u884c\u5206\u7c7b\u3002\u8fd9\u5728\u6267\u884c\u56fe\u50cf\u5206\u5272\u6216\u751f\u6210\u7684\u7f51\u7edc\u4e2d\u4e5f\u975e\u5e38\u6709\u7528\uff0c\u56e0\u4e3a\u8fd9\u4e9b\u7f51\u7edc\u7684\u8f93\u51fa\u662f\u50cf\u7d20\u3002<\/p>\n<\/li>\n<\/ul>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913171058522.png\" style=\"height:700px\">\n<\/p>\n<h3>Calculating the Convolutional Layer Output Shape<\/h3>\n<hr \/>\n<p>\u6211\u4eec\u5b9a\u4e49\u4e86<em>\u5377\u79ef\u5c42<\/em> \u7684\u4ee5\u4e0b\u53c2\u6570\uff1a<\/p>\n<ul>\n<li>$W_{in}$  - \u8f93\u5165\u7684\u5bbd\u5ea6<\/li>\n<li>$F$ - \u5377\u79ef\u6838\u5927\u5c0f<\/li>\n<li>$P$ - \u586b\u5145<\/li>\n<li>$S$ - \u6b65\u5e45<\/li>\n<\/ul>\n<p>\u8f93\u51fa\u7279\u5f81\u56fe\u7684\u5927\u5c0f:<br \/>\n$$W_{out} = \\frac{W_{in} - F + 2P}{S} + 1$$<\/p>\n<p>\u8003\u8651\u5927\u5c0f\u4e3a $28\\times 28$ \u7684\u8f93\u5165\u56fe\u50cf\u3001\u5377\u79ef\u6838\u5927\u5c0f\u4e3a $5\\times 5$\u3001\u586b\u5145\u4e3a 2\u3001\u6b65\u5e45\u4e3a 1<br \/>\n$$W_{1, out} = \\frac{28 - 5 + 2*2}{1} + 1 = 28 \\rightarrow MaxPooling(2x2) \\rightarrow 28 \/ 2 = 14$$<\/p>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913171757876.png\" style=\"height:200px\">\n<\/p>\n<p><a href=\"https:\/\/towardsdatascience.com\/a-comprehensive-introduction-to-different-types-of-convolutions-in-deep-learning-669281e58215\">Image Source<\/a><\/p>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913171841362.gif\" style=\"height:300px\">\n<\/p>\n<p>Animation by <a href=\"https:\/\/medium.com\/@nadeemhqazi\/a-brief-introduction-to-convolution-neural-network-4821215aa591\">Nadeem Qazi<\/a>.<\/p>\n<h3><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/color\/96\/000000\/layers.png\" style=\"height:50px;display:inline\"> Low Level (Shallow) and High Level (Deep) Features<\/h3>\n<hr \/>\n<ul>\n<li>\u4e3a\u4e86\u89e3\u91caCNN\u7684\u539f\u7406\uff0cZFNet\u7b49\u76f8\u5173\u5de5\u4f5c\u5728\u7f51\u7edc\u7684\u4e0d\u540c\u5c42\u7ea7\u89c2\u5bdf\u7279\u5f81\uff08\u5377\u79ef\u6838\u7684\u8f93\u51fa\uff09\u3002<\/li>\n<li><strong>\u4f4e\u5c42<\/strong> - \u6d45\u5c42\u7279\u5f81\uff0c\u5305\u62ec\u7ebf\u6761\u3001\u89d2\u3001\u8fb9\u7f18\u3001\u989c\u8272\u7b49\u3002<\/li>\n<li><strong>\u4e2d\u5c42<\/strong> - \u4e2d\u95f4\u5c42\u7279\u5f81\uff0c\u901a\u5e38\u662f\u7269\u4f53\u7684\u4e00\u90e8\u5206\u3002<\/li>\n<li><strong>\u9ad8\u5c42<\/strong> - \u6df1\u5c42\u7279\u5f81\uff0c\u66f4\u5927\u89c6\u89d2\u6211\u7269\u4f53\uff0c\u751a\u81f3\u662f\u6574\u4e2a\u7269\u4f53\u3002\n<p align=\"center\">\n<img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913194645525.png\" style=\"height:350px\">\n<\/p>\n<\/li>\n<\/ul>\n<p><a href=\"https:\/\/medium.com\/analytics-vidhya\/the-world-through-the-eyes-of-cnn-5a52c034dbeb\">Image Source<\/a><\/p>\n<p>\u8fd9\u5176\u5b9e\u4e0eCNN\u7684\u611f\u53d7\u91ce\u76f8\u5173\uff1a\u611f\u53d7\u91ce\uff08Receptive Field\uff09\u662f\u6307\u5377\u79ef\u795e\u7ecf\u7f51\u7edc\u4e2d\u67d0\u4e00\u795e\u7ecf\u5143\u5728\u8f93\u5165\u56fe\u50cf\u4e0a\u611f\u53d7\u5230\u7684\u533a\u57df\u5927\u5c0f\u3002\u6362\u53e5\u8bdd\u8bf4\uff0c\u611f\u53d7\u91ce\u63cf\u8ff0\u4e86\u795e\u7ecf\u5143\u5bf9\u8f93\u5165\u56fe\u50cf\u7684\u54ea\u4e9b\u90e8\u5206\u4ea7\u751f\u54cd\u5e94\u3002<\/p>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913194757372.png\" style=\"height:350px\">\n<\/p>\n<h3>Non-Linear Activations<\/h3>\n<hr \/>\n<ul>\n<li>\u6fc0\u6d3b\u51fd\u6570\u5728\u5377\u79ef\u795e\u7ecf\u7f51\u7edc\uff08CNN\uff09\u4e2d\u7528\u4e8e\u5f15\u5165\u975e\u7ebf\u6027\uff0c\u4f7f\u6a21\u578b\u80fd\u591f\u5b66\u4e60\u548c\u8868\u793a\u590d\u6742\u7684\u6a21\u5f0f\u548c\u7279\u5f81\u3002<\/li>\n<li>\u5b83\u5c06\u5377\u79ef\u5c42\u7684\u7ebf\u6027\u8f93\u51fa\u8f6c\u6362\u4e3a\u975e\u7ebf\u6027\u8f93\u51fa\uff0c\u4ece\u800c\u589e\u5f3a\u6a21\u578b\u7684\u8868\u8fbe\u80fd\u529b\u548c\u5206\u7c7b\u6027\u80fd\u3002<\/li>\n<li>\u5e38\u89c1\u7684\u6fc0\u6d3b\u51fd\u6570\u5982ReLU\u3001Sigmoid\u548cTanh\u5728\u4e0d\u540c\u5c42\u6b21\u548c\u4efb\u52a1\u4e2d\u53d1\u6325\u5173\u952e\u4f5c\u7528\uff0c\u5f71\u54cdCNN\u7684\u6574\u4f53\u6027\u80fd\u548c\u8bad\u7ec3\u6548\u679c\u3002<\/li>\n<\/ul>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913194848136.png\" style=\"height:350px\">\n<\/p>\n<h3><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/officel\/80\/000000\/rope.png\" style=\"height:50px;display:inline\"> Regularization - Preventing Overfitting<\/h3>\n<hr \/>\n<p>\u5728\u5377\u79ef\u795e\u7ecf\u7f51\u7edc\uff08CNN\uff09\u4e2d\uff0c<strong>\u8fc7\u62df\u5408<\/strong>\u662f\u6307\u6a21\u578b\u5728\u8bad\u7ec3\u6570\u636e\u4e0a\u8868\u73b0\u4f18\u5f02\uff0c\u4f46\u5728\u65b0\u6570\u636e\u4e0a\u8868\u73b0\u4e0d\u4f73\u7684\u73b0\u8c61\u3002\u8fd9\u901a\u5e38\u662f\u56e0\u4e3a\u6a21\u578b\u8fc7\u4e8e\u590d\u6742\uff0c\u80fd\u591f\u8bb0\u4f4f\u8bad\u7ec3\u6570\u636e\u4e2d\u7684\u566a\u58f0\u548c\u7ec6\u8282\uff0c\u800c\u4e0d\u662f\u5b66\u4e60\u5230\u771f\u6b63\u7684\u6a21\u5f0f\u548c\u7279\u5f81\u3002<\/p>\n<p>\u4e3a\u4e86\u89e3\u51b3\u8fc7\u62df\u5408\u95ee\u9898\uff0c\u5e38\u7528\u7684\u65b9\u6cd5\u6709\u6570\u636e\u589e\u5f3a\u548c\u6b63\u5219\u5316\uff1a<\/p>\n<p>\u6570\u636e\u589e\u5f3a\uff1a\u901a\u8fc7\u5bf9\u8bad\u7ec3\u6570\u636e\u8fdb\u884c\u5404\u79cd\u968f\u673a\u53d8\u6362\uff08\u5982\u65cb\u8f6c\u3001\u7f29\u653e\u3001\u5e73\u79fb\u7b49\uff09\u6765\u751f\u6210\u66f4\u591a\u7684\u8bad\u7ec3\u6837\u672c\uff0c\u4ece\u800c\u589e\u52a0\u6570\u636e\u7684\u591a\u6837\u6027\uff0c\u5e2e\u52a9\u6a21\u578b\u66f4\u597d\u5730\u6cdb\u5316\u3002<\/p>\n<p>\u6b63\u5219\u5316\uff1a\u901a\u8fc7\u5f15\u5165\u6b63\u5219\u9879\uff08\u5982L2\u6b63\u5219\u5316\uff09\u6765\u9650\u5236\u6a21\u578b\u7684\u590d\u6742\u5ea6\uff0c\u9632\u6b62\u6743\u91cd\u8fc7\u5927\uff0c\u6216\u8005\u4f7f\u7528Dropout\u6280\u672f\u968f\u673a\u4e22\u5f03\u4e00\u4e9b\u795e\u7ecf\u5143\uff0c\u4ee5\u51cf\u5c11\u6a21\u578b\u5bf9\u7279\u5b9a\u8def\u5f84\u7684\u4f9d\u8d56\u3002<\/p>\n<ul>\n<li>\n<p>\u8fc7\u62df\u5408\u7684\u53cd\u4e49\u8bcd\u662f<strong>\u6b20\u62df\u5408<\/strong>\u3002\u8fd9\u79cd\u60c5\u51b5\u53ef\u80fd\u51fa\u4e8e\u591a\u79cd\u539f\u56e0\uff1a\u5982\u679c\u6a21\u578b\u4e0d\u591f\u5f3a\u5927\u3001\u8fc7\u5ea6\u6b63\u5219\u5316\uff0c\u6216\u8005\u53ea\u662f\u8bad\u7ec3\u65f6\u95f4\u4e0d\u591f\u957f\u3002\u8fd9\u610f\u5473\u7740\u7f51\u7edc\u5c1a\u672a\u5b66\u4e60\u8bad\u7ec3\u6570\u636e\u4e2d\u7684\u76f8\u5173\u6a21\u5f0f\u3002<\/p>\n<\/li>\n<li>\n<p>\u6b63\u5219\u5316\u901a\u5e38\u4ee5\u5bf9\u53c2\u6570\u65bd\u52a0\u7ea6\u675f\u7684\u5f62\u5f0f\u51fa\u73b0\uff0c\u6216\u8005\u5728\u795e\u7ecf\u7f51\u7edc\u7684\u60c5\u51b5\u4e0b\uff0c\u5bf9\u5c42\u7684\u6743\u91cd\u65bd\u52a0\u7ea6\u675f\u3002 <\/p>\n<\/li>\n<li>\n<p>\u5e38\u89c1\u7684\u6b63\u5219\u5316\u6709  $L_2\u3001L_1$ \u6b63\u5219\u5316\uff1a $$ \\text{New Loss}_{L_2} = \\text{Original Loss} + \\lambda \\mid \\mid w \\mid \\mid^2$$<\/p>\n<\/li>\n<li>\n<p>\u5bf9\u4e8e\u6df1\u5ea6\u795e\u7ecf\u7f51\u7edc\uff08\u548c CNN\uff09\uff0c\u4e00\u79cd\u5e38\u89c1\u7684\u6b63\u5219\u5316\u6280\u672f\u662f <strong>Dropout<\/strong>\u3002<\/p>\n<\/li>\n<\/ul>\n<h4>Dropout Regularization<\/h4>\n<hr \/>\n<ul>\n<li>\u9996\u6b21\u51fa\u73b0\u5728 <a href=\"http:\/\/jmlr.org\/papers\/v15\/srivastava14a.html\">Dropout: A Simple Way to Prevent Neural Networks from Overfitting<\/a>, 2014.<\/li>\n<li>Dropout \u662f\u4e00\u79cd\u6b63\u5219\u5316\u65b9\u6cd5\uff0c<strong>\u8fd1\u4f3c\u4e8e\u5e76\u884c\u8bad\u7ec3\u5927\u91cf\u5177\u6709\u4e0d\u540c\u67b6\u6784\u7684\u795e\u7ecf\u7f51\u7edc<\/strong>\u3002<\/li>\n<li>\u5728\u8bad\u7ec3\u671f\u95f4\uff0c\u4e00\u4e9b\u5c42\u8f93\u51fa\uff08\u5373\u795e\u7ecf\u5143\uff09\u88ab\u968f\u673a\u5ffd\u7565\u6216\u4ee5\u67d0\u4e2a\u6982\u7387 $p$\u201c\u4e22\u5f03\u201d\u3002<\/li>\n<li>Dropout \u4f1a\u4f7f\u8bad\u7ec3\u8fc7\u7a0b\u53d8\u5f97\u5608\u6742\uff0c\u8feb\u4f7f\u5c42\u5185\u7684\u8282\u70b9\u6982\u7387\u5730\u627f\u62c5\u66f4\u591a\u6216\u66f4\u5c11\u7684\u8f93\u5165\u8d23\u4efb\u3002<\/li>\n<li>Dropout \u4ec5\u5728<strong>\u8bad\u7ec3<\/strong>\u671f\u95f4\u6fc0\u6d3b**\uff08<code>model.train()<\/code>\uff09\u3002\u5728\u6d4b\u8bd5\u65f6\uff0c\u5b83\u4f1a\u5173\u95ed\uff08<code>model.eval()<\/code>\uff09\u3002<\/li>\n<li>Dropout \u4ece\u67d0\u79cd\u7a0b\u5ea6\u6765\u8bf4\uff0c\u6709\u70b9\u201c\u96c6\u6210\u5b66\u4e60\u201d\u7684\u601d\u60f3<\/li>\n<\/ul>\n<p>\u9605\u8bfb\u66f4\u591a - <a href=\"https:\/\/machinelearningmastery.com\/dropout-for-regularizing-deep-neural-networks\/\">\u6df1\u5ea6\u795e\u7ecf\u7f51\u7edc\u6b63\u5219\u5316 Dropout \u7684\u7b80\u5355\u4ecb\u7ecd<\/a><\/p>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913195524712.png\" style=\"height:200px\">\n<\/p>\n<p><a href=\"https:\/\/www.oreilly.com\/library\/view\/tensorflow-for-deep\/9781491980446\/ch04.html\">Image Source<\/a><\/p>\n<h3><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/dusk\/64\/000000\/variation.png\" style=\"height:50px;display:inline\"> Data Augmentation<\/h3>\n<hr \/>\n<ul>\n<li>\n<p>\u6570\u636e\u589e\u5f3a\u662f\u4e00\u79cd\u5e38\u7528\u6280\u672f\uff0c\u53ef\u4ee5\u6539\u5584\u7ed3\u679c\u5e76\u907f\u514d\u8fc7\u5ea6\u62df\u5408\uff0c\u5e2e\u52a9\u7f51\u7edc\u66f4\u597d\u5730\u6cdb\u5316\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u5f53\u6837\u672c\u6570\u91cf\u6709\u9650\u65f6\uff0c\u6211\u4eec\u5982\u4f55\u83b7\u5f97\u66f4\u591a\u6570\u636e\uff1f<\/strong>\u6211\u4eec\u53ef\u4ee5\u8fdb\u884c\u6570\u636e\u589e\u5f3a\u3002<\/p>\n<\/li>\n<li>\n<p>\u6570\u636e\u589e\u5f3a\u901a\u8fc7\u6dfb\u52a0\u539f\u59cb\u6837\u672c\u7684\u53d8\u4f53\u6765\u4e30\u5bcc\u6570\u636e\u96c6\u3002 <\/p>\n<\/li>\n<li>\n<p>Popular augmentation techniques:<\/p>\n<ul>\n<li><strong>\u7ffb\u8f6c<\/strong> - \u6c34\u5e73\u548c\/\u6216\u5782\u76f4\u7ffb\u8f6c\u56fe\u50cf\u3002<\/li>\n<li><strong>\u65cb\u8f6c<\/strong> - \u5c06\u56fe\u50cf\u65cb\u8f6c\u4e00\u5b9a\u89d2\u5ea6\u3002\u8fd9\u53ef\u80fd\u4f1a\u6539\u53d8\u56fe\u50cf\u7684\u5927\u5c0f\uff0c\u56e0\u6b64\uff0c\u88c1\u526a\u6216\u586b\u5145\u662f\u5e38\u89c1\u7684\u89e3\u51b3\u65b9\u6cd5\u3002<\/li>\n<li><strong>\u7f29\u653e<\/strong> - \u56fe\u50cf\u53ef\u4ee5\u5411\u5916\u6216\u5411\u5185\u7f29\u653e\u3002\u8fd9\u4e5f\u53ef\u80fd\u4f1a\u6539\u53d8\u56fe\u50cf\u7684\u5927\u5c0f\uff0c\u56e0\u6b64\u901a\u5e38\u4f1a\u8fdb\u884c\u8c03\u6574\u5927\u5c0f\uff08\u4e5f\u5305\u62ec\u62c9\u4f38\uff09\u3002<\/li>\n<li><strong>\u88c1\u526a<\/strong> - \u4ece\u539f\u59cb\u56fe\u50cf\u4e2d\u968f\u673a\u62bd\u53d6\u4e00\u90e8\u5206\u3002\u7136\u540e\u5c06\u6b64\u90e8\u5206\u8c03\u6574\u4e3a\u539f\u59cb\u56fe\u50cf\u5927\u5c0f\u3002\u8fd9\u79f0\u4e3a\u968f\u673a\u88c1\u526a\u3002<\/li>\n<li><strong>\u5e73\u79fb<\/strong> - \u6cbf X \u6216 Y \u65b9\u5411\uff08\u6216\u4e24\u8005\uff09\u79fb\u52a8\u56fe\u50cf\u3002\u8fd9\u4f1a\u8feb\u4f7f\u795e\u7ecf\u7f51\u7edc\u5230\u5904\u67e5\u770b\u3002<\/li>\n<li><strong>\u566a\u58f0<\/strong> - \u8fc7\u5ea6\u62df\u5408\u901a\u5e38\u53d1\u751f\u5728\u7f51\u7edc\u5c1d\u8bd5\u5b66\u4e60\u53ef\u80fd\u6ca1\u7528\u7684\u9ad8\u9891\u7279\u5f81\uff08\u7ecf\u5e38\u51fa\u73b0\u7684\u6a21\u5f0f\uff09\u65f6\u3002\u9ad8\u65af\u566a\u58f0\u7684\u5747\u503c\u4e3a\u96f6\uff0c\u57fa\u672c\u4e0a\u5728\u6240\u6709\u9891\u7387\u4e2d\u90fd\u6709\u6570\u636e\u70b9\uff0c\u4ece\u800c\u6709\u6548\u5730\u626d\u66f2\u4e86\u9ad8\u9891\u7279\u5f81\u3002\u8fd9\u4e5f\u610f\u5473\u7740\u8f83\u4f4e\u9891\u7387\u7684\u6210\u5206\uff08\u901a\u5e38\u662f\u60a8\u60f3\u8981\u7684\u6570\u636e\uff09\u4e5f\u4f1a\u5931\u771f\uff0c\u4f46\u60a8\u7684\u795e\u7ecf\u7f51\u7edc\u53ef\u4ee5\u5b66\u4f1a\u5ffd\u7565\u8fd9\u4e00\u70b9\u3002\u6dfb\u52a0\u9002\u91cf\u7684\u566a\u97f3\u53ef\u4ee5\u589e\u5f3a\u5b66\u4e60\u80fd\u529b\uff08\u4f8b\u5982\uff0c\u6dfb\u52a0\u76d0\u548c\u80e1\u6912\uff09\u3002<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>Read More - <a href=\"https:\/\/nanonets.com\/blog\/data-augmentation-how-to-use-deep-learning-when-you-have-limited-data-part-2\/\">Data Augmentation | How to use Deep Learning when you have Limited Data<\/a><\/p>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913195647646.png\" style=\"height:200px\">\n<\/p>\n<h3><img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913200026514.png\" style=\"height:50px;display:inline\"> Kornia: Differentiable GPU-Accelerated Augmentations<\/h3>\n<hr \/>\n<ul>\n<li><a href=\"https:\/\/kornia.github.io\/\">Kornia<\/a> \u662f\u4e00\u4e2a\u53ef\u5fae\u5206\u7684\u5e93\uff0c\u5b83\u5141\u8bb8\u5728GPU\u4e0a\u76f4\u63a5\u8fdb\u884c\u6570\u636e\u589e\u5f3a\u4ee5\u53ca\u5e94\u7528\u5176\u4ed6\u56fe\u50cf\u548c\u51e0\u4f55\u64cd\u4f5c\/\u53d8\u6362\uff0c\u5e76\u4e14\u53ef\u4ee5\u901a\u8fc7\u8fd9\u4e9b\u64cd\u4f5c\u8fdb\u884c\u53cd\u5411\u4f20\u64ad\uff01\n<ul>\n<li>\u8fd9\u662f torchvision \u5b9e\u73b0\u7684\u6570\u636e\u589e\u5f3a\u6240\u65e0\u6cd5\u505a\u5230\u7684\uff0c\u901a\u5e38\u9700\u8981\u5c06\u56fe\u50cf\u5f20\u91cf\u53d1\u9001\u56deCPU\u3002<\/li>\n<\/ul>\n<\/li>\n<li>See <a href=\"https:\/\/kornia.github.io\/tutorials\/\">Jupyter Notebook Tutorials using Kornia<\/a>.<\/li>\n<\/ul>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913195743313.gif\" style=\"height:200px\">\n<\/p>\n<table class=\"docutils align-default\" id=\"id1\">\n<caption><span class=\"caption-text\">\u8fd9\u662f\u5728 <a class=\"reference external\" href=\"https:\/\/colab.research.google.com\/drive\/1b-HpK4EsZR8uolztgH4roNBLaDwcMULx?usp=sharing\">Google Colab<\/a><br \/>\nK80 GPU \u4e0a\u4f7f\u7528\u4e0d\u540c\u5e93\u548c\u6279\u6b21\u5927\u5c0f\u6267\u884c\u7684\u57fa\u51c6\u6d4b\u8bd5\u3002\u6b64\u57fa\u51c6\u6d4b\u8bd5\u663e\u793a\u4e86<br \/>\nKornia \u6570\u636e\u589e\u5f3a\u5e26\u6765\u7684\u5f3a\u5927 GPU \u589e\u5f3a\u901f\u5ea6\u52a0\u901f\u3002\u56fe\u50cf\u5927\u5c0f\u56fa\u5b9a\u4e3a 224x224\uff0c<br \/>\n\u5355\u4f4d\u4e3a\u6beb\u79d2 (ms)\u3002<\/span><\/caption>\n<colgroup>\n<col style=\"width: 27%\">\n<col style=\"width: 15%\">\n<col style=\"width: 15%\">\n<col style=\"width: 15%\">\n<col style=\"width: 15%\">\n<col style=\"width: 15%\">\n<\/colgroup>\n<thead>\n<tr class=\"row-odd\">\n<th class=\"head\">\n<p>Libraries<\/p>\n<\/th>\n<th class=\"head\">\n<p>TorchVision<\/p>\n<\/th>\n<th class=\"head\">\n<p>Albumentations<\/p>\n<\/th>\n<th class=\"head\" colspan=\"3\">\n<p>Kornia (GPU)<\/p>\n<\/th>\n<\/tr>\n<tr class=\"row-even\">\n<th class=\"head\">\n<p>Batch Size<\/p>\n<\/th>\n<th class=\"head\">\n<p>1<\/p>\n<\/th>\n<th class=\"head\">\n<p>1<\/p>\n<\/th>\n<th class=\"head\">\n<p>1<\/p>\n<\/th>\n<th class=\"head\">\n<p>32<\/p>\n<\/th>\n<th class=\"head\">\n<p>128<\/p>\n<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr class=\"row-odd\">\n<td>\n<p>RandomPerspective<\/p>\n<\/td>\n<td>\n<p>4.88\u00b11.82<\/p>\n<\/td>\n<td>\n<p>4.68\u00b13.60<\/p>\n<\/td>\n<td>\n<p>4.74\u00b12.84<\/p>\n<\/td>\n<td>\n<p>0.37\u00b12.67<\/p>\n<\/td>\n<td>\n<p>0.20\u00b127.00<\/p>\n<\/td>\n<\/tr>\n<tr class=\"row-even\">\n<td>\n<p>ColorJiggle<\/p>\n<\/td>\n<td>\n<p>4.40\u00b12.88<\/p>\n<\/td>\n<td>\n<p>3.58\u00b13.66<\/p>\n<\/td>\n<td>\n<p>4.14\u00b13.85<\/p>\n<\/td>\n<td>\n<p>0.90\u00b124.68<\/p>\n<\/td>\n<td>\n<p>0.83\u00b112.96<\/p>\n<\/td>\n<\/tr>\n<tr class=\"row-odd\">\n<td>\n<p>RandomAffine<\/p>\n<\/td>\n<td>\n<p>3.12\u00b15.80<\/p>\n<\/td>\n<td>\n<p>2.43\u00b17.11<\/p>\n<\/td>\n<td>\n<p>3.01\u00b17.80<\/p>\n<\/td>\n<td>\n<p>0.30\u00b14.39<\/p>\n<\/td>\n<td>\n<p>0.18\u00b16.30<\/p>\n<\/td>\n<\/tr>\n<tr class=\"row-even\">\n<td>\n<p>RandomVerticalFlip<\/p>\n<\/td>\n<td>\n<p>0.32\u00b10.08<\/p>\n<\/td>\n<td>\n<p>0.34\u00b10.16<\/p>\n<\/td>\n<td>\n<p>0.35\u00b10.82<\/p>\n<\/td>\n<td>\n<p>0.02\u00b10.13<\/p>\n<\/td>\n<td>\n<p>0.01\u00b10.35<\/p>\n<\/td>\n<\/tr>\n<tr class=\"row-odd\">\n<td>\n<p>RandomHorizontalFlip<\/p>\n<\/td>\n<td>\n<p>0.32\u00b10.08<\/p>\n<\/td>\n<td>\n<p>0.34\u00b10.18<\/p>\n<\/td>\n<td>\n<p>0.31\u00b10.59<\/p>\n<\/td>\n<td>\n<p>0.01\u00b10.26<\/p>\n<\/td>\n<td>\n<p>0.01\u00b10.37<\/p>\n<\/td>\n<\/tr>\n<tr class=\"row-even\">\n<td>\n<p>RandomRotate<\/p>\n<\/td>\n<td>\n<p>1.82\u00b14.70<\/p>\n<\/td>\n<td>\n<p>1.59\u00b14.33<\/p>\n<\/td>\n<td>\n<p>1.58\u00b14.44<\/p>\n<\/td>\n<td>\n<p>0.25\u00b12.09<\/p>\n<\/td>\n<td>\n<p>0.17\u00b15.69<\/p>\n<\/td>\n<\/tr>\n<tr class=\"row-odd\">\n<td>\n<p>RandomCrop<\/p>\n<\/td>\n<td>\n<p>4.09\u00b13.41<\/p>\n<\/td>\n<td>\n<p>4.03\u00b14.94<\/p>\n<\/td>\n<td>\n<p>3.84\u00b13.07<\/p>\n<\/td>\n<td>\n<p>0.16\u00b11.17<\/p>\n<\/td>\n<td>\n<p>0.08\u00b19.42<\/p>\n<\/td>\n<\/tr>\n<tr class=\"row-even\">\n<td>\n<p>RandomErasing<\/p>\n<\/td>\n<td>\n<p>2.31\u00b11.47<\/p>\n<\/td>\n<td>\n<p>1.89\u00b11.08<\/p>\n<\/td>\n<td>\n<p>2.32\u00b13.31<\/p>\n<\/td>\n<td>\n<p>0.44\u00b12.82<\/p>\n<\/td>\n<td>\n<p>0.57\u00b19.74<\/p>\n<\/td>\n<\/tr>\n<tr class=\"row-odd\">\n<td>\n<p>RandomGrayscale<\/p>\n<\/td>\n<td>\n<p>0.41\u00b10.18<\/p>\n<\/td>\n<td>\n<p>0.43\u00b10.60<\/p>\n<\/td>\n<td>\n<p>0.45\u00b11.20<\/p>\n<\/td>\n<td>\n<p>0.03\u00b10.11<\/p>\n<\/td>\n<td>\n<p>0.03\u00b17.10<\/p>\n<\/td>\n<\/tr>\n<tr class=\"row-even\">\n<td>\n<p>RandomResizedCrop<\/p>\n<\/td>\n<td>\n<p>4.23\u00b12.86<\/p>\n<\/td>\n<td>\n<p>3.80\u00b13.61<\/p>\n<\/td>\n<td>\n<p>4.07\u00b12.67<\/p>\n<\/td>\n<td>\n<p>0.23\u00b15.27<\/p>\n<\/td>\n<td>\n<p>0.13\u00b18.04<\/p>\n<\/td>\n<\/tr>\n<tr class=\"row-odd\">\n<td>\n<p>RandomCenterCrop<\/p>\n<\/td>\n<td>\n<p>2.93\u00b11.29<\/p>\n<\/td>\n<td>\n<p>2.81\u00b11.38<\/p>\n<\/td>\n<td>\n<p>2.88\u00b12.34<\/p>\n<\/td>\n<td>\n<p>0.13\u00b12.20<\/p>\n<\/td>\n<td>\n<p>0.07\u00b19.41<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h3><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/dusk\/64\/null\/downloads.png\" style=\"height:50px;display:inline\"> Torchvision Datasets<\/h3>\n<hr \/>\n<ul>\n<li>Torchvision \u5728 <code>torchvision.datasets<\/code> \u6a21\u5757\u4e2d\u63d0\u4f9b\u4e86\u8bb8\u591a\u5185\u7f6e\u6570\u636e\u96c6\uff0c\u4ee5\u53ca\u7528\u4e8e\u6784\u5efa\u60a8\u81ea\u5df1\u7684\u6570\u636e\u96c6\u7684\u5b9e\u7528\u7a0b\u5e8f\u7c7b\u3002<\/li>\n<li><a href=\"https:\/\/pytorch.org\/vision\/stable\/datasets.html#datasets\">Torchvision \u6570\u636e\u96c6<\/a><\/li>\n<li>\u793a\u4f8b\uff1a<code>torchvision.datasets.CelebA<\/code>\u3001<code>torchvision.datasets.Flowers102<\/code>\u3001<code>torchvision.datasets.ImageNet<\/code>\u3002<\/li>\n<li>\u5982\u679c\u60a8\u60f3\u7528\u81ea\u5df1\u7684\u56fe\u50cf\u521b\u5efa\u6570\u636e\u96c6\uff0c\u5219\u53ef\u4ee5\u4f7f\u7528<a href=\"https:\/\/pytorch.org\/vision\/stable\/generated\/torchvision.datasets.ImageFolder.html#torchvision.datasets.ImageFolder\"><code>torchvision.datasets.ImageFolder<\/code><\/a> \u548c <a href=\"https:\/\/pytorch.org\/vision\/stable\/generated\/torchvision.datasets.VisionDataset.html#torchvision.datasets.VisionDataset\"><code>torchvision.datasets.VisionDataset<\/code><\/a>\uff0c\u6216\u8005\u60a8\u53ef\u4ee5\u81ea\u5df1\u5b9e\u73b0 <code>Dataset<\/code> \u7c7b\u3002\u4ee5\u4e0b\u662f\u4e00\u4e2a\u57fa\u4e8ePyTorch\u5b9e\u73b0\u81ea\u5b9a\u4e49 Dataset \u7c7b\u7684\u793a\u4f8b\u4ee3\u7801\uff0c\u5305\u62ec <strong>init<\/strong>, <strong>len<\/strong>, \u548c <strong>getitem<\/strong> \u65b9\u6cd5\uff1a<\/li>\n<\/ul>\n<pre><code class=\"language-python\">import torch\nfrom torch.utils.data import Dataset\nfrom torchvision import transforms\nfrom PIL import Image\nimport os\n\nclass CustomDataset(Dataset):\n    def __init__(self, image_dir, transform=None):\n        &quot;&quot;&quot;\n        Args:\n            image_dir (str): Path to the directory with images.\n            transform (callable, optional): Optional transform to be applied\n                on a sample.\n        &quot;&quot;&quot;\n        self.image_dir = image_dir\n        self.image_filenames = os.listdir(image_dir)\n        self.transform = transform\n\n    def __len__(self):\n        return len(self.image_filenames)\n\n    def __getitem__(self, idx):\n        if torch.is_tensor(idx):\n            idx = idx.tolist()\n\n        img_name = os.path.join(self.image_dir, self.image_filenames[idx])\n        image = Image.open(img_name).convert(&#039;RGB&#039;)\n\n        if self.transform:\n            image = self.transform(image)\n\n        return image\n<\/code><\/pre>\n<h2><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/clouds\/100\/000000\/dog.png\" style=\"height:50px;display:inline\"> The CIFAR-10 Dataset<\/h2>\n<hr \/>\n<ul>\n<li>\n<p>CIFAR-10 \u6570\u636e\u96c6\u5305\u542b 10 \u4e2a\u7c7b\u522b\u7684 60000 \u5f20 32x32 \u5f69\u8272\u56fe\u50cf\uff0c\u6bcf\u4e2a\u7c7b\u522b\u6709 6000 \u5f20\u56fe\u50cf\u3002\u6709 50000 \u5f20\u8bad\u7ec3\u56fe\u50cf\u548c 10000 \u5f20\u6d4b\u8bd5\u56fe\u50cf\u3002<\/p>\n<\/li>\n<li>\n<p>\u8fd8\u6709 CIFAR-100\uff0c\u6709 100 \u4e2a\u7c7b\u522b\u3002<\/p>\n<\/li>\n<li>\n<p><a href=\"https:\/\/www.cs.toronto.edu\/~kriz\/cifar.html\">Official Site<\/a><\/p>\n<\/li>\n<\/ul>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913200134234.png\" style=\"height:400px\">\n<\/p>\n<pre><code class=\"language-python\"># define pre-processing steps on the images\n# also called &quot;data augementation&quot; (only done for the train set)\n# to use &#039;kornia&#039; instead of torchvision, see example after\n\ntransform_train = transforms.Compose([\n    transforms.RandomCrop(32, padding=4), # input is PIL image\n    transforms.RandomHorizontalFlip(), # input is PIL image, can also set the probability parameter &#039;p&#039;\n    transforms.ToTensor(),  # uint8 values in [0, 255] -&gt; float tensor with values [0, 1] \n    transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),  # [mu, sigma] standartization RGB values\n    # notice the order: first iamges transformation on PIL, then ToTensor, then normalization.\n])\n\n# how are the standartization params (mu, sigma) calculated? \n# iterate over the entire train set to get the R, G, and B values of the all images, compute their mean and std\n\n# Normalize the test set same as training set without augmentation\ntransform_test = transforms.Compose([\n    transforms.ToTensor(),\n    transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),\n])\n\n# test-time augementations (TTA): for a more robust prediciton, we can apply N augs on each input and get N scores, take the average of the scores.\n\n# load dataset\nclasses = (&#039;plane&#039;, &#039;car&#039;, &#039;bird&#039;, &#039;cat&#039;, &#039;deer&#039;, \n           &#039;dog&#039;, &#039;frog&#039;, &#039;horse&#039;, &#039;ship&#039;, &#039;truck&#039;)\n\ntrainset = torchvision.datasets.CIFAR10(\n    root=&#039;.\/datasets&#039;, train=True, download=True, transform=transform_train)\n\ntestset = torchvision.datasets.CIFAR10(\n    root=&#039;.\/datasets&#039;, train=False, download=True, transform=transform_test)<\/code><\/pre>\n<pre><code>Files already downloaded and verified\nFiles already downloaded and verified<\/code><\/pre>\n<pre><code class=\"language-python\"># let&#039;s see some of the images\ndef convert_to_imshow_format(image): \n    # first convert back to [0,1] range\n    mean = torch.tensor([0.4914, 0.4822, 0.4465])\n    std = torch.tensor([0.2023, 0.1994, 0.2010])\n    image = image * std[:, None, None] + mean[:, None, None]  # [:, None, None] changes the shape from [N] to [N, 1, 1]\n    image = image.clamp(0, 1).numpy()\n    # convert from CHW to HWC\n    # from 3x32x32 to 32x32x3\n    return image.transpose(1, 2, 0)\n\ntrainloader = torch.utils.data.DataLoader(trainset, \n                                          batch_size=4,\n                                          shuffle=True)\ndataiter = iter(trainloader)\nimages, labels = next(dataiter)\n\nfig, axes = plt.subplots(1, len(images), figsize=(10, 2.5))\nfor idx, image in enumerate(images):\n    axes[idx].imshow(convert_to_imshow_format(image))\n    axes[idx].set_title(classes[labels[idx]])\n    axes[idx].set_xticks([])\n    axes[idx].set_yticks([])<\/code><\/pre>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913200246562.png\" style=\"height:200px\">\n<\/p>\n<h2><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/bubbles\/50\/000000\/fire-element.png\" style=\"height:50px;display:inline\"> Building a CNN-Classifier for CIFAR-10 with PyTorch<\/h2>\n<hr \/>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913200323766.png\" style=\"height:200px\">\n<\/p>\n<pre><code class=\"language-python\">class CifarCNN(nn.Module):\n    &quot;&quot;&quot;CNN for the CIFAR-10 Datset&quot;&quot;&quot;\n\n    def __init__(self):\n        &quot;&quot;&quot;CNN Builder.&quot;&quot;&quot;\n        super(CifarCNN, self).__init__()\n\n        self.conv_layer = nn.Sequential(\n\n            # Conv Layer block 1\n            nn.Conv2d(in_channels=3, out_channels=32, kernel_size=3, padding=1),\n            # `paading=1` is the same as `padding=&#039;same&#039;` for 3x3 kernels size\n            nn.BatchNorm2d(32),\n            nn.ReLU(inplace=True),\n            nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, padding=1),\n            nn.ReLU(inplace=True),\n            nn.MaxPool2d(kernel_size=2, stride=2),  # input_resolution \/ 2 = 32 \/ 2 = 16\n\n            # Conv Layer block 2\n            nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3, padding=1),\n            nn.BatchNorm2d(128),\n            nn.ReLU(inplace=True),\n            nn.Conv2d(in_channels=128, out_channels=128, kernel_size=3, padding=1),\n            nn.ReLU(inplace=True),\n            nn.MaxPool2d(kernel_size=2, stride=2),  # input_resolution \/ 4 = 32 \/ 4 = 8\n            nn.Dropout2d(p=0.05),\n\n            # Conv Layer block 3\n            nn.Conv2d(in_channels=128, out_channels=256, kernel_size=3, padding=1),\n            nn.BatchNorm2d(256),\n            nn.ReLU(inplace=True),\n            nn.Conv2d(in_channels=256, out_channels=256, kernel_size=3, padding=1),\n            nn.ReLU(inplace=True),\n            nn.MaxPool2d(kernel_size=2, stride=2),  # input_resolution \/ 8 = 32 \/ 8 = 4\n            # the output dimensions: [batch_size, 256, h=input_resolution \/ 8, w=input_resolution \/ 8]\n        )\n\n        self.fc_layer = nn.Sequential(\n            nn.Dropout(p=0.1),\n            nn.Linear(4096, 1024),  # 256 * 4 * 4 = 4096\n            nn.ReLU(inplace=True),\n            nn.Linear(1024, 512),\n            nn.ReLU(inplace=True),\n            nn.Dropout(p=0.1),\n            nn.Linear(512, 10)\n        )\n\n    def forward(self, x):\n        &quot;&quot;&quot;Perform forward.&quot;&quot;&quot;\n\n        # conv layers\n        x = self.conv_layer(x)  # [batch_size, channels=256, h_f=4, w_f=4]\n\n        # flatten - can also use nn.Flatten() in __init__() instead\n        x = x.view(x.size(0), -1)  # [batch_size, channels * h_f * w_f=4096]\n\n        # fc layer\n        x = self.fc_layer(x)  # [batch_size, n_classes=10]\n\n        return x<\/code><\/pre>\n<h3>More Conv2D Properties<\/h3>\n<hr \/>\n<ul>\n<li><a href=\"https:\/\/pytorch.org\/docs\/stable\/generated\/torch.nn.Conv2d.html\"><code>torch.nn.Conv2D<\/code><\/a> \u6709\u66f4\u591a\u60a8\u5e94\u8be5\u4e86\u89e3\u7684\u8d85\u53c2\u6570\u3002<\/li>\n<li>\u8be6\u7ec6\u4fe1\u606f\uff1a<code>torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode=&#039;zeros&#039;, device=None, dtype=None)<\/code><\/li>\n<li><code>dilation<\/code> \u63a7\u5236\u6838\u70b9\u4e4b\u95f4\u7684\u95f4\u8ddd\uff0c\u4e5f\u79f0\u4e3a \u00e0 trous \u7b97\u6cd5\uff08\u201catrous\u201d\u5377\u79ef\uff09\u3002\u8fd9\u901a\u5e38\u7528\u4e8e\u5bf9\u8f93\u5165\u56fe\u8fdb\u884c\u4e0a\u91c7\u6837\uff08\u4f8b\u5982\uff0c\u7528\u4e8e\u56fe\u50cf\u5206\u5272\u548c\u751f\u6210\u4efb\u52a1\uff0c\u5176\u4e2d\u8f93\u51fa\u662f\u56fe\u50cf\uff09\u3002<a href=\"https:\/\/github.com\/vdumoulin\/conv_arithmetic\/blob\/master\/README.md\"><code>dilation<\/code> \u7684\u53ef\u89c6\u5316<\/a>\u3002<\/li>\n<li><code>groups<\/code> \u63a7\u5236\u8f93\u5165\u548c\u8f93\u51fa\u4e4b\u95f4\u7684\u8fde\u63a5\u3002<code>in_channels<\/code> \u548c <code>out_channels<\/code> \u90fd\u5fc5\u987b\u80fd\u88ab\u7ec4\u6574\u9664\u3002\u793a\u4f8b\uff1a<\/li>\n<li>\u5728 <code>groups=1<\/code>\uff08\u9ed8\u8ba4\u503c\uff09\u4e0b\uff0c\u6240\u6709\u8f93\u5165\u90fd\u4e0e\u6240\u6709\u8f93\u51fa\u8fdb\u884c\u5377\u79ef\u3002<\/li>\n<li>\u5728 <code>groups=in_channels<\/code> \u4e0b\uff0c\u6bcf\u4e2a\u8f93\u5165\u901a\u9053\u90fd\u4e0e\u5176\u81ea\u5df1\u7684\u4e00\u7ec4\u5377\u79ef\u6838\u8fdb\u884c\u5377\u79ef\uff08\u5927\u5c0f\u4e3a $\\frac{\\text{out-channels}}{\\text{in-channels}}$\uff09\u3002\u5f53 <code>groups == in_channels<\/code> \u65f6\uff0c\u79f0\u4e3a <strong>\u201c\u9010\u5c42\u5377\u79ef\u201d<\/strong>\u3002<\/li>\n<li><code>padding_mode<\/code>\uff08\u6765\u81ea <code>torch<\/code> 1.1\uff09\u63a7\u5236\u586b\u5145\u7684\u503c\uff0c\u53ef\u4ee5\u662f <code>&#039;zeros&#039;\u3001&#039;reflect&#039;\u3001&#039;replicate&#039; \u6216 &#039;circular&#039;\u3002\u9ed8\u8ba4\u503c\uff1a<\/code>'zeros'\u3002<\/li>\n<li>\u4ece <code>torch<\/code> 1.13 \u5f00\u59cb\uff0c\u60a8\u53ef\u4ee5\u4f7f\u7528\u5b57\u7b26\u4e32\u6765\u63a7\u5236\u586b\u5145\u3002<code>padding=&#039;valid&#039;<\/code> \u4e0e\u65e0\u586b\u5145\u76f8\u540c\u3002 <code>padding=&#039;same&#039;<\/code> \u586b\u5145\u8f93\u5165\uff0c\u4ee5\u4fbf\u8f93\u51fa\u5177\u6709\u4e0e\u8f93\u5165\u76f8\u540c\u7684\u5f62\u72b6\u3002<\/li>\n<\/ul>\n<table>\n<thead>\n<tr>\n<th><code>same<\/code> convolution:<\/th>\n<th>kernel size<\/th>\n<th>padding<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>3<\/td>\n<td>1<\/td>\n<\/tr>\n<tr>\n<td>5<\/td>\n<td>2<\/td>\n<\/tr>\n<tr>\n<td>7<\/td>\n<td>3<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<pre><code class=\"language-python\"># how can we calcualte the output of the convolution automatically?\ndummy_input = torch.zeros([1, 3, 32, 32])\ndummy_model = CifarCNN()\ndummy_output = dummy_model.conv_layer(dummy_input)\nprint(dummy_output.shape)\ndummy_output = dummy_output.view(dummy_output.size(0), -1)\nprint(dummy_output.shape)\n# how many weights (trainable parameters) we have in our model?\nnum_trainable_params = sum([p.numel() for p in dummy_model.parameters() if p.requires_grad])\nprint(&quot;num trainable weights: &quot;, num_trainable_params)<\/code><\/pre>\n<pre><code>torch.Size([1, 256, 4, 4])\ntorch.Size([1, 4096])\nnum trainable weights:  5852170<\/code><\/pre>\n<pre><code class=\"language-python\"># calculate the model size on disk\nnum_trainable_params = sum([p.numel() for p in dummy_model.parameters() if p.requires_grad])\nparam_size = 0\nfor param in dummy_model.parameters():\n    param_size += param.nelement() * param.element_size()\nbuffer_size = 0\nfor buffer in dummy_model.buffers():\n    buffer_size += buffer.nelement() * buffer.element_size()\nsize_all_mb = (param_size + buffer_size) \/ 1024 ** 2\nprint(f&quot;model size: {size_all_mb:.2f} MB&quot;)<\/code><\/pre>\n<pre><code>model size: 22.33 MB<\/code><\/pre>\n<pre><code class=\"language-python\"># time to train our model\n# hyper-parameters\nbatch_size = 128\nlearning_rate = 1e-4\nepochs = 20\n\n# dataloaders - creating batches and shuffling the data\ntrainloader = torch.utils.data.DataLoader(\n    trainset, batch_size=batch_size, shuffle=True, num_workers=2)\ntestloader = torch.utils.data.DataLoader(\n    testset, batch_size=batch_size, shuffle=False, num_workers=2)\n\n# device - cpu or gpu?\ndevice = torch.device(&quot;cuda:0&quot; if torch.cuda.is_available() else &quot;cpu&quot;)\n\n# loss criterion\ncriterion = nn.CrossEntropyLoss()  # accepts &#039;logits&#039; - unnormalized scores (no need to apply `softmax` manually)\n\n# build our model and send it to the device\nmodel = CifarCNN().to(device) # no need for parameters as we alredy defined them in the class\n\n# optimizer - SGD, Adam, RMSProp...\noptimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)<\/code><\/pre>\n<pre><code class=\"language-python\"># function to calcualte accuracy of the model\ndef calculate_accuracy(model, dataloader, device):\n    model.eval() # put in evaluation mode,  turn off Dropout, BatchNorm uses learned statistics\n    total_correct = 0\n    total_images = 0\n    confusion_matrix = np.zeros([10, 10], int)\n    with torch.no_grad():\n        for data in dataloader:\n            images, labels = data\n            images = images.to(device)\n            labels = labels.to(device)\n            outputs = model(images)\n            _, predicted = torch.max(outputs.data, 1)\n            total_images += labels.size(0)\n            total_correct += (predicted == labels).sum().item()\n            for i, l in enumerate(labels):\n                confusion_matrix[l.item(), predicted[i].item()] += 1 \n\n    model_accuracy = total_correct \/ total_images * 100\n    return model_accuracy, confusion_matrix<\/code><\/pre>\n<pre><code class=\"language-python\"># training loop\nfor epoch in range(1, epochs + 1):\n    model.train()  # put in training mode, turn on Dropout, BatchNorm uses batch&#039;s statistics\n    running_loss = 0.0\n    epoch_time = time.time()\n    for i, data in enumerate(trainloader, 0):\n        # get the inputs\n        inputs, labels = data\n        # send them to device\n        inputs = inputs.to(device)\n        labels = labels.to(device)\n        # augmentation with `kornia` happens here inputs = aug_list(inputs)\n\n        # forward + backward + optimize\n        outputs = model(inputs)  # forward pass\n        loss = criterion(outputs, labels)  # calculate the loss\n        # always the same 3 steps\n        optimizer.zero_grad()  # zero the parameter gradients\n        loss.backward()  # backpropagation\n        optimizer.step()  # update parameters\n\n        # print statistics\n        running_loss += loss.data.item()\n\n    # Normalizing the loss by the total number of train batches\n    running_loss \/= len(trainloader)\n\n    # Calculate training\/test set accuracy of the existing model\n    train_accuracy, _ = calculate_accuracy(model, trainloader, device)\n    test_accuracy, _ = calculate_accuracy(model, testloader, device)\n\n    log = &quot;Epoch: {} | Loss: {:.4f} | Training accuracy: {:.3f}% | Test accuracy: {:.3f}% | &quot;.format(epoch, running_loss, train_accuracy, test_accuracy)\n    # with f-strings\n    # log = f&quot;Epoch: {epoch} | Loss: {running_loss:.4f} | Training accuracy: {train_accuracy:.3f}% | Test accuracy: {test_accuracy:.3f}% |&quot;\n    epoch_time = time.time() - epoch_time\n    log += &quot;Epoch Time: {:.2f} secs&quot;.format(epoch_time)\n    # with f-strings\n    # log += f&quot;Epoch Time: {epoch_time:.2f} secs&quot;\n    print(log)\n\n    # save model\n    if epoch % 20 == 0:\n        print(&#039;==&gt; Saving model ...&#039;)\n        state = {\n            &#039;net&#039;: model.state_dict(),\n            &#039;epoch&#039;: epoch,\n        }\n        if not os.path.isdir(&#039;checkpoints&#039;):\n            os.mkdir(&#039;checkpoints&#039;)\n        torch.save(state, &#039;.\/checkpoints\/cifar_cnn_ckpt.pth&#039;)\n\nprint(&#039;==&gt; Finished Training ...&#039;)<\/code><\/pre>\n<pre><code>Epoch: 1 | Loss: 1.5280 | Training accuracy: 55.570% | Test accuracy: 57.590% | Epoch Time: 36.34 secs\nEpoch: 2 | Loss: 1.0987 | Training accuracy: 63.086% | Test accuracy: 63.620% | Epoch Time: 35.00 secs\nEpoch: 3 | Loss: 0.9198 | Training accuracy: 70.398% | Test accuracy: 69.990% | Epoch Time: 35.35 secs\nEpoch: 4 | Loss: 0.8134 | Training accuracy: 74.082% | Test accuracy: 73.670% | Epoch Time: 36.50 secs\nEpoch: 5 | Loss: 0.7299 | Training accuracy: 77.630% | Test accuracy: 77.340% | Epoch Time: 34.80 secs\nEpoch: 6 | Loss: 0.6658 | Training accuracy: 78.758% | Test accuracy: 77.900% | Epoch Time: 34.92 secs\nEpoch: 7 | Loss: 0.6087 | Training accuracy: 80.418% | Test accuracy: 78.830% | Epoch Time: 35.32 secs\nEpoch: 8 | Loss: 0.5652 | Training accuracy: 82.702% | Test accuracy: 80.890% | Epoch Time: 34.69 secs\nEpoch: 9 | Loss: 0.5366 | Training accuracy: 82.044% | Test accuracy: 79.730% | Epoch Time: 34.59 secs\nEpoch: 10 | Loss: 0.4996 | Training accuracy: 85.266% | Test accuracy: 82.950% | Epoch Time: 35.04 secs\nEpoch: 11 | Loss: 0.4750 | Training accuracy: 85.068% | Test accuracy: 81.760% | Epoch Time: 35.28 secs\nEpoch: 12 | Loss: 0.4455 | Training accuracy: 85.540% | Test accuracy: 82.720% | Epoch Time: 35.06 secs\nEpoch: 13 | Loss: 0.4245 | Training accuracy: 86.686% | Test accuracy: 83.560% | Epoch Time: 35.26 secs\nEpoch: 14 | Loss: 0.4018 | Training accuracy: 86.924% | Test accuracy: 83.880% | Epoch Time: 34.72 secs\nEpoch: 15 | Loss: 0.3828 | Training accuracy: 88.926% | Test accuracy: 84.940% | Epoch Time: 35.05 secs\nEpoch: 16 | Loss: 0.3644 | Training accuracy: 88.426% | Test accuracy: 85.050% | Epoch Time: 35.85 secs\nEpoch: 17 | Loss: 0.3518 | Training accuracy: 89.314% | Test accuracy: 84.990% | Epoch Time: 35.33 secs\nEpoch: 18 | Loss: 0.3346 | Training accuracy: 90.434% | Test accuracy: 86.040% | Epoch Time: 34.66 secs\nEpoch: 19 | Loss: 0.3175 | Training accuracy: 90.606% | Test accuracy: 85.680% | Epoch Time: 35.01 secs\nEpoch: 20 | Loss: 0.3000 | Training accuracy: 91.394% | Test accuracy: 86.470% | Epoch Time: 34.60 secs\n==> Saving model ...\n==> Finished Training ...<\/code><\/pre>\n<pre><code class=\"language-python\"># load model, calculate accuracy and confusion matrix\nmodel = CifarCNN().to(device)\nstate = torch.load(&#039;.\/checkpoints\/cifar_cnn_ckpt.pth&#039;, map_location=device)\nmodel.load_state_dict(state[&#039;net&#039;])\n# note: `map_location` is necessary if you trained on the GPU and want to run inference on the CPU\n\ntest_accuracy, confusion_matrix = calculate_accuracy(model, testloader, device)\nprint(&quot;test accuracy: {:.3f}%&quot;.format(test_accuracy))\n\n# plot confusion matrix\nfig, ax = plt.subplots(1,1,figsize=(8,6))\nax.matshow(confusion_matrix, aspect=&#039;auto&#039;, vmin=0, vmax=1000, cmap=plt.get_cmap(&#039;Blues&#039;))\nplt.ylabel(&#039;Actual Category&#039;)\nplt.yticks(range(10), classes)\nplt.xlabel(&#039;Predicted Category&#039;)\nplt.xticks(range(10), classes)\nplt.show()<\/code><\/pre>\n<pre><code>test accuracy: 86.470%<\/code><\/pre>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913200436344.png\" style=\"height:300px\">\n<\/p>\n<h2><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/dusk\/64\/000000\/doughnut-chart.png\" style=\"height:50px;display:inline\"> Visualizing CNN Filters<\/h2>\n<hr \/>\n<h2><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/color\/96\/000000\/xray.png\" style=\"height:50px;display:inline\"> Visualizing Layer Output<\/h2>\n<hr \/>\n<ul>\n<li>We can see which neurons are active for every input image.<\/li>\n<li>This way we can a better understanding of what the network sees during forward pass, which probably affects the final prediction.<\/li>\n<li>Let's see an example by <a href=\"https:\/\/github.com\/sar-gupta\/convisualize_nb\/blob\/master\/cnn-visualize.ipynb\">Sarthak Gupta <\/a>.<\/li>\n<\/ul>\n<pre><code class=\"language-python\"># helper functions\ndef to_grayscale(image):\n    &quot;&quot;&quot;\n    input is (d,w,h)\n    converts 3D image tensor to grayscale images corresponding to each channel\n    &quot;&quot;&quot;\n    image = torch.sum(image, dim=0)\n    image = torch.div(image, image.shape[0])\n    return image\n\ndef normalize(image, device=torch.device(&quot;cpu&quot;)):\n    normalize = transforms.Normalize(\n    mean=[0.485, 0.456, 0.406],\n    std=[0.229, 0.224, 0.225]\n    )\n    preprocess = transforms.Compose([\n    transforms.Resize((224,224)),\n    transforms.ToTensor(),\n    normalize\n    ])\n    image = preprocess(image).unsqueeze(0).to(device)\n    return image\n\ndef predict(image, model, labels=None):\n    _, index = model(image).data[0].max(0)\n    if labels is not None:\n        return str(index.item()), labels[str(index.item())][1]\n    else:\n        return str(index.item()) \n\ndef deprocess(image, device=torch.device(&quot;cpu&quot;)):\n    return image * torch.tensor([0.229, 0.224, 0.225]).to(device) + torch.tensor([0.485, 0.456, 0.406]).to(device)\n\ndef load_image(path):\n    image = Image.open(path)\n    plt.imshow(image)\n    plt.title(&quot;Image loaded successfully&quot;)\n    return image<\/code><\/pre>\n<pre><code class=\"language-python\"># load sample image\nkitten_img = load_image(&quot;.\/assets\/kitten.jpg&quot;)<\/code><\/pre>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913200603825.png\" style=\"height:400px\">\n<\/p>\n<h3><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/nolan\/64\/download-from-cloud.png\" style=\"height:50px;display:inline\"> Torchvision Pre-Trained Models<\/h3>\n<hr \/>\n<ul>\n<li><code>torchvision.models<\/code> \u5b50\u5305\u5305\u542b\u7528\u4e8e\u89e3\u51b3\u4e0d\u540c\u4efb\u52a1\u7684\u6a21\u578b\u5b9a\u4e49\uff0c\u5305\u62ec\uff1a\u56fe\u50cf\u5206\u7c7b\u3001\u50cf\u7d20\u7ea7\u8bed\u4e49\u5206\u5272\u3001\u5bf9\u8c61\u68c0\u6d4b\u3001\u5b9e\u4f8b\u5206\u5272\u3001\u4eba\u7269\u5173\u952e\u70b9\u68c0\u6d4b\u3001\u89c6\u9891\u5206\u7c7b\u548c\u5149\u6d41\u3002<\/li>\n<li>\u60a8\u53ef\u4ee5\u5728\u6b64\u5904\u67e5\u770b\u6240\u6709\u53ef\u7528\u5185\u5bb9 - <a href=\"https:\/\/pytorch.org\/vision\/stable\/models.html\">\u6a21\u578b\u548c\u9884\u8bad\u7ec3\u6743\u91cd<\/a>\u3002<\/li>\n<li>\u5728\u4ee3\u7801\u4e2d\uff0c\u60a8\u53ef\u4ee5\u4f7f\u7528 <a href=\"https:\/\/pytorch.org\/vision\/stable\/models.html#model-registration-mechanism\"><code>torchvision.models.list_models<\/code><\/a> \u67e5\u770b\u6240\u6709\u53ef\u7528\u6a21\u578b\u7684\u5217\u8868\u3002 <\/li>\n<\/ul>\n<h4>Output of Each Layer<\/h4>\n<hr \/>\n<pre><code class=\"language-python\">def layer_outputs(image, model):\n    modulelist = list(model.features.modules())\n    outputs = []\n    names = []\n    for index, layer in enumerate(modulelist[1:]):\n        image = layer(image)\n        outputs.append(image)\n        names.append(str(index)+str(layer))\n\n    output_im = []\n    for i in outputs:\n        i = i.squeeze(0)\n        temp = to_grayscale(i)  # take the mean\n        output_im.append(temp.data.cpu().numpy())\n\n    fig = plt.figure(figsize=(10, 20))\n\n    for i in range(len(output_im)):\n        a = fig.add_subplot(8, 4, i+1)\n        plt.imshow(output_im[i])\n        a.set_axis_off()\n        a.set_title(names[i].partition(&#039;(&#039;)[0], fontsize=10)\n    plt.tight_layout()\n#     plt.savefig(&#039;layer_outputs.jpg&#039;, bbox_inches=&#039;tight&#039;)<\/code><\/pre>\n<pre><code class=\"language-python\">layer_outputs(prep_img, model)<\/code><\/pre>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913200654700.png\" style=\"height:400px\">\n<\/p>\n<p>\u8d8a\u5f80\u540e\u7684\u5c42\u63d0\u53d6\u7684\u7279\u5f81\u8d8a\u62bd\u8c61\uff0c\u8d8a\u4e0d\u5177\u4f53\u3002\u7f51\u7edc\u8bd5\u56fe\u6355\u6349\u7684\u662f\u66f4\u9ad8\u5c42\u6b21\u7684\u8bed\u4e49\u4fe1\u606f\uff0c\u800c\u4e0d\u662f\u5177\u4f53\u7684\u50cf\u7d20\u7ea7\u522b\u7684\u4fe1\u606f\u3002<\/p>\n<h4>Output of Each Filter for a Certain Layer<\/h4>\n<hr \/>\n<pre><code class=\"language-python\">def filter_outputs(image, model, layer_to_visualize):\n    modulelist = list(model.features.modules())\n    if layer_to_visualize &lt; 0:\n        layer_to_visualize += 31\n    output = None\n    name = None\n    for count, layer in enumerate(modulelist[1:]):\n        image = layer(image)\n        if count == layer_to_visualize: \n            output = image\n            name = str(layer)\n\n    filters = []\n    output = output.data.squeeze().cpu().numpy()\n    for i in range(output.shape[0]):\n        filters.append(output[i,:,:])\n\n    fig = plt.figure(figsize=(10, 10))\n\n    for i in range(int(np.sqrt(len(filters))) * int(np.sqrt(len(filters)))):\n        ax = fig.add_subplot(int(np.sqrt(len(filters))), int(np.sqrt(len(filters))), i+1)\n        imgplot = ax.imshow(filters[i])\n        ax.set_axis_off()\n    plt.tight_layout()<\/code><\/pre>\n<pre><code class=\"language-python\">filter_outputs(prep_img, model, 0)<\/code><\/pre>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913200743868.png\" style=\"height:400px\">\n<\/p>\n<h4>CNN Explainer Demo<\/h4>\n<hr \/>\n<p><a href=\"https:\/\/poloclub.github.io\/cnn-explainer\/\">Open Demo<\/a><\/p>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913200825949.gif\" style=\"height:300px\">\n<\/p>\n<p><a href=\"https:\/\/arxiv.org\/pdf\/1311.2901\">ZFNet: Visualizing and Understanding Convolutional Networks<\/a><\/p>\n<h2><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/cute-clipart\/64\/000000\/easter-egg.png\" style=\"height:50px;display:inline\"> Are CNNs the Holy Grail? The Problem with CNNs<\/h2>\n<hr \/>\n<h4>\u6df1\u5ea6\u795e\u7ecf\u7f51\u7edc\u5bf9\u5bf9\u6297\u6027\u653b\u51fb\u5f88\u654f\u611f\u3002<\/h4>\n<hr \/>\n<ul>\n<li>\u4f8b\u5982\uff1a\u8003\u8651\u4e0b\u9762\u7684\u56fe\u50cf\uff0c\u5de6\u4fa7\u662f\u4e00\u5934\u732a\u7684\u56fe\u50cf\uff0c\u8be5\u56fe\u50cf\u88ab\u6700\u5148\u8fdb\u7684\u5377\u79ef\u795e\u7ecf\u7f51\u7edc\u6b63\u786e\u5206\u7c7b\u3002<\/li>\n<li>\u5728\u5bf9\u56fe\u50cf\u8fdb\u884c\u8f7b\u5fae\u6270\u52a8\u540e\uff08\u6bcf\u4e2a\u50cf\u7d20\u90fd\u5728 [0, 1] \u8303\u56f4\u5185\uff0c\u6700\u591a\u6539\u53d8 0.005\uff09\uff0c\u7f51\u7edc\u73b0\u5728\u4ee5\u9ad8\u7f6e\u4fe1\u5ea6\u8fd4\u56de\u201c\u5ba2\u673a\u201d\u7c7b\u522b\u3002<\/li>\n<\/ul>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913200917886.png\" style=\"height:200px\">\n<\/p>\n<p><a href=\"http:\/\/gradientscience.org\/intro_adversarial\/\">Image Source<\/a><\/p>\n<h4>\u8bc6\u522b\u7b97\u6cd5\u5bf9\u65b0\u73af\u5883\u7684\u6cdb\u5316\u80fd\u529b\u8f83\u5dee<\/h4>\n<hr \/>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913201003738.png\" style=\"height:400px\">\n<\/p>\n<p><a href=\"https:\/\/arxiv.org\/pdf\/1807.04975.pdf\">Recognition in Terra Incognita (Beery et al., 2018)<\/a><\/p>\n<p>\u6cdb\u5316\u80fd\u529b\u6307\u7684\u662f\u6a21\u578b\u5728\u8bad\u7ec3\u6570\u636e\u4ee5\u5916\u7684\u672a\u89c1\u8fc7\u7684\u6570\u636e\u4e0a\u7684\u8868\u73b0\u80fd\u529b\u3002\u5728\u8fd9\u5f20\u56fe\u7247\u4e2d\uff0c\u901a\u8fc7\u5c55\u793a\u7b97\u6cd5\u5728\u4e0d\u540c\u73af\u5883\u4e0b\uff08\u4f8b\u5982\u7267\u573a\u3001\u6c34\u8fb9\u3001\u6c99\u6ee9\uff09\u5bf9\u540c\u4e00\u5bf9\u8c61\uff08\u725b\uff09\u7684\u8bc6\u522b\u7ed3\u679c\uff0c\u53ef\u4ee5\u770b\u5230\u7b97\u6cd5\u5728\u4e0d\u540c\u73af\u5883\u4e0b\u7684\u8bc6\u522b\u51c6\u786e\u6027\u548c\u7f6e\u4fe1\u5ea6\u3002<\/p>\n<ul>\n<li>\u73af\u5883A\uff08\u7267\u573a\uff09\uff1a\u7b97\u6cd5\u80fd\u591f\u975e\u5e38\u9ad8\u7f6e\u4fe1\u5ea6\u5730\u8bc6\u522b\u51fa\u725b\u4ee5\u53ca\u5176\u4ed6\u76f8\u5173\u6807\u7b7e\u3002<\/li>\n<li>\u73af\u5883B\uff08\u6c34\u8fb9\uff09\uff1a\u7b97\u6cd5\u7684\u7f6e\u4fe1\u5ea6\u4f9d\u7136\u5f88\u9ad8\uff0c\u4f46\u8bc6\u522b\u51fa\u4e86\u66f4\u591a\u4e0e\u6c34\u76f8\u5173\u7684\u6807\u7b7e\u3002<\/li>\n<li>\u73af\u5883C\uff08\u6c99\u6ee9\uff09\uff1a\u7b97\u6cd5\u7684\u8bc6\u522b\u7f6e\u4fe1\u5ea6\u7565\u6709\u964d\u4f4e\uff0c\u4e14\u51fa\u73b0\u4e86\u66f4\u591a\u4e0e\u6c99\u6ee9\u73af\u5883\u76f8\u5173\u7684\u6807\u7b7e\u3002<\/li>\n<\/ul>\n<p>\u901a\u8fc7\u8fd9\u4e9b\u7ed3\u679c\u53ef\u4ee5\u770b\u51fa\uff0c\u5c3d\u7ba1\u8bc6\u522b\u7b97\u6cd5\u5728\u65b0\u73af\u5883\u4e2d\u4f9d\u7136\u80fd\u8bc6\u522b\u51fa\u4e3b\u8981\u5bf9\u8c61\uff08\u725b\uff09\uff0c\u4f46\u4f1a\u53d7\u5230\u73af\u5883\u53d8\u5316\u7684\u5f71\u54cd\uff0c\u8bc6\u522b\u51fa\u5176\u4ed6\u4e0e\u73af\u5883\u76f8\u5173\u7684\u6807\u7b7e\u3002<\/p>\n<h4>\u795e\u7ecf\u7f51\u7edc\u5f80\u5f80\u4f1a\u8868\u73b0\u51fa\u4e0d\u826f\u504f\u89c1<\/h4>\n<hr \/>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913201051284.png\" style=\"height:400px\">\n<\/p>\n<ul>\n<li>\u6a21\u578b\u5b66\u4e60\u8fd9\u4e9b\u504f\u89c1\u7684\u539f\u56e0\u5c1a\u4e0d\u6e05\u695a\u3002<\/li>\n<li>\u4e00\u4e2a\u5047\u8bbe\u662f\uff0c\u795e\u7ecf\u7f51\u7edc\u5b66\u4e60\u5230\u7684\u504f\u89c1\u53ef\u80fd\u6e90\u81ea\u8bad\u7ec3\u6570\u636e\u7684\u4e0d\u5e73\u8861\u3002\u6bd4\u5982\uff0c\u67d0\u4e9b\u8fd0\u52a8\u5458\u7684\u56fe\u7247\u5728\u8bad\u7ec3\u6570\u636e\u96c6\u4e2d\u51fa\u73b0\u9891\u7387\u8f83\u9ad8\uff0c\u5bfc\u81f4\u6a21\u578b\u5bf9\u8fd9\u4e9b\u56fe\u7247\u7684\u9884\u6d4b\u7ed3\u679c\u503e\u5411\u4e8e\u67d0\u4e9b\u7279\u5b9a\u7684\u7c7b\u522b\u3002<br \/>\n\u56fe\u7247\u7684\u80cc\u666f\u3001\u80a4\u8272\u3001\u8fd0\u52a8\u670d\u7b49\u56e0\u7d20\u53ef\u80fd\u5bfc\u81f4\u6a21\u578b\u4ea7\u751f\u8bef\u5bfc\u6027\u7684\u9884\u6d4b\u3002<\/li>\n<\/ul>\n<p><a href=\"https:\/\/arxiv.org\/abs\/1711.11443\">ConvNets and ImageNet Beyond Accuracy: Understanding Mistakes and Uncovering Biases (Stock and Cisse, 2018)<\/a><\/p>\n<h2><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/clouds\/100\/000000\/lightning-bolt.png\" style=\"height:50px;display:inline\"> CNNs Applications in Computer Vision<\/h2>\n<hr \/>\n<p>See 1000+ computer vision tasks with benchmarks and papers on <a href=\"https:\/\/paperswithcode.com\/area\/computer-vision\">PapersWithCode.com<\/a>.<\/p>\n<ul>\n<li><strong>Object Detection<\/strong><\/li>\n<\/ul>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913201149215.png\" style=\"height:400px\">\n<\/p>\n<p><a href=\"https:\/\/medium.com\/better-programming\/real-time-object-detection-on-gpus-in-10-minutes-6e8c9b857bb3\"> Source<\/a><\/p>\n<ul>\n<li><strong>Semantic Segmentation<\/strong><\/li>\n<\/ul>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913201239132.png\" style=\"height:400px\">\n<\/p>\n<p><a href=\"https:\/\/missinglink.ai\/guides\/computer-vision\/image-segmentation-deep-learning-methods-applications\/\">Source<\/a><\/p>\n<ul>\n<li><strong>Super Resolution<\/strong><\/li>\n<\/ul>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913201332718.png\" style=\"height:400px\">\n<\/p>\n<p><a href=\"https:\/\/arxiv.org\/pdf\/1609.04802.pdf\">Source<\/a><\/p>\n<ul>\n<li><strong>Style Transfer<\/strong><\/li>\n<\/ul>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913201418190.png\" style=\"height:400px\">\n<\/p>\n<p><a href=\"https:\/\/towardsdatascience.com\/light-on-math-machine-learning-intuitive-guide-to-neural-style-transfer-ef88e46697ee\">Source<\/a><\/p>\n<ul>\n<li><strong>Image Editing<\/strong><\/li>\n<\/ul>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913201459288.png\" style=\"height:400px\">\n<\/p>\n<p><a href=\"http:\/\/people.csail.mit.edu\/junyanz\/projects\/gvm\/\">Source<\/a><\/p>\n<ul>\n<li><strong>Image Generation<\/strong><\/li>\n<\/ul>\n<p>StyleGAN (V1-3) - <a href=\"https:\/\/thispersondoesnotexist.com\/\">thispersondoesnotexist.com<\/a><\/p>\n<ul>\n<li><strong>Multi-Signals<\/strong><br \/>\nSynthesizing Obama: Learning Lip Sync from Audio<\/li>\n<\/ul>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/07\/20240913201534800.png\" style=\"height:400px\">\n<\/p>\n<p><a href=\"http:\/\/grail.cs.washington.edu\/projects\/AudioToObama\/\">Source<\/a><\/p>\n<h2><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/dusk\/64\/000000\/prize.png\" style=\"height:50px;display:inline\"> Credits<\/h2>\n<hr \/>\n<ul>\n<li>Icons made by <a href=\"https:\/\/www.flaticon.com\/authors\/becris\" title=\"Becris\">Becris<\/a> from <a href=\"https:\/\/www.flaticon.com\/\" title=\"Flaticon\">www.flaticon.com<\/a><\/li>\n<li>Icons from <a href=\"https:\/\/icons8.com\/\">Icons8.com<\/a> - <a href=\"https:\/\/icons8.com\">https:\/\/icons8.com<\/a><\/li>\n<li>Some slides from CS131 and CS231n (Stanford)<\/li>\n<li>Deep Learning with Pytorch on CIFAR10 Dataset - <a href=\"https:\/\/www.stefanfiott.com\/machine-learning\/cifar-10-classifier-using-cnn-in-pytorch\/\">Zhenye's Blog<\/a><\/li>\n<li>CIFAR-10 Classifier Using CNN in PyTorch - <a href=\"https:\/\/www.stefanfiott.com\/machine-learning\/cifar-10-classifier-using-cnn-in-pytorch\/\">Stefan Fiott<\/a><\/li>\n<li><a href=\"https:\/\/taldatech.github.io\">Tal Daniel<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Deep Learning create by Arwin Yu Tutorial 02 &#8211; Convolut [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":1847,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[18,24],"tags":[16,19],"class_list":["post-1810","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-18","category-24","tag-python","tag-19"],"_links":{"self":[{"href":"http:\/\/gnn.club\/index.php?rest_route=\/wp\/v2\/posts\/1810","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/gnn.club\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/gnn.club\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/gnn.club\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/gnn.club\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1810"}],"version-history":[{"count":16,"href":"http:\/\/gnn.club\/index.php?rest_route=\/wp\/v2\/posts\/1810\/revisions"}],"predecessor-version":[{"id":1972,"href":"http:\/\/gnn.club\/index.php?rest_route=\/wp\/v2\/posts\/1810\/revisions\/1972"}],"wp:featuredmedia":[{"embeddable":true,"href":"http:\/\/gnn.club\/index.php?rest_route=\/wp\/v2\/media\/1847"}],"wp:attachment":[{"href":"http:\/\/gnn.club\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1810"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/gnn.club\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1810"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/gnn.club\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1810"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}