ResNets

Residual Block

z[l+1]=W[l+1]a[l]+b[l+1]a[l+1]=g(z[l+1])z[l+2]=W[l+2]a[l+1]+b[l+2]a[l+2]=g(z[l+2])z^{[l+1]} = W^{[l+1]} a^{[l]} + b^{[l+1]}\\ a^{[l+1]} = g(z^{[l+1]})\\ z^{[l+2]} = W^{[l+2]} a^{[l+1]} + b^{[l+2]}\\ a^{[l+2]} = g(z^{[l+2]})
  • ์œ„์™€๊ฐ™์€ ์‹ ๊ฒฝ๋ง์ด ์žˆ๋‹ค๊ณ  ํ• ๋•Œ ์•„๋ž˜์™€ ๊ฐ™์ด ์š”์•ฝํ•  ์ˆ˜ ์žˆ์Œ
    1. a[l]a^{[l]}
    2. Linear
    3. ReLU
    4. a[l+1]a^{[l+1]}
    5. Linear
    6. ReLU
    7. a[l+2]a^{[l+2]}
  • ์ด ์ผ๋ จ์˜ ๊ณผ์ •์„ Main Path ๋ผ๊ณ  ํ•œ๋‹ค.
  • ์—ฌ๊ธฐ์„œ a[l]a^{[l]} ์„ 5๋ฒˆ Linear ์ดํ›„ ๋”ํ•ด์ฃผ์–ด ์ค‘๊ฐ„ ๋ ˆ์ด์–ด๋ฅผ ์Šคํ‚ตํ•  ์ˆ˜ ์žˆ๊ณ , ์ด๋ฅผ Short cut ํ˜น์€ Skip connecttion ์ด๋ผ๊ณ  ํ•œ๋‹ค.
    • โ†’ a[l+2]=g(z[l+2]+a[l])a^{[l+2]}=g(z^{[l+2]}+a^{[l]})
    • ์ด๋Š” ๊ธฐ์šธ๊ธฐ ์†Œ์‹ค ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ๋‹ค.
    • ๋„คํŠธ์›Œํฌ๊ฐ€ ๋” ๊นŠ์–ด์ง€๋”๋ผ๋„ ํ•™์Šต์ด ์ž˜ ์ง„ํ–‰๋œ๋‹ค.