๐Ÿ“„ Semi-supervised Semantic Segmentation for Remote Sensing Images via Multi-scale Uncertainty Consistency and Cross-Teacher-Student Attention

TitleSemi-supervised Semantic Segmentation for Remote Sensing Images via Multi-scale Uncertainty Consistency and Cross-Teacher-Student Attention
Authors
Affiliation
ConferenceIEEE Transactions on Geoscience and Remote Sensing (1735657200000)
DOI10.1109/TGRS.2025.3585489
Keywords

์—ฐ๊ตฌ ๋ฐฐ๊ฒฝ

  • Semantic Segmentation for Remote Sensing Images - ์žฌ๋‚œ ๋Œ€์‘, ํ† ์ง€ ์ด์šฉ ๋ถ„์„, ํ™˜๊ฒฝ ๋ชจ๋‹ˆํ„ฐ๋ง ๋“ฑ ๋‹ค์–‘ํ•œ ์‘์šฉ ๋ถ„์•ผ์—์„œ ํ•ต์‹ฌ์  ์—ญํ• 
    • ์ง€๊ตฌ ๊ด€์ธก ์œ„์„ฑ์˜ ์ฆ๊ฐ€
      • ๋Œ€๊ทœ๋ชจ ์›๊ฒฉ ํƒ์‚ฌ ์˜์ƒ์ด ์ง€์†์ ์œผ๋กœ ์ˆ˜์ง‘๋จ
      • ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜ ๋ถ„์„ ์ ์šฉ ๊ฐ€๋Šฅ์„ฑ ํ™•๋Œ€
    • ์›๊ฒฉ ํƒ์‚ฌ ์˜์ƒ ๋ฐ์ดํ„ฐ ํŠน์„ฑ
      • ํ”ฝ์…€ ๋‹จ์œ„ ๋ผ๋ฒจ๋ง์ด ํ•„์ˆ˜์ ์ž„
      • ํด๋ž˜์Šค ์ˆ˜๊ฐ€ ๋งŽ๊ณ  ๋ผ๋ฒจ๋ง ๊ทœ์น™์ด ๋ณต์žกํ•จ
      • ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ์™„์ „ ์ง€๋„ ํ•™์Šต ์ ์šฉ์ด ์–ด๋ ค์›€
    • ์ด์— ๋”ฐ๋ผ ์ค€์ง€๋„ ํ•™์Šต์ด ํšจ๊ณผ์ ์ธ ๋Œ€์•ˆ์œผ๋กœ ์ฃผ๋ชฉ๋ฐ›์Œ
  • ์ž์—ฐ ์ด๋ฏธ์ง€ ๋ถ„์•ผ์˜ ๊ธฐ์กด Semi-supervised Semantic Segmentation ๋ฐฉ๋ฒ•๊ณผ ํ•œ๊ณ„
    • Pseudo-label ๊ธฐ๋ฐ˜ Self-training
      • pros:
        ๋ชจ๋ธ ์˜ˆ์ธก์„ pseudo-label๋กœ ํ™œ์šฉํ•˜์—ฌ unlabeled data๋ฅผ ํ•™์Šต์— ์‚ฌ์šฉ
      • cons:
        ํ•™์Šต ์ดˆ๊ธฐ์˜ ์ž˜๋ชป๋œ pseudo-label๋กœ ์ธํ•ด ์˜ค๋ฅ˜๊ฐ€ ๋ˆ„์ ๋จ
    • Feature perturbation ๊ธฐ๋ฐ˜ ์ผ๊ด€์„ฑ ํ•™์Šต
      • pros:
        ์ž…๋ ฅ ๋˜๋Š” ํŠน์ง• ๊ณต๊ฐ„์—์„œ์˜ ๊ต๋ž€(perturbation)์— ๋Œ€ํ•œ ์˜ˆ์ธก ์ผ๊ด€์„ฑ ๊ฐ•์ œ
      • cons:
        ๊ต๋ž€ ์„ค๊ณ„์™€ ๋ผ๋ฒจ ๋ถ„ํฌ์— ๋ฏผ๊ฐํ•จ
    • Teacherโ€“Student ์ผ๊ด€์„ฑ ํ”„๋ ˆ์ž„์›Œํฌ
      • pros:
        Teacher ๋ชจ๋ธ์˜ ์˜ˆ์ธก์„ Student ๋ชจ๋ธ์˜ ํ•™์Šต ์‹ ํ˜ธ๋กœ ์‚ฌ์šฉ
        ์ตœ์ข… ์ถœ๋ ฅ(prediction) ์ˆ˜์ค€์˜ ์ผ๊ด€์„ฑ์„ ์ค‘์‹ฌ์œผ๋กœ ์„ค๊ณ„๋จ
      • cons:
        ์ถœ๋ ฅ ์ˆ˜์ค€์˜ ์ผ๊ด€์„ฑ๋งŒ ๊ณ ๋ คํ•˜์—ฌ ๋‚ด๋ถ€ ํ‘œํ˜„์ด๋‚˜ ์Šค์ผ€์ผ ์ฐจ์ด๋ฅผ ์ถฉ๋ถ„ํžˆ ๋ฐ˜์˜ํ•˜์ง€ ๋ชปํ•จ
  • Remote Sensing ์ด๋ฏธ์ง€์™€ ์ž์—ฐ ์ด๋ฏธ์ง€ ๊ฐ„ ๋„๋ฉ”์ธ ํŠน์„ฑ ์ฐจ์ด
    • ์„œ๋กœ ๋‹ค๋ฅธ ํฌ๊ธฐ์˜ ๊ฐ์ฒด๋“ค์ด ๋™์‹œ์— ์กด์žฌ
      • ์˜ˆ: ๋Œ€ํ˜• ๊ฑด๋ฌผ, ์†Œํ˜• ์ž๋™์ฐจ, ๊ธธ๊ฒŒ ์ด์–ด์ง„ ๋„๋กœ
    • ๋†’์€ ํด๋ž˜์Šค ๊ฐ„ ์‹œ๊ฐ์  ์œ ์‚ฌ์„ฑ
      • ์˜ˆ: ๋„๋กœ vs ์ฃผ์ฐจ์žฅ, ๊ฑด๋ฌผ ์˜ฅ์ƒ vs ์ฝ˜ํฌ๋ฆฌํŠธ ์ง€๋ฉด
    • ๊ฒฐ๊ณผ์ ์œผ๋กœ
      • ํ”ฝ์…€ ๊ฒฝ๊ณ„ ๋ชจํ˜ธยท์†Œํ˜• ๊ฐ์ฒด ๋ˆ„๋ฝยทํด๋ž˜์Šค ํ˜ผ๋™ ๋ฐœ์ƒ
    • ๊ธฐ์กด Semi-supervised Semantic Segmentation ๋ฐฉ๋ฒ•๋“ค์€
      ์ž์—ฐ ์ด๋ฏธ์ง€ ๋„๋ฉ”์ธ ๊ฐ€์ •์— ๊ธฐ๋ฐ˜ํ•˜์—ฌ ์„ค๊ณ„๋˜์—ˆ์œผ๋ฉฐ
    • ์›๊ฒฉ ํƒ์‚ฌ ์˜์ƒ์˜ ๋‹ค์ค‘ ์Šค์ผ€์ผ ํŠน์„ฑ๊ณผ ๋†’์€ ํด๋ž˜์Šค ์œ ์‚ฌ์„ฑ์„ ์ถฉ๋ถ„ํžˆ ๊ณ ๋ คํ•˜์ง€ ๋ชปํ•จ
    • ํ•ต์‹ฌ ํ•œ๊ณ„ ์š”์ธ: ์ž์—ฐ ์ด๋ฏธ์ง€์™€ ์›๊ฒฉ ํƒ์‚ฌ ์˜์ƒ ๊ฐ„์˜ ๋„๋ฉ”์ธ ์ฐจ์ด๊ฐ€
      ์„ฑ๋Šฅ์„ ์ œํ•œํ•˜๋Š” ์ฃผ์š” ์š”์ธ์œผ๋กœ ์ž‘์šฉํ•จ

์ฃผ์š” ์•„์ด๋””์–ด

  • Multi-scale Uncertainty Consistency (MSUC) ๋ชจ๋“ˆ
    • ๊ธฐ์กด ์ค€์ง€๋„ ํ•™์Šต์˜ ํ•œ๊ณ„
      • Teacherโ€“Student ๊ฐ„ ์ถœ๋ ฅ ์ผ๊ด€์„ฑ๋งŒ์„ ์ •๊ทœํ™”
      • ์ค‘๊ฐ„ ๊ณ„์ธต์˜ ํ’๋ถ€ํ•œ ํ‘œํ˜„ ์ •๋ณด ํ™œ์šฉ ๋ถ€์กฑ
    • ์ œ์•ˆ ๋ฐฉ์‹
      • ๋„คํŠธ์›Œํฌ์˜ ์„œ๋กœ ๋‹ค๋ฅธ ๊ณ„์ธต(feature maps) ๊ฐ„ ์ผ๊ด€์„ฑ์„ ๋ถˆํ™•์‹ค์„ฑ(uncertainty) ๊ธฐ๋ฐ˜์œผ๋กœ ์ œ์•ฝ
      • ๋ผ๋ฒจ์ด ์—†๋Š” ๋ฐ์ดํ„ฐ์—์„œ ๋‹ค์ค‘ ์Šค์ผ€์ผ ํ‘œํ˜„ ํ•™์Šต ๋Šฅ๋ ฅ ๊ฐ•ํ™”
    • ๊ธฐ๋Œ€ ํšจ๊ณผ
      • ์Šค์ผ€์ผ๋ณ„ ํ‘œํ˜„ ๋ถˆ์•ˆ์ •์„ฑ ์™„ํ™”
      • ํ’๋ถ€ํ•˜๊ณ  ์•ˆ์ •์ ์ธ ๋‹ค์ค‘ ์Šค์ผ€์ผ ํŠน์ง• ํ•™์Šต ๊ฐ€๋Šฅ
  • Cross-Teacher-Student Attention Mechanism
    • Remote Sensing ์ด๋ฏธ์ง€์˜ ๋ฌธ์ œ
      • ์›๊ฒฉํƒ์‚ฌ ์˜์ƒ์˜ ๋†’์€ ํด๋ž˜์Šค ๊ฐ„ ์‹œ๊ฐ์  ์œ ์‚ฌ์„ฑ์œผ๋กœ ์ธํ•œ ๊ตฌ๋ถ„ ์–ด๋ ค์›€
    • ์ œ์•ˆ ๋ฐฉ์‹
      • Student ์ธ์ฝ”๋” ์ถœ๋ ฅ โ†’ Query, Teacher ์ธ์ฝ”๋” ์ถœ๋ ฅ โ†’ KeyยทValue๋กœ ์‚ฌ์šฉ
      • ๊ต์ฐจ ๋„คํŠธ์›Œํฌ Attention์„ ํ†ตํ•ด Teacher์˜ ๋ณด์™„์  ํŠน์ง•์„ Student์— ์ „๋‹ฌ
    • ๊ตฌ์กฐ์  ํŠน์ง•
      • ๊ธฐ์กด์˜ ์ถœ๋ ฅ ์ˆ˜์ค€ ๋‹จ๋ฐฉํ–ฅ ์ง€๋„ ๋ฐฉ์‹์„ ํŠน์ง• ์ˆ˜์ค€ ์ƒํ˜ธ ํ•™์Šต(feature-level mutual learning) ์œผ๋กœ ํ™•์žฅ
    • ๊ธฐ๋Œ€ ํšจ๊ณผ
      • Teacherโ€“Student ๊ฐ„ ์ƒํ˜ธ ๋ณด์™„์  ํ‘œํ˜„ ํ•™์Šต ๊ฐ•ํ™”
      • ์‹œ๊ฐ์ ์œผ๋กœ ์œ ์‚ฌํ•œ ํด๋ž˜์Šค ๊ฐ„ ํŒ๋ณ„๋ ฅ ํ–ฅ์ƒ

๋ฐฉ๋ฒ•๋ก  (์–ด๋–ป๊ฒŒ ๊ตฌํ˜„ํ–ˆ๋Š”๊ฐ€?)

  • ๋ฐ์ดํ„ฐ์…‹ / ์‹คํ—˜ ํ™˜๊ฒฝ
  • ๋ชจ๋ธ ๊ตฌ์กฐ, ํ•™์Šต ์„ค์ •, ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ
  • ๋น„๊ต ๋Œ€์ƒ (Baseline) Section III-A. Main Optimization Objectives

describes the main optimization objectives,

Section III-B. Multi-Scale Uncertainty Consistency Module

  • Uncertainty Estimation
  • Consistency Loss Functions for Multiscale Uncertainty

introduces the principles of the multi-scale uncertainty consistency module

Section III-C. Cross-Teacher-Student Attention

presents the basic principles of the crossteacher-student attention module.

The overall structure of our approach is shown in Fig. 2


์‹คํ—˜ ๊ฒฐ๊ณผ ๋ฐ ๋ถ„์„

  • ์ฃผ์š” ์„ฑ๋Šฅ ์ง€ํ‘œ ์š”์•ฝ (ํ‘œ๋‚˜ ๊ทธ๋ž˜ํ”„๋กœ)
  • ๋น„๊ต ๊ฒฐ๊ณผ ์š”์•ฝ (ํ–ฅ์ƒ๋œ ๋ถ€๋ถ„ / ํ•œ๊ณ„์ )
  • ์ €์ž์˜ ํ•ด์„๊ณผ ๋‚ด ์ƒ๊ฐ ๋น„๊ต

๊ฒฐ๋ก  ๋ฐ ์‹œ์‚ฌ์ 

  • ๋…ผ๋ฌธ์ด ์ œ์‹œํ•œ ํ•ต์‹ฌ ๊ฒฐ๋ก 
  • ๋‚ด ํ•™์Šตยทํ”„๋กœ์ ํŠธ์— ์ ์šฉ ๊ฐ€๋Šฅํ•œ ๋ถ€๋ถ„
  • ๋‚จ์€ ์˜๋ฌธ์  ๋˜๋Š” ํ›„์† ์—ฐ๊ตฌ ์•„์ด๋””์–ด

๊ฐœ์ธ ์ฝ”๋ฉ˜ํŠธ

  • ์ดํ•ด๊ฐ€ ์–ด๋ ค์› ๋˜ ๋ถ€๋ถ„
  • ๋‹ค์‹œ ์ฐพ์•„๋ณผ ๊ฐœ๋… (๋…ผ๋ฌธ ๋‚ด ์šฉ์–ดยท์ฐธ๊ณ  ๋ฌธํ—Œ ๋“ฑ)
  • ์ถ”๊ฐ€ ์ฐธ๊ณ ํ•  ๋…ผ๋ฌธ

temp note

์ฒซ์งธ, ์šฐ๋ฆฌ๋Š” ์ตœ์ดˆ๋กœ ๋‹ค์ค‘ ์Šค์ผ€์ผ ๋ถˆํ™•์‹ค์„ฑ ์ผ๊ด€์„ฑ ๋ชจ๋“ˆ(MSUC)์„ ์ œ์•ˆํ•œ๋‹ค. ์ด ๋ชจ๋“ˆ์€ ๋„คํŠธ์›Œํฌ ๋ชจ๋ธ์˜ ์„œ๋กœ ๋‹ค๋ฅธ ๊ณ„์ธต์— ์žˆ๋Š” ํŠน์ง• ๋งต ๊ฐ„์˜ ์ผ๊ด€์„ฑ์„ ์ œ์•ฝํ•จ์œผ๋กœ์จ, ๋ผ๋ฒจ์ด ์—†๋Š” ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ๋ฐ˜๊ฐ๋… ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ๋‹ค์ค‘ ์Šค์ผ€์ผ ํ•™์Šต ๋Šฅ๋ ฅ์„ ํ–ฅ์ƒ์‹œํ‚จ๋‹ค. ๊ธฐ์กด ์ค€์ง€๋„ ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ๋งˆ์ง€๋ง‰ ๊ณ„์ธต์—์„œ๋งŒ ๊ต์‚ฌ-ํ•™์ƒ ๊ฐ„ ์ผ๊ด€์„ฑ ์ •๊ทœํ™”๋ฅผ ์ˆ˜ํ–‰ํ•˜์—ฌ, ์ค‘๊ฐ„ ๊ณ„์ธต์—์„œ ๋Œ€๋Ÿ‰์˜ ๋ผ๋ฒจ๋ง๋˜์ง€ ์•Š์€ ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ ํ•™์Šต๋œ ํ’๋ถ€ํ•œ ์ •๋ณด๋ฅผ ๋ฌด์‹œํ•ฉ๋‹ˆ๋‹ค. ๋ฐ˜๋ฉด, ์ œ์•ˆ๋œ ์ค€์ง€๋„ ํ•™์Šต ๋ฐฉ๋ฒ•์€ MSUC ๋ชจ๋“ˆ์„ ํ†ตํ•ด ๋ผ๋ฒจ๋ง๋˜์ง€ ์•Š์€ RS ์ด๋ฏธ์ง€๋กœ๋ถ€ํ„ฐ ํšจ์œจ์ ์ด๊ณ  ํ’๋ถ€ํ•œ ๋‹ค์ค‘ ์Šค์ผ€์ผ ์ •๋ณด๋ฅผ ํ•™์Šตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋‘˜์งธ, ๋†’์€ ํด๋ž˜์Šค ๊ฐ„ ์œ ์‚ฌ์„ฑ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•œ ํ•ต์‹ฌ์€ ๋ชจ๋ธ์˜ ํŠน์ง• ์ถ”์ถœ ๋Šฅ๋ ฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๋Š” ๊ฒƒ์ด๋‹ค. ๋”ฐ๋ผ์„œ ์šฐ๋ฆฌ๋Š” ๊ต์‚ฌ ๋„คํŠธ์›Œํฌ๊ฐ€ ํ•™์ƒ ๋„คํŠธ์›Œํฌ๊ฐ€ ์ธ์ฝ”๋”์˜ ์ถœ๋ ฅ์„ ์žฌ๊ตฌ์„ฑํ•˜๋„๋ก ๋•๋Š” ๊ต์ฐจ ๊ต์‚ฌ-ํ•™์ƒ ์–ดํ…์…˜ ๋ชจ๋“ˆ(CTSA)์„ ์ œ์•ˆํ•œ๋‹ค. ๊ตฌ์ฒด์ ์œผ๋กœ, ํ•™์ƒ์˜ ์ธ์ฝ”๋” ๊ฒฐ๊ณผ๋ฅผ ์ฟผ๋ฆฌ๋กœ, ๊ต์‚ฌ์˜ ๊ฒฐ๊ณผ๋ฅผ ํ‚ค์™€ ๊ฐ’์œผ๋กœ ์‚ฌ์šฉํ•œ๋‹ค. ์ด๋Ÿฌํ•œ ๊ต์ฐจ ๋„คํŠธ์›Œํฌ ์–ดํ…์…˜์˜ ์žฅ์ ์€ ํ•™์ƒ๊ณผ ๊ต์‚ฌ ๋ชจ๋ธ ๋ชจ๋‘๋กœ๋ถ€ํ„ฐ ์‹ฌ์ธต ํŠน์ง•์„ ํ•™์Šตํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ์ ์ž…๋‹ˆ๋‹ค. ์ด๋Š” RS ์ด๋ฏธ์ง€์—์„œ ๊ตฌ๋ถ„์ด ์–ด๋ ค์šด ๋ฒ”์ฃผ๋ฅผ ๋ถ„ํ• ํ•˜๋Š” ๋ฐ ๋„์›€์ด ๋ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ๊ทธ๋ฆผ 1์—์„œ ๋ณผ ์ˆ˜ ์žˆ๋“ฏ์ด, ์šฐ๋ฆฌ์˜ ๋ฐ˜๊ฐ๋… ํ•™์Šต ๋ชจ๋ธ(๊ทธ๋ฆผ 1(h))์€ RS ์ด๋ฏธ์ง€์—์„œ SOTA Unimatch ๋ฐฉ๋ฒ•(๊ทธ๋ฆผ 1(g))๋ณด๋‹ค ๋” ์ •ํ™•ํ•œ ๋ถ„ํ•  ๊ฒฐ๊ณผ๋ฅผ ๋‹ฌ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  1. ํ’๋ถ€ํ•œ ๋‹ค์ค‘ ์Šค์ผ€์ผ ์ •๋ณด๋ฅผ ํ•™์Šตํ•˜๊ธฐ ์œ„ํ•ด MUCA๋Š” ๋„คํŠธ์›Œํฌ์˜ ์„œ๋กœ ๋‹ค๋ฅธ ๊ณ„์ธต์—์„œ ํŠน์ง• ๋งต ๊ฐ„์˜ ์ผ๊ด€์„ฑ์„ ์ œ์•ฝํ•˜๋Š” ๋‹ค์ค‘ ์Šค์ผ€์ผ ๋ถˆํ™•์‹ค์„ฑ ์ผ๊ด€์„ฑ ์ •๊ทœํ™”๋ฅผ ๋„์ž…ํ•ฉ๋‹ˆ๋‹ค.

  2. ๋†’์€ ํด๋ž˜์Šค ๊ฐ„ ์œ ์‚ฌ์„ฑ์„ ๊ตฌ๋ณ„ํ•˜๊ธฐ ์œ„ํ•ด MUCA๋Š” ์ƒˆ๋กœ์šด ๊ต์ฐจ ๊ต์‚ฌ-ํ•™์ƒ ์–ดํ…์…˜์„ ํ™œ์šฉํ•˜์—ฌ ํ•™์ƒ ๋„คํŠธ์›Œํฌ๊ฐ€ ๊ต์‚ฌ์— ์˜ํ•ด ์‹๋ณ„ ๊ฐ€๋Šฅํ•œ ์ธ์ฝ”๋”ฉ์„ ์žฌ๊ตฌ์„ฑํ•˜๋„๋ก ์•ˆ๋‚ดํ•ฉ๋‹ˆ๋‹ค.