8000 GitHub - hwkims/jetbot_IBM: jetbot_IBM granite3.2-vision control
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

hwkims/jetbot_IBM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

18 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

PPT ๋‚ด์šฉ ์ •๋ฆฌ (JetBot AI ํ”„๋กœ์ ํŠธ)

01 ํ”„๋กœ์ ํŠธ ๊ฐœ์š” (Project Overview)

  • JetBot ์†Œ๊ฐœ:
    • NVIDIA Jetson Nano ๊ธฐ๋ฐ˜์˜ ์ €๋ ดํ•˜๊ณ  ๊ต์œก์ ์ธ ์˜คํ”ˆ์†Œ์Šค AI ๋กœ๋ด‡ ํ”Œ๋žซํผ.
    • ํŠน์ง•: ํ•ฉ๋ฆฌ์  ๊ฐ€๊ฒฉ ($250 ๋ฏธ๋งŒ), ๊ต์œก์  ๊ฐ€์น˜ (๊ธฐ๋ณธ ๋กœ๋ด‡ ๊ณตํ•™ ~ ๊ณ ๊ธ‰ AI), ์‰ฌ์šด ์„ค์ • (์›น ๋ธŒ๋ผ์šฐ์ € ํ”„๋กœ๊ทธ๋ž˜๋ฐ, Docker), ์žฌ๋ฏธ (AI, ์ด๋ฏธ์ง€ ์ฒ˜๋ฆฌ, ๋กœ๋ด‡ ๊ณตํ•™ ํƒํ—˜).
    • [์ด๋ฏธ์ง€: JetBot ์‚ฌ์ง„ ๋˜๋Š” ๋กœ๊ณ ]
  • ํ”„๋กœ์ ํŠธ ๋ชฉํ‘œ:
    • JetBot ํ”Œ๋žซํผ์„ ํ™œ์šฉํ•˜์—ฌ ๋‹ค์–‘ํ•œ AI ๊ธฐ๋ฐ˜ ๋กœ๋ด‡ ์ œ์–ด ๋ฐ ์ƒํ˜ธ์ž‘์šฉ ์‹œ์Šคํ…œ ๊ฐœ๋ฐœ ๋ฐ ํƒ๊ตฌ.
    • ๊ธฐ๋ณธ์ ์ธ AI ์ฃผํ–‰๋ถ€ํ„ฐ ์ตœ์‹  ์–ธ์–ด/๋น„์ „ ๋ชจ๋ธ์„ ํ™œ์šฉํ•œ ์ž์œจ ์ œ์–ด๊นŒ์ง€ ๋‹จ๊ณ„๋ณ„ ๊ตฌํ˜„.
  • ์ˆ˜ํ–‰ ํ”„๋กœ์ ํŠธ ์š”์•ฝ:
    1. ๊ธฐ๋ณธ AI ์ฃผํ–‰ (JetBot AI Following):
      • ๋„๋กœ ๋”ฐ๋ผ๊ฐ€๊ธฐ (Road Following)
      • ์ปต๋ผ๋ฉด ๋”ฐ๋ผ๊ฐ€๊ธฐ (Object Following - Cup Ramen)
    2. LLM ๊ธฐ๋ฐ˜ ์ œ์–ด (JetBot Gemma Control):
      • ๋กœ์ปฌ LLM (Ollama + gemma3:4b)์„ ์ด์šฉํ•œ ์‹ค์‹œ๊ฐ„ ์˜์ƒ ๋ถ„์„ ๋ฐ ์ œ์–ด.
      • ์Œ์„ฑ ํ”ผ๋“œ๋ฐฑ (Edge TTS - ํ•œ๊ตญ์–ด) ๊ธฐ๋Šฅ ๊ตฌํ˜„.
    3. ๋น„์ „ LLM ๊ธฐ๋ฐ˜ ์ œ์–ด (JetBot IBM Control):
      • ๋น„์ „ ํŠนํ™” LLM (Ollama + IBM granite3.2-vision) ํ™œ์šฉ.
      • ์ˆ˜๋™/์ž์œจ/์„ค๋ช…/์ปค์Šคํ…€ ๋“ฑ ๋‹ค์–‘ํ•œ ์ œ์–ด ๋ชจ๋“œ ๊ตฌํ˜„.
      • Streamlit ๊ธฐ๋ฐ˜ ์›น ์ธํ„ฐํŽ˜์ด์Šค ๋ฐ ์ถ”๊ฐ€ ๊ธฐ๋Šฅ (์–ผ๊ตด ํŠน์ง• ๋ถ„์„ ๋ฐ๋ชจ).
  • ์ฃผ์š” ์‚ฌ์šฉ ๊ธฐ์ˆ :
    • ํ•˜๋“œ์›จ์–ด: NVIDIA Jetson Nano, JetBot ํ‚คํŠธ
    • ์†Œํ”„ํŠธ์›จ์–ด: Python, FastAPI, Streamlit, WebSockets, HTML/CSS/JS, Docker
    • AI/ML:
      • ๋”ฅ๋Ÿฌ๋‹ (Road/Object Following ๋ชจ๋ธ - Hugging Face)
      • ๋Œ€๊ทœ๋ชจ ์–ธ์–ด ๋ชจ๋ธ (LLM): Ollama (gemma3:4b, IBM granite3.2-vision)
      • ์Œ์„ฑ ํ•ฉ์„ฑ (TTS): Edge TTS
      • ์Œ์„ฑ ์ธ์‹: Web Speech API

02 ํ”„๋กœ์ ํŠธ ํŒ€ ๊ตฌ์„ฑ ๋ฐ ์—ญํ•  (Team Composition and Roles)

  • ํŒ€ ๊ตฌ์„ฑ: ๋ณธ ํ”„๋กœ์ ํŠธ๋Š” [hwkims]์— ์˜ํ•ด ๊ธฐํš ๋ฐ ๊ฐœ๋ฐœ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. (Assuming it's an individual project based on the repo owner)
  • ์ˆ˜ํ–‰ ์—ญํ• :
    • ๊ธฐํš: JetBot ํ™œ์šฉ ์•„์ด๋””์–ด ๊ตฌ์ƒ ๋ฐ ๋‹จ๊ณ„๋ณ„ ํ”„๋กœ์ ํŠธ ๋ชฉํ‘œ ์„ค์ •.
    • ํ•˜๋“œ์›จ์–ด: JetBot ์กฐ๋ฆฝ ๋ฐ ์„ค์ •.
    • ์†Œํ”„ํŠธ์›จ์–ด ๊ฐœ๋ฐœ:
      • JetBot ์ œ์–ด ๋กœ์ง ๊ตฌํ˜„ (Python).
      • ๋ฐฑ์—”๋“œ API ๊ฐœ๋ฐœ (FastAPI).
      • ํ”„๋ก ํŠธ์—”๋“œ ์ธํ„ฐํŽ˜์ด์Šค ๊ฐœ๋ฐœ (HTML/JS/CSS, Streamlit).
      • ์‹ค์‹œ๊ฐ„ ํ†ต์‹  ๊ตฌํ˜„ (WebSockets).
    • AI ๋ชจ๋ธ ํ™œ์šฉ:
      • ๋ฐ์ดํ„ฐ์…‹ ์ˆ˜์ง‘/ํ™œ์šฉ (Road/Object Following).
      • ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ๋ชจ๋ธ ํ™œ์šฉ ๋ฐ ์—ฐ๋™ (Hugging Face).
      • Ollama ๊ธฐ๋ฐ˜ LLM/Vision LLM ์„ค์ • ๋ฐ API ์—ฐ๋™.
      • TTS/์Œ์„ฑ์ธ์‹ API ์—ฐ๋™.
    • ํ…Œ์ŠคํŠธ ๋ฐ ๋””๋ฒ„๊น…: ๊ธฐ๋Šฅ ํ…Œ์ŠคํŠธ ๋ฐ ์˜ค๋ฅ˜ ์ˆ˜์ •.
    • ๋ฌธ์„œํ™” ๋ฐ ๊ณต์œ : GitHub README ์ž‘์„ฑ, ๊ฒฐ๊ณผ๋ฌผ ๊ณต์œ  (Hugging Face, YouTube).

03 ํ”„๋กœ์ ํŠธ ์ˆ˜ํ–‰ ์ ˆ์ฐจ ๋ฐ ๋ฐฉ๋ฒ• (Project Execution Procedure and Method)

  • 1๋‹จ๊ณ„: ํ™˜๊ฒฝ ๊ตฌ์ถ• ๋ฐ ๊ธฐ๋ณธ ์„ค์ •
    • JetBot ํ•˜๋“œ์›จ์–ด ์กฐ๋ฆฝ ๋ฐ Jetson Nano OS ์„ค์ •.
    • ๊ธฐ๋ณธ JetBot ์†Œํ”„ํŠธ์›จ์–ด ์„ค์น˜ (SD ์นด๋“œ ์ด๋ฏธ์ง€ ๋˜๋Š” Docker ํ™œ์šฉ).
    • ๋„คํŠธ์›Œํฌ ์„ค์ • (Wi-Fi ์—ฐ๊ฒฐ) ๋ฐ PC์™€์˜ ํ†ต์‹  ํ™•์ธ.
  • 2๋‹จ๊ณ„: ๊ธฐ๋ณธ AI ์ฃผํ–‰ ๊ธฐ๋Šฅ ๊ตฌํ˜„ (jetbot.kr)
    • ์ฃผํ–‰ ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘ (๋„๋กœ, ์ปต๋ผ๋ฉด ์ด๋ฏธ์ง€).
    • AI ๋ชจ๋ธ ํ•™์Šต ๋˜๋Š” ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ๋ชจ๋ธ ํ™œ์šฉ (Hugging Face).
    • Jupyter Notebook ์˜ˆ์ œ๋ฅผ ์ฐธ๊ณ ํ•˜์—ฌ ๊ธฐ๋ณธ ๋™์ž‘(ํšŒํ”ผ, ๊ฒฝ๋กœ/๊ฐ์ฒด ์ถ”์ข…) ๋กœ์ง ๊ตฌํ˜„.
    • ๊ฒฐ๊ณผ๋ฌผ(๋ชจ๋ธ, ๋ฐ์ดํ„ฐ์…‹) ๊ณต์œ .
  • 3๋‹จ๊ณ„: LLM ๊ธฐ๋ฐ˜ ์ œ์–ด ์‹œ์Šคํ…œ ๊ฐœ๋ฐœ (jetbot_gemma)
    • Ollama ์„ค์น˜ ๋ฐ gemma3:4b ๋ชจ๋ธ ๋‹ค์šด๋กœ๋“œ.
    • FastAPI ๋ฐฑ์—”๋“œ ์„œ๋ฒ„ ๊ตฌ์ถ•:
      • JetBot ์นด๋ฉ”๋ผ ์˜์ƒ ์ŠคํŠธ๋ฆฌ๋ฐ ์ˆ˜์‹ .
      • Ollama API ์—ฐ๋™ํ•˜์—ฌ ์˜์ƒ ๋ถ„์„ ์š”์ฒญ.
      • ๋ถ„์„ ๊ฒฐ๊ณผ ๊ธฐ๋ฐ˜ JetBot ์ œ์–ด ๋ช…๋ น ์ƒ์„ฑ (์ „์ง„, ํ›„์ง„, ํšŒ์ „ ๋“ฑ).
      • WebSocket์„ ์ด์šฉํ•œ JetBot ๋ฐ ์›น ํด๋ผ์ด์–ธํŠธ์™€์˜ ์‹ค์‹œ๊ฐ„ ํ†ต์‹ .
    • Edge TTS ์—ฐ๋™ํ•˜์—ฌ ๋ถ„์„ ๊ฒฐ๊ณผ ์Œ์„ฑ ์ถœ๋ ฅ (ํ•œ๊ตญ์–ด).
    • ๊ธฐ๋ณธ ์›น UI ๊ฐœ๋ฐœ (HTML/JS/CSS)ํ•˜์—ฌ ์ œ์–ด ๋ฐ ๊ฒฐ๊ณผ ํ™•์ธ.
  • 4๋‹จ๊ณ„: ๋น„์ „ LLM ๊ธฐ๋ฐ˜ ๊ณ ๊ธ‰ ์ œ์–ด ์‹œ์Šคํ…œ ๊ฐœ๋ฐœ (jetbot_IBM)
    • Ollama์— IBM granite3.2-vision ๋ชจ๋ธ ์ถ”๊ฐ€.
    • FastAPI ๋ฐฑ์—”๋“œ ํ™•์žฅ:
      • ๋‹ค์–‘ํ•œ ์ œ์–ด ๋ชจ๋“œ(์ˆ˜๋™, ์ž์œจ, ์„ค๋ช…, ์ปค์Šคํ…€) ๋กœ์ง ๊ตฌํ˜„.
      • ๋น„์ „ ๋ชจ๋ธ ํŠนํ™” ํ”„๋กฌํ”„ํŠธ ์„ค๊ณ„.
      • Web Speech API ํ™œ์šฉ ์Œ์„ฑ ๋ช…๋ น ์ธ์‹ ๊ธฐ๋Šฅ ์ถ”๊ฐ€ (์„ ํƒ ์‚ฌํ•ญ).
    • Streamlit์„ ์ด์šฉํ•œ ์ธํ„ฐ๋ž™ํ‹ฐ๋ธŒ ์›น ํ”„๋ก ํŠธ์—”๋“œ ๊ฐœ๋ฐœ:
      • ์‹ค์‹œ๊ฐ„ ์˜์ƒ, ์ œ์–ด ๋ฒ„ํŠผ, ๋ถ„์„ ๊ฒฐ๊ณผ ์‹œ๊ฐํ™”.
    • (๋ถ€๊ฐ€ ๊ธฐ๋Šฅ) ๋ณ„๋„ Streamlit ์•ฑ์œผ๋กœ ์–ผ๊ตด ํŠน์ง• ๋ถ„์„ ๋ฐ๋ชจ ๊ตฌํ˜„.
  • 5๋‹จ๊ณ„: ํ…Œ์ŠคํŠธ, ๊ฐœ์„  ๋ฐ ๊ณต์œ 
    • ๊ฐ ๊ธฐ๋Šฅ๋ณ„ ๋‹จ์œ„ ํ…Œ์ŠคํŠธ ๋ฐ ํ†ตํ•ฉ ํ…Œ์ŠคํŠธ ์ˆ˜ํ–‰.
    • ์„ฑ๋Šฅ ์ธก์ • (์‘๋‹ต ์†๋„ ๋“ฑ) ๋ฐ ๊ฐœ์„ ์  ๋„์ถœ.
    • GitHub ํ†ตํ•ด ์†Œ์Šค ์ฝ”๋“œ ๋ฐ README ๋ฌธ์„œ ์—…๋ฐ์ดํŠธ ๋ฐ ๊ณต์œ .
    • ๋ฐ๋ชจ ์˜์ƒ ์ œ์ž‘ ๋ฐ ๊ณต์œ  (YouTube Shorts).

04 ํ”„๋กœ์ ํŠธ ์ˆ˜ํ–‰ ๊ฒฐ๊ณผ (Project Execution Results)

  • ๊ฒฐ๊ณผ๋ฌผ 1: ๊ธฐ๋ณธ AI ์ฃผํ–‰ ๊ธฐ๋Šฅ (jetbot.kr)
    • JetBot ๋„๋กœ ๋”ฐ๋ผ๊ฐ€๊ธฐ ๊ธฐ๋Šฅ ๊ตฌํ˜„ ๋ฐ ๋ฐ๋ชจ ์˜์ƒ ๊ณต๊ฐœ.
      • [์˜์ƒ: JetBot AI Road Following (YouTube Short ๋งํฌ)]
    • JetBot ์ปต๋ผ๋ฉด ๋”ฐ๋ผ๊ฐ€๊ธฐ ๊ธฐ๋Šฅ ๊ตฌํ˜„ ๋ฐ ๋ฐ๋ชจ ์˜์ƒ ๊ณต๊ฐœ.
      • [์˜์ƒ: JetBot Cup Ramen Following (YouTube Short ๋งํฌ)]
    • ๊ด€๋ จ ๋ฐ์ดํ„ฐ์…‹ ๋ฐ ํ•™์Šต ๋ชจ๋ธ Hugging Face ํ†ตํ•ด ๊ณต๊ฐœ.
      • [๋งํฌ/์Šคํฌ๋ฆฐ์ƒท: Hugging Face ๋ฐ์ดํ„ฐ์…‹ ๋ฐ ๋ชจ๋ธ ํŽ˜์ด์ง€]
  • ๊ฒฐ๊ณผ๋ฌผ 2: LLM ๊ธฐ๋ฐ˜ ์ œ์–ด ์‹œ์Šคํ…œ (jetbot_gemma)
    • gemma3:4b ๋ชจ๋ธ์„ ํ™œ์šฉ, ์‹ค์‹œ๊ฐ„ ์˜์ƒ ๋ถ„์„ ๊ธฐ๋ฐ˜ JetBot ์ œ์–ด ์‹œ์Šคํ…œ ํ”„๋กœํ† ํƒ€์ž… ์™„์„ฑ.
    • ์›น ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ํ†ตํ•œ ์›๊ฒฉ ์ œ์–ด ๋ฐ ์‹ค์‹œ๊ฐ„ ์˜์ƒ/๋ถ„์„ ๊ฒฐ๊ณผ ํ™•์ธ ๊ฐ€๋Šฅ.
    • ์ƒํ™ฉ ๋ถ„์„ ๊ฒฐ๊ณผ๋ฅผ ํ•œ๊ตญ์–ด ์Œ์„ฑ(Edge TTS)์œผ๋กœ ์ถœ๋ ฅํ•˜๋Š” ๊ธฐ๋Šฅ ๊ตฌํ˜„.
    • [์Šคํฌ๋ฆฐ์ƒท: jetbot_gemma ์›น ์ธํ„ฐํŽ˜์ด์Šค]
  • ๊ฒฐ๊ณผ๋ฌผ 3: ๋น„์ „ LLM ๊ธฐ๋ฐ˜ ๊ณ ๊ธ‰ ์ œ์–ด ์‹œ์Šคํ…œ (jetbot_IBM)
    • IBM granite3.2-vision ๋ชจ๋ธ ํ™œ์šฉ, ํ–ฅ์ƒ๋œ ๋น„์ „ ๊ธฐ๋ฐ˜ ์ œ์–ด ์‹œ์Šคํ…œ ๊ตฌ์ถ•.
    • ์ˆ˜๋™ ์ œ์–ด, ์ž์œจ ์ฃผํ–‰, ์žฅ๋ฉด ์„ค๋ช…, ์‚ฌ์šฉ์ž ์ •์˜ ๋ช…๋ น ๋“ฑ ๋‹ค์ค‘ ๋ชจ๋“œ ์ง€์›.
    • Streamlit ๊ธฐ๋ฐ˜์˜ ์‹œ๊ฐ์ ์ด๊ณ  ์ธํ„ฐ๋ž™ํ‹ฐ๋ธŒํ•œ ์ œ์–ด ์ธํ„ฐํŽ˜์ด์Šค ์ œ๊ณต.
    • [์Šคํฌ๋ฆฐ์ƒท: jetbot_IBM Streamlit ์ธํ„ฐํŽ˜์ด์Šค]
    • (๋ถ€๊ฐ€ ๊ฒฐ๊ณผ) ์–ผ๊ตด ํŠน์ง• ๋ถ„์„ Streamlit ๋ฐ๋ชจ ์•ฑ ๊ฐœ๋ฐœ.
  • ๊ณตํ†ต ๊ฒฐ๊ณผ:
    • ๋ชจ๋“  ํ”„๋กœ์ ํŠธ ์†Œ์Šค ์ฝ”๋“œ ๋ฐ ์„ค์ • ๋ฐฉ๋ฒ• GitHub ํ†ตํ•ด ๊ณต๊ฐœ.
    • JetBot์„ ํ™œ์šฉํ•œ ๋‹ค์–‘ํ•œ ์ตœ์‹  AI ๊ธฐ์ˆ  ์ ์šฉ ์‚ฌ๋ก€ ์ œ์‹œ.

05 ์ž์ฒด ํ‰๊ฐ€ ์˜๊ฒฌ (Self-Assessment Opinion)

  • ์ž˜๋œ ์  (Strengths):
    • ์ €๋ ดํ•œ JetBot ํ”Œ๋žซํผ์„ ํ™œ์šฉํ•˜์—ฌ ์ตœ์‹  AI ๊ธฐ์ˆ (LLM, Vision LLM, TTS) ์ ‘๋ชฉ ์„ฑ๊ณต.
    • ๋‹จ๊ณ„๋ณ„ ํ”„๋กœ์ ํŠธ ์ง„ํ–‰์„ ํ†ตํ•ด AI ๋กœ๋ด‡ ์ œ์–ด ๊ธฐ์ˆ  ์‹ฌ์ธต ํ•™์Šต ๋ฐ ๊ตฌํ˜„ ๊ฒฝํ—˜ ํ™•๋ณด.
    • FastAPI, WebSockets, Streamlit ๋“ฑ ์ตœ์‹  ์›น ๊ธฐ์ˆ ์„ ํ™œ์šฉํ•˜์—ฌ ์‚ฌ์šฉ์ž ์นœํ™”์ ์ธ ์ธํ„ฐํŽ˜์ด์Šค ๋ฐ ์‹ค์‹œ๊ฐ„ ์ œ์–ด ์‹œ์Šคํ…œ ๊ตฌ์ถ•.
    • ์˜คํ”ˆ์†Œ์Šค(Ollama, JetBot)๋ฅผ ์ ๊ทน ํ™œ์šฉํ•˜์—ฌ ๊ฐœ๋ฐœ ํšจ์œจ์„ฑ ์ฆ๋Œ€.
    • ๊ฒฐ๊ณผ๋ฌผ(์ฝ”๋“œ, ๋ชจ๋ธ, ์˜์ƒ)์„ ์ฒด๊ณ„์ ์œผ๋กœ ์ •๋ฆฌํ•˜๊ณ  ๊ณต๊ฐœํ•˜์—ฌ ๊ณต์œ  ๋ฐ ์žฌํ˜„ ๊ฐ€๋Šฅ์„ฑ ๋†’์ž„.
  • ์–ด๋ ค์› ๋˜ ์  ๋ฐ ๊ฐœ์„ ํ•  ์  (Weaknesses & Improvements):
    • ์„ฑ๋Šฅ ํ•œ๊ณ„: Jetson Nano์˜ ์—ฐ์‚ฐ ๋Šฅ๋ ฅ ๋ฐ ๋„คํŠธ์›Œํฌ ํ™˜๊ฒฝ์— ๋”ฐ๋ฅธ ์‹ค์‹œ๊ฐ„ ์ฒ˜๋ฆฌ ์ง€์—ฐ(latency) ๋ฐœ์ƒ ๊ฐ€๋Šฅ์„ฑ.
    • LLM ์˜์กด์„ฑ ๋ฐ ์‹ ๋ขฐ๋„: LLM์˜ ๋ถ„์„ ๊ฒฐ๊ณผ๊ฐ€ ํ•ญ์ƒ ์ •ํ™•ํ•˜๊ฑฐ๋‚˜ ์ผ๊ด€๋˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์ด๋Š” ๋กœ๋ด‡ ์ œ์–ด์˜ ์•ˆ์ •์„ฑ์— ์˜ํ–ฅ. (์˜ˆ: ๋ถ€์ •ํ™•ํ•œ ์žฅ์• ๋ฌผ ์ธ์‹, ์˜ˆ์ƒ์น˜ ๋ชปํ•œ ๋ช…๋ น ์ƒ์„ฑ)
    • ์žฅ์• ๋ฌผ ํšŒํ”ผ ๋กœ์ง: ํ˜„์žฌ ๊ตฌํ˜„๋œ ํšŒํ”ผ ๋กœ์ง์ด ๋‹จ์ˆœํ•˜์—ฌ(์˜ˆ: 'obstacle' ๋‹จ์–ด ๊ฐ์ง€ ์‹œ ๋ฌด์กฐ๊ฑด ์ขŒํšŒ์ „), LLM ์ถœ๋ ฅ์„ ๋” ์ •๊ตํ•˜๊ฒŒ ๋ถ„์„ํ•˜์—ฌ ์ƒํ™ฉ์— ๋งž๋Š” ํšŒํ”ผ ๊ธฐ๋™(์ขŒ/์šฐ/ํ›„์ง„ ๋“ฑ)์ด ํ•„์š”ํ•จ (jetbot_gemma ๊ฐœ์„ ์ ).
    • ํ”„๋กฌํ”„ํŠธ ์—”์ง€๋‹ˆ์–ด๋ง: ์›ํ•˜๋Š” ๋กœ๋ด‡ ํ–‰๋™์„ ์œ ๋„ํ•˜๊ธฐ ์œ„ํ•œ ํšจ๊ณผ์ ์ธ ํ”„๋กฌํ”„ํŠธ ์„ค๊ณ„์˜ ์–ด๋ ค์›€.
    • UI/UX: ์‚ฌ์šฉ์ž ๊ฒฝํ—˜์„ ๋”์šฑ ํ–ฅ์ƒ์‹œํ‚ค๊ธฐ ์œ„ํ•œ ์ธํ„ฐํŽ˜์ด์Šค ๊ฐœ์„  ํ•„์š”.
    • ์—๋Ÿฌ ํ•ธ๋“ค๋ง: ์˜ˆ๊ธฐ์น˜ ์•Š์€ ์ƒํ™ฉ(ํ†ต์‹  ์˜ค๋ฅ˜, ๋ชจ๋ธ ์‘๋‹ต ์‹คํŒจ ๋“ฑ)์— ๋Œ€ํ•œ ์•ˆ์ •์„ฑ ๊ฐ•ํ™” ํ•„์š”.
  • ๋ฐฐ์šด ์  (Lessons Learned):
    • ์—ฃ์ง€ ๋””๋ฐ”์ด์Šค(Jetson Nano) ํ™˜๊ฒฝ์—์„œ AI ๋ชจ๋ธ ๋ฐฐํฌ ๋ฐ ์ตœ์ ํ™”์˜ ์ค‘์š”์„ฑ ์ฒด๊ฐ.
    • ํ•˜๋“œ์›จ์–ด(๋กœ๋ด‡)-์†Œํ”„ํŠธ์›จ์–ด(์ œ์–ด ๋กœ์ง)-AI(๋ถ„์„/ํŒ๋‹จ) ๊ฐ„์˜ ์œ ๊ธฐ์ ์ธ ์‹œ์Šคํ…œ ํ†ตํ•ฉ ๋Šฅ๋ ฅ ํ–ฅ์ƒ.
    • LLM/Vision LLM์˜ ๋กœ๋ณดํ‹ฑ์Šค ๋ถ„์•ผ ์ ์šฉ ๊ฐ€๋Šฅ์„ฑ ๋ฐ ํ˜„์žฌ ๊ธฐ์ˆ ์  ํ•œ๊ณ„์  ๋ช…ํ™•ํžˆ ์ธ์ง€.
    • ์˜คํ”ˆ์†Œ์Šค ์ปค๋ฎค๋‹ˆํ‹ฐ ๋ฐ ๋ฌธ์„œ ํ™œ์šฉ ๋Šฅ๋ ฅ ์ฆ๋Œ€.
  • ํ–ฅํ›„ ๊ณ„ํš (Future Work):
    • ์ •๊ตํ•œ ์žฅ์• ๋ฌผ ํšŒํ”ผ ๋ฐ ๊ฒฝ๋กœ ๊ณ„ํš ์•Œ๊ณ ๋ฆฌ์ฆ˜ ๋„์ž….
    • ํŠน์ • ๊ฐ์ฒด ์ธ์‹ ๋ฐ ์ถ”์  ๊ธฐ๋Šฅ ๊ณ ๋„ํ™”.
    • ๋” ๋‹ค์–‘ํ•œ ๋กœ๋ด‡ ๋ช…๋ น์–ด ๋ฐ ์ƒํ˜ธ์ž‘์šฉ ์‹œ๋‚˜๋ฆฌ์˜ค ์ถ”๊ฐ€ (์˜ˆ: ์Œ์„ฑ ๋Œ€ํ™” ๊ธฐ๋ฐ˜ ์ œ์–ด).
    • ์„ฑ๋Šฅ ์ตœ์ ํ™” (๋ชจ๋ธ ๊ฒฝ๋Ÿ‰ํ™”, ์ฝ”๋“œ ์ตœ์ ํ™” ๋“ฑ).
    • ๋‹ค์–‘ํ•œ AI ๋ชจ๋ธ ํ…Œ์ŠคํŠธ ๋ฐ ๋น„๊ต ํ‰๊ฐ€.

jetbot_IBMimage

image

jetbot_IBM granite3.2-vision controlimage image

image image

image image image https://ollama.com/library/granite3.2-vision https://jetbot.org/master/index.html

JetBot Control with Vision and Ollama

This project provides a comprehensive system for controlling a JetBot robot using both direct commands and autonomous, vision-based navigation powered by Ollama. It leverages FastAPI for the backend API, WebSockets for real-time communication, and edge-tts for text-to-speech feedback. The system supports a Streamlit-based web interface for manual control and visualization of the autonomous process.

Features

  • Direct Control: Issue basic movement commands (forward, backward, left, right, stop, dance) to the JetBot via a web interface.
  • Vision-Based Autonomous Navigation: Utilize the Ollama large language model (specifically granite3.2-vision) for image analysis and autonomous decision-making. The JetBot can navigate, avoid obstacles, and describe its environment.
  • Real-time Image Streaming: View a live video stream from the JetBot's camera in the web interface.
  • Text-to-Speech (TTS) Feedback: Receive spoken feedback from the JetBot, describing its actions and observations, using edge-tts.
  • Custom Commands: Define custom actions based on the vision model's analysis of the scene.
  • Multiple Control Modes: Switch between manual control, descriptive mode (where the JetBot describes what it sees), custom command mode, and full autonomous mode.
  • Physiognomy Analysis: Includes a Streamlit application demonstrating how to analyze a person's face and classify different characteristics.
  • FastAPI Backend: A robust and efficient backend API built with FastAPI.
  • WebSocket Communication: Real-time communication between the web interface, the backend server, and the JetBot.
  • Streamlit Frontend: An interactive web interface built with Streamlit.

Project Structure

The project consists of the following main components:

  • app.py (FastAPI Backend): This is the core of the system. It handles:
    • WebSocket connections to both the JetBot and the web client.
    • Communication with the Ollama API for vision processing.
    • Text-to-speech generation using edge-tts.
    • Processing commands from the client and sending them to the JetBot.
    • Serving the static files for the web interface.
  • static/ (Web Interface): Contains the HTML, CSS, and JavaScript files for the basic web interface that interacts with the FastAPI backend.
  • physiognomy_app.py (Streamlit App): A separate Streamlit application demonstrating the use of Ollama for facial analysis.
  • requirements.txt: Contains the python dependencies.

Prerequisites

  1. JetBot: A fully assembled and configured JetBot robot. You must have the JetBot's WebSocket server running on the JetBot itself (see JetBot documentation for details). The default WebSocket URL is ws://192.168.137.181:8766. You may need to adjust this based on your JetBot's IP address.
  2. Ollama: Ollama must be installed and running. Download and install Ollama from https://ollama.ai/. You'll need to pull the granite3.2-vision model:
    ollama pull granite3.2-vision
  3. Python 3.7+: This project requires Python 3.7 or higher.
  4. Dependencies: Install the required Python packages:
    pip install -r requirements.txt
  5. Edge-TTS: edge-tts is used for voice generation. It is included in the requirements.txt.

Setup and Running

  1. Clone the Repository:

    git clone https://github.com/hwkims/jetbot_IBM.git
    cd jetbot_IBM
  2. Configure (if necessary):

    • app.py: If your Ollama server or JetBot WebSocket URL are different from the defaults, modify the OLLAMA_HOST and JETBOT_WEBSOCKET_URL variables in app.py.
    • physiognomy_app.py: If your Ollama server is different from the defaults, modify the OLLAMA_HOST in physiognomy_app.py.
  3. Start the FastAPI Server:

    uvicorn app:app --host 0.0.0.0 --port 8000

    This will start the backend server, making the web interface available at http://localhost:8000. The --host 0.0.0.0 makes the server accessible from other devices on your network.

  4. Start the Streamlit App (Optional - for physiognomy analysis): In a separate terminal, navigate to the project directory and run:

    streamlit run physiognomy_app.py
    ```    This will start the Streamlit application, typically at `http://localhost:8501`.
    
  5. Connect your Jetbot Ensure your Jetbot is powered on, and the Jetbot websocket server is running.

  6. Access the Web Interface: Open a web browser and go to http://localhost:8000 (or the appropriate address if you changed the host/port).

Usage

Main Web Interface (http://localhost:8000)

The main interface provides buttons for direct control (Forward, Backward, Left, Right, Stop, Dance) and input fields for more complex interactions:

  • Direct Control Buttons: These send immediate commands to the JetBot.
  • Iterations: Specifies how many times a command should be repeated.
  • Text Input: Used for custom prompts and text input for Ollama.
  • Describe: Sends the current camera image to Ollama and displays/speaks a description of the scene.
  • Custom: Sends the current camera image and the text prompt to Ollama, then executes the returned commands.
  • Autonomous: Enters autonomous mode, where the JetBot repeatedly analyzes the scene and navigates based on Ollama's output.

Physiognomy App (http://localhost:8501)

The Streamlit app allows you to either upload an image or use your webcam to capture an image of a face. It then sends the image to Ollama with a prompt to analyze 32 facial features and provide a physiognomy reading in JSON format. The results are displayed, including a radar chart of the facial features.

Troubleshooting

  • JetBot Not Connecting:
    • Ensure the JetBot is powered on and connected to the same network as your computer.
    • Verify that the JetBot's WebSocket server is running.
    • Double-check the JETBOT_WEBSOCKET_URL in app.py.
  • Ollama Not Responding:
    • Make sure Ollama is running.
    • Verify that you have pulled the granite3.2-vision model (ollama pull granite3.2-vision).
    • Check the OLLAMA_HOST variable in app.py and physiognomy_app.py.
  • Web Interface Not Working:
    • Ensure the FastAPI server is running (uvicorn app:app ...).
    • Check your browser's developer console for any JavaScript errors.
  • TTS Not Working
    • Ensure you have a working internet connection for edge-tts to download the required voice data the first time it runs.
  • Streamlit Webcam Issues:
    • Modern browsers often require secure contexts (HTTPS) for webcam access. If you are experiencing trouble with the webcam in the Streamlit app, you might need to set up HTTPS or use a workaround like streamlit run physiognomy_app.py --server.enableCORS=false --server.enableXsrfProtection=false. Note: Disabling CORS and XSRF protection is generally not recommended for production environments, but may be acceptable for local development and testing. A better solution for production would be to set up a proper HTTPS server.

Contributing

Contributions are welcome! Please feel free to submit pull requests or open issues on the GitHub repository.

License

This project is licensed under the MIT License - see the LICENSE file for details.

JetBot Control ๐Ÿค–

JetBot
JetBot in action - Replace this with an actual image URL or path.

JetBot Control์€ NVIDIA Jetson Nano ๊ธฐ๋ฐ˜์˜ ๊ต์œก์šฉ AI ๋กœ๋ด‡ JetBot์„ ์œ„ํ•œ ์ œ์–ด ์‹œ์Šคํ…œ์ž…๋‹ˆ๋‹ค. IBM์˜ granite3.2-vision ๋ชจ๋ธ์„ ํ™œ์šฉํ•˜์—ฌ ์ด๋ฏธ์ง€ ๋ถ„์„ ๋ฐ ์ž์œจ ์ฃผํ–‰์„ ์ง€์›ํ•˜๋ฉฐ, ์ˆ˜๋™ ๋ฒ„ํŠผ ์กฐ์ž‘๊ณผ ์‚ฌ์šฉ์ž ์ •์˜ ๋ช…๋ น๋„ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค. ์ด ํ”„๋กœ์ ํŠธ๋Š” ๋กœ๋ด‡ ๊ณตํ•™, AI, ๋น„์ „ ์ฒ˜๋ฆฌ์— ๊ด€์‹ฌ ์žˆ๋Š” ํ•™์ƒ๊ณผ ๊ฐœ๋ฐœ์ž๋“ค์—๊ฒŒ ์‹ค์Šต ๊ฒฝํ—˜์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.


์ฃผ์š” ๊ธฐ๋Šฅ

  • ์ˆ˜๋™ ์ œ์–ด: ๋ฒ„ํŠผ์œผ๋กœ "Forward", "Backward", "Left", "Right", "Stop", "Dance" ๋™์ž‘์„ ์ง์ ‘ ์‹คํ–‰.
  • ๋น„์ „ ๊ธฐ๋ฐ˜ ์ž์œจ ์ฃผํ–‰: granite3.2-vision์„ ์‚ฌ์šฉํ•ด ์‹ค์‹œ๊ฐ„ ์ด๋ฏธ์ง€ ๋ถ„์„ ํ›„ ์ž์œจ ์ฃผํ–‰.
  • ์„ค๋ช… ๊ธฐ๋Šฅ: ์นด๋ฉ”๋ผ ํ”ผ๋“œ๋ฅผ ๋ถ„์„ํ•ด ์žฅ๋ฉด ์„ค๋ช… ์ œ๊ณต (์˜ˆ: "์˜ค๋ฅธ์ชฝ์— ํ…Œ์ด๋ธ”, ์•ž์— ์—ด๋ฆฐ ๊ฒฝ๋กœ").
  • ์‚ฌ์šฉ์ž ์ •์˜ ๋ช…๋ น: ํ…์ŠคํŠธ ์ž…๋ ฅ ๋˜๋Š” ์Œ์„ฑ์œผ๋กœ ์ž์œ ๋กœ์šด ๋ช…๋ น ์‹คํ–‰ (์˜ˆ: "spin around").
  • ์Œ์„ฑ ์ธ์‹: ํ•œ๊ตญ์–ด(ko-KR) ์ง€์›์œผ๋กœ "์•ˆ๋…•" ๊ฐ™์€ ๋ช…๋ น ์ฒ˜๋ฆฌ.
  • ๋ฐ˜๋ณต ์‹คํ–‰: ๋ชจ๋“  ๋ช…๋ น์— ๋Œ€ํ•ด ๋ฐ˜๋ณต ํšŸ์ˆ˜ ์„ค์ • ๊ฐ€๋Šฅ.

๊ธฐ์ˆ  ์Šคํƒ

  • ํ•˜๋“œ์›จ์–ด: NVIDIA Jetson Nano, JetBot ํ”Œ๋žซํผ
  • ๋ฐฑ์—”๋“œ: FastAPI (Python), WebSocket ํ†ต์‹ 
  • ๋น„์ „ ๋ชจ๋ธ: IBM granite3.2-vision (Ollama ํ˜ธ์ŠคํŒ…)
  • ํ”„๋ก ํŠธ์—”๋“œ: HTML/CSS/JavaScript (ํ˜„๋Œ€์  UI)
  • ์Œ์„ฑ ์ฒ˜๋ฆฌ: Web Speech API, Edge TTS (Jenny Neural ์Œ์„ฑ)
  • ๋กœ๋ด‡ ์ œ์–ด: JetBot Python ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ

์„ค์น˜ ๋ฐฉ๋ฒ•

์š”๊ตฌ ์‚ฌํ•ญ

  • NVIDIA Jetson Nano ๋ฐ JetBot ํ•˜๋“œ์›จ์–ด
  • Python 3.6+ (JetBot์šฉ), Python 3.8+ (FastAPI์šฉ)
  • Ollama 0.5.13+ (granite3.2-vision ์„ค์น˜)
  • ding.mp3 ํŒŒ์ผ (static/ ๋””๋ ‰ํ† ๋ฆฌ์— ์ถ”๊ฐ€)

JetBot ์„ค์ •

  1. JetBot์— ์ ‘์†:
    ssh jetbot@192.168.137.181
  2. ์ €์žฅ์†Œ ํด๋ก :
    git clone https://github.com/hwkims/jetbot_IBM.git
    cd jetbot_IBM
  3. ์ข…์†์„ฑ ์„ค์น˜:
    pip install jetbot
  4. JetBot ์‹คํ–‰:
    python3.6 jetbot.py

์„œ๋ฒ„ ์„ค์ •

  1. ๋กœ์ปฌ ๋จธ์‹ ์—์„œ ์ €์žฅ์†Œ ํด๋ก :
    git clone https://github.com/hwkims/jetbot_IBM.git
    cd jetbot_IBM
  2. ๊ฐ€์ƒ ํ™˜๊ฒฝ ์ƒ์„ฑ ๋ฐ ํ™œ์„ฑํ™”:
    python -m venv venv
    source venv/bin/activate  # Linux/Mac
    venv\Scripts\activate     # Windows
  3. ์ข…์†์„ฑ ์„ค์น˜:
    pip install fastapi uvicorn httpx edge-tts websockets
  4. Ollama ์‹คํ–‰ ๋ฐ ๋ชจ๋ธ ์„ค์น˜:
    ollama serve
    ollama run granite3.2-vision
  5. ์„œ๋ฒ„ ์‹คํ–‰:
    uvicorn main:app --host 0.0.0.0 --port 8000

์›น ์ธํ„ฐํŽ˜์ด์Šค

  • ๋ธŒ๋ผ์šฐ์ €์—์„œ http://localhost:8000 ์—ด๊ธฐ.

์‚ฌ์šฉ๋ฒ•

  1. ์ˆ˜๋™ ์ œ์–ด:

    • ๋ฒ„ํŠผ ํด๋ฆญ: "Forward", "Backward", "Left", "Right", "Stop", "Dance".
    • "Iterations" ์ž…๋ ฅ์œผ๋กœ ๋ฐ˜๋ณต ํšŸ์ˆ˜ ์„ค์ •.
  2. ๋น„์ „ ์„ค๋ช…:

    • "Describe" ๋ฒ„ํŠผ ํด๋ฆญ โ†’ ํ˜„์žฌ ์žฅ๋ฉด ์„ค๋ช… ์ถœ๋ ฅ ๋ฐ TTS ์žฌ์ƒ.
  3. ์‚ฌ์šฉ์ž ์ •์˜ ๋ช…๋ น:

    • ํ…์ŠคํŠธ ์ž…๋ ฅ: "spin around" ์ž…๋ ฅ ํ›„ "Execute" ํด๋ฆญ.
    • ์Œ์„ฑ: "Voice" ๋ฒ„ํŠผ ํด๋ฆญ ํ›„ "์•ˆ๋…•" ๋˜๋Š” "go forward" ๋งํ•˜๊ธฐ.
  4. ์ž์œจ ์ฃผํ–‰:

    • "Iterations" ์„ค์ • ํ›„ "Autonomous" ํด๋ฆญ โ†’ ๋น„์ „ ๊ธฐ๋ฐ˜ ์ฃผํ–‰ ์‹œ์ž‘.

์˜ˆ์ œ ๋ช…๋ น

  • ์ˆ˜๋™: "Forward" ๋ฒ„ํŠผ โ†’ 1์ดˆ๊ฐ„ ์ „์ง„.
  • ์„ค๋ช…: "Describe" โ†’ "์˜ค๋ฅธ์ชฝ์— ์˜์ž, ์•ž์— ์—ด๋ฆฐ ๊ณต๊ฐ„์ž…๋‹ˆ๋‹ค."
  • ์ปค์Šคํ…€: "spin around" โ†’ ์˜ค๋ฅธ์ชฝ์œผ๋กœ 2์ดˆ ํšŒ์ „.
  • ์ž์œจ: Iterations 3 โ†’ ์ด๋ฏธ์ง€ ๋ถ„์„ ํ›„ 3๋ฒˆ ์ฃผํ–‰ (์˜ˆ: "forward", "left", "stop").

๊ธฐ์—ฌํ•˜๊ธฐ

  1. ์ด์Šˆ ์ œ์ถœ: ๋ฒ„๊ทธ๋‚˜ ์ œ์•ˆ์€ Issues์—์„œ.
  2. ํ’€ ๋ฆฌํ€˜์ŠคํŠธ: ํฌํฌ ํ›„ ๋ธŒ๋žœ์น˜ ์ƒ์„ฑ, ๋ณ€๊ฒฝ ํ›„ PR ์ œ์ถœ.

๋ผ์ด์„ ์Šค

์ด ํ”„๋กœ์ ํŠธ๋Š” MIT License ํ•˜์— ๋ฐฐํฌ๋ฉ๋‹ˆ๋‹ค.


์—ฐ๋ฝ์ฒ˜

  • GitHub: hwkims
  • Email: hwkims@example.com (์‹ค์ œ ์ด๋ฉ”์ผ๋กœ ๊ต์ฒด ํ•„์š”)

์ด ๋ฆฌ๋“œ๋ฏธ๋Š” GitHub Markdown ํ˜•์‹์œผ๋กœ ์ž‘์„ฑ๋˜์—ˆ์œผ๋ฉฐ, hwkims/jetbot_IBM ์ €์žฅ์†Œ์˜ README.md ํŒŒ์ผ๋กœ ๋ฐ”๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฏธ์ง€ URL(https://via.placeholder.com/...)์€ ์ž„์‹œ๋กœ ๋„ฃ์€ ๊ฒƒ์ด๋‹ˆ, ์‹ค์ œ JetBot ์‚ฌ์ง„์œผ๋กœ ๊ต์ฒดํ•˜์„ธ์š”. ์ถ”๊ฐ€ ์ˆ˜์ •์ด๋‚˜ ์š”์ฒญ ์žˆ์œผ๋ฉด ๋ง์”€ํ•ด์ฃผ์„ธ์š”!

About

jetbot_IBM granite3.2-vision control

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0