chore: 루트 문서 정리 — knowledge/ CANON 소스 신설 + 흩어진 문서 루트 밖 격리

seed 품질 확보(GIGO 차단). 루트에 흩어졌던 ~150개 문서를 용도별 분리.

- knowledge/ 신설 = 단일 CANON 지식 소스 (RAG/지식은 여기만 참조)
  · 플랜트 지식 7: 구조설명 6-1/6-2차, 측류추출 관계식·시간지연, PGMEA 일반상식·운전주의점
  · 도면-데이터시트/: As-Built 15 + FCV 데이터시트 2 (PDF 바이너리는 .gitignore, 디스크 유지)
- 계획·진단·대화로그·멀티모델 초안(byQwen/byGemma 등)·완료작업(dxf-graph/·fastTable/·plans/)은
  **프로젝트 루트 밖 저장소로 격리**(삭제 아닌 이동, 복원 가능):
    /home/windpacer/projects/ReferenceSources/ExperionCrawler/
  (ExperionCrawler.Tests/ 도 동일 위치 — 완료/실패분, 필요시 복원)
- .gitignore: 대용량 PDF(knowledge 104M + src/Web/uploads 157M)·*.backup 제외

근거 플랜(아카이브): ReferenceSources/.../plans/online-lora-학습-파이프라인-실행계획-byOPUS.md Phase -1.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
windpacer
2026-05-26 09:55:19 +09:00
parent ab3e36680f
commit 3e9f3076ef
367 changed files with 1566 additions and 2525740 deletions

View File

@@ -0,0 +1,55 @@
#!/usr/bin/env bash
set -euo pipefail
NAME="vllm_qwen35b"
PORT="${1:-8001}"
echo "Starting Qwen3.6-35B-A3B-FP8 on port ${PORT} (LoRA enabled)..."
docker rm -f "$NAME" 2>/dev/null || true
docker run -d --name "$NAME" \
--restart unless-stopped \
--gpus all --network host --ipc host \
--ulimit memlock=-1 --ulimit stack=67108864 \
-v /home/windpacer/.cache/huggingface:/root/.cache/huggingface \
-v /home/windpacer/.cache/vllm:/root/.cache/vllm \
-v /home/windpacer/ai-models:/root/ai-models \
--entrypoint "" \
vllm-node-tf5 \
bash -c "
exec vllm serve /root/ai-models/Qwen3.6-35B-A3B-FP8 \
--served-model-name Qwen3.6-35B-A3B-FP8 \
--max-model-len 65536 \
--max-num-seqs 4 \
--gpu-memory-utilization 0.55 \
--port ${PORT} --host 0.0.0.0 \
--enable-chunked-prefill \
--enable-auto-tool-choice \
--tool-call-parser qwen3_coder \
--reasoning-parser qwen3 \
--trust-remote-code \
--kv-cache-dtype fp8 \
--default-chat-template-kwargs '{\"preserve_thinking\": true}' \
--speculative-config '{\"method\": \"qwen3_next_mtp\", \"num_speculative_tokens\": 2}' \
--override-generation-config '{\"temperature\": 0.6, \"top_p\": 0.95}' \
--load-format instanttensor \
--enable-lora \
--max-lora-rank 64 \
--max-loras 4 \
--lora-dtype auto \
-tp 1
"
echo "Waiting for model to load..."
for i in $(seq 1 120); do
if curl -sf "http://localhost:${PORT}/v1/models" > /dev/null 2>&1; then
echo "✓ Ready on port ${PORT}"
curl -s "http://localhost:${PORT}/v1/models" | python3 -m json.tool 2>/dev/null || true
exit 0
fi
echo " Waiting... (${i}/120)"
sleep 5
done
echo "❌ Failed to start within 10 minutes"
exit 1