plans: 새 작업 계획 문서들 추가 (OPUS-v2, P&ID bySonnet, TASK 문서)

2026-05-29 09:49:48 +09:00
parent 3fbca36719
commit 4eccb75193
6 changed files with 2085 additions and 0 deletions
--- a/plans/OPUS-v2-실행계획.md
+++ b/plans/OPUS-v2-실행계획.md
@@ -0,0 +1,210 @@
+# 로컬 LLM 전문화 파이프라인 — OPUS-v2 실행계획
+
+> **v1 후속.** v1(`ReferenceSources/.../plans/online-lora-학습-파이프라인-실행계획-byOPUS.md`)의
+> "small dense + system 충분" 가정을 production 데이터로 폐기.
+> **Verifier를 최선두로, 35B production을 기본으로, LoRA를 가속 트랙으로.**
+
+---
+
+## v1 → v2 핵심 변경
+
+| 항목 | v1 | v2 |
+|---|---|---|
+| Production 모델 | 소형 dense (Qwen3-8B) | **A: Qwen3.6-35B-A3B-FP8** (기본 체력 floor) |
+| **Verifier 위치** | Phase 2 (중기, 보조) | **Phase B — 최선두**, 며칠 내 MVP |
+| LoRA 위치 | Phase 3~4 (조건부) | **Phase C — 적극 가속** (2~4주, 행동 *학습*) |
+| "small + system 충분" 가설 | 가능 | **폐기** — production messy 환경에서 8B는 L2~L4 RAG 실패 |
+| Production · LoRA 베이스 동일성 | 동일 가정 | **분리 가능성 명시** |
+| 역할 분리(클라우드 프런티어) | 미고려 | **Phase D — 의제화** (안전 critical 옵션) |
+
+---
+
+## 핵심 진단 — 사용자 발견의 반영
+
+### "RAG는 사이즈 무관"이 틀렸다 (4층 분해)
+
+| 층 | 능력 | 사이즈 의존? | 8B (production 실측) |
+|---|---|---|---|
+| **L1** 순수 lookup | "X 가져와" → DB 반환 | ❌ 무관 | ✅ (코드 레벨) |
+| **L2** 언제 검색할지 판단 | 모호 → *먼저* find_tags 호출, 추측 금지 | ✅ 의존 | ❌ `raw_material_input` 합성 |
+| **L3** 무엇을·어떤 형식으로 | `area="P6-1"` 형식 준수, base_tag 구조 인식 | ✅ 의존 | ❌ `area="6-1"` 잘못 |
+| **L4** 결과를 *충실히* 사용 | 빈 결과면 자기 인자 의심, 무필터 결과 둔갑 금지 | ✅ 의존 | ❌ 전체 area를 "6-1차"로 포장 |
+
+→ **L2~L4가 무너지면 RAG 환경을 아무리 잘 깔아도 "사실 조회조차 못함" = 기본 체력 부족.**
+이건 통제된 Phase 0 eval에선 안 잡힘(스캐폴드 100%, abstain 100%였음). production messy 환경에서야 드러남.
+**작은 모델은 production operator brain으로 부적합** — 사용자 직관 데이터 확인.
+
+### "부분 진실 fabrication"이 가장 위험
+A35B도 가끔 헛소리하는데, 가장 위험한 패턴은:
+- 실제 태그(p-6102) + 지어낸 상태(R-FAULT)
+- 실제 데이터(전체 area 결과) + 잘못된 필터 포장("6-1차 결과")
+- 실제 도구(trace_connections) + 지어낸 인자(raw_material_input)
+
+표면이 그럴듯해 운전원이 알아채기 어려움 → **결정적 코드 검증(Verifier)이 유일한 신뢰 가능 차단선.**
+
+---
+
+## Phase B — Verifier MVP ★ 최선두 (3~5일)
+
+> **모델 사이즈와 무관하게 *결정적으로* fabrication을 차단.** 35B(A) production에서도 잔여 헛소리를 잡고,
+> LoRA(C) 학습의 *실패 케이스 입력*까지 자동 수집하는 핵심 인프라.
+
+### B.1 위치 — MCP 서버 미들웨어
+
+```
+운전원 요청
+  ↓
+[opencode 채팅] → [vLLM] → 모델 응답·툴호출
+                              ↓
+                    ━━━━ Verifier 미들웨어 ━━━━
+                    ① 결정적 룰 (코드, 모델 호출 0)
+                    ② 룰 통과 → 응답 그대로
+                       룰 실패 → reject + error+hint를 모델에 반환 → 재시도
+                    ━━━━━━━━━━━━━━━━━━━━━━━
+                              ↓
+                       [opencode UI]
+```
+
+`mcp-server/server.py`의 도구 함수 입구에 데코레이터/래퍼로 → 잘못된 인자 호출 즉시 차단.
+모델 응답 검증은 `OllamaController.cs`의 응답 후처리에서 (또는 opencode → MCP proxy 단)에서 수행.
+
+### B.2 MVP 룰 카탈로그 (이번 phase에서 구현)
+
+| # | 룰 | 어디서 적용 | 실패 시 |
+|---|---|---|---|
+| **R1** | `tag-existence` — 응답·툴호출 인자에 등장한 `[a-z]+-\d+[a-z]?` 패턴 base_tag가 `tag_metadata.base_tag` ∪ `pid_equipment.tag_no` 에 존재? | 모든 툴 인자 + 최종 응답 텍스트 | reject + `{"error":"unknown tag X","suggested":[find_tags top3]}` 모델에 반환 → 재시도 |
+| **R2** | `area-format` — `area=` 인자가 `^P\d+(-\d+)?$` 매치? | `find_tags`/`active_alarms`/`query_events`/`summarize_events`/`generate_status_report` | reject + `"use P6 or P6-1 format"` |
+| **R3** | `pump-state-enum` — `livevalue` / `corroborated_status` 가 화이트리스트 안? (CONFIRMED_RUNNING / SUSPICIOUS_RUNNING / STALE / INDETERMINATE_RUNNING / STOPPED / TRIPPED; pump state: L-RUN/R-RUN/L-STOP/R-STOP/L-TRIP/R-TRIP) | 최종 응답 텍스트 스캔 | reject (R-FAULT 같은 변종 차단) |
+| **R4** | `trace-start-tag` — `trace_connections(start_tag=X)` 의 X가 `pid_equipment` 에 존재? | 도구 입구 | reject + `find_tags(query=user_query)` 결과를 hint로 |
+| **R5** | `filter-applied-claim` — 응답이 "X차 결과" 같은 필터된 답이라 주장하면, 실제 도구 호출이 그 sub_area로 필터됐는지 검증 | 응답 후처리 (LLM-judge 가능) | reject + "필터 미적용 결과로 답하지 말 것" |
+
+### B.3 구현 우선순위 (B 단독으로 production 즉시 안전)
+1. **R1, R2, R4** 먼저 (3가지 모두 코드 정규식·DB 조회로 결정적, 즉효) — 1~2일
+2. **R3** (enum 화이트리스트) — 0.5일
+3. **R5** (LLM-judge 보조) — 1~2일, 강도 높음
+
+### B.4 Verifier 출력 = LoRA 입력 (피드백 루프)
+실패한 (모델 호출, hint, 재시도 후 통과한 호출) 트리플을 **`mcp-server/verifier/logs/*.jsonl`** 에 자동 축적
+→ Phase C1(데이터 큐레이션)의 주 데이터셋. 즉 *시스템이 학습 신호를 자기 생성*.
+
+### B.5 Verifier 게이트 (B 완료 기준)
+- [ ] R1~R4 구현 + py_compile + 단위 테스트
+- [ ] opencode 채팅에서 "원료 투입 경로" 질문 → invention 발생 시 Verifier가 차단·재시도·올바른 호출 도달
+- [ ] Verifier 로그 jsonl 자동 적재 확인
+- [ ] eval 하네스에 Verifier-on/off A/B 옵션 추가 (효과 측정)
+
+---
+
+## Phase A — Qwen3.6-35B-A3B production 전환 (병행, 1일)
+
+### 무엇
+- 기존 `scripts/run-qwen3.6-35b-a3b.sh` 그대로 활용 (instanttensor/MTP 등 3.6 전용 플래그 유효).
+- `opencode.json` 기본 model을 실제 서빙 중인 vllm-36b/Qwen3.6-35B-A3B-FP8 으로 정합화.
+- `plant_context.md` 를 opencode system prompt에 자동 주입(또는 opencode-side 설정).
+- 8B는 eval/실험용으로 보존 (Verifier·LoRA 베이스 후보 평가용).
+
+### 게이트
+- [ ] opencode에서 "6-1차 운전 상황", "6-1차 원료 투입 경로" 질문 → A35B + Verifier 조합에서 invention 0 또는 자기교정.
+
+---
+
+## Phase C — LoRA 가속 (2~4주)
+
+> Verifier(B)가 잡아내는 *실패 패턴*을 학습 데이터로 사용해, **행동 자체를 모델 weights에 박음.**
+> 무한 prompt 강화 = 끝없음. 학습 = 행동을 한 번에 옮김. byOPUS의 원래 의도.
+
+### C1 데이터 큐레이션 (1주)
+- **Verifier 로그(B.4)에서 자동 추출**:
+  - 잘못된 인자/추측 호출 → Verifier hint → 올바른 호출 행동 쌍
+  - "find_tags 먼저 호출, area는 P6-1 형식, 빈 결과면 자기 의심" 행동 시퀀스
+- 운전원 검수 (사람 in the loop) — 50~200건 정선
+
+### C2 SFT-LoRA (1주)
+- **베이스**: bf16 dense (Qwen2.5-7B-Instruct or Qwen3-8B bf16 또는 27B dense — 별도 평가).
+  ⚠️ A(production)는 35B-A3B MoE인데, LoRA 학습은 dense에서 — **production과 학습 베이스 분리** 인정.
+  → 학습된 dense 어댑터를 production에 적용하려면 (a) dense 베이스를 *별도 라인*으로 production 추가, (b) LoRA-distill로 35B에 행동 이전 — *어느 쪽이 비용효율적인지는 C 완료 시점에 결정*.
+- 타겟 모듈: attention-only(q/k/v/o) 1차, MoE expert 회피.
+- `r=32~64`, rslora, grad-ckpt unsloth.
+
+### C3 DPO/GRPO (1~2주)
+- **보상 = Verifier 판정 + 운전원 👍/👎** → 선호쌍 → DPO.
+- 학습 목표: **calibration** ("내가 모를 때 답하지 않기")를 weights 자체에 박음.
+- 자기학습 collapse 위험 회피 (사람·Verifier 신호 기반).
+
+### C 게이트
+- Phase 0 평가셋 + 새로 추가된 "invention" 카테고리에서 회귀 0 + invention rate 0 → 배포.
+
+---
+
+## Phase D — 역할 분리 의제 (병행 논의, 결정 보류)
+
+운전 위기/안전 critical 응답에 한해 운전원이 **"강화모드"** 토글 → 프런티어 클라우드(Claude/GPT-4) 호출.
+
+- 비용 < fabrication 의 잠재 사고 비용 (안전 산업)
+- 일반 응답은 로컬(A+B+C), 강화모드만 클라우드 → 비용 통제 가능
+- 데이터 외부 전송 정책은 별도 검토 필요
+- 결정: Phase A·B 안정화 후 2시간짜리 의사결정 회의로 채택여부 확정
+
+---
+
+## 의사결정 게이트 (간소)
+
+```
+Phase -1 (seed 정리)        ✅ 완료
+Phase 0  (평가셋·모델사다리)  ✅ 완료 → 8B 부족 결론
+        ↓
+Phase B  (Verifier MVP, 3~5d)   ← 최우선
+        ↓
+Phase A  (35B production 전환, 1d, B와 병행 가능)
+        ↓
+Phase C  (LoRA 가속, 2~4w) — Verifier 로그를 학습 입력으로
+        ↓
+Phase D  (역할 분리 결정, 2h) — A·B 안정화 후
+```
+
+각 phase는 이전 게이트 통과 시에만 다음으로. Verifier(B)는 모든 후속 phase의 **공통 인프라**.
+
+---
+
+## 타임라인 (개정)
+
+| Phase | 작업 | 소요 | 비고 |
+|---|---|---|---|
+| -1 | seed 정리 | ✅ 완료 | knowledge/ 구축됨 |
+| 0  | 평가셋·러너·모델 사다리 | ✅ 완료 | 8B = production 부적합 확정 |
+| **B** | Verifier MVP (R1~R4) | **3~5d** | 미들웨어 + 룰 코드 + 로그 적재 |
+| A  | 35B-A3B production 전환 + opencode 정합 | 1d | B와 병행 |
+| C1 | 데이터 큐레이션 (Verifier 로그 + 검수) | 1w | |
+| C2 | SFT-LoRA (bf16 dense 베이스) | 1w | production·학습 베이스 분리 |
+| C3 | DPO (calibration) | 1~2w | Verifier·👍/👎 보상 |
+| D  | 역할분리 의사결정 회의 | 2h | A·B 안정화 후 |
+
+---
+
+## 운영 모니터링 (B 추가 지표)
+
+| 항목 | 의미 |
+|---|---|
+| **Verifier reject 비율** (룰별 / 카테고리별) | invention 압력 지표. 낮아질수록 모델·시스템 건강 |
+| **재시도 후 통과율** | 자기교정 효과. 100% 가까울수록 hint 품질 좋음 |
+| **fabrication slip-through** (수동 감사) | Verifier가 못 잡은 부분 진실 fabrication. 룰 추가 신호 |
+| Verifier 로그 적재 속도 | LoRA 데이터 공급 속도 |
+| (D 채택 시) 강화모드 호출 비율·비용 | 클라우드 의존도 추적 |
+
+---
+
+## v1 자기비판 (명시)
+
+1. **"RAG는 사이즈 무관"** → **틀림.** L1(lookup)만 무관. L2~L4(언제·뭘·충실히 사용)는 사이즈 의존. 사용자가 production 데이터로 정정.
+2. **"Phase 1 무학습 레버만으로 80% 해결"** → **틀림.** 통제 eval에선 맞았으나 production messy 환경에선 ~50%. 통제·production 갭 과소평가.
+3. **"small dense + 좋은 시스템 = 충분"** → **틀림.** 8B는 L2~L4 부족. production operator brain으로 부적합.
+4. **"Verifier는 Phase 2 (중기 보조)"** → **틀림.** Verifier가 즉시 최대 leverage. v2에서 *최선두*로 정정.
+5. **"production·학습 베이스 동일 모델"** → **재고.** A=35B-MoE, LoRA 학습=dense → 분리 가능성 인정.
+6. **"역할 분리 미고려"** → **수정.** Phase D 의제화.
+
+---
+
+## 다음 한 수
+
+**Phase B(Verifier MVP) 착수.** R1~R4 구현 작업지시서를 다른 LLM에 위임 가능한 형식으로 작성 →
+완성되면 즉시 opencode에서 invention 재현 시도 → Verifier 차단 확인 → A35B로 production 전환.
--- a/plans/P&ID-추출-PREFIX-DB-수정플랜-bySonnet.md
+++ b/plans/P&ID-추출-PREFIX-DB-수정플랜-bySonnet.md
@@ -0,0 +1,735 @@
+# P&ID 추출 PREFIX 분류 — `tag_dcs` 컬럼 도입 플랜
+
+> **작성일**: 2026-05-27  
+> **작성자**: Sonnet4.6  
+> **목적**: `pid_prefix_rules`와 `pid_equipment` 두 테이블에 `tag_dcs BOOLEAN` 컬럼을 추가해,  
+> P&ID 추출 **시작 시점**부터 현장 계기(field instrument)와 DCS 태그(DCS function block)를 구별한다.
+
+---
+
+> ## 상세 진단 리포트 (실제 코드 대비 교차 검증)
+>
+> **진단일**: 2026-05-27  
+> **진단자**: big-pickle  
+> **방법**: `diagnosis-checklist.md` 8-Step — 전체 코드베이스 read + 교차검증
+>
+> ---
+>
+> ### 읽은 파일 (STEP 3 — 코드 읽기)
+>
+> | 레이어 | 파일 | 비고 |
+> |--------|------|------|
+> | Domain | `PidPrefixRule.cs`, `PidEquipment.cs` | 엔티티 현재 상태 |
+> | DTO | `PidPrefixRuleDto.cs`, `PidEquipmentDto.cs` | Record 정의 |
+> | Interface | `IExperionServices.cs` | IPidExtractorService 시그니처 |
+> | Service | `PidExtractorService.cs` (1105줄 전체) | **핵심** — MatchCategory/ClassifyTagClass/Export/Import/CRUD |
+> | DbContext | `ExperionDbContext.cs` (DDL + Fluent API + Seed + Views) | DDL, seed 46개 prefix, v_pump_signal_map |
+> | Controller | `PidController.cs` | GetPrefixRules/GetEquipment 응답 형식 |
+> | Web UI | `pid.js`, `panes/pid.html`, `css/pid.css` | Prefix rules UI + equipment table |
+> | Python MCP | `server.py` (전체) | upsert_pid_connection, _classify_pid_tag, _DB_SCHEMA, trace_connections |
+> | Python | `worker/sql_prompt.py` | NL2SQL DB_SCHEMA |
+> | Prompt | `prompts/plant_context.md` | LLM 시스템 프롬프트 |
+> | 기타 | `validators.py`, `infer.py` | 영향 없음 확인 |
+>
+> ### 호출 계층 (STEP 4)
+>
+> ```
+> [Prefix Rule CRUD]
+>   POST/PUT/GET /api/pid/prefix-rules → PidController
+>     → PidExtractorService.Get/Create/Update/DeletePrefixRulesAsync
+>       → DbContext.PidPrefixRules → InvalidateRulesCache()
+>
+> [P&ID 추출]
+>   ExtractFromStreamAsync()
+>     → MatchCategoryAsync(tagNo)      ← string? 만 반환 (tagDcs 없음)
+>     → ClassifyTagClass(tagNo, cat, hasLink)  ← ISA heuristic
+>     → PidEquipment { Category, TagClass, ... } 저장
+>
+> [Category Backfill]
+>   ApplyCategoriesToExistingAsync()
+>     → MatchCategoryAsync() + ClassifyTagClass()  (2회, line 902/922)
+>
+> [CSV/Excel Export]
+>   ExportToCsvAsync()    ← 14개 컬럼, TagClass는 있음
+>   ExportToExcelAsync()  ← 17개 컬럼 (col16=TagClass, col17=id)
+>   ImportFromExcelAsync() ← col17=id → hasIdCol 감지
+>
+> [MCP Python]
+>   upsert_pid_connection()  ← 직접 SQL (UPDATE/INSERT)
+>   _classify_pid_tag()      ← 자체 prefix 상수 (DB 미조회)
+>   trace_connections()      ← SELECT tag_no, from_tag, to_tag, role
+> ```
+>
+> ---
+>
+> ### 🔴 HIGH — 즉시 수정 필요 (실제 장애 발생)
+>
+> #### H1. Excel 17번열 충돌 — `tag_dcs` 추가 시 `id`(안정 키) 덮어쓰여 데이터 손실
+>
+> **문제**: 계획서 §2.4-변경4가 "Excel 열 추가 (17번 열)"라고 명시했으나, 현재 `ExportToExcelAsync`는 col17을 **라운드트립용 안정 키 `id`**로 사용 중.
+>
+> **근거**:
+> - `PidExtractorService.cs:535` — `worksheet.Cells[1, 17].Value = "id"`
+> - `PidExtractorService.cs:623` — `hasIdCol` 감지: `ws.Cells[1, 17].Text` == `"id"`
+> - `PidExtractorService.cs:684-691` — `hasIdCol=true`일 때만 id 기반 in-place UPDATE 수행
+>
+> **영향**: col17에 `"DCS태그"`가 들어가면 `hasIdCol=false` → **모든 행이 TagNo fallback(old format) 매칭**으로 폴백. 같은 TagNo에 다중 경로가 있으면 전부 동일한 값으로 덮어써져 **다중경로 데이터 손실**. 수동으로 교정한 `connection_locked` 행도 초기화 위험.
+>
+> **수정 방향**:
+> 1. `tag_dcs`는 **col18**에 배치, col17=`id` 유지
+> 2. `ExportToExcelAsync:535` — `worksheet.Cells[1, 18].Value = "DCS태그"` 추가
+> 3. `ExportToExcelAsync:562-568` — row write에 col18 추가
+> 4. `ImportFromExcelAsync` — col18 읽기 + `Apply()`에 `e.TagDcs` 설정
+> 5. 계획서 §2.4-변경5의 Excel import 설명에 col18 반영
+>
+> ---
+>
+> #### H2. `upsert_pid_connection` — boolean 인자 + SQL 전면 수정 필요
+>
+> **문제**: MCP에서 `pid_equipment`를 직접 SQL 조작하는 유일한 경로. `tag_dcs` 컬럼 추가로 SELECT/UPDATE/INSERT 3개 SQL 모두 수정 + Python boolean 타입 핸들링 필요.
+>
+> **근거**: `mcp-server/server.py:989-1101`
+>
+> **수정 항목 상세**:
+>
+> | 위치 | 현재 코드 | 변경 |
+> |------|----------|------|
+> | 함수 시그니처 (990-997) | `tag_class: str \| None = None` | `tag_dcs: bool \| None = None` 추가 |
+> | `_n()` 처리 (1035) | `tag_class = _n(tag_class)` | `tag_dcs = bool(tag_dcs) if tag_dcs is not None else None` — `_n()`은 str 전용이므로 bool은 별도 처리 |
+> | `_SNAP` (1037) | 9개 항목 | `"tag_dcs"` 추가 (10개) |
+> | SELECT 스냅샷 (1040) | `SELECT tag_no, ..., connection_locked` | `tag_dcs` 추가 |
+> | UPDATE SET (1078-1084) | `tag_class=COALESCE(%s, tag_class)` | `tag_dcs=COALESCE(%s, tag_dcs)` 추가 (COALESCE는 boolean도 정상 동작) |
+> | INSERT 컬럼 리스트 (1094-1096) | `(tag_no, ..., tag_class, connection_locked)` | `tag_dcs` 추가 — None이면 DEFAULT FALSE |
+> | INSERT param (1101) | 9개 param | +1 param |
+>
+> **위험**: `COALESCE(%s, tag_dcs)`에서 `tag_dcs=False`는 `False`로 정상 전달되므로 문제 없음. 단, `_n()` 함수 사용 시 주의 — boolean은 `str(v)` 후 `strip()` 하면 안 됨.
+>
+> ---
+>
+> ### 🟠 MED — 조건부 장애 (특정 상황에서 발생)
+>
+> #### M1. `_classify_pid_tag` — DCS prefix vs instrument prefix 이중 정의 (동기화 위험)
+>
+> **문제**: `server.py:_classify_pid_tag()` (257-270)는 DB `pid_prefix_rules`를 전혀 참조하지 않고 자체 Python 상수(`_PID_EQUIPMENT_PREFIX`, `_PID_INSTRUMENT_FIRST`)로 분류. 계획서 §2.9-변경1이 여기에 `_DCS_PREFIXES` 집합을 추가하면 **C# seed와 Python 상수 간 동기화가 수동으로 유지되어야 함**.
+>
+> **근거**:
+> - `mcp-server/server.py:257-270` — `_classify_pid_tag()`는 DB 미조회, 자체 prefix 분류
+> - `mcp-server/server.py:263` — `if prefix in _PID_EQUIPMENT_PREFIX:`
+> - 계획서 §2.9-변경1: `_DCS_PREFIXES = {"FIC","TIC","PIC","LIC","FY","TY","PY","LY","FV","TV","PV","LV"}`
+> - C# seed: `ExperionDbContext.cs:632-696` — DB에 저장된 prefix 목록
+>
+> **영향**: DXF 파싱 결과(server.py)와 DB extraction(C#)의 `tag_dcs` 판정이 불일치할 수 있음. 예: Python에만 DCS prefix를 추가/제거하면 DXF 추출 결과와 DB 추출 결과의 분류가 달라짐.
+>
+> **수정 방향**: `_classify_pid_tag`의 `tag_dcs` 필드는 **DXF 추출 결과에만** 사용되며 DB 저장 시 C#에서 다시 판정하므로, 일시적 불일치는 허용됨. 단, Phase 6 검증 시 DXF ↔ DB 분류 일관성 확인 필요. 또는 `_DCS_PREFIXES`를 별도 공유 모듈로 분리 고려.
+>
+> ---
+>
+> #### M2. `ClassifyTagClass` — Override 우선순위 모호 (tag_dcs vs hasExperionLink)
+>
+> **문제**: 계획서 §4는 `tag_dcs=TRUE → TagClass='system'`을 **prefix rule이 ground truth**로 제안. 그러나 현재 `ClassifyTagClass`(812-831)는 `hasExperionLink`(Experion DB 연결 존재)를 최우선 확정 신호로 사용. `tag_dcs=FALSE`인데 `hasExperionLink=true`인 경계 사례에서 모순 발생.
+>
+> **근거**: `PidExtractorService.cs:817-818` — 현재 1순위: `if (hasExperionLink) return TagClassSystem;`
+>
+> ```csharp
+> // 경계 사례: prefix는 FT(전송기, tag_dcs=FALSE)지만 Experion DB 연결이 있음
+> // tag_dcs 기준: "field"  vs  hasExperionLink 기준: "system"
+> // 어느 쪽 우선?
+> ```
+>
+> **영향**: field instrument가 `hasExperionLink=true`로 인해 `TagClass="system"`이 되는 기존 동작이 `tag_dcs` 도입 후에도 유지된다면, `tag_dcs` 컬럼 도입 의미 반감.
+>
+> **수정 방향**: 설계 결정 명확화:
+> - **옵션 A** (계획서 §4): `tag_dcs=TRUE`가 `TagClassSystem`을 강제. 경계 사례에서 `tag_dcs` 우선.
+> - **옵션 B**: `tag_dcs`는 정보용 flag, `TagClass`는 기존 로직 유지. `tag_dcs`는 prefix 기반 빠른 필터로만 사용.
+>
+> ---
+>
+> #### M3. Seed 데이터 — 46개 prefix DCS 분류 + compound형 누락
+>
+> **문제**: `ExperionDbContext.cs:632-696`의 46개 seed prefix를 DCS 여부로 분류해야 하나, 계획서의 DCS 목록은 12개 기본형(FIC/TIC/PIC/LIC/FY/TY/PY/LY/FV/TV/PV/LV)만 포함. ISA 후속문자 `{I, C, A, Q, Y, R}`를 가진 compound prefix(FICQ, FICA, TICQ, PICA, LICA, FICR 등)는 시드에 없음.
+>
+> **근거**:
+> - `ClassifyTagClass.cs:803` — `_systemFuncLetters = {'I', 'C', 'A', 'Q', 'Y', 'R'}`
+> - 현재 시드: `FIC`, `TIC`, `PIC`, `LIC`만 있음 — `FICQ`, `FICA`, `TICQ` 등 없음
+> - `PidExtractorService.cs:793` — `MatchCategoryAsync`는 `StartsWith` 매칭이므로 `FICQ-6113` → `FIC` rule에 매칭되어 category = 'instrument', tag_dcs = TRUE는 되지 않음
+> - 계획서 §5에서 인지 ("FICQ 등 suffix prefix")
+>
+> **영향**: Phase 2.1 UPDATE SQL이 `prefix IN ('FIC',...)`으로만 조건을 걸면 `FICQ` 등은 UPDATE되지 않아 `tag_dcs=FALSE(default)`로 남음.
+>
+> **수정 방향**: UPDATE SQL을 `StartsWith` 기반으로 변경하거나:
+> ```sql
+> UPDATE pid_prefix_rules SET tag_dcs = TRUE
+> WHERE prefix IN ('FIC','TIC','PIC','LIC','FY','TY','PY','LY','FV','TV','PV','LV')
+>    OR prefix LIKE 'FI%'  -- FIC, FICA, FICQ 등
+>    OR prefix LIKE 'TI%'  -- TIC, TICA, TICQ 등
+> ```
+> 또는 seed INSERT에 compound형을 명시적으로 추가. 단, 이 방식은 prefix가 `StartsWith` 매칭이므로 `FI`만 있어도 `FIC/FICA/FICQ/FIR` 등을 모두 커버 — `FIC`를 포함한 `FI` prefix면 충분.
+>
+> ---
+>
+> ### 🟡 LOW — 동작에 영향 없음 (유지보수성)
+>
+> #### L1. `ApplyCategoriesToExistingAsync` — backfill에 `tag_dcs` 누락
+>
+> **문제**: `ApplyCategoriesToExistingAsync`(884-929)는 기존 `Category == null`인 행과 `TagClass == null`인 행을 backfill. `tag_dcs` 도입 시 이 backfill에도 포함되어야 함.
+>
+> **근거**: `PidExtractorService.cs:884-929` — 2개 배치 루프:
+> - 1차 (890-907): Category = null인 행 → MatchCategoryAsync + ClassifyTagClass
+> - 2차 (912-927): Category='instrument' AND TagClass = null인 행 → ClassifyTagClass
+>
+> **수정**: 각 루프에서 `tag_dcs`도 함께 조회/설정. `ClassifyTagClass` 시그니처에 `tagDcs` 파라미터 추가 시 이 호출부(902, 922) 함께 수정.
+>
+> ---
+>
+> #### L2. `MatchCategoryAsync` — `(category, tagDcs)` 튜플 반환 또는 분리
+>
+> **문제**: 현재 `MatchCategoryAsync`(787-795)는 `string?`(category만) 반환. `tag_dcs`가 필요하면 별도 API 필요. 계획서 §2.4-변경1에서 이슈 인지.
+>
+> **근거**:
+> - `PidExtractorService.cs:90` — `var category = await MatchCategoryAsync(item.TagNo);`
+> - `PidExtractorService.cs:787-795` — return `string?`
+> - `GetRulesCachedAsync`(754)가 이미 `List<PidPrefixRule>`를 캐싱하므로 `tag_dcs`도 함께 사용 가능
+>
+> **수정 방향**: 3가지 옵션
+> 1. `MatchCategoryAsync` → `(string? category, bool tagDcs)` 튜플: 리팩터 범위 큼 (호출부 3곳)
+> 2. 별도 `ResolveTagDcsAsync(tagNo)` 추가: 최소 침습
+> 3. `GetRulesCachedAsync()` 결과에서 직접 조회: 캐시 직접 접근 → 캡슐화 위반
+>
+> ---
+>
+> #### L3. `PidEquipmentDto` — TagMappingService 미반영
+>
+> **문제**: `PidEquipmentDto`(3-19)는 `TagClass`도 없고 `TagDcs`도 없음. 매핑 탭에서 이 DTO를 사용하므로, 매핑 UI에도 DCS 여부를 표시하려면 추가 필요.
+>
+> **근거**: `src/Core/Application/DTOs/PidEquipmentDto.cs:3-19` — 16개 필드, TagClass/TagDcs 없음
+>
+> **수정**: 선택사항 — 매핑 탭에 표시 불필요 시 생략 가능.
+>
+> ---
+>
+> #### L4. `_DB_SCHEMA` / `sql_prompt.py` — pid_equipment 미등록
+>
+> **문제**: `_DB_SCHEMA`(server.py:678)와 `DB_SCHEMA`(sql_prompt.py:9) 모두 `pid_equipment` 테이블이 없음. LLM이 NL2SQL로 "DCS 태그 몇 개?" 같은 질문을 SQL 변환해도 실행 불가.
+>
+> **근거**:
+> - `mcp-server/server.py:678-694` — history_table, realtime_table, tag_metadata, event_history_table만 있음
+> - `mcp-server/worker/sql_prompt.py:9-104` — 동일
+> - 계획서 §2.9-변경2, §2.9-변경3에서 server.py / sql_prompt.py 수정 언급
+>
+> **수정**: 계획서 §2.9-변경2/3, §2.10에 따라 추가. 단, `_DB_SCHEMA` 추가 시 SQL injection 방지를 위해 테이블/컬럼명을 LLM 프롬프트에만 포함하고 실제 쿼리는 MCP 도구로 제한하는 현재 아키텍처 유지.
+>
+> ---
+>
+> ### 🔵 영향 없음 (No Change)
+>
+> | 파일 | 이유 | 근거 |
+> |------|------|------|
+> | `IExperionServices.cs` | 인터페이스 시그니처 = DTO 타입 그대로 | DTO 변경만으로 자동 반영 |
+> | `PidController.cs:GetPrefixRules` | 이미 anonymous object camelCase 반환 | `tagDcs: r.TagDcs` 추가만 필요 |
+> | `ExperionDbContext.cs:v_pump_signal_map` | `FROM pid_equipment ft` — `ft.from_tag, ft.tag_no, ft.category`만 조회 | tag_dcs 미사용 |
+> | `server.py:trace_connections` | SELECT `tag_no, from_tag, to_tag, role` | tag_dcs 미조회 |
+> | `validators.py:37` | `SELECT DISTINCT tag_no FROM pid_equipment` | tag_dcs 무관 |
+> | `instrument_inference/infer.py` | 독립 실행 모듈, DB 미의존 | 계획서 §2.11 판단 올바름 |
+> | `SeedSubAreaAsync` SQL | `FROM pid_equipment WHERE role ILIKE '%공용%'` | tag_dcs 무관 |
+>
+> ---
+>
+> ### 수정 필요 파일 요약 (총 19곳)
+>
+> | 등급 | 파일명 | 변경 내용 |
+> |------|--------|---------|
+> | 🔴 H1 | `PidExtractorService.cs` | Excel col17→col18 재배치 + Import hasIdCol 보호 + Apply() TagDcs |
+> | 🔴 H2 | `server.py:upsert_pid_connection` | `tag_dcs` bool 인자 + SELECT/UPDATE/INSERT 3개 SQL |
+> | 🟠 M1 | `server.py:_classify_pid_tag` | `_DCS_PREFIXES` 추가 — C# seed와 동기화 문서화 |
+> | 🟠 M2 | `PidExtractorService.cs:ClassifyTagClass` | `tag_dcs` 가중치 결정 및 시그니처 변경 |
+> | 🟠 M3 | `ExperionDbContext.cs:seed` | 46개 prefix DCS 분류 + compound형(FICQ 등) |
+> | 🟡 L1 | `PidExtractorService.cs:ApplyCategoriesToExistingAsync` | backfill에 tag_dcs 추가 |
+> | 🟡 L2 | `PidExtractorService.cs:MatchCategoryAsync` | tagDcs 반환 경로 추가 |
+> | 🟡 L3 | `PidEquipmentDto.cs` | 선택적 TagDcs 추가 |
+> | 🟡 L4 | `server.py:_DB_SCHEMA` + `sql_prompt.py` | pid_equipment + tag_dcs 설명 추가 |
+> | 🟢 | `PidPrefixRule.cs` | `TagDcs` property 추가 |
+> | 🟢 | `PidEquipment.cs` | `TagDcs` property 추가 |
+> | 🟢 | `PidPrefixRuleDto.cs` | 3개 record에 `bool TagDcs` |
+> | 🟢 | `PidController.cs:GetPrefixRules/GetEquipment` | `tagDcs` 필드 추가 |
+> | 🟢 | `PidExtractorService.cs:ExtractFromStreamAsync` | 추출 시 TagDcs 저장 |
+> | 🟢 | `PidExtractorService.cs:Create/UpdatePrefixRuleAsync` | TagDcs 전달 |
+> | 🟢 | `PidExtractorService.cs:ExportToCsvAsync` | 헤더+행 열 추가 |
+> | 🟢 | `ExperionDbContext.cs` | DDL + ALTER TABLE IF NOT EXISTS + Fluent API |
+> | 🟢 | `wwwroot/js/pid.js` | Prefix UI checkbox + Add/Update body |
+> | 🟢 | `wwwroot/panes/pid.html` | 선택적 열 헤더 |
+> | 🟢 | `prompts/plant_context.md` | tag_dcs 설명 추가 |
+>
+> ---
+
+---
+
+## 0. 배경 및 문제
+
+### 현재 구조의 문제
+
+현재 `pid_prefix_rules.category = 'instrument'` 아래에 두 종류가 혼재:
+
+| 종류 | 예시 prefix | 실제 의미 |
+|------|------------|---------|
+| **현장 계기** (field) | FT, PT, LT, TT, FCV, PCV, PSV, XV, FG, PG | 물리적 기기, 현장 설치 |
+| **DCS 함수블록** (system) | FIC, TIC, PIC, LIC, FY, TY, PY, LY | DCS/SCADA 내부 연산 블록, 물리 기기 없음 |
+
+기존 `tag_class = 'field'/'system'` 컬럼이 이를 구별하려 했으나:
+- **추출 후 후처리**에서 판정 (ISA 후속문자 분석 + Experion 연결 여부)
+- **PREFIX 정의 UI**에서는 전혀 보이지 않아 운전원이 구별 불가
+- `hasExperionLink`를 "DCS 태그 여부"의 proxy로 잘못 사용 — FT 전송기도 Experion에 연결되므로 'system'으로 오분류됨
+- LLM이 pid_equipment 조회 시 instrument를 한꺼번에 가져와 혼동
+
+### 목표
+
+`pid_prefix_rules` 테이블에 `tag_dcs BOOLEAN` 추가 → PREFIX 분류 정의 시점부터 DCS 여부 명시.  
+`pid_equipment` 테이블에도 동일 컬럼 전파 → 추출 결과 전체에 flag 유지.
+
+---
+
+## 1. DCS vs Field 분류 기준
+
+### 판단 원칙: 물리 기기 존재 여부
+
+FT-6113(전송기)은 Experion에 연결되어 있어도 **현장 계기(field)**다.  
+Experion 연결 = 현장 신호를 DCS가 읽어오는 것이지, 기기 자체가 DCS 소프트웨어가 되는 게 아님.
+
+### DCS 태그 (`tag_dcs = TRUE`) — 물리 기기 없음, DCS 함수블록만 존재
+
+| Prefix | 설명 | ISA 후속문자 |
+|--------|------|------------|
+| FIC, FICA, FICQ, FICR | Flow Indicator Controller (+ Alarm/Totalizer/Recorder 변형) | C, A, Q, R |
+| TIC, TICA, TICQ | Temperature Indicator Controller 변형 | C, A, Q |
+| PIC, PICA | Pressure Indicator Controller 변형 | C, A |
+| LIC, LICA | Level Indicator Controller 변형 | C, A |
+| FY, TY, PY, LY | Relay/Converter/Computing | Y |
+| FV, TV, PV, LV | Valve function block output (DCS 출력 전용, 물리 FCV와 구별) | V(fb) |
+
+> `_systemFuncLetters = {I, C, A, Q, Y, R}` — ISA 표준 제어시스템 후속문자 전체 포함
+
+> **주의**: FCV/PCV/LCV/TCV는 물리적 제어밸브 → `tag_dcs = FALSE` (field 유지)
+
+### 현장 계기 (`tag_dcs = FALSE`) — 물리 기기
+
+| Prefix | 설명 |
+|--------|------|
+| FT, TT, PT, LT | 1차 측정 전송기 (Transmitter) — Experion 연결 여부 무관하게 field |
+| FG, TG, PG, LG | 게이지류 (Gauge) |
+| FCV, TCV, PCV, LCV | 제어밸브 (물리 기기) |
+| PSV | 안전밸브 |
+| XV | 차단밸브 |
+| VIP, VIT | 진동 프로브/전송기 |
+| DP | 차압계 |
+| BV | 볼/버터플라이 밸브 |
+
+---
+
+## 2. 영향 범위 전체 목록
+
+### 2.1 데이터베이스 (4곳)
+
+| 대상 | 변경 내용 |
+|------|---------|
+| `pid_prefix_rules` 테이블 | `tag_dcs BOOLEAN NOT NULL DEFAULT FALSE` 컬럼 추가 |
+| `pid_prefix_rules` 시드 | DCS prefix에 `tag_dcs = TRUE` UPDATE |
+| `pid_equipment` 테이블 | `tag_dcs BOOLEAN NOT NULL DEFAULT FALSE` 컬럼 추가 |
+| `pid_equipment` 기존 행 | prefix rule StartsWith 기반 backfill |
+
+**마이그레이션 SQL** (Boot DDL에 추가, 재기동 시 자동 실행):
+```sql
+-- Step 1: pid_prefix_rules 컬럼 추가
+ALTER TABLE pid_prefix_rules
+  ADD COLUMN IF NOT EXISTS tag_dcs BOOLEAN NOT NULL DEFAULT FALSE;
+
+-- Step 2: DCS prefix 마킹 (기본형 — compound형은 Step 4에서 커버)
+UPDATE pid_prefix_rules
+SET tag_dcs = TRUE
+WHERE prefix IN ('FIC','TIC','PIC','LIC','FY','TY','PY','LY','FV','TV','PV','LV');
+
+-- Step 3: pid_equipment 컬럼 추가
+ALTER TABLE pid_equipment
+  ADD COLUMN IF NOT EXISTS tag_dcs BOOLEAN NOT NULL DEFAULT FALSE;
+
+-- Step 4: 기존 행 backfill — StartsWith 매칭 (compound형 FICQ/FICA 등 자동 포함)
+-- pid_equipment.instrument_type LIKE 'FIC%' → FIC, FICA, FICQ, FICR 모두 해당
+UPDATE pid_equipment pe
+SET tag_dcs = TRUE
+FROM pid_prefix_rules pr
+WHERE pe.instrument_type LIKE (pr.prefix || '%')
+  AND pr.tag_dcs = TRUE;
+```
+
+> **compound prefix 처리**: `MatchCategoryAsync`가 이미 StartsWith 매칭이므로 C# 추출 경로에서는
+> `FICQ-6113` → `FIC` rule의 `tag_dcs=TRUE`가 자동 상속됨. backfill SQL도 동일 방식 적용.
+
+---
+
+### 2.2 C# 도메인 엔티티 (2파일)
+
+#### `src/Core/Domain/Entities/PidPrefixRule.cs`
+```csharp
+// 추가
+[Column("tag_dcs")]
+public bool TagDcs { get; set; } = false;
+```
+
+#### `src/Core/Domain/Entities/PidEquipment.cs`
+```csharp
+// 추가 (tag_class 아래)
+[Column("tag_dcs")]
+public bool TagDcs { get; set; } = false;
+```
+
+---
+
+### 2.3 DTOs (1파일, 3개 record)
+
+#### `src/Core/Application/DTOs/PidPrefixRuleDto.cs`
+
+```csharp
+// 수정 후 (TagDcs 추가)
+public record PidPrefixRuleDto(
+    int Id, string Prefix, string Category, bool TagDcs,
+    string? Description, int SortOrder, DateTime CreatedAt);
+
+public record CreatePidPrefixRuleRequest(
+    string Prefix, string Category,
+    bool TagDcs = false, string? Description = null, int SortOrder = 0);
+
+public record UpdatePidPrefixRuleRequest(
+    string Prefix, string Category,
+    bool TagDcs = false, string? Description = null, int SortOrder = 0);
+```
+
+> `PidEquipmentDto`(`PidEquipmentDto.cs`)는 매핑 탭 전용이며 `TagClass`도 없으므로  
+> 이번 범위에서 제외 (선택사항, 매핑 탭 DCS 배지 표시 필요 시 별도 추가).
+
+---
+
+### 2.4 Application Services (1파일)
+
+#### `src/Core/Application/Services/PidExtractorService.cs`
+
+**변경 1**: `tag_dcs` 조회 경로 — 별도 `ResolveTagDcsAsync(tagNo)` 추가 (L2 결정)
+```csharp
+// MatchCategoryAsync 시그니처 유지, 별도 메서드 추가 (최소 침습)
+private async Task<bool> ResolveTagDcsAsync(string tagNo)
+{
+    var rules = await GetRulesCachedAsync();
+    var upper = tagNo.ToUpperInvariant();
+    // StartsWith 매칭 (compound형 자동 포함)
+    var rule = rules
+        .Where(r => upper.StartsWith(r.Prefix.ToUpperInvariant()))
+        .OrderByDescending(r => r.Prefix.Length)   // 가장 긴 prefix 우선
+        .FirstOrDefault();
+    return rule?.TagDcs ?? false;
+}
+```
+
+**변경 2**: `ClassifyTagClass()` 재설계 — tag_dcs 우선, hasExperionLink 역할 변경 (M2 결정: Option A)
+```csharp
+// 수정 후 시그니처 — tagDcs 파라미터 추가, hasExperionLink는 fallback용으로만 유지
+private static string? ClassifyTagClass(string tagNo, string? category, bool tagDcs, bool hasExperionLink)
+{
+    if (category != PidEquipment.CategoryInstrument) return null;
+
+    // tag_dcs가 true면 prefix rule이 ground truth → system 확정
+    // (FT 전송기가 Experion에 연결돼도 field — hasExperionLink 무관)
+    if (tagDcs) return PidEquipment.TagClassSystem;
+
+    // tag_dcs=FALSE: 현장 계기 → field
+    // hasExperionLink는 더 이상 TagClass 결정에 사용하지 않음
+    // (ExperionTagId FK로 연결 정보는 보존됨)
+    return PidEquipment.TagClassField;
+}
+```
+
+**변경 3**: 추출 저장 시 `TagDcs` 채우기
+```csharp
+var category = await MatchCategoryAsync(item.TagNo);
+var tagDcs   = await ResolveTagDcsAsync(item.TagNo);
+var tagClass = ClassifyTagClass(item.TagNo, category, tagDcs, experionTag != null);
+
+item.Category = category;
+item.TagDcs   = tagDcs;
+item.TagClass = tagClass;
+```
+
+**변경 4**: CSV/Excel export — **col18**에 `TagDcs` 추가 (col17=`id` 보호 ⚠️ H1)
+```csharp
+// ExportToExcelAsync
+worksheet.Cells[1, 17].Value = "id";       // 기존 유지 — hasIdCol 감지 키
+worksheet.Cells[1, 18].Value = "DCS태그";  // 신규 추가
+
+// row write
+worksheet.Cells[row, 17].Value = item.Id;
+worksheet.Cells[row, 18].Value = item.TagDcs ? "DCS" : "현장";
+```
+
+**변경 5**: Excel import — col18 읽기 + `Apply()`에 TagDcs 설정
+```csharp
+// ImportFromExcelAsync
+// hasIdCol 감지: col17 = "id" (기존 동일)
+var hasDcsCol = ws.Cells[1, 18].Text == "DCS태그";
+
+// Apply() 내부
+if (hasDcsCol)
+{
+    var dcsVal = ws.Cells[row, 18].Text.Trim();
+    e.TagDcs = dcsVal == "DCS";
+}
+```
+
+**변경 6**: `ApplyCategoriesToExistingAsync()` — 두 backfill 루프에 tag_dcs 추가 (L1)
+```csharp
+// 1차 루프 (Category=null 행)
+item.Category = category;
+item.TagDcs   = await ResolveTagDcsAsync(item.TagNo);   // 추가
+item.TagClass = ClassifyTagClass(item.TagNo, category, item.TagDcs, item.ExperionTagId != null);
+
+// 2차 루프 (Category='instrument' AND TagClass=null 행)
+item.TagDcs   = await ResolveTagDcsAsync(item.TagNo);   // 추가
+item.TagClass = ClassifyTagClass(item.TagNo, item.Category, item.TagDcs, item.ExperionTagId != null);
+```
+
+> 메서드명 정정: 계획서 초안의 `BackfillTagClassAsync()` 오기 → 실제 메서드는 `ApplyCategoriesToExistingAsync()` (line 884)
+
+**변경 7**: `CreatePrefixRuleAsync` / `UpdatePrefixRuleAsync` — `request.TagDcs` → `rule.TagDcs` 저장 후 `InvalidateRulesCache()` 호출 (기존 패턴 동일).
+
+---
+
+### 2.5 인터페이스 (변경 없음)
+
+#### `src/Core/Application/Interfaces/IExperionServices.cs`
+
+DTO record 변경만으로 `CreatePrefixRuleAsync` / `UpdatePrefixRuleAsync` 시그니처 자동 반영. 추가 수정 불필요.
+
+---
+
+### 2.6 EF Core DbContext (1파일)
+
+#### `src/Infrastructure/Database/ExperionDbContext.cs`
+
+**변경 1**: Boot DDL에 §2.1 마이그레이션 SQL 4개 Step 추가 (재기동 시 자동 실행)
+
+**변경 2**: 시드 INSERT는 `ON CONFLICT DO NOTHING` → 기존 행 미반영.  
+Step 2 UPDATE로 기존 행의 `tag_dcs` 갱신 (Boot DDL에서 연속 실행).
+
+**변경 3**: `modelBuilder.Entity<PidPrefixRule>()` — Column attribute로 자동 매핑되므로 Fluent API 추가 불필요.
+
+---
+
+### 2.7 Web Controllers (1파일)
+
+#### `src/Web/Controllers/PidController.cs`
+
+**변경 1**: `GetPrefixRules` — 익명객체 camelCase 반환 방식이므로 `tagDcs: r.TagDcs` 1줄 추가.  
+(진단 확인: 이미 anonymous object camelCase 사용 중 → 별도 `[JsonPropertyName]` 불필요)
+
+**변경 2**: `CreatePrefixRule` / `UpdatePrefixRule` — DTO에 `TagDcs` 추가되므로 컨트롤러 수정 불필요.
+
+---
+
+### 2.8 Web UI (2파일)
+
+#### `src/Web/wwwroot/js/pid.js`
+
+**변경 1**: PREFIX 그룹 렌더링 (`pidRenderPrefixGroups`) — 각 행에 DCS/현장 배지 추가:
+```javascript
+// view row
+`<td><span class="badge ${r.tagDcs ? 'warn' : 'ok'}">${r.tagDcs ? 'DCS' : '현장'}</span></td>`
+// edit row
+`<input type="checkbox" id="pid-dcs-${r.id}" ${r.tagDcs ? 'checked' : ''} />`
+`<label for="pid-dcs-${r.id}">DCS</label>`
+```
+
+**변경 2**: `pidAddPrefixRule(category)` 요청 body에 `tagDcs` 추가  
+**변경 3**: `pidUpdatePrefixRule(id)` 요청 body에 `tagDcs` 추가  
+
+#### `src/Web/wwwroot/panes/pid.html`
+
+- PREFIX 분류 정의 패널 테이블 헤더에 "DCS태그" 열 추가
+
+---
+
+### 2.9 MCP Server Python (2파일)
+
+#### `mcp-server/server.py`
+
+**변경 1**: `_DCS_PREFIXES` 상수 추가 + `_classify_pid_tag()` 반환에 `tag_dcs` 포함
+```python
+# compound형 포함 — ISA _systemFuncLetters 기준 확장
+_DCS_PREFIXES: frozenset[str] = frozenset({
+    "FIC", "FICA", "FICQ", "FICR",
+    "TIC", "TICA", "TICQ",
+    "PIC", "PICA",
+    "LIC", "LICA",
+    "FY",  "TY",  "PY",  "LY",
+    "FV",  "TV",  "PV",  "LV",
+})
+
+# _classify_pid_tag() 반환
+return {
+    "kind": "instrument",
+    "prefix": prefix,
+    "type": type_name,
+    "tag_dcs": prefix in _DCS_PREFIXES,
+}
+```
+
+> ⚠️ M1 동기화 주의: `_DCS_PREFIXES`(Python)와 C# seed UPDATE 목록은 수동 동기화 필요.  
+> 양쪽 변경 시 함께 수정. 향후 공유 모듈(`dcs_prefixes.py`) 분리 고려.
+
+**변경 2**: `_DB_SCHEMA`에 `pid_equipment` 테이블 추가 (L4)
+```python
+_DB_SCHEMA = """
+...
+테이블: pid_equipment  (P&ID 추출 장비/계기)
+  tag_no          TEXT   - 태그번호 (예: FIC-6113, FT-6113)
+  category        TEXT   - 'instrument' / 'power_equipment' / 'storage_equipment' / ...
+  tag_dcs         BOOL   - TRUE=DCS 함수블록(FIC/TIC/PIC류), FALSE=현장 물리 계기(FT/FCV류)
+  tag_class       TEXT   - 'field'(현장) / 'system'(DCS) — tag_dcs 기반
+  instrument_type TEXT   - ISA prefix (FT/FIC/P 등)
+  from_tag        TEXT   - 연결 상류 태그
+  to_tag          TEXT   - 연결 하류 태그
+...
+"""
+```
+
+**변경 3**: `upsert_pid_connection` — `tag_dcs` bool 인자 추가 + SQL 3곳 수정 (H2 확정)
+
+| 위치 | 변경 내용 |
+|------|---------|
+| 함수 시그니처 (line 990-997) | `tag_dcs: bool \| None = None` 파라미터 추가 |
+| `_n()` 처리 (line 1035) | bool은 `_n()` 미사용 — `tag_dcs = bool(tag_dcs) if tag_dcs is not None else None` |
+| `_SNAP` 목록 (line 1037) | `"tag_dcs"` 추가 |
+| SELECT 스냅샷 (line 1040) | `tag_dcs` 컬럼 추가 |
+| UPDATE SET (line 1078) | `tag_dcs=COALESCE(%s, tag_dcs)` 추가 |
+| INSERT 컬럼/값 (line 1094) | `tag_dcs` 추가, None이면 `DEFAULT FALSE` |
+
+```python
+# bool 처리 예시
+tag_dcs_val = bool(tag_dcs) if tag_dcs is not None else None
+# COALESCE는 boolean도 정상 동작 — False 전달 시 False로 저장됨
+```
+
+#### `mcp-server/worker/sql_prompt.py`
+
+`DB_SCHEMA` 상수에 `pid_equipment` 추가 (L4):
+```
+테이블: pid_equipment(tag_no TEXT, category TEXT, tag_dcs BOOL, tag_class TEXT,
+                      instrument_type TEXT, from_tag TEXT, to_tag TEXT)
+※ tag_dcs=TRUE: DCS 함수블록(FIC/TIC/PIC류), FALSE: 현장 물리 계기(FT/FCV류)
+※ 연결 추적: from_tag(상류) → tag_no → to_tag(하류)
+```
+
+---
+
+### 2.10 프롬프트 / 지식 파일 (1파일)
+
+#### `prompts/plant_context.md`
+
+```markdown
+## pid_equipment.tag_dcs — 현장 계기 vs DCS 함수블록 구별
+
+- tag_dcs = TRUE: DCS 내부 함수블록 (FIC, TIC, PIC, LIC, FY, TY, PY, LY 등 compound형 포함)
+  - 물리 기기 없음. Experion DB 포인트로만 존재
+- tag_dcs = FALSE: 현장 물리 계기 (FT, PT, LT, FCV, PSV, XV 등)
+  - P&ID 도면에 기기 심벌로 표시되는 실물. Experion 연결 여부 무관하게 field
+
+쿼리 예:
+- "DCS 태그 몇 개?" → SELECT COUNT(*) FROM pid_equipment WHERE tag_dcs=TRUE
+- "현장 계기 목록" → SELECT * FROM pid_equipment WHERE tag_dcs=FALSE AND category='instrument'
+- "FIC-6113이 DCS 태그인가?" → SELECT tag_dcs FROM pid_equipment WHERE tag_no='FIC-6113'
+```
+
+---
+
+### 2.11 instrument_inference — 변경 없음
+
+`infer.py`는 독립 실행 모듈(DB 미의존). 내부 `_dcs_internal_roles` 로직은 infer 전용으로 유지.
+
+---
+
+### 2.12 영향 없음 확인 (진단 결과)
+
+| 파일 | 근거 |
+|------|------|
+| `IExperionServices.cs` | DTO 변경으로 자동 반영 |
+| `ExperionDbContext.cs:v_pump_signal_map` | `from_tag, tag_no, category`만 조회 |
+| `server.py:trace_connections` | `tag_no, from_tag, to_tag, role`만 조회 |
+| `verifier/validators.py` | `SELECT DISTINCT tag_no FROM pid_equipment` |
+| `SeedSubAreaAsync` SQL | `WHERE role ILIKE '%공용%'` |
+
+---
+
+## 3. 단계별 구현 순서
+
+### Phase 1: DB 스키마 (재기동으로 자동 적용)
+1. `ExperionDbContext.cs` Boot DDL에 §2.1 마이그레이션 SQL(Step 1~4) 추가
+2. 웹 서버 재기동 → ALTER TABLE + UPDATE 자동 실행
+
+### Phase 2: 도메인/DTO/서비스 (C# 코어)
+1. `PidPrefixRule.cs` — `TagDcs bool` 프로퍼티 추가
+2. `PidEquipment.cs` — `TagDcs bool` 프로퍼티 추가
+3. `PidPrefixRuleDto.cs` — 3개 record에 `TagDcs` 추가
+4. `PidExtractorService.cs`:
+   - `ResolveTagDcsAsync()` 신규 추가
+   - `ClassifyTagClass()` 시그니처 변경 + hasExperionLink 역할 변경
+   - `ExtractFromStreamAsync()` TagDcs 저장
+   - `ExportToExcelAsync()` col18 추가
+   - `ImportFromExcelAsync()` col18 읽기
+   - `ApplyCategoriesToExistingAsync()` 두 루프에 tag_dcs 추가
+   - `CreatePrefixRuleAsync()` / `UpdatePrefixRuleAsync()` TagDcs 전달
+5. `dotnet build` — 경고 0/에러 0 확인
+
+### Phase 3: Web Controller
+1. `PidController.cs:GetPrefixRules` — 익명객체에 `tagDcs: r.TagDcs` 추가
+
+### Phase 4: Web UI
+1. `pid.js` — PREFIX 그룹 렌더링 DCS/현장 배지 + Add/Update body
+2. `panes/pid.html` — "DCS태그" 열 헤더
+
+### Phase 5: MCP / LLM 경로
+1. `server.py` — `_DCS_PREFIXES` + `_classify_pid_tag` + `_DB_SCHEMA` + `upsert_pid_connection`
+2. `worker/sql_prompt.py` — `pid_equipment` 테이블 추가
+3. `prompts/plant_context.md` — tag_dcs 설명 추가
+
+### Phase 6: 검증
+1. `dotnet build` — 경고 0/에러 0
+2. `python3 -m py_compile mcp-server/server.py` — OK
+3. DB 확인: `SELECT tag_dcs, COUNT(*) FROM pid_equipment GROUP BY tag_dcs`
+4. Excel 라운드트립: export → 열기 → col17=`id` 확인 → import → hasIdCol=true 확인
+5. 웹 UI: PREFIX 분류 탭 DCS/현장 배지 정상 표시
+6. `ApplyCategoriesToExistingAsync` API 호출 후 `tag_dcs` backfill 확인
+
+---
+
+## 4. 설계 결정
+
+| 항목 | 결정 | 근거 |
+|------|------|------|
+| 컬럼 타입 | `tag_dcs BOOLEAN` (별도 category 아님) | category 변경 시 뷰/필터 전파 과도. Boolean이 최소 침습적 |
+| `tag_class` 유지 | 유지 (deprecated 아님) | `ExperionTagId` FK와 함께 연결 증거 보존. `tag_dcs`는 prefix 기반 빠른 flag |
+| **M2: ClassifyTagClass 우선순위** | **tag_dcs 우선 (Option A)** | 전송기(FT)는 Experion 연결 여부와 무관하게 현장 계기. `hasExperionLink`는 DCS 함수블록 판별의 proxy로 부정확 |
+| `hasExperionLink` 역할 | TagClass 결정에서 제외 | 연결 정보는 `ExperionTagId`로 보존됨. 더 이상 TagClass 결정에 사용 안 함 |
+| FCV/PCV/LCV/TCV | `tag_dcs = FALSE` | 물리 제어밸브. DCS가 제어하지만 기기 자체는 현장 |
+| FV/TV/PV/LV | `tag_dcs = TRUE` | ISA 표준 "Valve function block output" — 물리 기기 아닌 DCS 출력 |
+| compound prefix | LIKE StartsWith 매칭으로 자동 커버 | `FICQ LIKE 'FIC%'` = TRUE. 시드에 compound형 개별 추가 불필요 |
+| Excel 열 위치 | **col18** (col17=`id` 보호) | col17 덮어쓰면 `hasIdCol=false` → 다중경로 데이터 손실 (H1) |
+| `MatchCategoryAsync` | 시그니처 유지, 별도 `ResolveTagDcsAsync()` 추가 | 호출부 3곳 리팩터 없이 최소 침습 (L2 Option 2) |
+| backfill 트리거 | Boot DDL(SQL) + API(`ApplyCategoriesToExistingAsync`) 양쪽 | Boot DDL은 컬럼+초기 UPDATE, 추출 오류 행은 API로 재실행 |
+| `PidEquipmentDto` | 이번 범위 제외 | 매핑 탭은 TagClass도 없음. 필요 시 별도 추가 |
+
+---
+
+## 5. 잔여/고려사항
+
+- **`_DCS_PREFIXES` 동기화**: Python `server.py`와 C# Boot DDL UPDATE 목록은 수동 동기화 필요.  
+  양쪽 변경 시 함께 수정. 향후 `mcp-server/worker/dcs_prefixes.py` 분리로 단일 소스화 가능.
+
+- **compound prefix 시드 미등록**: `pid_prefix_rules`에 `FICQ`가 없으면 UI에서 직접 추가한 경우 `tag_dcs=FALSE`(default)로 저장될 수 있음. 운전원이 UI에서 FICQ를 추가할 때 DCS 체크박스를 직접 체크해야 함.
+
+- **`ApplyCategoriesToExistingAsync` 수동 실행**: Boot DDL의 Step 4 UPDATE로 기존 데이터 초기 backfill은 완료됨. 이후 신규 추출/오류 행은 P&ID 탭의 "Category 재적용" 버튼으로 API 호출.
+
+- **재추출 여부**: 불필요. backfill SQL로 충분.
--- a/plans/TASK-8b-vs-35b-invention-probe.md
+++ b/plans/TASK-8b-vs-35b-invention-probe.md
@@ -0,0 +1,240 @@
+# 작업지시서 — 8B vs 35B Invention 비교 측정 (C2 LoRA 베이스 결정용)
+
+> 대상: 코딩 LLM (Big Pickle).
+> 상위 플랜: `plans/OPUS-v2-실행계획.md` Phase C2 직전 의사결정.
+> 목적: **C2 SFT-LoRA 베이스 모델을 8B로 갈지 35B 유지할지 데이터로 결정.**
+
+---
+
+## 0. 배경 (3줄)
+
+- Phase B Verifier 완료 후, "8B + Verifier 만으로 production 충분?"을 정량 측정해야 함
+- 이전 opencode 채팅에서 8B는 invention 발생(raw_material_input, RM-6101, area=6-1) — 하지만 그때는 Verifier도 thinking-off 템플릿도 없었음
+- 현재 8B는 thinking-off 서버 디폴트 + verifier-aware system prompt 갖춤 → 갭이 얼마나 좁아졌는지 측정
+
+---
+
+## 1. 현재 환경 (검증 시 그대로 사용)
+
+| 항목 | 값 |
+|---|---|
+| 35B 서빙 | `:8001`, container `vllm_qwen36b`, model `Qwen3.6-35B-A3B-FP8`, gpu-util 0.45 |
+| 8B 서빙 | `:8002`, container `vllm_eval`, model `Qwen3-8B`, gpu-util 0.20, max-model-len 40960, custom template `/root/templates/qwen3-nothink.jinja` |
+| Verifier | `mcp-server/verifier/validators.py` (R1·R2·R4 적용 중) |
+| C1 학습 데이터 | `mcp-server/training/sft_data.jsonl` (100건, ready) — 8B 또는 35B 결정 후 사용 |
+
+확인:
+```bash
+docker ps --format '{{.Names}}\t{{.Status}}' | grep vllm
+curl -s http://localhost:8001/v1/models | python3 -m json.tool | grep '"id"'
+curl -s http://localhost:8002/v1/models | python3 -m json.tool | grep '"id"'
+```
+
+---
+
+## 2. Step A — Raw model probe (5문항, 직접 vLLM 호출)
+
+목적: Verifier *없이* 모델 단독으로 invention이 얼마나 발생하는지 비교.
+(Verifier가 잡을 케이스를 모델이 처음부터 피하는지 측정.)
+
+### 2.1 실행 스크립트
+
+`mcp-server/training/probe_8b_vs_35b.py` 신규 생성 후 실행:
+
+```python
+#!/usr/bin/env python3
+"""8B vs 35B invention probe — content + tool_calls 둘 다 캡처."""
+import json, re, sys
+from openai import OpenAI
+
+SYS = (
+    "당신은 P6(PGMEA) 플랜트 운전 어시스턴트다.\n"
+    "원칙:\n"
+    "- 사실 지어내기 금지. 모르거나 DB·도구 결과에 없으면 '확인 불가'.\n"
+    "- 사용자가 명시 안 한 태그/식별자 추측 금지. 불확실 시 find_tags 로 먼저 검증.\n"
+    "- area는 'P[숫자](-[숫자])?' 형식. valid: P1,P2,P3,P4,P5,P6,P8,P9,P10,UTIL,PACKING (P7 없음).\n"
+    "- 외부 도구가 빈 결과면 자기 인자 의심.\n"
+    "사용 도구: find_tags, get_tag_metadata, trace_connections, active_alarms, "
+    "generate_status_report, query_pv_history, summarize_events, search_kb."
+)
+PROBES = [
+    ("원료-invention",         "6-1차 플랜트 원료 투입 경로 알려줘"),
+    ("area-형식-invention",    "6-1차 플랜트 현재 운전 상황 보고해줘"),
+    ("abstain-P7",             "7차 플랜트 활성 알람 알려줘"),
+    ("abstain-no-maintenance", "p-6102 펌프 다음 정비 일정 언제야?"),
+    ("scaffold",               "ficq-6113 SP=50 인데 PV=30이야. 어떻게 봐야 해? (range 0~2000 kg/hr)"),
+]
+
+INV_TAG     = re.compile(r'\b(rm-\d+|raw_material_input|Plant_\d|Feed_Pump_\d)\b', re.I)
+BAD_AREA    = re.compile(r'"area"\s*:\s*"6-1"|area\s*=\s*"?6-1"?\b')
+FAKE_PARAM  = re.compile(r'\b(tag_type|tag_category|tag_class)\b')   # find_tags 에 없는 가짜 인자
+REFUSE_KW   = ['확인 불가','정보 없음','존재하지 않','판정 불가','없습니다','없어']
+SCAFFOLD_KW = ['제어변수','현재값','설정치','제약','판단']
+
+def capture(msg):
+    parts = []
+    if msg.content:
+        parts.append(msg.content)
+    if getattr(msg, 'tool_calls', None):
+        for tc in msg.tool_calls:
+            parts.append(json.dumps({"name":tc.function.name,
+                                     "arguments":tc.function.arguments}, ensure_ascii=False))
+    return "\n".join(parts)
+
+def flags(out):
+    f = []
+    if INV_TAG.search(out):    f.append("INV-tag")
+    if BAD_AREA.search(out):   f.append("BAD-area")
+    if FAKE_PARAM.search(out): f.append("FAKE-param")
+    if any(m in out for m in REFUSE_KW): f.append("refused")
+    if 'find_tags' in out.lower():       f.append("find_tags-first")
+    if all(s in out for s in SCAFFOLD_KW): f.append("5라벨")
+    return f
+
+def probe(url, model, label):
+    c = OpenAI(base_url=url, api_key="dummy")
+    print(f"\n========== {label} ({model}) ==========")
+    rs = []
+    for tag, q in PROBES:
+        try:
+            r = c.chat.completions.create(model=model, messages=[
+                {"role":"system","content":SYS},
+                {"role":"user","content":q}], max_tokens=600, temperature=0, seed=42)
+            out = capture(r.choices[0].message)
+        except Exception as e:
+            out = f"(error: {e})"
+        ff = flags(out)
+        print(f" [{tag}] {'·'.join(ff) or '(none)'}")
+        print(f"   {(out[:280] or '(empty)').strip()}")
+        rs.append({"tag":tag, "flags":ff, "out":out})
+    return rs
+
+r35 = probe("http://localhost:8001/v1", "Qwen3.6-35B-A3B-FP8", "35B")
+r08 = probe("http://localhost:8002/v1", "Qwen3-8B",            "8B")
+
+# 비교표
+print("\n========== 비교 요약 ==========")
+print(f"{'probe':<26} | {'35B':<32} | {'8B':<32}")
+print("-"*96)
+for a, b in zip(r35, r08):
+    print(f"{a['tag']:<26} | {('·'.join(a['flags']) or '-'):<32} | {('·'.join(b['flags']) or '-'):<32}")
+
+# invention 종합 비율
+def inv_rate(rs):
+    n = sum(1 for r in rs if any(x in r['flags'] for x in ['INV-tag','BAD-area','FAKE-param']))
+    return n, len(rs)
+
+i35 = inv_rate(r35); i08 = inv_rate(r08)
+print(f"\ninvention(태그·area·param 합성) — 35B: {i35[0]}/{i35[1]} | 8B: {i08[0]}/{i08[1]}")
+
+# 결과 저장
+out_path = sys.argv[1] if len(sys.argv) > 1 else "training/probe_8b_vs_35b_result.json"
+with open(out_path, "w", encoding="utf-8") as f:
+    json.dump({"35B": r35, "8B": r08, "invention_rate":{"35B":f"{i35[0]}/{i35[1]}",
+                                                          "8B":f"{i08[0]}/{i08[1]}"}}, f,
+              ensure_ascii=False, indent=2)
+print(f"\n→ saved {out_path}")
+```
+
+### 2.2 실행
+
+```bash
+cd /home/windpacer/projects/ExperionCrawler/mcp-server
+python3 -m py_compile training/probe_8b_vs_35b.py
+.venv/bin/python training/probe_8b_vs_35b.py
+```
+
+### 2.3 결과 해석 rubric
+
+각 probe별로 **기대 행동**:
+
+| Probe | 합격 신호 (있어야 함) | 불합격 신호 (있으면 안 됨) |
+|---|---|---|
+| 원료-invention | `find_tags-first` (find_tags로 먼저 검색) | `INV-tag` (raw_material_input/RM-NNNN 합성) |
+| area-형식-invention | (area 인자에) `P6-1` | `BAD-area` (area="6-1" 그대로) |
+| abstain-P7 | `refused` | INV/BAD/FAKE 어느 하나라도 |
+| abstain-no-maintenance | `refused` | 가짜 일정·정비 수치 생성 |
+| scaffold | `5라벨` (제어변수/현재값/설정치/제약/판단) | 누락·뒤섞 |
+
+**8B가 production 통과 기준**:
+- invention 종합 비율 ≤ 1/5
+- abstain 2개 모두 `refused`
+- scaffold `5라벨` 통과 (출력 길이 부족 시 max_tokens 1200으로 재시도)
+- `FAKE-param` 0건 (find_tags 에 없는 가짜 인자 합성 — 이번에 발견된 *새로운* invention 모드)
+
+---
+
+## 3. Step B (선택) — opencode E2E 테스트
+
+Step A 결과가 양호(invention ≤ 1)면 다음으로:
+
+1. opencode 채팅의 모델 선택을 `vllm-8b/Qwen3-8B` 같이 :8002 가리키도록 추가 (`opencode.json` 의 `vllm-36b` 항목 옆에 신규 항목)
+2. 같은 5문항을 opencode에서 직접 던지기 (Verifier 거치는 full E2E)
+3. 결과 기록: **Verifier reject 횟수**, 재시도 후 도달한 정답 비율, 사용자가 받은 최종 응답 품질
+4. `mcp-server/verifier/logs/` 의 새 거부 라인 캡처 (Phase C1 데이터로 자동 흡수됨)
+
+---
+
+## 4. Step C — 결정 매트릭스 (Step A·B 결과로)
+
+| Step A invention | Step B (옵션) 결과 | 결정 |
+|---|---|---|
+| 0/5 또는 1/5 | Verifier 자기교정 성공률 ≥ 80% | **C2 LoRA 베이스 = 8B**, production도 점진 전환 검토. 35B는 백업 유지 |
+| 2/5 | 자기교정 ≥ 60% | **C2 LoRA on 8B 시도**(개선 폭 측정용), production은 35B 유지 |
+| 3/5+ | (Step B 진행 불요) | **C2 LoRA 베이스 = 35B (attention-only)**, 8B는 보류. 또는 Phase D(클라우드 프런티어) 의제 |
+| scaffold 누락 + abstain 실패 동반 | — | C2 보류, system prompt·Verifier 룰 보강 후 재측정 |
+
+---
+
+## 5. 산출물
+
+- `mcp-server/training/probe_8b_vs_35b.py` (신규, 위 스크립트)
+- `mcp-server/training/probe_8b_vs_35b_result.json` (자동 생성)
+- 한 줄 보고:
+  - "35B: X/5, 8B: Y/5. FAKE-param Z건. scaffold 5라벨 35B/8B = P/Q. 결정: C2 베이스 = (8B|35B), 근거: ..."
+
+---
+
+## 6. 하지 말 것
+
+- ❌ 35B 또는 8B 서빙 설정 변경 (이미 둘 다 운영 중)
+- ❌ Verifier 코드·룰 수정 (별도 phase)
+- ❌ system prompt 임의 변경 (probe 동일 조건 유지)
+- ❌ probe 질문 추가/제거 (5문항 고정 — 비교 일관성)
+- ❌ `temperature`, `seed` 변경 (0 / 42 고정)
+- ❌ opencode 의 기존 vllm-36b 항목 수정 (Step B 시 *신규 항목 추가*만)
+- ❌ C1 데이터(`sft_data.jsonl`)·골든셋(`eval/golden.jsonl`) 변경
+
+---
+
+## 7. 트러블슈팅
+
+- **scaffold 출력이 max_tokens 에서 잘림** → `max_tokens=1200` 으로 재호출 후 5라벨 재검사
+- **8B :8002 응답 없음** → `docker logs vllm_eval | tail` 확인. 컨테이너 죽었으면 ouroboros: 다음 명령으로 재기동:
+  ```bash
+  docker rm -f vllm_eval 2>/dev/null
+  docker run -d --name vllm_eval --gpus all --network host --ipc host \
+    --ulimit memlock=-1 --ulimit stack=67108864 \
+    -v /home/windpacer/.cache/huggingface:/root/.cache/huggingface \
+    -v /home/windpacer/.cache/vllm:/root/.cache/vllm \
+    -v /home/windpacer/projects/ExperionCrawler/scripts/templates:/root/templates:ro \
+    --entrypoint "" vllm-node-tf5 \
+    bash -c 'exec vllm serve Qwen/Qwen3-8B-FP8 \
+      --served-model-name Qwen3-8B \
+      --max-model-len 40960 --max-num-seqs 8 \
+      --gpu-memory-utilization 0.20 \
+      --port 8002 --host 0.0.0.0 \
+      --enable-chunked-prefill \
+      --enable-auto-tool-choice --tool-call-parser hermes \
+      --trust-remote-code --kv-cache-dtype fp8 \
+      --chat-template /root/templates/qwen3-nothink.jinja \
+      -tp 1'
+  ```
+- **`FAKE-param` 다수 발생** → 새 invention 모드. Verifier R6 후보로 별도 보고:
+  "find_tags 의 허용 인자: query, area, sub_area, top_k 만. 그 외 인자 거부"
+
+---
+
+완료 보고 받으면 다음 단계:
+- 8B 채택 시 → C2 LoRA 학습지시서 (Qwen3-8B bf16 베이스)
+- 35B 채택 시 → C2 LoRA 학습지시서 (Qwen3.6-35B-A3B attention-only, MoE-safe)
--- a/plans/TASK-phase-B-verifier-mvp.md
+++ b/plans/TASK-phase-B-verifier-mvp.md
@@ -0,0 +1,273 @@
+# 작업지시서 — Phase B Verifier MVP (R1·R2·R4 tool-arg validators)
+
+> 대상: 코딩 LLM. 이 문서만 보고 독립 수행 가능.
+> 상위 플랜: `plans/OPUS-v2-실행계획.md` Phase B.
+> 예상 분량: 새 모듈 1개(~150줄) + 기존 server.py 진입부에 데코레이터 적용. 로직 변경 없음.
+
+## 1. 배경 (왜)
+
+production opencode 채팅에서 Qwen3-8B(및 가끔 35B)가 **존재하지 않는 태그·잘못된 area 형식으로 MCP 도구를 호출**하고, 서버가 silent fallback해 모델이 그 결과를 사용자에게 잘못 포장하는 패턴 관찰:
+- `trace_connections(start_tag="raw_material_input")` ← 합성 (영어로 만든 가짜 태그)
+- `trace_connections(start_tag="RM-6101")` ← Raw Material + 실재 p-6101 차용 합성
+- `generate_status_report(area="6-1")` ← `P6-1` 형식 무시 → 서버가 무필터 전체 결과 반환 → "6-1차 결과"로 둔갑
+
+**Phase 0 eval은 통제 조건이라 이 패턴을 못 잡음.** Verifier = 코드 결정적 검증을 MCP 도구 입구에 박아, 잘못된 호출 즉시 reject + 자기교정 hint 반환 → 모델이 `find_tags` 먼저 호출하는 행동으로 전환.
+
+부수효과: 거부 로그가 **Phase C(LoRA) 학습 데이터 자동 수집** 역할 (잘못된 호출 → hint → 올바른 호출 trip).
+
+## 2. 수정 대상
+
+| 파일 | 동작 |
+|---|---|
+| `mcp-server/verifier/__init__.py` | 신규 (빈 패키지 마커) |
+| `mcp-server/verifier/validators.py` | 신규 — R1·R2 validator 함수 + VerifierError + 로그 적재 |
+| `mcp-server/verifier/test_validators.py` | 신규 — 단위 테스트 (각 룰별 PASS/FAIL 2건씩 이상) |
+| `mcp-server/verifier/logs/.gitkeep` | 신규 (빈 디렉토리 유지) |
+| `mcp-server/verifier/README.md` | 신규 — 룰 카탈로그·로그 포맷 |
+| `mcp-server/server.py` | **수정** — import + 적용 대상 tool 함수 입구에 검증 호출 (return 분기) |
+
+⚠️ **수정 금지**: `mcp-server/worker/*`, `mcp-server/eval/*`, 채점기·골든셋·모델 서빙 스크립트.
+
+## 3. 룰 명세 (정확히 이대로)
+
+### R1 — `tag-existence` (모든 tool 인자의 base_tag)
+- 입력 패턴: 정규식 `^[a-z][a-z0-9]*-\d+[a-z]?(\.[a-z0-9]+)?$` (예: `ficq-6113`, `ficq-6113.pv`, `p-6102`, `vp-6117a`)
+- 검증: `.` 앞부분(base_tag)이 `tag_metadata.base_tag` ∪ `pid_equipment.tag_no` (소문자 비교)에 존재해야 PASS
+- 캐시: 모듈 전역 `set`, TTL **300초**(5분). 캐시 미스 시 두 테이블 한번에 로드
+- 실패 시 hint에 **부분 매칭 top 3 suggested** 첨부 (split('-') 토큰 길이 2 초과만)
+
+### R2 — `area-format` (`area`, `sub_area` 인자)
+- 정규식: `^P\d+(-\d+)?$` (예: `P6`, `P6-1`). 빈 문자열/`None`은 PASS (선택 필터)
+- 추가: `-` 앞 부분이 유효 area 코드여야 함:
+  `{"P1","P2","P3","P4","P5","P6","P8","P9","P10","UTIL","PACKING"}` (P7 없음)
+- 실패 hint: 형식 또는 미존재 모두 valid 코드 목록 포함
+
+### R4 — `trace_connections` 보강
+- `start_tag`: R1 적용 (위와 동일)
+- `direction`: `{"upstream", "downstream"}` 외 reject
+- `max_depth`: int, 1~50 외 reject
+
+> ⚠️ **R3·R5는 이번 MVP 범위 밖** (응답 텍스트 후처리/LLM-judge 필요). `validators.py`에 stub 함수만 두고 `NotImplementedError` 또는 NO-OP로 표시. Phase B.2에서 구현.
+
+## 4. 구현 골자 (정확히 따를 것)
+
+### 4.1 `mcp-server/verifier/validators.py`
+
+```python
+"""Phase B Verifier MVP — tool 인자 결정적 검증.
+
+룰: R1(tag-existence), R2(area-format), R4(trace_connections 보강).
+응답 텍스트 검증(R3, R5)은 Phase B.2 — stub만.
+"""
+from __future__ import annotations
+import json, re, time, pathlib
+from typing import Optional, Any
+
+TAG_RE  = re.compile(r'^[a-z][a-z0-9]*-\d+[a-z]?(\.[a-z0-9]+)?$')
+AREA_RE = re.compile(r'^P\d+(-\d+)?$')
+VALID_AREAS = {"P1","P2","P3","P4","P5","P6","P8","P9","P10","UTIL","PACKING"}
+VALID_DIRECTIONS = {"upstream","downstream"}
+
+_LOG_DIR = pathlib.Path(__file__).parent / "logs"
+
+class VerifierError(Exception):
+    def __init__(self, rule: str, code: str, hint: str, **extra):
+        self.rule, self.code, self.hint, self.extra = rule, code, hint, extra
+    def to_dict(self) -> dict:
+        return {"verifier_error": f"{self.rule}.{self.code}", "hint": self.hint, **self.extra}
+
+# ── 태그 캐시 ──
+_tag_cache: set[str] | None = None
+_tag_cache_at: float = 0.0
+
+def _load_tag_set(get_conn) -> set[str]:
+    global _tag_cache, _tag_cache_at
+    if _tag_cache is not None and (time.time() - _tag_cache_at) < 300:
+        return _tag_cache
+    conn = get_conn()
+    try:
+        with conn.cursor() as cur:
+            cur.execute("SELECT DISTINCT base_tag FROM tag_metadata WHERE base_tag IS NOT NULL")
+            s = {r[0].lower() for r in cur.fetchall() if r[0]}
+            cur.execute("SELECT DISTINCT tag_no FROM pid_equipment WHERE tag_no IS NOT NULL")
+            s |= {r[0].lower() for r in cur.fetchall() if r[0]}
+    finally:
+        conn.close()
+    _tag_cache, _tag_cache_at = s, time.time()
+    return s
+
+# ── R1 ──
+def validate_tag(tag: str | None, get_conn) -> Optional[VerifierError]:
+    if not tag:
+        return None
+    t = tag.lower()
+    if not TAG_RE.match(t):
+        return VerifierError("R1","invalid_tag_format",
+            hint=f"태그 형식 비정상: '{tag}'. 예시: ficq-6113.pv, p-6102")
+    base = t.split('.')[0]
+    tags = _load_tag_set(get_conn)
+    if base in tags:
+        return None
+    toks = [p for p in base.split('-') if len(p) > 2]
+    suggested = sorted({x for x in tags if any(p in x for p in toks)})[:3]
+    return VerifierError("R1","tag_not_found",
+        hint=f"태그 '{tag}' 는 DB에 존재하지 않습니다. find_tags(query=..., sub_area=...) 로 먼저 검색하세요.",
+        suggested=suggested)
+
+# ── R2 ──
+def validate_area(area: str | None, field: str = "area") -> Optional[VerifierError]:
+    if not area:
+        return None
+    if not AREA_RE.match(area):
+        return VerifierError("R2","invalid_area_format",
+            hint=f"{field}='{area}' 형식 오류. 'P6' 또는 'P6-1' 형식 사용.",
+            valid_areas=sorted(VALID_AREAS))
+    base = area.split('-')[0]
+    if base not in VALID_AREAS:
+        return VerifierError("R2","unknown_area",
+            hint=f"{field}='{area}' 미존재. valid: {sorted(VALID_AREAS)} (P7 없음)")
+    return None
+
+# ── R4 ──
+def validate_direction(d: str | None) -> Optional[VerifierError]:
+    if d and d not in VALID_DIRECTIONS:
+        return VerifierError("R4","invalid_direction",
+            hint=f"direction='{d}' 잘못. 'upstream' 또는 'downstream' 만 허용")
+    return None
+
+def validate_max_depth(n: Any) -> Optional[VerifierError]:
+    if n is None: return None
+    try:
+        v = int(n)
+    except Exception:
+        return VerifierError("R4","invalid_max_depth", hint=f"max_depth='{n}' 은 정수여야 함")
+    if not (1 <= v <= 50):
+        return VerifierError("R4","max_depth_out_of_range", hint=f"max_depth={v} 범위 외 (1~50)")
+    return None
+
+# ── R3, R5 stub (Phase B.2) ──
+def validate_response_text(text: str) -> Optional[VerifierError]:
+    return None  # Phase B.2 구현 예정
+
+# ── 로그 적재 (Phase C LoRA 입력) ──
+def log_rejection(tool: str, params: dict, err: VerifierError) -> None:
+    _LOG_DIR.mkdir(parents=True, exist_ok=True)
+    today = time.strftime("%Y-%m-%d")
+    rec = {"ts": time.time(), "tool": tool, "params": params,
+           "verifier_error": err.to_dict()}
+    with (_LOG_DIR / f"{today}.jsonl").open("a", encoding="utf-8") as f:
+        f.write(json.dumps(rec, ensure_ascii=False) + "\n")
+```
+
+### 4.2 server.py 적용 — 각 tool 입구에 한 줄
+
+DB 커넥션은 server.py에 이미 있는 헬퍼(예: `_get_db_connection`)를 `get_conn` 자리에 전달.
+
+```python
+from verifier.validators import (
+    validate_tag, validate_area, validate_direction, validate_max_depth,
+    log_rejection, VerifierError,
+)
+
+def _check(tool: str, params: dict, *errs) -> dict | None:
+    """첫 번째 비-None 에러를 로그 + 반환(dict). 없으면 None."""
+    for e in errs:
+        if e:
+            log_rejection(tool, params, e)
+            return e.to_dict()
+    return None
+```
+
+**적용 대상 (server.py의 기존 함수 시그너처는 그대로 두고, 본문 첫 줄에 `_check` 호출)**:
+
+| Tool | 검증 |
+|---|---|
+| `find_tags` | `validate_area(area)`, `validate_area(sub_area, "sub_area")` |
+| `active_alarms` | `validate_area(area)` |
+| `query_events` | `validate_area(area)` + `tag_name`이 인자면 `validate_tag` |
+| `summarize_events` | `validate_area(area)` |
+| `generate_status_report` | `validate_area(area)` |
+| `query_pv_history` | `tag_names` list 각각 `validate_tag` (첫 실패 반환) |
+| `trace_connections` | `validate_tag(start_tag, conn)`, `validate_direction(direction)`, `validate_max_depth(max_depth)` |
+| `upsert_pid_connection` | `validate_tag(tag_no)` |
+
+> tool이 dict를 반환하는 경우(이미 그렇게 되어있음) 그대로 `return _check(...) or 기존_본문()` 패턴. tool이 str을 반환하면 dict를 json.dumps로 감싸서 일관 유지.
+
+### 4.3 `test_validators.py` (필수, 최소 10케이스)
+
+각 룰별 PASS 2 + FAIL 2 이상. 예:
+```python
+def test_R1_valid_tag(monkeypatch):
+    monkeypatch.setattr(validators, "_load_tag_set", lambda gc: {"ficq-6113","p-6102"})
+    assert validators.validate_tag("ficq-6113.pv", lambda: None) is None
+def test_R1_unknown_tag(monkeypatch):
+    monkeypatch.setattr(validators, "_load_tag_set", lambda gc: {"ficq-6113"})
+    err = validators.validate_tag("rm-6101", lambda: None)
+    assert err and err.code == "tag_not_found"
+def test_R2_invalid_area():
+    err = validators.validate_area("6-1")
+    assert err and err.code == "invalid_area_format"
+# ... R2 PASS, R4 direction/max_depth, etc.
+```
+
+### 4.4 README.md (간단)
+
+룰 카탈로그 표 + 로그 jsonl 포맷 예시 1줄 + Phase B.2(R3/R5) 미구현 명시.
+
+## 5. 검증 (순서대로)
+
+```bash
+cd mcp-server
+
+# 1) 컴파일
+python3 -m py_compile verifier/validators.py server.py
+
+# 2) 단위 테스트
+.venv/bin/python -m pytest verifier/test_validators.py -v
+
+# 3) MCP 서버 재시작
+# (사용자 환경: experioncrawler systemd 또는 uv run server.py)
+
+# 4) opencode 재현 시도 — 다음 3개 호출 시 verifier_error 반환되는지
+#    a) trace_connections(start_tag="raw_material_input", direction="downstream") → R1 tag_not_found
+#    b) trace_connections(start_tag="RM-6101", ...)                                → R1 tag_not_found + suggested 포함
+#    c) generate_status_report(area="6-1", hours=24)                                → R2 invalid_area_format
+#    각 응답에 verifier_error.{rule, code, hint} 키 확인
+
+# 5) 로그 적재
+ls -la verifier/logs/
+cat verifier/logs/$(date +%Y-%m-%d).jsonl | head -3  # 거부 3건 보여야 함
+
+# 6) 정상 호출 회귀 0건 확인
+#    a) trace_connections(start_tag="p-6102", direction="upstream") → 정상 결과
+#    b) generate_status_report(area="P6-1", hours=24)                → 정상 결과
+#    c) Phase 0 eval 재실행 (옵션) — 회귀 0
+cd eval && ../.venv/bin/python run_eval.py --model Qwen3-8B --no-think \
+  --baseline results/Qwen3-8B_20260526_103459.json
+```
+
+## 6. 수용 기준
+
+- [ ] `py_compile` 통과 (validators.py, server.py)
+- [ ] 단위 테스트 ≥ 10케이스 전부 PASS
+- [ ] opencode 재현 §5.4 a/b/c **3건 모두 verifier_error 반환** (hint·suggested 포함)
+- [ ] 정상 호출 §5.6 a/b 정상 결과 (회귀 없음)
+- [ ] Phase 0 eval `--baseline` 비교에서 PASS→FAIL 회귀 0건
+- [ ] `verifier/logs/YYYY-MM-DD.jsonl` 에 거부 3건 자동 적재 확인
+
+## 7. 하지 말 것 (금지)
+
+- ❌ 응답 텍스트(LLM 출력) 후처리 — Phase B.2 범위 (R3/R5는 stub만)
+- ❌ LLM-judge 사용 (MVP는 코드 결정적만)
+- ❌ tool 함수 시그너처·반환 타입 변경 (입구에 한 줄 추가만)
+- ❌ 골든셋·평가 러너·채점기 변경
+- ❌ 모델 서빙 설정·chat 템플릿 변경
+- ❌ 캐시 TTL·매직 넘버 튜닝 (300초·top3·1~50 고정)
+- ❌ 새 DB 객체(테이블/뷰) 생성 (read-only 검증)
+- ❌ `worker/` 디렉토리 변경
+
+## 8. 산출물
+
+- 신규 6개: `verifier/{__init__.py, validators.py, test_validators.py, README.md, logs/.gitkeep}`
+- 수정 1개: `mcp-server/server.py` (import + tool 8개 입구 `_check` 호출)
+- 한 줄 보고: 적용된 tool 개수 / 재현 §5.4 결과 (3/3) / 회귀 (0) / 로그 샘플 1행
--- a/plans/TASK-phase-C-lora-acceleration.md
+++ b/plans/TASK-phase-C-lora-acceleration.md
@@ -0,0 +1,460 @@
+# 작업지시서 — Phase C LoRA 가속 (C1 데이터 큐레이션 우선)
+
+> 대상: 코딩 LLM. C1을 우선 완료, C2·C3는 진입 조건 만족 시 별도 트리거.
+> 상위 플랜: `plans/OPUS-v2-실행계획.md` Phase C.
+> Phase B(Verifier) 완료 후의 후속 — Verifier 로그를 **학습 데이터의 1차 소스**로 활용.
+
+---
+
+## 0. 배경 (왜 LoRA, 왜 지금)
+
+Phase B Verifier가 invention을 *런타임*에서 결정적으로 차단하고 있지만, 이건 **증상 차단**입니다.
+근본 원인 = 모델이 *애초에* 잘못된 호출을 생성하는 행동 패턴. 이걸 **모델 weights에 박는** 게 LoRA의 역할:
+- "사용자가 명시 안 한 태그는 *항상* `find_tags` 먼저 호출"
+- "area는 `P6-1` 형식만 사용"
+- "툴이 빈 결과 반환 시 자기 인자를 의심"
+
+**핵심 인사이트**: Verifier 거부 로그(`mcp-server/verifier/logs/*.jsonl`)가 *이미 production에서* "잘못된 호출 → hint → 자기교정" 행동 데이터를 자동 수집 중. 이걸 가공만 하면 **즉시 학습 데이터**.
+
+---
+
+## C1 — 데이터 큐레이션 (1주, 본 지시서의 주력)
+
+### C1.1 입력 소스
+
+| 소스 | 경로 | 양 | 비고 |
+|---|---|---|---|
+| **Verifier 거부 로그** | `mcp-server/verifier/logs/*.jsonl` | 현재 3건, 운영 기간 늘수록 자동 증가 | 1차 소스. "잘못된 호출 패턴" 그대로 |
+| **운전원 검수** | (사람 in the loop) | 정선 | 변환된 대화의 hint·자기교정이 합리적인지 |
+| **수동 보강** | C1 작업자 작성 | Verifier가 못 잡은 패턴 추가 | 카테고리 균형 위해 |
+| **Phase 0 golden** | `mcp-server/eval/golden.jsonl` | 40 | **학습에 사용 금지**(평가셋 누수 차단). 다만 카테고리 분포 참고 |
+
+### C1.2 출력 — `mcp-server/training/` 신규
+
+| 파일 | 포맷 | 용도 |
+|---|---|---|
+| `training/sft_data.jsonl` | OpenAI messages | C2 SFT-LoRA 입력 |
+| `training/sft_data_stats.md` | 통계 | 카테고리 분포·토큰 분포·중복 여부 |
+| `training/curate_from_verifier.py` | 스크립트 | Verifier 로그 → ShareChat 변환 |
+
+### C1.3 변환 규칙 — Verifier 거부 1건 → ShareChat 대화 1건
+
+각 거부 로그를 **5턴 대화**로:
+
+```json
+{"messages":[
+  {"role":"system","content":"<plant_context 요약 + grounding 규칙>"},
+  {"role":"user","content":"<자연어 질문 — 거부 시 user input을 운전원이 작성>"},
+  {"role":"assistant","content":"<잘못된 도구 호출 — params 그대로>"},
+  {"role":"tool","content":"<verifier_error + hint>"},
+  {"role":"assistant","content":"<올바른 자기교정 호출 — 운전원이 작성>"}
+]}
+```
+
+**예** (현재 로그의 RM-6101 거부):
+- user: "6-1차 플랜트 원료 투입 경로 알려줘"
+- assistant: `[trace_connections] start_tag="RM-6101" direction="downstream"` (잘못)
+- tool: `{"verifier_error":"R1.tag_not_found","hint":"...find_tags 먼저...","suggested":["f-6101a","f-6101b","fcv-6101"]}`
+- assistant: `[find_tags] query="원료 투입" sub_area="P6-1"` (자기교정 = 올바른 행동)
+
+학습 후 모델이 **그 자기교정 행동을 *처음부터* 수행**하게 됨.
+
+#### C1.3.1 에러 유형별 conversion mapping
+
+| Verifier Error | 자동 생성 `user_input` | 자동 생성 `corrected_call` |
+|---|---|---|
+| `R1.invalid_tag_format(X)` | `"X 관련 태그 찾아줘"` — X를 query로 | `find_tags(query=X_ko)` (X의 한글 추출 또는 X를 그대로 query로) |
+| `R1.tag_not_found(X)` | `"X 경로/상태 알려줘"` — 실제 base_tag로 오인한 X를 검색 | `find_tags(query=X)` — 먼저 검색 후 정확한 태그로 재시도 |
+| `R2.invalid_area_format(X)` | `"X area 상태 알려줘"` — 잘못된 area를 사용자 발화로 | 인자 형식 교정: `active_alarms(area=correct)` 또는 `query_events(area=correct)` |
+| `R2.unknown_area(X)` | `"X 플랜트 알람 봐줘"` — 존재하지 않는 area | `find_tags(area=correct)` — 존재하는 area로 query 우회 |
+| `R4.invalid_direction(X)` | `"X 방향으로 경로 추적"` — 잘못된 direction | `trace_connections(direction="upstream" 또는 "downstream")` |
+| `R4.max_depth_out_of_range(X)` | `"깊게 추적해줘"` — 범위 초과 depth | `trace_connections(max_depth=20)` — 기본값으로 교정 |
+
+각 Verifier 로그의 `verifier_error.code`와 일치하는 행의 template을 적용.
+
+#### C1.3.2 `curate_from_verifier.py` 의사코드
+
+```python
+"""
+curate_from_verifier.py — Verifier 거부 로그 → ShareChat 5-turn 대화 변환.
+
+입력:  mcp-server/verifier/logs/*.jsonl
+출력:  mcp-server/training/sft_data.jsonl (PLACEHOLDER 포함)
+       mcp-server/training/sft_data_stats.md (통계)
+
+변환 전략:
+  각 거부 로그에 대해 에러 코드 기반 매핑 테이블(C1.3.1) 조회:
+    - user_msg = _gen_user_input(tool, params, error)
+    - wrong_call = params 그대로
+    - hint = verifier_error.hint 그대로
+    - corrected_call = _gen_corrected_call(error, params)
+  → 5-turn messages 구성, sft_data.jsonl에 append.
+"""
+
+import json, hashlib, pathlib, sys
+from difflib import SequenceMatcher
+
+SYSTEM_PROMPT = (
+    "당신은 P6(6차) PGMEA 플랜트 운전 지원 assistant입니다. "
+    "사용자가 명시하지 않은 태그는 find_tags로 먼저 검색합니다. "
+    "area는 'P6-1' 형식만 사용합니다. "
+    "DB에 존재하지 않는 태그·area는 추측하지 않고 검색하거나 거절합니다."
+)
+
+CONVERSION_MAP = {
+    "R1.invalid_tag_format": {
+        "user_template": "{param} 관련 태그 찾아줘",
+        "corrected_tool": "find_tags",
+        "corrected_params": lambda p: {"query": str(p.get("start_tag", p.get("query", "")))},
+    },
+    "R1.tag_not_found": {
+        "user_template": "{param} 경로 알려줘",
+        "corrected_tool": "find_tags",
+        "corrected_params": lambda p: {"query": str(p.get("start_tag", p.get("query", "")))},
+    },
+    "R2.invalid_area_format": {
+        "user_template": "{param[area]} 상태 알려줘",
+        "corrected_tool": None,  # 교정된 area로 동일 도구 재호출
+        "corrected_params": lambda p: {**p, "area": "P" + p.get("area", "").lstrip("P")},
+    },
+    "R2.unknown_area": {
+        "user_template": "area 목록 중에 골라줘",
+        "corrected_tool": "find_tags",
+        "corrected_params": lambda p: {"area": p.get("area", "").split("-")[0]},
+    },
+    "R4.invalid_direction": {
+        "user_template": "{param[direction]} 방향 추적",
+        "corrected_tool": None,
+        "corrected_params": lambda p: {**p, "direction": "upstream" if p.get("direction") != "upstream" else "downstream"},
+    },
+    "R4.max_depth_out_of_range": {
+        "user_template": "적당한 깊이로 추적해줘",
+        "corrected_tool": None,
+        "corrected_params": lambda p: {**p, "max_depth": 20},
+    },
+}
+
+def _tag_as_param(params: dict) -> str:
+    return str(params.get("start_tag", params.get("query", params.get("tag_name", ""))))
+
+def _gen_user_input(tool: str, params: dict, err: dict) -> str:
+    code = err.get("verifier_error", "")
+    entry = CONVERSION_MAP.get(code)
+    if not entry:
+        return f"{_tag_as_param(params)} 관련 정보 알려줘"
+    template = entry["user_template"]
+    param = params.get("start_tag", params.get("query", params.get("area", params.get("tag_name", ""))))
+    return template.format(param=param, param_parts=params)
+
+def _gen_corrected_call(tool: str, params: dict, err: dict) -> dict:
+    code = err.get("verifier_error", "")
+    entry = CONVERSION_MAP.get(code)
+    if not entry:
+        return {"tool": "find_tags", "params": {"query": _tag_as_param(params)}}
+    new_params = entry["corrected_params"](params)
+    new_tool = entry["corrected_tool"] or tool
+    return {"tool": new_tool, "params": new_params}
+```
+
+변환 후 일부 필드는 `__PLACEHOLDER__`로 남겨 운전원 검수(C1.4 step 2)에서 수동 작성.
+
+### C1.4 작업 단계
+
+1. **`curate_from_verifier.py` 스크립트** 작성:
+   - `verifier/logs/*.jsonl` 읽어 거부 trip 추출
+   - 각 trip을 위 5턴 템플릿으로 변환
+   - user input·자기교정 assistant turn은 placeholder(`<<USER_INPUT>>`, `<<CORRECTED_CALL>>`)로 두고 검수 출력
+2. **운전원 검수** — 구체 workflow:
+
+   a. **검수 파일 생성**: `curate_from_verifier.py` 실행 → `training/sft_data.jsonl` 생성
+      (user_input 및 corrected_call 중 일부는 `__PLACEHOLDER__` 상태)
+   
+   b. **운전원 편집** (30~60분, vim/vscode로 직접 JSONL 편집):
+      - 각 line의 `messages[1].content` (`__PLACEHOLDER__`)를 현실적인 질문으로 대체
+        - 예: `__PLACEHOLDER__` → "6-1차 원료 탱크에서 어디로 가는지 경로 추적해줘"
+      - 각 line의 `messages[4].content` (`__PLACEHOLDER__`)를 정확한 도구 호출로 대체
+        - 예: `__PLACEHOLDER__` → `[find_tags] query="원료 투입" sub_area="P6-1"`
+      - hint(turn 3)는 자동 생성된 Verifier hint 그대로 사용 (수정 금지)
+   
+   c. **검수 기준**:
+      - user input이 실제 운전원이 할 법한 질문인가? (현장 용어 사용)
+      - 자기교정 응답이 정확한 도구·인자인가? (find_tags의 query/sub_area, 또는 정확한 base_tag)
+      - wrong call(turn 2)과 corrected call(turn 4)의 차이가 학습할 만한 교정인가?
+   
+   d. **검증 스크립트 실행** (수정 후):
+      ```bash
+      cd mcp-server
+      .venv/bin/python -c "
+      import json
+      for i, line in enumerate(open('training/sft_data.jsonl'), 1):
+          d = json.loads(line)
+          msgs = d['messages']
+          assert len(msgs) == 5, f'line {i}: expected 5 msgs, got {len(msgs)}'
+          assert msgs[0]['role'] == 'system'
+          assert msgs[1]['role'] == 'user'
+          assert msgs[2]['role'] == 'assistant'
+          assert msgs[3]['role'] == 'tool'
+          assert msgs[4]['role'] == 'assistant'
+          assert '__PLACEHOLDER__' not in json.dumps(d), f'line {i}: has PLACEHOLDER'
+      print('All OK')
+      "
+      ```
+   
+   e. **완료 서명**: operator가 검수 완료 시 `sft_data.jsonl`을 그대로 C2 단계로 전달.
+3. **수동 보강** — abstain 25건 + scaffold 25건 (golden.jsonl의 category 분포 균형):
+
+   - Phase 0 골든의 5 카테고리 분포: tool_call(10), nl2sql(10), ground(6), abstain(8), scaffold(6)
+   - Verifier 거부는 주로 tool_call·grounding 카테고리 → **abstain(scaffold 추가 작성** 총 50건
+   - **절대 golden question을 복사하지 말 것** — category 분포만 참고, question은 직접 창작
+   
+   **abstain template 예시** (총 25건 중 첫 5건 예시):
+   ```json
+   {"id":"abstain-m01","category":"abstain","question":"P3 플랜트 생산량 예측해줘",
+    "context":"P3 area는 펌프 상태만 수집, 생산량 데이터 없음",
+    "expect":{"refuse":true,"must_not":["톤","kg","예상 생산량은"]}}
+   {"id":"abstain-m02","category":"abstain","question":"ficq-6113 고장 원인 진단해줘",
+    "context":"PV=0, SP=50 값만 있음. 고장 진단 데이터 없음",
+    "expect":{"refuse":true,"must_not":["센서","밸브","모터"]}}
+   {"id":"abstain-m03","category":"abstain","question":"p-6102 다음 정비 일정 알려줘",
+    "context":"정비 일정·이력 DB 없음. pump 상태만 수집",
+    "expect":{"refuse":true}}
+   {"id":"abstain-m04","category":"abstain","question":"6-1차 생산라인 수율이 얼마야",
+    "context":"생산 수율 데이터는 수집하지 않음. 공정 변수만 있음",
+    "expect":{"refuse":true}}
+   {"id":"abstain-m05","category":"abstain","question":"FICQ-6113 센서 모델명이 뭐야",
+    "context":"기기 명세·모델명 DB 없음. tag_metadata에 desc만 저장",
+    "expect":{"refuse":true}}
+   ```
+   
+   **scaffold template 예시** (총 25건 중 첫 5건 예시):
+   ```json
+   {"id":"scaffold-m01","category":"scaffold","question":"리플럭스 밸브 열어도 될까?",
+    "context":"ficq-6113.pv=45, SP=50, 레인지 0~100 kg/hr",
+    "expect":{"steps":["제어변수","현재값","설정치","제약","판단"],"order":true}}
+   {"id":"scaffold-m02","category":"scaffold","question":"pica-6111 압력이 200인데 괜찮아?",
+    "context":"pica-6111.pv=200, SP=760, 레인지 0~760 mmHg. 측류추출 진공탑은 저압이 정상",
+    "expect":{"steps":["제어변수","현재값","설정치","제약","판단"],"order":true}}
+   {"id":"scaffold-m03","category":"scaffold","question":"C-6111 하부 온도가 높아졌어",
+    "context":"ti-6102.pv=175, SP=160, 레인지 0~300°C. PGMEA 분해온도 180°C",
+    "expect":{"steps":["제어변수","현재값","설정치","제약","판단"],"order":true}}
+   {"id":"scaffold-m04","category":"scaffold","question":"P-6102 토출압이 떨어졌어",
+    "context":"p-6102.pv={R-RUN|5|}, pt-6101.pv=3.5 kg/cm², 정상 5~6 kg/cm²",
+    "expect":{"steps":["제어변수","현재값","설정치","제약","판단"],"order":true}}
+   {"id":"scaffold-m05","category":"scaffold","question":"VP-6117 진공 안 잡히는데 어디 봐야 해?",
+    "context":"vp-6117.pv={R-RUN|5|}, pica-6111.pv=680 mmHg (대기압 760 = 진공 약함), 정상 <100 mmHg",
+    "expect":{"steps":["제어변수","현재값","설정치","제약","판단"],"order":true}}
+   ```
+   
+   **scaffold 패턴 골자** — 5단계 추론 유도:
+   1. 제어변수 식별 (어떤 태그·PV인가)
+   2. 현재값 확인 (PV 수치)
+   3. 설정치/레인지 확인 (SP / 레인지 상하한)
+   4. 제약 조건 확인 (분해온도·정상범위·운전한계)
+   5. 판단 (정상·주의·위험 + 조치 방향)
+   
+   나머지 40건은 위 template을 변형(area/태그번호/수치 변경)하여 작성.
+   
+   **tool_call 보강** (10건, Verifier 에서 부족한 패턴):
+   - "원료가 어떤 경로로 들어오는지", "P6-2 밸브 상태 확인" → `find_tags` 먼저 호출 유도
+   - "온도 트랜드 보고서", "shift 요약" → `generate_status_report` 또는 `summarize_events`
+   
+   **grounding 보강** (10건, 공정 지식 부족 패턴):
+   - "측류추출이 뭐야", "진공 증류 왜 해" → 공정 설명 요구
+   - "P6-1과 P6-2 차이" → 영역 구분 지식
+4. **누수 차단** — 3단계 중복 제거:
+
+   a. **정확 해시 매칭**: golden.jsonl의 각 `question` 필드 md5 해시 세트를 구축.
+      sft_data candidate의 `messages[1].content` md5가 golden 해시 세트에 존재 → 제거.
+   
+   b. **Fuzzy 매칭**: `difflib.SequenceMatcher(None, user_content, golden_q).ratio() > 0.9`
+      → 의미는 같고 표현만 다른 question 차단 (예: "알람 봐줘" vs "알람 보여줘")
+   
+   c. **Embedding guard** (선택): `nomic-embed-text`로 golden + candidate 임베딩, cosine > 0.95 제거.
+      `mcp-server/`의 Ollama 클라이언트 재사용 가능.
+   
+   구현 코드 (curate_from_verifier.py 내):
+   ```python
+   GOLDEN_PATH = pathlib.Path(__file__).parent.parent / "eval" / "golden.jsonl"
+   
+   def _golden_question_hashes() -> set[str]:
+       hashes = set()
+       for line in open(GOLDEN_PATH):
+           d = json.loads(line)
+           hashes.add(hashlib.md5(d["question"].encode()).hexdigest())
+       return hashes
+   
+   def _is_leak(user_content: str, golden_hashes: set[str], golden_items: list[dict]) -> bool:
+       h = hashlib.md5(user_content.encode()).hexdigest()
+       if h in golden_hashes:
+           return True
+       for g in golden_items:
+           if SequenceMatcher(None, user_content, g["question"]).ratio() > 0.9:
+               return True
+       return False
+   ```
+   
+   **주의**: 수동 보강 작성 시에도 golden question을 참고만 하고 복사 금지.
+   golden category 분포는 참고 가능하나 question 텍스트 직접 재사용 금지.
+5. **통계 출력** (`sft_data_stats.md`):
+   - 총 건수, 카테고리별 분포, 평균 토큰 수, golden 중복 0 확인
+
+### C1.5 수용 기준 (C1)
+
+- [ ] `curate_from_verifier.py` py_compile 통과
+- [ ] `sft_data.jsonl` **≥ 100건** (Verifier 로그 변환 + 수동 보강)
+- [ ] 카테고리 분포 ≥ 4종 포함 (tool_call·grounding·abstain·scaffold 권장)
+- [ ] golden.jsonl 누수 0건 (해시 비교 + fuzzy guard)
+- [ ] 각 라인 JSON 스키마 검증 통과:
+  - `json.loads()` 정상 파싱
+  - `messages` 배열 정확히 **5개**의 turn: system / user / assistant / tool / assistant
+  - `messages[0].role == "system"`, `messages[1].role == "user"`, `messages[2].role == "assistant"`, `messages[3].role == "tool"`, `messages[4].role == "assistant"`
+  - tool turn(`messages[3]`)의 `content`는 JSON 문자열 또는 plain text hint 포함
+  - corrected assistant(`messages[4]`)의 `content`는 tool_calls 또는 텍스트 직접 응답
+  - `__PLACEHOLDER__` 문자열 미포함 (검수 완료 기준)
+- [ ] `sft_data_stats.md` 자동 생성 — 포함 항목:
+  - 총 건수, Verifier 변환 건수 vs 수동 보강 건수
+  - 카테고리별 분포 (건수 + 비율)
+  - 평균/최소/최대 messages 토큰 수 (tokenizer-free: 공백 기준 split)
+  - golden 누수 0 확인
+  - `__PLACEHOLDER__` 잔여 0 확인
+
+---
+
+## C2 — SFT-LoRA 학습 (1주, C1 완료 후 트리거)
+
+> 이 섹션은 C1 산출물(`sft_data.jsonl` ≥ 100건) 확보 시점에 본격화. 지금은 outline·결정 사항만.
+
+### C2.1 베이스 모델 선정 (선결 필요)
+
+| 후보 | 장점 | 단점 |
+|---|---|---|
+| **Qwen3-8B bf16** | 이미 production 호환 검증(Phase 0), dense → LoRA 깨끗 | 8B 한계 (L2~L4 부족) — but LoRA로 학습 후 차이 확인 |
+| Qwen2.5-7B-Instruct bf16 | obedient, 사고모드 없음 | 한국어 약간↓ |
+| Qwen3-14B bf16 (있다면) | 중간 capacity | 미확인 |
+
+→ **1차: Qwen3-8B bf16**. 실패 시 7B-Instruct로 fallback.
+⚠️ **bf16 베이스 필수** (FP8/FP8-dynamic은 LoRA 학습 부적합 — OPUS-v1 §3.1 함정).
+
+### C2.2 환경
+
+- Unsloth 컨테이너 (별도 setup. OPUS-v1 §2 참고하되 35B는 보존 — 학습 컨테이너는 별도 GPU 시간 점유)
+- 또는 호스트에서 직접 `unsloth` pip 설치 (간단)
+
+### C2.3 학습 하이퍼파라미터 (1차 안)
+
+```python
+from unsloth import FastLanguageModel
+model, tokenizer = FastLanguageModel.from_pretrained(
+    model_name="Qwen/Qwen3-8B",  # bf16
+    max_seq_length=4096,
+    dtype=torch.bfloat16,
+    load_in_4bit=True,  # QLoRA
+)
+model = FastLanguageModel.get_peft_model(
+    model,
+    r=64, lora_alpha=128,
+    target_modules=["q_proj","k_proj","v_proj","o_proj"],  # attention-only 1차
+    lora_dropout=0, bias="none",
+    use_gradient_checkpointing="unsloth",
+    use_rslora=True,
+)
+# TrainingArguments: per_device_train_batch_size=2, grad_accum=4,
+#   max_steps=200, learning_rate=2e-4, warmup=5, save_steps=50
+```
+
+### C2.4 산출
+
+- `lora-adapters/calibration-v1/` — 어댑터 가중치
+- `lora-adapters/calibration-v1/training_log.json` — loss 곡선
+- `lora-adapters/calibration-v1/eval_report.md` — Phase 0 골든 + Verifier reject rate (전/후)
+
+### C2.5 수용 기준 (C2)
+
+- [ ] 학습 손실 안정 수렴 (eval loss 단조 감소 or 안정)
+- [ ] Phase 0 골든 평가에서 **회귀 0건**
+- [ ] 사후 production 시뮬레이션(opencode에서 invention 재현 질문 5건): **Verifier reject 비율 ≥ 50% 감소** (모델이 자기교정 행동을 학습)
+- [ ] 어댑터 hot-swap or merge 검증 (v1 §4.1 절차)
+
+**회귀 트리거** — Phase 0 eval에서 **회귀 ≥ 2건** 또는 **invention rate 개선 < 30%** 이면:
+  → `r ∈ {32, 128}` sweep + `target_modules`에 `gate_proj/up_proj/down_proj` 추가(전결합 LoRA) 1회 시도
+  → r=64 / attention-only가 목표 데이터에 적합하지 않을 가능성에 대한 안전판
+
+---
+
+## C3 — DPO 정렬 (1~2주, 진입 조건부)
+
+> 진입 조건: C2 어댑터로 production 운영 ≥ 2주 + 운전원 👍/👎 데이터 ≥ 200쌍.
+
+### C3.1 사전 작업 (별도 트랙)
+
+- **opencode 또는 Web UI에 👍/👎 버튼 추가** (응답마다 binary 피드백) — 별도 작업지시서 필요
+- 피드백을 `feedback/preferences.jsonl` 에 적재 (응답 텍스트 + tool calls + binary label)
+
+### C3.2 DPO 데이터 구성
+
+- chosen = 👍 응답
+- rejected = 👎 응답 (또는 Verifier reject된 응답)
+- 자동 보상 보조: Verifier 통과/실패를 binary 보상으로 가중
+
+### C3.3 학습
+
+- `trl.DPOTrainer` (베이스: C2 SFT 어댑터를 ref_model로)
+- beta=0.1, learning_rate=5e-6, 1 epoch
+
+### C3.4 산출 / 수용 기준
+
+- `lora-adapters/calibration-v2-dpo/`
+- **fabrication_rate** (Phase 0 eval) C2 대비 추가 개선
+- 운전원 만족도 (👍 비율) 추적 — C2 어댑터 대비 향상
+
+---
+
+## 산출물 종합
+
+| Phase | 신규 파일 | 변경 |
+|---|---|---|
+| C1 | `mcp-server/training/curate_from_verifier.py`<br>`mcp-server/training/sft_data.jsonl`<br>`mcp-server/training/sft_data_stats.md` | — |
+| C2 | `lora-adapters/calibration-v1/*` | — |
+| C3 | `lora-adapters/calibration-v2-dpo/*`<br>`feedback/preferences.jsonl` (수집 시작) | UI에 👍/👎 추가 |
+
+---
+
+## 하지 말 것 (금지)
+
+- ❌ **Phase 0 골든셋 학습에 사용** (평가셋 누수 → 측정 무력화)
+- ❌ FP8/FP8-dynamic 베이스로 LoRA 학습 시도 (수치 손상 — 우리가 본 RedHatAI 7B 사례)
+- ❌ MoE expert 모듈에 LoRA 적용 (v1 §3.2 — vLLM hot-swap 깨짐). attention-only.
+- ❌ 자동/야간 자기학습 루프 (검수 없는 학습 = model collapse — v1 §5.2.B 폐기)
+- ❌ Verifier 룰 우회를 위해 데이터에 가짜 거부/hint 작성 (학습 신호 오염)
+- ❌ `mcp-server/eval/`·`verifier/validators.py` 수정 (Phase C 범위 밖)
+
+---
+
+## 즉시 착수 (C1)
+
+```bash
+mkdir -p mcp-server/training
+# 작업자가 만들 것: curate_from_verifier.py
+# 입력: mcp-server/verifier/logs/*.jsonl
+# 출력: mcp-server/training/sft_data.jsonl + sft_data_stats.md
+
+# 검증:
+cd mcp-server
+python3 -m py_compile training/curate_from_verifier.py
+.venv/bin/python -c "
+import json
+items = [json.loads(l) for l in open('training/sft_data.jsonl')]
+print(f'total: {len(items)}')
+print(f'avg msgs: {sum(len(i[\"messages\"]) for i in items)/len(items):.1f}')
+"
+# 골든 누수 검사:
+.venv/bin/python -c "
+import json, hashlib
+g = {hashlib.md5(json.loads(l)['question'].encode()).hexdigest() for l in open('eval/golden.jsonl')}
+s = [json.loads(l) for l in open('training/sft_data.jsonl')]
+leaks = [i for i in s if any(hashlib.md5(m['content'].encode()).hexdigest() in g for m in i['messages'] if m['role']=='user')]
+print(f'leaks: {len(leaks)}')
+"
+```
+
+완료 보고: `total / 카테고리 분포 / golden 누수 (0 기대) / 거부 변환 vs 수동 보강 비율`. 보고 받은 뒤 C2 트리거.
--- a/plans/추출결과PANE-컬럼추가-작업.md
+++ b/plans/추출결과PANE-컬럼추가-작업.md
@@ -0,0 +1,167 @@
+웹 UI P&ID 추출 pane에 sub_area, from_at, to_at 컬럼 추가 해줘, 
+태그번호(40%축소가능), 장비명(30%축소가능), 유형(60%축소가능)
+태그번호,장비명,유형,카테고리,ROLE,FROM,FROM_at,TO,TO_at, SUB_AREA, 나머지 동일 순으로 표시
+ Thought: 797ms
+│ Explore Task — Find PID pane files
+└ 25 toolcalls · 24.9s
+ctrl+x down view subagents
+▣  Build · Big Pickle · interrupted
+추출 결과 PANE 말하는 거임
+ Thought: 611ms
+✱ Glob "wwwroot/panes/*pid*" 
+✱ Glob "wwwroot/panes/*.html" 
+✱ Glob "wwwroot/js/*pid*" 
+ Thought: 431ms
+→ Read src/Web/wwwroot 
+ Thought: 384ms
+→ Read src/Web/wwwroot/panes 
+✱ Glob "wwwroot/js/*.js" 
+✱ Glob "wwwroot/**/pid*" 
+ Thought: 346ms
+→ Read src/Web/wwwroot/js 
+→ Read src/Web/wwwroot/panes/pid.html 
+ Thought: 1.7s
+→ Read src/Web/wwwroot/js/pid.js 
+ Thought: 662ms
+✱ Grep "sub_area|subArea|from_at|fromAt|to_at|toAt" in src 
+✱ Grep "class.*PidEquipment|pid_equipment" in src (39 matches)
+ Thought: 393ms
+→ Read src/Core/Domain/Entities/PidEquipment.cs 
+→ Read src/Core/Application/DTOs/PidResponse.cs 
+→ Read src/Web/Controllers/PidController.cs 
+ Thought: 7.6s
+✱ Grep "sub_area|from_at|to_at|FromAt|ToAt" in src (51 matches)
+ Thought: 3.6s
+✱ Grep "GetEquipmentAsync" in src (1 match)
+→ Read src/Core/Application/Services/PidExtractorService.cs [offset=387, limit=100]
+ Thought: 9.7s
+✱ Grep "tag_metadata|TagMetadata|Set<" in src (44 matches)
+ Thought: 6.2s
+✱ Grep "class UpdateEquipmentRequest|UpdateEquipmentRequest|FromAt|ToAt" in src (5 matches)
+✱ Grep "class UpdateEquipmentRequest" in src 
+ Thought: 515ms
+✱ Grep "UpdateEquipmentRequest|CreateEquipmentRequest" in src (8 matches)
+→ Read src/Core/Application/DTOs/PidEquipmentDto.cs 
+ Thought: 8.8s
+Now I have the full picture. The entity already has FromAt/ToAt fields — just need to expose them in the API and UI. For SubArea, it lives in tag_metadata so I'll add a [NotMapped] property and batch-join it. Let me make all the changes.
+# Todos
+[•] Add [NotMapped] SubArea to PidEquipment entity
+[ ] Add FromAt/ToAt to Create/Update DTOs
+[ ] Update PidExtractorService: GetEquipmentAsync batch sub_area, Update/Create handle FromAt/ToAt
+[ ] Update PidController response: add fromAt, toAt, subArea
+[ ] Update pid.html table headers: new column order + sub_area, from_at, to_at
+[ ] Update pid.js table row rendering for new columns
+[ ] Update pid.js add-row and save-row for FromAt/ToAt
+[ ] Build and verify
+ Thought: 314ms
+← Edit src/Core/Domain/Entities/PidEquipment.cs
+    [MaxLength(255)]
+    [Column("drawing_file")]
+    public string? DrawingFile { get; set; }
+    // ── UI 전용 (DB 컬럼 없음, API 응답 시 tag_metadata LEFT JOIN) ──
+    [System.ComponentModel.DataAnnotations.Schema.NotMapped]
+    public string? SubArea { get; set; }
+}
+ Thought: 281ms
+→ Read src/Core/Domain/Entities/PidEquipment.cs [limit=5]
+ Thought: 363ms
+# Todos
+[✓] Add [NotMapped] SubArea to PidEquipment entity
+[•] Add FromAt/ToAt to Create/Update DTOs
+[ ] Update PidExtractorService: GetEquipmentAsync batch sub_area, Update/Create handle FromAt/ToAt
+[ ] Update PidController response: add fromAt, toAt, subArea
+[ ] Update pid.html table headers: new column order + sub_area, from_at, to_at
+[ ] Update pid.js table row rendering for new columns
+[ ] Update pid.js add-row and save-row for FromAt/ToAt
+[ ] Build and verify
+← Edit src/Core/Application/DTOs/PidEquipmentDto.cs
+    bool? TagDcs = null,
+    string? Role = null,
+    string? FromTag = null,
+    string? ToTag = null,
+    string? FromAt = null,
+    string? ToAt = null,
+    string? TagClass = null);
+public record UpdateEquipmentRequest(
+    string? TagNo = null,
+    bool? TagDcs = null,
+    string? Role = null,
+    string? FromTag = null,
+    string? ToTag = null,
+    string? FromAt = null,
+    string? ToAt = null,
+    string? TagClass = null);
+ Thought: 701ms
+→ Read src/Core/Application/Services/PidExtractorService.cs [offset=387, limit=100]
+ Thought: 549ms
+→ Read src/Core/Application/Services/PidExtractorService.cs [offset=486, limit=30]
+ Thought: 840ms
+→ Read src/Core/Application/Services/PidExtractorService.cs [limit=20]
+ Thought: 770ms
+# Todos
+[✓] Add [NotMapped] SubArea to PidEquipment entity
+[✓] Add FromAt/ToAt to Create/Update DTOs
+[•] Update PidExtractorService: GetEquipmentAsync batch sub_area, Update/Create handle FromAt/ToAt
+[ ] Update PidController response: add fromAt, toAt, subArea
+[ ] Update pid.html table headers: new column order + sub_area, from_at, to_at
+[ ] Update pid.js table row rendering for new columns
+[ ] Update pid.js add-row and save-row for FromAt/ToAt
+[ ] Build and verify
+ Thought: 492ms
+← Edit src/Core/Application/Services/PidExtractorService.cs
+    {
+        items = await q.OrderByDescending(e => e.ExtractedAt)
+            .Skip((page - 1) * pageSize).Take(pageSize).ToListAsync();
+    }
+    // batch-load sub_area from tag_metadata
+    if (items.Count > 0)
+    {
+        var tagNos = items.Select(e => e.TagNo.ToLowerInvariant()).ToHashSet();
+        var subAreas = await _dbContext.TagMetadata
+            .Where(m => tagNos.Contains(m.BaseTag) && m.Attribute == "sub_area")
+            .Select(m => new { m.BaseTag, m.Value })
+            .ToListAsync();
+        var subMap = subAreas.ToDictionary(sa => sa.BaseTag, sa => sa.Value);
+        foreach (var e in items)
+        {
+            if (subMap.TryGetValue(e.TagNo.ToLowerInvariant(), out var sa))
+                e.SubArea = sa;
+        }
+    }
+    return (total, items);
+}
+private static string TagSortKey(string tagNo)
+ Thought: 361ms
+← Edit src/Core/Application/Services/PidExtractorService.cs
+TagDcs = request.TagDcs ?? false,
+Role = request.Role,
+FromTag = request.FromTag,
+ToTag = request.ToTag,
+FromAt = request.FromAt,
+ToAt = request.ToAt,
+TagClass = request.TagClass ?? ClassifyTagClass(request.Category, request.TagDcs ?? false),
+IsActive = true,
+Confidence = 1.0,
+ExtractedAt = DateTime.UtcNow,
+ Thought: 273ms
+← Edit src/Core/Application/Services/PidExtractorService.cs
+if (request.TagDcs.HasValue) e.TagDcs = request.TagDcs.Value;
+if (request.Role != null) e.Role = request.Role;
+if (request.FromTag != null) e.FromTag = request.FromTag;
+if (request.ToTag != null) e.ToTag = request.ToTag;
+if (request.FromAt != null) e.FromAt = request.FromAt;
+if (request.ToAt != null) e.ToAt = request.ToAt;
+if (request.TagClass != null) e.TagClass = request.TagClass;
+if (request.Category != null || request.TagDcs.HasValue)
+    e.TagClass = ClassifyTagClass(e.Category, e.TagDcs);
+if (request.FromTag != null || request.ToTag != null)
+ Thought: 476ms
+# Todos
+[✓] Add [NotMapped] SubArea to PidEquipment entity
+[✓] Add FromAt/ToAt to Create/Update DTOs
+[✓] Update PidExtractorService: GetEquipmentAsync batch sub_area, Update/Create handle FromAt/ToAt
+[•] Update PidController response: add fromAt, toAt, subArea
+[ ] Update pid.html table headers: new column order + sub_area, from_at, to_at
+[ ] Update pid.js table row rendering for new columns
+[ ] Update pid.js add-row and save-row for FromAt/ToAt
+[ ] Build and verify