feat: P&ID 연결 분석, LLM 에이전트 모드, KB 확장, MCP 서버 리팩토링

- P&ID: 연결 분석 API, Prefix 규칙 관리, 카테고리 분류, DXF 그래프 빌드 - LLM: 대화 요약, tool card 영구 보존, 시계열 차트(uPlot), 에이전트 모드 - KB: 청크 미리보기, Field Instrument Inference, 인증/Qdrant 클라이언트 - MCP: 서버 기능 확장, 파이프라인 수정, timeout 개선 - Frontend: P&ID UI, LLM UI, KB UI, OPC UA Write 탭 추가 - 설정: AGENTS.md, plant_context, README, opencode.json 업데이트 - 정리: 진단 체크리스트 문서 삭제
2026-05-21 23:36:57 +09:00
parent 960bda4a3c
commit 302183c97e
142 changed files with 2432231 additions and 1082 deletions
--- a/mcp-server/server.py
+++ b/mcp-server/server.py
@@ -104,6 +104,24 @@ def _llm():
    return OpenAI(base_url=VLLM_BASE_URL, api_key="dummy")


+def _strip_think(text: str) -> str:
+    """Qwen/DeepSeek 계열 모델의 <think>...</think> reasoning 블록 제거"""
+    if not text:
+        return text
+    for tag in ("think", "skip", "reason"):
+        pat = re.compile(rf"</?{tag}>.*?</?{tag}>", re.DOTALL)
+        text = pat.sub("", text).strip()
+    # 태그가 열렸으나 닫히지 않은 경우 (truncated) — 태그부터 끝까지 제거
+    for tag in ("think", "skip", "reason"):
+        if f"<{tag}>" in text and f"</{tag}>" not in text:
+            idx = text.index(f"<{tag}>")
+            text = text[:idx].strip()
+        if f"</{tag}>" in text and f"<{tag}>" not in text:
+            idx = text.index(f"</{tag}>") + len(f"</{tag}>")
+            text = text[idx:].strip()
+    return text
+
+
 # ── PaddleOCR 싱글톤 (PDF fallback용) ──────────────────────────────────────────

@lru_cache(maxsize=1)
@@ -603,8 +621,8 @@ async def _get_db_connection():

 def _validate_sql(sql: str) -> tuple[bool, str]:
    """SQL 안전 검증 — SELECT/WITH만 허용, 위험 키워드 차단."""
-    if len(sql) > 2000:
-        return False, "쿼리 길이 2000자를 초과했습니다."
+    if len(sql) > 4000:
+        return False, "쿼리 길이 4000자를 초과했습니다."
    dangerous = ['EXEC', 'DROP', 'DELETE', 'UPDATE', 'INSERT', 'ALTER', 'CREATE', 'GRANT', 'REVOKE', 'TRUNCATE', 'COPY']
    sql_upper = sql.upper()
    for kw in dangerous:
@@ -635,85 +653,22 @@ def _apply_sql_guards(sql: str, max_rows: int = SQL_MAX_ROWS) -> str:
    return f"SELECT * FROM ({s}) _capped LIMIT {max_rows}"


-# DB 스키마 — LLM SQL 생성 시 컨텍스트로 사용
+# Compact DB schema for LLM SQL generation
 _DB_SCHEMA = """
-PostgreSQL 시계열 데이터베이스 스키마
+Tables:
+  history_table(tagname TEXT, value TEXT, recorded_at TIMESTAMPTZ)
+  realtime_table(tagname TEXT, livevalue TEXT, timestamp TIMESTAMPTZ)
+  tag_metadata(base_tag TEXT, attribute TEXT, value TEXT)
+  event_history_table(tagname TEXT, prev_value TEXT, curr_value TEXT, event_type TEXT, event_time TIMESTAMPTZ, duration_seconds INT)

-테이블: history_table  (시계열 이력)
-  tagname     TEXT         - 태그명 (모두 소문자, 예: 'ficq-6113.pv') — 대소문자 구분
-  node_id     TEXT         - OPC UA 노드 ID
-  value       TEXT         - 측정값, 수치 연산 시 ::double precision 캐스트 필요
-  recorded_at TIMESTAMPTZ  - 기록 시각(UTC), 스냅샷 주기 약 60초
+Views:
+  v_tag_summary(base_tag TEXT, pv TEXT, sp TEXT, op TEXT, description TEXT, area TEXT)

-테이블: realtime_table  (실시간 최신값)
-  tagname     TEXT         - 태그명 (모두 소문자)
-  node_id     TEXT         - OPC UA 노드 ID
-  livevalue   TEXT         - 현재값
-  timestamp   TIMESTAMPTZ  - 최종 갱신 시각
-
-테이블: tag_metadata  (태그 메타데이터 - 변경 드묾)
-  base_tag    TEXT         - 기본 태그명 (예: 'ficq-6101', 'xv-6124')
-  attribute   TEXT         - 속성명 ('desc', 'area')
-  value       TEXT         - 메타데이터 값
-  node_id     TEXT         - OPC UA 노드 ID
-  loaded_at   TIMESTAMPTZ  - 마지막 로드 시각
-
-테이블: event_history_table  (디지털 포인트 상태 변경 이벤트)
-  id               BIGSERIAL    - PK
-  tagname          TEXT         - 태그명 (소문자)
-  node_id          TEXT
-  prev_value       TEXT         - 직전 값
-  curr_value       TEXT         - 현재 값
-  event_type       TEXT         - 'ALARM' / 'TRIP' / 'NORMAL' / 'RUN' / 'CHANGE'
-  event_time       TIMESTAMPTZ  - 이벤트 발생 시각(UTC)
-  area             TEXT         - tag_metadata.area 복사본
-  section          TEXT         - 태그명 패턴에서 추출한 차수(예: '6-1차')
-  duration_seconds INT          - 직전 상태에서 머문 시간
-  metadata         JSONB        - 부가 정보 (interlock 등)
-  created_at       TIMESTAMPTZ
-
-뷰: v_tag_summary  (실시간값 + 메타데이터 통합 뷰)
-  base_tag          TEXT   - 기본 태그명
-  pv                TEXT   - 현재 프로세스 값
-  sp                TEXT   - 설정값
-  op                TEXT   - 출력값
-  instate0          TEXT   - 상태 비트 0 (true/false)
-  instate1          TEXT   - 상태 비트 1 (true/false)
-  instate2          TEXT   - 상태 비트 2 (true/false)
-  description       TEXT   - 장비 설명 (tag_metadata.desc)
-  area              TEXT   - 소속 플랜트 (tag_metadata.area)
-
-새로운 태그 타입:
-  - 아날로그: ficq-6101.pv/sp/op (Double)
-  - 디지털 XV: xv-6124.pv/op (Int32), xv-6124.instate0~7 (Boolean)
-  - Pump: p-6102.pv/op (Int32), p-6102.instate0~7 (Boolean)
-  - 메타데이터: desc (String), area (Enum)
-
-BCD 상태 조회 팁:
-  - instate0~7은 Boolean (true/false)
-  - pv 값이 EnumValueType 형식인 경우 `{코드 | DisplayName | }`에서 DisplayName으로 상태 확인 가능
-  - v_tag_summary 뷰를 사용하면 실시간값+메타데이터 한 번에 조회 가능
-
-N분 간격 집계 공식 (time_bucket 금지, date_trunc 사용):
-  1분 버킷: date_trunc('minute', recorded_at) AS bucket
-  2분 버킷: to_timestamp(FLOOR(EXTRACT(EPOCH FROM recorded_at)/120)*120) AS bucket
-  5분 버킷: to_timestamp(FLOOR(EXTRACT(EPOCH FROM recorded_at)/300)*300) AS bucket
-  10분 버킷: to_timestamp(FLOOR(EXTRACT(EPOCH FROM recorded_at)/600)*600) AS bucket
-  N분 버킷: to_timestamp(FLOOR(EXTRACT(EPOCH FROM recorded_at)/(N*60))*(N*60)) AS bucket
-
-예시 (2분 간격, 여러 태그):
-  SELECT to_timestamp(FLOOR(EXTRACT(EPOCH FROM recorded_at)/120)*120) AS bucket,
-         tagname, AVG(value::double precision) AS avg_val
-  FROM history_table
-  WHERE tagname IN ('tag1', 'tag2')
-    AND recorded_at >= NOW() - INTERVAL '3 hours'
-  GROUP BY bucket, tagname ORDER BY bucket, tagname
-
-규칙:
-  - SELECT만 허용 (INSERT/UPDATE/DELETE/DROP 등 불가)
-  - tagname은 모두 소문자로 정확히 입력
-  - value 컬럼은 TEXT이므로 집계 시 ::double precision 캐스트 필수
-  - time_bucket 함수 사용 금지 — 위의 to_timestamp/FLOOR/EPOCH 공식 사용
+Rules:
+  - SELECT only. tagname lowercase exact match.
+  - value is TEXT; cast ::double precision when aggregating.
+  - time_bucket() banned. For N-min buckets: to_timestamp(FLOOR(EXTRACT(EPOCH FROM recorded_at)/(N*60))*(N*60))
+  - KST input = UTC-9 in DB.
 """

 # ── RAG 도구 ─────────────────────────────────────────────────────────────────
@@ -774,7 +729,8 @@ def ask_iiot_llm(question: str, context: str = "") -> str:
        max_tokens=2048,
        temperature=0.1,
    )
-    return resp.choices[0].message.content or "(응답 없음)"
+    content = resp.choices[0].message.content or "(응답 없음)"
+    return _strip_think(content)


@mcp.tool()
@@ -1184,22 +1140,13 @@ async def query_with_nl(question: str) -> str:
    
    system = (
        "You are a PostgreSQL SQL expert.\n"
-        "Convert the user's question into a SELECT SQL using the schema below.\n"
-        "IMPORTANT rules:\n"
-        "- Use ONLY PostgreSQL syntax. No DATE_FORMAT, no INTERVAL N DAY.\n"
-        "- Time column is 'recorded_at' (TIMESTAMPTZ). Do NOT use 'timestamp'.\n"
-        "- NEVER use time_bucket(). For N-minute buckets use to_timestamp/FLOOR/EPOCH formula.\n"
-        "- INTERVAL rule:\n"
-        "    * If the question specifies an interval (e.g. '2분 간격', '5-minute interval'):\n"
-        "        use: to_timestamp(FLOOR(EXTRACT(EPOCH FROM recorded_at)/(N*60))*(N*60)) AS bucket\n"
-        "        with GROUP BY bucket, tagname and AVG(value::double precision) AS avg_val\n"
-        "    * If NO interval is specified: SELECT recorded_at, tagname, value — NO GROUP BY.\n"
-        "- Current year is 2026. '4월 27일' means 2026-04-27.\n"
-        "- All times in DB are UTC. Korean input is KST (UTC+9). Convert: KST 12:00 = UTC 03:00.\n"
-        "- value column is TEXT; cast with ::double precision only when aggregating.\n"
-        "- All tagnames are lowercase (e.g. 'ficq-6113.pv'). Match exactly.\n"
-        "- PostgreSQL LIKE: dot has no special meaning, no escaping needed.\n"
-        "- Return ONLY the SQL statement. No explanation, no markdown.\n\n"
+        "Convert the user's question into a SELECT SQL.\n"
+        "Return ONLY the SQL. No explanation, no markdown, NO <think> tags.\n"
+        "Use PostgreSQL syntax. tagname lowercase exact match.\n"
+        "value is TEXT; cast ::double precision when aggregating.\n"
+        "KST input = UTC-9. Example: KST 12:00 = UTC 03:00.\n"
+        "For N-min buckets: to_timestamp(FLOOR(EXTRACT(EPOCH FROM recorded_at)/(N*60))*(N*60)).\n"
+        "No GROUP BY if no interval specified.\n\n"
        f"{_DB_SCHEMA}"
    )
    
@@ -1216,7 +1163,7 @@ async def query_with_nl(question: str) -> str:
            )
        
        resp = await asyncio.to_thread(_call_llm)
-        sql = (resp.choices[0].message.content or "").strip()
+        sql = _strip_think(resp.choices[0].message.content or "").strip()
        # 마크다운 코드 블록 제거
        if sql.startswith("```"):
            lines = sql.splitlines()
@@ -1319,6 +1266,139 @@ async def find_tags(query: str, area: str | None = None, top_k: int = 20) -> str
            conn.close()


+@mcp.tool()
+async def trace_connections(start_tag: str, direction: str = "downstream", max_depth: int = 20) -> str:
+    """pid_equipment 테이블의 from_tag/to_tag를 활용해 장비 연결 경로를 추적.
+
+    사용 시점: "스팀 경로 설명해줘", "원료 흐름 따라가줘", "T-203에서 어디로 가?" 같은 질문.
+    개별 태그 검색 + SQL 조합(10+ 라운드) → trace_connections 1회 호출로 대체.
+
+    Args:
+        start_tag: 시작 태그명 (예: 'FT-6115', 'T-203')
+        direction: 'downstream'(하류) 또는 'upstream'(상류). 기본 downstream.
+        max_depth: 최대 추적 깊이 (기본 20)
+
+    Returns:
+        JSON: { success, start_tag, direction, path: [{step, from_tag, to_tag, role, tag_no}] }
+    """
+    conn = None
+    try:
+        start_tag = start_tag.strip().upper()
+        direction = direction.strip().lower()
+        if direction not in ("downstream", "upstream"):
+            return json.dumps({"success": False, "error": "direction은 'downstream' 또는 'upstream'"}, ensure_ascii=False)
+        
+        conn = await _get_db_connection()
+        with conn.cursor() as cur:
+            cur.execute(f"SET statement_timeout = {SQL_STATEMENT_TIMEOUT_MS}")
+            
+            def _split_tags(tag_str):
+                if not tag_str:
+                    return []
+                return [t.strip() for t in tag_str.split(',') if t.strip()]
+
+            def _build_or_condition(tags):
+                if not tags:
+                    return "", []
+                conditions = []
+                params = []
+                for t in tags:
+                    conditions.append("from_tag LIKE %s")
+                    params.append(f'%{t}%')
+                    conditions.append("tag_no = %s")
+                    params.append(t)
+                return f"({' OR '.join(conditions)})", params
+
+            def _trace_downstream(current_tags, visited, depth):
+                if depth > max_depth or not current_tags:
+                    return
+                or_clause, params = _build_or_condition(current_tags)
+                if not or_clause:
+                    return
+                cur.execute(f"""
+                    SELECT tag_no, from_tag, to_tag, role
+                    FROM pid_equipment
+                    WHERE {or_clause}
+                    ORDER BY tag_no
+                """, params)
+                rows = cur.fetchall()
+                for row in rows:
+                    tag_no = row[0]
+                    if tag_no in visited:
+                        continue
+                    visited.add(tag_no)
+                    path.append({
+                        "step": len(path) + 1,
+                        "tag_no": tag_no,
+                        "from_tag": row[1],
+                        "to_tag": row[2],
+                        "role": row[3],
+                    })
+                    next_tags = _split_tags(row[2])
+                    _trace_downstream(next_tags, visited, depth + 1)
+
+            path = []
+            visited = set()
+            visited.add(start_tag)
+            start_tags = _split_tags(start_tag)
+            _trace_downstream(start_tags, visited, 0)
+            
+            # upstream
+            if direction == "upstream":
+                def _trace_upstream(current_tags, visited, depth):
+                    if depth > max_depth or not current_tags:
+                        return
+                    conditions = []
+                    params = []
+                    for t in current_tags:
+                        conditions.append("to_tag LIKE %s")
+                        params.append(f'%{t}%')
+                        conditions.append("tag_no = %s")
+                        params.append(t)
+                    if not conditions:
+                        return
+                    or_clause = f"({' OR '.join(conditions)})"
+                    cur.execute(f"""
+                        SELECT tag_no, from_tag, to_tag, role
+                        FROM pid_equipment
+                        WHERE {or_clause}
+                        ORDER BY tag_no
+                    """, params)
+                    rows = cur.fetchall()
+                    for row in rows:
+                        tag_no = row[0]
+                        if tag_no in visited:
+                            continue
+                        visited.add(tag_no)
+                        path.append({
+                            "step": len(path) + 1,
+                            "tag_no": tag_no,
+                            "from_tag": row[1],
+                            "to_tag": row[2],
+                            "role": row[3],
+                        })
+                        prev_tags = _split_tags(row[1])
+                        _trace_upstream(prev_tags, visited, depth + 1)
+                
+                path = []
+                visited = set()
+                visited.add(start_tag)
+                start_tags = _split_tags(start_tag)
+                _trace_upstream(start_tags, visited, 0)
+            
+            return json.dumps({
+                "success": True,
+                "start_tag": start_tag,
+                "direction": direction,
+                "path": path,
+            }, ensure_ascii=False, indent=2)
+    except Exception as e:
+        return json.dumps({"success": False, "error": f"연결 추적 실패: {e}"}, ensure_ascii=False)
+    finally:
+        if conn:
+            conn.close()
+
+
@mcp.tool()
 async def query_events(
    tag_name: str | None = None,
@@ -1540,7 +1620,7 @@ async def summarize_events(

    try:
        resp = await asyncio.to_thread(_call)
-        summary = resp.choices[0].message.content or "(요약 없음)"
+        summary = _strip_think(resp.choices[0].message.content) or "(요약 없음)"
    except Exception as e:
        summary = f"(LLM 요약 실패: {e})"

@@ -1626,7 +1706,7 @@ async def generate_status_report(area: str | None = None, hours: int = 24) -> st

    try:
        resp = await asyncio.to_thread(_call)
-        report = resp.choices[0].message.content or "(보고서 생성 실패)"
+        report = _strip_think(resp.choices[0].message.content) or "(보고서 생성 실패)"
    except Exception as e:
        report = f"(LLM 보고서 실패: {e})"

@@ -1845,7 +1925,7 @@ async def parse_pid_pdf(filepath: str, use_ocr: bool = True) -> str:
        
        resp = await asyncio.to_thread(_call_llm)
        
-        raw = (resp.choices[0].message.content or "").strip()
+        raw = _strip_think(resp.choices[0].message.content or "").strip()
        
        # 마크다운 코드 블록 제거
        if raw.startswith("```"):