Files
ExperionCrawler/mcp-server/worker/pid_extract_prompts.py
windpacer 960bda4a3c fix: P&ID 배관번호 분류 오류 수정 (power_equipment → pipings)
- _PID_LINENO_FULL_RE: 7필드 고정 regex → 5~7필드 통합 (9차 P-9107-25A-F-n 등 미매칭 수정)
- _extract_pid_dxf_fast: 레이어 이름 하드코딩 제거 → FULL_RE 매칭 우선, LINENO 계열 레이어 힌트 보조
- MatchCategoryAsync: 배관번호 regex(_pipeLineNoRe) 체크를 prefix 룰보다 먼저 실행 → P-9117-20A-F-n 등이 power_equipment로 오분류되던 문제 수정
- pump extractor 프롬프트: 배관번호 SKIP/INCLUDE 예시 추가
- DB 기존 레코드 435건 pipings로 재분류 (직접 SQL)
- .claude/settings.json: LLM 모델명 하드코딩 제거

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-17 10:36:30 +09:00

83 lines
3.1 KiB
Python

"""P&ID 추출기용 계측기 유형별 프롬프트 정의"""
# 공통 프롬프트 헤더
_PROMPT_HEADER = """You are a P&ID (Piping and Instrumentation Diagram) expert.
Extract ONLY the specified instrument types from the provided DXF text.
Return ONLY a valid JSON array. Each element must have exactly these fields:
{"tagNo":"FCV-101","equipmentName":null,"instrumentType":"FCV","lineNumber":null,"pidDrawingNo":null,"confidence":0.95}
Rules:
- tagNo: any token matching [LETTERS]-[DIGITS] or [LETTERS]-[DIGITS]-[SUFFIX]
- instrumentType: leading letters of tagNo
- equipmentName: descriptive name if present near tag, else null
- lineNumber/pidDrawingNo: null unless explicitly associated
- confidence: 0.95 for clear tags, lower for ambiguous
- Output ONLY the JSON array, no markdown, no explanation.
- If no tags found, return: []
"""
# 센서/계측기: FT, FIT, LT, PT, TE, PG, LG, TG
_SENSOR_PROMPT = _PROMPT_HEADER + """
Extract ONLY flow transmitters (FT/FIT), level transmitters (LT),
pressure transmitters (PT), temperature elements (TE),
pressure gauges (PG), level gauges (LG), temperature gauges (TG).
Target instrument types: FT, FIT, FIC, LIC, PIC, TIC, LT, PT, TE, PG, LG, TG,
and their variants (e.g., FIT-XXXX, FT-XXXX).
Examples: FT-101, FIT-10115, PT-201, LT-301, TE-401, PG-501, LG-601, TG-701
"""
# 밸브: FCV, TCV, LCV, PCV, XV, FV, LV, PV, TV
_VALVE_PROMPT = _PROMPT_HEADER + """
Extract ONLY control valves and on/off valves.
Target instrument types: FCV, TCV, LCV, PCV, XV, FV, LV, PV, TV,
BCV, GV, and their variants (e.g., FCV-XXXX, PCV-XXXX, XV-XXXX).
Examples: FCV-101, TCV-201, LCV-301, PCV-401, XV-501, FV-601, LV-701, PV-801
"""
# 시스템/제어기: LI, PI, TI, FIQ, FICQ, TICA, PICA, LICA
_SYSTEM_PROMPT = _PROMPT_HEADER + """
Extract ONLY indicating instruments, recorders, and controllers.
Target instrument types: LI, PI, TI, SI, HI, FIQ, FICQ, TICA, PICA, LICA,
FIC, LIC, PIC, TIC, and their variants.
Examples: LI-101, PI-201, TI-301, FIQ-401, FICQ-501, TICA-601, PICA-701, LICA-801
"""
# 게이지: PG, TG, LG
_GAUGE_PROMPT = _PROMPT_HEADER + """
Extract ONLY gauges (pressure, temperature, level).
Target instrument types: PG, TG, LG, SG, HG, and their variants.
Examples: PG-101, TG-201, LG-301, PG-10101, TG-10201
"""
# 펌프: P-10101, VP-10117, DP-10101 등
_PUMP_PROMPT = _PROMPT_HEADER + """
Extract ONLY pumps and compressors (simple equipment tags, NO pipe size suffix).
Target equipment types: P (pump), VP (vertical pump), DP (dual pump),
C (compressor), CP (centrifugal pump), BP (booster pump), SP (sump pump),
and their variants.
Examples (4~5 digit loop numbers): P-10101, VP-10117, DP-10101, C-10201, P-9101, P-9116, VP-9201
IMPORTANT: Do NOT extract pipeline/line numbers that have a pipe size suffix (e.g. 25A, 50A, 100A).
SKIP (pipeline, not a pump): P-10101-25A-F1A-n, P-9107-25A-F-n, CHR-9641-50A-F-C50
INCLUDE (pump tag): P-10101, VP-10117, P-9101
"""
# 프롬프트 매핑
PROMPTS = {
"sensor": _SENSOR_PROMPT,
"valve": _VALVE_PROMPT,
"system": _SYSTEM_PROMPT,
"gauge": _GAUGE_PROMPT,
"pump": _PUMP_PROMPT,
}