fix: P&ID 배관번호 분류 오류 수정 (power_equipment → pipings)

- _PID_LINENO_FULL_RE: 7필드 고정 regex → 5~7필드 통합 (9차 P-9107-25A-F-n 등 미매칭 수정)
- _extract_pid_dxf_fast: 레이어 이름 하드코딩 제거 → FULL_RE 매칭 우선, LINENO 계열 레이어 힌트 보조
- MatchCategoryAsync: 배관번호 regex(_pipeLineNoRe) 체크를 prefix 룰보다 먼저 실행 → P-9117-20A-F-n 등이 power_equipment로 오분류되던 문제 수정
- pump extractor 프롬프트: 배관번호 SKIP/INCLUDE 예시 추가
- DB 기존 레코드 435건 pipings로 재분류 (직접 SQL)
- .claude/settings.json: LLM 모델명 하드코딩 제거

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
windpacer
2026-05-17 10:36:30 +09:00
parent 0ccec38c18
commit 960bda4a3c
4 changed files with 885 additions and 309 deletions

File diff suppressed because it is too large Load Diff

View File

@@ -59,13 +59,17 @@ Examples: PG-101, TG-201, LG-301, PG-10101, TG-10201
# 펌프: P-10101, VP-10117, DP-10101 등
_PUMP_PROMPT = _PROMPT_HEADER + """
Extract ONLY pumps and compressors.
Extract ONLY pumps and compressors (simple equipment tags, NO pipe size suffix).
Target equipment types: P (pump), VP (vertical pump), DP (dual pump),
C (compressor), CP (centrifugal pump), BP (booster pump),
Target equipment types: P (pump), VP (vertical pump), DP (dual pump),
C (compressor), CP (centrifugal pump), BP (booster pump), SP (sump pump),
and their variants.
Examples: P-10101, VP-10117, DP-10101, C-10201, CP-10301, BP-10401
Examples (4~5 digit loop numbers): P-10101, VP-10117, DP-10101, C-10201, P-9101, P-9116, VP-9201
IMPORTANT: Do NOT extract pipeline/line numbers that have a pipe size suffix (e.g. 25A, 50A, 100A).
SKIP (pipeline, not a pump): P-10101-25A-F1A-n, P-9107-25A-F-n, CHR-9641-50A-F-C50
INCLUDE (pump tag): P-10101, VP-10117, P-9101
"""
# 프롬프트 매핑