# ๐Ÿ› ๏ธ Graph Pipeline Phase 1: ๊ธฐํ•˜ํ•™์  ๋ฐ์ดํ„ฐ ์ถ”์ถœ (Geometric Extraction) ์ด ๋ฌธ์„œ๋Š” P&ID Graph Pipeline์˜ ์ฒซ ๋ฒˆ์งธ ๋‹จ๊ณ„์ธ **๊ธฐํ•˜ํ•™์  ๋ฐ์ดํ„ฐ ์ถ”์ถœ**์˜ ์ƒ์„ธ ๊ตฌํ˜„ ๊ณ„ํš์„ ๋‹ค๋ฃน๋‹ˆ๋‹ค. ๋ชฉํ‘œ๋Š” ๋‹จ์ˆœํ•œ ํ…์ŠคํŠธ ์ถ”์ถœ์„ ๋„˜์–ด, ๋„๋ฉด ๋‚ด ๋ชจ๋“  ๊ฐ์ฒด์˜ **๋ฌผ๋ฆฌ์  ์œ„์น˜(์ขŒํ‘œ)**์™€ **๊ธฐํ•˜ํ•™์  ์†์„ฑ**์„ ๋ณด์กดํ•˜์—ฌ ์ดํ›„ ์œ„์ƒ ๋ชจ๋ธ๋ง(Topology Modeling)์ด ๊ฐ€๋Šฅํ•˜๋„๋ก ํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. --- ## ๐Ÿ“ฆ 1. ํ•„์ˆ˜ ํŒจํ‚ค์ง€ ๋ฐ ํ™˜๊ฒฝ ์„ค์ • ### 1.1 Python ํŒจํ‚ค์ง€ | ํŒจํ‚ค์ง€ | ์šฉ๋„ | ๋น„๊ณ  | |---|---|---| | `ezdxf` | DXF ํŒŒ์ผ ํŒŒ์‹ฑ ๋ฐ ์—”ํ‹ฐํ‹ฐ ์ถ”์ถœ | ํ•ต์‹ฌ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ | | `shapely` | ๊ธฐํ•˜ํ•™์  ์—ฐ์‚ฐ (Intersection, Distance, Bounding Box) | ์ขŒํ‘œ ๊ธฐ๋ฐ˜ ๋ถ„์„ ํ•„์ˆ˜ | | `numpy` | ๋Œ€๋Ÿ‰์˜ ์ขŒํ‘œ ๋ฐ์ดํ„ฐ ๊ณ„์‚ฐ ๋ฐ ํ–‰๋ ฌ ์—ฐ์‚ฐ | ์„ฑ๋Šฅ ์ตœ์ ํ™” | | `pandas` | ์ถ”์ถœ๋œ ๊ฐ์ฒด ๋ฐ์ดํ„ฐ์˜ ๊ตฌ์กฐํ™” ๋ฐ CSV/JSON ์ €์žฅ | ๋ฐ์ดํ„ฐ ๊ด€๋ฆฌ | | `pydantic` | ์ถ”์ถœ ๋ฐ์ดํ„ฐ์˜ ์Šคํ‚ค๋งˆ ์ •์˜ ๋ฐ ์œ ํšจ์„ฑ ๊ฒ€์ฆ | ๋ฐ์ดํ„ฐ ๋ฌด๊ฒฐ์„ฑ ๋ณด์žฅ | | `pytesseract` / `pdf2image` | PDF ๋„๋ฉด์˜ ์˜์—ญ ๊ธฐ๋ฐ˜ OCR ์ถ”์ถœ | PDF ์ฒ˜๋ฆฌ ์‹œ ํ•„์š” | ### 1.2 ์„ค์น˜ ๋ช…๋ น์–ด ```bash pip install ezdxf shapely numpy pandas pydantic pytesseract pdf2image ``` --- ## ๐Ÿ“ 2. ์ƒ์„ธ ์„ค๊ณ„ ๊ตฌ์กฐ ### 2.1 ๋ฐ์ดํ„ฐ ๋ชจ๋ธ (Schema) ๋ชจ๋“  ์ถ”์ถœ ๊ฐ์ฒด๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๊ณตํ†ต ์†์„ฑ์„ ๊ฐ–๋Š” `GeometricEntity` ๋ชจ๋ธ์„ ๋”ฐ๋ฆ…๋‹ˆ๋‹ค. ```python from pydantic import BaseModel from typing import List, Optional, Union, Tuple class BoundingBox(BaseModel): min_x: float min_y: float max_x: float max_y: float center: Tuple[float, float] class GeometricEntity(BaseModel): entity_id: str entity_type: str # TEXT, LINE, CIRCLE, POLYLINE, ARC layer: str bbox: BoundingBox properties: dict # ํ…์ŠคํŠธ ๊ฐ’, ์ƒ‰์ƒ, ์„  ๊ตต๊ธฐ ๋“ฑ coordinates: List[Tuple[float, float]] # ์‹œ์ž‘์ , ๋์  ๋˜๋Š” ์ •์  ๋ฆฌ์ŠคํŠธ ``` ### 2.2 ์ฒ˜๋ฆฌ ํŒŒ์ดํ”„๋ผ์ธ ํ๋ฆ„ 1. **DXF Load:** `ezdxf.readfile()`์„ ํ†ตํ•ด ๋„๋ฉด ๋กœ๋“œ. 2. **Entity Iteration:** ๋ชจ๋“  ๋ ˆ์ด์–ด์˜ ์—”ํ‹ฐํ‹ฐ๋ฅผ ์ˆœํšŒํ•˜๋ฉฐ ํƒ€์ž…๋ณ„ ๋ถ„๋ฅ˜. 3. **Coordinate Extraction:** * `TEXT`: ์‚ฝ์ž…์ (Insertion Point) ๋ฐ ํ…์ŠคํŠธ ๊ธธ์ด๋ฅผ ์ด์šฉํ•œ BBox ๊ณ„์‚ฐ. * `LINE`: ์‹œ์ž‘์ (Start)๊ณผ ๋์ (End) ์ถ”์ถœ. * `POLYLINE`: ๋ชจ๋“  ์ •์ (Vertices) ๋ฆฌ์ŠคํŠธ ์ถ”์ถœ. * `CIRCLE/ARC`: ์ค‘์‹ฌ์ (Center)๊ณผ ๋ฐ˜์ง€๋ฆ„(Radius) ์ถ”์ถœ. 4. **Spatial Normalization:** ๋„๋ฉด ์ขŒํ‘œ๊ณ„๋ฅผ ๋ถ„์„ ์‹œ์Šคํ…œ ์ขŒํ‘œ๊ณ„๋กœ ์ •๊ทœํ™”. 5. **Structured Export:** JSON ๋˜๋Š” DB(PostgreSQL/PostGIS)์— ์ €์žฅ. --- ## ๐Ÿ’ป 3. ์‹ค์ œ ๊ตฌํ˜„ ์ฝ”๋”ฉ ๊ฐ€์ด๋“œ (Example) ### 3.1 DXF ๊ธฐํ•˜ํ•™์  ์ถ”์ถœ ํ•ต์‹ฌ ์ฝ”๋“œ ```python import ezdxf from shapely.geometry import box, LineString, Point from typing import List class PidGeometricExtractor: def __init__(self, file_path: str): self.doc = ezdxf.readfile(file_path) self.msp = self.doc.modelspace() def get_bbox(self, entity): """์—”ํ‹ฐํ‹ฐ์˜ Bounding Box๋ฅผ ๊ณ„์‚ฐํ•˜์—ฌ shapely box ๊ฐ์ฒด๋กœ ๋ฐ˜ํ™˜""" if entity.dxftype() == 'TEXT': # ํ…์ŠคํŠธ์˜ ๊ฒฝ์šฐ ์‚ฝ์ž…์ ๊ณผ ํ…์ŠคํŠธ ๊ธธ์ด๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๋‹จ์ˆœํ™”๋œ BBox ์ƒ์„ฑ p = entity.dxf.insert return box(p.x, p.y, p.x + 10, p.y + 5) # ์‹ค์ œ๋กœ๋Š” ํฐํŠธ ํฌ๊ธฐ ๋ฐ˜์˜ ํ•„์š” elif entity.dxftype() == 'LINE': start = entity.dxf.start end = entity.dxf.end return box(min(start.x, end.x), min(start.y, end.y), max(start.x, end.x), max(start.y, end.y)) # ... ๊ธฐํƒ€ ํƒ€์ž… ๊ตฌํ˜„ return None def extract_all(self) -> List[dict]: results = [] for entity in self.msp: bbox_obj = self.get_bbox(entity) if bbox_obj: results.append({ "id": entity.dxf.handle, "type": entity.dxftype(), "layer": entity.dxf.layer, "bbox": { "min_x": bbox_obj.bounds[0], "min_y": bbox_obj.bounds[1], "max_x": bbox_obj.bounds[2], "max_y": bbox_obj.bounds[3] }, "value": getattr(entity.dxf, 'text', None) }) return results # ์‚ฌ์šฉ ์˜ˆ์‹œ extractor = PidGeometricExtractor("plant_drawing.dxf") geometric_data = extractor.extract_all() ``` ### 3.2 ์œ ํ‹ธ๋ฆฌํ‹ฐ ํ•จ์ˆ˜: ์ธ์ ‘์„ฑ ์ฒดํฌ (Proximity Utility) ์ถ”ํ›„ 2๋‹จ๊ณ„(์œ„์ƒ ๋ชจ๋ธ๋ง)์—์„œ ์‚ฌ์šฉํ•  ํ•ต์‹ฌ ์œ ํ‹ธ๋ฆฌํ‹ฐ์ž…๋‹ˆ๋‹ค. ```python from shapely.geometry import Point def is_near(entity_a_bbox, entity_b_bbox, threshold=5.0): """๋‘ ๊ฐ์ฒด์˜ Bounding Box ๊ฐ„์˜ ์ตœ๋‹จ ๊ฑฐ๋ฆฌ๊ฐ€ ์ž„๊ณ„๊ฐ’ ์ด๋‚ด์ธ์ง€ ํ™•์ธ""" return entity_a_bbox.distance(entity_b_bbox) <= threshold def is_inside(point, bbox): """ํŠน์ • ์ ์ด Bounding Box ๋‚ด๋ถ€์— ์žˆ๋Š”์ง€ ํ™•์ธ""" return bbox.contains(Point(point)) ``` --- ## ๐Ÿš€ 4. Phase 1 ์™„๋ฃŒ ๊ธฐ์ค€ (Definition of Done) - [ ] DXF ํŒŒ์ผ ๋‚ด ๋ชจ๋“  `TEXT`, `LINE`, `POLYLINE`์˜ ์ขŒํ‘œ ๋ฐ์ดํ„ฐ๊ฐ€ ๋ˆ„๋ฝ ์—†์ด ์ถ”์ถœ๋˜๋Š”๊ฐ€? - [ ] ๊ฐ ๊ฐ์ฒด๋ณ„๋กœ ์ •ํ™•ํ•œ `Bounding Box`๊ฐ€ ๊ณ„์‚ฐ๋˜์–ด ์ €์žฅ๋˜๋Š”๊ฐ€? - [ ] ์ถ”์ถœ๋œ ๋ฐ์ดํ„ฐ๊ฐ€ `GeometricEntity` ์Šคํ‚ค๋งˆ์— ๋งž๊ฒŒ JSON ํ˜•ํƒœ๋กœ ์ €์žฅ๋˜๋Š”๊ฐ€? - [ ] (์„ ํƒ ์‚ฌํ•ญ) PDF ๋„๋ฉด์˜ ๊ฒฝ์šฐ OCR์„ ํ†ตํ•ด ํ…์ŠคํŠธ์˜ ์ขŒํ‘œ๊ฐ’์ด ์ถ”์ถœ๋˜๋Š”๊ฐ€?