Files

windpacer 302183c97e feat: P&ID 연결 분석, LLM 에이전트 모드, KB 확장, MCP 서버 리팩토링

- P&ID: 연결 분석 API, Prefix 규칙 관리, 카테고리 분류, DXF 그래프 빌드
- LLM: 대화 요약, tool card 영구 보존, 시계열 차트(uPlot), 에이전트 모드
- KB: 청크 미리보기, Field Instrument Inference, 인증/Qdrant 클라이언트
- MCP: 서버 기능 확장, 파이프라인 수정, timeout 개선
- Frontend: P&ID UI, LLM UI, KB UI, OPC UA Write 탭 추가
- 설정: AGENTS.md, plant_context, README, opencode.json 업데이트
- 정리: 진단 체크리스트 문서 삭제

2026-05-21 23:36:57 +09:00

21 KiB

Raw Blame History

OllamaController.cs — 재진단 보고서 (2차)

진단 대상: src/Web/Controllers/OllamaController.cs (1,194줄) 진단 기준: diagnosis-checklist.md 8단계 날짜: 2026-05-16

⚠️ 1차 진단 대비 주요 변경사항:

LoadConfig() case-sensitivity 버그 발견 (1차 진단에서 누락)

기존 static 캐시 문제 (HIGH) → 실제 배포 환경(systemd 단일 프로세스)에선 문제 없음으로 재평가 → MED 강등

기존 Headers.Append → LOW 강등 (현재 아키텍처에선 중복 조건 없음)

기존 silent catch (HIGH) → 실제 낙하 경로 분석 결과 MED로 조정

신규 발견: VllmChatStreamWithTools Reflection 추출, GetModels thundering herd

파일 I/O blocking 항목 → Q3/Q4 탈락으로 보고서에서 완전 제거

STEP 1 — 맥락 파악 (변경 없음)

역할: Ollama / vLLM LLM HTTP Proxy + MCP Tool Calling Bridge (Web/API Controller)
주요 엔드포인트: GET /models, POST /chat, POST /chat/stream (Ollama native) + vllm/ variants + GET/POST /config + GET /ping

STEP 2 — 구조 탐색 (변경 없음)

STEP 3 — 코드 읽기 (전체 재확인 완료)

STEP 4 — 호출 계층 지도 (변경 없음)

STEP 5 — 패턴 매칭 + STEP 6 — 교차 검증

#	발견	Q1 이미 수정?	Q2 다른 레이어?	Q3 의도적?	Q4 재현 시나리오?	결과
1	LoadConfig() case-sensitive deserialization → 저장한 설정이 유실됨	❌	❌	❌	✅ 저장 → 새로고침 → 기본값(localhost) 표시	🔴 HIGH
2	`VllmChatStreamWithTools` anonymous type reflection 추출 → 느리고 fragile	❌	❌	❌	✅ 동시 다중 tool_calls + 부하 상황	🟠 MED
3	JSON 텍스트 도구 감지 실패 시 silent catch + fallthrough	❌	❌	❌	✅ LLM이 `{"tool":"run_sql","parameters":{...}}` 출력 → 파싱 예외 → 도구 무시	🟠 MED
4	static 캐시(`_capsCache`) 메모리 누수 — TTL 만료 항목 미제거	❌	❌	❌	✅ Ollama에 100+ 모델 등록 시 누적	🟠 MED
5	`GetModels()` — 모든 모델 capabilities를 동시 병렬 조회 (thundering herd)	❌	❌	❌	✅ Ollama에 20+ 모델 존재 시 /api/show 동시 20+ 요청	🟠 MED
6	Summarize — `LoadVllmModel()` 무시하고 환경변수만 참조	❌	❌	❌	✅ `VLLM_MODEL` 미설정 + `req.Model` null → 빈 문자열로 vLLM 요청	🟠 MED
7	`Response.Headers.Append()` — 이론적 중복 가능	❌	✅ 현재 미들웨어 체인에서 선행 설정 없음	❌	❌ 실제 재현 불가(LOW 조건)	🟡 LOW
8	HttpRequestMessage/HttpResponseMessage 미처분	❌	✅ .NET HttpClient가 내부 관리	❌	❌ 실제 누수 측정 불가	🟡 LOW
9	ExtractFirstJsonObject 문자열 내 brace 미처리	❌	❌	❌	✅ 드물지만 LLM output에 `{` 포함 시 파싱 실패	🟡 LOW

STEP 7 — 상세 진단

[1]. LoadConfig() case-sensitive deserialization → 설정 저장 후 유실 (🔴 HIGH)

문제: SetConfig()는 {"host":"10.0.0.50","port":11434}처럼 camelCase로 JSON 파일을 저장하지만, LoadConfig()는 JsonSerializer.Deserialize<OllamaConfig>(json) (기본 옵션, PropertyNameCaseInsensitive = false)로 읽기 때문에 host → Host 매칭이 실패한다. OllamaConfig 클래스에는 [JsonPropertyName] 애트리뷰트가 없으므로, 항상 기본값(localhost:11434)이 반환된다.

근거:

파일 쓰기 (SetConfig() 492-496줄) — JSON 키가 camelCase:

var json = JsonSerializer.Serialize(new
{
    host = cfg.Host,    // ← "host" (camelCase)
    port = cfg.Port     // ← "port" (camelCase)
}, new JsonSerializerOptions { WriteIndented = true });
System.IO.File.WriteAllText(path, json);

파일 읽기 (LoadConfig() 47-63줄) — PropertyNameCaseInsensitive = false 기본값:

OllamaConfig LoadConfig()
{
    var path = OllamaConfigPath;
    if (System.IO.File.Exists(path))
    {
        var json = System.IO.File.ReadAllText(path);
        return JsonSerializer.Deserialize<OllamaConfig>(json)  // ← case-sensitive!
            ?? new OllamaConfig();  // ← "host" != "Host" → 기본값 반환
    }
    return new OllamaConfig();  // localhost:11434
}

대상 클래스 (OllamaConfig 1144-1150줄):

public class OllamaConfig
{
    public string Host { get; set; } = "localhost";  // ← PascalCase
    public int Port { get; set; } = 11434;            // ← PascalCase
    public string BaseUrl => $"http://{Host}:{Port}";
}

참고 — ASP.NET Core [FromBody] 모델 바인딩은 AddJsonOptions에서 PropertyNameCaseInsensitive = true가 기본값으로 설정되므로, SetConfig(OllamaConfig cfg)에서 { host: "..." } 수신은 성공한다. 하지만 LoadConfig()가 호출하는 JsonSerializer.Deserialize<T>()는 그 옵션을 공유하지 않는다.

영향: 사용자 시나리오:

설정 화면에서 Ollama host를 10.0.0.50로 변경 → POST /api/ollama/config → SetConfig → 파일 저장 직후 응답은 {"host":"10.0.0.50","port":11434}로 정상
프론트엔드 alert: "변경 사항 적용을 위해 페이지를 새로고침하세요."
새로고침 → llmLoadConfigToUI() → GET /api/ollama/config → GetConfig() → LoadConfig() → "host" != "Host" → Host = "localhost" 반환
사용자는 "왜 저장이 안 되지?" 반복 시도 → 설정 저장은 항상 실패한 것처럼 보임
심지어 서버 재시작 후에는 자동으로 localhost:11434로 동작하므로, 원격 Ollama 서버에 연결 불가

수정 — 두 가지 중 택일:

수정 A (권장 — LoadConfig에서 case-insensitive 옵션 적용):

private static readonly JsonSerializerOptions _configJsonOptions = new()
{
    PropertyNameCaseInsensitive = true
};

OllamaConfig LoadConfig()
{
    try
    {
        var path = OllamaConfigPath;
        if (System.IO.File.Exists(path))
        {
            var json = System.IO.File.ReadAllText(path);
            return JsonSerializer.Deserialize<OllamaConfig>(json, _configJsonOptions)
                ?? new OllamaConfig();
        }
    }
    catch (Exception ex)
    {
        _logger.LogWarning(ex, "[OllamaController] 설정 로드 실패, 기본값 사용");
    }
    return new OllamaConfig();
}

수정 B (PascalCase로 파일 저장 — SetConfig 수정):

// SetConfig에서 PascalCase로 저장
System.IO.File.WriteAllText(path, JsonSerializer.Serialize(cfg, new JsonSerializerOptions { WriteIndented = true }));

수정 A가 더 안전하다. 언젠가 다른 코드 경로에서도 이 파일을 camelCase로 작성할 가능성이 있으므로, 읽는 쪽에서 대소문자를 무시하는 것이 근본 해결책이다.

[2]. `VllmChatStreamWithTools` — anonymous type reflection으로 tool call 추출 (🟠 MED)

문제: VllmChatStreamWithTools()(894-899줄)에서 tcList에 anonymous object를 담은 후, 같은 메서드 내에서 Reflection(GetType().GetProperty(...))으로 값을 다시 꺼낸다. 값은 이미 변수(tcId, funcName, funcArgs)에 들어 있는데도 중복 추출하고 있다.

근거 (OllamaController.cs:887-899):

// 1. tcList에 anonymous object 저장
tcList.Add(new
{
    id = tcId,           // ← 이미 이 시점에 값이 로컬 변수에 있음
    type = "function",
    function = new
    {
        name = funcName,
        arguments = funcArgs
    }
});

// 2. 직후에 Reflection으로 다시 꺼냄 (894-899줄)
foreach (var tc in tcList)
{
    var tcId = tc.GetType().GetProperty("id")?.GetValue(tc) as string ?? "";
    var func = tc.GetType().GetProperty("function")?.GetValue(tc);
    var funcName = func?.GetType().GetProperty("name")?.GetValue(func) as string ?? "";
    var funcArgs = func?.GetType().GetProperty("arguments")?.GetValue(func) as string ?? "{}";

영향:

Reflection은 직접 접근보다 10~100배 느림
컴파일 타임 타입 안전성 없음 — anonymous type property가 rename되면 조용히 null 반환
고부하 다중 tool_calls 시 불필요한 CPU 낭비

수정 — 이미 알고 있는 변수 직접 사용:

// messages.Add용으로만 tcList 유지 (line 887-892)
messages.Add(new
{
    role = "assistant",
    content = (string?)null,
    tool_calls = tcList
});

// tool execution은 변수 직접 사용 (894-899줄 대신)
// 위쪽 루프(866-884)에서 tcId, funcName, funcArgs를 이미 알고 있음
// → 해당 루프 내에서 바로 tool 실행하거나, 별도 List<(string, string, string)>에 저장
var toolCallInfos = new List<(string id, string name, string args)>();
foreach (var tc in toolCalls.EnumerateArray())
{
    var id = tc.GetProperty("id").GetString() ?? $"tc_{toolRound}_{Guid.NewGuid():N}";
    var func = tc.GetProperty("function");
    var name = func.GetProperty("name").GetString() ?? "";
    var args = func.GetProperty("arguments").GetString() ?? "{}";
    toolCallInfos.Add((id, name, args));
}

// messages에 assistant + tool_calls 추가 (tcList는 anonymous array)
messages.Add(new { role = "assistant", content = (string?)null, tool_calls = tcList });

// tool execution: toolCallInfos의 값을 직접 사용 (Reflection 불필요)
foreach (var (tcId, funcName, funcArgs) in toolCallInfos)
{
    await EmitToolStart(tcId, funcName, funcArgs);
    // ... 나머지 동일 ...
}

[3]. JSON 텍스트 도구 감지 silent catch + fallthrough (🟠 MED)

문제: VllmChatStreamWithTools()(944-1036줄)에서 LLM 응답 내 JSON을 파싱하여 텍스트 기반 도구 호출을 감지할 때, catch { }(1035줄)로 예외를 삼킨 후 코드가 아래로 낙하한다. stopContent가 있으면 raw text로 SSE 발송되어 사용자에게 도구 의도가 노출된다. 로깅도 없어 디버깅이 불가능하다.

근거 (OllamaController.cs:944-1036):

var jsonCandidate = ExtractFirstJsonObject(stopContent);

if (!string.IsNullOrEmpty(jsonCandidate))
{
    try
    {
        // ... JSON 파싱, 도구 감지, 실행 ...
        if (detectedTool != null && args.Count > 0)
        {
            // ... 도구 호출 ...
            continue;  // ← 성공만 continue
        }
        // (실패: detectedTool == null || args.Count == 0) → fallthrough!
    }
    catch { }  // ← 파싱 예외도 fallthrough → 로깅 없음
}

// 1038-1047: stopContent가 있으면 raw text로 SSE 발송
if (!string.IsNullOrEmpty(stopContent))
{
    var msgJson = JsonSerializer.Serialize(new { message = new { content = stopContent } });
    await Response.WriteAsync($"event: message\ndata: {msgJson}\n\n");
    // ...
    return;  // ← 도구 대신 원본 텍스트 노출!
}

영향:

LLM이 {"tool": "run_sql", "parameters": {"sql": "..."}}를 출력했으나 JSON 파싱 실패 → SQL이 실행되지 않고 JSON 텍스트가 그대로 사용자에게 전달됨
에이전트 모드 다단계 추론 중단
catch { }로 디버깅 불가

수정:

if (!string.IsNullOrEmpty(jsonCandidate))
{
    bool toolExecuted = false;
    try
    {
        // ... 기존 파싱/도출/실행 로직 ...
        if (detectedTool != null && args.Count > 0) toolExecuted = true;
    }
    catch (Exception ex)
    {
        _logger.LogWarning(ex, "[OllamaController] JSON 도구 감지 파싱 실패, Text로 fallback: {Candidate}", jsonCandidate);
    }
    if (toolExecuted) continue;
}

[4]. `_capsCache` TTL 만료 항목 미제거 — 메모리 누수 (🟠 MED)

문제: GetModelCapabilitiesAsync()(329-357줄)의 _capsCache는 DateTime.UtcNow.AddMinutes(5)로 TTL을 설정하지만, 만료된 항목을 제거하는 로직이 없다. Ollama 서버에 새 모델이 계속 추가되면(실험/개발 환경) 딕셔너리가 무한히 커진다.

근거 (OllamaController.cs:329-357):

private static readonly Dictionary<string, (DateTime Until, string[] Caps)> _capsCache = new();
private static readonly object _capsCacheLock = new();

private async Task<string[]> GetModelCapabilitiesAsync(string baseUrl, string model)
{
    lock (_capsCacheLock)
    {
        if (_capsCache.TryGetValue(model, out var hit) && hit.Until > DateTime.UtcNow)
            return hit.Caps;
    }
    // ... HTTP call ...
    lock (_capsCacheLock) { _capsCache[model] = (DateTime.UtcNow.AddMinutes(5), caps); }
    return caps;
}

영향:

10개 모델에서 200개 모델로 증가 시 캐시가 20배 확장 (무제한)
앱 생명주기 동안 제거되지 않아 메모리 단편화 유발

수정 — IMemoryCache로 교체 (TTL + 자동 pruning):

private readonly IMemoryCache _memoryCache;

// DI: Program.cs에서 builder.Services.AddMemoryCache();
public OllamaController(..., IMemoryCache memoryCache)
{
    _memoryCache = memoryCache;
}

private async Task<string[]> GetModelCapabilitiesAsync(string baseUrl, string model)
{
    var cacheKey = $"caps_{model}";
    if (_memoryCache.TryGetValue(cacheKey, out string[] cached))
        return cached;

    try
    {
        // ... HTTP call ...
        _memoryCache.Set(cacheKey, caps, TimeSpan.FromMinutes(5));
        return caps;
    }
    catch { return Array.Empty<string>(); }
}

[5]. `GetModels()` — 모든 모델의 `/api/show`를 동시 병렬 호출 (🟠 MED)

문제: GetModels()(300-305줄)는 allModels.Select(async n => ...) + Task.WhenAll(tasks)를 사용하여 모든 모델의 capabilities를 동시에 조회한다. Ollama 서버에 20개 모델이 있으면 20개의 /api/show 요청이 순간적으로 폭주한다.

근거 (OllamaController.cs:300-305):

var tasks = allModels.Select(async n =>
{
    var caps = await GetModelCapabilitiesAsync(cfg.BaseUrl, n);
    return (name: n, isChat: caps.Contains("completion"));
}).ToList();
var results = await Task.WhenAll(tasks);

영향:

Ollama 서버에 부하 집중 (특히 모델이 디스크에서 로딩 중이면 응답 지연)
초기 페이지 로드 시 /api/ollama/models가 수십 초 지연 가능
분당 수백 번 호출 시 서버 리소스 고갈

수정 — SemaphoreSlim으로 동시성 제한:

private static readonly SemaphoreSlim _modelCapSemaphore = new(3);  // 최대 3 concurrent

var tasks = allModels.Select(async n =>
{
    await _modelCapSemaphore.WaitAsync();
    try
    {
        var caps = await GetModelCapabilitiesAsync(cfg.BaseUrl, n);
        return (name: n, isChat: caps.Contains("completion"));
    }
    finally { _modelCapSemaphore.Release(); }
}).ToList();
var results = await Task.WhenAll(tasks);

또는 더 간단한 방법 — 캐시가 있으므로 GetModels의 Task.WhenAll을 제거하고 순차 조회:

// 이미 5분 TTL 캐시가 있으므로, 순차 조회해도 두 번째 요청부터는 캐시 히트
foreach (var n in allModels)
{
    var caps = await GetModelCapabilitiesAsync(cfg.BaseUrl, n);
    if (caps.Contains("completion")) chatModels.Add(n);
    else embeddingModels.Add(n);
}

[6]. Summarize — `LoadVllmModel()` 무시 (🟠 MED)

문제: Summarize()(663줄)에서 model 기본값을 Environment.GetEnvironmentVariable("VLLM_MODEL") ?? ""로 설정한다. LoadVllmModel() 메서드가 파일→환경변수→기본값 순서로 폴백하는 것과 달리, 환경변수만 확인하고 빈 문자열로 폴백한다.

근거 (OllamaController.cs:663):

var model = string.IsNullOrWhiteSpace(req.Model)
    ? Environment.GetEnvironmentVariable("VLLM_MODEL") ?? ""   // ← LoadVllmModel() 무시
    : req.Model;

동일 컨트롤러의 기존 메서드 (OllamaController.cs:536-553):

string LoadVllmModel()
{
    // 1. llm-model.json 파일 확인
    // 2. 파일 없으면 → "Qwen3.6-27B-FP8" 기본값 반환
}

영향:

VLLM_MODEL env 미설정 환경에서 req.Model 미포함 요청 시 빈 문자열 model로 vLLM 400 오류
llm-model.json에 설정된 모델명이 무시됨

수정:

var model = string.IsNullOrWhiteSpace(req.Model)
    ? LoadVllmModel()    // ← 기존 메서드 재사용
    : req.Model;

[7]. `Response.Headers.Append()` — 이론적 중복 가능 (🟡 LOW)

문제: SSE 스트리밍 엔드포인트(402-405, 702-705줄)에서 Response.Headers.Append()로 헤더를 설정한다. IHeaderDictionary.Append()는 기존 헤더가 이미 존재할 경우 값을 추가하므로(덮어쓰지 않음), nginx 등 reverse proxy에서 중복 헤더 충돌 가능성이 있다.

근거 (OllamaController.cs:402-405):

Response.Headers.Append("Content-Type", "text/event-stream");
Response.Headers.Append("Cache-Control", "no-cache");
Response.Headers.Append("Connection", "keep-alive");
Response.Headers.Append("X-Accel-Buffering", "no");

영향: 현재 아키텍처(Kestrel 직접 서빙, 미들웨어 체인에 선행 헤더 세터 없음)에서는 재현 불가. nginx-proxy 도입 시 발생 가능.

수정:

Response.Headers["Content-Type"] = "text/event-stream";
Response.Headers["Cache-Control"] = "no-cache";
Response.Headers["Connection"] = "keep-alive";
Response.Headers["X-Accel-Buffering"] = "no";

[8]. HttpRequestMessage / HttpResponseMessage 미처분 (🟡 LOW)

문제: 스트리밍 메서드들(OllamaChatStream 420-425줄, VllmChatStreamSimple 751-756줄, VllmChatStreamWithTools 832-836, 1060-1065줄)에서 HttpRequestMessage와 HttpResponseMessage에 using이 누락되었다.

근거: OllamaChatStream(420-434줄):

var httpRequest = new HttpRequestMessage(HttpMethod.Post, $"{cfg.BaseUrl}/api/chat")
{
    Content = content
};  // ← using 없음
var res = await _httpClient.SendAsync(httpRequest, ...);  // ← using 없음
// ...
using var reader = new StreamReader(stream, Encoding.UTF8);  // StreamReader만 using

영향: .NET HttpClient(SocketsHttpHandler)가 내부적으로 연결을 관리하고, StreamReader disposal이 스트림을 닫으므로 실질적 누수는 미미하다. 하지만 코드 분석 툴에서 경고를 발생시키고, 극단적 부하에서 GC 압박 유발 가능.

수정: 일관성 있는 using var 적용:

using var httpRequest = new HttpRequestMessage(HttpMethod.Post, ...) { Content = content };
using var res = await _httpClient.SendAsync(httpRequest, HttpCompletionOption.ResponseHeadersRead, HttpContext.RequestAborted);

[9]. `ExtractFirstJsonObject` — 문자열 내 `{` brace 미처리 (🟡 LOW)

문제: ExtractFirstJsonObject(1130-1141줄)가 escaping과 문자열 상태를 고려하지 않고 depth만 센다. LLM 응답 문자열 리터럴에 {가 포함되면 depth 계산이 틀어진다.

영향: LLM이 응답 중간에 JSON 형식을 출력하면(예: 설명: 매개변수 {a, b, c}) brace depth가 깨져서 잘못된 범위 추출. 발생 빈도 낮음.

수정: 1차 진단과 동일 (문자열 상태 추적 로직 추가).

STEP 8 — 자가 검증

각 지적 사항을 현재 파일의 특정 줄 번호로 직접 가리킴
HIGH 항목(1건)은 재현 가능한 시나리오를 한 문장으로 서술 가능
교차 검증 4개 질문 모두 통과
실측 없는 성능 수치 배제
기존 보고서 누락 항목(LoadConfig case-sensitivity) 추가 완료

최종 요약

심각도	수	항목
🔴 HIGH	1	[1] `LoadConfig()` case-sensitive deserialization → 설정 저장 후 유실
🟠 MED	5	[2] Reflection tool call 추출, [3] JSON 감지 silent catch + fallthrough, [4] 캐시 메모리 누수, [5] Thundering herd, [6] Summarize 기본값 누락
🟡 LOW	3	[7] Headers.Append 중복, [8] HttpMessage disposal 누락, [9] ExtractFirstJsonObject brace

우선 수정 순서: [1] → [2] → [3] → [5] → [4] → [6] → [9] → [7] → [8]

[1]은 서비스 장애 — 설정을 저장할 수 없어 원격 Ollama 연결 불가
[2]는 기능 버그 — Reflection 깨지면 tool_calling 무시
[3]은 디버깅 불가 — 도구 호출 누락 시 silent
[5]는 성능 — 모델 20개 이상 시 페이지 로드 수초 지연
[4][6][9]는 안정성/일관성
[7][8]은 Best practice

1차 보고서와의 비교

항목	1차 (등급)	2차 (등급)	사유
static 캐시	🔴 HIGH	🟠 MED	systemd 단일 프로세스 환경에선 문제 아님. 메모리 누수만 재평가
JSON silent catch	🔴 HIGH	🟠 MED	fallthrough 후에도 stopContent는 SSE 발송됨 (crash는 아님)
Headers.Append	🟠 MED	🟡 LOW	선행 미들웨어 없어 실제 재현 불가
HttpMessage disposal	🟠 MED	🟡 LOW	.NET 런타임이 내부 관리
LoadConfig 캐싱	🟡 LOW	🔴 HIGH	case-sensitivity 버그 발견 (1차 누락)
Reflection tool call	—	🟠 MED	1차 누락
Thundering herd	—	🟠 MED	1차 누락
Summarize 기본값	🟠 MED	🟠 MED	유지
ExtractFirstJsonObject	🟡 LOW	🟡 LOW	유지

21 KiB Raw Blame History