MSoftech · Apr 2025 – Nov 2025

Clinical
Nursing EMR

An AI-powered EMR simulation platform that bridges the gap between clinical practice and nursing education. 6 AI Agents evaluate student performance across 55 criteria in real time.

AI Agents

Evaluation Criteria

40K+

Log Entries

90%

Time Reduction

Overview

Project Overview

Nursing students practice clinical workflows in a real hospital-grade EMR environment — covering OCS, ENR, Lab, and PACS — exactly as they would on an actual ward. Every action, every input, every decision is captured as a structured evaluation log in real time.

The evaluation engine was built on domain knowledge, not just code. We mapped actual clinical nursing protocols — ward rounds, medication administration, vital sign assessment, intake/output management — and translated them into 55 structured evaluation criteria that mirror real professional standards. Upon completing a simulation, 6 specialized AI Agents automatically assess each student's performance: identifying what they did well, what needs improvement, and why — with structured, actionable feedback.

This architecture delivers measurable outcomes: evaluation time reduced by 90% compared to manual grading, full consistency across 100+ students per cohort, and professor workload cut by 40% — shifting their role from repetitive scoring to targeted clinical mentoring. The system has processed over 40,000 evaluation logs in production.

Period

Apr 2025 – Nov 2025

Type

MSoftech Product

Role

AI Solutions Architect · Full-Stack Engineer

Domain

Healthcare · AI · Education

Domain Foundation

Clinical Standards

임의로 만든 기준이 아닌 공식 간호 표준 기반

Evaluation systems in regulated domains cannot rely on approximations. Each of the 55 criteria in this platform is systematically codified from published Korean national nursing standards — the clinical references that underpin nursing licensure and professional practice guidelines. We design evaluation engines only after clinical judgment is formally mapped at the protocol level, because healthcare AI that settles for 'mostly correct' becomes a liability the moment it enters production.

Practice Guide

기본간호학 실습 지침서

Vitals Standard

대한간호협회 활력징후 측정 표준

Patient Safety

환자안전 지표 · Morse Fall Scale

VITAL SIGNS

Measurement Tolerances

        BT: ±0.2°C

        HR: ±3 bpm

        RR: ±2/min

        BP: ±5 mmHg

        SpO2: ±1%

MEDICATION

The 5 Rights

Right Patient
Right Drug
Right Dose
Right Route
Right Time

NURSING RECORDS

F-DAR Format

Focus
Data
Action
Response

SAFETY

Risk Assessment

Morse Fall Scale
9-Category Initial
Pain NRS (0-10)
Danger Value Flag

How It Works

System Overview

A student completes a clinical simulation, and the system captures two outputs: structured practice data and a granular action log. These flow into 6 specialized AI Agents for automated first-pass evaluation. From there, the professor takes over — reviewing AI-generated scores alongside the full prompt context and raw student action logs — before approving or adjusting the final grade. Only after this human confirmation does the result reach the student.

AI Agents

Evaluation Engine

6개 전문 Agent — 각각 독립된 임상 영역을 전담

VitalSignsAgent

Measurement Accuracy

BT ±0.2°C · HR ±3 bpm
RR ±2/min · BP ±5 mmHg

MedicationAgent

5 Rights Protocol

Patient · Drug · Dose
Route · Time

NursingRecordAgent

F-DAR Documentation

Focus · Data
Action · Response

IoAgent

Intake / Output

Recording accuracy
and completeness

InitialAssessmentAgent

9-Category Assessment

Patient safety · Pain
Fall risk · Consciousness

ActionLogAgent

Behavioral Sequence

Action vs expected
clinical workflow

All 6 agents run in parallel against the same student session, producing 28 independent evaluation criteria in a single pass. Every criterion traces directly back to the clinical standards defined above.

Evaluation Philosophy

Human-in-the-Loop

AI는 제안하고, 인간이 결정합니다 — 아키텍처에 내재된 안전성 설계

Every AI evaluation in this system is a draft, not a verdict. Before any score reaches a student, a professor reviews the full AI reasoning alongside raw action logs — and can approve, adjust, or override. This isn't a compliance checkbox added at the end. Human oversight was embedded into the architecture from day one, following the same structural principle that FDA SaMD and EU AI Act require: AI proposes, humans decide.

REVIEW

What the Professor Sees

→ Full prompt transmitted to the LLM — 28,873 characters, version-controlled
→ AI's complete reasoning and structured feedback, not just the final score
→ Every student action — logged at the keystroke level with timestamps
→ Clinically critical values auto-flagged with full contextual data

CONTROL

Professor's Control

→ Approve or override AI-generated scores
→ Adjust individual category scores without affecting others
→ Selectively re-execute any single agent for re-validation
→ Append domain-specific comments and learning points

Production Infrastructure

Full Observability

모든 LLM 호출의 투명한 추적, 기록 및 재실행을 통한 신뢰성 확보

TRACEABILITY

Full Pipeline Tracking

Session context, token usage, and schema validation — every step of the AI reasoning process transparently logged.

RE-EXECUTION

Independent Re-run

Any individual LLM call can be independently re-executed from logged data — precise issue reproduction and root cause analysis.

RELIABILITY

Enterprise-Grade Trust

→ Debugging — rapid error resolution
→ Incident Traceback — precise failure analysis
→ Compliance Audit — regulatory adherence proof

PROMPT ENGINEERING

Production-Scale Prompt Management

→ Structured prompts at production scale — 28,873 characters, versioned and reviewable
→ JSON Schema validation enforced on every response — zero tolerance for malformed output
→ Mode-branched prompts: GUIDED (learning) vs EVALUATION (assessment)
→ Built-in guardrails against hallucination and format drift

Features

Key Screens

간호기록 (ENR) · GUIDED MODE

Students practice in a real hospital EMR environment. Records Initial Assessment, Vital Signs, I/O, and Nursing Notes in real time with step-by-step progress tracking.

검사결과 (Lab)

CBC, biochemistry results with trend charts. Critical values highlighted, nursing action reports auto-generated.

영상판독 (PACS)

X-Ray, CT, MRI images with diagnostic reports. Findings + Impression sections simulate real clinical reading.

AI 평가 실행 · Vertex AI (Gemini 2.0 Flash)

Full AI prompt, patient context, and JSON response are logged transparently. Each agent's score and detailed feedback visible in real time.

교수용 평가 관리 대시보드 · Professor View

Professors review AI scores with category-level feedback (correct / caution / needs improvement). Final scores approved or adjusted with one click.

학생 현황 · Student Competency Profile

Each student's clinical competency visualized as a 9-dimension radar profile — from patient safety to documentation accuracy. AI-generated analysis identifies strengths and specific areas for improvement, with score trends tracked across sessions.

액션 로그 뷰어 · Action Log Viewer

Every student action — navigation, data entry, clinical decision — recorded with action type, screen context, evaluation score, and timestamp. Enables granular analysis of clinical reasoning patterns and serves as the evidence base for AI evaluation.

AI 평가 모니터링 · LLMOps Dashboard

Full LLMOps visibility — every Agent call surfaces its Call ID, token consumption, and processing latency. Reviewers can inspect any call's complete trace or trigger selective re-execution of individual Agents without affecting the rest of the pipeline.

Clinical
Nursing EMR

Project Overview

Clinical Standards

System Overview

Evaluation Engine

Human-in-the-Loop

Full Observability

Key Screens

Impact

Tech Stack

ClinicalNursing EMR

Project Overview

Clinical Standards

System Overview

Evaluation Engine

Human-in-the-Loop

Full Observability

Key Screens

Impact

Tech Stack

Clinical
Nursing EMR