README
title: GOFA AI Overview sidebar_label: Overview description: Overview of GOFA AI intelligent services and RAG Chatbot API grouping.
GOFA AI provides intelligent health assistant capabilities centered on Retrieval‑Augmented Generation (RAG). All responses are required to be grounded in retrieved content, avoiding hallucination.
The first publicly available endpoints are the RAG Chatbot API (/api/rag-chatbot) and its accompanying text-to-speech endpoint (/api/rag-chatbot/tts). This section will:
Quick Navigation
- RAG Chatbot API – Interactive retrieval-augmented health consultation
- Upcoming features (planned): Personalized risk summaries, adaptive rehabilitation suggestions, explainability metrics, etc.
Core Design Principles
| Principle | Description |
|---|---|
| Grounded Responses | Answers must cite retrieved document content, no speculation |
| Language Adaptation | Chinese input responds in "Cantonese colloquial"; English input responds in concise professional English |
| Safety Compliance | Each response includes a medical disclaimer, not a substitute for professional advice |
| Scope Restriction | Non-health-related questions are politely redirected and constrained |
| Traceability | Responses end with source document titles listed as references |
| Streaming First | Default to SSE streaming output for improved perceived response speed |
Endpoint Overview
| Category | Endpoint | Description |
|---|---|---|
| Chat Completion | POST /api/rag-chatbot | RAG streaming/single-shot responses (with source sentinel) |
| (Accompanying) TTS | POST /api/rag-chatbot/tts | Synthesizes speech from response text (explained separately) |
Recommended Integration Flow (High-Level)
- The client first obtains a Firebase ID Token (corresponding to your user).
- Send
POST /api/rag-chatbot, preferably without?format=jsonto experience streaming. - The frontend concatenates
text-deltaincrementally, waits for the final source sentinel => separates answer and sources. - If speech is needed, call the TTS endpoint after obtaining the final clean text (removing the sentinel segment).
- Store the answer, source list, model, and timestamp in conversation records (to avoid re-parsing raw text).
Versions and Compatibility
| Component | Status | Notes |
|---|---|---|
| Model | gemini-2.5-flash | May adjust with GCP upgrades, check release notes |
| Source Sentinel | Stable | Pattern: \n[__SOURCE_FILES__] + JSON array |
| JSON Mode | Stable | Suitable for batch, export, or atomic result scenarios |
Related Documents
For new features or adjustments, please include use cases and expected behavior when submitting an issue to facilitate evaluation.