feat: Implement Intent Routing for Smart RAG Bypassing by Sudhanshu-NITR · Pull Request #65 · sugarlabs/sugar-ai

Sudhanshu-NITR · 2026-03-10T21:17:34Z

Description

This PR introduces an intent routing system designed to differentiate between technical coding questions and conversational/off-topic interactions (such as greetings).

Previously, every query was sent through the RAG pipeline and evaluated against the internal context. Now, we perform an intention check before triggering RAG, allowing Sugar-AI to respond appropriately to greetings in a kid-friendly manner without initiating unnecessary document retrieval processes.

Why does this matter?

Better User Experience: Provides natural and engaging responses when users say "Hi" or ask simple, non-technical questions.
Efficiency & Speed: Reduces overhead by avoiding expensive vector search and context-matching for meaningless or off-topic questions.
Accuracy: Focuses the RAG capabilities purely on technical problems, reducing confusing outputs caused by retrieving irrelevant context for basic greetings.

Screenshots / Testing

Prior to this fix, the application bypassed quota incrementation for brand new API key requests. This assigns the dictionary and allows it to fall through to the proper count increment logic.

Implement an intent router in prompts.py that intercepts queries and replies exactly with 'TECHNICAL' for coding questions. Update RAGAgent.run in ai.py to check this intent prompt first, allowing non-technical questions (like greetings) to bypass the RAG search pipeline.

Copilot

Pull request overview

Adds intent-based routing so Sugar-AI can respond conversationally to greetings/off-topic inputs while preserving the existing RAG pipeline for technical questions, and fixes daily quota counting for first-time API keys.

Changes:

Fix check_quota() so the first request for a new API key is counted toward the daily quota.
Introduce INTENT_ROUTER_PROMPT used to classify inputs as technical vs. friendly/off-topic.
Update RAGAgent.run() to perform an intent check via chat-completions before running the RAG chain.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
app/routes/api.py	Fixes quota accounting by ensuring new API keys still go through the normal count increment logic.
app/prompts.py	Adds a new prompt used to route between “technical RAG” vs “friendly reply” behavior.
app/ai.py	Adds an intent-check step to `run()` using `run_chat_completion()` before executing the RAG pipeline.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copilot · 2026-03-10T21:20:39Z

app/ai.py

+        # Check intent first using proper chat formatting
+        messages = [
+            {"role": "system", "content": prompts.INTENT_ROUTER_PROMPT},
+            {"role": "user", "content": question}
+        ]


Intent routing calls run_chat_completion with the default generation settings (temperature=0.7, max_length=1024). For a classifier-style prompt that must output exactly "TECHNICAL" or a short reply, sampling + large max_length can make the output non-deterministic and unnecessarily slow/costly. Consider forcing deterministic settings (e.g., temperature=0 / do_sample=False) and a very small token/length budget for this intent check.

Copilot · 2026-03-10T21:20:40Z

app/ai.py

+
+        intent_response = self.run_chat_completion(messages)
+        intent_text = str(intent_response).strip()
+


By making RAGAgent.run() always call run_chat_completion(), the /ask path now depends on the loaded model's tokenizer having a valid chat template. If the model is changed (e.g., via /change-model) to one without a chat template, apply_chat_template will raise and break all /ask requests. Consider adding a fallback path for intent routing when the tokenizer lacks a chat template (or skipping intent routing / using a plain prompt in that case) so the endpoint degrades gracefully.

Suggested change

intent_response = self.run_chat_completion(messages)

intent_text = str(intent_response).strip()

try:

intent_response = self.run_chat_completion(messages)

intent_text = str(intent_response).strip()

except Exception:

# Fallback: if chat-style intent routing fails (e.g., no chat template),

# treat the question as technical so we still run the RAG pipeline.

intent_text = "TECHNICAL_FALLBACK"

…view

Sudhanshu-NITR added 2 commits March 3, 2026 23:41

fix API user quota logic flaw

e08a8af

Prior to this fix, the application bypassed quota incrementation for brand new API key requests. This assigns the dictionary and allows it to fall through to the proper count increment logic.

Copilot AI review requested due to automatic review settings March 10, 2026 21:17

Copilot started reviewing on behalf of Sudhanshu-NITR March 10, 2026 21:18 View session

Copilot AI reviewed Mar 10, 2026

View reviewed changes

fix: update intent routing fallback and optimize parameters per PR re…

ce825bc

…view

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Implement Intent Routing for Smart RAG Bypassing#65

feat: Implement Intent Routing for Smart RAG Bypassing#65
Sudhanshu-NITR wants to merge 3 commits intosugarlabs:mainfrom
Sudhanshu-NITR:feat/intent-routing

Sudhanshu-NITR commented Mar 10, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 10, 2026

Uh oh!

Copilot AI Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		intent_response = self.run_chat_completion(messages)
		intent_text = str(intent_response).strip()

Conversation

Sudhanshu-NITR commented Mar 10, 2026

Description

Why does this matter?

Screenshots / Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants