AI-Plane

AI-powered interview assistant: generates hints, performs code reviews, and conducts interactive AI interviews.

Overview

The AI-plane is a standalone NestJS application context (NestFactory.createApplicationContext()), no HTTP server. It pulls jobs from BullMQ queues, calls LLM providers, and pushes results back through paired result queues. The control-plane is the only thing that talks to it, and only via Redis.

The AI-plane handles three job types: interview responses (conversation + follow-ups), hint generation, and code review.

Architecture

Queue Contracts

AI Interview Response

Covers AI interviewer conversation and code-aware follow-up questions.

Property	Value
Job queue	`ai.interview-response`
Result queue	`ai.interview-response.results`
Concurrency	3
Job kind tag	`ai:interview`

HTTP request body

What the frontend sends to POST /rooms/:roomId/ai/message:

BullMQ job payload

What the control-plane enqueues after enrichment:

Result payload

weaknessSignals is internal to the control-plane's weakness aggregation pipeline; it is not returned in HTTP polling responses.

Sequence diagram

Generate Hint

Covers layered hint generation for candidates.

Property	Value
Job queue	`ai.generate-hint`
Result queue	`ai.generate-hint.results`
Concurrency	5
Job kind tag	`ai:hint`

HTTP request body

What the frontend sends to POST /rooms/:roomId/ai/hint:

BullMQ job payload

What the control-plane enqueues after enrichment:

Result payload

Sequence diagram

Code Review

Covers structured evaluation reports, improvement suggestions, and evidence-based scoring.

Property	Value
Job queue	`ai.review-code`
Result queue	`ai.review-code.results`
Concurrency	3
Job kind tag	`ai:review`

HTTP request body

What the frontend sends to POST /rooms/:roomId/ai/review:

BullMQ job payload

What the control-plane enqueues after enrichment:

Result payload

Typed categories, line-level suggestions, and evidence-based scoring:

weaknessSignals, same as interview response, is internal only, stripped from HTTP polling responses.

Sequence diagram

Conversation History

Served by the control-plane directly from PostgreSQL. No AI-plane involvement.

Property	Value
Endpoint	`GET /rooms/:roomId/ai/messages`
Auth	Bearer (requires `code:view` room capability)
Pagination	Cursor-based (`cursor`, `limit` default 50)

Response:

Individual messages come through the interview response queue. The control-plane persists them and serves history from the DB.

AI Capabilities by Room Mode

Capability	AI-mode rooms	Peer-mode rooms
AI interview conversation	Yes	No
AI follow-up questions	Yes	No
Adaptive difficulty	Yes	No
TTS voice output	Yes	No
STT voice input	Yes	No
Hint generation	Yes	Yes
Code review	Yes	Yes (post-session)
Weakness signal emission	Yes	Yes (from reviews)

In AI-mode, the AI runs the interview. In peer-mode, hints and reviews are supplementary tools for the human participants.

Adaptive Difficulty

Question difficulty adjusts per-session based on candidate performance. Each interview response job analyzes conversation history and code quality to decide the next difficulty tier.

Difficulty signals (inputs):

Signal	Source	Indicates
Code correctness	Test case pass rate from execution results	Solution quality
Response time	Timestamps between messages	Comfort with topic
Hint usage	Count of hint requests in session	Struggle level
Conversation depth	Number of follow-ups without resolution	Difficulty
Code complexity	Cyclomatic complexity, nesting depth	Solution sophistication

Difficulty adjustments (outputs):

Current performance	Next question difficulty	Follow-up type
Solving quickly, no hints	`hard`: deeper algorithmic questions	`question`
Moderate pace, few hints	`medium`: standard follow-ups	`question` or `evaluation`
Struggling, multiple hints	`easy`: simpler sub-problems	`hint` or `encouragement`

The difficulty field in InterviewResponseResult tells the frontend which tier the AI is targeting, so the UI can show progress indicators.

Voice: TTS and STT

In AI-mode rooms, the AI interviewer communicates by voice. This involves two directions: text-to-speech (TTS) for the AI's spoken output, and speech-to-text (STT) for transcribing the candidate's voice input.

TTS (AI speaks)

After the LLM generates a text response, the AI-plane sends it to a TTS provider, uploads the resulting audio to SeaweedFS with a presigned URL (1-hour expiry), and returns that URL as audioUrl in the job result. The frontend fetches and plays it.

The TTS provider is abstracted behind an ITtsProvider interface so implementations can be swapped without changing job processing logic.

Environment variables:

Var	Description
`TTS_PROVIDER`	`openai` / `google` / `azure` / `none`
`TTS_VOICE`	Voice ID (provider-specific)
`TTS_AUDIO_FORMAT`	`mp3` / `ogg` (default: `mp3`)

When TTS_PROVIDER=none or unset, audio generation is skipped and audioUrl is omitted from the result.

STT (Candidate speaks)

The candidate speaks into their microphone. The browser captures audio via the Web Audio API / MediaRecorder, encodes it (Opus in WebM or raw PCM), and sends chunks to the control-plane. The control-plane forwards the audio to the AI-plane for transcription, and the resulting text is injected into the conversation as a user message before triggering the next interview response job.

Flow:

Queue contract:

Property	Value
Job queue	`ai.transcribe`
Result queue	`ai.transcribe.results`
Concurrency	5
Job kind tag	`ai:transcribe`

Job payload:

Result payload:

Like TTS, the STT provider is abstracted behind an ISttProvider interface.

Environment variables:

Var	Description
`STT_PROVIDER`	`openai` / `google` / `azure` / `none`
`STT_LANGUAGE`	Default language hint (BCP-47 tag, e.g., `en`)
`STT_MAX_AUDIO_SIZE_MB`	Maximum upload size (default: 25 MB, matching most provider limits)

When STT_PROVIDER=none or unset, the voice input endpoint returns 400 and voice input is unavailable. Users type instead.

Weakness Tracking

Cross-session weakness aggregation.

Data flow

How it works

The AI-plane tags interview and review results with weaknessSignals, short string identifiers like 'edge_cases', 'time_complexity', 'off_by_one'. The control-plane's result consumer persists these to PostgreSQL, tied to the user and session. GET /users/me/ai/weaknesses aggregates across sessions:

Future enhancement: the AI-plane could receive the user's historical weaknesses as part of the job data, letting it probe known weak areas.

Weakness categories

Category	Description	Detected from
`edge_cases`	Missing boundary/edge case handling	Code review, follow-up questions
`time_complexity`	Suboptimal algorithmic complexity	Code review, interview discussion
`space_complexity`	Excessive memory usage	Code review
`variable_naming`	Poor variable/function naming	Code review (readability)
`code_structure`	Deeply nested or poorly organized code	Code review (readability)
`off_by_one`	Off-by-one errors in loops/indices	Code review (correctness)
`input_validation`	Missing null/empty/type checks	Code review (edge cases)
`communication`	Unclear explanation of approach	Interview conversation analysis

Rate Limiting

Rate limits are enforced by the control-plane before jobs are enqueued.

Scope	Limit	Window	Enforced at
AI hints	3 requests	5 min	Per user per room
AI messages	20 requests	1 min	Per user per room

Over-limit requests get 429 Too Many Requests with a Retry-After header. The job never hits the queue. The 429 body includes a user-facing message so the frontend can display quota exhaustion clearly.

Result Caching

The control-plane caches job results in Redis after consuming them from result queues.

Property	Value
Cache key format	`ai-result:{jobId}`
TTL	24 hours (86400 seconds)
Storage	Redis (`ICacheService`)
Written by	Control-plane result queue consumer
Read by	Control-plane GET endpoint

Same Redis instance as execution results and other caches.

Polling flow: The frontend polls the per-type endpoint (GET /rooms/:roomId/ai/message/:jobId, GET /rooms/:roomId/ai/hint/:jobId, or GET /rooms/:roomId/ai/review/:jobId) until it gets completed or failed. The control-plane checks the cache first; if nothing is cached yet, it queries BullMQ job status via IAiClient.getHintJobStatus() / getReviewJobStatus() / getInterviewJobStatus() and returns the queue state (queued | running).

Error Handling

LLM provider failures

Error type	AI-plane behavior	Control-plane behavior
LLM API timeout	Job fails, BullMQ retries (exponential backoff)	Returns `queued` or `running` status to frontend
LLM API rate limit	Job fails with retryable error	Same as timeout
LLM API auth error	Job fails permanently (no retry)	Returns `failed` status with error message
Invalid LLM response	Logged, job fails, retry	Returns `failed` if retries exhausted
TTS failure	Logged, result returned without `audioUrl`	Transparent. Result has no audio
STT failure	Job fails permanently (no retry; audio may be corrupted)	Returns `failed` status with error message

BullMQ retry configuration

Setting	Value
Max attempts	3
Backoff type	Exponential
Backoff delay	5000ms (5s, 10s, 20s)
Stall detection	30s lock duration
Dead letter	Jobs moved to DLQ after max attempts

Circuit breaker (control-plane side)

IAiClient is wrapped with a circuit breaker proxy. When the AI-plane is unresponsive:

CircuitBreakerOpenError -> 503 Service Unavailable

CircuitBreakerTimeoutError -> 504 Gateway Timeout

healthCheck() bypasses the circuit breaker; it always makes a real call so the control-plane health endpoint can report AI-plane status accurately.

Observability

OTel for traces and metrics, pino-opentelemetry-transport for structured log shipping.

Metrics

Metric	Type	Labels	Description
`ai.queue.depth`	Gauge	`queue`	Jobs waiting in each queue
`ai.queue.active`	Gauge	`queue`	Jobs currently being processed
`ai.job.duration_ms`	Histogram	`queue`, `status`	End-to-end job processing time
`ai.llm.latency_ms`	Histogram	`provider`, `model`	LLM API call latency
`ai.llm.tokens.input`	Counter	`provider`, `model`	Input tokens consumed
`ai.llm.tokens.output`	Counter	`provider`, `model`	Output tokens generated
`ai.llm.failures`	Counter	`provider`, `error_type`	LLM API failures by type
`ai.tts.latency_ms`	Histogram	`provider`	TTS generation latency
`ai.tts.failures`	Counter	`provider`	TTS generation failures
`ai.stt.latency_ms`	Histogram	`provider`	STT transcription latency
`ai.stt.failures`	Counter	`provider`	STT transcription failures
`ai.stt.audio_duration_ms`	Histogram	`provider`	Duration of audio submitted for transcription

Tracing

Each job creates a span with:

ai.job.id: BullMQ job ID

ai.job.queue: queue name

ai.job.type: hint / review / interview / transcribe

Child spans for LLM calls, TTS generation, and STT transcription

Logging

Structured logs via nestjs-pino with:

Log level: debug in development, info in production

OTel log shipping when OTEL_EXPORTER_OTLP_ENDPOINT is configured

Service name: ai-plane

Overview#

Architecture#

Queue Contracts#

AI Interview Response#

HTTP request body#

BullMQ job payload#

Result payload#

Sequence diagram#

Generate Hint#

HTTP request body#

BullMQ job payload#

Result payload#

Sequence diagram#

Code Review#

HTTP request body#

BullMQ job payload#

Result payload#

Sequence diagram#

Conversation History#

AI Capabilities by Room Mode#

Adaptive Difficulty#

Voice: TTS and STT#

TTS (AI speaks)#

STT (Candidate speaks)#

Weakness Tracking#

Data flow#

How it works#

Weakness categories#

Rate Limiting#

Result Caching#

Error Handling#

LLM provider failures#

BullMQ retry configuration#

Circuit breaker (control-plane side)#

Observability#

Metrics#

Tracing#

Logging#

Overview

Architecture

Queue Contracts

AI Interview Response

HTTP request body

BullMQ job payload

Result payload

Sequence diagram

Generate Hint

HTTP request body

BullMQ job payload

Result payload

Sequence diagram

Code Review

HTTP request body

BullMQ job payload

Result payload

Sequence diagram

Conversation History

AI Capabilities by Room Mode

Adaptive Difficulty

Voice: TTS and STT

TTS (AI speaks)

STT (Candidate speaks)

Weakness Tracking

Data flow

How it works

Weakness categories

Rate Limiting

Result Caching

Error Handling

LLM provider failures

BullMQ retry configuration

Circuit breaker (control-plane side)

Observability

Metrics

Tracing

Logging