Speech Transcription

Overview

The Speech Transcription feature provides two powerful tools for converting speech to text:

📝

General-purpose real-time transcription with sensitive data detection. Perfect for notes, dictation, and quick recordings.

👥

Meeting Transcription

Specialised for meetings with automatic minutes generation, action items extraction, and participant tracking.

💡 Privacy Reminder: This feature demonstrates what's possible when an app has microphone access. It's a practical tool, but also a reminder to be mindful of which apps you grant this permission to. All processing happens on-device.

↑ Back to top

1. Speech Transcription

Services: SpeechTranscriptionService, SensitiveInfoDetector, TranscriptStorageService

Features

Real-time speech-to-text using Apple's Speech framework
On-device recognition for privacy (when available)
Support for 50+ languages with automatic detection
Continuous transcription that grows as you speak
Auto-save every 30 seconds to prevent data loss
Pause and resume without losing progress

Sensitive Data Detection

Automatically detects and flags potentially sensitive information:

Credit Card Numbers — 16-digit card patterns
National ID Numbers — Norwegian fødselsnummer (11 digits)
Phone Numbers — 8+ digit sequences
Email Addresses — Standard email patterns
Passwords — Text following "password:", "PIN:", etc.

Live Statistics

Word count and character count
Words per minute (speaking rate)
Session duration
Average confidence score
Sensitive segment count

Audio Controls

Microphone Gain Slider — Manual adjustment of input sensitivity (0.5x to 2.0x)
AGC (Automatic Gain Control) — Automatically adjusts gain for optimal levels
Delta SNR Indicator — Shows real-time change in signal-to-noise ratio

↑ Back to top

2. Meeting Transcription

Services: MeetingMinutesService, TranscriptExportService

Meeting Setup

Meeting title and location
Participant list management
Recording mode selection
Document sensitivity level (Personal, Public, Internal, Confidential)
Keep screen on during recording

Recording Modes

Mode	Best For	Description
Standard	General use	Default transcription settings
Meeting	Group discussions	Optimised for multiple speakers
Interview	Two-person conversations	Balanced for dialogue
Lecture	Presentations	Single speaker, long duration

Automatic Meeting Minutes

The app analyses your transcript and generates structured meeting minutes:

Summary — Automatic overview of the meeting content
Key Points — Important topics identified by keywords
Decisions — Statements containing "decided", "agreed", "approved"
Action Items — Tasks with assignee and priority detection

Action Item Detection

Automatically extracts action items based on keywords like:

"action item", "todo", "task", "will do", "need to"
"should", "must", "have to", "assigned to"
"follow up", "deadline", "responsible for"

Priority is determined by urgency keywords (urgent, critical, ASAP = High; when possible, eventually = Low).

↑ Back to top

3. Advanced Audio Processing

Service: AdaptiveLMSFilter, AudioEngineHandler

Two audio enhancement modes that can be used simultaneously for maximum quality:

🔇

Denoise (LMS Filter)

Adaptive Least Mean Squares filter that learns and removes background noise in real-time. Uses vDSP-accelerated processing for minimal CPU impact.

64-tap FIR filter (~2.9ms at 22kHz)
Conservative learning rate (β = 0.003) to preserve speech
Leakage factor and coefficient clamping for stability
Adapts to your environment using white noise reference
Removes constant noise (fans, AC, traffic)

🎤

Clarity (Voice Enhancement)

6-band parametric EQ optimised for both male and female voice frequencies.

High-pass at 80Hz (removes sub-bass rumble)
Boost at 220Hz (voice fundamentals, male + female)
Boost at 700Hz (first formant, male voices)
Boost at 1kHz (first formant, female voices)
Boost at 2.8kHz (articulation and consonants)
Low-pass at 8kHz (removes high-frequency hiss)

Combined Mode: All three filters (Denoise, Clarity, and AGC) can be active simultaneously for maximum audio quality. When combined, the signal flows through the EQ first (voice shaping), then through the LMS filter (noise removal), with AGC applied at the input stage. This combined processing chain delivers the best results in noisy environments. Filters can be toggled on and off during recording without interruption.

Signal-to-Noise Ratio (SNR) Indicator

Live SNR display shows audio quality during recording:

SNR Level	Quality	Colour
< 10 dB	Poor (noisy environment)	Red
10-20 dB	Acceptable	Orange
20-30 dB	Good	Yellow
> 30 dB	Excellent	Green

📎 See also: FAQ — Speech Transcription — detailed explanation of how the Denoise, Clarity, and AGC audio filters work.

↑ Back to top

4. Export & Sharing

Service: TranscriptExportService

Export Formats

Format	Extension	Best For
Plain Text	.txt	Simple sharing, compatibility
Markdown	.md	Documentation, formatted notes
PDF	.pdf	Professional documents, email attachments
JSON	.json	Data processing, backups

Sharing Options

Share via iOS share sheet (AirDrop, Messages, Mail, etc.)
Copy to clipboard with one tap
Export meeting minutes separately from transcript
PDF generation with professional formatting

Meeting Minutes Export

Meeting minutes can be exported with full formatting:

Header with meeting details (title, date, duration, participants)
Summary section
Key points as bullet list
Decisions with checkmarks
Action items table with priority, assignee, and status
Full transcript appendix

Corporate PDF Export

Professional PDF design for business use:

Dark blue header stripe with meeting title
Teal accent colours for section headers
Sensitivity watermark with logo (when set)
Clean white background
Proper typography and spacing

↑ Back to top

5. Search & Analysis

Search Functionality

Full-text search across entire transcript
Case-insensitive matching
Highlighted search results
Match count display
Search works on all segments (growing document)

Statistics Dashboard

Duration — Total recording time
Word Count — Total words transcribed
Character Count — Total characters
Words per Minute — Speaking rate
Average Confidence — Recognition accuracy (0-100%)
Sensitive Segments — Count of flagged content
Language — Selected transcription language

Document Behaviour

Growing Document: The transcript is a continuous document that grows over time. Text never disappears — it only accumulates. When recognition restarts (due to Apple's 1-minute limit), all previous text is preserved and new text is appended.

↑ Back to top

6. Technical Implementation

Core Technologies

AVAudioEngine — Real-time audio capture and processing
SFSpeechRecognizer — Apple's on-device speech recognition
AVAudioUnitEQ — 6-band parametric equaliser
Custom 64-tap LMS adaptive filter with vDSP acceleration (Accelerate framework)

Swift 6 Concurrency

@MainActor for UI-related service classes
@unchecked Sendable with NSLock for thread-safe audio processing
nonisolated methods for cross-actor operations
Async/await for permission requests and recognition tasks

Recognition Handling

Automatic restart when Apple's recognition times out (~1 minute)
Sequence ID system — each recognition task gets a unique ID; callbacks from old tasks are ignored deterministically, eliminating race conditions
Overlapping recognition — new task starts before old one fully stops, ensuring no audio buffers are lost during transitions
Pending text buffer to prevent loss during restarts
Partial results shown in real-time, converted to final on completion
Duplicate detection using Jaccard similarity (85% threshold) and containment checks
Improved sentence boundary detection — uses Apple's segment timestamps and count changes instead of simple word count heuristics

↑ Back to top

🎙️ Speech Transcription

Overview

Speech Transcription

Meeting Transcription

1. Speech Transcription

Features

Sensitive Data Detection

Live Statistics

Audio Controls

2. Meeting Transcription

Meeting Setup

Recording Modes

Automatic Meeting Minutes

Action Item Detection

3. Advanced Audio Processing

Denoise (LMS Filter)

Clarity (Voice Enhancement)

Signal-to-Noise Ratio (SNR) Indicator

4. Export & Sharing

Export Formats

Sharing Options

Meeting Minutes Export

Corporate PDF Export

5. Search & Analysis

Search Functionality

Statistics Dashboard

Document Behaviour

6. Technical Implementation

Core Technologies

Swift 6 Concurrency

Recognition Handling