Designing Hierarchical Summary System for Lifelog Audio Workflow
Designing and implementing a hierarchical summary system with daily, weekly, monthly, quarterly, and yearly summaries. Defining summary generation rules and format specifications.
TL;DR
Designed hierarchical summary system to make lifelog transcripts human-readable and easy to review:
- Multi-scale summaries: Daily/weekly/monthly/quarterly/yearly hierarchy
- Generation rules: Date boundaries, week start, aggregation logic
- Format specification: Documented structure for AI agents and humans
- Documentation: Reorganized project docs to clarify tasks
Goal
The main purpose of this summary system is to make lifelog transcripts human-readable and easy to review. Raw JSONL transcripts are machine-readable but hard to scan. Summaries are generated by AI and organized at multiple time scales (daily → weekly → monthly → quarterly → yearly), allowing me to quickly review what happened and identify patterns over time.
What I built today
1. Hierarchical summary directory structure
Created a multi-level summary system that mirrors the natural time hierarchy:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
summaries/
daily/
YYYY/
YYYY-MM-DD.md
weekly/
YYYY/
YYYY-Www.md
monthly/
YYYY/
YYYY-MM.md
quarterly/
YYYY/
YYYY-Qq.md
yearly/
YYYY.md
Key decision: Designed date, week, and month boundaries:
- Date boundary: 04:00 cutoff (aligns with natural sleep cycles, prevents mid-day boundary issues)
- Week start: Monday (ISO week number compliant, standard calendar alignment)
- Month aggregation: Week-based with daily fallback for cross-boundary weeks (balances detail with efficiency)
2. Summary format specification
Created docs/summary-format.md as a formal specification document that serves both:
- AI agents: Clear structure for AI-assisted summary generation (manual workflow)
- Humans: Readable format for manual review and editing
Key format features:
Front matter (YAML) for metadata:
- Tags, importance, categories, priority in table format
- TODO items as structured data
- Conversation highlights in structured format
Markdown body for narrative content:
- Natural language summaries
- Key insights and patterns
- Action items and follow-ups
This dual-purpose design allows the same file to be:
- Parsed programmatically by AI agents
- Read naturally by humans
- Version controlled and diff-friendly
3. AGENTS.md reorganization
Restructured AGENTS.md to clearly separate concerns:
Before: Mixed summary generation tasks with project editing tasks
After:
- Summary generation tasks: Clearly scoped for LLM-based summarization
- Project editing tasks: Separate section for code/content editing
- Format details: Moved detailed format specification to
docs/summary-format.md
This makes it easier for AI agents to understand their role and scope.
Design decisions
Why 04:00 date boundary?
- Natural sleep cycle alignment
- Reduces edge cases around midnight transitions
- Matches common “day” perception (wake up = new day)
Why ISO week (Monday start)?
- Weekends often have more events, so grouping them in the same week summary makes it easier to review weekend activities together
Why week-based monthly summaries?
- Token efficiency: Reduces LLM token consumption while efficiently extracting content
- Fallback: Daily summaries handle month boundary weeks
Why manual workflow instead of automation?
- Development speed: During development, manual workflow allows me to quickly check results and iterate on content
- Right-sized solution: Using AI cloud services feels overkill for what I want to achieve
Current status
✅ Completed:
- Directory structure design
- Summary generation rules defined
- Format specification document created
- AGENTS.md reorganized
🔄 Next steps:
- Design summary hosting and reminder notification system
Related posts
Development log
| Date | Notes |
|---|---|
| 2025/12/13 | Designed hierarchical summary generation rules (daily/weekly/monthly/quarterly/yearly, date boundary, week start). Reorganized AGENTS.md to clarify generation rules. |
For the full development history, see the anchor post.