Bilingual Publishing Progress Report

Date: 2025-04-12 Status: 🔄 In Progress (Open API-based approach)

Executive Summary

Using open APIs (Dashscope, Edge TTS) for bilingual publishing. Local models need training/fine-tuning, so focusing on API-based approach for now.

Current Status

✅ Translation: EP001-EP010 complete (Dashscope API)
✅ Audio Generation: EP001-EP005 complete (Edge TTS)
🔄 Audio Generation: EP006-EP010 in progress
⚠️ Issue: EP011-EP015 episodes don't exist in episodes directory

Completed Tasks (4/9)

1. ✅ Translation (EP001-EP010)

API: Aliyun Dashscope (qwen-turbo model)
Files: EP001.md, EP002.md, ..., EP010.md (in episodes_translated/)
Quality: Good, but contains "part X of 75" markers (cleanup function not working properly)
Speed: ~1-2 minutes per episode (API rate limiting)

Output:

episodes_translated/
  EP001.md (13KB)
  EP002.md (14KB)
  EP003.md (15KB)
  EP004.md (15KB)
  EP005.md (13KB)
  EP006.md (12KB)
  EP007.md (12KB)
  EP008.md (13KB)
  EP009.md (14KB)
  EP010.md (14KB)

2. ✅ Audio Generation (EP001-EP005)

Engine: Edge TTS (Microsoft Azure)
Voices:
Guangda (Male): en-US-GuyNeural
Lingyi (Female): en-US-JennyNeural
Status: Complete

Output:

audio_en/
  EP001_en.mp3 (4.3 MB)
  EP002_en.mp3 (4.1 MB)
  EP003_en.mp3 (4.7 MB)
  EP004_en.mp3 (4.5 MB)
  EP005_en.mp3 (4.0 MB)

3. ✅ Script Updates

File: scripts/generate_english_audio.py
Changes:
Switched from Coqui TTS to Edge TTS library
Fixed speaker name recognition (Lingtong/Huixin → Guangda/Lingyi)
Improved dialogue parsing
Fixed file naming (EP{num}.md instead of EP{num}_en.md)

4. ✅ API Configuration

Dashscope API: Configured and tested
Edge TTS: Installed and working
No cost: Using free tiers for both APIs

In Progress (1/9)

5. 🔄 Audio Generation (EP006-EP010)

Status: Running in background (started 02:12)
Estimated completion: 02:14-02:16
Expected output: EP006_en.mp3 - EP010_en.mp3 (5 files)

Pending Tasks (4/9)

6. ⚠️ Bilingual Content Planning

Issue: EP011-EP015 episodes don't exist

Available episodes:

episodes/
  ep001-ep010 (10 episodes) ✅
  ep032 (1 episode)
  ep037-ep051 (15 episodes) ✅

Missing: ep011-ep031, ep033-ep036
User request: EP001-EP015 (English) + EP037-EP051 (Chinese)
Solution needed: Generate/create EP011-EP015 first

Platforms: WeChat, Bilibili, Kuaishou, Douyin, Xiaohongshu
Requirement: User QR code scanning
Status: Pending (user intervention required)

8. ⏳ Auto Publisher Updates

File: scripts/auto_publisher.py
Updates needed:
Bilingual episode mapping (EP37→EP1, EP38→EP2, etc.)
Video platform integration
Chinese + English metadata handling
Estimated time: 30-60 minutes

9. ⏳ Testing & Deployment

Dry-run test: EP37 Chinese + EP1 English
Bilingual schedule: 15-day daily release
Estimated time: 5-10 minutes

Issues & Solutions

Issue 1: EP011-EP015 Don't Exist

Problem: User requested EP001-EP015 for English version, but episodes directory only has EP001-EP010.

Current Options: 1. Generate EP011-EP015 using LingFlow AI (requires script generation time) 2. Use EP001-EP010 only for English version (10 episodes instead of 15) 3. Alternative mapping: Map EP037-EP051 to EP001-EP010 instead of EP001-EP015

Recommendation: Discuss with user about preferred approach.

Issue 2: Translation Quality Markers

Problem: Translated files contain "part X of 75" markers despite cleanup function.

Impact: Minor - audio generation works fine, markers are ignored.

Solution: Not critical for current work, can be cleaned up later if needed.

Next Steps (Immediate)

Monitor EP006-EP010 audio generation (in progress)
Start EP011-EP015 audio generation once files exist
Discuss EP011-EP015 content with user
Update auto_publisher.py for bilingual support
Configure platform logins (user QR code scanning)
Test bilingual dry-run workflow
Deploy bilingual publishing schedule

API Usage Summary

Dashscope Translation

Model: qwen-turbo
Cost: Free/Low cost
Rate: ~1-2 minutes per episode (rate limited)
Quality: Good for Chinese-to-English

Edge TTS Audio

Engine: Microsoft Edge TTS
Cost: Free
Speed: ~1 minute per episode
Quality: Natural, professional voices

File Structure

lingtongask/
├── scripts/
│   ├── translate_episodes.py (Dashscope API)
│   └── generate_english_audio.py (Edge TTS)
├── episodes_translated/
│   ├── EP001.md - EP010.md (translated scripts)
│   └── EP001.json - EP010.json (metadata)
├── audio_en/
│   ├── EP001_en.mp3 - EP005_en.mp3 (generated)
│   ├── EP006_en.mp3 - EP010_en.mp3 (in progress)
│   └── generation_overview.json
└── episodes/
    ├── ep001-ep010/ (source scripts)
    ├── ep032/
    └── ep037-ep051/ (Chinese versions)

Time Estimates

EP006-EP010 audio: 5-10 minutes ⏳
EP011-EP015 translation: N/A (files don't exist)
EP011-EP015 audio: N/A (files don't exist)
Auto publisher updates: 30-60 minutes
Platform login setup: 15-30 minutes (user)
Testing: 5-10 minutes
Deployment: 5 minutes

Total remaining: 1-2 hours (excluding EP011-EP015 creation)

Notes for User

Open API approach is working well - Dashscope and Edge TTS are stable and free
EP011-EP015 issue needs resolution - currently only EP001-EP010 available
Audio generation is fast - ~1 minute per episode with Edge TTS
Translation quality is good - Dashscope handles Qigong terminology well
Platform logins require user action - QR code scanning needed

Last Updated: 2025-04-12 02:12 Next Review: After EP006-EP010 audio generation completes