跳转至

Bilingual Publishing Progress Report

Date: 2025-04-12 Status: 🔄 In Progress (Open API-based approach)


Executive Summary

Using open APIs (Dashscope, Edge TTS) for bilingual publishing. Local models need training/fine-tuning, so focusing on API-based approach for now.

Current Status

  • ✅ Translation: EP001-EP010 complete (Dashscope API)
  • ✅ Audio Generation: EP001-EP005 complete (Edge TTS)
  • 🔄 Audio Generation: EP006-EP010 in progress
  • ⚠️ Issue: EP011-EP015 episodes don't exist in episodes directory

Completed Tasks (4/9)

1. ✅ Translation (EP001-EP010)

  • API: Aliyun Dashscope (qwen-turbo model)
  • Files: EP001.md, EP002.md, ..., EP010.md (in episodes_translated/)
  • Quality: Good, but contains "part X of 75" markers (cleanup function not working properly)
  • Speed: ~1-2 minutes per episode (API rate limiting)
  • Output:
    episodes_translated/
      EP001.md (13KB)
      EP002.md (14KB)
      EP003.md (15KB)
      EP004.md (15KB)
      EP005.md (13KB)
      EP006.md (12KB)
      EP007.md (12KB)
      EP008.md (13KB)
      EP009.md (14KB)
      EP010.md (14KB)
    

2. ✅ Audio Generation (EP001-EP005)

  • Engine: Edge TTS (Microsoft Azure)
  • Voices:
  • Guangda (Male): en-US-GuyNeural
  • Lingyi (Female): en-US-JennyNeural
  • Status: Complete
  • Output:
    audio_en/
      EP001_en.mp3 (4.3 MB)
      EP002_en.mp3 (4.1 MB)
      EP003_en.mp3 (4.7 MB)
      EP004_en.mp3 (4.5 MB)
      EP005_en.mp3 (4.0 MB)
    

3. ✅ Script Updates

  • File: scripts/generate_english_audio.py
  • Changes:
  • Switched from Coqui TTS to Edge TTS library
  • Fixed speaker name recognition (Lingtong/Huixin → Guangda/Lingyi)
  • Improved dialogue parsing
  • Fixed file naming (EP{num}.md instead of EP{num}_en.md)

4. ✅ API Configuration

  • Dashscope API: Configured and tested
  • Edge TTS: Installed and working
  • No cost: Using free tiers for both APIs

In Progress (1/9)

5. 🔄 Audio Generation (EP006-EP010)

  • Status: Running in background (started 02:12)
  • Estimated completion: 02:14-02:16
  • Expected output: EP006_en.mp3 - EP010_en.mp3 (5 files)

Pending Tasks (4/9)

6. ⚠️ Bilingual Content Planning

  • Issue: EP011-EP015 episodes don't exist
  • Available episodes:
    episodes/
      ep001-ep010 (10 episodes) ✅
      ep032 (1 episode)
      ep037-ep051 (15 episodes) ✅
    
  • Missing: ep011-ep031, ep033-ep036
  • User request: EP001-EP015 (English) + EP037-EP051 (Chinese)
  • Solution needed: Generate/create EP011-EP015 first

7. ⏳ Platform Login Configuration

  • Platforms: WeChat, Bilibili, Kuaishou, Douyin, Xiaohongshu
  • Requirement: User QR code scanning
  • Status: Pending (user intervention required)

8. ⏳ Auto Publisher Updates

  • File: scripts/auto_publisher.py
  • Updates needed:
  • Bilingual episode mapping (EP37→EP1, EP38→EP2, etc.)
  • Video platform integration
  • Chinese + English metadata handling
  • Estimated time: 30-60 minutes

9. ⏳ Testing & Deployment

  • Dry-run test: EP37 Chinese + EP1 English
  • Bilingual schedule: 15-day daily release
  • Estimated time: 5-10 minutes

Issues & Solutions

Issue 1: EP011-EP015 Don't Exist

Problem: User requested EP001-EP015 for English version, but episodes directory only has EP001-EP010.

Current Options: 1. Generate EP011-EP015 using LingFlow AI (requires script generation time) 2. Use EP001-EP010 only for English version (10 episodes instead of 15) 3. Alternative mapping: Map EP037-EP051 to EP001-EP010 instead of EP001-EP015

Recommendation: Discuss with user about preferred approach.

Issue 2: Translation Quality Markers

Problem: Translated files contain "part X of 75" markers despite cleanup function.

Impact: Minor - audio generation works fine, markers are ignored.

Solution: Not critical for current work, can be cleaned up later if needed.


Next Steps (Immediate)

  1. Monitor EP006-EP010 audio generation (in progress)
  2. Start EP011-EP015 audio generation once files exist
  3. Discuss EP011-EP015 content with user
  4. Update auto_publisher.py for bilingual support
  5. Configure platform logins (user QR code scanning)
  6. Test bilingual dry-run workflow
  7. Deploy bilingual publishing schedule

API Usage Summary

Dashscope Translation

  • Model: qwen-turbo
  • Cost: Free/Low cost
  • Rate: ~1-2 minutes per episode (rate limited)
  • Quality: Good for Chinese-to-English

Edge TTS Audio

  • Engine: Microsoft Edge TTS
  • Cost: Free
  • Speed: ~1 minute per episode
  • Quality: Natural, professional voices

File Structure

lingtongask/
├── scripts/
│   ├── translate_episodes.py (Dashscope API)
│   └── generate_english_audio.py (Edge TTS)
├── episodes_translated/
│   ├── EP001.md - EP010.md (translated scripts)
│   └── EP001.json - EP010.json (metadata)
├── audio_en/
│   ├── EP001_en.mp3 - EP005_en.mp3 (generated)
│   ├── EP006_en.mp3 - EP010_en.mp3 (in progress)
│   └── generation_overview.json
└── episodes/
    ├── ep001-ep010/ (source scripts)
    ├── ep032/
    └── ep037-ep051/ (Chinese versions)

Time Estimates

  • EP006-EP010 audio: 5-10 minutes ⏳
  • EP011-EP015 translation: N/A (files don't exist)
  • EP011-EP015 audio: N/A (files don't exist)
  • Auto publisher updates: 30-60 minutes
  • Platform login setup: 15-30 minutes (user)
  • Testing: 5-10 minutes
  • Deployment: 5 minutes

Total remaining: 1-2 hours (excluding EP011-EP015 creation)


Notes for User

  1. Open API approach is working well - Dashscope and Edge TTS are stable and free
  2. EP011-EP015 issue needs resolution - currently only EP001-EP010 available
  3. Audio generation is fast - ~1 minute per episode with Edge TTS
  4. Translation quality is good - Dashscope handles Qigong terminology well
  5. Platform logins require user action - QR code scanning needed

Last Updated: 2025-04-12 02:12 Next Review: After EP006-EP010 audio generation completes