Skip to content
View hongping-zh's full-sized avatar

Block or report hongping-zh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
hongping-zh/README.md

πŸ‘‹ Hi, I'm Hongping Zhang

Independent AI Researcher | Energy Efficiency & Sustainable Computing


🎯 Core Assets

Asset Type Impact Link
πŸ€— HuggingFace Optimum Integration Official Documentation Trusted by thousands of HF developers View Docs β†’
πŸ“Š Complete Energy Dataset Research Benchmark 360+ configurations, 5 precision methods Explore Data β†’
🦾 EcoCompute AI Assistant Interactive Tool Conversational energy advisor on ClawHub Try EcoCompute β†’
πŸ›οΈ MLCommons Power WG Discussion Industry Recognition Invited to contribute to MLPerf power measurement standards View Discussion β†’

πŸ”¬ Core Discovery

Quantization only saves energy for models > 3.2–4.6B parameters.
For smaller models, FP16 is actually more energy-efficient.
β€” Measured on RTX 4090D, RTX 5090, A800 with NVML power sampling.

This finding challenges the default assumption that "quantize everything = green." Our benchmark data is open and reproducible.

Key Findings:

  • NF4 crossover: 3.2–3.9B parameters (hardware-dependent)
  • INT8 crossover: 4.0–4.6B parameters (hardware-dependent)
  • Below threshold: Quantization adds 25–55% energy overhead
  • Above threshold: Quantization saves 15–23% energy

πŸš€ Try It Now

🌐 Live Demo ecocompute-dynamic-eval β†’
πŸ“Š What it does Compare AI models by Accuracy Γ— Cost Γ— Carbon in one dashboard
⚑ Data source Real GPU benchmarks β€” PyTorch 2.10 + CUDA 12.8, 10 runs per config

πŸ“ˆ Recognition & Impact

Achievement Details
πŸ€— HuggingFace Official Quantization energy findings integrated into Optimum documentation
πŸ›οΈ MLCommons Invited Contributing to MLPerf Power Working Group on quantization energy metrics
πŸ“Š Open Dataset 360 configurations, 270 analyzed + 90 FP8 reserved for future work
🌍 Zenodo Archive Permanent DOI: 10.5281/zenodo.18900289
πŸ“ Research Paper "When Does Quantization Save Energy?" β€” arXiv submission in progress

🎯 2026 Roadmap

  • βœ… HuggingFace Integration β€” Official documentation published
  • βœ… MLCommons Engagement β€” Invited to Power Working Group
  • πŸ”„ arXiv Publication β€” Seeking endorsement for cs.LG submission
  • πŸ›‘οΈ VS Code Extension β€” Real-time energy linting before code merges
  • 🀝 Enterprise Pilots β€” Seeking design partners for carbon-aware CI/CD

πŸ’š How You Can Help

I'm looking for design partners, early adopters, arXiv endorsers, and grant sponsors to take EcoCompute from research to production.

Action Link
⭐ Star the repo quantization-energy-crossover
🌐 Try the demo Live Dashboard β†’
πŸ“§ arXiv Endorsement Email me β†’
🀝 Become a design partner Email me β†’
πŸ’Ό Invest / Grant Email me β†’

πŸ“š Key Publications & Resources


🌍 Making AI development more sustainable, one model at a time.

Pinned Loading

  1. ecocompute-dynamic-eval ecocompute-dynamic-eval Public

    ⚑ Compare AI models by Accuracy Γ— Cost Γ— Carbon β€” RTX 5090 benchmarks reveal 4-bit quantization wastes energy on small models

    TypeScript 2