Image-editing performance inconsistent the paper

I tried the interleave generation task (image editing) by directly running the unchanged repo under MMaDA-Parallel-A/:
```shelll
python inference.py 
--checkpoint tyfeld/MMaDA-Parallel-A     
--vae_ckpt tyfeld/MMaDA-Parallel-A    
--prompt "Replace beer with a cup of coffee and make the keyboard space gray"     
--image_path examples/image.png    
--height 512     
--width 512    
--timesteps 128     
--text_steps 256     
--text_gen_length 256     
--text_block_length 32    
--cfg_scale 0     
--cfg_img 4.0     
--temperature 1.0     
--text_temperature 0     
--seed 42 
--output_dir output/results_interleave
```
According to the paper, I should get a edited image faithful to the original image, like this:

<img width="420" height="334" alt="Image" src="https://github.com/user-attachments/assets/57503a3c-c0b9-4ef5-b7f9-5a8950d58bb0" />

But it turns out I get this:

<img width="512" height="512" alt="Image" src="https://github.com/user-attachments/assets/6f8c3a5e-e5a4-4aeb-8cfe-6e4b2a1498a1" />

Is there something wrong with my way of reproducing it?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Image-editing performance inconsistent the paper #11

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Image-editing performance inconsistent the paper #11

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions