I tried the interleave generation task (image editing) by directly running the unchanged repo under MMaDA-Parallel-A/:
python inference.py
--checkpoint tyfeld/MMaDA-Parallel-A
--vae_ckpt tyfeld/MMaDA-Parallel-A
--prompt "Replace beer with a cup of coffee and make the keyboard space gray"
--image_path examples/image.png
--height 512
--width 512
--timesteps 128
--text_steps 256
--text_gen_length 256
--text_block_length 32
--cfg_scale 0
--cfg_img 4.0
--temperature 1.0
--text_temperature 0
--seed 42
--output_dir output/results_interleave
According to the paper, I should get a edited image faithful to the original image, like this:
But it turns out I get this:
Is there something wrong with my way of reproducing it?
I tried the interleave generation task (image editing) by directly running the unchanged repo under MMaDA-Parallel-A/:
According to the paper, I should get a edited image faithful to the original image, like this:
But it turns out I get this:
Is there something wrong with my way of reproducing it?