Upload two images and input an action description. The model will predict which image is closer to the completed state of the task.