• tal@lemmy.today
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      1
      ·
      edit-2
      3 months ago

      I saw this neat project the other day, detextify.

      All it does is, in an automated fashion, with a large number of images, run OCR on an image, identify text, take the bounding box of the text, and do an inpaint on that area until it can’t detect any text there.

      It does kind of seem to me like that might be a generalizable-approach. That is, it might be hard to write software to draw a good image of someone with just two hands and the right number of fingers. But…it might be an easier problem to solve to identify, or at least flag likely, mismatches in number of fingers.

      Like, the ideal would be to have LLMs generate the right number of fingers and stuff like that. But in the absence of that, having software written to identify problematic features in the image automatically and simply regenerate them might be a reasonably doable workaround.