Last week we shared a number of updates with our community of users, and now we want to share them here: At Mozilla, we work hard to make Firefox the best
So - I don’t think Firefox would be generating captions for PDFs on PDF creation.
But of the major ways that PDF’s do get created - converted from text editors or design software, I know that Microsoft Word automatically suggests captions when the document creator adds an image (but does not automatically apply captions), and I believe that some design software does, as well.
I think that, functionally, both suggesting captions at time of document creation, or at time of document read are prone to the same issues - that the software may not be smart enough to properly identify the object, and if it is, that it is not necessarily smart enough to explain it in context.
By way of example, a screenshot of a computer program will have the automatic suggestion of “A graphical user interface” (or similar), but depending on the context and usage, it could be “A virus installer disguised as ___ video game installer.” Or “The ___ video game installer.” Between the document creator and the creation software or screen reader, only the document creator would really know the context for the image.
Which is all to say that I think that Mozilla has the right idea with auto-tagging, but it will always fail on context. The only way to actually address the issue is to deal with it within the document creation software.
But I wouldn’t be opposed to ML on those that can auto-suggest things or even critique how content authors write their descriptions.
So - I don’t think Firefox would be generating captions for PDFs on PDF creation.
But of the major ways that PDF’s do get created - converted from text editors or design software, I know that Microsoft Word automatically suggests captions when the document creator adds an image (but does not automatically apply captions), and I believe that some design software does, as well.
I think that, functionally, both suggesting captions at time of document creation, or at time of document read are prone to the same issues - that the software may not be smart enough to properly identify the object, and if it is, that it is not necessarily smart enough to explain it in context.
By way of example, a screenshot of a computer program will have the automatic suggestion of “A graphical user interface” (or similar), but depending on the context and usage, it could be “A virus installer disguised as ___ video game installer.” Or “The ___ video game installer.” Between the document creator and the creation software or screen reader, only the document creator would really know the context for the image.
Which is all to say that I think that Mozilla has the right idea with auto-tagging, but it will always fail on context. The only way to actually address the issue is to deal with it within the document creation software.
But I wouldn’t be opposed to ML on those that can auto-suggest things or even critique how content authors write their descriptions.