Documentos accesibles
¿Qué significa que un documento sea accesible? Podemos decir que es un documento que permite que tecnologías de asistencia (lectores de pantalla, lupas digitales, etc.) puedan interpretar correctamente su contenido. La accesibilidad digital no es solo una cuestión técnica, sino un compromiso ético e incluso legal con la inclusión. Garantizar la accesibilidad digital permite que personas con discapacidades visuales, cognitivas, auditivas o motoras accedan al contenido en igualdad de condiciones. En el ámbito educativo, esto es especialmente relevante, ya que asegura que todos los estudiantes puedan participar plenamente en el proceso de aprendizaje, sin barreras derivadas del formato del material.
Entre otras cosas, un documento accesible debería incluir:
- Estructura semántica clara (títulos, secciones, listas, tablas).
- Texto alternativo para imágenes.
- Etiquetado de fórmulas matemáticas.
- Orden de lectura correcto.
- Metadatos descriptivos.
Documentos accesibles con LaTeX
La creación de documentos accesibles es complicado técnicamente. Si además, el documento que se genera es PDF, todo se complica puesto que el formato está orientado a que el software “pinte” eficientemente en pantalla el contenido con poca información sobre las estructuras semánticas, textos alternativos, etiquetados, órden de lectura, etc. que son necesarios para que el documento sea accesible.
LaTeX es un software diseñando para hacer composición de documentos sin tener en cuenta la accesibilidad. Sin embargo, es posible crear documentos accesibles utilizando LaTeX aunque los autores tienen que hacer un esfuerzo extra utilizando paquetes y siguiendo, además de las buenas prácticas habituales, algunas convenciones extra.
Buenas prácticas que ayudan
- Intenta siempre utilizar los comandos para crear títulos y secciones. En el preámbulo incluye siempre
|
|
- Incluye siempre el
\maketitle - Utiliza siempre
\section{},\subsection{}, etc. - Incluye siempre leyendas en las tablas:
|
|
- Utiliza el paquete
hyperrefy añade la siguiente información para que los lectores de pantalla entiendan mejor el idioma del documento:
|
|
Paquetes que ayudan a la accesibilidad
Existen paquetes que ayudan a generar documentos accesbles con
información sobre estructura, textos alternativos o etiquetado de
fórmulas matemáticas. Los dos más importantes son accessibility y
axessibility
Accessibility
NO USAR, EL PROPIO AUTOR LO DESACONSEJA
Para crear un PDF más accesible puedes importar el paquete con las opciones [tagged, highstructure]:
|
|
Una vez cargado se puede usar además el comando \alt para añadir texto alternativo a las figuras:
|
|
Axxessibility
Importa el paquete:
|
|
El documento generado contendrá información de accesibilidad para el entorno equation y las fórmulas inline ($$):
|
|
TagPDF and LaTeX Prototype
https://www.latex-project.org/news/2024/07/08/tagging/ https://latex3.github.io/tagging-project/tagging-status/ https://www.latex-project.org/publications/2024-FMi-DPC-UFi-JAW-doceng24.pdf https://latex3.github.io/tagging-project/documentation/prototype-usage-instructions.html
Buenas prácticas que ayudan
- Utiliza texto alternativo para imágenes:
|
|
- Añade tablas después de los gráficos: esos datos proporcionan la única forma de que un lector de pantalla explique los gráficos.
Orientaciones rápidas
Convert to HTML and MathML Since the PDFs produced by TeX engines are not accessible, you should provide an accessible alternative in the form of HTML and MathML.
Pandoc and LaTeXML both offer free webapps that allow you to upload your TeX document and output an HTML file.
Because of the limitations of LaTeX at this time, many experts recommend converting your LaTeX documents to HTML and MathML. This can be accomplished through a number of applications including LaTeXML and Pandoc.
LaTeXML Documentation LaTeXML is a command line program that can be used to convert LaTeX documents to XML./ Latexmlpost converts this XML into other formats such as HTML or XHTML, with options to convert the math into MathML (currently only presentation). Pandoc Pandoc is a command line program that can also be used to convert LaTeX to XML and HTML file types. See example 17 on the demos page for an example on how Pandoc can be used to convert LaTeX to HTML.
PDF Tagging & Headings There does not appear to be a way of reliably generating tagged PDFs using LaTeX. The unsuitability of two potential solutions I came across follows:
As outlined in the tagpdf documentation, the tagpdf package is not meant for normal document production. As such, the syntax required to use it is complicated and the package likely contains bugs. As outlined on the Accessibility package GitHub page, the accessibility package is also not suitable for production and is no longer maintained. Although it does produces tagged PDFs according to Ally, it sometimes leads to documents not compiling, and sometimes causes unexpected behaviour. As an example: This compilable project does not contain the package, but otherwise identical uncompilable project contains the package. In this incorrectly compiled project some of the text is duplicated whereas in this correctly compiled project no duplication occurs. Tags can be added to a PDF once it’s been created by a few different services, namely Adobe Acrobat Pro DC, Microsoft Word, and PDFix. Since Acrobat Pro isn’t free to use and Word seems to often ruin the format, I found PDFix’s ‘Make PDF Accessible’ tool to be the best solution. This also allows metadata to be changed. The company appear reputable with the PDFix privacy policy stating they delete all provided files for 30 days and pass data to third parties “only within the extent necessary to meet its obligations”.
The only problem I found with this service was its inability to render a .pdf vector image. This format is unusual, and was easily fixed by converting the image to a .png file.
Maths Making maths accessible in LaTeX does appear to be possible but is a little complex. Most sources seem to recommend converting LaTeX documents to HTML5 documents via a semi-automated process using various tools. This aforementioned Massie and Sarantsev paper provides a good overview of the topic.
I found Pandoc to be the easiest tool to do this conversion. To convert maths it uses MathJax – a JavaScript engine which creates “beautiful and accessible math in all browsers”. HTML documents are accessible by default since they are tagged, and contain conventions for setting alt text and metadata. See this MathJax documentation page for information on screen readers for maths it helps display.
Once installing Pandoc, LaTeX documents can be converted on Windows as follows:
Open command prompt (press Win+R, type cmd, press enter). Copy the location of the folder containing the .tex file you wish to convert. The .bib file should be in the same directory. In command prompt, enter: cd “the folder location you copied” Enter the following command, replacing myTex.tex and myBib.bib with your filenames. pandoc myTex.tex -f latex -t html -s -o output.html –bibliography myTex.bib –citeproc –mathjax Move the new file output.html up one folder level. For example, from C:/folder1/folder2/folder3/output.html to C:/folder1/folder2/output.html. This is so images’ paths are correct. Open output.html.
What does this article cover? This goal of this article is to provide an introduction to tagged PDF together with an overview of some technical challenges faced by software, including TeX engines and LaTeX, which aims to produce tagged and accessible PDF files. Accessibility, particularly of PDFs, is a broad and complex topic, which also has technical challenges that don’t always have a single, universally agreed or accepted solution—such as how to represent complex mathematics within PDFs in an accessible way: using MathML or LaTeX code?
Although we cannot take a deep dive into every topic, and have to simplify many details, we can take a look inside PDFs to show what tagging a PDF really involves. Overleaf hopes this article will serve as a useful introduction, providing sufficient background to enable readers to better understand the technical challenges and support their further reading and exploration of tagged PDF and accessibility. Resources within this article include:
an 8-minute video exploring a tagged PDF produced by LaTeX; an Overleaf project to explore the use of space characters in LuaTeX; sound recordings demonstrating PDF accessibility issues.