ladybird/DocumentParser.cpp at e7f7c434f79adc199f08ca30f5f6b49d82e92a5f

mirror of https://github.com/fergalmoran/ladybird.git synced 2026-01-06 08:36:15 +00:00

Files

Nico Weber e7f7c434f7 LibPDF: Don't check for startxref after trailer dict

Several files have a comment after the trailer dict and the
`startxref` after it.

We really should add a consume_whitespace_and_comments() function
and call that in most places we currently call consume_whitespace().

But in this case, for non-linearized files, we first jump to the
end of the file, read `startxref`, then jump to `xref` from the
offset there, and then read the trailer after the `xref`,
only to read `startxref` again. So we can just not do that.

(For linearized files, we now completely ignore `startxref`.
But we don't use the data in `startxref` in linearized files
anyways, so it's fine to not read it there too.)

Reduces number of crashes on 300 random PDFs from the web (the first 300
from 0000.zip from
https://pdfa.org/new-large-scale-pdf-corpus-now-publicly-available/)
from 25 (8%) to 23 (7%).

2023-10-24 13:32:01 -04:00

33 KiB

Raw Blame History

View Raw

33 KiB Raw Blame History

33 KiB

Raw Blame History