Trying to make such things:
1st step:
convert img.jpg img.pdf
2nd step:
pdfimages -j img.pdf img1
Comparing source and extracted images in HEX shows difference. How to make such conversion without data loss?
Trying to make such things:
1st step:
convert img.jpg img.pdf
2nd step:
pdfimages -j img.pdf img1
Comparing source and extracted images in HEX shows difference. How to make such conversion without data loss?
One way is to use pdflatex instead of convert.
You need in a extra file, which is here called image.tex:
\documentclass{article}
\usepackage[active,tightpage]{preview}
\usepackage{graphicx}
\PreviewMacro[{*[][]{}}]{\includegraphics}
\begin{document}
\includegraphics{img.jpg}
\end{document}
Then run pdflatex image.tex to generate image.pdf.
Are you sure that there's an entire JPEG - metadata and picture data, in JFIF/JPEG format within the PDF? If not, then even if the image data is extracted verbatim, pdfimages will have to reconstruct the container and that may not match.
You can get a similar situation with audio files and tags etc. - you can't do sum comparisons if you change the metadata.
IN that situation, you need to calculate hashes for just the data portion, rather than the whole file.