Moldable Emacs: capturing text from open images with an OCR mold
Too long; didn't read
Extract text from images with a mold! A mold is available to use imgclip to transform an image in the text it contains.
The problem
I have been reading The Humane Interface by Jef Raskin. Tudor Girba mentioned this book in one of his talks about GT. It is an inspiring book, very recommended! In explaining the ideal user interface, the author takes as an example the extraction of text from images. He basically says that a good interface would be contextual: it would act according to the object at hand. For example, if you have an image with text at your pointer, it should be easy for you to get its text.
Now, how difficult would be to make a mold for this?
And there is a solution
Not difficult at all! First I looked for some OCR library that can
extract text reliably. A result that seemed good is imgclip, which you
can install with an npm install -g imgclip
.
Now say that I take a picture of this text while I am writing it and I
open it in Emacs. When I call me/mold
on the buffer with the image,
this is the new mold I find!
And this shows the result I get from extracting the text.
The text is imperfect! There are a lot of wrong words: for example "I" gets translated as "1" (the number). Still, it is cool to get text out of an (unsearchable) image!
Also it was simple to implement the mold.
(me/register-mold :key "Image To Text" :docs "Extracts text from the image using `imageclip'." :given (lambda () (and (eq major-mode 'image-mode) (executable-find "imgclip"))) :then (lambda () (let* ((buffername (buffer-name)) (self nil) ;; TODO what here? (buffer (get-buffer-create (format "Text from %s" buffername))) (_ (async-map `(lambda (s) (shell-command-to-string (format "imgclip -p '%s' --lang eng" s))) (list (or (buffer-file-name) (let ((path (concat "/tmp/" buffername))) (write-region (point-min) (point-max) path) path))) `(lambda () (with-current-buffer ,buffer (erase-buffer) (clipboard-yank)))))) (with-current-buffer buffer (erase-buffer) (insert "Loading text from image...")) buffer)))
This mold works only if the buffer is an image, and imgclip
is
available. When you run it, it translates the result via imgclip
and
displays a "Loading text from image..." while it is busy extracting
text. Notice that I instructed imgclip
to recognize English (--lang
eng
): you can redefine the mold for the language you need.
And now that I think about it, a natural extension of this mold is when you open a PDF with text you cannot select. It should be easy to extract an image of the page you are viewing and compose this mold on that. The UNIX saying is just true: worse is better!
Edit: I actually tried that! If you use pdf-tools this is just too
easy. You want to call the interactive function
pdf-view-extract-region-image
. This generates a png view of the
current PDF page. Then you can call our OCR mold!
Conclusion
The mold is in my package moldable-emacs: grab it and try it out
(after installing imgclip
)! The installation is already a bit easier
because a nice user ("Tekakutli") started trying it out, which
inspired me to put at least some effort in making my extension
accessible to others.
So extract (imperfect) text from images with a mold if you wish!
Happy texting!