本文讲述基于python的一些模块进行图片内容的提取、图片内容的翻译。本文主要进行记录一些在实践中的构想以及遇到的问题，并且记录上一些实现的代码，因为技术含量实在是不怎么高的，不过若是自己玩玩，参加那种水比赛也许能获得个不错的名次，或者是应付个学生报告什么的…

由于时间关系，本文多数只是起到一个构想记录的效用。

基于OCR的图片内容提取

在python使用到的模块是pytesseract，关于简要的下载介绍什么的可见：Python–文字识别–Tesseract。

运行代码：

import pytesseract
import cv2
image = cv2.imread('/Users/junjieliu/Desktop/1.png')
text = pytesseract.image_to_string(image)
print(text)

在此记录一下在使用过程中的出现的问题：

问题一：

1	Error: [Errno 2] No such file or directory using pytesser

之后我参考了：https://stackoverflow.com/questions/35609773/oserror-errno-2-no-such-file-or-directory-using-pytesser

我使用了其中的前面的几个答案的方案，结果出现了下面的错误…

问题二：

1	PermissionError: [Errno 13] Permission denied

之后我参考了：https://github.com/madmaze/pytesseract/issues/62

但是依旧得不到解决。

解决方案：

使用命令行：

1	which tesseract

找到了它的位置（没想到Mac自带的一个？）：

1	/usr/local/bin/tesseract

然后虽然在替换了地址之后可以正常运行代码了(即tesseract_cmd = “/usr/local/bin/tesseract”)，就会变得很麻烦，因为自带的根本难以进行扩展。

将下载好的加入环境变量替换掉原装的：

1	vi ~/.bash_profile

写入：

1 2	#tesseract export PATH="/usr/local/Cellar/tesseract/4.0.0/bin:$PATH"

立即生效：

1	source ~/.bash_profile

之后再使用命令行which tesseract,就会发现变了位置，更改tesseract_cmd = “/usr/local/Cellar/tesseract/4.0.0/bin/tesseract”，之后程序就能成功运行并且可以得到以后的更多的扩展使用了，比如语言包的选择。

在线提取图片文字小工具

提取这一块的具体过程就不多说了，简单记录一下结合其他技术可以实现的想法：

可结合Pyqt5的GUI界面化开发，输入图片的目录地址，下方即出现提取的内容。

在以上的基础上结合爬虫实现翻译。

可参考我以前写的文章：python3爬虫与GUI-基于有道词典的词典小工具

这样一来这个小工具就能出来了。这里就这样吧，因为时间关系加上实现的过程不是很难，所以就不多说了。

关于提取的精确度可移步参考更强大的工具：deep_ocr

结合OpenCV实时翻译

这里主要是我在参考了：用OpenCV和Python识别二维码和条形码这篇文章之后结合本身的需求出现的启发。

这是我经过修改之后的代码（添加并且修改了几行代码）：

from imutils.video import VideoStream
from pyzbar import pyzbar
import datetime
import imutils
import time
import cv2
cap = cv2.VideoCapture(0)
vs = VideoStream(src=0).start()
time.sleep(2.0)
# open the output CSV file for writing and initialize the set of
# barcodes found thus far
csv = open("barcodes.csv", "w")
found = set()
# loop over the frames from the video stream
while True:
    # grab the frame from the threaded video stream and resize it to
    # have a maximum width of 400 pixels
    frame = vs.read()
    frame = imutils.resize(frame, width=400)
    # find the barcodes in the frame and decode each of the barcodes
    barcodes = pyzbar.decode(frame)
    # loop over the detected barcodes
    for barcode in barcodes:
        # extract the bounding box location of the barcode and draw
        # the bounding box surrounding the barcode on the image
        (x, y, w, h) = barcode.rect
        cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 0, 255), 2)
        # the barcode data is a bytes object so if we want to draw it
        # on our output image we need to convert it to a string first
        barcodeData = barcode.data.decode("utf-8")
        barcodeType = barcode.type
        # draw the barcode data and barcode type on the image
        text = "{} ({})".format(barcodeData, barcodeType)
        cv2.putText(frame, text, (x, y - 10),
            cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 2)
        # if the barcode text is currently not in our CSV file, write
        # the timestamp + barcode qto disk and update the set
        if barcodeData not in found:
            csv.write("{},{}\n".format(datetime.datetime.now(),
                barcodeData))
            csv.flush()
            found.add(barcodeData)
    # show the output frame
    cv2.imshow("Barcode Scanner", frame)
    key = cv2.waitKey(1) & 0xFF
 
    # if the `q` key was pressed, break from the loop
    if key == ord("q"):
        break
# close the output CSV file do a bit of cleanup
print("[INFO] cleaning up...")
cap.release()  # 释放摄像头
csv.close()
cv2.destroyAllWindows()
vs.stop()

性能得到了一点的优化，少写了点代码。效果没变化。

关于实现实时翻译的效果，这里可结合上面的有道爬虫与OpenCV来完成。基本上进行一些修改就行了，实现的过程不算太难。多参考官方文档以及他人的做法即能实现。

大概的代码样本：

# coding:utf-8
from imutils.video import VideoStream
import datetime
import imutils
import time
import cv2
import hashlib
import requests
import json
import random
cap = cv2.VideoCapture(0)
# initialize the video stream and allow the camera sensor to warm up
print("starting video stream...")
# vs = VideoStream(src=0).start()
vs = VideoStream(src=0).start()
time.sleep(2.0)
csv = open("barcodes.csv", "w")
found = set()
while True:
    # grab the frame from the threaded video stream and resize it to
    # have a maximum width of 400 pixels
    frame = vs.read()
    frame = imutils.resize(frame, width=400)
    word = frame
    for words in word:
        r = str(int(time.time() * 1000 + random.randint(1, 10)))  # 模仿JS代码的仿写
        S = 'fanyideskweb'
        n = words
        D = "ebSeFb%=XZ%T[KZ)c(sy!"  # 在完整的JS代码中可找到    
        o = hashlib.md5((S + n + str(r) + D).encode('utf-8')).hexdigest()
        
        data = {
            'i': words,
            'from': 'AUTO',
            'to': 'AUTO',
            'smartresult': 'dict',
            'client': S,
            'salt': r,
            'sign': o,
            'doctype': 'json',
            'version': '2.1',
            'keyfrom': 'fanyi.web',
            'action': 'FY_BY_REALTIME',
            'typoResult': 'false'
        }
        url = 'http://fanyi.youdao.com/translate_o?smartresult=dict&smartresult=rule'
        
        # 在代理中需要加入cookies信息，否则会出现代码错误信息的返回
        header = {
            'Cookie': 'OUTFOX_SEARCH_USER_ID=432464843@10.168.8.76; _ntes_nnid=25aff2b1480f17471ca1585f6f2f4293,1512024136653; OUTFOX_SEARCH_USER_ID_NCOO=132154936.07902834; JSESSIONID=aaa3TFIg-JJJN4xEog6mw; ___rl__test__cookies=1525691300664',
            'Referer': 'http://fanyi.youdao.com/',
            'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.139 Safari/537.36'
        }
        response = requests.post(url=url, headers=header, data=data)
        response.encoding = 'utf-8'
        
        translateResult = json.loads(response.text)["translateResult"][0][0]['tgt']
        #(x, y, w, h) = words.rect
        cv2.rectangle(frame, (10, 10), (20, 20), (0, 0, 255), 2)
        
               cv2.putText(frame, translateResult, (10, 10 - 10),
        cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 2)
        
        # if the  text is currently not in our CSV file, write
        # the timestamp + text qto disk and update the set
        if translateResult not in found:
            csv.write("{},{}\n".format(datetime.datetime.now(),
                translateResult))
            csv.flush()
            found.add(translateResult)
    # show the output frame
    cv2.imshow("Translate Discern", frame)
    key = cv2.waitKey(1) & 0xFF
 
    # if the `q` key was pressed, break from the loop
    if key == ord("q"):
        break
# close the output CSV file do a bit of cleanup
print("cleaning up...")
cap.release()  # 释放摄像头
csv.close()
cv2.destroyAllWindows()
vs.stop()

可参考：

最后

记录一下在下载tesseract之后的提示，有一天可能会用到：

icu4c is keg-only, which means it was not symlinked into /usr/local,
because macOS provides libicucore.dylib (but nothing else).
If you need to have icu4c first in your PATH run:
  echo 'export PATH="/usr/local/opt/icu4c/bin:$PATH"' >> ~/.bash_profile
  echo 'export PATH="/usr/local/opt/icu4c/sbin:$PATH"' >> ~/.bash_profile
For compilers to find icu4c you may need to set:
  export LDFLAGS="-L/usr/local/opt/icu4c/lib"
  export CPPFLAGS="-I/usr/local/opt/icu4c/include"
For pkg-config to find icu4c you may need to set:
  export PKG_CONFIG_PATH="/usr/local/opt/icu4c/lib/pkgconfig"