File: //usr/lib/python3/dist-packages/ocrmypdf/__pycache__/_sync.cpython-38.pyc
U
��Z^$: � @ s� d dl Z d dlZ d dlZd dlZd dlZd dlZd dlZd dlmZ d dl m
Z
d dlmZ d dl
Z
d dlmZ ddlmZ ddlmZmZmZ dd lmZmZmZmZmZmZmZmZmZmZm Z m!Z!m"Z"m#Z#m$Z$m%Z%m&Z&m'Z'm(Z(m)Z)m*Z*m+Z+m,Z, dd
l-m.Z.m/Z/m0Z0 ddl1m2Z2m3Z3 ddl4m5Z5 dd
l6m7Z7 ddl8m9Z9 edd�Z:dd� Z;dd� Z<dd� Z=dd� Z>dd� Z?dd� Z@dd� ZAG dd � d eB�ZCd!d"� ZDd)d$d%�ZEd*d'd(�ZFdS )+� N)�
namedtuple)�Path)�mkdtemp)�tqdm� )�
OcrGrafter)�
PDFContext�cleanup_working_files�make_logger)�convert_to_pdfa�
copy_final�create_ocr_image�create_pdf_page_from_image�create_visible_page_jpg�generate_postscript_stub�get_orientation_correction�get_pdfinfo�is_ocr_required�merge_sidecars�metadata_fixup�ocr_tesseract_hocr�ocr_tesseract_textonly_pdf�optimize_pdf�preprocess_clean�preprocess_deskew�preprocess_remove_background� rasterize�rasterize_preview�render_hocr_page�!should_visible_page_image_use_jpg�triage�validate_pdfinfo_options)�check_requested_output_file�create_input_file�report_output_file_size)�ExitCode�ExitCodeException)�qpdf)�available_cpu_count)�file_claims_pdfa�
PageResultz>pageno, pdf_page_from_image, ocr, text, orientation_correctionc C s. |rt || �}|rt|| �}|r*t|| �}|S �N)r r r )�page_contextZimage�remove_background�deskew�clean� r0 �0/usr/lib/python3/dist-packages/ocrmypdf/_sync.py�
preprocessJ s
r2 c C s` | j }d}d }d }d }t| ��rL|jr<t| j| �}t|| �}t| j| |dd�}t|j|j |j
g�s~t| ||j|j
dd� }} nV|js�t| ||j|j
|j d�} |j
r�t| j| |ddd�}
n|}
t| |
|j|j
|jd�}t|| �}d }|j�s| }t| j��rt|| �}t|| �}|jdk�r2t|| �\}
}t|
| �}|jd k�rLt|| �\}}t| j||||d
�S )Nr F)�
correction�remove_vectors)r/ TZ_ocr)r3 r4 Z
output_tagZhocrZsandwich)�pagenoZpdf_page_from_imageZocr�text�orientation_correction)�optionsr Zrotate_pagesr �originr r �anyr/ Zclean_finalr4 r2 r- r. Zlossless_reconstructionr
r Zpageinfor r Zpdf_rendererr r r r* r5 )r, r8 r7 Zpdf_page_from_image_outZocr_outZtext_outZrasterize_preview_outZ
rasterize_outZ ocr_imageZpreprocess_outZrasterize_ocr_outZ
ocr_image_outZvisible_image_outZhocr_outr0 r0 r1 �exec_page_syncT s�
������
� �
��r; c C s: | }|j j�d�r&t|�}t|||�}t||�}t||�S )N�pdfa)r8 �output_type�
startswithr r r r )Zpdf_file�contextZpdf_outZps_stub_outr0 r0 r1 �post_process� s
r@ c C s@ t � t jt j� tj�| �}t�� }g |_|�|� |tj _
dS )z Initialize a process pool workerN)�signal�SIGINT�SIG_IGN�loggingZhandlersZQueueHandler� getLogger�
addHandler�PIL�Image�MAX_IMAGE_PIXELS)�queue�
max_pixels�h�rootr0 r0 r1 �worker_init� s
rN c C s |t j_d S r+ )rG rH rI )Z_queuerK r0 r0 r1 �worker_thread_init� s rO c C sn z.| � � }|dkrW qjt�|j�}|�|� W q tk
rf ddl}tdtj d� |j
tj d� Y q X q dS )a� Listen to the worker processes and forward the messages to logging
For simplicity this is a thread rather than a process. Only one process
should actually write to sys.stderr or whatever we're using, so if this is
made into a process the main application needs to be directed to it.
See https://docs.python.org/3/howto/logging-cookbook.html#logging-to-a-single-file-from-multiple-processes
Nr zLogging problem)�file)�getrD rE �nameZhandle� Exception� traceback�print�sys�stderr� print_exc)rJ �recordZloggerrT r0 r0 r1 �log_listener� s rZ c
C sN t t| j�| jj�}|dkr*| j�d|� t d| jj| �}| jjdkrVtj �
� | j_| jj�dt|�� |dkr�| j�d|� | jj
r�ddlm} t}n
tj}t}dgt| j� }t| �}t�d �}tjt|fd
�}|�� tdt| j� dd
d| jj d���} ||||tjjfd�}
z�zh|
�!t"| �#� �}z2|�$� }|j%||j&<