HEX

File: //lib/python3/dist-packages/bs4/__pycache__/diagnose.cpython-38.pyc
U

t�^)�@sdZdZddlZddlmZddlmZddlZddlmZm	Z	ddl
mZddlZddl
Z
ddlZddlZddlZddlZddlZddlZdd	�Zd#dd�ZGd
d�de�Zdd�ZdZdZd$dd�Zd%dd�Zd&dd�Zd'dd�Zd(d d!�Zed"k�reej� ��dS))z=Diagnostic functions, mainly for use when doing tech support.ZMIT�N)�StringIO)�
HTMLParser)�
BeautifulSoup�__version__)�builder_registrycCsJtdt�tdtj�dddg}|D]4}tjD]}||jkr2q(q2|�|�td|�q(d|kr�|�d�z*dd	l	m
}td
d�tt
|j���Wn*tk
r�}ztd�W5d
}~XYnXd|k�rzdd
l}td|j�Wn,tk
�r}ztd�W5d
}~XYnXt|d��r.|��}n�|�d��sF|�d��r^td|�td�d
Sz:tj�|��r�td|�t|��}|��}W5QRXWntk
�r�YnXt�|D]�}td|�d}	zt||d�}
d}	Wn8tk
�r}ztd|�t��W5d
}~XYnX|	�r:td|�t|
���td��q�d
S)z�Diagnostic suite for isolating common problems.

    :param data: A string containing markup that needs to be explained.
    :return: None; diagnostics are printed to standard output.
    z'Diagnostic running on Beautiful Soup %szPython version %s�html.parser�html5lib�lxmlz;I noticed that %s is not installed. Installing it may help.zlxml-xmlr��etreezFound lxml version %s�.z.lxml is not installed or couldn't be imported.NzFound html5lib version %sz2html5lib is not installed or couldn't be imported.�readzhttp:zhttps:z<"%s" looks like a URL. Beautiful Soup is not an HTTP client.zpYou need to use some other library to get the document behind the URL, and feed that document to Beautiful Soup.z7"%s" looks like a filename. Reading data from the file.z#Trying to parse your markup with %sF)�featuresT�%s could not parse the markup.z#Here's what %s did with the markup:zP--------------------------------------------------------------------------------)�printr�sys�versionrZbuildersr�remove�appendr	r�join�map�strZLXML_VERSION�ImportErrorr�hasattrr
�
startswith�os�path�exists�open�
ValueErrorr�	Exception�	traceback�	print_excZprettify)�dataZ
basic_parsers�nameZbuilderr�er�fp�parser�success�soup�r*�./usr/lib/python3/dist-packages/bs4/diagnose.py�diagnosesr



��
�
�

r,TcKsJddlm}|jt|�fd|i|��D]\}}td||j|jf�q&dS)a�Print out the lxml events that occur during parsing.

    This lets you see how lxml parses a document when no Beautiful
    Soup code is running. You can use this to determine whether
    an lxml-specific problem is in Beautiful Soup's lxml tree builders
    or in lxml itself.

    :param data: Some markup.
    :param html: If True, markup will be parsed with lxml's HTML parser.
       if False, lxml's XML parser will be used.
    rr
�htmlz%s, %4s, %sN)r	rZ	iterparserr�tag�text)r#r-�kwargsrZevent�elementr*r*r+�
lxml_trace]s"r2c@s`eZdZdZdd�Zdd�Zdd�Zdd	�Zd
d�Zdd
�Z	dd�Z
dd�Zdd�Zdd�Z
dS)�AnnouncingParserz�Subclass of HTMLParser that announces parse events, without doing
    anything else.

    You can use this to get a picture of how html.parser sees a given
    document. The easiest way to do this is to call `htmlparser_trace`.
    cCst|�dS)N)r)�self�sr*r*r+�_puszAnnouncingParser._pcCs|�d|�dS)Nz%s START�r6)r4r$Zattrsr*r*r+�handle_starttagxsz AnnouncingParser.handle_starttagcCs|�d|�dS)Nz%s ENDr7�r4r$r*r*r+�
handle_endtag{szAnnouncingParser.handle_endtagcCs|�d|�dS)Nz%s DATAr7�r4r#r*r*r+�handle_data~szAnnouncingParser.handle_datacCs|�d|�dS)Nz
%s CHARREFr7r9r*r*r+�handle_charref�szAnnouncingParser.handle_charrefcCs|�d|�dS)Nz%s ENTITYREFr7r9r*r*r+�handle_entityref�sz!AnnouncingParser.handle_entityrefcCs|�d|�dS)Nz
%s COMMENTr7r;r*r*r+�handle_comment�szAnnouncingParser.handle_commentcCs|�d|�dS)Nz%s DECLr7r;r*r*r+�handle_decl�szAnnouncingParser.handle_declcCs|�d|�dS)Nz%s UNKNOWN-DECLr7r;r*r*r+�unknown_decl�szAnnouncingParser.unknown_declcCs|�d|�dS)Nz%s PIr7r;r*r*r+�	handle_pi�szAnnouncingParser.handle_piN)�__name__�
__module__�__qualname__�__doc__r6r8r:r<r=r>r?r@rArBr*r*r*r+r3msr3cCst�}|�|�dS)z�Print out the HTMLParser events that occur during parsing.

    This lets you see how HTMLParser parses a document when no
    Beautiful Soup code is running.

    :param data: Some markup.
    N)r3Zfeed)r#r'r*r*r+�htmlparser_trace�srGZaeiouZbcdfghjklmnpqrstvwxyz�cCs:d}t|�D](}|ddkr"t}nt}|t�|�7}q|S)z#Generate a random word-like string.��r)�range�_consonants�_vowels�random�choice)�lengthr5�i�tr*r*r+�rword�srS�cCsd�dd�t|�D��S)z'Generate a random sentence-like string.� css|]}tt�dd��VqdS)rT�	N)rSrN�randint)�.0rQr*r*r+�	<genexpr>�szrsentence.<locals>.<genexpr>)rrK)rPr*r*r+�	rsentence�srZ��cCs�dddddddg}g}t|�D]r}t�dd	�}|dkrPt�|�}|�d
|�q|dkrp|�tt�dd���q|d
krt�|�}|�d|�qdd�|�dS)z+Randomly generate an invalid HTML document.�pZdiv�spanrQ�bZscript�tabler�z<%s>�rTrJz</%s>z<html>�
z</html>)rKrNrWrOrrZr)�num_elementsZ	tag_names�elementsrQrOZtag_namer*r*r+�rdoc�s

re順c
Cs$tdt�t|�}tdt|��dddgddfD]z}d}z"t��}t||�}t��}d}Wn6tk
r�}ztd	|�t��W5d
}~XYnX|r4td|||f�q4dd
l	m
}t��}|�|�t��}td||�dd
l}	|	�
�}t��}|�|�t��}td||�d
S)z.Very basic head-to-head performance benchmark.z1Comparative parser benchmark on Beautiful Soup %sz3Generated a large invalid HTML document (%d bytes).r	r-rrFTrNz"BS4+%s parsed the markup in %.2fs.rr
z$Raw lxml parsed the markup in %.2fs.z(Raw html5lib parsed the markup in %.2fs.)rrre�len�timerr r!r"r	rZHTMLrr�parse)
rcr#r'r(�ar)r^r%rrr*r*r+�benchmark_parsers�s4


rkr	cCsXt��}|j}t|�}tt||d�}t�d|||�t�	|�}|�
d�|�dd�dS)z7Use Python's profiler on a randomly generated document.)�bs4r#r'zbs4.BeautifulSoup(data, parser)Z
cumulativez
_html5lib|bs4�2N)�tempfileZNamedTemporaryFiler$re�dictrl�cProfileZrunctx�pstatsZStatsZ
sort_statsZprint_stats)rcr'Z
filehandle�filenamer#�varsZstatsr*r*r+�profile�s

rt�__main__)T)rH)rT)r[)rf)rfr	)!rFZ__license__rp�iorZhtml.parserrrlrrZbs4.builderrrrqrNrnrhr!rr,r2r3rGrMrLrSrZrerkrtrC�stdinr
r*r*r*r+�<module>s8G
&