Why Gemfury? Push, build, and install  RubyGems npm packages Python packages Maven artifacts PHP packages Go Modules Debian packages RPM packages NuGet packages

Repository URL to install this package:

Details    
contego / home / tvault / .virtenv / lib / python2.7 / site-packages / chardet / universaldetector.pyc
Size: Mime:
ó
çEYc@s°dZddlZddlZddlZddlmZddlmZmZm	Z	ddl
mZddlm
Z
ddlmZdd	lmZd
efd„ƒYZdS(s
Module containing the UniversalDetector detector class, which is the primary
class a user of ``chardet`` should use.

:author: Mark Pilgrim (initial port to Python)
:author: Shy Shalom (original C code)
:author: Dan Blanchard (major refactoring for 3.0)
:author: Ian Cordasco
iÿÿÿÿNi(tCharSetGroupProber(t
InputStatetLanguageFiltertProbingState(tEscCharSetProber(tLatin1Prober(tMBCSGroupProber(tSBCSGroupProbertUniversalDetectorcBs©eZdZdZejdƒZejdƒZejdƒZidd6dd6d	d
6dd6d
d6dd6dd6dd6Z	e
jd„Zd„Z
d„Zd„ZRS(sq
    The ``UniversalDetector`` class underlies the ``chardet.detect`` function
    and coordinates all of the different charset probers.

    To get a ``dict`` containing an encoding and its confidence, you can simply
    run:

    .. code::

            u = UniversalDetector()
            u.feed(some_bytes)
            u.close()
            detected = u.result

    gš™™™™™É?s[€-ÿ]s(|~{)s[€-Ÿ]sWindows-1252s
iso-8859-1sWindows-1250s
iso-8859-2sWindows-1251s
iso-8859-5sWindows-1256s
iso-8859-6sWindows-1253s
iso-8859-7sWindows-1255s
iso-8859-8sWindows-1254s
iso-8859-9sWindows-1257siso-8859-13cCsqd|_g|_d|_d|_d|_d|_d|_||_t	j
tƒ|_d|_
|jƒdS(N(tNonet_esc_charset_probert_charset_proberstresulttdonet	_got_datat_input_statet
_last_chartlang_filtertloggingt	getLoggert__name__tloggert_has_win_bytestreset(tselfR((sN/home/tvault/.virtenv/lib/python2.7/site-packages/chardet/universaldetector.pyt__init__Qs									cCs‰idd6dd6dd6|_t|_t|_t|_tj|_d|_	|j
rg|j
jƒnx|jD]}|jƒqqWdS(sæ
        Reset the UniversalDetector and all of its probers back to their
        initial states.  This is called by ``__init__``, so you only need to
        call this directly in between analyses of different documents.
        tencodinggt
confidencetlanguagetN(
R	RtFalseR
RRRt
PURE_ASCIIRRR
RR(Rtprober((sN/home/tvault/.virtenv/lib/python2.7/site-packages/chardet/universaldetector.pyR^s					cCsy|jr
dSt|ƒsdSt|tƒs;t|ƒ}n|js{|jtjƒrwidd6dd6dd6|_nØ|jtj	tj
fƒr³idd6dd6dd6|_nœ|jd	ƒrãid
d6dd6dd6|_nl|jdƒridd6dd6dd6|_n<|jtjtjfƒrOid
d6dd6dd6|_nt
|_|jddk	r{t
|_dSn|jtjkrë|jj|ƒr®tj|_që|jtjkrë|jj|j|ƒrëtj|_qën|d|_|jtjkr…|js(t|jƒ|_n|jj|ƒtjkrui|jjd6|jjƒd6|jj d6|_t
|_qunð|jtjkru|j!sñt"|jƒg|_!|jt#j$@rÛ|j!j%t&ƒƒn|j!j%t'ƒƒnx`|j!D]U}|j|ƒtjkrûi|jd6|jƒd6|j d6|_t
|_PqûqûW|j(j|ƒrut
|_)qundS(sý
        Takes a chunk of a document and feeds it through all of the relevant
        charset probers.

        After calling ``feed``, you can check the value of the ``done``
        attribute to see if you need to continue feeding the
        ``UniversalDetector`` more data, or if it has made a prediction
        (in the ``result`` attribute).

        .. note::
           You should always call ``close`` when you're done feeding in your
           document if ``done`` is not already ``True``.
        Ns	UTF-8-SIGRgð?RRRsUTF-32sþÿsX-ISO-10646-UCS-4-3412tÿþsX-ISO-10646-UCS-4-2143sUTF-16iÿÿÿÿ(*R
tlent
isinstancet	bytearrayRt
startswithtcodecstBOM_UTF8RtBOM_UTF32_LEtBOM_UTF32_BEtBOM_LEtBOM_BEtTrueR	RRRtHIGH_BYTE_DETECTORtsearcht	HIGH_BYTEtESC_DETECTORRt	ESC_ASCIIR
RRtfeedRtFOUND_ITtcharset_nametget_confidenceRRRRtNON_CJKtappendRRtWIN_BYTE_DETECTORR(Rtbyte_strR ((sN/home/tvault/.virtenv/lib/python2.7/site-packages/chardet/universaldetector.pyR2os~		




		
	
	

	c	Cs>|jr|jSt|_|js5|jjdƒn1|jtjkrhidd6dd6dd6|_nþ|jtj	krfd}d}d}xD|jD]9}|s¨q–n|jƒ}||kr–|}|}q–q–W|rf||j
krf|j}|jjƒ}|jƒ}|jd	ƒr?|jr?|jj||ƒ}q?ni|d6|d6|jd6|_qfn|jjƒtjkr7|jddkr7|jjd
ƒx|jD]‚}|s½q«nt|tƒrx^|jD]+}|jjd|j|j|jƒƒqÖWq«|jjd|j|j|jƒƒq«Wq7n|jS(
sæ
        Stop analyzing the current document and come up with a final
        prediction.

        :returns:  The ``result`` attribute, a ``dict`` with the keys
                   `encoding`, `confidence`, and `language`.
        sno data received!tasciiRgð?RRRgsiso-8859s no probers hit minimum thresholds%s %s confidence = %sN(R
RR,RRtdebugRRRR/R	RR5tMINIMUM_THRESHOLDR4tlowerR%RtISO_WIN_MAPtgetRtgetEffectiveLevelRtDEBUGR#Rtprobers(	Rtprober_confidencetmax_prober_confidencet
max_proberR R4tlower_charset_nameRtgroup_prober((sN/home/tvault/.virtenv/lib/python2.7/site-packages/chardet/universaldetector.pytcloseÜs`				

		
(Rt
__module__t__doc__R<tretcompileR-R0R8R>RtALLRRR2RH(((sN/home/tvault/.virtenv/lib/python2.7/site-packages/chardet/universaldetector.pyR3s"


		m(RJR&RRKtcharsetgroupproberRtenumsRRRt	escproberRtlatin1proberRtmbcsgroupproberRtsbcsgroupproberRtobjectR(((sN/home/tvault/.virtenv/lib/python2.7/site-packages/chardet/universaldetector.pyt<module>$s