Why Gemfury? Push, build, and install  RubyGems npm packages Python packages Maven artifacts PHP packages Go Modules Debian packages RPM packages NuGet packages

Repository URL to install this package:

Details    
wiperf / usr / local / lib / python3.7 / dist-packages / chardet / __pycache__ / charsetprober.cpython-37.pyc
Size: Mime:
B

ŒYu‘s¶œÈã@s0ddlZddlZddlmZGdd„deƒZdS)éNé)ÚProbingStatec@sneZdZdZddd„Zdd„Zedd„ƒZd	d
„Zedd„ƒZ	d
d„Z
edd„ƒZedd„ƒZ
edd„ƒZdS)Ú
CharSetProbergffffffî?NcCsd|_||_t t¡|_dS)N)Ú_stateÚlang_filterÚloggingÚ	getLoggerÚ__name__Úlogger)Úselfr©rú:/tmp/pip-install-fdhvs41_/chardet/chardet/charsetprober.pyÚ__init__'szCharSetProber.__init__cCstj|_dS)N)rÚ	DETECTINGr)rrrr
Úreset,szCharSetProber.resetcCsdS)Nr)rrrr
Úcharset_name/szCharSetProber.charset_namecCsdS)Nr)rÚbufrrr
Úfeed3szCharSetProber.feedcCs|jS)N)r)rrrr
Ústate6szCharSetProber.statecCsdS)Ngr)rrrr
Úget_confidence:szCharSetProber.get_confidencecCst dd|¡}|S)Ns([-])+ó )ÚreÚsub)rrrr
Úfilter_high_byte_only=sz#CharSetProber.filter_high_byte_onlycCs`tƒ}t d|¡}xH|D]@}| |dd…¡|dd…}| ¡sN|dkrNd}| |¡qW|S)u9
        We define three types of bytes:
        alphabet: english alphabets [a-zA-Z]
        international: international characters [€-ÿ]
        marker: everything else [^a-zA-Z€-ÿ]

        The input buffer can be thought to contain a series of words delimited
        by markers. This function works to filter all words that contain at
        least one international character. All contiguous sequences of markers
        are replaced by a single space ascii character.

        This filter applies to all scripts which do not use English characters.
        s%[a-zA-Z]*[€-ÿ]+[a-zA-Z]*[^a-zA-Z€-ÿ]?Néÿÿÿÿó€r)Ú	bytearrayrÚfindallÚextendÚisalpha)rÚfilteredÚwordsÚwordÚ	last_charrrr
Úfilter_international_wordsBs
z(CharSetProber.filter_international_wordscCs¨tƒ}d}d}x~tt|ƒƒD]n}|||d…}|dkr>d}n|dkrJd}|dkr| ¡s||kr‚|s‚| |||…¡| d¡|d}qW|s¤| ||d	…¡|S)
aÈ
        Returns a copy of ``buf`` that retains only the sequences of English
        alphabet and high byte characters that are not between <> characters.
        Also retains English alphabet and high byte characters immediately
        before occurrences of >.

        This filter can be applied to all scripts which contain both English
        characters and extended ASCII characters, but is currently only used by
        ``Latin1Prober``.
        Frró>ó<TrrN)rÚrangeÚlenrr)rr Úin_tagÚprevÚcurrÚbuf_charrrr
Úfilter_with_english_lettersgs"
z)CharSetProber.filter_with_english_letters)N)r	Ú
__module__Ú__qualname__ÚSHORTCUT_THRESHOLDrrÚpropertyrrrrÚstaticmethodrr$r-rrrr
r#s
%r)rrÚenumsrÚobjectrrrrr
Ú<module>s