`words` ¤

This module contains a function to retrieve words from HTML text.

Functions:

Name	Description
`get_words`	Get words in HTML text.

`get_words(html: str, *, known_words: set[str] | None = None, min_length: int = 2, max_capital: int = 1, ignore_code: bool = True, allow_unicode: bool = True) -> list[str]` ¤

Get words in HTML text.

Parameters:

Name	Type	Description	Default
`html`	`str`	The HTML text.	required
`known_words`	`set[str] \| None`	Words to exclude.	`None`
`min_length`	`int`	Words minimum length.	`2`
`max_capital`	`int`	Maximum number of capital letters.	`1`
`ignore_code`	`bool`	Ignore words in code tags.	`True`
`allow_unicode`	`bool`	Keep unicode characters.	`True`

Returns:

Type	Description
`list[str]`	A list of words.

Source code in src/mkdocs_spellcheck/words.py

def get_words(
    html: str,
    *,
    known_words: set[str] | None = None,
    min_length: int = 2,
    max_capital: int = 1,
    ignore_code: bool = True,
    allow_unicode: bool = True,
) -> list[str]:
    """Get words in HTML text.

    Parameters:
        html: The HTML text.
        known_words: Words to exclude.
        min_length: Words minimum length.
        max_capital: Maximum number of capital letters.
        ignore_code: Ignore words in code tags.
        allow_unicode: Keep unicode characters.

    Returns:
        A list of words.
    """
    known_words = known_words or set()
    keep = partial(_keep_word, min_length=min_length, max_capital=max_capital)
    filtered = filter(keep, _normalize(_strip_tags(html, ignore_code), allow_unicode).split("-"))
    words = {word.lower() for word in filtered}
    return sorted(words - known_words)

words ¤

get_words(html: str, *, known_words: set[str] | None = None, min_length: int = 2, max_capital: int = 1, ignore_code: bool = True, allow_unicode: bool = True) -> list[str] ¤

`words` ¤

`get_words(html: str, *, known_words: set[str] | None = None, min_length: int = 2, max_capital: int = 1, ignore_code: bool = True, allow_unicode: bool = True) -> list[str]` ¤