Markdown

This section contains the reference for the implementation of translate-md's MarkdownProcessor and helper functions.

MarkdownProcessor ¶

MarkdownProcessor(markdown_content: str) -> None

Class that allows to work with a markdown file, extracting the text content to be translated.

The expected format of the markdown file is the one used in hugo for blogging.

Parameters:

Name	Type	Description	Default
`markdown_content`	`str`	The content of a markdown file as a string.	required

Notes

See gohugo for type of markdown files

tokens `property` ¶

tokens: list[Token]

Parsed pieces of the markdown file. The content will be extracted from these pieces, updated and created back.

get_pieces ¶

get_pieces() -> list[str]

Gets the pieces of the markdown file to be translated.

The relevant pieces are those tokens considered of type 'inline' and which aren't the front matter, a figure, code or markdown comments.

Internally stores the position of the corresponding tokens for later use.

render ¶

render() -> str

Get a new markdown file with the paragraphs translated.

Parameters:

Name	Type	Description	Default
`texts`	`list[str]`	List of texts to insert back to the	required

update ¶

update(texts: list[str]) -> None

Update the content with the translated pieces.

Parameters:

Name	Type	Description	Default
`texts`	`list[str]`	List of texts to insert back to the	required

Raises:

Type	Description
`ValueError`	If the number of texts to update don't match the number of texts obtained from get_pieces method.

write_to ¶

write_to(filename: Path) -> None

Write the content of the updated markdown to disk.

Parameters:

Name	Type	Description	Default
`filename`	`Path`	Name of the new file.	required

read_file ¶

read_file(filename: Path) -> str

Read a whole markdown file to a string, just a helper function.

is_front_matter ¶

is_front_matter(text: str) -> bool

Check if a token pertains to the front matter.

The check seeks if the string starts with '---' and the word title after a single line jump (it will fail if some space is inserted between them), and ends with '---'.

Parameters:

Name	Type	Description	Default
`text`	`str`	text obtained in the Token's content. Expects to be applied to the tokens from a markdown parsed.	required

Returns:

Type	Description
`bool`	bool

is_figure ¶

is_figure(text: str) -> bool

Check if a paragraph is just a picture in the doc.

Some lines may contain just a picture, and there is no reason to translate those. i.e. ' helpner ' The type of check is not perfect, it just fits my needs.

Parameters:

Name	Type	Description	Default
`text`	`str`	text obtained in the Token's content.	required

Returns:

Name	Type	Description
`bool`	`bool`

is_code ¶

is_code(text: str) -> bool

Check if a blob of text is a chunk of code.

Parameters:

Name	Type	Description	Default
`text`	`str`	text obtained in the Token's content.	required

Returns:

Type	Description
`bool`	bool

is_comment ¶

is_comment(text: str) -> bool

Markdown

MarkdownProcessor ¶

tokens property ¶

get_pieces ¶

render ¶

update ¶

write_to ¶

read_file ¶

is_front_matter ¶

is_figure ¶

is_code ¶

is_comment ¶

tokens `property` ¶