Markdown
This section contains the reference for the implementation of translate-md's MarkdownProcessor
and helper functions.
MarkdownProcessor ¶
Class that allows to work with a markdown file, extracting the text content to be translated.
The expected format of the markdown file is the one used in hugo for blogging.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
markdown_content |
str
|
The content of a markdown file as a string. |
required |
Notes
See gohugo for type of markdown files
tokens
property
¶
Parsed pieces of the markdown file. The content will be extracted from these pieces, updated and created back.
get_pieces ¶
Gets the pieces of the markdown file to be translated.
The relevant pieces are those tokens considered of type 'inline' and which aren't the front matter, a figure, code or markdown comments.
Internally stores the position of the corresponding tokens for later use.
render ¶
Get a new markdown file with the paragraphs translated.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
texts |
list[str]
|
List of texts to insert back to the |
required |
update ¶
Update the content with the translated pieces.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
texts |
list[str]
|
List of texts to insert back to the |
required |
Raises:
Type | Description |
---|---|
ValueError
|
If the number of texts to update don't match the number of texts obtained from get_pieces method. |
See Also
write_to ¶
Write the content of the updated markdown to disk.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
filename |
Path
|
Name of the new file. |
required |
read_file ¶
Read a whole markdown file to a string, just a helper function.
is_front_matter ¶
Check if a token pertains to the front matter.
The check seeks if the string starts with '---' and
the word title
after a single line jump (it will fail if some
space is inserted between them), and ends with '---'.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text |
str
|
text obtained in the Token's content. Expects to be applied to the tokens from a markdown parsed. |
required |
Returns:
Type | Description |
---|---|
bool
|
bool |
is_figure ¶
Check if a paragraph is just a picture in the doc.
Some lines may contain just a picture, and there is no
reason to translate those.
i.e.
''
The type of check is not perfect, it just fits my needs.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text |
str
|
text obtained in the Token's content. |
required |
Returns:
Name | Type | Description |
---|---|---|
bool |
bool
|
is_code ¶
Check if a blob of text is a chunk of code.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text |
str
|
text obtained in the Token's content. |
required |
Returns:
Type | Description |
---|---|
bool
|
bool |