Developer Interface
Contents
Developer Interface#
Main Interface#
Examples for the most relevant api functions can be viewed in the test file. md_toc’s API uses type hints instead of assertions to check input and output types.
Important
If you are a developer and you need a quick way to generate a TOC, the function you may want to use is build_toc
- md_toc.get_atx_heading(line: str, keep_header_levels: int = 3, parser: str = 'github', no_links: bool = False) list #
Given a line extract the link label and its type.
- Parameters
line (str) – the line to be examined. This string may include newline characters in between.
keep_header_levels (int) – the maximum level of headers to be considered as such when building the table of contents. Defaults to
3
.parser (str) – decides rules on how to generate the anchor text. Defaults to
github
.no_links (bool) – disables the use of links.
- Returns
struct, a list of dictionaries with
header type
(int)header text trimmed
(str)visible
(bool)
as keys.
header type
andheader text trimmed
are set toNone
if the line does not contain header elements according to the rules of the selected markdown parser.visible
is set toTrue
if the line needs to be saved,False
if it just needed for duplicate counting.- Return type
list
- Raises
GithubEmptyLinkLabel or GithubOverflowCharsLinkLabel or a built-in exception.
Note
license B applies for the github part. See docs/copyright_license.rst
- md_toc.get_md_header(header_text_line: str, header_duplicate_counter: dict, keep_header_levels: int = 3, parser: str = 'github', no_links: bool = False) list #
Build a data structure with the elements needed to create a TOC line.
- Parameters
header_text_line (str) – a single markdown line that needs to be transformed into a TOC line. This line may include nmultiple newline characters in between.
header_duplicate_counter (dict) – a data structure that contains the number of occurrencies of each header anchor link. This is used to avoid duplicate anchor links and it is meaningful only for certain values of parser.
keep_header_levels (int) – the maximum level of headers to be considered as such when building the table of contents. Defaults to
3
.parser (str) – decides rules on how to generate anchor links. Defaults to
github
.
- Returns
a list with elements
None
if the input line does not correspond to one of the designated cases or a list of data structures containing the necessary components to create a table of contents.- Return type
list
- Raises
a built-in exception.
Note
This works like a wrapper to other functions.
- md_toc.build_toc_line(toc_line_no_indent: str, no_of_indentation_spaces: int = 0) str #
Build the TOC line.
- Parameters
toc_line_no_indent (str) – the TOC line without indentation.
no_of_indentation_spaces (int) – the number of indentation spaces. Defaults to
0
.
- Returns
toc_line, a single line of the table of contents.
- Return type
str
- Raises
a built-in exception.
- md_toc.increase_index_ordered_list(header_type_count: dict, header_type_prev: int, header_type_curr: int, parser: str = 'github')#
Compute the current index for ordered list table of contents.
- Parameters
header_type_count (dict) – the count of each header type.
header_type_prev (int) – the previous type of header (h[1,…,Inf]).
header_type_curr (int) – the current type of header (h[1,…,Inf]).
parser (str) – decides rules on how to generate ordered list markers. Defaults to
github
.
- Returns
None
- Return type
None
- Raises
GithubOverflowOrderedListMarker or a built-in exception.
- md_toc.build_anchor_link(header_text_trimmed: str, header_duplicate_counter: dict, parser: str = 'github') str #
Apply the specified slug rule to build the anchor link.
- Parameters
header_text_trimmed (str) – the text that needs to be transformed in a link.
header_duplicate_counter (dict) – a data structure that keeps track of possible duplicate header links in order to avoid them. This is meaningful only for certain values of parser.
parser (str) – decides rules on how to generate anchor links. Defaults to
github
.
- Returns
None if the specified parser is not recognized, or the anchor link, otherwise.
- Return type
str
- Raises
a built-in exception.
- md_toc.build_toc(filename: str, ordered: bool = False, no_links: bool = False, no_indentation: bool = False, no_list_coherence: bool = False, keep_header_levels: int = 3, parser: str = 'github', list_marker: str = '-', skip_lines: int = 0, constant_ordered_list: bool = False, newline_string: str = '\n') str #
Build the table of contents of a single file.
- Parameters
filename (str) – the file that needs to be read.
ordered (bool) – decides whether to build an ordered list or not. Defaults to
False
.no_links (bool) – disables the use of links. Defaults to
False
.no_indentation (bool) – disables indentation in the list. Defaults to
False
.no_list_coherence (bool) – if set to
False
checks header levels for consecutiveness. If they are not consecutive an exception is raised. For example:# ONE\n### TWO\n
are not consecutive header levels while# ONE\n## TWO\n
are. Defaults toFalse
.keep_header_levels (int) – the maximum level of headers to be considered as such when building the table of contents. Defaults to
3
.parser (str) – decides rules on how to generate anchor links. Defaults to
github
.list_marker (str) – a string that contains some of the first characters of the list element. Defaults to
-
.skip_lines (int) – the number of lines to be skipped from the start of file before parsing for table of contents. Defaults to
0`
.constant_ordered_list (bool) – use a single integer as list marker. This sets ordered to
True
.newline_string (str) – the newline separator. Defaults to
os.linesep
.
- Returns
toc, the corresponding table of contents of the file.
- Return type
str
- Raises
a built-in exception.
Warning
In case of ordered TOCs you must explicitly pass one of the supported ordered list markers.
- md_toc.build_multiple_tocs(filenames: list, ordered: bool = False, no_links: bool = False, no_indentation: bool = False, no_list_coherence: bool = False, keep_header_levels: int = 3, parser: str = 'github', list_marker: str = '-', skip_lines: int = 0, constant_ordered_list: bool = False, newline_string: str = '\n') list #
Parse files by line and build the table of contents of each file.
- Parameters
filenames (list) – the files that needs to be read.
ordered (bool) – decides whether to build an ordered list or not. Defaults to
False
.no_links (bool) – disables the use of links. Defaults to
False
.no_indentation (bool) – disables indentation in the list. Defaults to
False
.keep_header_levels (int) – the maximum level of headers to be considered as such when building the table of contents. Defaults to
3
.parser (str) – decides rules on how to generate anchor links. Defaults to
github
.skip_lines (int) – the number of lines to be skipped from the start of file before parsing for table of contents. Defaults to
0`
.list_marker (str) – a string that contains some of the first characters of the list element. Defaults to
-
.constant_ordered_list (bool) – use a single integer as list marker. This sets ordered to
True
.newline_string (str) – the newline separator. Defaults to
os.linesep
.
- Returns
toc_struct, the corresponding table of contents for each input file.
- Return type
list
- Raises
a built-in exception.
Warning
In case of ordered TOCs you must explicitly pass one of the supported ordered list markers.
- md_toc.write_string_on_file_between_markers(filename: str, string: str, marker: str, newline_string: str = '\n') bool #
Write the table of contents on a single file.
- Parameters
filename (str) – the file that needs to be read or modified.
string (str) – the string that will be written on the file.
marker (str) – a marker that will identify the start and the end of the string.
newline_string (str) – the new line separator. Defaults to
os.linesep
.
- Returns
True
if new TOC is the same as the exising one,False
otherwise.- Return type
bool
- Raises
StdinIsNotAFileToBeWritten or an fpyutils exception or a built-in exception.
- md_toc.write_strings_on_files_between_markers(filenames: list, strings: list, marker: str, newline_string: str = '\n') bool #
Write the table of contents on multiple files.
- Parameters
filenames (list) – the files that needs to be read or modified.
strings (list) – the strings that will be written on the file. Each string is associated with one file.
marker (str) – a marker that will identify the start and the end of the string.
newline_string (str) – the new line separator. Defaults to
os.linesep
.
- Returns
True
if all TOCs are the same as the existing ones,False
otherwise.- Return type
bool
- Raises
an fpyutils exception or a built-in exception.
- md_toc.init_indentation_log(parser: str = 'github', list_marker: str = '-') dict #
Create a data structure that holds list marker information.
- Parameters
parser (str) – decides rules on how compute indentations. Defaults to
github
.list_marker (str) – a string that contains some of the first characters of the list element. Defaults to
-
.
- Returns
indentation_log, the data structure.
- Return type
dict
- Raises
a built-in exception.
- md_toc.compute_toc_line_indentation_spaces(header_type_curr: int = 1, header_type_prev: int = 0, parser: str = 'github', ordered: bool = False, list_marker: str = '-', indentation_log: dict = {1: {'indentation spaces': 0, 'index': 0, 'list marker': '-'}, 2: {'indentation spaces': 0, 'index': 0, 'list marker': '-'}, 3: {'indentation spaces': 0, 'index': 0, 'list marker': '-'}, 4: {'indentation spaces': 0, 'index': 0, 'list marker': '-'}, 5: {'indentation spaces': 0, 'index': 0, 'list marker': '-'}, 6: {'indentation spaces': 0, 'index': 0, 'list marker': '-'}}, index: int = 1)#
Compute the number of indentation spaces for the TOC list element.
- Parameters
header_type_curr (int) – the current type of header (h[1,…,Inf]). Defaults to
1
.header_type_prev (int) – the previous type of header (h[1,…,Inf]). Defaults to
0
.parser (str) – decides rules on how compute indentations. Defaults to
github
.ordered (bool) – if set to
True
, numbers will be used as list ids instead of dash characters. Defaults toFalse
.list_marker (str) – a string that contains some of the first characters of the list element. Defaults to
-
.indentation_log (dict) – a data structure that holds list marker information for ordered lists. Defaults to
init_indentation_log('github', '.')
.index (int) – a number that will be used as list id in case of an ordered table of contents. Defaults to
1
.
- Returns
None
- Return type
None
- Raises
a built-in exception.
Warning
In case of ordered TOCs you must explicitly pass one of the supported ordered list markers.
- md_toc.build_toc_line_without_indentation(header: dict, ordered: bool = False, no_links: bool = False, index: int = 1, parser: str = 'github', list_marker: str = '-') str #
Return a list element of the table of contents.
- Parameters
header (dict) – a data structure that contains the original text, the trimmed text and the type of header.
ordered (bool) – if set to
True
, numbers will be used as list ids, otherwise a dash character. Defaults toFalse
.no_links (bool) – disables the use of links. Defaults to
False
.index (int) – a number that will be used as list id in case of an ordered table of contents. Defaults to
1
.parser (str) – decides rules on how compute indentations. Defaults to
github
.list_marker (str) – a string that contains some of the first characters of the list element. Defaults to
-
.
- Returns
toc_line_no_indent, a single line of the table of contents without indentation.
- Return type
str
- Raises
a built-in exception.
Warning
In case of ordered TOCs you must explicitly pass one of the supported ordered list markers.
- md_toc.is_valid_code_fence_indent(line: str, parser: str = 'github') bool #
Determine if the given line has valid indentation for a code block fence.
- Parameters
line (str) – a single markdown line to evaluate.
parser (str) – decides rules on how to generate the anchor text. Defaults to
github
.
- Returns
True if the given line has valid indentation or False otherwise.
- Return type
bool
- Raises
a built-in exception.
- md_toc.is_opening_code_fence(line: str, parser: str = 'github')#
Determine if the given line is possibly the opening of a fenced code block.
- Parameters
line (str) – a single markdown line to evaluate.
parser (str) – decides rules on how to generate the anchor text. Defaults to
github
.
- Returns
None if the input line is not an opening code fence. Otherwise, returns the string which will identify the closing code fence according to the input parsers’ rules.
- Return type
Optional[str]
- Raises
a built-in exception.
- md_toc.is_closing_code_fence(line: str, fence: str, is_document_end: bool = False, parser: str = 'github') bool #
Determine if the given line is the end of a fenced code block.
- Parameters
line (str) – a single markdown line to evaluate.
is_document_end (bool) – This variable tells the function that the end of the file is reached. Defaults to
False
.parser (str) – decides rules on how to generate the anchor text. Defaults to
github
.
- Paramter fence
a sequence of backticks or tildes marking the start of the current code block. This is usually the return value of the is_opening_code_fence function.
- Returns
True if the line ends the current code block. False otherwise.
- Return type
bool
- Raises
a built-in exception.
- md_toc.init_indentation_status_list(parser: str = 'github')#
Create a data structure that holds the state of indentations.
- Parameters
parser (str) – decides the length of the list. Defaults to
github
.- Returns
indentation_list, a list that contains the state of indentations given a header type.
- Return type
list
- Raises
a built-in exception.
- md_toc.tocs_equal(current_toc: str, filename: str, marker: str) bool #
Check if the TOC already present in a file is the samw of the one passed to this function.
- Parameters
current_toc (str) – the new or current TOC. Do not include the
<!--TOC-->\n\n
and\n\n<!--TOC-->
.filename (str) – the filename with the TOC for the comparison already present in the file.
marker (str) – the TOC marker.
- Returns
True
if the two TOCs are the same,False
otherwise- Return type
bool
- Raises
a built-in exception.
- md_toc.toc_renders_as_coherent_list(header_type_curr: int = 1, header_type_first: int = 1, indentation_list: list = [False, False, False, False, False, False], parser: str = 'github') bool #
Check if the TOC will render as a working list.
- Parameters
header_type_curr (int) – the current type of header (h[1,…,Inf]).
header_type_first (int) – the type of header first encountered (h[1,…,Inf]). This must correspond to the one with the least indentation.
indentation_list (list) – a list that holds the state of indentations.
parser (str) – decides rules on how to generate ordered list markers.
- Returns
renders_as_list
- Return type
bool
- Raises
a built-in exception.
- md_toc.remove_html_tags(line: str, parser: str = 'github') str #
Remove HTML tags.
- Parameters
line (str) – a string.
parser (str) – decides rules on how to remove HTML tags. Defaults to
github
.
- Returns
the input string without HTML tags.
- Return type
str
- Raises
a built-in exception.
- md_toc.remove_emphasis(line: str, parser: str = 'github') str #
Remove markdown emphasis.
- Parameters
line (str) – a string.
parser (str) – decides rules on how to find delimiters. Defaults to
github
.
- Returns
the input line without emphasis.
- Return type
str
- Raises
a built-in exception.
Note
Backslashes are preserved.
- md_toc.replace_and_split_newlines(line: str) list #
Replace all the newline characters with line feeds and separate the components.
- Parameters
line (str) – a string.
- Returns
a list of newline separated strings.
- Return type
list
- Raises
a built-in exception.
- md_toc.filter_indices_from_line(line: str, ranges: list) str #
Given a line and a Python ranges, remove the characters in the ranges.
- Parameters
line (str) – a string.
ranges (list) – a list of Python ranges.
- Returns
the line without the specified indices.
- Return type
str
- Raises
a built-in exception.
Exceptions#
- exception md_toc.GithubOverflowCharsLinkLabel#
Cannot parse link label.
- exception md_toc.GithubEmptyLinkLabel#
The link lables contains only whitespace characters or is empty.
- exception md_toc.GithubOverflowOrderedListMarker#
The ordered list marker number is too big.
- exception md_toc.StdinIsNotAFileToBeWritten#
stdin cannot be written onto.
- exception md_toc.TocDoesNotRenderAsCoherentList#
TOC list indentations are either wrong or not what the user intended.
- exception md_toc.StringCannotContainNewlines#
The specified string cannot contain newlines.
- exception md_toc.CannotTreatUnicodeString#
Cannot treat unicode string.
Note
This exception is deprecated and will be removed in the next major release.