API Reference

`md_spreadsheet_parser`

`ConversionSchema` `dataclass`

Configuration for converting string values to Python types.

Attributes:

Name	Type	Description
`boolean_pairs`	`tuple[tuple[str, str], ...]`	Pairs of strings representing (True, False). Case-insensitive. Example: `(("yes", "no"), ("on", "off"))`.
`custom_converters`	`dict[type, Callable[[str], Any]]`	Dictionary mapping ANY Python type to a conversion function `str -> Any`. You can specify: - Built-in types: `int`, `float`, `bool` (to override default behavior) - Standard library types: `Decimal`, `datetime`, `date`, `ZoneInfo` - Custom classes: `MyClass`, `Product`
`field_converters`	`dict[str, Callable[[str], Any]]`	Dictionary mapping field names (str) to conversion functions. Takes precedence over `custom_converters`.

Source code in src/md_spreadsheet_parser/schemas.py

@dataclass(frozen=True)
class ConversionSchema:
    """
    Configuration for converting string values to Python types.

    Attributes:
        boolean_pairs: Pairs of strings representing (True, False). Case-insensitive.
                       Example: `(("yes", "no"), ("on", "off"))`.
        custom_converters: Dictionary mapping ANY Python type to a conversion function `str -> Any`.
                           You can specify:
                           - Built-in types: `int`, `float`, `bool` (to override default behavior)
                           - Standard library types: `Decimal`, `datetime`, `date`, `ZoneInfo`
                           - Custom classes: `MyClass`, `Product`
        field_converters: Dictionary mapping field names (str) to conversion functions.
                          Takes precedence over `custom_converters`.
    """

    boolean_pairs: tuple[tuple[str, str], ...] = (
        ("true", "false"),
        ("yes", "no"),
        ("1", "0"),
        ("on", "off"),
    )
    custom_converters: dict[type, Callable[[str], Any]] = field(default_factory=dict)
    field_converters: dict[str, Callable[[str], Any]] = field(default_factory=dict)

`ExcelParsingSchema` `dataclass`

Configuration for parsing Excel-exported data (TSV/CSV or openpyxl).

Attributes:

Name	Type	Description
`header_rows`	`int`	Number of header rows (1 or 2). If 2, headers are flattened to "Parent - Child" format.
`fill_merged_headers`	`bool`	Whether to forward-fill empty header cells (for merged cells in Excel exports).
`delimiter`	`str`	Column separator for TSV/CSV parsing. Default is tab.
`header_separator`	`str`	Separator used when flattening 2-row headers.

Source code in src/md_spreadsheet_parser/schemas.py

@dataclass(frozen=True)
class ExcelParsingSchema:
    """
    Configuration for parsing Excel-exported data (TSV/CSV or openpyxl).

    Attributes:
        header_rows: Number of header rows (1 or 2).
                     If 2, headers are flattened to "Parent - Child" format.
        fill_merged_headers: Whether to forward-fill empty header cells
                             (for merged cells in Excel exports).
        delimiter: Column separator for TSV/CSV parsing. Default is tab.
        header_separator: Separator used when flattening 2-row headers.
    """

    header_rows: int = 1
    fill_merged_headers: bool = True
    delimiter: str = "\t"
    header_separator: str = " - "

    def __post_init__(self):
        if self.header_rows not in (1, 2):
            raise ValueError("header_rows must be 1 or 2")

`MultiTableParsingSchema` `dataclass`

Bases: ParsingSchema

Configuration for parsing multiple tables (workbook mode). Inherits from ParsingSchema.

Attributes:

Name	Type	Description
`root_marker`	`str`	The marker indicating the start of the data section. Defaults to "# Tables".
`sheet_header_level`	`int`	The markdown header level for sheets. Defaults to 2 (e.g. `## Sheet`).
`table_header_level`	`int \| None`	The markdown header level for tables. If None, table names are not extracted. Defaults to None.
`capture_description`	`bool`	Whether to capture text between the table header and the table as a description. Defaults to False.

Source code in src/md_spreadsheet_parser/schemas.py

@dataclass(frozen=True)
class MultiTableParsingSchema(ParsingSchema):
    """
    Configuration for parsing multiple tables (workbook mode).
    Inherits from ParsingSchema.

    Attributes:
        root_marker (str): The marker indicating the start of the data section. Defaults to "# Tables".
        sheet_header_level (int): The markdown header level for sheets. Defaults to 2 (e.g. `## Sheet`).
        table_header_level (int | None): The markdown header level for tables. If None, table names are not extracted. Defaults to None.
        capture_description (bool): Whether to capture text between the table header and the table as a description. Defaults to False.
    """

    root_marker: str = "# Tables"
    sheet_header_level: int = 2
    table_header_level: int | None = 3
    capture_description: bool = True

    def __post_init__(self):
        if self.capture_description and self.table_header_level is None:
            raise ValueError(
                "capture_description=True requires table_header_level to be set"
            )

`ParsingSchema` `dataclass`

Configuration for parsing markdown tables. Designed to be immutable and passed to pure functions.

Attributes:

Name	Type	Description
`column_separator`	`str`	Character used to separate columns. Defaults to "\|".
`header_separator_char`	`str`	Character used in the separator row. Defaults to "-".
`require_outer_pipes`	`bool`	Whether tables must have outer pipes (e.g. `\| col \|`). Defaults to True.
`strip_whitespace`	`bool`	Whether to strip whitespace from cell values. Defaults to True.

Source code in src/md_spreadsheet_parser/schemas.py

@dataclass(frozen=True)
class ParsingSchema:
    """
    Configuration for parsing markdown tables.
    Designed to be immutable and passed to pure functions.

    Attributes:
        column_separator (str): Character used to separate columns. Defaults to "|".
        header_separator_char (str): Character used in the separator row. Defaults to "-".
        require_outer_pipes (bool): Whether tables must have outer pipes (e.g. `| col |`). Defaults to True.
        strip_whitespace (bool): Whether to strip whitespace from cell values. Defaults to True.
    """

    column_separator: str = "|"
    header_separator_char: str = "-"
    require_outer_pipes: bool = True
    strip_whitespace: bool = True
    convert_br_to_newline: bool = True

`Sheet` `dataclass`

Represents a single sheet containing tables.

Attributes:

Name	Type	Description
`name`	`str`	Name of the sheet.
`tables`	`list[Table]`	List of tables contained in this sheet.
`metadata`	`dict[str, Any] \| None`	Arbitrary metadata (e.g. layout). Defaults to None.

Source code in src/md_spreadsheet_parser/models.py

@dataclass(frozen=True)
class Sheet:
    """
    Represents a single sheet containing tables.

    Attributes:
        name (str): Name of the sheet.
        tables (list[Table]): List of tables contained in this sheet.
        metadata (dict[str, Any] | None): Arbitrary metadata (e.g. layout). Defaults to None.
    """

    name: str
    tables: list[Table]
    metadata: dict[str, Any] | None = None

    def __post_init__(self):
        if self.metadata is None:
            # Hack to allow default value for mutable type in frozen dataclass
            object.__setattr__(self, "metadata", {})

    @property
    def json(self) -> SheetJSON:
        """
        Returns a JSON-compatible dictionary representation of the sheet.

        Returns:
            SheetJSON: A dictionary containing the sheet data.
        """
        return {
            "name": self.name,
            "tables": [t.json for t in self.tables],
            "metadata": self.metadata if self.metadata is not None else {},
        }

    def get_table(self, name: str) -> Table | None:
        """
        Retrieve a table by its name.

        Args:
            name (str): The name of the table to retrieve.

        Returns:
            Table | None: The table object if found, otherwise None.
        """
        for table in self.tables:
            if table.name == name:
                return table
        return None

    def to_markdown(self, schema: ParsingSchema = DEFAULT_SCHEMA) -> str:
        """
        Generates a Markdown string representation of the sheet.

        Args:
            schema (ParsingSchema, optional): Configuration for formatting.

        Returns:
            str: The Markdown string.
        """
        return generate_sheet_markdown(self, schema)

`json` `property`

Returns a JSON-compatible dictionary representation of the sheet.

Returns:

Name	Type	Description
`SheetJSON`	`SheetJSON`	A dictionary containing the sheet data.

`get_table(name)`

Retrieve a table by its name.

Parameters:

Name	Type	Description	Default
`name`	`str`	The name of the table to retrieve.	required

Returns:

Type	Description
`Table \| None`	Table \| None: The table object if found, otherwise None.

Source code in src/md_spreadsheet_parser/models.py

def get_table(self, name: str) -> Table | None:
    """
    Retrieve a table by its name.

    Args:
        name (str): The name of the table to retrieve.

    Returns:
        Table | None: The table object if found, otherwise None.
    """
    for table in self.tables:
        if table.name == name:
            return table
    return None

`to_markdown(schema=DEFAULT_SCHEMA)`

Generates a Markdown string representation of the sheet.

Parameters:

Name	Type	Description	Default
`schema`	`ParsingSchema`	Configuration for formatting.	`DEFAULT_SCHEMA`

Returns:

Name	Type	Description
`str`	`str`	The Markdown string.

Source code in src/md_spreadsheet_parser/models.py

def to_markdown(self, schema: ParsingSchema = DEFAULT_SCHEMA) -> str:
    """
    Generates a Markdown string representation of the sheet.

    Args:
        schema (ParsingSchema, optional): Configuration for formatting.

    Returns:
        str: The Markdown string.
    """
    return generate_sheet_markdown(self, schema)

`Table` `dataclass`

Represents a parsed table with optional metadata.

Attributes:

Name	Type	Description
`headers`	`list[str] \| None`	List of column headers, or None if the table has no headers.
`rows`	`list[list[str]]`	List of data rows.
`alignments`	`list[AlignmentType] \| None`	List of column alignments ('left', 'center', 'right'). Defaults to None.
`name`	`str \| None`	Name of the table (e.g. from a header). Defaults to None.
`description`	`str \| None`	Description of the table. Defaults to None.
`metadata`	`dict[str, Any] \| None`	Arbitrary metadata. Defaults to None.

Source code in src/md_spreadsheet_parser/models.py

@dataclass(frozen=True)
class Table:
    """
    Represents a parsed table with optional metadata.

    Attributes:
        headers (list[str] | None): List of column headers, or None if the table has no headers.
        rows (list[list[str]]): List of data rows.
        alignments (list[AlignmentType] | None): List of column alignments ('left', 'center', 'right'). Defaults to None.
        name (str | None): Name of the table (e.g. from a header). Defaults to None.
        description (str | None): Description of the table. Defaults to None.
        metadata (dict[str, Any] | None): Arbitrary metadata. Defaults to None.
    """

    headers: list[str] | None
    rows: list[list[str]]
    alignments: list[AlignmentType] | None = None
    name: str | None = None
    description: str | None = None
    metadata: dict[str, Any] | None = None
    start_line: int | None = None
    end_line: int | None = None

    def __post_init__(self):
        if self.metadata is None:
            # Hack to allow default value for mutable type in frozen dataclass
            object.__setattr__(self, "metadata", {})

    @property
    def json(self) -> TableJSON:
        """
        Returns a JSON-compatible dictionary representation of the table.

        Returns:
            TableJSON: A dictionary containing the table data.
        """
        return {
            "name": self.name,
            "description": self.description,
            "headers": self.headers,
            "rows": self.rows,
            "metadata": self.metadata if self.metadata is not None else {},
            "start_line": self.start_line,
            "end_line": self.end_line,
            "alignments": self.alignments,
        }

    def to_models(
        self,
        schema_cls: type[T],
        conversion_schema: ConversionSchema = DEFAULT_CONVERSION_SCHEMA,
    ) -> list[T]:
        """
        Converts the table rows into a list of dataclass instances, performing validation and type conversion.

        Args:
            schema_cls (type[T]): The dataclass type to validate against.
            conversion_schema (ConversionSchema, optional): Configuration for type conversion.

        Returns:
            list[T]: A list of validated dataclass instances.

        Raises:
            ValueError: If schema_cls is not a dataclass.
            TableValidationError: If validation fails for any row or if the table has no headers.
        """
        return validate_table(self, schema_cls, conversion_schema)

    def to_markdown(self, schema: ParsingSchema = DEFAULT_SCHEMA) -> str:
        """
        Generates a Markdown string representation of the table.

        Args:
            schema (ParsingSchema, optional): Configuration for formatting.

        Returns:
            str: The Markdown string.
        """
        return generate_table_markdown(self, schema)

    def update_cell(self, row_idx: int, col_idx: int, value: str) -> "Table":
        """
        Return a new Table with the specified cell updated.
        """
        # Handle header update
        if row_idx == -1:
            if self.headers is None:
                # Determine width from rows if possible, or start fresh
                width = len(self.rows[0]) if self.rows else (col_idx + 1)
                new_headers = [""] * width
                # Ensure width enough
                if col_idx >= len(new_headers):
                    new_headers.extend([""] * (col_idx - len(new_headers) + 1))
            else:
                new_headers = list(self.headers)
                if col_idx >= len(new_headers):
                    new_headers.extend([""] * (col_idx - len(new_headers) + 1))

            # Update alignments if headers grew
            new_alignments = list(self.alignments) if self.alignments else []
            if len(new_headers) > len(new_alignments):
                # Fill with default/None up to new width
                # But we only need as many alignments as columns.
                # If alignments is None, it stays None?
                # Ideally if we start tracking alignments, we should init it?
                # If self.alignments was None, we might keep it None unless explicitly set?
                # Consistent behavior: If alignments is NOT None, expand it.
                if self.alignments is not None:
                    # Cast or explicit type check might be needed for strict type checkers with literals
                    # Using a typed list to satisfy invariant list[AlignmentType]
                    extension: list[AlignmentType] = ["default"] * (
                        len(new_headers) - len(new_alignments)
                    )
                    new_alignments.extend(extension)

            final_alignments = new_alignments if self.alignments is not None else None

            new_headers[col_idx] = value

            return replace(self, headers=new_headers, alignments=final_alignments)

        # Handle Body update
        # 1. Ensure row exists
        new_rows = [list(r) for r in self.rows]

        # Grow rows if needed
        if row_idx >= len(new_rows):
            # Calculate width
            width = (
                len(self.headers)
                if self.headers
                else (len(new_rows[0]) if new_rows else 0)
            )
            if width == 0:
                width = col_idx + 1  # At least cover the new cell

            rows_to_add = row_idx - len(new_rows) + 1
            for _ in range(rows_to_add):
                new_rows.append([""] * width)

        # If columns expanded due to row update, we might need to expand alignments too
        current_width = len(new_rows[0]) if new_rows else 0
        if col_idx >= current_width:
            # This means we are expanding columns
            if self.alignments is not None:
                width_needed = col_idx + 1
                current_align_len = len(self.alignments)
                if width_needed > current_align_len:
                    new_alignments = list(self.alignments)
                    extension: list[AlignmentType] = ["default"] * (
                        width_needed - current_align_len
                    )
                    new_alignments.extend(extension)
                    return replace(
                        self,
                        rows=self._update_rows_cell(new_rows, row_idx, col_idx, value),
                        alignments=new_alignments,
                    )

        return replace(
            self, rows=self._update_rows_cell(new_rows, row_idx, col_idx, value)
        )

    def _update_rows_cell(self, new_rows, row_idx, col_idx, value):
        target_row = new_rows[row_idx]
        if col_idx >= len(target_row):
            target_row.extend([""] * (col_idx - len(target_row) + 1))
        target_row[col_idx] = value
        return new_rows

    def delete_row(self, row_idx: int) -> "Table":
        """
        Return a new Table with the row at index removed.
        """
        new_rows = [list(r) for r in self.rows]
        if 0 <= row_idx < len(new_rows):
            new_rows.pop(row_idx)
        return replace(self, rows=new_rows)

    def delete_column(self, col_idx: int) -> "Table":
        """
        Return a new Table with the column at index removed.
        """
        new_headers = list(self.headers) if self.headers else None
        if new_headers and 0 <= col_idx < len(new_headers):
            new_headers.pop(col_idx)

        new_rows = []
        for row in self.rows:
            new_row = list(row)
            if 0 <= col_idx < len(new_row):
                new_row.pop(col_idx)
            new_rows.append(new_row)

        new_alignments = None
        if self.alignments is not None:
            new_alignments = list(self.alignments)
            if 0 <= col_idx < len(new_alignments):
                new_alignments.pop(col_idx)

        return replace(
            self, headers=new_headers, rows=new_rows, alignments=new_alignments
        )

    def clear_column_data(self, col_idx: int) -> "Table":
        """
        Return a new Table with data in the specified column cleared (set to empty string),
        but headers and column structure preserved.
        """
        # Headers remain unchanged

        new_rows = []
        for row in self.rows:
            new_row = list(row)
            if 0 <= col_idx < len(new_row):
                new_row[col_idx] = ""
            new_rows.append(new_row)

        return replace(self, rows=new_rows)

    def insert_row(self, row_idx: int) -> "Table":
        """
        Return a new Table with an empty row inserted at row_idx.
        Subsequent rows are shifted down.
        """
        new_rows = [list(r) for r in self.rows]

        # Determine width
        width = (
            len(self.headers) if self.headers else (len(new_rows[0]) if new_rows else 0)
        )
        if width == 0:
            width = 1  # Default to 1 column if table is empty

        new_row = [""] * width

        if row_idx < 0:
            row_idx = 0
        if row_idx > len(new_rows):
            row_idx = len(new_rows)

        new_rows.insert(row_idx, new_row)
        return replace(self, rows=new_rows)

    def insert_column(self, col_idx: int) -> "Table":
        """
        Return a new Table with an empty column inserted at col_idx.
        Subsequent columns are shifted right.
        """
        new_headers = list(self.headers) if self.headers else None

        if new_headers:
            if col_idx < 0:
                col_idx = 0
            if col_idx > len(new_headers):
                col_idx = len(new_headers)
            new_headers.insert(col_idx, "")

        new_alignments = None
        if self.alignments is not None:
            new_alignments = list(self.alignments)
            # Pad if needed before insertion?
            if col_idx > len(new_alignments):
                extension: list[AlignmentType] = ["default"] * (
                    col_idx - len(new_alignments)
                )
                new_alignments.extend(extension)
            new_alignments.insert(col_idx, "default")  # Default alignment

        new_rows = []
        for row in self.rows:
            new_row = list(row)
            # Ensure row is long enough before insertion logic?
            # Or just insert.
            # If col_idx is way past end, we might need padding?
            # Standard list.insert handles index > len -> append.
            current_len = len(new_row)
            target_idx = col_idx
            if target_idx > current_len:
                # Pad up to target
                new_row.extend([""] * (target_idx - current_len))
                target_idx = len(new_row)  # Append

            new_row.insert(target_idx, "")
            new_rows.append(new_row)

        return replace(
            self, headers=new_headers, rows=new_rows, alignments=new_alignments
        )

`json` `property`

Returns a JSON-compatible dictionary representation of the table.

Returns:

Name	Type	Description
`TableJSON`	`TableJSON`	A dictionary containing the table data.

`clear_column_data(col_idx)`

Return a new Table with data in the specified column cleared (set to empty string), but headers and column structure preserved.

Source code in src/md_spreadsheet_parser/models.py

def clear_column_data(self, col_idx: int) -> "Table":
    """
    Return a new Table with data in the specified column cleared (set to empty string),
    but headers and column structure preserved.
    """
    # Headers remain unchanged

    new_rows = []
    for row in self.rows:
        new_row = list(row)
        if 0 <= col_idx < len(new_row):
            new_row[col_idx] = ""
        new_rows.append(new_row)

    return replace(self, rows=new_rows)

`delete_column(col_idx)`

Return a new Table with the column at index removed.

Source code in src/md_spreadsheet_parser/models.py

def delete_column(self, col_idx: int) -> "Table":
    """
    Return a new Table with the column at index removed.
    """
    new_headers = list(self.headers) if self.headers else None
    if new_headers and 0 <= col_idx < len(new_headers):
        new_headers.pop(col_idx)

    new_rows = []
    for row in self.rows:
        new_row = list(row)
        if 0 <= col_idx < len(new_row):
            new_row.pop(col_idx)
        new_rows.append(new_row)

    new_alignments = None
    if self.alignments is not None:
        new_alignments = list(self.alignments)
        if 0 <= col_idx < len(new_alignments):
            new_alignments.pop(col_idx)

    return replace(
        self, headers=new_headers, rows=new_rows, alignments=new_alignments
    )

`delete_row(row_idx)`

Return a new Table with the row at index removed.

Source code in src/md_spreadsheet_parser/models.py

def delete_row(self, row_idx: int) -> "Table":
    """
    Return a new Table with the row at index removed.
    """
    new_rows = [list(r) for r in self.rows]
    if 0 <= row_idx < len(new_rows):
        new_rows.pop(row_idx)
    return replace(self, rows=new_rows)

`insert_column(col_idx)`

Return a new Table with an empty column inserted at col_idx. Subsequent columns are shifted right.

Source code in src/md_spreadsheet_parser/models.py

def insert_column(self, col_idx: int) -> "Table":
    """
    Return a new Table with an empty column inserted at col_idx.
    Subsequent columns are shifted right.
    """
    new_headers = list(self.headers) if self.headers else None

    if new_headers:
        if col_idx < 0:
            col_idx = 0
        if col_idx > len(new_headers):
            col_idx = len(new_headers)
        new_headers.insert(col_idx, "")

    new_alignments = None
    if self.alignments is not None:
        new_alignments = list(self.alignments)
        # Pad if needed before insertion?
        if col_idx > len(new_alignments):
            extension: list[AlignmentType] = ["default"] * (
                col_idx - len(new_alignments)
            )
            new_alignments.extend(extension)
        new_alignments.insert(col_idx, "default")  # Default alignment

    new_rows = []
    for row in self.rows:
        new_row = list(row)
        # Ensure row is long enough before insertion logic?
        # Or just insert.
        # If col_idx is way past end, we might need padding?
        # Standard list.insert handles index > len -> append.
        current_len = len(new_row)
        target_idx = col_idx
        if target_idx > current_len:
            # Pad up to target
            new_row.extend([""] * (target_idx - current_len))
            target_idx = len(new_row)  # Append

        new_row.insert(target_idx, "")
        new_rows.append(new_row)

    return replace(
        self, headers=new_headers, rows=new_rows, alignments=new_alignments
    )

`insert_row(row_idx)`

Return a new Table with an empty row inserted at row_idx. Subsequent rows are shifted down.

Source code in src/md_spreadsheet_parser/models.py

def insert_row(self, row_idx: int) -> "Table":
    """
    Return a new Table with an empty row inserted at row_idx.
    Subsequent rows are shifted down.
    """
    new_rows = [list(r) for r in self.rows]

    # Determine width
    width = (
        len(self.headers) if self.headers else (len(new_rows[0]) if new_rows else 0)
    )
    if width == 0:
        width = 1  # Default to 1 column if table is empty

    new_row = [""] * width

    if row_idx < 0:
        row_idx = 0
    if row_idx > len(new_rows):
        row_idx = len(new_rows)

    new_rows.insert(row_idx, new_row)
    return replace(self, rows=new_rows)

`to_markdown(schema=DEFAULT_SCHEMA)`

Generates a Markdown string representation of the table.

Parameters:

Name	Type	Description	Default
`schema`	`ParsingSchema`	Configuration for formatting.	`DEFAULT_SCHEMA`

Returns:

Name	Type	Description
`str`	`str`	The Markdown string.

Source code in src/md_spreadsheet_parser/models.py

def to_markdown(self, schema: ParsingSchema = DEFAULT_SCHEMA) -> str:
    """
    Generates a Markdown string representation of the table.

    Args:
        schema (ParsingSchema, optional): Configuration for formatting.

    Returns:
        str: The Markdown string.
    """
    return generate_table_markdown(self, schema)

`to_models(schema_cls, conversion_schema=DEFAULT_CONVERSION_SCHEMA)`

Converts the table rows into a list of dataclass instances, performing validation and type conversion.

Parameters:

Name	Type	Description	Default
`schema_cls`	`type[T]`	The dataclass type to validate against.	required
`conversion_schema`	`ConversionSchema`	Configuration for type conversion.	`DEFAULT_CONVERSION_SCHEMA`

Returns:

Type	Description
`list[T]`	list[T]: A list of validated dataclass instances.

Raises:

Type	Description
`ValueError`	If schema_cls is not a dataclass.
`TableValidationError`	If validation fails for any row or if the table has no headers.

Source code in src/md_spreadsheet_parser/models.py

def to_models(
    self,
    schema_cls: type[T],
    conversion_schema: ConversionSchema = DEFAULT_CONVERSION_SCHEMA,
) -> list[T]:
    """
    Converts the table rows into a list of dataclass instances, performing validation and type conversion.

    Args:
        schema_cls (type[T]): The dataclass type to validate against.
        conversion_schema (ConversionSchema, optional): Configuration for type conversion.

    Returns:
        list[T]: A list of validated dataclass instances.

    Raises:
        ValueError: If schema_cls is not a dataclass.
        TableValidationError: If validation fails for any row or if the table has no headers.
    """
    return validate_table(self, schema_cls, conversion_schema)

`update_cell(row_idx, col_idx, value)`

Return a new Table with the specified cell updated.

Source code in src/md_spreadsheet_parser/models.py

def update_cell(self, row_idx: int, col_idx: int, value: str) -> "Table":
    """
    Return a new Table with the specified cell updated.
    """
    # Handle header update
    if row_idx == -1:
        if self.headers is None:
            # Determine width from rows if possible, or start fresh
            width = len(self.rows[0]) if self.rows else (col_idx + 1)
            new_headers = [""] * width
            # Ensure width enough
            if col_idx >= len(new_headers):
                new_headers.extend([""] * (col_idx - len(new_headers) + 1))
        else:
            new_headers = list(self.headers)
            if col_idx >= len(new_headers):
                new_headers.extend([""] * (col_idx - len(new_headers) + 1))

        # Update alignments if headers grew
        new_alignments = list(self.alignments) if self.alignments else []
        if len(new_headers) > len(new_alignments):
            # Fill with default/None up to new width
            # But we only need as many alignments as columns.
            # If alignments is None, it stays None?
            # Ideally if we start tracking alignments, we should init it?
            # If self.alignments was None, we might keep it None unless explicitly set?
            # Consistent behavior: If alignments is NOT None, expand it.
            if self.alignments is not None:
                # Cast or explicit type check might be needed for strict type checkers with literals
                # Using a typed list to satisfy invariant list[AlignmentType]
                extension: list[AlignmentType] = ["default"] * (
                    len(new_headers) - len(new_alignments)
                )
                new_alignments.extend(extension)

        final_alignments = new_alignments if self.alignments is not None else None

        new_headers[col_idx] = value

        return replace(self, headers=new_headers, alignments=final_alignments)

    # Handle Body update
    # 1. Ensure row exists
    new_rows = [list(r) for r in self.rows]

    # Grow rows if needed
    if row_idx >= len(new_rows):
        # Calculate width
        width = (
            len(self.headers)
            if self.headers
            else (len(new_rows[0]) if new_rows else 0)
        )
        if width == 0:
            width = col_idx + 1  # At least cover the new cell

        rows_to_add = row_idx - len(new_rows) + 1
        for _ in range(rows_to_add):
            new_rows.append([""] * width)

    # If columns expanded due to row update, we might need to expand alignments too
    current_width = len(new_rows[0]) if new_rows else 0
    if col_idx >= current_width:
        # This means we are expanding columns
        if self.alignments is not None:
            width_needed = col_idx + 1
            current_align_len = len(self.alignments)
            if width_needed > current_align_len:
                new_alignments = list(self.alignments)
                extension: list[AlignmentType] = ["default"] * (
                    width_needed - current_align_len
                )
                new_alignments.extend(extension)
                return replace(
                    self,
                    rows=self._update_rows_cell(new_rows, row_idx, col_idx, value),
                    alignments=new_alignments,
                )

    return replace(
        self, rows=self._update_rows_cell(new_rows, row_idx, col_idx, value)
    )

`TableValidationError`

Bases: Exception

Exception raised when table validation fails. Contains a list of errors found during validation.

Source code in src/md_spreadsheet_parser/validation.py

class TableValidationError(Exception):
    """
    Exception raised when table validation fails.
    Contains a list of errors found during validation.
    """

    def __init__(self, errors: list[str]):
        self.errors = errors
        super().__init__(
            f"Validation failed with {len(errors)} errors:\n" + "\n".join(errors)
        )

`Workbook` `dataclass`

Represents a collection of sheets (multi-table output).

Attributes:

Name	Type	Description
`sheets`	`list[Sheet]`	List of sheets in the workbook.
`metadata`	`dict[str, Any] \| None`	Arbitrary metadata. Defaults to None.

Source code in src/md_spreadsheet_parser/models.py

@dataclass(frozen=True)
class Workbook:
    """
    Represents a collection of sheets (multi-table output).

    Attributes:
        sheets (list[Sheet]): List of sheets in the workbook.
        metadata (dict[str, Any] | None): Arbitrary metadata. Defaults to None.
    """

    sheets: list[Sheet]
    metadata: dict[str, Any] | None = None

    def __post_init__(self):
        if self.metadata is None:
            # Hack to allow default value for mutable type in frozen dataclass
            object.__setattr__(self, "metadata", {})

    @property
    def json(self) -> WorkbookJSON:
        """
        Returns a JSON-compatible dictionary representation of the workbook.

        Returns:
            WorkbookJSON: A dictionary containing the workbook data.
        """
        return {
            "sheets": [s.json for s in self.sheets],
            "metadata": self.metadata if self.metadata is not None else {},
        }

    def get_sheet(self, name: str) -> Sheet | None:
        """
        Retrieve a sheet by its name.

        Args:
            name (str): The name of the sheet to retrieve.

        Returns:
            Sheet | None: The sheet object if found, otherwise None.
        """
        for sheet in self.sheets:
            if sheet.name == name:
                return sheet
        return None

    def to_markdown(self, schema: MultiTableParsingSchema) -> str:
        """
        Generates a Markdown string representation of the workbook.

        Args:
            schema (MultiTableParsingSchema): Configuration for formatting.

        Returns:
            str: The Markdown string.
        """
        return generate_workbook_markdown(self, schema)

    def add_sheet(self, name: str) -> "Workbook":
        """
        Return a new Workbook with a new sheet added.
        """
        # Create new sheet with one empty table as default
        new_table = Table(headers=["A", "B", "C"], rows=[["", "", ""]])
        new_sheet = Sheet(name=name, tables=[new_table])

        new_sheets = list(self.sheets)
        new_sheets.append(new_sheet)

        return replace(self, sheets=new_sheets)

    def delete_sheet(self, index: int) -> "Workbook":
        """
        Return a new Workbook with the sheet at index removed.
        """
        if index < 0 or index >= len(self.sheets):
            raise IndexError("Sheet index out of range")

        new_sheets = list(self.sheets)
        new_sheets.pop(index)

        return replace(self, sheets=new_sheets)

`json` `property`

Returns a JSON-compatible dictionary representation of the workbook.

Returns:

Name	Type	Description
`WorkbookJSON`	`WorkbookJSON`	A dictionary containing the workbook data.

`add_sheet(name)`

Return a new Workbook with a new sheet added.

Source code in src/md_spreadsheet_parser/models.py

def add_sheet(self, name: str) -> "Workbook":
    """
    Return a new Workbook with a new sheet added.
    """
    # Create new sheet with one empty table as default
    new_table = Table(headers=["A", "B", "C"], rows=[["", "", ""]])
    new_sheet = Sheet(name=name, tables=[new_table])

    new_sheets = list(self.sheets)
    new_sheets.append(new_sheet)

    return replace(self, sheets=new_sheets)

`delete_sheet(index)`

Return a new Workbook with the sheet at index removed.

Source code in src/md_spreadsheet_parser/models.py

def delete_sheet(self, index: int) -> "Workbook":
    """
    Return a new Workbook with the sheet at index removed.
    """
    if index < 0 or index >= len(self.sheets):
        raise IndexError("Sheet index out of range")

    new_sheets = list(self.sheets)
    new_sheets.pop(index)

    return replace(self, sheets=new_sheets)

`get_sheet(name)`

Retrieve a sheet by its name.

Parameters:

Name	Type	Description	Default
`name`	`str`	The name of the sheet to retrieve.	required

Returns:

Type	Description
`Sheet \| None`	Sheet \| None: The sheet object if found, otherwise None.

Source code in src/md_spreadsheet_parser/models.py

def get_sheet(self, name: str) -> Sheet | None:
    """
    Retrieve a sheet by its name.

    Args:
        name (str): The name of the sheet to retrieve.

    Returns:
        Sheet | None: The sheet object if found, otherwise None.
    """
    for sheet in self.sheets:
        if sheet.name == name:
            return sheet
    return None

`to_markdown(schema)`

Generates a Markdown string representation of the workbook.

Parameters:

Name	Type	Description	Default
`schema`	`MultiTableParsingSchema`	Configuration for formatting.	required

Returns:

Name	Type	Description
`str`	`str`	The Markdown string.

Source code in src/md_spreadsheet_parser/models.py

def to_markdown(self, schema: MultiTableParsingSchema) -> str:
    """
    Generates a Markdown string representation of the workbook.

    Args:
        schema (MultiTableParsingSchema): Configuration for formatting.

    Returns:
        str: The Markdown string.
    """
    return generate_workbook_markdown(self, schema)

`generate_sheet_markdown(sheet, schema=DEFAULT_SCHEMA)`

Generates a Markdown string representation of the sheet.

Parameters:

Name	Type	Description	Default
`sheet`	`Sheet`	The Sheet object.	required
`schema`	`ParsingSchema`	Configuration for formatting.	`DEFAULT_SCHEMA`

Returns:

Name	Type	Description
`str`	`str`	The Markdown string.

Source code in src/md_spreadsheet_parser/generator.py

def generate_sheet_markdown(
    sheet: "Sheet", schema: ParsingSchema = DEFAULT_SCHEMA
) -> str:
    """
    Generates a Markdown string representation of the sheet.

    Args:
        sheet: The Sheet object.
        schema (ParsingSchema, optional): Configuration for formatting.

    Returns:
        str: The Markdown string.
    """
    lines = []

    if isinstance(schema, MultiTableParsingSchema):
        lines.append(f"{'#' * schema.sheet_header_level} {sheet.name}")
        lines.append("")

    for i, table in enumerate(sheet.tables):
        lines.append(generate_table_markdown(table, schema))
        if i < len(sheet.tables) - 1:
            lines.append("")  # Empty line between tables

    # Append Sheet Metadata if present (at the end)
    if isinstance(schema, MultiTableParsingSchema) and sheet.metadata:
        lines.append("")
        metadata_json = json.dumps(sheet.metadata)
        comment = f"<!-- md-spreadsheet-sheet-metadata: {metadata_json} -->"
        lines.append(comment)

    return "\n".join(lines)

`generate_table_markdown(table, schema=DEFAULT_SCHEMA)`

Generates a Markdown string representation of the table.

Parameters:

Name	Type	Description	Default
`table`	`Table`	The Table object.	required
`schema`	`ParsingSchema`	Configuration for formatting.	`DEFAULT_SCHEMA`

Returns:

Name	Type	Description
`str`	`str`	The Markdown string.

Source code in src/md_spreadsheet_parser/generator.py

def generate_table_markdown(
    table: "Table", schema: ParsingSchema = DEFAULT_SCHEMA
) -> str:
    """
    Generates a Markdown string representation of the table.

    Args:
        table: The Table object.
        schema (ParsingSchema, optional): Configuration for formatting.

    Returns:
        str: The Markdown string.
    """
    lines = []

    # Handle metadata (name and description) if MultiTableParsingSchema
    if isinstance(schema, MultiTableParsingSchema):
        if table.name and schema.table_header_level is not None:
            lines.append(f"{'#' * schema.table_header_level} {table.name}")
            lines.append("")  # Empty line after name

        if table.description and schema.capture_description:
            lines.append(table.description)
            lines.append("")  # Empty line after description

    # Build table
    sep = f" {schema.column_separator} "

    def _prepare_cell(cell: str) -> str:
        """Prepare cell for markdown generation."""
        if schema.convert_br_to_newline and "\n" in cell:
            return cell.replace("\n", "<br>")
        return cell

    # Headers
    if table.headers:
        # Add outer pipes if required
        processed_headers = [_prepare_cell(h) for h in table.headers]
        header_row = sep.join(processed_headers)
        if schema.require_outer_pipes:
            header_row = (
                f"{schema.column_separator} {header_row} {schema.column_separator}"
            )
        lines.append(header_row)

        # Separator row
        separator_cells = []
        for i, _ in enumerate(table.headers):
            alignment = "default"
            if table.alignments and i < len(table.alignments):
                # Ensure we handle potentially None values if list has gaps (unlikely by design but safe)
                alignment = table.alignments[i] or "default"

            # Construct separator cell based on alignment
            # Use 3 hyphens as base
            if alignment == "left":
                cell = ":" + schema.header_separator_char * 3
            elif alignment == "right":
                cell = schema.header_separator_char * 3 + ":"
            elif alignment == "center":
                cell = ":" + schema.header_separator_char * 3 + ":"
            else:
                # default
                cell = schema.header_separator_char * 3

            separator_cells.append(cell)

        separator_row = sep.join(separator_cells)
        if schema.require_outer_pipes:
            separator_row = (
                f"{schema.column_separator} {separator_row} {schema.column_separator}"
            )
        lines.append(separator_row)

    # Rows
    for row in table.rows:
        processed_row = [_prepare_cell(cell) for cell in row]
        row_str = sep.join(processed_row)
        if schema.require_outer_pipes:
            row_str = f"{schema.column_separator} {row_str} {schema.column_separator}"
        lines.append(row_str)

    # Append Metadata if present
    if table.metadata and "visual" in table.metadata:
        metadata_json = json.dumps(table.metadata["visual"])
        comment = f"<!-- md-spreadsheet-table-metadata: {metadata_json} -->"
        lines.append("")
        lines.append(comment)

    return "\n".join(lines)

`generate_workbook_markdown(workbook, schema)`

Generates a Markdown string representation of the workbook.

Parameters:

Name	Type	Description	Default
`workbook`	`Workbook`	The Workbook object.	required
`schema`	`MultiTableParsingSchema`	Configuration for formatting.	required

Returns:

Name	Type	Description
`str`	`str`	The Markdown string.

Source code in src/md_spreadsheet_parser/generator.py

def generate_workbook_markdown(
    workbook: "Workbook", schema: MultiTableParsingSchema
) -> str:
    """
    Generates a Markdown string representation of the workbook.

    Args:
        workbook: The Workbook object.
        schema (MultiTableParsingSchema): Configuration for formatting.

    Returns:
        str: The Markdown string.
    """
    lines = []

    if schema.root_marker:
        lines.append(schema.root_marker)
        lines.append("")

    for i, sheet in enumerate(workbook.sheets):
        lines.append(generate_sheet_markdown(sheet, schema))
        if i < len(workbook.sheets) - 1:
            lines.append("")  # Empty line between sheets

    # Append Workbook Metadata if present
    if workbook.metadata:
        # Ensure separation from last sheet
        if lines and lines[-1] != "":
            lines.append("")

        metadata_json = json.dumps(workbook.metadata)
        comment = f"<!-- md-spreadsheet-workbook-metadata: {metadata_json} -->"
        lines.append(comment)

    return "\n".join(lines)

`parse_excel(source, schema=DEFAULT_EXCEL_SCHEMA)`

Parse Excel data from various sources.

Parameters:

Name	Type	Description	Default
`source`	`ExcelSource`	One of: - openpyxl.Worksheet (if openpyxl is installed) - str: TSV/CSV text content - list[list[str]]: Pre-parsed 2D array	required
`schema`	`ExcelParsingSchema`	Configuration for parsing.	`DEFAULT_EXCEL_SCHEMA`

Returns:

Type	Description
`Table`	Table object with processed headers and data.

Raises:

Type	Description
`TypeError`	If source type is not supported.

Source code in src/md_spreadsheet_parser/excel.py

def parse_excel(
    source: ExcelSource,
    schema: ExcelParsingSchema = DEFAULT_EXCEL_SCHEMA,
) -> Table:
    """
    Parse Excel data from various sources.

    Args:
        source: One of:
            - openpyxl.Worksheet (if openpyxl is installed)
            - str: TSV/CSV text content
            - list[list[str]]: Pre-parsed 2D array
        schema: Configuration for parsing.

    Returns:
        Table object with processed headers and data.

    Raises:
        TypeError: If source type is not supported.
    """
    rows: list[list[str]]

    # Check for openpyxl Worksheet (duck typing via hasattr)
    if HAS_OPENPYXL and hasattr(source, "iter_rows"):
        # At runtime, source is a Worksheet with iter_rows method
        ws: Any = source
        rows = [
            [_safe_str(cell) for cell in row] for row in ws.iter_rows(values_only=True)
        ]

    # Check for string (TSV/CSV content)
    elif isinstance(source, str):
        rows = _parse_tsv(source, schema.delimiter)

    # Check for pre-parsed 2D array
    elif isinstance(source, list):
        # Assume it's already list[list[str]]
        rows = source

    else:
        supported = "openpyxl.Worksheet, str, or list[list[str]]"
        if not HAS_OPENPYXL:
            supported = (
                "str or list[list[str]] (install openpyxl for Worksheet support)"
            )
        raise TypeError(
            f"Unsupported source type: {type(source).__name__}. Expected {supported}."
        )

    return parse_excel_text(rows, schema)

`parse_excel_text(rows, schema=DEFAULT_EXCEL_SCHEMA)`

Parse a 2D string array into a Table with merged cell and header handling.

Parameters:

Name	Type	Description	Default
`rows`	`list[list[str]]`	2D list of strings (e.g., from csv.reader or worksheet iteration).	required
`schema`	`ExcelParsingSchema`	Configuration for header processing.	`DEFAULT_EXCEL_SCHEMA`

Returns:

Type	Description
`Table`	Table object with processed headers and data rows.

Source code in src/md_spreadsheet_parser/excel.py

def parse_excel_text(
    rows: list[list[str]],
    schema: ExcelParsingSchema = DEFAULT_EXCEL_SCHEMA,
) -> Table:
    """
    Parse a 2D string array into a Table with merged cell and header handling.

    Args:
        rows: 2D list of strings (e.g., from csv.reader or worksheet iteration).
        schema: Configuration for header processing.

    Returns:
        Table object with processed headers and data rows.
    """
    if not rows:
        return Table(headers=None, rows=[])

    if schema.header_rows == 1:
        # Single header row
        header_row = rows[0]
        if schema.fill_merged_headers:
            header_row = _forward_fill(header_row)
        headers = header_row
        data_rows = rows[1:]

    elif schema.header_rows == 2:
        # Two header rows: Parent-Child flattening
        if len(rows) < 2:
            # Not enough rows for 2-row header
            return Table(headers=rows[0] if rows else None, rows=[])

        parent_row = rows[0]
        child_row = rows[1]

        if schema.fill_merged_headers:
            parent_row = _forward_fill(parent_row)

        headers = _flatten_headers(parent_row, child_row, schema.header_separator)
        data_rows = rows[2:]

    else:
        # Should not reach here due to schema validation
        raise ValueError(f"Invalid header_rows: {schema.header_rows}")

    # Convert data_rows to list[list[str]] ensuring all are strings
    processed_rows = [[_safe_str(cell) for cell in row] for row in data_rows]

    return Table(headers=headers, rows=processed_rows)

`parse_sheet(markdown, name, schema, start_line_offset=0)`

Parse a sheet (section) containing one or more tables.

Source code in src/md_spreadsheet_parser/parsing.py

def parse_sheet(
    markdown: str,
    name: str,
    schema: MultiTableParsingSchema,
    start_line_offset: int = 0,
) -> Sheet:
    """
    Parse a sheet (section) containing one or more tables.
    """
    metadata: dict[str, Any] | None = None

    # Scan for sheet metadata
    # We prioritize the first match if multiple exist (though usually only one)
    metadata_match = re.search(
        r"^<!-- md-spreadsheet-sheet-metadata: (.*) -->$", markdown, re.MULTILINE
    )
    if metadata_match:
        try:
            metadata = json.loads(metadata_match.group(1))
        except json.JSONDecodeError:
            pass  # Ignore invalid JSON

    tables = _extract_tables(markdown, schema, start_line_offset)
    return Sheet(name=name, tables=tables, metadata=metadata)

`parse_table(markdown, schema=DEFAULT_SCHEMA)`

Parse a markdown table into a Table object.

Parameters:

Name	Type	Description	Default
`markdown`	`str`	The markdown string containing the table.	required
`schema`	`ParsingSchema`	Configuration for parsing.	`DEFAULT_SCHEMA`

Returns:

Type	Description
`Table`	Table object with headers and rows.

Source code in src/md_spreadsheet_parser/parsing.py

def parse_table(markdown: str, schema: ParsingSchema = DEFAULT_SCHEMA) -> Table:
    """
    Parse a markdown table into a Table object.

    Args:
        markdown: The markdown string containing the table.
        schema: Configuration for parsing.

    Returns:
        Table object with headers and rows.
    """
    lines = markdown.strip().split("\n")
    headers: list[str] | None = None
    rows: list[list[str]] = []
    alignments: list[AlignmentType] | None = None
    potential_header: list[str] | None = None
    visual_metadata: dict | None = None

    # Buffer for potential header row until we confirm it's a header with a separator
    potential_header: list[str] | None = None

    for line in lines:
        line = line.strip()
        if not line:
            continue

        # Check for metadata comment
        metadata_match = re.match(
            r"^<!-- md-spreadsheet-table-metadata: (.*) -->$", line
        )
        if metadata_match:
            try:
                json_content = metadata_match.group(1)
                visual_metadata = json.loads(json_content)
                continue
            except json.JSONDecodeError:
                # If invalid JSON, treat as normal text/comment (or ignore?)
                # For robustness, we ignore it as metadata but let parse_row handle it or skip?
                # Usually comments are ignored by parse_row if they don't look like tables?
                # parse_row will likely return ["<!-- ... -->"].
                # If we want to hide it from table data, we should continue here even if error?
                # User constraint: "if user manually edits... handle gracefully".
                # Let's log/ignore and continue, effectively stripping bad metadata lines from table data.
                continue

        parsed_row = parse_row(line, schema)

        if parsed_row is None:
            continue

        if headers is None and potential_header is not None:
            detected_alignments = parse_separator_row(parsed_row, schema)
            if detected_alignments is not None:
                headers = potential_header
                alignments: list[AlignmentType] | None = detected_alignments
                potential_header = None
                continue
                potential_header = None
                continue
            else:
                # Previous row was not a header, treat as data
                rows.append(potential_header)
                potential_header = parsed_row
        elif headers is None and potential_header is None:
            potential_header = parsed_row
        else:
            rows.append(parsed_row)

    if potential_header is not None:
        rows.append(potential_header)

    # Normalize rows to match header length
    if headers:
        header_len = len(headers)
        normalized_rows = []
        for row in rows:
            if len(row) < header_len:
                # Pad with empty strings
                row.extend([""] * (header_len - len(row)))
            elif len(row) > header_len:
                # Truncate
                row = row[:header_len]
            normalized_rows.append(row)
        rows = normalized_rows

    metadata: dict[str, Any] = {"schema_used": str(schema)}
    if visual_metadata:
        metadata["visual"] = visual_metadata

    return Table(headers=headers, rows=rows, metadata=metadata, alignments=alignments)

`parse_table_from_file(source, schema=DEFAULT_SCHEMA)`

Parse a markdown table from a file.

Parameters:

Name	Type	Description	Default
`source`	`Union[str, Path, TextIO]`	File path (str/Path) or file-like object.	required
`schema`	`ParsingSchema`	Parsing configuration.	`DEFAULT_SCHEMA`

Source code in src/md_spreadsheet_parser/loader.py

def parse_table_from_file(
    source: Union[str, Path, TextIO], schema: ParsingSchema = DEFAULT_SCHEMA
) -> Table:
    """
    Parse a markdown table from a file.

    Args:
        source: File path (str/Path) or file-like object.
        schema: Parsing configuration.
    """
    content = _read_content(source)
    return parse_table(content, schema)

`parse_workbook(markdown, schema=MultiTableParsingSchema())`

Parse a markdown document into a Workbook.

Source code in src/md_spreadsheet_parser/parsing.py

def parse_workbook(
    markdown: str, schema: MultiTableParsingSchema = MultiTableParsingSchema()
) -> Workbook:
    """
    Parse a markdown document into a Workbook.
    """
    lines = markdown.split("\n")
    sheets: list[Sheet] = []
    metadata: dict[str, Any] | None = None

    # Check for Workbook metadata at the end of the file
    # Scan for Workbook metadata anywhere in the file
    # We filter it out from the lines so it doesn't interfere with sheet content
    filtered_lines: list[str] = []
    wb_metadata_pattern = re.compile(
        r"^<!-- md-spreadsheet-workbook-metadata: (.*) -->$"
    )

    for line in lines:
        stripped = line.strip()
        match = wb_metadata_pattern.match(stripped)
        if match:
            try:
                metadata = json.loads(match.group(1))
            except json.JSONDecodeError:
                pass
            # Skip adding this line to filtered_lines
        else:
            filtered_lines.append(line)

    lines = filtered_lines

    # Find root marker
    start_index = 0
    in_code_block = False
    if schema.root_marker:
        found = False
        for i, line in enumerate(lines):
            stripped = line.strip()
            if stripped.startswith("```"):
                in_code_block = not in_code_block

            if not in_code_block and stripped == schema.root_marker:
                start_index = i + 1
                found = True
                break
        if not found:
            return Workbook(sheets=[], metadata=metadata)

    # Split by sheet headers
    header_prefix = "#" * schema.sheet_header_level + " "

    current_sheet_name: str | None = None
    current_sheet_lines: list[str] = []
    current_sheet_start_line = start_index

    # Reset code block state for the second pass
    # If we started after a root marker, check if that root marker line was just a marker.
    # We assume valid markdown structure where root marker is not inside a code block (handled above).
    in_code_block = False

    for idx, line in enumerate(lines[start_index:], start=start_index):
        stripped = line.strip()

        if stripped.startswith("```"):
            in_code_block = not in_code_block

        if in_code_block:
            # Just collect lines if we are in a sheet
            if current_sheet_name:
                current_sheet_lines.append(line)
            continue

        # Check if line is a header
        if stripped.startswith("#"):
            # Count header level
            level = 0
            for char in stripped:
                if char == "#":
                    level += 1
                else:
                    break

            # If header level is less than sheet_header_level (e.g. # vs ##),
            # it indicates a higher-level section, so we stop parsing the workbook.
            if level < schema.sheet_header_level:
                break

        if stripped.startswith(header_prefix):
            if current_sheet_name:
                sheet_content = "\n".join(current_sheet_lines)
                # The content starts at current_sheet_start_line + 1 (header line)
                # Wait, current_sheet_lines collected lines AFTER the header.
                # So the offset for content is current_sheet_start_line + 1.
                sheets.append(
                    parse_sheet(
                        sheet_content,
                        current_sheet_name,
                        schema,
                        start_line_offset=current_sheet_start_line + 1,
                    )
                )

            current_sheet_name = stripped[len(header_prefix) :].strip()
            current_sheet_lines = []
            current_sheet_start_line = idx
        else:
            if current_sheet_name:
                current_sheet_lines.append(line)

    if current_sheet_name:
        sheet_content = "\n".join(current_sheet_lines)
        sheets.append(
            parse_sheet(
                sheet_content,
                current_sheet_name,
                schema,
                start_line_offset=current_sheet_start_line + 1,
            )
        )

    return Workbook(sheets=sheets, metadata=metadata)

`parse_workbook_from_file(source, schema=MultiTableParsingSchema())`

Parse a markdown workbook from a file.

Parameters:

Name	Type	Description	Default
`source`	`Union[str, Path, TextIO]`	File path (str/Path) or file-like object.	required
`schema`	`MultiTableParsingSchema`	Parsing configuration.	`MultiTableParsingSchema()`

Source code in src/md_spreadsheet_parser/loader.py

def parse_workbook_from_file(
    source: Union[str, Path, TextIO],
    schema: MultiTableParsingSchema = MultiTableParsingSchema(),
) -> Workbook:
    """
    Parse a markdown workbook from a file.

    Args:
        source: File path (str/Path) or file-like object.
        schema: Parsing configuration.
    """
    content = _read_content(source)
    return parse_workbook(content, schema)

`scan_tables(markdown, schema=None)`

Scan a markdown document for all tables, ignoring sheet structure.

Parameters:

Name	Type	Description	Default
`markdown`	`str`	The markdown text.	required
`schema`	`MultiTableParsingSchema \| None`	Optional schema. If None, uses default MultiTableParsingSchema.	`None`

Returns:

Source code in src/md_spreadsheet_parser/parsing.py

def scan_tables(
    markdown: str, schema: MultiTableParsingSchema | None = None
) -> list[Table]:
    """
    Scan a markdown document for all tables, ignoring sheet structure.

    Args:
        markdown: The markdown text.
        schema: Optional schema. If None, uses default MultiTableParsingSchema.

    Returns:
    """
    if schema is None:
        schema = MultiTableParsingSchema()

    return _extract_tables(markdown, schema)

`scan_tables_from_file(source, schema=None)`

Scan a markdown file for all tables.

Parameters:

Name	Type	Description	Default
`source`	`Union[str, Path, TextIO]`	File path (str/Path) or file-like object.	required
`schema`	`MultiTableParsingSchema \| None`	Optional schema.	`None`

Source code in src/md_spreadsheet_parser/loader.py

def scan_tables_from_file(
    source: Union[str, Path, TextIO], schema: MultiTableParsingSchema | None = None
) -> list[Table]:
    """
    Scan a markdown file for all tables.

    Args:
        source: File path (str/Path) or file-like object.
        schema: Optional schema.
    """
    content = _read_content(source)
    return scan_tables(content, schema)

`scan_tables_iter(source, schema=None)`

Stream tables from a source (file path, file object, or iterable) one by one. This allows processing files larger than memory, provided that individual tables fit in memory.

Parameters:

Name	Type	Description	Default
`source`	`Union[str, Path, TextIO, Iterable[str]]`	File path, open file object, or iterable of strings.	required
`schema`	`MultiTableParsingSchema \| None`	Parsing configuration.	`None`

Yields:

Type	Description
`Table`	Table objects found in the stream.

Source code in src/md_spreadsheet_parser/loader.py

def scan_tables_iter(
    source: Union[str, Path, TextIO, Iterable[str]],
    schema: MultiTableParsingSchema | None = None,
) -> Iterator[Table]:
    """
    Stream tables from a source (file path, file object, or iterable) one by one.
    This allows processing files larger than memory, provided that individual tables fit in memory.

    Args:
        source: File path, open file object, or iterable of strings.
        schema: Parsing configuration.

    Yields:
        Table objects found in the stream.
    """
    if schema is None:
        schema = MultiTableParsingSchema()

    header_prefix = None
    if schema.table_header_level is not None:
        header_prefix = "#" * schema.table_header_level + " "

    current_lines: list[str] = []
    current_name: str | None = None
    # We track line number manually for metadata
    current_line_idx = 0
    # Start of the current block
    block_start_line = 0

    def parse_and_yield(
        lines: list[str], name: str | None, start_offset: int
    ) -> Iterator[Table]:
        if not lines:
            return

        # Check if block looks like a table (has separator)
        block_text = "".join(lines)

        if schema.column_separator not in block_text:
            return

        # Simple extraction logic similar to process_table_block
        # We reuse parsing logic.

        # Split description vs table
        # We need list of lines stripped of newline for index finding
        stripped_lines = [line_val.rstrip("\n") for line_val in lines]

        table_start_idx = -1
        for idx, line in enumerate(stripped_lines):
            if schema.column_separator in line:
                table_start_idx = idx
                break

        if table_start_idx != -1:
            desc_lines = stripped_lines[:table_start_idx]
            table_lines = stripped_lines[table_start_idx:]

            table_text = "\n".join(table_lines)
            table = parse_table(table_text, schema)

            if table.rows or table.headers:
                description = None
                if schema.capture_description:
                    desc_text = "\n".join(d.strip() for d in desc_lines if d.strip())
                    if desc_text:
                        description = desc_text

                table = replace(
                    table,
                    name=name,
                    description=description,
                    start_line=start_offset + table_start_idx,
                    end_line=start_offset + len(lines),
                )
                yield table

    for line in _iter_lines(source):
        # normalize: file iter yields line with \n
        stripped_line = line.strip()

        is_header = header_prefix and stripped_line.startswith(header_prefix)

        if is_header:
            # New section starts. Yield previous buffer if any.
            yield from parse_and_yield(current_lines, current_name, block_start_line)

            assert header_prefix is not None
            current_name = stripped_line[len(header_prefix) :].strip()
            current_lines = []
            block_start_line = current_line_idx

        elif stripped_line == "":
            # Blank line.
            yield from parse_and_yield(current_lines, current_name, block_start_line)
            current_lines = []
            # block_start_line for NEXT block will be current_line_idx + 1
            block_start_line = current_line_idx + 1

        else:
            current_lines.append(line)

        current_line_idx += 1

    # End of stream
    yield from parse_and_yield(current_lines, current_name, block_start_line)

`md_spreadsheet_parser.schemas`

`ConversionSchema` `dataclass`

Configuration for converting string values to Python types.

Attributes:

Name	Type	Description
`boolean_pairs`	`tuple[tuple[str, str], ...]`	Pairs of strings representing (True, False). Case-insensitive. Example: `(("yes", "no"), ("on", "off"))`.
`custom_converters`	`dict[type, Callable[[str], Any]]`	Dictionary mapping ANY Python type to a conversion function `str -> Any`. You can specify: - Built-in types: `int`, `float`, `bool` (to override default behavior) - Standard library types: `Decimal`, `datetime`, `date`, `ZoneInfo` - Custom classes: `MyClass`, `Product`
`field_converters`	`dict[str, Callable[[str], Any]]`	Dictionary mapping field names (str) to conversion functions. Takes precedence over `custom_converters`.

Source code in src/md_spreadsheet_parser/schemas.py

@dataclass(frozen=True)
class ConversionSchema:
    """
    Configuration for converting string values to Python types.

    Attributes:
        boolean_pairs: Pairs of strings representing (True, False). Case-insensitive.
                       Example: `(("yes", "no"), ("on", "off"))`.
        custom_converters: Dictionary mapping ANY Python type to a conversion function `str -> Any`.
                           You can specify:
                           - Built-in types: `int`, `float`, `bool` (to override default behavior)
                           - Standard library types: `Decimal`, `datetime`, `date`, `ZoneInfo`
                           - Custom classes: `MyClass`, `Product`
        field_converters: Dictionary mapping field names (str) to conversion functions.
                          Takes precedence over `custom_converters`.
    """

    boolean_pairs: tuple[tuple[str, str], ...] = (
        ("true", "false"),
        ("yes", "no"),
        ("1", "0"),
        ("on", "off"),
    )
    custom_converters: dict[type, Callable[[str], Any]] = field(default_factory=dict)
    field_converters: dict[str, Callable[[str], Any]] = field(default_factory=dict)

`ExcelParsingSchema` `dataclass`

Configuration for parsing Excel-exported data (TSV/CSV or openpyxl).

Attributes:

Name	Type	Description
`header_rows`	`int`	Number of header rows (1 or 2). If 2, headers are flattened to "Parent - Child" format.
`fill_merged_headers`	`bool`	Whether to forward-fill empty header cells (for merged cells in Excel exports).
`delimiter`	`str`	Column separator for TSV/CSV parsing. Default is tab.
`header_separator`	`str`	Separator used when flattening 2-row headers.

Source code in src/md_spreadsheet_parser/schemas.py

@dataclass(frozen=True)
class ExcelParsingSchema:
    """
    Configuration for parsing Excel-exported data (TSV/CSV or openpyxl).

    Attributes:
        header_rows: Number of header rows (1 or 2).
                     If 2, headers are flattened to "Parent - Child" format.
        fill_merged_headers: Whether to forward-fill empty header cells
                             (for merged cells in Excel exports).
        delimiter: Column separator for TSV/CSV parsing. Default is tab.
        header_separator: Separator used when flattening 2-row headers.
    """

    header_rows: int = 1
    fill_merged_headers: bool = True
    delimiter: str = "\t"
    header_separator: str = " - "

    def __post_init__(self):
        if self.header_rows not in (1, 2):
            raise ValueError("header_rows must be 1 or 2")

`MultiTableParsingSchema` `dataclass`

Bases: ParsingSchema

Configuration for parsing multiple tables (workbook mode). Inherits from ParsingSchema.

Attributes:

Name	Type	Description
`root_marker`	`str`	The marker indicating the start of the data section. Defaults to "# Tables".
`sheet_header_level`	`int`	The markdown header level for sheets. Defaults to 2 (e.g. `## Sheet`).
`table_header_level`	`int \| None`	The markdown header level for tables. If None, table names are not extracted. Defaults to None.
`capture_description`	`bool`	Whether to capture text between the table header and the table as a description. Defaults to False.

Source code in src/md_spreadsheet_parser/schemas.py

@dataclass(frozen=True)
class MultiTableParsingSchema(ParsingSchema):
    """
    Configuration for parsing multiple tables (workbook mode).
    Inherits from ParsingSchema.

    Attributes:
        root_marker (str): The marker indicating the start of the data section. Defaults to "# Tables".
        sheet_header_level (int): The markdown header level for sheets. Defaults to 2 (e.g. `## Sheet`).
        table_header_level (int | None): The markdown header level for tables. If None, table names are not extracted. Defaults to None.
        capture_description (bool): Whether to capture text between the table header and the table as a description. Defaults to False.
    """

    root_marker: str = "# Tables"
    sheet_header_level: int = 2
    table_header_level: int | None = 3
    capture_description: bool = True

    def __post_init__(self):
        if self.capture_description and self.table_header_level is None:
            raise ValueError(
                "capture_description=True requires table_header_level to be set"
            )

`ParsingSchema` `dataclass`

Configuration for parsing markdown tables. Designed to be immutable and passed to pure functions.

Attributes:

Name	Type	Description
`column_separator`	`str`	Character used to separate columns. Defaults to "\|".
`header_separator_char`	`str`	Character used in the separator row. Defaults to "-".
`require_outer_pipes`	`bool`	Whether tables must have outer pipes (e.g. `\| col \|`). Defaults to True.
`strip_whitespace`	`bool`	Whether to strip whitespace from cell values. Defaults to True.

Source code in src/md_spreadsheet_parser/schemas.py

@dataclass(frozen=True)
class ParsingSchema:
    """
    Configuration for parsing markdown tables.
    Designed to be immutable and passed to pure functions.

    Attributes:
        column_separator (str): Character used to separate columns. Defaults to "|".
        header_separator_char (str): Character used in the separator row. Defaults to "-".
        require_outer_pipes (bool): Whether tables must have outer pipes (e.g. `| col |`). Defaults to True.
        strip_whitespace (bool): Whether to strip whitespace from cell values. Defaults to True.
    """

    column_separator: str = "|"
    header_separator_char: str = "-"
    require_outer_pipes: bool = True
    strip_whitespace: bool = True
    convert_br_to_newline: bool = True

`md_spreadsheet_parser.models`

`Sheet` `dataclass`

Represents a single sheet containing tables.

Attributes:

Name	Type	Description
`name`	`str`	Name of the sheet.
`tables`	`list[Table]`	List of tables contained in this sheet.
`metadata`	`dict[str, Any] \| None`	Arbitrary metadata (e.g. layout). Defaults to None.

Source code in src/md_spreadsheet_parser/models.py

@dataclass(frozen=True)
class Sheet:
    """
    Represents a single sheet containing tables.

    Attributes:
        name (str): Name of the sheet.
        tables (list[Table]): List of tables contained in this sheet.
        metadata (dict[str, Any] | None): Arbitrary metadata (e.g. layout). Defaults to None.
    """

    name: str
    tables: list[Table]
    metadata: dict[str, Any] | None = None

    def __post_init__(self):
        if self.metadata is None:
            # Hack to allow default value for mutable type in frozen dataclass
            object.__setattr__(self, "metadata", {})

    @property
    def json(self) -> SheetJSON:
        """
        Returns a JSON-compatible dictionary representation of the sheet.

        Returns:
            SheetJSON: A dictionary containing the sheet data.
        """
        return {
            "name": self.name,
            "tables": [t.json for t in self.tables],
            "metadata": self.metadata if self.metadata is not None else {},
        }

    def get_table(self, name: str) -> Table | None:
        """
        Retrieve a table by its name.

        Args:
            name (str): The name of the table to retrieve.

        Returns:
            Table | None: The table object if found, otherwise None.
        """
        for table in self.tables:
            if table.name == name:
                return table
        return None

    def to_markdown(self, schema: ParsingSchema = DEFAULT_SCHEMA) -> str:
        """
        Generates a Markdown string representation of the sheet.

        Args:
            schema (ParsingSchema, optional): Configuration for formatting.

        Returns:
            str: The Markdown string.
        """
        return generate_sheet_markdown(self, schema)

`json` `property`

Returns a JSON-compatible dictionary representation of the sheet.

Returns:

Name	Type	Description
`SheetJSON`	`SheetJSON`	A dictionary containing the sheet data.

`get_table(name)`

Retrieve a table by its name.

Parameters:

Name	Type	Description	Default
`name`	`str`	The name of the table to retrieve.	required

Returns:

Type	Description
`Table \| None`	Table \| None: The table object if found, otherwise None.

Source code in src/md_spreadsheet_parser/models.py

def get_table(self, name: str) -> Table | None:
    """
    Retrieve a table by its name.

    Args:
        name (str): The name of the table to retrieve.

    Returns:
        Table | None: The table object if found, otherwise None.
    """
    for table in self.tables:
        if table.name == name:
            return table
    return None

`to_markdown(schema=DEFAULT_SCHEMA)`

Generates a Markdown string representation of the sheet.

Parameters:

Name	Type	Description	Default
`schema`	`ParsingSchema`	Configuration for formatting.	`DEFAULT_SCHEMA`

Returns:

Name	Type	Description
`str`	`str`	The Markdown string.

Source code in src/md_spreadsheet_parser/models.py

def to_markdown(self, schema: ParsingSchema = DEFAULT_SCHEMA) -> str:
    """
    Generates a Markdown string representation of the sheet.

    Args:
        schema (ParsingSchema, optional): Configuration for formatting.

    Returns:
        str: The Markdown string.
    """
    return generate_sheet_markdown(self, schema)

`SheetJSON`

Bases: TypedDict

JSON-compatible dictionary representation of a Sheet.

Source code in src/md_spreadsheet_parser/models.py

class SheetJSON(TypedDict):
    """
    JSON-compatible dictionary representation of a Sheet.
    """

    name: str
    tables: list[TableJSON]
    metadata: dict[str, Any]

`Table` `dataclass`

Represents a parsed table with optional metadata.

Attributes:

Name	Type	Description
`headers`	`list[str] \| None`	List of column headers, or None if the table has no headers.
`rows`	`list[list[str]]`	List of data rows.
`alignments`	`list[AlignmentType] \| None`	List of column alignments ('left', 'center', 'right'). Defaults to None.
`name`	`str \| None`	Name of the table (e.g. from a header). Defaults to None.
`description`	`str \| None`	Description of the table. Defaults to None.
`metadata`	`dict[str, Any] \| None`	Arbitrary metadata. Defaults to None.

Source code in src/md_spreadsheet_parser/models.py

@dataclass(frozen=True)
class Table:
    """
    Represents a parsed table with optional metadata.

    Attributes:
        headers (list[str] | None): List of column headers, or None if the table has no headers.
        rows (list[list[str]]): List of data rows.
        alignments (list[AlignmentType] | None): List of column alignments ('left', 'center', 'right'). Defaults to None.
        name (str | None): Name of the table (e.g. from a header). Defaults to None.
        description (str | None): Description of the table. Defaults to None.
        metadata (dict[str, Any] | None): Arbitrary metadata. Defaults to None.
    """

    headers: list[str] | None
    rows: list[list[str]]
    alignments: list[AlignmentType] | None = None
    name: str | None = None
    description: str | None = None
    metadata: dict[str, Any] | None = None
    start_line: int | None = None
    end_line: int | None = None

    def __post_init__(self):
        if self.metadata is None:
            # Hack to allow default value for mutable type in frozen dataclass
            object.__setattr__(self, "metadata", {})

    @property
    def json(self) -> TableJSON:
        """
        Returns a JSON-compatible dictionary representation of the table.

        Returns:
            TableJSON: A dictionary containing the table data.
        """
        return {
            "name": self.name,
            "description": self.description,
            "headers": self.headers,
            "rows": self.rows,
            "metadata": self.metadata if self.metadata is not None else {},
            "start_line": self.start_line,
            "end_line": self.end_line,
            "alignments": self.alignments,
        }

    def to_models(
        self,
        schema_cls: type[T],
        conversion_schema: ConversionSchema = DEFAULT_CONVERSION_SCHEMA,
    ) -> list[T]:
        """
        Converts the table rows into a list of dataclass instances, performing validation and type conversion.

        Args:
            schema_cls (type[T]): The dataclass type to validate against.
            conversion_schema (ConversionSchema, optional): Configuration for type conversion.

        Returns:
            list[T]: A list of validated dataclass instances.

        Raises:
            ValueError: If schema_cls is not a dataclass.
            TableValidationError: If validation fails for any row or if the table has no headers.
        """
        return validate_table(self, schema_cls, conversion_schema)

    def to_markdown(self, schema: ParsingSchema = DEFAULT_SCHEMA) -> str:
        """
        Generates a Markdown string representation of the table.

        Args:
            schema (ParsingSchema, optional): Configuration for formatting.

        Returns:
            str: The Markdown string.
        """
        return generate_table_markdown(self, schema)

    def update_cell(self, row_idx: int, col_idx: int, value: str) -> "Table":
        """
        Return a new Table with the specified cell updated.
        """
        # Handle header update
        if row_idx == -1:
            if self.headers is None:
                # Determine width from rows if possible, or start fresh
                width = len(self.rows[0]) if self.rows else (col_idx + 1)
                new_headers = [""] * width
                # Ensure width enough
                if col_idx >= len(new_headers):
                    new_headers.extend([""] * (col_idx - len(new_headers) + 1))
            else:
                new_headers = list(self.headers)
                if col_idx >= len(new_headers):
                    new_headers.extend([""] * (col_idx - len(new_headers) + 1))

            # Update alignments if headers grew
            new_alignments = list(self.alignments) if self.alignments else []
            if len(new_headers) > len(new_alignments):
                # Fill with default/None up to new width
                # But we only need as many alignments as columns.
                # If alignments is None, it stays None?
                # Ideally if we start tracking alignments, we should init it?
                # If self.alignments was None, we might keep it None unless explicitly set?
                # Consistent behavior: If alignments is NOT None, expand it.
                if self.alignments is not None:
                    # Cast or explicit type check might be needed for strict type checkers with literals
                    # Using a typed list to satisfy invariant list[AlignmentType]
                    extension: list[AlignmentType] = ["default"] * (
                        len(new_headers) - len(new_alignments)
                    )
                    new_alignments.extend(extension)

            final_alignments = new_alignments if self.alignments is not None else None

            new_headers[col_idx] = value

            return replace(self, headers=new_headers, alignments=final_alignments)

        # Handle Body update
        # 1. Ensure row exists
        new_rows = [list(r) for r in self.rows]

        # Grow rows if needed
        if row_idx >= len(new_rows):
            # Calculate width
            width = (
                len(self.headers)
                if self.headers
                else (len(new_rows[0]) if new_rows else 0)
            )
            if width == 0:
                width = col_idx + 1  # At least cover the new cell

            rows_to_add = row_idx - len(new_rows) + 1
            for _ in range(rows_to_add):
                new_rows.append([""] * width)

        # If columns expanded due to row update, we might need to expand alignments too
        current_width = len(new_rows[0]) if new_rows else 0
        if col_idx >= current_width:
            # This means we are expanding columns
            if self.alignments is not None:
                width_needed = col_idx + 1
                current_align_len = len(self.alignments)
                if width_needed > current_align_len:
                    new_alignments = list(self.alignments)
                    extension: list[AlignmentType] = ["default"] * (
                        width_needed - current_align_len
                    )
                    new_alignments.extend(extension)
                    return replace(
                        self,
                        rows=self._update_rows_cell(new_rows, row_idx, col_idx, value),
                        alignments=new_alignments,
                    )

        return replace(
            self, rows=self._update_rows_cell(new_rows, row_idx, col_idx, value)
        )

    def _update_rows_cell(self, new_rows, row_idx, col_idx, value):
        target_row = new_rows[row_idx]
        if col_idx >= len(target_row):
            target_row.extend([""] * (col_idx - len(target_row) + 1))
        target_row[col_idx] = value
        return new_rows

    def delete_row(self, row_idx: int) -> "Table":
        """
        Return a new Table with the row at index removed.
        """
        new_rows = [list(r) for r in self.rows]
        if 0 <= row_idx < len(new_rows):
            new_rows.pop(row_idx)
        return replace(self, rows=new_rows)

    def delete_column(self, col_idx: int) -> "Table":
        """
        Return a new Table with the column at index removed.
        """
        new_headers = list(self.headers) if self.headers else None
        if new_headers and 0 <= col_idx < len(new_headers):
            new_headers.pop(col_idx)

        new_rows = []
        for row in self.rows:
            new_row = list(row)
            if 0 <= col_idx < len(new_row):
                new_row.pop(col_idx)
            new_rows.append(new_row)

        new_alignments = None
        if self.alignments is not None:
            new_alignments = list(self.alignments)
            if 0 <= col_idx < len(new_alignments):
                new_alignments.pop(col_idx)

        return replace(
            self, headers=new_headers, rows=new_rows, alignments=new_alignments
        )

    def clear_column_data(self, col_idx: int) -> "Table":
        """
        Return a new Table with data in the specified column cleared (set to empty string),
        but headers and column structure preserved.
        """
        # Headers remain unchanged

        new_rows = []
        for row in self.rows:
            new_row = list(row)
            if 0 <= col_idx < len(new_row):
                new_row[col_idx] = ""
            new_rows.append(new_row)

        return replace(self, rows=new_rows)

    def insert_row(self, row_idx: int) -> "Table":
        """
        Return a new Table with an empty row inserted at row_idx.
        Subsequent rows are shifted down.
        """
        new_rows = [list(r) for r in self.rows]

        # Determine width
        width = (
            len(self.headers) if self.headers else (len(new_rows[0]) if new_rows else 0)
        )
        if width == 0:
            width = 1  # Default to 1 column if table is empty

        new_row = [""] * width

        if row_idx < 0:
            row_idx = 0
        if row_idx > len(new_rows):
            row_idx = len(new_rows)

        new_rows.insert(row_idx, new_row)
        return replace(self, rows=new_rows)

    def insert_column(self, col_idx: int) -> "Table":
        """
        Return a new Table with an empty column inserted at col_idx.
        Subsequent columns are shifted right.
        """
        new_headers = list(self.headers) if self.headers else None

        if new_headers:
            if col_idx < 0:
                col_idx = 0
            if col_idx > len(new_headers):
                col_idx = len(new_headers)
            new_headers.insert(col_idx, "")

        new_alignments = None
        if self.alignments is not None:
            new_alignments = list(self.alignments)
            # Pad if needed before insertion?
            if col_idx > len(new_alignments):
                extension: list[AlignmentType] = ["default"] * (
                    col_idx - len(new_alignments)
                )
                new_alignments.extend(extension)
            new_alignments.insert(col_idx, "default")  # Default alignment

        new_rows = []
        for row in self.rows:
            new_row = list(row)
            # Ensure row is long enough before insertion logic?
            # Or just insert.
            # If col_idx is way past end, we might need padding?
            # Standard list.insert handles index > len -> append.
            current_len = len(new_row)
            target_idx = col_idx
            if target_idx > current_len:
                # Pad up to target
                new_row.extend([""] * (target_idx - current_len))
                target_idx = len(new_row)  # Append

            new_row.insert(target_idx, "")
            new_rows.append(new_row)

        return replace(
            self, headers=new_headers, rows=new_rows, alignments=new_alignments
        )

`json` `property`

Returns a JSON-compatible dictionary representation of the table.

Returns:

Name	Type	Description
`TableJSON`	`TableJSON`	A dictionary containing the table data.

`clear_column_data(col_idx)`

Return a new Table with data in the specified column cleared (set to empty string), but headers and column structure preserved.

Source code in src/md_spreadsheet_parser/models.py

def clear_column_data(self, col_idx: int) -> "Table":
    """
    Return a new Table with data in the specified column cleared (set to empty string),
    but headers and column structure preserved.
    """
    # Headers remain unchanged

    new_rows = []
    for row in self.rows:
        new_row = list(row)
        if 0 <= col_idx < len(new_row):
            new_row[col_idx] = ""
        new_rows.append(new_row)

    return replace(self, rows=new_rows)

`delete_column(col_idx)`

Return a new Table with the column at index removed.

Source code in src/md_spreadsheet_parser/models.py

def delete_column(self, col_idx: int) -> "Table":
    """
    Return a new Table with the column at index removed.
    """
    new_headers = list(self.headers) if self.headers else None
    if new_headers and 0 <= col_idx < len(new_headers):
        new_headers.pop(col_idx)

    new_rows = []
    for row in self.rows:
        new_row = list(row)
        if 0 <= col_idx < len(new_row):
            new_row.pop(col_idx)
        new_rows.append(new_row)

    new_alignments = None
    if self.alignments is not None:
        new_alignments = list(self.alignments)
        if 0 <= col_idx < len(new_alignments):
            new_alignments.pop(col_idx)

    return replace(
        self, headers=new_headers, rows=new_rows, alignments=new_alignments
    )

`delete_row(row_idx)`

Return a new Table with the row at index removed.

Source code in src/md_spreadsheet_parser/models.py

def delete_row(self, row_idx: int) -> "Table":
    """
    Return a new Table with the row at index removed.
    """
    new_rows = [list(r) for r in self.rows]
    if 0 <= row_idx < len(new_rows):
        new_rows.pop(row_idx)
    return replace(self, rows=new_rows)

`insert_column(col_idx)`

Return a new Table with an empty column inserted at col_idx. Subsequent columns are shifted right.

Source code in src/md_spreadsheet_parser/models.py

def insert_column(self, col_idx: int) -> "Table":
    """
    Return a new Table with an empty column inserted at col_idx.
    Subsequent columns are shifted right.
    """
    new_headers = list(self.headers) if self.headers else None

    if new_headers:
        if col_idx < 0:
            col_idx = 0
        if col_idx > len(new_headers):
            col_idx = len(new_headers)
        new_headers.insert(col_idx, "")

    new_alignments = None
    if self.alignments is not None:
        new_alignments = list(self.alignments)
        # Pad if needed before insertion?
        if col_idx > len(new_alignments):
            extension: list[AlignmentType] = ["default"] * (
                col_idx - len(new_alignments)
            )
            new_alignments.extend(extension)
        new_alignments.insert(col_idx, "default")  # Default alignment

    new_rows = []
    for row in self.rows:
        new_row = list(row)
        # Ensure row is long enough before insertion logic?
        # Or just insert.
        # If col_idx is way past end, we might need padding?
        # Standard list.insert handles index > len -> append.
        current_len = len(new_row)
        target_idx = col_idx
        if target_idx > current_len:
            # Pad up to target
            new_row.extend([""] * (target_idx - current_len))
            target_idx = len(new_row)  # Append

        new_row.insert(target_idx, "")
        new_rows.append(new_row)

    return replace(
        self, headers=new_headers, rows=new_rows, alignments=new_alignments
    )

`insert_row(row_idx)`

Return a new Table with an empty row inserted at row_idx. Subsequent rows are shifted down.

Source code in src/md_spreadsheet_parser/models.py

def insert_row(self, row_idx: int) -> "Table":
    """
    Return a new Table with an empty row inserted at row_idx.
    Subsequent rows are shifted down.
    """
    new_rows = [list(r) for r in self.rows]

    # Determine width
    width = (
        len(self.headers) if self.headers else (len(new_rows[0]) if new_rows else 0)
    )
    if width == 0:
        width = 1  # Default to 1 column if table is empty

    new_row = [""] * width

    if row_idx < 0:
        row_idx = 0
    if row_idx > len(new_rows):
        row_idx = len(new_rows)

    new_rows.insert(row_idx, new_row)
    return replace(self, rows=new_rows)

`to_markdown(schema=DEFAULT_SCHEMA)`

Generates a Markdown string representation of the table.

Parameters:

Name	Type	Description	Default
`schema`	`ParsingSchema`	Configuration for formatting.	`DEFAULT_SCHEMA`

Returns:

Name	Type	Description
`str`	`str`	The Markdown string.

Source code in src/md_spreadsheet_parser/models.py

def to_markdown(self, schema: ParsingSchema = DEFAULT_SCHEMA) -> str:
    """
    Generates a Markdown string representation of the table.

    Args:
        schema (ParsingSchema, optional): Configuration for formatting.

    Returns:
        str: The Markdown string.
    """
    return generate_table_markdown(self, schema)

`to_models(schema_cls, conversion_schema=DEFAULT_CONVERSION_SCHEMA)`

Converts the table rows into a list of dataclass instances, performing validation and type conversion.

Parameters:

Name	Type	Description	Default
`schema_cls`	`type[T]`	The dataclass type to validate against.	required
`conversion_schema`	`ConversionSchema`	Configuration for type conversion.	`DEFAULT_CONVERSION_SCHEMA`

Returns:

Type	Description
`list[T]`	list[T]: A list of validated dataclass instances.

Raises:

Type	Description
`ValueError`	If schema_cls is not a dataclass.
`TableValidationError`	If validation fails for any row or if the table has no headers.

Source code in src/md_spreadsheet_parser/models.py

def to_models(
    self,
    schema_cls: type[T],
    conversion_schema: ConversionSchema = DEFAULT_CONVERSION_SCHEMA,
) -> list[T]:
    """
    Converts the table rows into a list of dataclass instances, performing validation and type conversion.

    Args:
        schema_cls (type[T]): The dataclass type to validate against.
        conversion_schema (ConversionSchema, optional): Configuration for type conversion.

    Returns:
        list[T]: A list of validated dataclass instances.

    Raises:
        ValueError: If schema_cls is not a dataclass.
        TableValidationError: If validation fails for any row or if the table has no headers.
    """
    return validate_table(self, schema_cls, conversion_schema)

`update_cell(row_idx, col_idx, value)`

Return a new Table with the specified cell updated.

Source code in src/md_spreadsheet_parser/models.py

def update_cell(self, row_idx: int, col_idx: int, value: str) -> "Table":
    """
    Return a new Table with the specified cell updated.
    """
    # Handle header update
    if row_idx == -1:
        if self.headers is None:
            # Determine width from rows if possible, or start fresh
            width = len(self.rows[0]) if self.rows else (col_idx + 1)
            new_headers = [""] * width
            # Ensure width enough
            if col_idx >= len(new_headers):
                new_headers.extend([""] * (col_idx - len(new_headers) + 1))
        else:
            new_headers = list(self.headers)
            if col_idx >= len(new_headers):
                new_headers.extend([""] * (col_idx - len(new_headers) + 1))

        # Update alignments if headers grew
        new_alignments = list(self.alignments) if self.alignments else []
        if len(new_headers) > len(new_alignments):
            # Fill with default/None up to new width
            # But we only need as many alignments as columns.
            # If alignments is None, it stays None?
            # Ideally if we start tracking alignments, we should init it?
            # If self.alignments was None, we might keep it None unless explicitly set?
            # Consistent behavior: If alignments is NOT None, expand it.
            if self.alignments is not None:
                # Cast or explicit type check might be needed for strict type checkers with literals
                # Using a typed list to satisfy invariant list[AlignmentType]
                extension: list[AlignmentType] = ["default"] * (
                    len(new_headers) - len(new_alignments)
                )
                new_alignments.extend(extension)

        final_alignments = new_alignments if self.alignments is not None else None

        new_headers[col_idx] = value

        return replace(self, headers=new_headers, alignments=final_alignments)

    # Handle Body update
    # 1. Ensure row exists
    new_rows = [list(r) for r in self.rows]

    # Grow rows if needed
    if row_idx >= len(new_rows):
        # Calculate width
        width = (
            len(self.headers)
            if self.headers
            else (len(new_rows[0]) if new_rows else 0)
        )
        if width == 0:
            width = col_idx + 1  # At least cover the new cell

        rows_to_add = row_idx - len(new_rows) + 1
        for _ in range(rows_to_add):
            new_rows.append([""] * width)

    # If columns expanded due to row update, we might need to expand alignments too
    current_width = len(new_rows[0]) if new_rows else 0
    if col_idx >= current_width:
        # This means we are expanding columns
        if self.alignments is not None:
            width_needed = col_idx + 1
            current_align_len = len(self.alignments)
            if width_needed > current_align_len:
                new_alignments = list(self.alignments)
                extension: list[AlignmentType] = ["default"] * (
                    width_needed - current_align_len
                )
                new_alignments.extend(extension)
                return replace(
                    self,
                    rows=self._update_rows_cell(new_rows, row_idx, col_idx, value),
                    alignments=new_alignments,
                )

    return replace(
        self, rows=self._update_rows_cell(new_rows, row_idx, col_idx, value)
    )

`TableJSON`

Bases: TypedDict

JSON-compatible dictionary representation of a Table.

Source code in src/md_spreadsheet_parser/models.py

class TableJSON(TypedDict):
    """
    JSON-compatible dictionary representation of a Table.
    """

    name: str | None
    description: str | None
    headers: list[str] | None
    rows: list[list[str]]
    metadata: dict[str, Any]
    start_line: int | None
    end_line: int | None
    alignments: list[AlignmentType] | None

`Workbook` `dataclass`

Represents a collection of sheets (multi-table output).

Attributes:

Name	Type	Description
`sheets`	`list[Sheet]`	List of sheets in the workbook.
`metadata`	`dict[str, Any] \| None`	Arbitrary metadata. Defaults to None.

Source code in src/md_spreadsheet_parser/models.py

@dataclass(frozen=True)
class Workbook:
    """
    Represents a collection of sheets (multi-table output).

    Attributes:
        sheets (list[Sheet]): List of sheets in the workbook.
        metadata (dict[str, Any] | None): Arbitrary metadata. Defaults to None.
    """

    sheets: list[Sheet]
    metadata: dict[str, Any] | None = None

    def __post_init__(self):
        if self.metadata is None:
            # Hack to allow default value for mutable type in frozen dataclass
            object.__setattr__(self, "metadata", {})

    @property
    def json(self) -> WorkbookJSON:
        """
        Returns a JSON-compatible dictionary representation of the workbook.

        Returns:
            WorkbookJSON: A dictionary containing the workbook data.
        """
        return {
            "sheets": [s.json for s in self.sheets],
            "metadata": self.metadata if self.metadata is not None else {},
        }

    def get_sheet(self, name: str) -> Sheet | None:
        """
        Retrieve a sheet by its name.

        Args:
            name (str): The name of the sheet to retrieve.

        Returns:
            Sheet | None: The sheet object if found, otherwise None.
        """
        for sheet in self.sheets:
            if sheet.name == name:
                return sheet
        return None

    def to_markdown(self, schema: MultiTableParsingSchema) -> str:
        """
        Generates a Markdown string representation of the workbook.

        Args:
            schema (MultiTableParsingSchema): Configuration for formatting.

        Returns:
            str: The Markdown string.
        """
        return generate_workbook_markdown(self, schema)

    def add_sheet(self, name: str) -> "Workbook":
        """
        Return a new Workbook with a new sheet added.
        """
        # Create new sheet with one empty table as default
        new_table = Table(headers=["A", "B", "C"], rows=[["", "", ""]])
        new_sheet = Sheet(name=name, tables=[new_table])

        new_sheets = list(self.sheets)
        new_sheets.append(new_sheet)

        return replace(self, sheets=new_sheets)

    def delete_sheet(self, index: int) -> "Workbook":
        """
        Return a new Workbook with the sheet at index removed.
        """
        if index < 0 or index >= len(self.sheets):
            raise IndexError("Sheet index out of range")

        new_sheets = list(self.sheets)
        new_sheets.pop(index)

        return replace(self, sheets=new_sheets)

`json` `property`

Returns a JSON-compatible dictionary representation of the workbook.

Returns:

Name	Type	Description
`WorkbookJSON`	`WorkbookJSON`	A dictionary containing the workbook data.

`add_sheet(name)`

Return a new Workbook with a new sheet added.

Source code in src/md_spreadsheet_parser/models.py

def add_sheet(self, name: str) -> "Workbook":
    """
    Return a new Workbook with a new sheet added.
    """
    # Create new sheet with one empty table as default
    new_table = Table(headers=["A", "B", "C"], rows=[["", "", ""]])
    new_sheet = Sheet(name=name, tables=[new_table])

    new_sheets = list(self.sheets)
    new_sheets.append(new_sheet)

    return replace(self, sheets=new_sheets)

`delete_sheet(index)`

Return a new Workbook with the sheet at index removed.

Source code in src/md_spreadsheet_parser/models.py

def delete_sheet(self, index: int) -> "Workbook":
    """
    Return a new Workbook with the sheet at index removed.
    """
    if index < 0 or index >= len(self.sheets):
        raise IndexError("Sheet index out of range")

    new_sheets = list(self.sheets)
    new_sheets.pop(index)

    return replace(self, sheets=new_sheets)

`get_sheet(name)`

Retrieve a sheet by its name.

Parameters:

Name	Type	Description	Default
`name`	`str`	The name of the sheet to retrieve.	required

Returns:

Type	Description
`Sheet \| None`	Sheet \| None: The sheet object if found, otherwise None.

Source code in src/md_spreadsheet_parser/models.py

def get_sheet(self, name: str) -> Sheet | None:
    """
    Retrieve a sheet by its name.

    Args:
        name (str): The name of the sheet to retrieve.

    Returns:
        Sheet | None: The sheet object if found, otherwise None.
    """
    for sheet in self.sheets:
        if sheet.name == name:
            return sheet
    return None

`to_markdown(schema)`

Generates a Markdown string representation of the workbook.

Parameters:

Name	Type	Description	Default
`schema`	`MultiTableParsingSchema`	Configuration for formatting.	required

Returns:

Name	Type	Description
`str`	`str`	The Markdown string.

Source code in src/md_spreadsheet_parser/models.py

def to_markdown(self, schema: MultiTableParsingSchema) -> str:
    """
    Generates a Markdown string representation of the workbook.

    Args:
        schema (MultiTableParsingSchema): Configuration for formatting.

    Returns:
        str: The Markdown string.
    """
    return generate_workbook_markdown(self, schema)

`WorkbookJSON`

Bases: TypedDict

JSON-compatible dictionary representation of a Workbook.

Source code in src/md_spreadsheet_parser/models.py

class WorkbookJSON(TypedDict):
    """
    JSON-compatible dictionary representation of a Workbook.
    """

    sheets: list[SheetJSON]
    metadata: dict[str, Any]

`md_spreadsheet_parser.validation`

`TableValidationError`

Bases: Exception

Exception raised when table validation fails. Contains a list of errors found during validation.

Source code in src/md_spreadsheet_parser/validation.py

class TableValidationError(Exception):
    """
    Exception raised when table validation fails.
    Contains a list of errors found during validation.
    """

    def __init__(self, errors: list[str]):
        self.errors = errors
        super().__init__(
            f"Validation failed with {len(errors)} errors:\n" + "\n".join(errors)
        )

`validate_table(table, schema_cls, conversion_schema=DEFAULT_CONVERSION_SCHEMA)`

Validates a Table object against a dataclass OR Pydantic schema.

Parameters:

Name	Type	Description	Default
`table`	`Table`	The Table object to validate.	required
`schema_cls`	`Type[T]`	The dataclass or Pydantic model type to validate against.	required
`conversion_schema`	`ConversionSchema`	Configuration for type conversion.	`DEFAULT_CONVERSION_SCHEMA`

Returns:

Type	Description
`list[T]`	list[T]: A list of validated instances.

Raises:

Type	Description
`ValueError`	If schema_cls is not a valid schema.
`TableValidationError`	If validation fails.

Source code in src/md_spreadsheet_parser/validation.py

def validate_table(
    table: "Table",
    schema_cls: Type[T],
    conversion_schema: ConversionSchema = DEFAULT_CONVERSION_SCHEMA,
) -> list[T]:
    """
    Validates a Table object against a dataclass OR Pydantic schema.

    Args:
        table: The Table object to validate.
        schema_cls: The dataclass or Pydantic model type to validate against.
        conversion_schema: Configuration for type conversion.

    Returns:
        list[T]: A list of validated instances.

    Raises:
        ValueError: If schema_cls is not a valid schema.
        TableValidationError: If validation fails.
    """
    # Check for Pydantic Model
    if HAS_PYDANTIC and BaseModel and issubclass(schema_cls, BaseModel):
        if not table.headers:
            raise TableValidationError(["Table has no headers"])
        # Import adapter lazily to avoid unused imports when pydantic is not used
        # (though we checked HAS_PYDANTIC so it exists)
        from .pydantic_adapter import validate_table_pydantic

        return validate_table_pydantic(table, schema_cls, conversion_schema)  # type: ignore

    # Check for Dataclass
    if is_dataclass(schema_cls):
        if not table.headers:
            raise TableValidationError(["Table has no headers"])
        return _validate_table_dataclass(table, schema_cls, conversion_schema)

    # Check for TypedDict
    if is_typeddict(schema_cls):
        if not table.headers:
            raise TableValidationError(["Table has no headers"])
        return _validate_table_typeddict(table, schema_cls, conversion_schema)

    # Check for simple dict
    # We compare schema_cls against dict type
    if schema_cls is dict:
        if not table.headers:
            raise TableValidationError(["Table has no headers"])
        return _validate_table_dict(table, conversion_schema)  # type: ignore

    raise ValueError(
        f"{schema_cls} must be a dataclass, Pydantic model, TypedDict, or dict"
    )

`md_spreadsheet_parser.generator`

`generate_sheet_markdown(sheet, schema=DEFAULT_SCHEMA)`

Generates a Markdown string representation of the sheet.

Parameters:

Name	Type	Description	Default
`sheet`	`Sheet`	The Sheet object.	required
`schema`	`ParsingSchema`	Configuration for formatting.	`DEFAULT_SCHEMA`

Returns:

Name	Type	Description
`str`	`str`	The Markdown string.

Source code in src/md_spreadsheet_parser/generator.py

def generate_sheet_markdown(
    sheet: "Sheet", schema: ParsingSchema = DEFAULT_SCHEMA
) -> str:
    """
    Generates a Markdown string representation of the sheet.

    Args:
        sheet: The Sheet object.
        schema (ParsingSchema, optional): Configuration for formatting.

    Returns:
        str: The Markdown string.
    """
    lines = []

    if isinstance(schema, MultiTableParsingSchema):
        lines.append(f"{'#' * schema.sheet_header_level} {sheet.name}")
        lines.append("")

    for i, table in enumerate(sheet.tables):
        lines.append(generate_table_markdown(table, schema))
        if i < len(sheet.tables) - 1:
            lines.append("")  # Empty line between tables

    # Append Sheet Metadata if present (at the end)
    if isinstance(schema, MultiTableParsingSchema) and sheet.metadata:
        lines.append("")
        metadata_json = json.dumps(sheet.metadata)
        comment = f"<!-- md-spreadsheet-sheet-metadata: {metadata_json} -->"
        lines.append(comment)

    return "\n".join(lines)

`generate_table_markdown(table, schema=DEFAULT_SCHEMA)`

Generates a Markdown string representation of the table.

Parameters:

Name	Type	Description	Default
`table`	`Table`	The Table object.	required
`schema`	`ParsingSchema`	Configuration for formatting.	`DEFAULT_SCHEMA`

Returns:

Name	Type	Description
`str`	`str`	The Markdown string.

Source code in src/md_spreadsheet_parser/generator.py

def generate_table_markdown(
    table: "Table", schema: ParsingSchema = DEFAULT_SCHEMA
) -> str:
    """
    Generates a Markdown string representation of the table.

    Args:
        table: The Table object.
        schema (ParsingSchema, optional): Configuration for formatting.

    Returns:
        str: The Markdown string.
    """
    lines = []

    # Handle metadata (name and description) if MultiTableParsingSchema
    if isinstance(schema, MultiTableParsingSchema):
        if table.name and schema.table_header_level is not None:
            lines.append(f"{'#' * schema.table_header_level} {table.name}")
            lines.append("")  # Empty line after name

        if table.description and schema.capture_description:
            lines.append(table.description)
            lines.append("")  # Empty line after description

    # Build table
    sep = f" {schema.column_separator} "

    def _prepare_cell(cell: str) -> str:
        """Prepare cell for markdown generation."""
        if schema.convert_br_to_newline and "\n" in cell:
            return cell.replace("\n", "<br>")
        return cell

    # Headers
    if table.headers:
        # Add outer pipes if required
        processed_headers = [_prepare_cell(h) for h in table.headers]
        header_row = sep.join(processed_headers)
        if schema.require_outer_pipes:
            header_row = (
                f"{schema.column_separator} {header_row} {schema.column_separator}"
            )
        lines.append(header_row)

        # Separator row
        separator_cells = []
        for i, _ in enumerate(table.headers):
            alignment = "default"
            if table.alignments and i < len(table.alignments):
                # Ensure we handle potentially None values if list has gaps (unlikely by design but safe)
                alignment = table.alignments[i] or "default"

            # Construct separator cell based on alignment
            # Use 3 hyphens as base
            if alignment == "left":
                cell = ":" + schema.header_separator_char * 3
            elif alignment == "right":
                cell = schema.header_separator_char * 3 + ":"
            elif alignment == "center":
                cell = ":" + schema.header_separator_char * 3 + ":"
            else:
                # default
                cell = schema.header_separator_char * 3

            separator_cells.append(cell)

        separator_row = sep.join(separator_cells)
        if schema.require_outer_pipes:
            separator_row = (
                f"{schema.column_separator} {separator_row} {schema.column_separator}"
            )
        lines.append(separator_row)

    # Rows
    for row in table.rows:
        processed_row = [_prepare_cell(cell) for cell in row]
        row_str = sep.join(processed_row)
        if schema.require_outer_pipes:
            row_str = f"{schema.column_separator} {row_str} {schema.column_separator}"
        lines.append(row_str)

    # Append Metadata if present
    if table.metadata and "visual" in table.metadata:
        metadata_json = json.dumps(table.metadata["visual"])
        comment = f"<!-- md-spreadsheet-table-metadata: {metadata_json} -->"
        lines.append("")
        lines.append(comment)

    return "\n".join(lines)

`generate_workbook_markdown(workbook, schema)`

Generates a Markdown string representation of the workbook.

Parameters:

Name	Type	Description	Default
`workbook`	`Workbook`	The Workbook object.	required
`schema`	`MultiTableParsingSchema`	Configuration for formatting.	required

Returns:

Name	Type	Description
`str`	`str`	The Markdown string.

Source code in src/md_spreadsheet_parser/generator.py

def generate_workbook_markdown(
    workbook: "Workbook", schema: MultiTableParsingSchema
) -> str:
    """
    Generates a Markdown string representation of the workbook.

    Args:
        workbook: The Workbook object.
        schema (MultiTableParsingSchema): Configuration for formatting.

    Returns:
        str: The Markdown string.
    """
    lines = []

    if schema.root_marker:
        lines.append(schema.root_marker)
        lines.append("")

    for i, sheet in enumerate(workbook.sheets):
        lines.append(generate_sheet_markdown(sheet, schema))
        if i < len(workbook.sheets) - 1:
            lines.append("")  # Empty line between sheets

    # Append Workbook Metadata if present
    if workbook.metadata:
        # Ensure separation from last sheet
        if lines and lines[-1] != "":
            lines.append("")

        metadata_json = json.dumps(workbook.metadata)
        comment = f"<!-- md-spreadsheet-workbook-metadata: {metadata_json} -->"
        lines.append(comment)

    return "\n".join(lines)

API Reference

md_spreadsheet_parser

ConversionSchema dataclass

ExcelParsingSchema dataclass

MultiTableParsingSchema dataclass

ParsingSchema dataclass

Sheet dataclass

json property

get_table(name)

to_markdown(schema=DEFAULT_SCHEMA)

Table dataclass

json property

clear_column_data(col_idx)

delete_column(col_idx)

delete_row(row_idx)

insert_column(col_idx)

insert_row(row_idx)

to_markdown(schema=DEFAULT_SCHEMA)

to_models(schema_cls, conversion_schema=DEFAULT_CONVERSION_SCHEMA)

update_cell(row_idx, col_idx, value)

TableValidationError

Workbook dataclass

json property

add_sheet(name)

delete_sheet(index)

get_sheet(name)

to_markdown(schema)

generate_sheet_markdown(sheet, schema=DEFAULT_SCHEMA)

generate_table_markdown(table, schema=DEFAULT_SCHEMA)

generate_workbook_markdown(workbook, schema)

parse_excel(source, schema=DEFAULT_EXCEL_SCHEMA)

parse_excel_text(rows, schema=DEFAULT_EXCEL_SCHEMA)

parse_sheet(markdown, name, schema, start_line_offset=0)

parse_table(markdown, schema=DEFAULT_SCHEMA)

parse_table_from_file(source, schema=DEFAULT_SCHEMA)

parse_workbook(markdown, schema=MultiTableParsingSchema())

parse_workbook_from_file(source, schema=MultiTableParsingSchema())

scan_tables(markdown, schema=None)

scan_tables_from_file(source, schema=None)

scan_tables_iter(source, schema=None)

md_spreadsheet_parser.schemas

ConversionSchema dataclass

ExcelParsingSchema dataclass

MultiTableParsingSchema dataclass

ParsingSchema dataclass

md_spreadsheet_parser.models

Sheet dataclass

json property

get_table(name)

to_markdown(schema=DEFAULT_SCHEMA)

SheetJSON

Table dataclass

json property

clear_column_data(col_idx)

delete_column(col_idx)

delete_row(row_idx)

insert_column(col_idx)

insert_row(row_idx)

to_markdown(schema=DEFAULT_SCHEMA)

to_models(schema_cls, conversion_schema=DEFAULT_CONVERSION_SCHEMA)

update_cell(row_idx, col_idx, value)

TableJSON

Workbook dataclass

json property

add_sheet(name)

delete_sheet(index)

get_sheet(name)

to_markdown(schema)

WorkbookJSON

md_spreadsheet_parser.validation

TableValidationError

validate_table(table, schema_cls, conversion_schema=DEFAULT_CONVERSION_SCHEMA)

md_spreadsheet_parser.generator

generate_sheet_markdown(sheet, schema=DEFAULT_SCHEMA)

generate_table_markdown(table, schema=DEFAULT_SCHEMA)

generate_workbook_markdown(workbook, schema)

`md_spreadsheet_parser`

`ConversionSchema` `dataclass`

`ExcelParsingSchema` `dataclass`

`MultiTableParsingSchema` `dataclass`

`ParsingSchema` `dataclass`

`Sheet` `dataclass`

`json` `property`

`get_table(name)`

`to_markdown(schema=DEFAULT_SCHEMA)`

`Table` `dataclass`

`json` `property`

`clear_column_data(col_idx)`

`delete_column(col_idx)`

`delete_row(row_idx)`

`insert_column(col_idx)`

`insert_row(row_idx)`

`to_markdown(schema=DEFAULT_SCHEMA)`

`to_models(schema_cls, conversion_schema=DEFAULT_CONVERSION_SCHEMA)`

`update_cell(row_idx, col_idx, value)`

`TableValidationError`

`Workbook` `dataclass`

`json` `property`

`add_sheet(name)`

`delete_sheet(index)`

`get_sheet(name)`

`to_markdown(schema)`

`generate_sheet_markdown(sheet, schema=DEFAULT_SCHEMA)`

`generate_table_markdown(table, schema=DEFAULT_SCHEMA)`

`generate_workbook_markdown(workbook, schema)`

`parse_excel(source, schema=DEFAULT_EXCEL_SCHEMA)`

`parse_excel_text(rows, schema=DEFAULT_EXCEL_SCHEMA)`

`parse_sheet(markdown, name, schema, start_line_offset=0)`

`parse_table(markdown, schema=DEFAULT_SCHEMA)`

`parse_table_from_file(source, schema=DEFAULT_SCHEMA)`

`parse_workbook(markdown, schema=MultiTableParsingSchema())`

`parse_workbook_from_file(source, schema=MultiTableParsingSchema())`

`scan_tables(markdown, schema=None)`

`scan_tables_from_file(source, schema=None)`

`scan_tables_iter(source, schema=None)`

`md_spreadsheet_parser.schemas`

`ConversionSchema` `dataclass`

`ExcelParsingSchema` `dataclass`

`MultiTableParsingSchema` `dataclass`

`ParsingSchema` `dataclass`

`md_spreadsheet_parser.models`

`Sheet` `dataclass`

`json` `property`

`get_table(name)`

`to_markdown(schema=DEFAULT_SCHEMA)`

`SheetJSON`

`Table` `dataclass`

`json` `property`

`clear_column_data(col_idx)`

`delete_column(col_idx)`

`delete_row(row_idx)`

`insert_column(col_idx)`

`insert_row(row_idx)`

`to_markdown(schema=DEFAULT_SCHEMA)`

`to_models(schema_cls, conversion_schema=DEFAULT_CONVERSION_SCHEMA)`

`update_cell(row_idx, col_idx, value)`

`TableJSON`

`Workbook` `dataclass`

`json` `property`

`add_sheet(name)`

`delete_sheet(index)`

`get_sheet(name)`

`to_markdown(schema)`

`WorkbookJSON`

`md_spreadsheet_parser.validation`

`TableValidationError`

`validate_table(table, schema_cls, conversion_schema=DEFAULT_CONVERSION_SCHEMA)`

`md_spreadsheet_parser.generator`

`generate_sheet_markdown(sheet, schema=DEFAULT_SCHEMA)`

`generate_table_markdown(table, schema=DEFAULT_SCHEMA)`

`generate_workbook_markdown(workbook, schema)`