Skip to content

[Feature]: Semantic diff that ignores pure formatting #541

@qdrddr

Description

@qdrddr

Description

Currently, git diff captures pure text changes, which may include mostly irrelevant changes.
This consumes LLM tokens and creates commit message comments that bloat the message, potentially hiding important changes in the toll pile of garage messages.

Suggested Solution

I’d like to have a --semantic-diff parameter when enabled with oco, it would use diffsitter that is based on AST code changes.
https://github.com/afnanenayet/diffsitter

Alternatives

No response

Additional Context

This idea could be extended:

By still leveraging the remaining less important changes that are captured as usual git diff, but then clearly identifying and separating them as less important, that we could ask LLM to pay less attention to and produce less detailed summary.

So I could get full picture of the changes, but have LLm focus with on summarizing with more details important chances. While keeping semantically not important formatting and spend less output LLM tokens on those (potentially even using another cheaper model).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions