8000 ast, read_verilog: ownership in AST, use C++ styles for parser and lexer by widlarizer · Pull Request #5135 · YosysHQ/yosys · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

ast, read_verilog: ownership in AST, use C++ styles for parser and lexer #5135

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

widlarizer
Copy link
Collaborator
@widlarizer widlarizer commented May 21, 2025

The purpose of this PR is to clarify resource ownership (responsibility for construction and deletion) in the AST by distinguishing between owning pointers (std::unique_ptr<AstNode>) and non-owning pointers (AstNode*).

Core idea

struct AstNode now holds these members:

std::vector<std::unique_ptr<AstNode>> children;
std::map<RTLIL::IdString, std::unique_ptr<AstNode>> attributes;

There are no more new AstNode and corresponding delete statements present in the codebase. std::unique_ptr<AstNode> falling out of scope triggers ~AstNode, destroying its children, too. Assigning nullptr to it behaves the same way. This should eliminate AstNode allocations being reported as leaking on yosys termination by valgrind.

Damage

This implied returning std::unique_ptr<AstNode> from parse rules. Prior to this PR, the return data type for parse rules is auto-generated with the bison %union directive. The auto-generated code handling it relies on each union member type having an T& operator=(T&);, but std::unique_ptr<...> deletes its own for good reason.

That's why I refactored the parser and lexer with their C++ modes causing a greater code change.

Locations in AST and parser values now use the same custom data type with std::shared_ptr<std::string> filename;. This reduces filename copying in the frontend greatly. I'm seeing 6-20% memory usage improvements in read_verilog as a result, at a cost of a 1% performance regression, which may be avoidable.

Superior location tracking without global state will allow proper column ranges in errors in a simple followup PR.

Somehow, syntax errors now don't quote characters, so you get ERROR: syntax error, unexpected ; instead of ERROR: syntax error, unexpected ';'. fixed

Testing

This PR has a high risk of creating new yosys segfaults in edge cases not hit by our test suite. More extensive testing will be needed. However, I would argue that crashes are more desirable than a mix of "everything somehow works out" and straight up silent data corruption.

TODOs

  • Source locations are temporarily completely broken. This breaks tests
  • Extend source location diagnostics testing on main prior to merging this. See logger: add -expect types prefix-log, prefix-warning, prefix-error #5183
  • Can global variables be eliminated from the parser by moving them into the generated parser class?
  • Ensure there are no Makefile dependency errors
  • Extend ownership in ast and parser to attribute lists and strings. Containers already basically are owned pointers, but maybe should be still pointed to with std::unique_ptr to avoid accidental copies
  • Play around with memory and address sanitizers to see if we're cutting down on the leaks
  • Try reduce the perf regression likely caused by mandatory source location construction for all nodes

@widlarizer widlarizer force-pushed the emil/ast-ownership branch from 6a7ecf1 to 5af4e05 Compare June 17, 2025 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant
0