Skip to content

Conversation

@KaanOzkan
Copy link

@KaanOzkan KaanOzkan commented Dec 10, 2025

Extracted from #681

Equivalent file using whitequark AST https://github.com/sorbet/sorbet/blob/master/rbs/SigsRewriter.cc

/**
* Returns true if the body contains an `extend T::Helpers` call already.
*/
bool containsExtendTHelper(pm_statements_node_t *body, const parser::Prism::Parser &prismParser,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some methods take const parser::Prism::Parser &prismParser as an argument and others take parser::Prism::Parser &parser, maybe these should be consistent?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use const whenever we can. There was one function that could take const but wasn't, it's now updated.

}

ENFORCE(*body != nullptr && PM_NODE_TYPE_P(*body, PM_STATEMENTS_NODE), "Body must be a statements node");
auto *stmts = down_cast<pm_statements_node_t>(*body);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think statements is used as a variable name a lot more than stmts in this file

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot of both names. Since statements is long we'd have to break the lines in some cases. So I went with using stmts for short lived variables that are easy to understand. But for cases like this one it's migrated to statements. I think this is better than doing only stmts or only statements.

std::map<pm_node_t *, std::vector<CommentNodePrism>> &commentsByNode);

SigsRewriterPrism(core::MutableContext ctx, parser::Prism::Parser &parser,
std::unordered_map<pm_node_t *, std::vector<rbs::CommentNodePrism>> &commentsByNode)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does using unordered maps vs ordered maps affect performance

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, std::map will have to keep itself ordered so it's implemented as a tree (Red black tree probably). Meanwhile unordered_map can be a simple hash table. Looks ups are going to be more performant, O(log n) vs O(1)

Copy link

@amomchilov amomchilov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working my way through it, here's the first batch of comments

*/
vector<pm_node_t *> extractHelpers(core::MutableContext ctx, absl::Span<const Comment> annotations,
parser::Prism::Parser &parser) {
Factory prism(parser);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The linter doesn't strictly require it, but try to use uniform initialization sytnax whenever you can

Suggested change
Factory prism(parser);
Factory prism{parser};

Copy link

@amomchilov amomchilov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done for now

Comment on lines 259 to 260
if (commentsByNode != nullptr && node != nullptr) {
if (auto it = commentsByNode->find(node); it != commentsByNode->end()) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both of these can be inverted into early-returning guard clauses

case PM_ELSE_NODE: {
auto *else_ = down_cast<pm_else_node_t>(node);
if (auto *stmts = else_->statements) {
else_->statements = down_cast<pm_statements_node_t>(rewriteBody(up_cast(stmts)));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern comes up a lot. Can you add an overload of rewriteBody that takes and returns pm_statements_node_t *?

Copy link

@amomchilov amomchilov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one more thing

Comment on lines +432 to +457

// Save old statements before modifying (pm_node_list_append can realloc, invalidating pointers)
pm_node_list_t oldStmts = statements->body;
statements->body = (pm_node_list_t){.size = 0, .capacity = 0, .nodes = nullptr};

// For each statement, insert signatures before it if it's a signature target
for (size_t i = 0; i < oldStmts.size; i++) {
pm_node_t *stmt = oldStmts.nodes[i];

if (canHaveSignature(stmt, parser)) {
if (auto signatures = signaturesForNode(stmt)) {
// Add all signatures
for (auto sig : *signatures) {
pm_node_list_append(&statements->body, sig);
}
}
}

// Add the rewritten statement
pm_node_list_append(&statements->body, rewriteNode(stmt));
}

// Free the old list structure (not the nodes themselves, as they were moved)
free(oldStmts.nodes);

return node;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Save old statements before modifying (pm_node_list_append can realloc, invalidating pointers)
pm_node_list_t oldStmts = statements->body;
statements->body = (pm_node_list_t){.size = 0, .capacity = 0, .nodes = nullptr};
// For each statement, insert signatures before it if it's a signature target
for (size_t i = 0; i < oldStmts.size; i++) {
pm_node_t *stmt = oldStmts.nodes[i];
if (canHaveSignature(stmt, parser)) {
if (auto signatures = signaturesForNode(stmt)) {
// Add all signatures
for (auto sig : *signatures) {
pm_node_list_append(&statements->body, sig);
}
}
}
// Add the rewritten statement
pm_node_list_append(&statements->body, rewriteNode(stmt));
}
// Free the old list structure (not the nodes themselves, as they were moved)
free(oldStmts.nodes);
return node;
return rewriteBody(statements)

Comment on lines +471 to +472
pm_statements_node_t *SigsRewriterPrism::rewriteBody(pm_statements_node_t *stmts) {
return down_cast<pm_statements_node_t>(rewriteBody(up_cast(stmts)));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest inverting this. Make rewriteBody(pm_node_t *node) call rewriteBody(pm_statements_node_t *stmts). It's more "direct", and you don't need the up/down cast

Suggested change
pm_statements_node_t *SigsRewriterPrism::rewriteBody(pm_statements_node_t *stmts) {
return down_cast<pm_statements_node_t>(rewriteBody(up_cast(stmts)));
pm_statements_node_t *SigsRewriterPrism::rewriteBody(pm_statements_node_t *statements) {
// Save old statements before modifying (pm_node_list_append can realloc, invalidating pointers)
pm_node_list_t oldStmts = statements->body;
statements->body = (pm_node_list_t){.size = 0, .capacity = 0, .nodes = nullptr};
// For each statement, insert signatures before it if it's a signature target
for (size_t i = 0; i < oldStmts.size; i++) {
pm_node_t *stmt = oldStmts.nodes[i];
if (canHaveSignature(stmt, parser)) {
if (auto signatures = signaturesForNode(stmt)) {
// Add all signatures
for (auto sig : *signatures) {
pm_node_list_append(&statements->body, sig);
}
}
}
// Add the rewritten statement
pm_node_list_append(&statements->body, rewriteNode(stmt));
}
// Free the old list structure (not the nodes themselves, as they were moved)
free(oldStmts.nodes);
return statements;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants