-
Notifications
You must be signed in to change notification settings - Fork 167
Improve performance: invariant_booleans #501
Improve performance: invariant_booleans #501
Conversation
[boolean satisfiability](https://en.wikipedia.org/wiki/Boolean_satisfiability_problem) is hard. | ||
Performance improvements are planned ([#434](https://github.com/dart-lang/linter/issues/434)) but | ||
in the meantime, this lint should be sparingly enabled in large projects or when lint performance | ||
_**WARNING:** this lint is comparatively expensive as, in general, calculating |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is no longer needed :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
Benchmark in my machine:
|
e26d57a
to
1d77944
Compare
_findConditionsOfElseStatementAncestor(node.parent, nodesInDFS))) | ||
.toSet(); | ||
return new TestedExpressions(node, conjunctions, negations); | ||
Element _getRealElement(AstNode node) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rename this method to "Canonical" as well?
: leftPart is PropertyAccess | ||
? DartTypeUtilities | ||
.getCanonicalElement(leftPart.propertyName.bestElement) | ||
: null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe move _getRealElement
from invariant_booleans.dart
to here and convert this to => _getRealElement(assignment.leftHandSide);
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed (good idea)
bool _isBreakStatement(Statement statement) => (statement is BreakStatement || | ||
(statement is Block && | ||
statement.statements.length == 1 && | ||
statement.statements.first is BreakStatement)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth a recursive call here to handle strange, but possible, code like this?
{
{
break;
}
}
Probably not, but it's your call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
statement.statements.first is ReturnStatement)); | ||
|
||
abstract class ConditionScopeVisitor extends RecursiveAstVisitor { | ||
final Queue<Queue<_ExpressionBox>> environments = new Queue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still seems odd to me to use a Queue
, especially when we're treating the top-level queue as a stack. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact all of them are stacks xd, I am trying to figure out what is exactly what we need in this kind of "scope"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of the few issues that I have not solved, what do you think is a better option here @bwilkerson?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would start by writing a class like the following:
class ConditionScope {
/**
* The outer scope, of `null` if this is the outermost scope.
*/
final Scope outer;
/**
* Initialize a newly created scope to have the given [outer] scope.
*/
ConditionScope(this.outer);
}
Then I would add fields and methods to support adding _ExpressionBox
s and performing lookup (which would replace one of the method in ConditionScopeVisitor
). Given that you're not really treating the inner queues as queues, I'd probably change to using a list. The environments
field would be replaced by
ConditionScope currentScope;
pushes look like
currentScope = new ConditionScope(currentScope);
and pops look like
currentScope = currentScope.outer;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented my own Stack the other day to see if that improved the benchmark of my iterative version of traverseNodeInDfs jaja, I got it, I will be focused on this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, I used a List but I had to reversed it when getting condition expression, PTAL, maybe is better use a Queue internally in the ConditionScope
|
||
@override | ||
visitAssignmentExpression(AssignmentExpression node) { | ||
_addElementToEnvironment(new _UndefinedExpression(_getLeftElement(node))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to guard against nodes that cannot be resolved by not adding anything to the environment (here and elsewhere):
Element element = _getLeftElement(node);
if (element != null) {
_addElementToEnvironment(new _UndefinedExpression(node);
}
We should minimally add a test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed this creating a factory constructor, so if the parameter is null, the constructor return null and _addElementToEnvironment does not add null objects, so FIXED, but what kind of test could add?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, we need an assignment expression whose left-hand-side is not resolved:
notDefined = 3;
(any name that isn't defined will work). Of course, you'll want to back out your fix to ensure that the test case actually triggers the bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes but, in that case, we would add to the scope something like, null is undefined, and that would not change the behavior of invariant booleans, at least we do something like:
if (notDefined2 < 2) {
notDefined = 3; // we undefine null
if (notDefined2 < 2) { // this won't be linted, because notDefined2 is null and null (notDefined) was undefined just before this condition is evaluated
}
}
My point is that checking those nulls wont change the behavior in code with not null elements in conditions inside if/while/do/for, so we should take care about conditions with null elements inside too, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given how few undefined names there usually are in code, if it doesn't cause exceptions to be thrown then I don't suppose it's worth doing a lot of work to guard against having a null
element.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bwilkerson this is my last issue, and I can not figure out a test where the rule do something and without checking adding null elements would do something different. (I mean lints)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's quite possible that having an instance of _UndefinedExpression
whose element is null
is harmless. It just wasn't clear to me that that would be the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it is harmless, but other things like:
if (notDefined < 2) {
if (notDefined < 2) { // there is a lint here
}
}
can decrease the UX level, because there will be a linter and an error in the same place, I fixed this locally but it is necessary to check for types and it is really expensive (the benchmark in the rule moved from ~55 to >140), so I think it is a cost I would pay to have a faster linter as a user.
If you think this is OK the branch is ready to merge, if you want me to fix this even that means have a worse performance just tell me and I will do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'll merge it, and we can re-visit this after the fact.
if (_isReturnStatement(statement) || | ||
_isBreakStatement(statement) || | ||
_isContinueStatement(statement)) { | ||
_addElementToEnvironment(expressionBox); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do something similar for dead code detection. Check out https://github.com/dart-lang/sdk/blob/master/pkg/analyzer/lib/src/generated/resolver.dart#L2201. Our checking is a bit more complete, and you might be able to make use of it here (although we'd need to make it available).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
} | ||
} | ||
|
||
void _addLocalEnvironment() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Under what conditions do we need to add a local environment? (Adding a comment would be good.) It seems like we might be adding and removing too many, but I'm not sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, but I am not sure yet, we are trying to refactor the way we use the scope
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized that it is necessary to create new local environments, I created an unifying visitor so now it asks which clases just creates local environments, PTAL at the methods visitNode and _needsLocalEnvironment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replicated issue
57ce820
to
15128e2
Compare
node is CompilationUnit || | ||
node is ConstructorDeclaration || | ||
node is FunctionDeclaration || | ||
node is MethodDeclaration; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The value analysis performed by this rule can only be used on local variables, so it isn't clear to me why it makes sense to test for ClassDeclaration
or CompilationUnit
. Please add documentation somewhere explaining when and/or why we need a local environment.
I believe that the last three conditions could be replaced by node is FunctionBody
, although testing for FunctionBody
will also cause a local environment to be created when visiting local functions, so it's not strictly equivalent.
Also, given that blocks define a name scope, do we need to define a local environment when we enter a block? Asked another way, should the conditions tested inside a block still be considered when we leave the block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was not necessary, and yes, now they only is FunctionBody and trying to resolve what to do with blocks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is not necessary to create new local environments in blocks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just so it isn't forgotten, I'll repeat the request:
Please add documentation somewhere explaining when and/or why we need a local environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, PTAL
bool hasBreak = DartTypeUtilities | ||
.traverseNodesInDFS(node, excludeCriteria: _isLoopStatement) | ||
.any((e) => e is BreakStatement); | ||
if (!hasBreak) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect that this will not work correctly with labeled break statements:
outer: for (int i = 0; i < 10; i++) {
for (int j = 0; j < 10; j++) {
if (i > j) {
break outer;
}
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
} | ||
|
||
@override | ||
visitNode(AstNode node) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will work, and I'm fine with it if you want to leave it, but there are two factors to consider. First, is
tests are surprisingly slow, so it would be faster if you take advantage of the polymorphism that's going to be done either way. Second, if anyone ever creates a subclass and overwrites a visit...
method to visit a node for which _needsLocalEnvironment
would return true
, it will be harder for them to see that they need to invoke either the super visit...
method or visitNode
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the overridden method do exactly what this one does? (I think it does, but it's hard for me to check right now.) If so, let's just delete this method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method made this Visitor have the behavior of a RecursiveAstVisitor, I deleted the method and replace the extends UnifyingAstVisitor by extends RecursiveAstVisitor.
|
||
@override | ||
visitVariableDeclaration(VariableDeclaration node) { | ||
_addElementToEnvironment(new _UndefinedExpression(node.element)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there plans to use either the initialization expression (or the right-hand side in an assignment) when the value assigned is known (a constant expression)? For example, will we create a lint here:
var x = 0;
// code that doesn't modify x
if (x > 0) {
...
}
(It's ok if the answer is "no", or to do that as a follow-on to this CL if the answer is "yes".)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not supported by now, and before this optimization it was not contemplated, but it could be implemented in the future, I would like to do it, what do you think @alexeieleusis ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is definitely nice to have, I wonder if now is the best time. We need to consider the cost/benefit, I lean towards leaving it for later, there are rules that require less effort and provide more value.
TestedExpressions _findPreviousTestedExpressions(Expression node) { | ||
final elements = _getRealElementsInExpression(node); | ||
Iterable<Expression> conjunctions = getTrueExpressions(elements) | ||
.map(_splitConjunctions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be more efficient to split conjunctions when adding the conditions rather than when finding tested expressions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
aba4d3e
to
d7f2817
Compare
bd96969
to
a9b2c55
Compare
Iterable<_ExpressionBox> getUndefinedExpressions() => | ||
environment.where((e) => e is _UndefinedExpression); | ||
|
||
static void _recursiveGetExpressions(ConditionScope conditionScope, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you make this an instance method, then the conditionScope
parameter can be removed in favor of using this
. The last line would need to change to invoke this method on outer
, and only when outer
is non-null. But I think it will make the code a little smaller. (Not a requirement, though.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
} | ||
|
||
@override | ||
visitEmptyFunctionBody(EmptyFunctionBody node) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that an empty function body doesn't have any children, this method should be unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
node.updaters.accept(this); | ||
node.body?.accept(this); | ||
_propagateUndefinedExpressions(_removeLastScope()); | ||
bool hasBreak = DartTypeUtilities |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to add a comment here related to the handling of break statements so that we remember the issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
node.body?.accept(this); | ||
_propagateUndefinedExpressions(_removeLastScope()); | ||
bool hasBreak = DartTypeUtilities | ||
.traverseNodesInDFS(node, excludeCriteria: _isLoopStatement) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we're not handling break statements with labels correctly, I think that excluding loop statements will cause false positives. Consider:
loop: for (...) {
while (...) {
break loop;
}
}
Also, not excluding function bodies might well cause false negatives. Consider:
while (...) {
void func() {
while (...) {
break;
}
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About this:
while (...) {
void func() {
while (...) {
break;
}
}
}
The first while search for breaks excluding other for and while, so in that case, the break won't be see because is inside other while, an option would be something like this:
while (...) {
void func() {
break;
}
}
But that is not possible, the analyzer would say 'A break statement cannot be used outside of a loop or switch statement'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, replace the body of the function with a switch statement and stick a break in one case. Then I think we can get to the break when we shouldn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is a bug that I made, I should exclude loop statements and switch statements, and that's it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
loop: for (...) {
while (...) {
break loop;
}
}
I have been thinking a lot about this, can you help me?
How do you think this would produce a false positive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that hasBreak
should be true
(in both the for
and while
cases) if there is any way for control to reach the statement following the loop other than for the condition to evaluate to false
. But if you don't look inside the nested while
loop (in the example above) then you won't find the break
statement that applies to the for
loop and hasBreak
will be false
despite the fact that the condition doesn't have to be false
. Here's a more concrete example:
String s = '...';
loop: for (; s != null; s += '.') {
while (true != false) {
break loop;
}
}
// I think there will be a lint on the following line,
// even though the test is reasonable.
if (s != null) {
print(s);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woooow!! I see now, I thought a break with label worked like a goto, I am so sorry, my bad, I understood perfectly now, and I will work fixing this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed (finally :D)
_visitIfStatement(node); | ||
_propagateUndefinedExpressions(elseScope); | ||
if (_isLastStatementAnExitStatement(node.thenStatement)) { | ||
_addFalseCondition(node.condition); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens with the following (pathological) case:
if (x == null) {
return false;
} else {
return true;
}
if (x == null) {
...
}
(and with the sense of the conditions reversed)?
I think we only want to record either true or false conditions only when exactly one block terminates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case the last if is dead code, in the environment we would see that (x == null) is true and false at the same time, and probably would lint the last if. But it still correct that if a then statement is return the condition is false after de if/else block the same if the else is a return block. There is only one exception when you have a label after the if/else statement, in that case, when we see a label, we should undefine all expression, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the case you're thinking of. Can you give me an example involving a label after an if statement?
That said, I do think that we might want to forget everything we think we know when we hit a label because we don't know what the state of any variable will be when execution reaches the break or continue that will transfer control to that label. Unfortunately, the same is true for any do, for, switch or while statement because they all have an implicit empty label.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forget what I said, in that case x == null is true and false in the environment, so it will be resolved to a lint, but we can easily detect those kind of cases, because they are dead code and do not lint if you want, what do you think is better lint or not to lint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is better to have one diagnostic ("dead code") than to have multiple diagnostics for the same problem. That's why I think we shouldn't produce a lint in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
bool _isLastStatementAnExitStatement(Statement statement) { | ||
if (statement is Block) { | ||
return _isLastStatementAnExitStatement( | ||
DartTypeUtilities.getLastStatementInBlock(statement)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does ExitDetector not handle blocks? (I'm wondering whether you need to special case blocks here.)
If you do need to special case blocks, I think you need to search for a statement within the block that can exit in order to handle pathological cases such as
if (...) {
return;
print("Shouldn't get here");
}
Otherwise we'll think that the last statement means that we can fall through.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the analyzer find the dead code there?, I think in that case we would be doing something that is already done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, analyzer will find and report the dead code. The question is what we need to do in order to avoid false positives for cases like these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An easy fix would be add an UndefinedAllExpressions when we see a return statement :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A return, a throw, or a rethrow (assuming I'm not forgetting anything).
As long as we've thought hard about possible false positives and worked to prevent them, and document in the code how we've done that, I'm fine with most solutions. (Including undefining everything after an exit from the current function body.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, I will work in documentation now, Alexei will help me also.
c517a88
to
d7ff56b
Compare
} | ||
|
||
@override | ||
visitReturnStatement(ReturnStatement node) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to create test for this, and throw and rethrow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
d7ff56b
to
aa528c5
Compare
d38232e
to
766d0ff
Compare
@bwilkerson : could you take a quick look at @JPaulsen's latest? Looks like the main issues are addressed. Thanks in advance! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should get this landed. As a future enhancement we might want to handle the conditional expression:
int i = 0;
return i == 0 ? (i ==0 ? false : true) : false;
|
||
@override | ||
visitAssignmentExpression(AssignmentExpression node) { | ||
_addElementToEnvironment(new _UndefinedExpression(_getLeftElement(node))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's quite possible that having an instance of _UndefinedExpression
whose element is null
is harmless. It just wasn't clear to me that that would be the case.
766d0ff
to
0999a4b
Compare
Fixes dart-lang/sdk#57442