-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Enable vectorized boolean logic in OpenCL C #948
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
if (is_zero(cond_neq->b)) { | ||
condition = cond_neq->a; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nested ifs should probably use &&. But can either of these conditions be false? If not it seems like it should be an assert instead of an if.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to keep it optional. The reason is maybe a bit hacky: currently, CodeGen_C implements many intrinsic functions by generating IR on the fly. This IR doesn't get processed by EliminateBoolVectors. As long as these expressions generated by CodeGen_C don't involve a logical op that isn't itself the select condition, just optionally stripping away the != 0 handles these cases.
However, I did add an assert after this to ensure the condition is in fact an integer vector. This will catch it if we ever do add an implementation of an intrinsic to CodeGen_C that requires bool vector elimination beyond the "root" of the condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, the assert is even too aggressive to handle the selects generated by CodeGen_C. For example, absd generates a0 > a1 as the condition. This is fine because operator > in OpenCL C for vectors returns an integer vector. However, Halide thinks its a bool vector, which triggers the assert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually x2, it turns out this logic isn't needed at all. The OpenCL compiler should be able to simplify the extra != 0 away.
@@ -16,6 +17,131 @@ using std::sort; | |||
|
|||
static ostringstream nil; | |||
|
|||
// OpenCL doesn't support vectors of bools, this mutator rewrites IR | |||
// to use signed integer vectors instead. | |||
class EliminateBoolVectors : public IRMutator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expand on this comment a little - I think this mutator converts all trees of logical ops on bool vectors to bitwise ops on int vectors, wrapping the comparisons at the leaves in upcasts, and wrapping the use in a select argument in a downcast (by comparing to zero). The actual codegen class then peepholes through the downcast and upcast to use OpenCL's comparisons (which return masks) and OpenCL's select (which takes a mask)
// To represent bool vectors, OpenCL uses vectors of signed | ||
// integers with the same width as the types being compared. | ||
t.code = Type::Int; | ||
expr = T::make(a, b); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd restructure this to put the if that tests if the lower down mutates changed anything above this so you can reuse the existing node here if they are. I.e.:
if (!a.same_as(op->a) || !b.same_as(op->b)) {
expr = T::make(a, b);
} else {
expr = op;
}
if (t.width > 1) {
t.code = Type::Int;
expr = Cast::make(t, expr);
}
You may have mentioned it in the follow up comments, but I think comparing vectors of bools is legal in Halide and this code likely fails if the type of the arguments is Int(1). I think to handle that case you'd want to insert casts to 8-bit before doing the comparison.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, done.
I'm not seeing why this code would fail for bool vector inputs... Do you mean if the arguments to the kernel are bool and vectorized in the kernel? OpenCL doesn't support that at all, so I don't think we need to worry about that case. Maybe an error if the user attempts to do that would be a good idea, but we have lots of other things like this that we aren't currently checking for either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I have not had time to try and craft a test case. Here is a sketch to consider however:
Param test;
ImageParam foo(...);
... select((foo(x, y) < 1) == test), ...) ...
(Scheduled with vectorization of course.)
In this case I would expect the inner comparison to be promoted to an integer of width > 1, but not the Param, which results in badly formed IR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Param actually will get promoted to an int (it will get changed from a broadcast of a bool to a broadcast of an integer cast). However, you're right that the comparison ops should also widen both operands to the wider of the two types.
Feel free to merge once travis completes |
Enable vectorized boolean logic in OpenCL C
This PR enables vectorized boolean logic in OpenCL C (fixes #939).
This revealed a few other bugs/issues that needed to be fixed/improved:
To test this, I just added vectorization/GPU schedules to a few of the tests (div_mod, boundary_conditions).