-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL] [FPGA] Add template parameter support for work_group_size attributes #2906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This patch adds support for template parameter on triple arguments SYCL function attributes such as: 1. [[intel::reqd_work_group_size()]] 2. [[cl::reqd_work_group_size()]] 3. [[intel::max_work_group_size()]] Default arguments in ReqWorkGroupSize can be used only with intel::reqd_work_group_size spelling. Checks correctness of mutual usage of different work_group_size attributes: reqd_work_group_size, max_work_group_size and max_global_work_dim. Values of reqd_work_group_size arguments shall be equal or less than values coming from max_work_group_size. In case the value of 'max_global_work_dim' attribute equals to 0 we shall ensure that if max_work_group_size and reqd_work_group_size attributes exist, they hold equal values (1, 1, 1). This patch does not add template parameter support on WorkGroupSizeHintAttr since the attribute does not have a [[]] spelling because it is an OpenCL-related attribute. updates sema/codegen tests with mock headers on device. splits tests of 'intel-max-work-group-size' and 'intel-reqd-work-group-size' attributes into host and device. Signed-off-by: Soumi Manna <soumi.manna@intel.com>
Signed-off-by: Soumi Manna <soumi.manna@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice there seem to be a lot of unrelated formatting changes in the review. FWIW, it's easier to review if those changes are separated out. (Formatting changes shouldn't require a review, so you can do an NFC push to fix the formatting and then base your patch on top of the cleaned code.)
Signed-off-by: Soumi Manna <soumi.manna@intel.com>
Thanks @AaronBallman for taking a look at the patch. I was fixing the format issues before i send you a code-review request. I have fixed all errors now. |
Ah, I jumped the gun, sorry about that. :-) |
clang/lib/Sema/SemaDeclAttr.cpp
Outdated
// Checks correctness of mutual usage of different work_group_size attributes: | ||
// reqd_work_group_size, max_work_group_size and max_global_work_dim. | ||
// Values of reqd_work_group_size arguments shall be equal or less than values | ||
// coming from max_work_group_size. | ||
// In case the value of 'max_global_work_dim' attribute equals to 0 we shall | ||
// ensure that if max_work_group_size and reqd_work_group_size attributes exist, | ||
// they hold equal values (1, 1, 1). | ||
|
||
static int64_t getIntExprValue(const Expr *E, ASTContext &Ctx) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know that this really holds its weight given that it's a one-liner around the same thing. How ugly does the code get if we sink this code into the call sites instead? (This way we don't have to try to think of a better name than getIntExprValue()
, which doesn't convey signed vs unsigned semantics clearly.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed getIntExprValue() and added this code into the call site.
@AaronBallman, does it look too ugly ? Since we have lost accessing the attribute value directly (A->getXDim()) due to template support, we need to get the attribute value here like codegen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking on this one as well, and I sort of wonder along the same lines as adding an additional member for the expressions, perhaps it'd make even more sense to make the additional member yield uint64_t
s instead. e.g.,
let AdditionalMembers = [{
ArrayRef<uint64_t> dimensions(ASTContext &Ctx) const {
makeArrayRef(getXDim()->getIntegerConstantExpr(Ctx)->getZExtValue(), ... );
}
}];
Then we don't need to sprinkle these helper functions all over.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added lambda function: getExprValue() that covers this part.
Signed-off-by: Soumi Manna <soumi.manna@intel.com>
Signed-off-by: Soumi Manna <soumi.manna@intel.com>
Signed-off-by: Soumi Manna <soumi.manna@intel.com>
Signed-off-by: Soumi Manna <soumi.manna@intel.com>
clang/lib/Sema/SemaDeclAttr.cpp
Outdated
Result &= checkZeroDim(A, A->getXDim(), A->getYDim(), A->getZDim()); | ||
if (const auto *A = D->getAttr<ReqdWorkGroupSizeAttr>()) | ||
Result &= checkZeroDim(A, A->getXDim(), A->getYDim(), A->getZDim()); | ||
if (const auto *A = D->getAttr<SYCLIntelMaxWorkGroupSizeAttr>()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if I am missing something but it looks like there is a some code duplication here. Can we combine the if-else block here by using hasAttr() || hasAttr().
I guess however this means you will have a getAttr() after hasAttr() which @AaronBallman mentioned is a bad coding practice. So I'm not sure. Tagging @AaronBallman
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we could use C++20 features, this would be easier to remove the duplication from because we could use a templated lambda. As it stands, I think the code is reasonable enough for the moment (we'd need the attributes to have a common base class otherwise, and I'm not certain if that's reasonable or not).
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that 's what i thought to have bad coding style if we have getAttr after hasAttr(). I did not make any change here.
@AaronBallman, any thought? since we decided to remove getIntExprValue() in our previous discussion (that removed some of the code duplication here).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh, I think we're all reviewing the same code at roughly the same time, so some of our comments come across in a fun order. :-D
There are a few approaches we can consider, but y'all know the domain better than I do and so some of these suggestions may be a bad idea. If these attributes are doing the same thing but are just spelled differently, we could use a single semantic attribute in Attd.td with different spellings. Then anywhere we do D->getAttr<>()
there'd only be the one semantic attribute to query (and we could add an accessor to it if we need to know which spelling the attribute was parsed with). If the two attributes are not doing the same thing but instead are doing highly related things, then perhaps a common attribute base class that both of the semantic attributes inherit from would be a good approach. But if the attributes are doing drastically different things, then I think this code is sufficient as-is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The attributes are doing different things here:
- max_global_work_dim() attribute take only single argument
reqd_work_group_size, max_work_group_size take triple attribute argument values. - Values of reqd_work_group_size arguments shall be equal or less than values
coming from max_work_group_size. - In case the value of 'max_global_work_dim' attribute equals to 0 we shall
ensure that if max_work_group_size and reqd_work_group_size attributes exist,
they hold equal values (1, 1, 1). If the attribute values does not hold (1,1,1), then we give error. - ReqdWorkGroupSizeAttr gives conflicts errors in case parsed attribute value >= Decl attribute.
if (const auto *A = D->getAttr()) {
if (!(WGSize[0] >= A->getXDim() && WGSize[1] >= A->getYDim() &&
WGSize[2] >= A->getZDim())) {
Since the attributes are doing different things here, we should keep the current codes here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I understanding properly that reqd_work_group_size
and max_work_group_size
are both specifying similar properties? Namely that one is a request and the other is the max request that can be fulfilled?
If so, would it make sense for there to be a single semantic attribute named WorkGroupSizeAttr
that looks something like this:
def WorkGroupSize : InheritableAttr {
let Spellings = [CXX11<"intelfpga","max_work_group_size">,
CXX11<"intel","max_work_group_size">,
GNU<"reqd_work_group_size">,
CXX11<"intel","reqd_work_group_size">,
CXX11<"cl","reqd_work_group_size">];
let Args = [ExprArgument<"XDim">,
ExprArgument<"YDim">,
ExprArgument<"ZDim">];
let Subjects = SubjectList<[Function], ErrorDiag>;
let Accessors = [Accessor<"isMax",
[CXX11<"intelfpga","max_work_group_size">,
CXX11<"intel","max_work_group_size">]>,
Accessor<"isRequest",
[GNU<"reqd_work_group_size">,
CXX11<"intel","reqd_work_group_size">,
CXX11<"cl","reqd_work_group_size">]>];
}
Then all of the code using the semantic attributes would use WorkGroupSizeAttr
instead, and if they need to care about max vs reqd, they can call isMax()
or isRequest()
as needed.
(Note, I'm speculating here. This idea may or may not make sense.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE: There are few differences between them:
- "max_work_group_size" is similar to reqd_work_group_size, but allows work groups that are smaller or equal to the specified sizes.
- Default arguments in ReqWorkGroupSize can be used only with
intel::reqd_work_group_size spelling. - ReqdWorkGroupSizeAttr gives conflicts errors in case parsed attribute value >= Decl attribute but not "max_work_group_size".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another idea that came to me last night while I should have been sleeping was that we could add an additional member to provide an array interface to the attribute arguments. This would at least let you do things in a loop where you're currently repeating the same code once for each dimension. Something like:
let AdditionalMembers = [{
ArrayRef<const Expr *> dimensions() const {
return {getXDim(), getYDim(), getZDim()};
}
ArrayRef<Expr *> dimensions() {
return {getXDim(), getYDim(), getZDim()};
}
}];
and this would allow you to do something like:
for (auto Item : zip(OldDeclAttr->dimensions(), NewDeclAttr->dimensions()) {
Expr *Old = std::get<0>(Item), *New = std::get<1>(Item);
...
}
or other such things. I've not tested the code out, but it seems like a plausible improvement that doesn't muck about with the type hierarchy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
clang/lib/Sema/SemaDeclAttr.cpp
Outdated
@@ -2948,8 +2985,21 @@ static bool checkWorkGroupSizeValues(Sema &S, Decl *D, const ParsedAttr &Attr, | |||
<< "'intel::max_work_group_size'"; | |||
|
|||
if (const auto *A = D->getAttr<ReqdWorkGroupSizeAttr>()) { | |||
if (!(WGSize[0] >= A->getXDim() && WGSize[1] >= A->getYDim() && | |||
WGSize[2] >= A->getZDim())) { | |||
Optional<llvm::APSInt> AttrXDimVal = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same duplication comment here as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. removed duplicate codes by adding getExprValue().
Signed-off-by: Soumi Manna <soumi.manna@intel.com>
clang/lib/Sema/SemaDeclAttr.cpp
Outdated
Result &= checkZeroDim(A, A->getXDim(), A->getYDim(), A->getZDim()); | ||
if (const auto *A = D->getAttr<ReqdWorkGroupSizeAttr>()) | ||
Result &= checkZeroDim(A, A->getXDim(), A->getYDim(), A->getZDim()); | ||
if (const auto *A = D->getAttr<SYCLIntelMaxWorkGroupSizeAttr>()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we could use C++20 features, this would be easier to remove the duplication from because we could use a templated lambda. As it stands, I think the code is reasonable enough for the moment (we'd need the attributes to have a common base class otherwise, and I'm not certain if that's reasonable or not).
Done
Signed-off-by: Soumi Manna <soumi.manna@intel.com>
Signed-off-by: Soumi Manna <soumi.manna@intel.com>
Signed-off-by: Soumi Manna <soumi.manna@intel.com>
Signed-off-by: Soumi Manna <soumi.manna@intel.com>
Signed-off-by: Soumi Manna <soumi.manna@intel.com>
sorry for the several submission. Clang-format took wrong formats. |
Signed-off-by: Soumi Manna <soumi.manna@intel.com>
Signed-off-by: Soumi Manna <soumi.manna@intel.com>
Signed-off-by: Soumi Manna <soumi.manna@intel.com>
Signed-off-by: Soumi Manna <soumi.manna@intel.com>
Thank you everyone for the reviews and discussion. |
…or semantic checks This is a followup comments on intel#2906 (comment) and intel#2906 (comment). Currently For a SYCLDevice WorkGroupAttr arguments are reversed. if (S.getLangOpts().SYCLIsDevice) std::swap(XDimExpr, ZDimExpr); There are few problems with this approach (swapping order in AST): 1. A user who writes the order as X, Y, Z in their source will get Z, Y, X when they pretty print the AST. 2. The same will happen when dumping the AST to text or JSON or generating diagnostic messages since the fist and third arguments of WorkGroupAttr are reversed. 3. This is very confusing to see the different order than the oroginal source codes. max_work_group_size() follows same rules as reqd_work_group_size() in Sema and LLVM IR generation. This patch keeps the arguments in the same order as they were parsed in doing semantic checks, and when performing code generation, we would change the argument order when lowering to LLVM IR and not when creating the semantic attribute. Signed-off-by: Soumi Manna <soumi.manna@intel.com>
…ments for semantic checks (#2962) This is a followup comments on #2906 (comment) and #2906 (comment). Currently For a SYCLDevice WorkGroupAttr arguments are reversed. if (S.getLangOpts().SYCLIsDevice) std::swap(XDimExpr, ZDimExpr); There are few problems with this approach (swapping order in AST): 1. A user who writes the order as X, Y, Z in their source will get Z, Y, X when they pretty print the AST. 2. The same will happen when dumping the AST to text or JSON or generating diagnostic messages since the fist and third arguments of WorkGroupAttr are reversed. 3. This is very confusing to see the different order than the oroginal source codes. max_work_group_size() follows same rules as reqd_work_group_size() in Sema and LLVM IR generation. This patch keeps the arguments in the same order as they were parsed in doing semantic checks, and when performing code generation, we would change the argument order when lowering to LLVM IR and not when creating the semantic attribute. Signed-off-by: Soumi Manna <soumi.manna@intel.com>
@@ -11207,6 +11207,10 @@ def warn_attribute_spelling_deprecated : Warning< | |||
InGroup<DeprecatedAttributes>; | |||
def note_spelling_suggestion : Note< | |||
"did you mean to use %0 instead?">; | |||
def warn_attribute_requires_non_negative_integer_argument : Warning< | |||
"the resulting value of the %0 attribute %select{first|second|third}1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay in review. I was on vacation without email access.
I think this diagnostic is too long. Can't we just say - "xyz parameter is non-negative after implicit conversion"
.
I don't think we need to mention "the resulting value of the %0 attribute"
since the diagnostic is generated at attribute location anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible alternative: we could also reuse the existing diagnostic text about implicit conversion, but put it under the new attribute grouping:
def warn_attribute_requires_non_negative_integer_argument : Warning<warn_impcast_integer_sign.Text>, InGroup<AcceptedAttributes>;
One oddity from this would be that users who pass -Wno-sign-compare
will still get what looks like a sign comparison warning, but that's the same behavior as with the current patch anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @AaronBallman and @elizabethandrews for the comments on improving the diagnostic message.
Does the new diagnostic print like this?
implicit conversion changes signedness: 'unsigned int' to ' int'
Example: [[intel::max_work_group_size(8, 8, -8)]]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The types look backwards in the diagnostic. It's changing from int
to unsigned int
.
Also, you should check that the diagnostic points to the attribute argument that changes sign (the -8
). Does the caret appear in a place where it's obvious what the diagnostic is talking about?
Btw, a second test would be [[intel::max_work_group_size(-8, 8, -8)]]
to ensure that the diagnostic triggers twice (once for each -8
) and is sufficiently clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also you should check that the diagnostic points to the attribute argument that changes sign (the -8).
Yes, it points to the place (-8)
Btw, a second test would be [[intel::max_work_group_size(-8, 8, -8)]] to ensure that the diagnostic triggers twice (once for each -8) and is sufficiently clear.
Yes, it triggers twice and correct place (once for each -8)
I have created #2988 for this change.
…ork_group_size attributes A linking error was introduced on PR intel#2906 where template declaration and definition was added in different files. This patch fixes the issue by moving the template definition for addIntelSYCLTripleArgFunctionAttr to Sema.h. Signed-off-by: Soumi Manna <soumi.manna@intel.com>
…ork_group_size attributes (#2975) A linking error was introduced on PR #2906 where template declaration and definition was added in different files. This patch fixes the issue by moving the template definition for addIntelSYCLTripleArgFunctionAttr to Sema.h. Signed-off-by: Soumi Manna <soumi.manna@intel.com>
…eAttribute Template parameter support for work_group_size attributes was added on intel#2906. Unsigned WGSize array which was used on function checkWorkGroupSizeValues() befor the support for the attributes dimension values to check correctness of mutual usage of different work_group_size attributes: reqd_work_group_size, max_work_group_size and max_global_work_dim, does not use anymore. This patch removes instances of WGSize array from the WorkGroupSizeAttribute. Signed-off-by: Soumi Manna <soumi.manna@intel.com>
…s of work_group_size attributes Template parameter support for work_group_size attributes was added on intel#2906 This is a followup comments on intel#2906 This patch improves the diagnostic message for "warn_attribute_requires_non_negative_integer_argument" and updates tests related to this diagnostic. Signed-off-by: Soumi Manna <soumi.manna@intel.com>
) Template parameter support for work_group_size attributes was added on #2906 This is a followup comments on #2906 This patch improves the diagnostic message for "warn_attribute_requires_non_negative_integer_argument" and updates tests related to this diagnostic. Signed-off-by: Soumi Manna <soumi.manna@intel.com>
#2983) Template parameter support for work_group_size attributes was added on #2906. Unsigned WGSize array which was used on function checkWorkGroupSizeValues() befor the support for the attributes dimension values to check correctness of mutual usage of different work_group_size attributes: reqd_work_group_size, max_work_group_size and max_global_work_dim, does not use anymore. This patch removes instances of WGSize array from the WorkGroupSizeAttribute. Signed-off-by: Soumi Manna <soumi.manna@intel.com>
This patch updates tests related to work_group_size attributes by fixing typos and adding CHECK that happened on commit intel#2906 Signed-off-by: Soumi Manna <soumi.manna@intel.com>
This patch updates tests related to work_group_size attributes by fixing typos and adding CHECK that happened on commit #2906 Signed-off-by: Soumi Manna <soumi.manna@intel.com>
… in conjunction with the reqd_work_group_size attribute This is a follow up comments on intel#2906 (comment) There is a rule on FPGA document that the num_simd_work_items attribute we specify must evenly divide the work-group size we specify for the reqd_work_group_size attribute, was missed during initial implemention of the num_simd_work_items attribute. OpenCL spelling for num_simd_work_items attribute supports this rule. This patch 1. addes support for mutual diagnostic in Sema for num_simd_work_items attribute interacting with reqd_work_group_size attribute. 2. addes tests 3. updates documenation about the num_simd_work_items attribute with the note and exmaples. Current implementation of SYCL attribute “num_simd_work_items” and “reqd_work_group_size” does not support this. Signed-off-by: Soumi Manna <soumi.manna@intel.com>
For in-tree builds with shared libraries, setting the rpath not only works, but is required. Fixes #2906 Original commit: KhronosGroup/SPIRV-LLVM-Translator@7c0db925422d1f5
This patch
adds support for template parameter on triple arguments
SYCL function attributes such as:
Default arguments in ReqWorkGroupSize can be used only with
intel::reqd_work_group_size spelling.
Checks correctness of mutual usage of different work_group_size attributes:
reqd_work_group_size, max_work_group_size and max_global_work_dim.
Values of reqd_work_group_size arguments shall be equal or less than values
coming from max_work_group_size.
In case the value of 'max_global_work_dim' attribute equals to 0 we shall
ensure that if max_work_group_size and reqd_work_group_size attributes exist,
they hold equal values (1, 1, 1).
This patch does not add template parameter support on WorkGroupSizeHintAttr
since the attribute does not have a [[]] spelling because it is an OpenCL-related attribute.
updates sema/codegen tests with mock headers on device.
splits tests of 'intel-max-work-group-size' and
'intel-reqd-work-group-size' attributes into host and device.
adds new group and new diagnostic (does not match with others) since
the attribute accepts negative argument value with a warning
(keeps treating the value as being unsigned).
Signed-off-by: Soumi Manna soumi.manna@intel.com