-
Notifications
You must be signed in to change notification settings - Fork 6.3k
8303762: Optimize vector slice operation with constant index using VPALIGNR instruction #24104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from 16 commits
2a17c5d
7385c75
edf51e7
607a8fc
b2e9343
04be59a
e7c7374
405de56
f36ae6d
70c2293
340f184
60deca5
9da1f86
2c7eb96
ae24292
1dfff55
444a356
9625b04
2b8f0b4
bde0c21
121c40a
ad7151e
2834a02
c595003
46fcc9a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -437,6 +437,31 @@ CallGenerator* CallGenerator::for_mh_late_inline(ciMethod* caller, ciMethod* cal | |
| return cg; | ||
| } | ||
|
|
||
| class LateInlineVectorCallGenerator : public LateInlineCallGenerator { | ||
| protected: | ||
| CallGenerator* _inline_cg; | ||
|
|
||
| public: | ||
| LateInlineVectorCallGenerator(ciMethod* method, CallGenerator* intrinsic_cg, CallGenerator* inline_cg) : | ||
| LateInlineCallGenerator(method, intrinsic_cg) , _inline_cg(inline_cg) {} | ||
|
|
||
| CallGenerator* inline_cg2() const { return _inline_cg; } | ||
| bool inline_fallback(); | ||
| virtual bool is_vector_late_inline() const { return true; } | ||
| }; | ||
|
|
||
| bool LateInlineVectorCallGenerator::inline_fallback() { | ||
| switch (method()->intrinsic_id()) { | ||
| case vmIntrinsics::_VectorSlice: return true; | ||
| default : return false; | ||
| } | ||
| } | ||
|
|
||
| CallGenerator* CallGenerator::for_vector_late_inline(ciMethod* m, CallGenerator* intrinsic_cg, CallGenerator* inline_cg) { | ||
| return new LateInlineVectorCallGenerator(m, intrinsic_cg, inline_cg); | ||
| } | ||
|
|
||
|
|
||
| // Allow inlining decisions to be delayed | ||
| class LateInlineVirtualCallGenerator : public VirtualCallGenerator { | ||
| private: | ||
|
|
@@ -673,6 +698,14 @@ void CallGenerator::do_late_inline_helper() { | |
|
|
||
| // Now perform the inlining using the synthesized JVMState | ||
| JVMState* new_jvms = inline_cg()->generate(jvms); | ||
| // Attempt inlining fallback implementation in case of | ||
| // intrinsification failure. | ||
| if (new_jvms == nullptr && is_vector_late_inline()) { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This may be problematic if the intrinsification does not succeed because the arguments have not been constant-folded. It is because the order in which methods are processed during incremental inline is not deterministic.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @merykitty , Intrinsicification failure due to any such reason is same with and without this patch, in case of slice intrinsic failure we simply inline the fallback implementation which is comprised of vector APIs, VectorSupport* entry points of APIs should then go through intrinsification attempts independently and may succeeded or fail if constraints are not met.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think there is no harm in-lining fallback after first un-successful attempt of intrinsification for sliceOp, as fallback is composed of vectorAPI and we are giving them opportunity for intrinsificaiton, this save costly boxing operation and performance will be at par with what we have today. WDYT ?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But this will affect other intrinsics, too, they are not implemented using other vector API operations.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We only perform fallback inlining on first intrinsification failure for sliceOp, this is a very localized change.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for pointing out the change. I think that's more hacky than I have expected.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is selective enablement of inlining for intrinsic failures which uses vector API in the fall back implimentation. |
||
| LateInlineVectorCallGenerator* late_inline_vec_cg = static_cast<LateInlineVectorCallGenerator*>(this); | ||
| if (late_inline_vec_cg->inline_fallback()) { | ||
| new_jvms = late_inline_vec_cg->inline_cg2()->generate(jvms); | ||
| } | ||
| } | ||
| if (new_jvms == nullptr) return; // no change | ||
| if (C->failing()) return; | ||
|
|
||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.