NEON performance for 11.11
We have a list that covers the near term in vectoriser improvements. Discuss those that adapt existing vectoriser features to NEON, those that add new general purpose features exposed by the ARM investigation, and features that are ARM specific.
This blueprint contains tasks that don't warrent a blueprint themselves. To see the other topics that have been spun out, see:
https:/
Blueprint information
- Status:
- Complete
- Approver:
- Michael Hope
- Priority:
- High
- Drafter:
- Ira Rosen
- Direction:
- Needs approval
- Assignee:
- Ira Rosen
- Definition:
- Approved
- Series goal:
- Accepted for 4.6
- Implementation:
- Implemented
- Milestone target:
- None
- Started by
- Michael Hope
- Completed by
- Michael Hope
Whiteboard
The blocks have been split into blueprints to match the new style.
Below are the original blocks before splitting. Kept as blueprints
don't have history.
Doubling multiply:
Use NEON doubling multiply instructions: DONE
Implement, upstream, and backport: DONE
Over-promotion:
Reduce over-promotion in multiplication: DONE
Implement, upstream, and backport: DONE
Fix vectorizer testsuite failures upstream: DONE
Reduce over-promotion of vector operations that could be done with narrower elements: DONE
Implement, upstream, and backport: DONE
Use NEON widening shift left instruction: INPROGRESS
Implement, upstream, and backport: TODO
Change the default vector size for NEON to 128 bits: INPROGRESS
Implement, upstream: DONE
and backport: TODO
Investigate how effectively GCC uses the NEON narrowing arithmetic instructions: DONE
Implement, upstream, and backport: POSTPONED
Investigate excessive use of vmov instructions: TODO
Implement, upstream, and backport: TODO
Peeling:
Improve peeling heuristic in the vectorizer - without cost model: DONE
Implement, upstream, and backport: DONE
Investigate if peeling is effective for NEON both with and without cost model: TODO
Implement any improvements, upstream, and backport: TODO
arm_neon.h:
Check for any upstream enhancement requests: TODO
Check if further work is needed: TODO
Do round 1: TODO
Do round 2: TODO
Coverage:
Add tests for NEON instructions that can be directly expressed in C: INPROGRESS
Document the current vectoriser coverage: INPROGRESS
Document the current NEON backend coverage: TODO
See http://
Work Items
Dependency tree
* Blueprints in grey have been implemented.