If you suspend your transcription on amara.org, please add a timestamp below to indicate how far you progressed! This will help others to resume your work!
Please do not press “publish” on amara.org to save your progress, use “save draft” instead. Only press “publish” when you're done with quality control.
The talk will explain unfamiliar concepts in more common terms like:
Vector registers are just registers where CPUs can store multiple numbers which belong together and are processed independent of each other together in same operation. This allows a higher processing performance similar to how moving a pallet of same sized boxes can be quicker than just moving the boxes on their own.
And will then use those new terms drawing comparisons like:
512 bits long are the largest vector registers available with any other CPU available today compared to 16348 bits long vector registers of which each VE core has 64 of. This puts it in a class of its own among CPUs.
If you weren't scrared off by this you shouldn't find the talk to technical. If you have a deep grasp on computing technology and wonder if this talk might interesting then you will hear about some implementation choices from NEC drawing reactions deep from the Kubler-Ross stages of Grief.
There will be a short introduction to the VE instruction set highlight a few instructions which are "fun" or otherwise "interesting" and might have some general computing https://en.wikipedia.org/wiki/Fast\_inverse\_square\_root trivia https://vaibhavsagar.com/blog/2019/09/08/popcount/ associtated. The different offloading modes of a VE are introduced, one of which is enterily novel and which also emphasizes the uniqueness and sheer quirkyness.
Programs executing on a Vector Engine run in a Linux environment thus one could make many applications run on this accelerator unlocking GPU like performance for them without a need for rewrites if said code can make use of these big vector registers and the massive memory bandwidth available to them. So it's unsupprising that it is enourmously fun to touch up identified bottelnecks and see some application get 200x faster with handful of fixes. We can call hardware homebrewed if we make 2048 run on it, can't we?
The presentation about hacks people which joined my "vect.or.at" Vector Engine PUBNIX (basically a shared linux computer) did will cover such speeds ups, mention the state of an ongoing attempt to port the Rust programming languages to it, attempts of digital perservationism and progress towards making the vector engine truely yours by "rooting" it to mess with hardware settings otherwise unavailable.
The introduction to HPC portion will be structured as an argument claiming "A NEC Vector Engine would turn your (Linux) computer into a small super computer" and use this as motivation to introduce what such a super computer or HPC cluster is, how you can make it work for you and common software packages used. A few performance "tripping" hazards also are mentioned.