Notices by Fabian Giesen (rygorous@mastodon.gamedev.place)

Embed this notice
Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Monday, 02-Jun-2025 03:54:36 JST Fabian Giesen
in reply to
- ✧✦Catherine✦✧
@whitequark they're all like that
In conversation about 4 days ago from gnusocial.jp permalink
Attachments
1. Untitled attachment
  https://cdn.masto.host/mastodongamedevplace/media_attachments/files/114/609/608/469/298/294/original/566fa5a17b0515c9.png
Embed this notice
Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Thursday, 29-May-2025 03:17:01 JST Fabian Giesen
in reply to
@regehr @resistor @barrelshifter I know about them originally as one of the standard topologies to implement a generic shifter.

In conversation about 8 days ago from mastodon.gamedev.place permalink
Embed this notice
Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Thursday, 29-May-2025 03:17:00 JST Fabian Giesen
in reply to
@regehr @resistor @barrelshifter If you want to just shift right or just shift left, that's straightforward, but if you need both (plus variants like arithmetic shifts), you end up making two or more full shifters, which is not great.
The main standard solutions are:
1. build a rotator, reduce everything to rotate+masking
2. set up into unidirectional funnel shift
3. data reversal shifter: optional bit reverse, uni-directional shifter, optional bit reverse.

In conversation about 8 days ago from gnusocial.jp permalink
Embed this notice
Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Thursday, 29-May-2025 03:16:59 JST Fabian Giesen
in reply to
@regehr @resistor @barrelshifter actual funnel shift as an exposed operation is much rarer, especially the variable-shift-amount variant, because it absolutely needs a 3-operand architecture on the datapath. GPUs usually pervasively have that, and it's fairly common to see native funnel shift operations exposed on GPUs. (e.g. AMD has v_alignbit_b32, NV has SHF https://docs.nvidia.com/cuda/cuda-binary-utilities/index.html#maxwell-pascal)
In conversation about 8 days ago from mastodon.gamedev.place permalink
Attachments
1. Domain not in remote thumbnail source whitelist: docs.nvidia.com
  
  1. Overview — cuda-binary-utilities 12.9 documentation
  
  The application notes for cuobjdump, nvdisasm, cu++filt and nvprune.
Embed this notice
Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Thursday, 29-May-2025 03:16:58 JST Fabian Giesen
in reply to
@regehr @resistor @barrelshifter anyway I know about the funnel variants mostly from the HW side, e.g. stuff like this https://iccd.et.tudelft.nl/2008/proceedings/626huntzicker.pdf ("Energy-Delay Tradeoffs in 32-bit Static Shifter Designs", Huntzicker et al. 2008). Data-reversal shifters are e.g. in https://web.archive.org/web/20170706054724id_/https://www.princeton.edu/~rblee/ELE572Papers/Fall04Readings/Shifter_Schulte.pdf ("Design alternatives for barrel shifters", Pillmeier et al. 2002)
In conversation about 8 days ago from mastodon.gamedev.place permalink
Attachments
1. Untitled attachment
2. Untitled attachment
Embed this notice
Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Thursday, 29-May-2025 03:16:57 JST Fabian Giesen
in reply to
@regehr @resistor @barrelshifter For some examples:
- PPC and ARM A64 very visibly go for the rotate+mask approach. PPC does everything with rot left + mask (see e.g. https://devblogs.microsoft.com/oldnewthing/20180810-00/?p=99465), A64 is rot right + mask (https://www.scs.stanford.edu/~zyedidia/arm64/ubfm.html), and the A64 logic immediates use basically the same mask generation logic. (So it can be shared if desired, in a small design.)
- I have seen datapath details for some Intel designs that used funnel shift right for all usual shifts.
In conversation about 8 days ago from gnusocial.jp permalink
Attachments
1. Domain not in remote thumbnail source whitelist: devblogs.microsoft.com
  
  The PowerPC 600 series, part 5: Rotates and shifts - The Old New Thing
  
  from Raymond Chen
  
  Get out your Swiss army knife.
2. No result found on File_thumbnail lookup.
  
  UBFM -- A64
Embed this notice
Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Thursday, 29-May-2025 03:16:55 JST Fabian Giesen
in reply to
@regehr @resistor @barrelshifter I've you're wondering why they're called funnel shifters, just look at the wiring diagram.
In conversation about 8 days ago from gnusocial.jp permalink
Attachments
1. Untitled attachment
  https://cdn.masto.host/mastodongamedevplace/media_attachments/files/114/586/730/520/813/697/original/4b3ee4c307f95e44.png
Embed this notice
Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Thursday, 29-May-2025 02:15:05 JST Fabian Giesen
in reply to
- Jessica Paquette
@barrelshifter this is now years ago, but I vividly recall how much resistance there was to adding rotates to LLVM IR too because "can't you just build them out of simpler shifts?"
(in short: yes, but not reliably for variable shifts; the patterns to match for rotates with variable shift amount are quite fragile and easily - and frequently - broken by unrelated opts in the middle end)

In conversation about 8 days ago from mastodon.gamedev.place permalink
Embed this notice
Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Thursday, 29-May-2025 02:15:04 JST Fabian Giesen
in reply to
- Jessica Paquette
- Owen Anderson
@resistor @barrelshifter The ADD->OR thing is another frequent monkey wrench especially wrt matching things that could be addressing modes, yeah.
It's not a worthless transformation for larger-than-native-register integers (where not having cross-limb carries is a solid win) but for <=pointer size, I'd argue it hurts more than it helps.
In conversation about 8 days ago from mastodon.gamedev.place permalink
Attachments
1. No result found on File_thumbnail lookup.
  
  yeah.it - このウェブサイトは販売用です！ - yeah リソースおよび情報
  
  このウェブサイトは販売用です！ yeah.it は、あなたがお探しの情報の全ての最新かつ最適なソースです。一般トピックからここから検索できる内容は、yeah.itが全てとなります。あなたがお探しの内容が見つかることを願っています！
Embed this notice
Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Thursday, 29-May-2025 02:15:02 JST Fabian Giesen
in reply to
- Jessica Paquette
- Owen Anderson
@resistor @barrelshifter Either way, I like the way the funnel shift solution worked out. (And will always be grateful for Sanjay doing the legwork!)
They're a good normal form, funnel shift<->rotate (where applicable) is canonical and trivial in both directions, they can be formed early (which avoids destroying the pattern), they're reasonably common in target ISAs on their own right, and are still pretty straightforward to lower where they're not available

In conversation about 8 days ago from mastodon.gamedev.place permalink
Embed this notice
Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Thursday, 22-May-2025 09:11:02 JST Fabian Giesen
in reply to

This instruction:
mov [rDest + <index>], ch
under these conditions, when overclocked a bit, once the machine has "warmed up", seems to have around a 1/10000 chance of actually storing the contents of CL instead of CH to memory.
(this was "fun" to debug.)
The workaround: when we detect Raptor Lake CPUs, we now do
shr ecx, 8
mov [rDest + <index>], cl
instead. This takes more FE and uop bandwidth, but this loop is mainly latency-limited, and this is off the critical path.

In conversation about 15 days ago from gnusocial.jp permalink
Embed this notice
Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Thursday, 22-May-2025 09:10:57 JST Fabian Giesen

What's that mysterious workaround?
Core Huff6 decode step is described in https://fgiesen.wordpress.com/2023/10/29/entropy-decoding-in-oodle-data-x86-64-6-stream-huffman-decoders/
A customer managed to get a fairly consistent repro for transient decode errors by overclocking an i7-14700KF by about 5% from stock settings ("performance" multiplier 56->59).
It took weeks of back and forth and forensic debugging to figure out what actually happens, but TL;DR: the observed decode errors are all consistent with a single instruction misbehaving.
In conversation about 15 days ago from mastodon.gamedev.place permalink
Attachments
1. No result found on File_thumbnail lookup.
  
  A small note on SIMD matrix-vector multiplication
  
  from fgiesen
  
  Suppose we want to calculate a product between a 4×4 matrix M and a 4-element vector v: $latex Mv = \begin{pmatrix}a_x & b_x & c_x & d_x \\ a_y & b_y & c_y & d_y \\ a_z…
Embed this notice
Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Monday, 19-May-2025 17:51:34 JST Fabian Giesen

I've been around _just_ long enough to get pretty much the entire history of the etymology of computer storage device names to get hella confusing

In conversation about 18 days ago from mastodon.gamedev.place permalink
Embed this notice
Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Monday, 19-May-2025 17:51:33 JST Fabian Giesen
in reply to

When I was a small kid, floppy disks were actually floppy (5.25") and contained actual spinning disks, floppy disk drives contained the thing that drove the disks (i.e. the motor), and hard disk drives had hard disks (metal platters) and the thing that drove them all in one package. Fair enough.
Not long after, we got 3.5" floppies that had a hard plastic shell and were not actually floppy anymore. Still called them floppies.

In conversation about 18 days ago from gnusocial.jp permalink
Embed this notice
Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Monday, 19-May-2025 17:51:32 JST Fabian Giesen
in reply to

So a 3.5" floppy drive is actually a drive for a disk in a 3.5" hard shell but whatever.
Then we got other magnetic removable storage formats that were also all hard-shelled but whatever.
Then audio CDs got adapted for data storage and we got CD-ROMs with CDs (and later DVDs) that were actually disc-shaped, unlike floppies where the actual storage medium was disc-shaped but the overall package wasn't.

In conversation about 18 days ago from mastodon.gamedev.place permalink
Embed this notice
Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Monday, 19-May-2025 17:51:30 JST Fabian Giesen
in reply to

and then we got solid-state storage and the entire terminology is just terminally bonkers now
we have "disks" that are neither diskettes nor disc-shaped (the actual chips are rectangular), "disk drives" that don't interact with dis[ck]s and don't drive anything (not in a mechanical sense anyway), and the "solid-state" part refers to no moving parts but of course the stuff with moving parts is also all in a solid state of matter, generally

In conversation about 18 days ago from gnusocial.jp permalink
Embed this notice
Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Monday, 19-May-2025 17:51:29 JST Fabian Giesen
in reply to

in short, two of the three words in "solid-state drive" refer to the thing it's got in common with literally all the other competing storage technologies and the remaining word describes the one thing it doesn't actually do, definitionally

In conversation about 18 days ago from mastodon.gamedev.place permalink
Embed this notice
Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Tuesday, 13-May-2025 03:10:13 JST Fabian Giesen

FOR IMMEDIATE RELEASE
Santa celebrates 50 years "out of list-making business"
NORTH POLE. May 12, 2025
Santa is celebrating 50 years without the naughty/nice list. "We stopped doing it in 1975 since it felt out of touch with the times then; these days, with frequent data breaches, privacy regulations, COPPA, GDPR... frankly it feels like a liability nightmare, I don't think anyone would even seriously consider doing this now. I mean, a naughty list leak. Can you imagine?"

In conversation about a month ago from mastodon.gamedev.place permalink
Embed this notice
Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Sunday, 01-Dec-2024 00:07:57 JST Fabian Giesen

a barometer is just a mic with a narrow frequency response centered at 0Hz and I'm tired of pretending it's not
don't @ me

In conversation about 6 months ago from mastodon.gamedev.place permalink
Embed this notice
Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Monday, 22-Jul-2024 21:45:24 JST Fabian Giesen

so I don't want to advocate for opening cans of worms but I _would_ like to know who makes these canned worms, and who stocks them everywhere

In conversation about 11 months ago from mastodon.gamedev.place permalink

Before

Public

Notices by Fabian Giesen (rygorous@mastodon.gamedev.place)

User actions

Following 0

Followers 0

Groups 0

Statistics

Feeds