GNU social JP
  • FAQ
  • Login
GNU social JPは日本のGNU socialサーバーです。
Usage/ToS/admin/test/Pleroma FE
  • Public

    • Public
    • Network
    • Groups
    • Featured
    • Popular
    • People

Notices by Fabian Giesen (rygorous@mastodon.gamedev.place)

  1. Embed this notice
    Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Monday, 02-Jun-2025 03:54:36 JST Fabian Giesen Fabian Giesen
    in reply to
    • ✧✦Catherine✦✧

    @whitequark they're all like that

    In conversation about 4 days ago from gnusocial.jp permalink

    Attachments


    1. https://cdn.masto.host/mastodongamedevplace/media_attachments/files/114/609/608/469/298/294/original/566fa5a17b0515c9.png
  2. Embed this notice
    Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Thursday, 29-May-2025 03:17:01 JST Fabian Giesen Fabian Giesen
    in reply to
    • Jessica Paquette
    • John Regehr
    • Owen Anderson

    @regehr @resistor @barrelshifter I know about them originally as one of the standard topologies to implement a generic shifter.

    In conversation about 8 days ago from mastodon.gamedev.place permalink
  3. Embed this notice
    Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Thursday, 29-May-2025 03:17:00 JST Fabian Giesen Fabian Giesen
    in reply to
    • Jessica Paquette
    • John Regehr
    • Owen Anderson

    @regehr @resistor @barrelshifter If you want to just shift right or just shift left, that's straightforward, but if you need both (plus variants like arithmetic shifts), you end up making two or more full shifters, which is not great.

    The main standard solutions are:
    1. build a rotator, reduce everything to rotate+masking
    2. set up into unidirectional funnel shift
    3. data reversal shifter: optional bit reverse, uni-directional shifter, optional bit reverse.

    In conversation about 8 days ago from gnusocial.jp permalink
  4. Embed this notice
    Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Thursday, 29-May-2025 03:16:59 JST Fabian Giesen Fabian Giesen
    in reply to
    • Jessica Paquette
    • John Regehr
    • Owen Anderson

    @regehr @resistor @barrelshifter actual funnel shift as an exposed operation is much rarer, especially the variable-shift-amount variant, because it absolutely needs a 3-operand architecture on the datapath. GPUs usually pervasively have that, and it's fairly common to see native funnel shift operations exposed on GPUs. (e.g. AMD has v_alignbit_b32, NV has SHF https://docs.nvidia.com/cuda/cuda-binary-utilities/index.html#maxwell-pascal)

    In conversation about 8 days ago from mastodon.gamedev.place permalink

    Attachments

    1. Domain not in remote thumbnail source whitelist: docs.nvidia.com
      1. Overview — cuda-binary-utilities 12.9 documentation
      The application notes for cuobjdump, nvdisasm, cu++filt and nvprune.
  5. Embed this notice
    Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Thursday, 29-May-2025 03:16:58 JST Fabian Giesen Fabian Giesen
    in reply to
    • Jessica Paquette
    • John Regehr
    • Owen Anderson

    @regehr @resistor @barrelshifter anyway I know about the funnel variants mostly from the HW side, e.g. stuff like this https://iccd.et.tudelft.nl/2008/proceedings/626huntzicker.pdf ("Energy-Delay Tradeoffs in 32-bit Static Shifter Designs", Huntzicker et al. 2008). Data-reversal shifters are e.g. in https://web.archive.org/web/20170706054724id_/https://www.princeton.edu/~rblee/ELE572Papers/Fall04Readings/Shifter_Schulte.pdf ("Design alternatives for barrel shifters", Pillmeier et al. 2002)

    In conversation about 8 days ago from mastodon.gamedev.place permalink

    Attachments



  6. Embed this notice
    Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Thursday, 29-May-2025 03:16:57 JST Fabian Giesen Fabian Giesen
    in reply to
    • Jessica Paquette
    • John Regehr
    • Owen Anderson

    @regehr @resistor @barrelshifter For some examples:
    - PPC and ARM A64 very visibly go for the rotate+mask approach. PPC does everything with rot left + mask (see e.g. https://devblogs.microsoft.com/oldnewthing/20180810-00/?p=99465), A64 is rot right + mask (https://www.scs.stanford.edu/~zyedidia/arm64/ubfm.html), and the A64 logic immediates use basically the same mask generation logic. (So it can be shared if desired, in a small design.)
    - I have seen datapath details for some Intel designs that used funnel shift right for all usual shifts.

    In conversation about 8 days ago from gnusocial.jp permalink

    Attachments

    1. Domain not in remote thumbnail source whitelist: devblogs.microsoft.com
      The PowerPC 600 series, part 5: Rotates and shifts - The Old New Thing
      from Raymond Chen
      Get out your Swiss army knife.
    2. No result found on File_thumbnail lookup.
      UBFM -- A64
  7. Embed this notice
    Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Thursday, 29-May-2025 03:16:55 JST Fabian Giesen Fabian Giesen
    in reply to
    • Jessica Paquette
    • John Regehr
    • Owen Anderson

    @regehr @resistor @barrelshifter I've you're wondering why they're called funnel shifters, just look at the wiring diagram.

    In conversation about 8 days ago from gnusocial.jp permalink

    Attachments


    1. https://cdn.masto.host/mastodongamedevplace/media_attachments/files/114/586/730/520/813/697/original/4b3ee4c307f95e44.png
  8. Embed this notice
    Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Thursday, 29-May-2025 02:15:05 JST Fabian Giesen Fabian Giesen
    in reply to
    • Jessica Paquette

    @barrelshifter this is now years ago, but I vividly recall how much resistance there was to adding rotates to LLVM IR too because "can't you just build them out of simpler shifts?"

    (in short: yes, but not reliably for variable shifts; the patterns to match for rotates with variable shift amount are quite fragile and easily - and frequently - broken by unrelated opts in the middle end)

    In conversation about 8 days ago from mastodon.gamedev.place permalink
  9. Embed this notice
    Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Thursday, 29-May-2025 02:15:04 JST Fabian Giesen Fabian Giesen
    in reply to
    • Jessica Paquette
    • Owen Anderson

    @resistor @barrelshifter The ADD->OR thing is another frequent monkey wrench especially wrt matching things that could be addressing modes, yeah.

    It's not a worthless transformation for larger-than-native-register integers (where not having cross-limb carries is a solid win) but for <=pointer size, I'd argue it hurts more than it helps.

    In conversation about 8 days ago from mastodon.gamedev.place permalink

    Attachments

    1. No result found on File_thumbnail lookup.
      yeah.it - このウェブサイトは販売用です! - yeah リソースおよび情報
      このウェブサイトは販売用です! yeah.it は、あなたがお探しの情報の全ての最新かつ最適なソースです。一般トピックからここから検索できる内容は、yeah.itが全てとなります。あなたがお探しの内容が見つかることを願っています!
  10. Embed this notice
    Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Thursday, 29-May-2025 02:15:02 JST Fabian Giesen Fabian Giesen
    in reply to
    • Jessica Paquette
    • Owen Anderson

    @resistor @barrelshifter Either way, I like the way the funnel shift solution worked out. (And will always be grateful for Sanjay doing the legwork!)

    They're a good normal form, funnel shift<->rotate (where applicable) is canonical and trivial in both directions, they can be formed early (which avoids destroying the pattern), they're reasonably common in target ISAs on their own right, and are still pretty straightforward to lower where they're not available

    In conversation about 8 days ago from mastodon.gamedev.place permalink
  11. Embed this notice
    Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Thursday, 22-May-2025 09:11:02 JST Fabian Giesen Fabian Giesen
    in reply to

    This instruction:
    mov [rDest + <index>], ch

    under these conditions, when overclocked a bit, once the machine has "warmed up", seems to have around a 1/10000 chance of actually storing the contents of CL instead of CH to memory.

    (this was "fun" to debug.)

    The workaround: when we detect Raptor Lake CPUs, we now do

    shr ecx, 8
    mov [rDest + <index>], cl

    instead. This takes more FE and uop bandwidth, but this loop is mainly latency-limited, and this is off the critical path.

    In conversation about 15 days ago from gnusocial.jp permalink
  12. Embed this notice
    Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Thursday, 22-May-2025 09:10:57 JST Fabian Giesen Fabian Giesen

    What's that mysterious workaround?

    Core Huff6 decode step is described in https://fgiesen.wordpress.com/2023/10/29/entropy-decoding-in-oodle-data-x86-64-6-stream-huffman-decoders/

    A customer managed to get a fairly consistent repro for transient decode errors by overclocking an i7-14700KF by about 5% from stock settings ("performance" multiplier 56->59).

    It took weeks of back and forth and forensic debugging to figure out what actually happens, but TL;DR: the observed decode errors are all consistent with a single instruction misbehaving.

    In conversation about 15 days ago from mastodon.gamedev.place permalink

    Attachments

    1. No result found on File_thumbnail lookup.
      A small note on SIMD matrix-vector multiplication
      from fgiesen
      Suppose we want to calculate a product between a 4×4 matrix M and a 4-element vector v: $latex Mv = \begin{pmatrix}a_x & b_x & c_x & d_x \\ a_y & b_y & c_y & d_y \\ a_z…
  13. Embed this notice
    Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Monday, 19-May-2025 17:51:34 JST Fabian Giesen Fabian Giesen

    I've been around _just_ long enough to get pretty much the entire history of the etymology of computer storage device names to get hella confusing

    In conversation about 18 days ago from mastodon.gamedev.place permalink
  14. Embed this notice
    Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Monday, 19-May-2025 17:51:33 JST Fabian Giesen Fabian Giesen
    in reply to

    When I was a small kid, floppy disks were actually floppy (5.25") and contained actual spinning disks, floppy disk drives contained the thing that drove the disks (i.e. the motor), and hard disk drives had hard disks (metal platters) and the thing that drove them all in one package. Fair enough.

    Not long after, we got 3.5" floppies that had a hard plastic shell and were not actually floppy anymore. Still called them floppies.

    In conversation about 18 days ago from gnusocial.jp permalink
  15. Embed this notice
    Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Monday, 19-May-2025 17:51:32 JST Fabian Giesen Fabian Giesen
    in reply to

    So a 3.5" floppy drive is actually a drive for a disk in a 3.5" hard shell but whatever.

    Then we got other magnetic removable storage formats that were also all hard-shelled but whatever.

    Then audio CDs got adapted for data storage and we got CD-ROMs with CDs (and later DVDs) that were actually disc-shaped, unlike floppies where the actual storage medium was disc-shaped but the overall package wasn't.

    In conversation about 18 days ago from mastodon.gamedev.place permalink
  16. Embed this notice
    Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Monday, 19-May-2025 17:51:30 JST Fabian Giesen Fabian Giesen
    in reply to

    and then we got solid-state storage and the entire terminology is just terminally bonkers now

    we have "disks" that are neither diskettes nor disc-shaped (the actual chips are rectangular), "disk drives" that don't interact with dis[ck]s and don't drive anything (not in a mechanical sense anyway), and the "solid-state" part refers to no moving parts but of course the stuff with moving parts is also all in a solid state of matter, generally

    In conversation about 18 days ago from gnusocial.jp permalink
  17. Embed this notice
    Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Monday, 19-May-2025 17:51:29 JST Fabian Giesen Fabian Giesen
    in reply to

    in short, two of the three words in "solid-state drive" refer to the thing it's got in common with literally all the other competing storage technologies and the remaining word describes the one thing it doesn't actually do, definitionally

    In conversation about 18 days ago from mastodon.gamedev.place permalink
  18. Embed this notice
    Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Tuesday, 13-May-2025 03:10:13 JST Fabian Giesen Fabian Giesen

    FOR IMMEDIATE RELEASE

    Santa celebrates 50 years "out of list-making business"

    NORTH POLE. May 12, 2025

    Santa is celebrating 50 years without the naughty/nice list. "We stopped doing it in 1975 since it felt out of touch with the times then; these days, with frequent data breaches, privacy regulations, COPPA, GDPR... frankly it feels like a liability nightmare, I don't think anyone would even seriously consider doing this now. I mean, a naughty list leak. Can you imagine?"

    In conversation about a month ago from mastodon.gamedev.place permalink
  19. Embed this notice
    Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Sunday, 01-Dec-2024 00:07:57 JST Fabian Giesen Fabian Giesen

    a barometer is just a mic with a narrow frequency response centered at 0Hz and I'm tired of pretending it's not

    don't @ me

    In conversation about 6 months ago from mastodon.gamedev.place permalink
  20. Embed this notice
    Fabian Giesen (rygorous@mastodon.gamedev.place)'s status on Monday, 22-Jul-2024 21:45:24 JST Fabian Giesen Fabian Giesen

    so I don't want to advocate for opening cans of worms but I _would_ like to know who makes these canned worms, and who stocks them everywhere

    In conversation about 11 months ago from mastodon.gamedev.place permalink
  • Before

User actions

    Fabian Giesen

    Fabian Giesen

    Abstraction maker, abstraction breaker. FUN FACT: things I prefix with FUN FACT are sometimes fun and sometimes factual, but very rarely both.

    Tags
    • (None)

    Following 0

      Followers 0

        Groups 0

          Statistics

          User ID
          95257
          Member since
          5 Feb 2023
          Notices
          42
          Daily average
          0

          Feeds

          • Atom
          • Help
          • About
          • FAQ
          • TOS
          • Privacy
          • Source
          • Version
          • Contact

          GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.

          Creative Commons Attribution 3.0 All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.