@feld We've always set up gmultipath just because it was the easiest way to label the drives by physical location. But having read up on the weird-ass way these expanders are set up, I definitely want to do more work to make sure we're not accessing the drives over a congested link!
Notices by Garrett Wollman (wollman@mastodon.social)
-
Embed this notice
Garrett Wollman (wollman@mastodon.social)'s status on Wednesday, 18-Jun-2025 15:16:11 JST Garrett Wollman
-
Embed this notice
Garrett Wollman (wollman@mastodon.social)'s status on Wednesday, 18-Jun-2025 13:03:02 JST Garrett Wollman
My guess is that the Data60 shares its IOM design with a larger enclosure -- I think they have a 90-drive one? -- where both expanders are equally, or more nearly equally, oversubscribed. But anyway, this suggests that there's not much point in connecting more than 10 lanes per host.
This also raises the question of whether #FreeBSD can distinguish between the short path, HBA to IOM to disk, and the longer and lower-bandwidth path, HBA to IOM-A to IOM-B to disk.
-
Embed this notice
Garrett Wollman (wollman@mastodon.social)'s status on Wednesday, 18-Jun-2025 13:03:02 JST Garrett Wollman
I spent a while reading the manual the other night, particularly staring at the block diagram. Each IOM has 24 host-facing lanes, 20 drive-facing lanes, 3 lanes of cross-connect to the other IOM, and 1 lane for the enclosure service processor. The drive-facing lanes are then split evenly, 10 lanes each to 2 expanders -- one of which services only 9 drives and the other of which services the remaining 51. And they don't tell you which slots are the undersubscribed ones. WTF?
-
Embed this notice
Garrett Wollman (wollman@mastodon.social)'s status on Wednesday, 18-Jun-2025 13:03:02 JST Garrett Wollman
Big Storage on #FreeBSD question: suppose you have eight-lane LSI/Avago/Broadcom SAS card (mpr(4)) and a WD Ultrastar Data60 enclosure. If you connect both miniSAS connectors on the HBA to the same IOM on the enclosure, does the HBA actually treat it as a single eight-lane link or is it functionally indistinguishable from connecting each four-lane cable to a different IOM? (We have a decent number of these, despite some peccadilloes it's been our preferred JBOD enclosure for some time.)
-
Embed this notice
Garrett Wollman (wollman@mastodon.social)'s status on Wednesday, 18-Jun-2025 13:03:01 JST Garrett Wollman
`camcontrol smpphylist` conveniently gives the #FreeBSD device names associated with WWNs it knows about, so for example if you run that command against a drive, it will show you the device names of all the other drives on that same expander. If I look at, say, `da5` on one server, it tells me that it's on the expander that is only wired to nine drives, which happened to end up being assigned `da1` through `da10` except `da2`.
-
Embed this notice
Garrett Wollman (wollman@mastodon.social)'s status on Wednesday, 18-Jun-2025 13:03:01 JST Garrett Wollman
And if you look at the WWNs, you can clearly see ports 0-23 are connected to the host ports on the back panel -- right now the one I'm looking at only has four lanes wired. Then ports 24-43 are wired to the two drive-side expanders, ports 44-46 are the cross-connect to the other IOM, port 47 is unconnected, and port 48 is the service processor.
-
Embed this notice
Garrett Wollman (wollman@mastodon.social)'s status on Wednesday, 18-Jun-2025 13:03:01 JST Garrett Wollman
Ok, so I've figured out a bit more about how to figure out the topology of these things under #FreeBSD, using everyone's favorite utility, camcontrol(8)!
The key is that there are six SAS expanders in these boxes: each IOM has one on the host side and two on the drive side -- but the enclosure service processor is connected to the host-side expander. So, if you do `camcontrol smpphylist sesX` you'll see the SAS neighbors of all 48 ports of the host-side expander of that IOM.
-
Embed this notice
Garrett Wollman (wollman@mastodon.social)'s status on Wednesday, 18-Jun-2025 13:03:00 JST Garrett Wollman
That information *does* exist, in the protocol, at least for drives: `smartctl -l sasphy` on a drive will tell you the WWPN of the drive and of the expander it's attached to. But this is now getting a lot more involved than I expected it to be.
It would be great if someone smarter than me who actually knew the protocol could implement a tool that would explore the SAS topology and output it as a graph in a standard format.
-
Embed this notice
Garrett Wollman (wollman@mastodon.social)'s status on Wednesday, 18-Jun-2025 13:03:00 JST Garrett Wollman
I spent a bit of time trying to write a simple Perl script to parse the output of `camcontrol smpphylist` and ran into a wall: there is no way to get the *local* WWPN of the expander. You can see all the WWPNs on the devices connected to it (which might include other expanders) but given two phylists from expanders that are physically connected to each other, there's no way to tell (except by heuristic matching) that 0x5000ccab054d0b3d and 0x5000ccab054d0b7d are in fact wired together.
-
Embed this notice
Garrett Wollman (wollman@mastodon.social)'s status on Wednesday, 18-Jun-2025 13:03:00 JST Garrett Wollman
This should then be enough information to balance utilization across all available PHYs. gmultipath(8) isn't smart enough to do this on its own, it needs to be told explicitly (and probably again at every reboot). Can plz hav "smart" multipathing?
-
Embed this notice
Garrett Wollman (wollman@mastodon.social)'s status on Monday, 16-Jun-2025 14:15:52 JST Garrett Wollman
@icm In 1994, there were still a couple of 3650s on the 5th floor as well, used by PIs in the Advanced Network Architecture group (for whom I worked at the time). I'm not sure when those left, but they were certainly gone by 1997 when I took a different job and was rebuilding the network in NE43. There were a lot more Lispms still in use in the AI Lab but I never saw them until ca. 2000 when we started planning for the new building.
-
Embed this notice
Garrett Wollman (wollman@mastodon.social)'s status on Monday, 16-Jun-2025 14:15:34 JST Garrett Wollman
@icm I was involved in the second shrink of NE43-250, the room where the CM-5 was. By the time I started, LCS no longer had any equipment in the old 9th floor machine room (see the document "How to turn off the 9th floor") which was mostly empty raised floor except for the southwest corner where the AI Sun-4 servers and the CM-2 were. In the late 1990s the old air conditioners and most of the raised floor were gutted to make a new AI Lab headquarters suite.
-
Embed this notice
Garrett Wollman (wollman@mastodon.social)'s status on Monday, 16-Jun-2025 14:15:33 JST Garrett Wollman
@icm I was also involved in the shrink of NE43-351 (these happened almost at the same time), when the old CIA vault was turned into office space and the machines that had been in the vault were moved into the rump machine room. This involved turfing out all of the remaining Symbolics 3650s, of which there were four: Zermatt, the file and Namespace server, and three others that belonged to the Clinical Decision-Making Group. All except Zermatt had remote displays connected over fiber.
-
Embed this notice
Garrett Wollman (wollman@mastodon.social)'s status on Monday, 16-Jun-2025 14:15:31 JST Garrett Wollman
@icm The export license was in a file cabinet outside my office in NE43 until 2003, at which point I don't know what happened to it (I moved offices and the files didn't; we vacated the building shortly thereafter and the paperwork may have gone into the recycling). I didn't start work at LCS until 1994 so much of what I know is stories retold by those who came before.
I know EECS kept their PDP-10 much longer, but I never used or saw it. There was less competition for space in 38 than NE43.
-
Embed this notice
Garrett Wollman (wollman@mastodon.social)'s status on Monday, 16-Jun-2025 14:14:53 JST Garrett Wollman
@icm So how did this machine end up in western Washington from its home in Cambridge? All of the PDP-10s were hauled out of NE43-901 in the late 1980s (I used to have the export license to send one of them to KTH in Sweden), obviously it's been floating around for a while. (And is the CM-2 in the collection the one that also used to be be in NE43-901, which was still there but unused until the big renovation ca. 1997? I think the NE43-250 CM-5 went to CHM in Mountain View.)
-
Embed this notice
Garrett Wollman (wollman@mastodon.social)'s status on Sunday, 08-Jun-2025 09:07:47 JST Garrett Wollman
@ireneista I'm hoping there may be something in the DMI or ACPI data.
-
Embed this notice
Garrett Wollman (wollman@mastodon.social)'s status on Friday, 09-May-2025 05:14:21 JST Garrett Wollman
@feld I have 15 years of Cook's Illustrated bound volumes!
-
Embed this notice
Garrett Wollman (wollman@mastodon.social)'s status on Friday, 09-May-2025 04:59:28 JST Garrett Wollman
@EugestShirley @MsMerope @ai6yr I have about a hundred cookbooks, but I am very lazy and do little scratch cooking, so the ones I use most are baking books. (Some of them, like Joanne Chang's FLOUR and FLOUR TOO, have cooking recipes in them too.) Probably Claire Ptak's THE VIOLET BAKERY COOKBOOK, King Arthur Flour WHOLE GRAIN BAKING, and Alice Medrich's SERIOUSLY BITTER SWEET are the ones I refer to most in addition to Chang's. (I do have 4 editions of JOY as well, more for reference than use.)
-
Embed this notice
Garrett Wollman (wollman@mastodon.social)'s status on Saturday, 03-May-2025 07:48:29 JST Garrett Wollman
@emaste You know, I'm still contemplating how I'm going to upgrade from 13 to 14 so that's pretty far off my radar at this point. I'm still waiting for pkgbase to be complete (including an etcupdate replacement and a reliable way to adopt existing systems) but I may be retired before that happens.
-
Embed this notice
Garrett Wollman (wollman@mastodon.social)'s status on Friday, 02-May-2025 11:04:02 JST Garrett Wollman
After supper tonight, I'm going to try upgrading three #PostgreSQL "clusters" from 14 to 15. I don't think there are any anticipated gotchas, but that's why we do these things on little personal databases before taking a wrecking ball to prod.
Still going to have to research how to upgrade a "cluster" that is running streaming replication. That'll be a hassle for sure.