There's a feature added to Linux 6.9 that I think people should become more aware of: there's finally an identifier for processes that doesn't wrap around as easily as UNIX pid_t PIDs do: the pidfd file descriptors have been moved onto their own proper file system (pidfs), which enabled at the same time unique inode numbers for them.
Conversation
Notices
-
Embed this notice
Lennart Poettering (pid_eins@mastodon.social)'s status on Monday, 28-Oct-2024 21:18:23 JST Lennart Poettering
- Haelwenn /элвэн/ :triskell: likes this.
-
Embed this notice
NetworkManager (networkmanager@fosstodon.org)'s status on Monday, 28-Oct-2024 21:18:21 JST NetworkManager
@pid_eins can we have an IPv6 address for every process
-
Embed this notice
翠星石 (suiseiseki@freesoftwareextremist.com)'s status on Monday, 28-Oct-2024 21:18:21 JST 翠星石
@NetworkManager The ability to have an IPv6 address for every process already has been implemented. -
Embed this notice
Lennart Poettering (pid_eins@mastodon.social)'s status on Monday, 28-Oct-2024 21:18:22 JST Lennart Poettering
These inode numbers are (at least on 64bit archs, i.e. anything modern) unique during the entire runtime of a system. And that's fantastic: there's finally a way how you can race-freely reference a process, with the ability to pass it around over any form of IPC, without risking that it suddenly starts to refer to some unintentended other process.
Haelwenn /элвэн/ :triskell: likes this. -
Embed this notice
Lennart Poettering (pid_eins@mastodon.social)'s status on Monday, 28-Oct-2024 21:18:22 JST Lennart Poettering
To query the inode number from a pidfd, you use a simple fstat() call, and look at the .st_ino field.
There's currently no way to get from a pidfd inode number directly to a process however. Hence, for now you always have to pass around a combination of classic PID and the new pidfd inode number. This can be safely and correctly be turned into a pidfd: 1. first acquire a pidfd from the PID via pidfd_open(). 2. Then fstat() the fd, and check if .st_ino matches the expected value.
-
Embed this notice
Lennart Poettering (pid_eins@mastodon.social)'s status on Monday, 28-Oct-2024 21:18:22 JST Lennart Poettering
If you want a world-wide unique identifier for a process it makes sense to combine the pair of pid_t and pidfd inode number with the system's boot ID (i.e. /proc/sys/kernel/random/boot_id). This triplet is awesome, because for the first time we can uniquely identify a Linux process, globally in this universe.
In systemd we are making use of this heavily now: internally we always store a triplet of pid, pidfd, pidfd inode for referencing processes we manage and…
Haelwenn /элвэн/ :triskell: likes this. -
Embed this notice
Erin 💽✨ (erincandescent@akko.erincandescent.net)'s status on Friday, 01-Nov-2024 08:14:15 JST Erin 💽✨
@pid_eins My perhaps controversial opinion is that from userland the magic file descriptors should Just Work like actual file descriptors in every regard except that you can’t close() them or dup2 to them, but I guess that ship has sailed anyway.
(i.e. you should be able to stuff PIDFD_SELF into an SCM_RIGHTS control message and out of the other end pops a pidfd for your process; no need to pidfd_open(getpid()) and close() it)
Haelwenn /элвэн/ :triskell: likes this. -
Embed this notice
Lennart Poettering (pid_eins@mastodon.social)'s status on Friday, 01-Nov-2024 08:14:17 JST Lennart Poettering
@erincandescent cgroupfs actually exposes the cgroupid via ntha() and obha(). So yes, there's prior art for doing the same in pidfs. But it's a bit weird, because unlike cgroupfs pidfs is not an fs you can mount, hence you don't really have anything to invoke obha() on. You'd probably have to get a pidfd on your own pid first, before you can use it to use obha() to get to the pidfd you actually want to get to.
-
Embed this notice
Erin 💽✨ (erincandescent@akko.erincandescent.net)'s status on Friday, 01-Nov-2024 08:14:17 JST Erin 💽✨
@pid_eins I was about to say “we’re getting PIDFD_SELF and you could use that in the same way as e.g. AT_FDCWD“ except both PIDFD_SELF and AT_FDCWD are defined as -100. This sucks. Maybe we could get them made into different numbers before 6.13 drops? ;_;
-
Embed this notice
Lennart Poettering (pid_eins@mastodon.social)'s status on Friday, 01-Nov-2024 08:14:19 JST Lennart Poettering
It took a long time, but thanks to @brauner after all those years the limitations of UNIX pid_t are addressed! Thanks, Christian!
-
Embed this notice
Lennart Poettering (pid_eins@mastodon.social)'s status on Friday, 01-Nov-2024 08:14:19 JST Lennart Poettering
Two caveats though: the concept is not universal: it's a Linux thing, and it requires kernel 6.9 or newer and a 64bit architecture. On 32bit the inode number range is too small to provide unique IDs.
To properly check if the feature is available allocate a pidfd, and check if statfs() reports a .f_type field of it being 0x50494446. Also verify if sizeof(ino_t) is >= 8.
-
Embed this notice
Erin 💽✨ (erincandescent@akko.erincandescent.net)'s status on Friday, 01-Nov-2024 08:14:19 JST Erin 💽✨
@pid_eins wtf, the kernel tracks this unique 64 bit number on 32 bit systems but won't let you see it. Infuriating.
This would be a basically perfect use case for name_to_handle_at (and maybe open_by_handle_at?)... -
Embed this notice
Lennart Poettering (pid_eins@mastodon.social)'s status on Friday, 01-Nov-2024 08:14:20 JST Lennart Poettering
… when we pass around information about processes via IPC we have started to do so via the triplet pid, pid inode, boot id.
And I'd recommend everyone dealing with low-level process management to do the same.
Haelwenn /элвэн/ :triskell: likes this. -
Embed this notice
Lennart Poettering (pid_eins@mastodon.social)'s status on Friday, 01-Nov-2024 08:14:20 JST Lennart Poettering
I think the pair of PID and pidfd inode number would be great to support in the various tools that currently deal with PIDs. For example, I filed an RFE bug against util-linux' kill tool to add just that:
-
Embed this notice
Haelwenn /элвэн/ :triskell: (lanodan@queer.hacktivis.me)'s status on Friday, 01-Nov-2024 08:22:11 JST Haelwenn /элвэн/ :triskell:
@erincandescent @pid_eins I guess it could be more appropriate to have a pidfd_self variable just like how there is stdin/stdout/…
Plus then dup(pidfd_self) would work fine as it would be equivalent to using pidfd_getfd().