Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

is it? most HPC (if GPU clusters count) are probably in industry and managed by containers


HPC admin here.

Yes. First, we use user level container systems like apptainer/singularity, and these containers run under the user itself.

This is also same for non academic HPC systems.

From schedulers to accounting, everything is done at user level, and we have many, many users.

It won’t change anytime soon.


I thought most containers shared the same user, ie. `dockremap` in the case of docker.

I understand academia has lots of different accounts.


Nope, full usermode containers (e.g.: apptainer) run under the user's own context, and furthermore under a cgroup (if we're talking HPC/SLURM at least) which restricts the user's resources to what they requested in their job file.

Hence all containers are isolated from each other, not only at process level, but at user + cgroup level too.

Apptainer: https://apptainer.org


I think a admin would better understand the system if there was only one subsystem doing a particular type of security and not two. Two subsystems doing security would lead to more problems down the road.


For HPC, there are two different contexts where users need to be considered - interactive use and batch job processing. Users login to a cluster, write their scripts, work with files, etc. This is your typical user account stuff. But they also submit jobs here.

Second, there are the jobs users submit. These are often executed on separate nodes and the usage is managed. Here you have both user and cgroup limits in place. The cgroups make sure that the jobs on have the required resources. The user authentication makes sure that the job can read/write data as the user. This was the user can work with their data on the interactive nodes.

So the two different systems have different rationales, and both are needed. It all depends on the context.


If we forget how the current system is architected, we are looking at two problems: First problem is that Linux capabilities are also dealing with isolating processes so they have limited capabilities because the user based isolation is not enough. Second problem is that local identity has no relation to the cloud identity which is undesirable. If we remove user based authentication and rely on capabilities only with identity served by cloud or kubernetes, it could be a simpler way to do authenticating and authorization


I'm not sure I even follow...

The primary point of user-authentication is that we need to be able to read/write data and programs. So you have to have a user-level authentication mechanism someplace to be able to read and write data. cgroups are used primarily for restricting resources, so those two sets of restrictions are largely orthogonal to each other.

Second, user-authentication is almost always backed (at least on interactive nodes) by an LDAP or some other networked mechanism, so I'm not sure what "cloud" or "k8s" really adds here.

If you're trying to say that we should just run HPC jobs in the cloud, that's an option. It's not necessarily a great option from a long-term budget perspective, but it's an option.


there is no reason for users to be maintained in the kernel.


Can you elaborate on that?


Containers rely on many privilege separation systems to do what they do, they are in fact a rather extreme case of multi-user systems, but they tend to present as “single” user environs to the container’s processes.


> they are in fact a rather extreme case of multi-user systems

Are they? My understanding was that by default, the `dockerd` (or whatever) is root and then all containers map to the same non-privileged user.


Good software hides complexity. User does not have to understand user group permissions suid etc etc




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: