Authenticating a docker container against host’s UNIX accounts

Recently, with the advent of Docker and similar technologies, there’s been an effort to containerize different kinds of setups that previously were running on a single machine or a set of tightly coupled machines. In case the components communicate over a network connection, the containerization might not be that hard, but what about cases where the components would talk over a strictly local interface, such as a UNIX socket?

A colleague of mine asked me to help setup up a container to authenticate against the host’s UNIX accounts using the PAM API the other day. It turned out to be doable, but maybe not obvious to anyone not familiar with some of the more esoteric SSSD options, so I’d like to write down the instructions in a blog post.

In our setup, there was a pam service, sss_test, that previously would run on the host and authenticate against accounts either locally stored on the hosts in /etc/passwd and /etc/shadow. The same setup could in principle be used to authenticate against a remote database using SSSD, just the host SSSD settings would be different.

So how does this work with containers? One possibility, especially with a remote database such as IPA or AD would be to run an SSSD instance in every container and authenticate to the remote store. But that doesn’t help us with cases where we’d like to authenticate against the local database stored on the host, since it’s not exposed outside the host. Moreover, putting SSSD into each container might also mean we’d need to put a keytab in each container, then each container would open its separate connection to the remote server..it just gets tedious. But let’s focus on the local accounts..

The trick we’ll use is bind-mounting. It’s possible to mount part of host’s filesystem into the container’s filesystem and we’ll leverage this to bind-mount the UNIX sockets SSSD communicates over into the container. This will allow the SSSD client side libraries to authenticate against the SSSD running on the host. Then, we’ll set up SSSD on the host to authenticate aganst the hosts’s UNIX files with the proxy back end.

This is the traditional schema:
application -> libpam -> pam_authenticate -> pam_unix.so -> /etc/passwd

If we add SSSD to the mix, it becomes:
application -> libpam -> pam_authenticate -> pam_sss.so -> SSSD -> pam_unix.so -> /etc/passwd

Let’s configure the host and the container, step by step.

  1. Host config
    1. Install packages
    2. yum -y install sssd-common sssd-proxy

    3. create a PAM service for the container.
    4. The name doesn’t matter, it just needs to be referenced from sssd.conf later.
      # cat /etc/pam.d/sss_proxy
      auth required pam_unix.so
      account required pam_unix.so
      password required pam_unix.so
      session required pam_unix.so

    5. create SSSD config file, /etc/sssd/sssd.conf
    6. Please note that the permissions must be 0600 and the file must be owned by root.root.
      # cat /etc/sssd/sssd.conf
      [sssd]
      services = nss, pam
      config_file_version = 2
      domains = proxy
      [nss]
      [pam]
      [domain/proxy]
      id_provider = proxy
      # The proxy provider will look into /etc/passwd for user info
      proxy_lib_name = files
      # The proxy provider will authenticate against /etc/pam.d/sss_proxy
      proxy_pam_target = sss_proxy

    7. start sssd
    8. # systemctl start sssd

    9. verify a user can be retrieved with sssd
    10. $ getent passwd -s sss localuser

  2. Container setup
  3. It’s important to bind-mount the /var/lib/sss/pipes directory from the host to the container since the SSSD UNIX sockets are located there.

    # docker run -i -t -v=/var/lib/sss/pipes/:/var/lib/sss/pipes/:rw --name sssd-cli fedora:latest /bin/sh

  4. Container config
  5. The container runs the PAM-aware application that authenticates against the host. I used a simple program in C that comes from the SSSD source. It pretty much just runs pam_authenticate() against a service called sss_test. Your application might need a different service, but then you just need to adjust the filename in /etc/pam.d/ in the container. All the steps below should be executed on the container itself.

    1. install only the sss client libraries
    2. # yum -y install sssd-client

    3. make sure sss is configured for passwd and group databases in /etc/nsswitch.conf
    4. # grep sss /etc/nsswitch.conf

    5. configure the PAM service that the application uses to call into SSSD

    6. # cat /etc/pam.d/sss_test
      auth required pam_sss.so
      account required pam_sss.so
      password required pam_sss.so
      session required pam_sss.so

    7. authenticate using your PAM application.
    8. In my case that was
      # ./pam_test_client auth localtest
      action: auth
      user: localtest
      testing pam_authenticate
      Password:
      pam_authenticate: Success

  6. Profit!

It would be possible to authenticate against any database, just by changing what the SSSD on the host authenticates against. There’s several gotchas, though, especially should you require that only certain containers are allowed to retrieve users from certain domains. Multitenancy doesn’t really work well, because we don’t have a good mechanism to retrieve the identity of the container.

Advertisements

Anatomy of SSSD user lookup

This blog post describes how a user lookup request is handled in SSSD. It should help you understand how the SSSD architecture looks like, how the data flows in SSSD and as a result help identify which part might not be functioning correctly on your system. It is aimed mostly at users and administrators – for developers, we have a separate document about SSSD internals on the SSSD wiki written by Yassir Elley. This document re-uses some of the info from the internals one.

We’ll look at the most common operation, looking up user info on a remote server. I won’t go into server-specific details, so most of the info should be equally true for LDAP, Active Directory or FreeIPA servers. There’s also more functionality in SSSD than looking up users, such as sudo or autofs integration, but they are out of scope of this post as well.

Before going into SSSD details, let’s do a really quick intro into what happens on the system in general when you request a user from a remote server. Let’s say the admin configured SSSD and tests the configuration by requesting the admin user:

$ getent passwd admin

When user information is requested about a user (with getent, id or similar), typically one of the functions of the Name Service Switch, such as getpwnam() or initgroups() in glibc is called. There’s lots of information about the Name Service Switch in the libc manual, but for our purposes, it’s enough to know that libc opens and reads the config file /etc/nsswitch.conf to find out which modules should be contacted in which order. The module that all of us have on our Linux machines is files which can read user info from /etc/passwd and user info from /etc/groups. There also exists an ldap module that would read the info directly from an LDAP server and of course an sss module that talks to SSSD. So how does that work?

The first thing to keep in mind is that, unlike nss_ldap or pam_ldap, the SSSD is not just a module that is loaded in the context of the application, but rather a deamon that the modules communicate with. Almost no logic is implemented in the modules, all the functionality happens in the deamon. A user-visible effect during debugging is that using strace is not too helpful as it would only show if the request made it to the SSSD. For debugging the rest, the SSSD debug logs should be used.

Earlier I said that SSSD is a deamon. That’s really not too precise, SSSD is actually a set of deamons that communicate with one another. There are three kinds of SSSD processes. One is the sssd process itself. Its purpose is to read the config file after startup and spawn the other processes according to the config file. Then there are responder or front end processes that listen to queries from the applications, like the query that would come from the getent command. If the responder process needs to contact the remote service for data, it talks to the last SSSD process type, which is the data provider or back end process. This architecture allows for a pluggable setup where there are different back end processes talking to different remote servers, while all these remote servers can be accessed from a range of applications or subsystems by the same lookup code in the responders.

Each process is represented by a section in the sssd.conf config file. The main sssd process is represented by the [sssd] section. The front end processes are defined on the services line in the [sssd] section and each can be configured in a section named after the service. And finally, the back end processes are those configured in the [domain] sections. Each process also logs into its own logfile.

Let’s continue with the getent passwd admin example. To illustrate the flow, there is a diagram that the text follows. The full arrows represent local IO operation (like opening a file), the empty arrows represent local IPC over UNIX sockets and the dotted arrow represents a network IO.

sssd-lookup

The user issued the getent command which calls libc’s getpwnam (diagram step 1), then the libc opens the nss_sss module as per nsswitch.conf and passes in the request. First, the nss_sss memory-mapped cache is consulted, that’s step 2 on the diagram. If the data is present in the cache, it is just returned without even contacting the SSSD, which is extremely fast. Otherwise, the request is passed to the SSSD’s responder process (step 3), in particular sssd_nss. The request first looks into the SSSD on-disk cache (step 4). If the data is present in the cache and valid, the nss responder reads the data from the cache and returns them to the application.

If the data is not present in the cache at all or if it’s expired, the sssd_nss request queries the appropriate back end process (step 5) and waits for reply. The back end process connects to the remote server, runs the search (step 6) and stores the resulting data into the cache (step 7). When the search request is finished, the provider process signals back to the responder process that the cache is updated (step 8). At that point, the front-end responder process checks the cache again. If there’s any data in the cache after the back end has updated it, the data is returned to the application – even in cases when the back end failed to update the cache for some reason, it’s better to return stale data than none. Of course, if no data is found in the cache after the back end has finished, an empty result is returned back. This final cache check is represented by step 9 in the diagram.

When I said the back end “runs a search” against the server, I really simplified the matter a lot. The search can involve many different steps, such as resolving the server to connect to, authenticating to the server, performing the search itself and storing the resulting data into the database. Some of the steps might even require a helper process, for instance authenticating against a remote server using a keytab is done in a heper process called ldap_child that logs into its own logfile called /var/log/sssd/ldap_child.log.

Given most steps happen in the back end itself, then most often, the problem or misconfiguration lies in the back end part. But it is still very important to know the overall architecture and be able to identify if and how the request made it to the back end at all. In the next part, we’ll apply this new information to perform a small case study and we will repair a buggy sssd setup.

Troubleshooting a failing SSSD user lookup.

With the SSSD architecture in mind, we can try a case study. Consider we have an IPA client, but no users, not even the admin show up:

$ getent passwd admin
$ echo $?
2

The admin user was not found! Given our knowledge of the architecture, let’s first see if the system is configured to query sssd for user information at all:

$ grep passwd /etc/nsswitch.conf
passwd: files sss

It is. Then the request was passed on to the nss responder process, since the only other possibility is a successful return from the memory cache. We need to raise the debug_level in the [nss] section like this:

[nss]
debug_level = 7

and restart sssd:

# systemctl restart sssd

Then we’ll request the admin user again and inspect the NSS logs:

[sssd[nss]] [accept_fd_handler] (0x0400): Client connected!
[sssd[nss]] [sss_cmd_get_version] (0x0200): Received client version [1].
[sssd[nss]] [sss_cmd_get_version] (0x0200): Offered version [1].
[sssd[nss]] [nss_cmd_getbynam] (0x0400): Running command [17] with input [admin].
[sssd[nss]] [sss_parse_name_for_domains] (0x0200): name 'admin' matched without domain, user is admin
[sssd[nss]] [nss_cmd_getbynam] (0x0100): Requesting info for [admin] from []
[sssd[nss]] [nss_cmd_getpwnam_search] (0x0100): Requesting info for [admin@ipa.example.com]
[sssd[nss]] [sss_dp_issue_request] (0x0400): Issuing request for [0x4266f9:1:admin@ipa.example.com]
[sssd[nss]] [sss_dp_get_account_msg] (0x0400): Creating request for [ipa.example.com][4097][1][name=admin]
[sssd[nss]] [sss_dp_internal_get_send] (0x0400): Entering request [0x4266f9:1:admin@ipa.example.com]
[sssd[nss]] [sss_dp_get_reply] (0x1000): Got reply from Data Provider - DP error code: 1 errno: 11 error message: Fast reply - offline
[sssd[nss]] [nss_cmd_getby_dp_callback] (0x0040): Unable to get information from Data Provider
Error: 1, 11, Fast reply - offline
Will try to return what we have in cache

Well, apparently the request for the admin user was received and passed on to the back end process, but the back end replied that it switched to offline mode..that means we need to also enable debugging in the domain part and continue investigation there. We need to add debug_level to the [domain] section and restart sssd again. Then run the getent command and inspect the file called /var/log/sssd/sssd_ipa.example.com starting with the time that corresponds to the NSS responder sending the data (as indicated by sss_dp_issue_request in the nss log). In the domain log we see:

[sssd[be[ipa.example.com]]] [fo_resolve_service_done] (0x0020): Failed to resolve server 'master.ipa.example.com:389': Domain name not found
[sssd[be[ipa.example.com]]] [set_server_common_status] (0x0100): Marking server 'master.ipa.example.com:389' as 'not working'
[sssd[be[ipa.example.com]]] [be_resolve_server_process] (0x0080): Couldn't resolve server (master.ipa.example.com:389), resolver returned (11)
[sssd[be[ipa.example.com]]] [be_resolve_server_process] (0x1000): Trying with the next one!
[sssd[be[ipa.example.com]]] [fo_resolve_service_send] (0x0100): Trying to resolve service 'IPA'
[sssd[be[ipa.example.com]]] [get_server_status] (0x1000): Status of server 'master.ipa.example.com:389' is 'not working'
[sssd[be[ipa.example.com]]] [get_port_status] (0x1000): Port status of port 0 for server '(no name)' is 'not working'
[sssd[be[ipa.example.com]]] [get_server_status] (0x1000): Status of server 'master.ipa.example.com:389' is 'not working'
[sssd[be[ipa.example.com]]] [fo_resolve_service_send] (0x0020): No available servers for service 'IPA'
[sssd[be[ipa.example.com]]] [be_resolve_server_done] (0x1000): Server resolution failed: 5
[sssd[be[ipa.example.com]]] [sdap_id_op_connect_done] (0x0020): Failed to connect, going offline (5 [Input/output error])
[sssd[be[ipa.example.com]]] [be_ptask_create] (0x0400): Periodic task [Check if online (periodic)] was created
[sssd[be[ipa.example.com]]] [be_ptask_schedule] (0x0400): Task [Check if online (periodic)]: scheduling task 70 seconds from now [1426087775]
[sssd[be[ipa.example.com]]] [be_run_offline_cb] (0x0080): Going offline. Running callbacks.

OK, that gets us somewhere. Indeed, our /etc/resolv.conf file was ponting to a bad nameserver. And indeed, after fixing the resolver settings and restarting SSSD, everything seems to be working:

$ getent passwd admin
admin:*:1546600000:1546600000:Administrator:/home/admin:/bin/bash

Awesome, we were able to repair a broken SSSD setup!