Every once in a while, when debugging an SSSD performance problem for a user or a customer, I see that even experienced users tend to measure login performance by running id(1). That’s not really the best thing to do, for one reason – running the plain id command does much more than what happens during login, and of course, at a cost.
Typically, the login program such as ssh or gdm, needs to perform a couple of tasks aside verifying the user’s credentials. These include finding out if the user exists, what shell and home directory does he have and what groups the user is a member of, so that the user can access the files he should be allowed to (and vice versa). These tasks boil down to two corresponding glibc calls – getpwnam to find the details about the user, and getgrouplist to retrieve the list of groups he is a member of.
Because these two library functions are used so often, we take special care in the SSSD to make sure that we use any optimization that is available, such as the transitive memberOf attribute when IPA is used or the tokenGroup attribute when the AD provider is configured. In order to measure what the performance of Name Service Switch calls the login program does is like, you can call "id -G $username" from the command line.
Notice the extra "-G" switch. That really makes a bit of a difference, because the getgrouplist operation returns a list of numerical IDs the user is a member of. That’s usually good enough for the login programs to set the groups for the user who logs in but not really friendly for the admin inspecting the output of the id command.
In contrast to "id -G $username", "id $username" does one operation that sounds trivial but can be extremely expensive – resolves the group GIDs to group names. While that sounds like a really easy operation, it involves calling getgrgid for each of the GIDs returned by getgroupslist. And there comes the slowdown, the getgrgid operation is kind of an all-or-nothing call. It retrieves not only the information about the group itself, such as its name, but also all information about the members of the group, including all the users. This can get quite expensive, consider a university setup where each student was a member of group “students” that consisted of all students on the university.
A legitimate question might be whether its possible to restrict the amount of information the get-by-GID call retrieves. And the answer is both yes and no – the POSIX interface doesn’t allow any such query directly, but the SSSD offers several means to speed up the overall processing. One quite recent addition is the ignore_group_members configuration option that was contributed by Paul Henson. Setting this option to True causes all groups to effectively appear empty, avoiding the need to download the members. Keep in mind that with many server implementations, the members might also include other nested groups which causes the whole operation to recurse.