Sections

Wednesday, January 1, 2014

Basics: The Centrify Agent Cache

Background

In a previous post we described the Centrify agent (adclient) and we mentioned that the cache exists for two objectives:  performance and offline access (availability control in case AD is not accessible). Understanding the cache allows any system administrator to be more effective at using the solution.   The client caches DNS, object and authorization information.

The NSCD service

NCSD stands for name server cache daemon.  It is the best method to retrieve cached information since it's implemented and optimized for each version of UNIX/Linux.  As a basic computer science principle, retrieving items from memory is always preferable than retrieving from permanent storage. The rule of thumb is to always implement NSCD in your systems.  This is a very important design consideration.
The best thing you can do to improve performance of any application that uses Name Server Switch is to implement NSCD.  This applies to Centrify for Unix/Linux.

The Object cache

This cache stores the UNIX-enabled users and managed groups that belong to the client's zone. It's implemented as a database that is stored in the /var/centrifydc folder.  It is used so the client does not have to go to Active Directory for each query and provides support for offline access.

Object expiration and refresh intervals

When objects are retrieved they are tagged with a time-stamp and some of the AD information is retrieved (mainly the update sequence number), while the refresh interval is current, the object will be returned "as is" (no queries to AD are made), if the refresh interval expires, then the object's USN is compared to 
the one in the cache, only if the numbers are different the object will be refreshed, otherwise the time-stamp
is renewed.
The notable exception to this logic is when an end-user attempts to log in; at this point (among other things)
the validity of the account is checked (e.g. is the account disabled, locked, expired, allowed to log in at the time, etc) those checks are performed in real time.
Although there are more parameters to fine-tune the cache, the adclient.cache.expires interval governs when objects are expired. Traditionally, by default is set to 1 hour.

The Authorization Cache

This cache is used for the purposes of authorization (roles, rights etc) and it's separate from the object cache.
It uses the AD azman schema and Centrify algorithms to determine if there are changes to the group memberships (or principals) that have roles and rights tied to them. It uses the USN and the adclient.azman.refresh.interval (set to 30 minutes by default).

Depending on the complexity of the AD infrastructure, the operation can be expensive (in terms of LDAP queries) this is because corporate domains are complex and in certain scenarios it's possible that groups are merged.  Since the memberships are fully enumerated, this is something to think about.

How the Authorization Cache affects the User's ability to log in

Notice that before we have mentioned that for a user to log in they require two things: A UNIX identity, and a role that allows them to log-in. One component relates to identification, the other relates to authorization.
This means that it's possible to have a proper login, UID, GID, etc; but still be unknown to a system because the user does not have a role assigned or the role that is assigned doesn't have a PAM login right.

Note that users can be "listed" in a system.
There are other types of cache, but the two most relevant are the Object and Authorization caches.

What does this all mean?

It means that there's a fine balance to be understood in order to provide performance and high-availability. It also means that there are planning implications to the workings of the agent.  Questions like:

  • What is this system used for?
    Very different settings for a shared developer's system than a slave node in a number-crunching system.
  • What are the current provisioning/de-provisioning SLAs?
    This opens a great question.  The SLA dictates a lot of settings.  For example in a scheme in which users are provisioned using an IdM solution from an HR system, and there are workflows involved to approve access to UNIX/Linux systems, all components have to be accounted for, Including Active Directory replication intervals.

Adfush described

Adflush is a command line utility that allows system administrators to manipulate the cache.  It shouldn't be used without understanding how the cache works.  

usage: adflush [options]
options:
  -f, --force             flush cache even adclient is in disconnected mode
  -a, --auth               flush cached authorization data
  -d, --dns                flush the adclient dns cache and DC locator cache
  -e, --expire             expire everything in the DC and GC object cache
  -o, --objects            flush only the DC and GC object cache
  -t, --trusts             rediscover the trusted domains
  -b, --bindings           force refresh binding
  -v, --version            print version information
  -V, --verbose            print debugging information
  -h, --help               print this help information and exit.

The type of adflush can have implications from slowing things down a bit, to consuming resources to actually eliminating the ability to log-in offline.  This is important to know for roaming users of Macintosh platforms.  A user that runs a forced adflush while at home will have to wait to get back to the office to log in!!!

No comments:

Post a Comment