Estimated reading time: 8 minutes
January 19, 2021
By Chris Hyzer, University of Pennsylvania
Grouper, the access management component of the InCommon Trusted Access Platform, continues to evolve to meet the community’s needs. Our most recent focus has been a reimagining of the way Grouper handles provisioning. The new Grouper provisioning framework revolutionizes how data flows between Grouper and external systems.
The intent of the framework is not to turn Grouper into a provisioning engine like MidPoint. Instead Grouper needs to be able to quickly and reliably provision authorization data to other IAM middleware such as MidPoint/LDAP/SQL as a core feature. In addition, if an institution does not have a provisioning system, or if it is desired to have a different way to provision entitlements to applications, the Grouper provisioning framework can be used for that too.
Caption: Diagram of the New Grouper Provisioning Framework
When Grouper 2.6 is released in the next six months, all provisioning is planned to be migrated to the new framework. Until then, each Grouper 2.5 release will contain more and more provisioning features that can be used to help solidify and polish the framework. Please let the Grouper team know if you have provisioning use cases, and if you are keeping up with the latest Grouper 2.5 containers and want to volunteer to kick the tires on the new technology. All Grouper supported provisioners are included in the Grouper container so they can be easily leveraged. Most items in the write-up below exist in Grouper 2.5 now. Some things are a work in progress.
Evolution of Grouper Provisioning
Briefly I would like to start by mentioning the previous provisioning in Grouper. Grouper has for years successfully provisioned to many different targets. In some cases we have been rewriting provisioners repeatedly to try to improve features and performance (e.g. LDAP). Einstein defined “insanity” as doing the same thing over and over and expecting different results. The configuration varies widely for various existing provisioners, and the property files are difficult to “get right”. Some provisioners do not have core features that others do (e.g. Microsoft Azure cannot full sync). Many provisioners do not have unit tests since it is difficult or impossible to replicate the target without buying licenses. A lot of the logic to make provisioning happen is written over and over (in slightly different ways) which results in a lot of unnecessary technical debt and inconsistent behavior. Some Grouper provisioners have had serious performance problems. It was not possible to easily analyze provisioning issues to troubleshoot helpdesk tickets about access problems.
Over the last year each member of the Grouper development team has been spending some or all of their time, happily working on the new provisioning framework. The goal of the effort is to address all the previously mentioned shortcomings of provisioning. This investment will pay off for the Grouper team and for institutions who use Grouper.
Configuration in the New Provisioning Framework
I don’t really know where to start when describing the features of the new framework, since they are all equally my favorite. So I will start with the configuration. All parts of provisioning configuration are performed in the Grouper UI with helpful documentation, wizard-like interfaces, descriptive validations, and diagnostic tests to further ensure that things are correct. Configuring new provisioners or editing existing provisioners do not require Grouper to be redeployed/restarted. Provisioning configuration starts with an “external system”, which is the connection information to connect to the target to provisioning to. External systems can be re-used among provisioners, or for other parts of Grouper (e.g. loader, “custom UI”). The provisioner itself is configured next.
In the new Grouper provisioning framework, all provisioners have consistent configuration concepts and terms so adding the next provisioner will be easy. Finally there are the daemon jobs for the full or incremental sync which are scheduled. Let me mention the validation and diagnostics one more time. When a provisioner is configured or when Grouper is upgraded or when the target changes, the provisioning diagnostics will quickly determine if the provisioning will work and help prevent rogue actions like erroneously removing data (access information) from the target. This will help you not get reprimanded by the Change Control Board.
Translating Data using a Standard Object Model
All provisioners have a standard object model. Translating data from Grouper to the target format is consistent across all provisioners. In addition to configuring which data from Grouper gets sent to the target and how it is formatted, there are provisioning-specific screens to identify which objects (groups, users, memberships, attributes) are sent to the target. In addition to marking objects as provisionable, provisioner specific metadata can be assigned to Grouper objects to inform provisioning actions. For example, a Grouper group destined for Azure could be marked as provisionable, and identified as public or private, and assigned a “security” or “unified” type. These metadata are documented and validated in the UI. In this case the UI for assigning Azure metadata can be delegated to the team responsible for managing Azure at your institution, or to power users in sub-organizations. This is a downside if you are compensated by the volume of helpdesk tickets!
Verification and Debugging for Provisioning
Configuring a provisioner is only the first part of the user interface story. When the provisioner is up and running, data propagation needs to be verified, errors need to readily available (without looking in logs), target identifiers need to displayed, and detailed logs and audits need to be available. All of this and more is available in the new framework (minus the kitchen sync). Problems can also be quickly addressed since each object type can be recalculated and fixed from the UI. e.g. this person or this group has a ticket open, provision them again now since the network issue is resolved. The framework has helpful consistent logs and failsafe controls.
Filtering by Types
One of the reasons existing provisioners have performance problems is that extraneous data is sent to the target since it is inconvenient to identify exactly which groups are needed in the target. Not anymore. We are introducing more “types” of groups in Grouper and integrating that with the part of the provisioning framework that retrieves data from Grouper. For example a provisioner can be configured to only monitor “policy” groups or could filter out “basis”, “reference”, and “intermediate” groups. Grouper can make you do what you should be doing!
A common task when interacting with targets is to not only send data to the target, but also bring data back to Grouper. For example, I want to provision groups to Zoom (and maybe only through SAML so accounts are just-in-time), but I also want a group in Grouper which contains Zoom users with licensed accounts. There are no plans for the provisioning framework to replace the Grouper loader, but it is possible to have data flow through the provisioning framework from the target to Grouper (bi-directional).
All provisioners (SQL, LDAP, web service) where possible will have extensive unit tests using the new framework. There is a web service mock facility to simulate various targets to make sure provisioning works and is not negatively affected during upgrades. An LDAP container is used in Grouper CI/CD to ensure LDAP provisioning is reliable.
Data from the target or from subject sources is database cached in the provisioning framework for a few reasons. Real-time updates need to quickly send data to the target without resolving references. Full syncs need to perform as fast as possible. Deletes from Grouper need to know about the identifiers in the target. Auditors and helpdesk workers need to be able to get a more clear picture of how data is interconnected. Where does Grouper keep its cache? In a river bank!
The framework is also intended to be easily pluggable so institution-specific provisioning (which might currently be in the form of change log consumers, web service clients, or messaging consumers) can take advantage of the new framework. Grouper 2.6 will not have legacy provisioners (all will be in the new framework), but other provisioning structures/interfaces will still be in place. The change log, ESB, messaging, and Web Services will still be there and will be backwards compatible as usual. If you are using these things to provision data between Grouper and a target, they will still work in 2.6, but I recommend that you consider moving this logic into the provisioning framework.
When you click on a group and select “provisioning”, don’t you want to see this custom target listed there? Do you want to see logs/audits in a consistent way from the Grouper UI? Do you want to be able to fix problems from the UI? Do you want to have the latest and greatest advice on your contrib page? Do you want to play nice in our sandbox?
To Learn More
Preliminary documentation on the new Grouper Provisioning Framework is available here. We invite your input and feedback. We appreciate all our community partners to date who have provided use cases and testing, with a special acknowledgement to University of Michigan and the University of Arizona.
Join us for Grouper Training in February!
There is a remote Grouper training February 9-12, 2021 where we will be teaching Grouper as usual, but this time with the new provisioning framework instead of the older PSPNG. We have new content regarding the Grouper Deployment Guide and improved training use cases. This training is useful for those new or experienced with Grouper. Please sign up and join us! After you finish the training, read this blog again and it might resonate a little more. 🙂
Chris Hyzer, Grouper lead