I was recently asked how to pitch an internal secrets management service to a company that doesn’t have one. I am providing the fruit of my efforts here in the hope that my fellow travelers find it useful. That said, I provide no guarantee or warranty as to the accuracy or usefulness of the information contained herein for any purpose.

The Problem

As companies move more and more of their operations to interconnected systems, many challenges arise around identification and authentication for access to various data and services. Traditional access control mechanisms involve a publicly known component (usernames, public keys, etc) and a secret (passwords, private keys, etc). These mechanisms move the problem from id/authn to management of the secrets.

In the simplest cases, users memorize passwords. This mechanism quickly breaks down due to fundamentally flawed yet still enforced ideals around what a strong password is as well as limitations on human memory and even the basic assumption that there is always going to be a human in the system in the first place.

Requirements

The natural solution is a Secrets Management Service. This service must meet several requirements:

  • Secrets (passwords, keys, certificates, etc) must be stored securely
  • Secrets must be able to be created, read, updated, deleted, and verified
  • Secrets must be versioned / lifecycled
  • All operations must be accessible only to authenticated and authorized entities (human or otherwise)
  • Granular permissions must support both individual and role-based access controls
  • All operations must be recorded in a tamper resistant / evident audit trail
  • All operations must be accessible through an easy-to-use API
  • The service must be ACID compliant and highly-available

Build vs. Buy

Now, of course in 2018, we’re not the first ones to identify the need for this type of service. Many very intelligent and capable folks have created secrets management services and several are available on the public market, some as commercial products and others as open source projects. To appropriately evaluate the marketplace options and compare procurement, integration, and development efforts along with direct costs, the company should assemble the smallest possible task force which includes representation from:

Who Why
Security Development / Engineering Owns developing tools that are not bought or acquired as open source
Security Operations / DevOps Owns operating the Secrets Management Service and integrating with existing systems
Product / Platform Engineering Primary consumer of the Secrets Management Service
Procurement / Finance Owns the business and legal processes around acquiring commercial or open source software

The decision of which pre-existing product or project (or none at all) to base the new service upon should not be taken lightly. It is well worth some extra time and effort upfront to perform proof-of-value with several potential solutions and compare their ease of deployment, integration, use, maintenance, etc. under real-world conditions with each other as well as the efforts required and advantages to develop a personally tailored solution in-house.

A non-exhaustive list of commercially available Secrets Management Services:

A non-exhaustive list of open source Secrets Management Services:

Deliverables

Phase Deliverables
Potential Base Product Selection
  • A short list of commercial and open source solutions to run through PoV
  • A well-defined timeline and resource requirements breakdown for each PoV
  • A detailed requirements document / checklist to enable quantitative comparison
Proof of Value
  • Limited deployment and integration of each potential solution, modeling as much of production workload as possible
  • Scored requirements documents for each tested solution
  • A decision of which solution to move forward with
Initial Production
  • Development of any missing pieces (or the whole thing) to meet the requirements defined above
  • Deployment and integration with one production system
  • Feedback from the initial consumer
  • Self-audit by the Security Team to ensure all requirements are met in production
  • Initial evangelical collateral
Iteration
  • Serial (or at least not massively parallel) deployment and integration with remaining production programs
  • Continued feedback from all live consumers
Codification / Evangelism
  • Training, policy, enforcement, and audit mechanisms to ensure all future engineering stores secrets in the Secret Management Service unless there is a deliberate and documented business reason not to
  • Continuous feedback review from all consumers to ensure the service is providing continued value and not being used only because policy says so

Success Criteria

I break project success down into three categories:

Criteria Details The Path
Voluntary Adoption
  • In the Iteration phase, future consumers are asking how soon they can onboard
  • Existing consumers are providing feedback and suggestions
  • No one is "working the system" to avoid using the service
  • Very few consumers are using the service only because it is policy to do so
  • Displaying immediate value by solving problems future consumers have today
  • Providing unobtrusive, quick feedback mechanisms
  • Avoiding sacrificing one use-case to serve another
  • Positioning the service as an efficiency aid first and security improvement second
Makes Things Better
  • Secrets are stored more securely than before
  • Secrets are easier to manage, so engineers develop simpler, more secure, and more maintainable applications in less time
  • Encryption at rest and in flight, RBAC, audit trail, etc.
  • Logical, standards compliant, easy-to-use API
Doesn't Make Things Worse
  • Minimizes new attack surfaces
  • Minimizes operational overhead
  • Does not cause confusion or frustration with developers or users
  • RBAC, least privilege, well-defined arena where secrets live and where they do not
  • Logical, modern deployment and upgrade framework -- containerization, red/blue, etc. depending on existing processes
  • Logical, standards compliant, easy-to-use API and purposely designed and A/B tested UX

Logistics / Expenses

This section is very company-specific, so I’m only able to provide some high-level guidance.

Phase Logistics / Expenses
Potential Base Product Selection
  • This phase requires time commitment primarily from the task force discussed above. Their deliverables could be achieved through independent research with followup presentation and discussion of findings or potentially one or two substantial group research sessions.
Proof of Value
  • This phase requires substantial time commitment from the task force as well as future service consumers and integration resources. Depending on the size of the Security and Product/Project Management Teams, these PoVs could be done serially or parallel. There are advantages to parallel PoV as you can compare things side-by-side, but it also increases complexity and thrash in the project, which could lead to poorer decisions.
Initial Production
  • This phase requires time commitment from the Security Team, the initial consumers, and potentially procurement to secure any outside software that will be used. This phase is the inflection point where concepts become reality, so all resources directly engaged in delivery should have little else on their dockets. This phase is also where software costs become a reality if a commercial solution is chosen as the base.
Iteration
  • This phase requires time commitment from the Security Team, the onboarding consumers, and the existing consumers (as consultants to the new). Onboarding should be more efficient than the previous phase and become even more efficient with each additional consumer onboarded. If basing on a commercial SaaS, software costs will increase as consumers are added.
Codification / Evangelism
  • This phase requires time commitment from the Security Team, Compliance, Training, and Tech Writing as this is where the point-in-time state is transformed into business as usual. As this phase involves more integration with existing process and procedure than project-specific work, I would expect the resources working in the phase to be multi-tasking more than those in other phases.

Bibliography

…aka: stuff other than the inline links that I watched and read while putting together this post