This post is from the CollabNet VersionOne blog and has not been updated since the original publish date.
Migrating from Subversion to Git: What Your PCI-DSS Guy Will Not Tell You, Part 1
It is the time of the year when consumers begin to open their wallets. Retail and credit card processing industries are busily preparing their systems to handle the projected volume of credit card transactions leading up to Black Friday through the New Year. It is no wonder IT folks are now revisiting their compliance with the infamous PCI-DSS (Payment Card Industry Data Security Standard) to ensure that their companies develop and store code dealing with credit cards in a secure way. If you are not familiar with PCI-DSS, in a nutshell, is a rigorous set of standards for protecting cardholder data anywhere it is transmitted, processed, or stored. PCI DSS is enforced by the credit card brands and banks, and violations can cause stiff fines and strict consequences, including the suspension of card processing privileges. Any business that accepts credit cards or/and processes card data must validate their compliance with PCI-DSS yearly assessment. This standard has been evolving for a while and has real teeth, and is now in its 3rd revision.
So you may be wondering at this point how exactly PCI-DSS is related to source code repositories? Is it even remotely relevant to your business? As a developer or engineering manager – why should you care? Last week I received a PCI related question from an old friend of mine who develops a credit card processing service for retail clients and large banks. After we talked, it occurred to me how quickly PCI compliance stuff can confuse even the smartest of engineers…So I decided to share this story in a blog. The code my friend is working on is for processing credit cards. The service itself is hosted by his company and accessed by clients remotely. It does not store any credit card data and the company is using SVN for version control. It is properly secured, and only the developers who need checkout/commit access have access to it. They also allow contractors to access their repositories over VPN, with the repository being PCI-DSS 2.0 certified. All was under control until they hired a new engineering VP. He commanded his team to move from their beloved SVN to one of the popular open source Git implementations in order to accelerate development cycles and meet the feature turnaround demands of the customers. The security team at the company expressed reservations about using a distributed SCM tool due to lack of access control on remote clones. Specifically, they claimed this would violate PCI-DSS compliance. My friend was wondering if it was possible at all to move to Git without breaking the PCI-DSS regulations. Since PCI-DSS standard doesn’t really tell you how it maps to a specific technology, he rolled up his sleeves and started digging in-depth into Git security…oOr rather the lack of thereof.
His research found that the Git implementation they wanted to use did not offer any access control options other than OS level basic file access. It also doesn’t protect code history in the way he accustomed to. Now he is concerned that his contractors can potentially destroy code in the central repository. Preparing for the worst, he decided to see if he can implement his own security controls at the OS level to ensure the data is protected..
Before my friend knew it, he found himself on a quest for an advanced role-based access control model for code repositories on Linux. Ouch!.
Let’s leave my friend in his pain for a minute and take a look at the PCI-DSS 3.0 requirements to understand the problem better. For the purpose of this blog, I would like to highlight a couple requirements that are of particular interest.
Identify and authenticate access to system components
“Assigning a unique identification (ID) to each person with access ensures that each individual is uniquely accountable for their actions. When such accountability is in place, actions taken on critical data and systems are performed by, and can be traced to, known and authorized users and processes.” In our particular case, the code repository falls into that “critical data and systems” category because it handles credit card transactions for clients. Note: according to the standard, “this includes accounts used by vendors and other third parties (for example, for support or maintenance)“. This means remote contactors, as well as your own developers.
Track and monitor all access to network resources and cardholder data
“Logging mechanisms and the ability to track user activities are critical in preventing, detecting, or minimizing the impact of a data compromise. The presence of logs in all environments allows thorough tracking, alerting, and analysis when something does go wrong. Determining the cause of a compromise is very difficult, if not impossible, without system activity logs.” At a more practical level, it means that if 1) one of your developers accidentally introduced a security breach, and 2) that code somehow gets pushed it into the master branch and into production, which means you are out of compliance if you don’t know who this guy (or gal) actually was.
I hope these two samples help to illustrate that PCI requirements applicability to Git is really all about having the right controls in place, rather than mandating one SCM technology over another. If you deal with the requirements, you are covered. The caveat however, is that neither security nor compliance experts at your company can translate it to the Git policies that you implement. It is on you. Get grip on Git. Understand the risks, research your options for meeting the standard, talk to an SCM vendor or two, and come up with a compelling PCI proposal.
When developing your Git strategy, usually the best practice is to think about three types of controls for your code repositories to make both PCI and security guys happy no matter what:
- Role-based access: Clearly defined segregation of duties between test, development, and production systems
- User activity tracking: Ability to detect code changes, where they originated, when, and by whom
- Enforceable workflow: Having change control gates when pushing to live systems
As long as you have these controls in place and documented, , nothing can stop you from using the kind of repository management you want. This is really important to understand as you make your decision.
I made a suggestion to my friend to – as a first step; document and clarify the requirements for his “SVN to Git migration” project and discuss them with the security and compliance teams. This may seems like an overhead, but believe me, if you are in a highly-regulated industry, it is worth it. Compliance and security get to call the shots on the Git policies, even though they don’t understand Git. Confusing? Sorry to break the news, but you need to work through this with them. And it applies to any SCM, not just Git.
Now here is what kind of stuff you need to talk to your security guys about.
Policy on Read Permission
Having any control about what one developer can read is limited both with SVN as with any DVCS. Even when you have a central SVN server typically there is no limitation to read old revisions of paths where you have the permission.. So you could dump the old history revision-by-revision into a local store. There is nothing magical in SVN which prevents someone from downloading every single revision and distributing these copies.
The bottom line: if you can’t restrict access to the remote copies at the user side, you have no read restrictions at all.
Policy on write permission
This is a different game altogether, since Git allows you to set any name you want as code committer. It is not very straightforward but definitely possible, so you need to add some controls at the server level to protect your code from False-Flag revisions. This will stop the developers from committing using a name of another fellow developer, and then pushing the revisions to the server. While Subversion does set the username by itself on the server, a Git server must be instrumented to keep track of all user names for all incoming changesets. There are many ways to do it, but you really need to understand all the scenarios well and have your requirements written and reviewed by others. Essentially, the lack of an enforceable policy can cause you to loss all your history of submissions. And you really need it to deal with Requirement 10 of PCI.
By the way, there is still a way to change the committer name in Subversion too, but your developers have to be real hackers and know how to enable the pre-revprop hook on the server to make it happen.
After you get on the same page with your security and compliance peers on the policies regarding access permissions, you can take the next step. I will discuss it next week. In the meantime, if you want to know how the story unfolds at my friend’s company, or want to hear more about PCI-DSS implications for your Git strategy, just give me a shout. If you post your questions in the comments, I will address them in my next blog!