I am still excited about moving to an distributed source control management system. I think there are a lot of benefits that would come along with it, however, there are also a number of obvious costs that I haven't been paying attention to.
Our current scm, perforce, is used to store both code and our content. My original goal was to have a single SCM system for both. By keeping code and content in the same system, I thought, we could more easily manage releasing code and content together. However, our content publishing system allows us to abstract the deployment of content to a certain extent using publishing "packages" (ie p4 labels) and should allow us to run separate systems and still meet our release management requirements. So, while I would like to have one system to manage perhaps, it is not such a hard requirement.
My biggest concern in moving is that we have a large number of developers and, due to our content management system using perforce as its data store, a large number of non-developer users. It will take time and training to move the hundred or so people to a new scm. The amount of money spent moving people needs to be weighed against the efficiencies gained via git (or mercurial). Some of these costs can be diminished by preserving perforce as our content repository and only moving our code to git (or mercurial). At the end of the day, moving to a new system needs to provide more benefit than the cost and time sucked up by moving. Right now I am beginning to be skeptical that that is the case and I am beginning to think that there are more pressing problems for us to tackle, such as continuous deployments.
This weekend I picked up a Jon Wegener Alaia riding a finless board is very difficult and requires relearning how to surf. However, it is teaching me a lot about how my "regular" boards work. It could be that I just want change for the sake of change, but I feel that trying different boards helps me become a better surfer.
Software is the same way, I've spent the last decade writing Java applications that use a relational database to store data. In the last 18 months or so I have been leading a change throughout the development organization to dismantle our RDBMS infrastructure and move our production systems towards using technologies such as Solr and Coherence. Additionally, I've started to use Python to test Map Reduce jobs and we moved our build system from Ant to Maven. We've had a lot of change over the last year and we will have more change coming. My hope is that the change forces us to rethink our assumptions and leads us to create faster more resilient software.
One thing I have learned from my new board and from the changes we have instituted over the last year, you have to be comfortable wiping out. The trick is to make sure you don't give up, realize that things will get better. Remember, change requires practice and most importantly, have fun wiping out.
I've been reading a very good book from our chairman's reading list called Open Leadership by Charlene Li. The book has got me thinking about being a manager in technology. Many of us have risen through the ranks due to our technical expertise and our ability to design and implement solutions for technical and business problems. As we become managers we are constantly faced with letting go and having others do the work we prided ourselves on being so good at. I struggled for a long time with letting go of coding, now my struggle is letting go of designing the solutions.
Open Leadership is really driving home the point that I need to let go of the solution. As a technology leader I need to ensure that my team is focused on the right problems and empower them to come up with solutions. Part of empowering them is allowing them to fail safely. As someone that has prided themselves on delivery it is scary to step back and give up the reigns. The irony is my hold on the reigns is purely imaginary. Once I stopped coding every day, I still write some code, I had already given over the reigns to developers more talented and capable than I.
I've been struggling for a long time trying to figure out how to get my team to really embrace owning quality. It is almost as if as long as there is a QA engineer on the team the developers rely on the QA engineer to find defects and enforce quality. While that is ostensibly the job of the QA engineer, I'd like our developers to ensure that their code is defect free before handing it off for the QA engineer. Often QA finds simple bugs that lead me to believe that the code was never really tested in any rigourous way prior to the developer being "done." After reading some posts by the Kaching team on their blog posts regarding continuous deployments I started thinking that perhaps their approach really focuses the technical leads and developers on the quality of their work. If you know that within a few minutes your commited code will be live you, probably, will spend more time ensuring it is defect free. If you don't it will become readily apparent. If not, after a few late night incidents you will.
What I need to figure out is what the code management strategy is for continuous deployments, what sort of tools we'll need, and how git fits in.
There has been a lot of talk around the office the last few weeks about source control management. Since I have been at Edmunds we have always used Perforce as our version control system. It seems to work for most users, and we have even used it as the storage engine for our content management system.
However, Perforce does have its problems, the one that I find most annoying is that the state of my local repository is stored on the server as part of my workspace specification. This "feature" causes us to implement a number of hacks within our CMS to support our centralized publishing service. This service can publish content from any branch or any label. The way workspaces handle keeping track of versions means we do a lot of work to allow the central service to meet our needs. Our first implementation used Subversion, which has its own set of problems, however, one thing we did like was that we did not need to sync to disk for anything we could request a specific version of a file from the Subversion server directly and send it out over our publishing bus without ever hitting disk. workspace specification.
Along comes Git. Several of our developers have been using Git on their own using git-p4 to keep their repositories in sync. There are several features of Git I find interesting. The primary feature I like is the ability to have multiple repositories. We've struggled for a long time trying to figure out how to implement an open source type model whereby developers become the curators of code bases. The goal is to ensure that an API stays true to its intentions and does not accumulate too much cruft. We've never come up with a good way of ensuring that the lead for an API is informed of changes, and Git's repository model seems like it would be a good fit. A lead for an API would have the authoritative repository for that API and only that repository could push changes to the central build repository.
Moving a large development group to a new source control system is a lot of work. There is training and tooling that needs to be created and moving our code and build systems would be a lot of work. I'm thinking that we could use git-p4 to have a single team test it. We're starting a new green field project in the next few weeks, I just need to convince the leads that this is a good additional risk to take on. I also have to research the git java APIs to ensure that the functionality we need for our CMS is supported as I'd like to keep one source control system in use.
This is the first post to my new technology focused blog. I am not a newcomer to blogging, however, I have never sat down to compose my thoughts on building software in such a public way. I am hoping that this blog will become a platform by which I can share my current musings and conundrums with designing and building a large public web site. I've been at Edmunds.com for almost six years now, and while this blog is not directly work related and is not endorsed by Edmunds, there are a lot of problems and topics we discuss regularly here that will influence my blog posts.
Santa Monica, CA