Two lessons the Infrastructure team stole from Ops, Dev and other departments at STYLIGHT

Posts by SysAdmins are usually rants. Or patronize you about someone’s philosophy. This blogpost won’t be either. It definitely could be both, but I will instead admit to shamelessly stealing from my colleagues. In the end, we use some pretty nice technologies and I’d like to give you a glimpse of what we’re doing at STYLIGHT Infra. These are two of the lessons I learned (stole and dubbed to be my own) from my peers.

Lesson 1: Configuration management is not exclusively for servers.

In Ops, our colleagues used Puppet for quite a while. They still use it, like it and will promote it if asked. However, as powerful as Puppet is, it has an equally steep learning curve and took some time until support for Windows arrived. Taking a closer look at the competition, Chef might have set the bar for integrating Windows by collaborating with Microsoft on their Desired State Configuration System for Powershell, but it is still to be considered a static Configuration Management System.

Being more recent players in the game, Ansible and Salt created some buzz as they took what was good from Puppet/Chef and extended that for a Remote Execution Engine. In a scenario that requires you not only to maintain reliable configurations but also ad hoc troubleshooting and fast access to remote machines, this comes in more than handy and seemed right away perfectly suited for our environment. For reasons I can’t fully remember anymore I ignored Ansible at the time entirely, spun up a Salt-Master and pushed the first Salt-Minions to my testing OU (GPO + script = wohoo). After the promising testing phase, I pushed it to all Clients in the Active Directory– even our (ever increasing) fleet of Macs are now Minions.

Right now Salt serves us in two ways:

  1.  Leveraging Salt States, we push custom applications to people and apply base sets of software to certain departments on login. We also set up a couple neat helpers such as an ELK stack or prototypes of potential tools.
  2. The real juice, however, lies in the Remote Execution Engine. Troubleshooting in a Windows world usually involves the GUI, and while it is nice to stay in touch with your colleagues by simply swinging by their desks, sometimes you just don’t want to/can’t afford the time. Powershell is neat for automating repetitive troubleshooting, and the Salt-Master serves as the central storage for the scripts. When necessary, we simply use Salt to run scripts on client machines to fix problems of all sorts. Installing fonts (and every Windows admin can tell you about the pain of doing so remotely!), installing/uninstalling Office or other applications. Even fine grained configurational changes can be either achieved out of the box using one of the (great) modules of Salt or by writing a script in Powershell.

The usage of Salt is not yet as extensive as it could be– that has to be admitted. The possibilities are there, and we look to migrate step-by-step from GPOs to Salt states where possible.

Lesson 2: Agile, SCRUM and Kanban preserve your sanity.

Agile principles and SCRUM are by their origins nothing SysAdmins, especially working on help desk or infrastructure tasks, would naturally see as something tailored for their daily work. You would see the developers at that fancy startup across the street sticking Post-it after Post-it to their walls and windows and keep wondering what the hell they are doing.

Well, not for us. The nature of the job dictates the distinction between projects and help desk. This is fairly easy– project related work screams to be tracked and managed using SCRUM (our awesome analog corner board pictured below) whereas a help desk is begged to be handled in a (digital) Kanban board.

awesome analog board

For both we use JIRA. Emails from our colleagues requesting help will be converted automatically to tickets in our help desk Kanban and all projects are handled using some parts of the SCRUM tool palette. To be honest with you, our Agile coach told me multiple times already that we created some mutated and only remotely related offspring of SCRUM, but for us it still works and has had a couple of positive effects for us:

  • The split solves the problem of justifying your attention. Should you handle help desk tickets first? Or is project work more important? Help desk tickets can be now tracked using a SLA and your project work is prioritized so you should always be in the clear which fire to extinguish first.
  • Transparency and awe for users. Walk-ins are often baffled by the projects currently displayed on our physical board and show appreciation.
  • Staying in sync with the business. A sprint planning every two weeks helps the product owner and other stakeholders to get their projects reprioritized when needed and still allow the bigger picture to be kept in mind.
  • Getting rid of hazardous processes. Even though the help desk runs as Kanban, the periodic retrospectives are used to reflect on the last two weeks.

As a final note I want to put a strong emphasis on something often simply forgotten by a lot of people working in the industry. As a SysAdmin your colleagues are a constant source of work. They might not understand networks (at all), probably bitch about the WiFi and crash your file server. However, don’t doubt their competence in their field of expertise – learn best practices and processes from your surrounding departments and evaluate them. Chances are the silver bullet is still for you to forge, but they might give you all the resources you will need.

Cross-functionality is a function over time

In my last blog post I described why we formed cross-functional business teams. In this blog post I am writing about team composition, that it changes over time and consequences of that. 

When we talk about the composition of cross-functional teams we usually have something like this in mind. The labels usually read developer, tester, designer and UX researcher or something along those lines. For the sake of this article we abstract from the specific role and just call them red, orange, yellow and brown experts.

constant expertise

This visualisation of the team composition is ignoring the fact that the amount of needed expertise to build a product changes over time. In reality it looks something more like this. 

needed expertise varies

During product development there might be a phase where there is a lot of yellow work needed (Feb) while sometime later there is almost none (Apr) and then it’s picking up again. 

There is a certain threshold up to which it makes sense to have someone with a specific expertise full-time on the team. If the needed expertise goes under that threshold that expert won’t be fully utilised. Which is OK, if it’s just a dip, but will get boring and frustrating if persistent. 

threshold

Looking at this, one might argue that the yellow expert should leave the team by mid-February and just be available to the team as needed. The orange expert joins the team around that time. The brown expert would leave sometime later around mid-March. This makes it effectively impossible to form a stable team that has the chance to gel and perform at it’s peak effectiveness.

Having T-shaped people on the team helps with this since they can help out in other disciplines than their own. This lowers the threshold in our graphical visualisation.

T-shaped: T-shaped people have two kinds of characteristics, hence the use of the letter “T” to describe them. The vertical stroke of the “T” is a depth of skill that allows them to contribute to the creative process. That can be from any number of different fields: an industrial designer, an architect, a social scientist, a business specialist or a mechanical engineer. The horizontal stroke of the “T” is the disposition for collaboration across disciplines.
IDEO CEO Tim Brown 

lower threshold

Now it makes sense to keep the yellow and brown experts for longer and bring on the orange one sooner.

Having M-shaped people on the team helps even more since they combine two or more needed disciplines. This makes it easier to stay above the threshold.

M-shaped: Building on top of the metaphor of T-shaped persons, M-shaped persons have expertise in two or more fields.

So, if our yellow expert was also an expert in the brown discipline, she would combine the areas below both of these lines resulting in the green line.

combined disciplines

Now, leaving the team because of under-utilisation is out of the picture.

Apart from having people on the team that are valuable in more than one discipline there are of course other options to deal with slack than leaving the team. How about some Kaizen? Helping someone else working on a stuck task, going to that conference, reading that book, finally doing that refactoring or writing that blog post are just a few of them.

Specialist teams are another option for experts that are having an effect here and there, but are not constantly needed on a team. In order to not create dependencies and thereby crippling autonomy of teams these specialist teams should be enablers and teachers helping teams. This means ownership stays with the teams, not with the specialists. At STYLIGHT we have for instance a platform team.

Conclusion

Forming stable cross-functional teams in the face of changing needs of expertise over time is not trivial. Being aware of this and having strategies how to deal with slack for a specific discipline (T-shaped people, Kaizen) still make it a viable strategy though. For us the advantages of cross-functional teams outweigh these difficulties.

The 3 Commandments of UX Research

IMG_6827

Earlier this week, we decided to remove our UX researcher from the Scrum teams she was in. The primary reason was that the main person in charge of research (me!) is working with three teams; the Magazine, the Shopping, and Mobile team. Joining the Scrum meetings of all teams is time consuming and unnecessary, unless we are working directly together on a feature.
This will also free up time for improving the quality of the research. Which brings me to the point of this post: Having time to reflect on my experience in research so far, I’ve come up with 3 commandments that we will apply to UX research from now on:

1- We don’t do unplanned research, and we don’t ask unstudied questions. Research is there to help us inform design decisions (example: Does the text need to be highlighted? Should there be a time stamp?) Therefore, we need to plan the research at least 2 days ahead of time, to design better questions and recruit participants.

2- Research should not answer “Like/hate” questions. People constantly engage in things they claim to hate. Instead, research should focus on gathering useful insights.

3- The researcher does not work in isolation. Research should be conducted in tandem with a designer, developer, or product owner from the team. Because A) the interview partner allows the researcher to focus on what the user is saying (or not saying) and to let the conversation flow naturally, rather than running through a list of questions, half-writing, half-listening. And without that focus, the researcher is likely to miss out on some really valuable stuff. B) People who have a hand in collecting the insights will look for opportunities to apply them.

We have a responsibility to the people we’re designing for, and this starts with asking better questions. Then listening, really listening, to those who give us their time and feedback.

Why we formed cross-functional business teams

swiss-army-knifeLate last year we decided to divide our three development teams and have them join the six business teams. Previously we had a hard time coming up with meaningful OKRs for the development teams. We figured only goals with business outcomes would make sense. Having the engineers work together with business experts in the same teams seemed like the logical consequence.

The previous setup made it hard to prioritize the requests to the development teams from the different business teams. This got frustrating for pretty much everyone involved — especially those whose requests didn’t make the top of the backlog. Today this issue all but disappeared.

We also hoped this would make the business teams more independent and thus allow them to move faster while at the same time allowing the engineers to focus on one area of our business model. This feels a whole lot more like the spirit from the founding days.

The former teams included developers (backend, frontend, iOS, Android), designers and UX experts. One might call those in itself cross-functional development teams. Including business experts seemed again like the logical next step — effectively turning them into cross-functional business teams. From my experience in working with teams in several companies a common progression for cross-functionality tends to be: developers + testers + design + UX experts + business experts — in that order.

Having gained some experience with the new setup, we realised that the choice between specialist teams and cross-functional teams is not black or white, but rather a tradeoff — as per usual.

Pro cross-functional business teams

  • no handovers
  • faster learning about business
  • more innovation
  • shorter development cycles
  • broadened perspectives through diversity of experiences, expertise and knowledge
  • greater sense of purpose by working on the full (or at least a greater part) of the value stream

Pro specialist teams (aka silos)

  • get work done more efficiently when it can be described precisely and handovers are cheap
  • learning from specialists in same field
  • higher consistency of outcomes within silos
  • easier agreement with people that speak the same lingo

How to remedy the short-comings of cross-functional feature teams

  • use communities of practice (CoP) for knowledge sharing amongst specialists
  • express yourself in the lingo of the addressed person when talking to a specialist in another field
  • get to a novice level of understanding in the specialist fields of your team mates (“become” T-shaped)

Cross-functional teams rock!
Christina, Online Marketing Manager SEA

By now, nobody is questioning the general setup anymore of having dedicated developers working with business teams. What we have realized though, is that the skills needed within a team is not constant but varies. The need for a UX expert is much higher in the beginning of an epic (a bunch of coherent user stories), than it is towards the end; therefore, the concrete team composition is a choice we have to keep making. As we believe that permanent teams outperform short-lived ones, we are trying to make as few changes in team membership as possible. Having T- or M-shaped people helps with that. But more on that in another blog post.

Image by James Case

Enable Your Teams to Rapidly Ship and Operate Quality Software

How often do your development teams release to production? Who gets the alert in the middle of the night when everything crashes and burns? Do these questions make you uncomfortable or rather their answers? Or maybe you are already discussing changes to your current deploy process? Because it sucks, right? If you’re honest, it will always suck because it constantly needs to be adapted to the current business requirements.

Enter the “Platform Team”: a group of build & deploy experts that jumpstart your teams down the road to operational success while providing a safety net. And, no, I’m not referring to a System Administrator with a pager. Instead, I’m suggesting a three-ply construction of automation, containerization and monitoring.
Continue reading

Supporting Millions of Rewrites in Nginx with Lua and Redis

About a year ago, I was tasked with greatly expanding our url rewrite capabilities. Our file based, nginx rewrites were becoming a performance bottleneck and we needed to make an architectural leap to that would take us to the next level of SEO wizardry.

In comparison to the total number of product categories in our database, Stylight supports a handful of “pretty URLs” – those understandable by a human being. Take http://www.stylight.com/Sandals/Women/ – pretty obvious what’s going to be on that page, right?
Continue reading

Ideation Camp @ STYLIGHT

Three months ago, we had our first ShipIt day in the Product and Engineering departments of STYLIGHT. Employees had the chance to work on anything that relates to our products, and deliver it during ShipIt Day, our 24-hour hackathon. We created teams, everyone felt energized and motivated, and the results were pretty fascinating.

That sparked our interest in posing a similar challenge for the entire company, in order to create an open space to bring ideas, criticism, and feedback. In the first Ideation Camp at STYLIGHT,  we followed the approach of Design Thinking to leverage on the multi-disciplinary, international staff we have and to show them methods to shape their ideas.

Design Thinking is a methodology that helps you solve problems like a designer. It emphasizes empathy building, problem framing, ideation, and validating potential solutions.

chart

 

Continue reading

Why work at STYLIGHT

The other day I was to re-vamp our job ads for engineers. One part is to inform and appeal possible candidates about STYLIGHT. I wanted to provide arguments to “why work at STYLIGHT?”. Being a co-founder a might have different answers than my fellow STYLIGHTers. I believe  authentic reasons are the strongest, so I just went ahead and asked them (anonymously).  Here’s what they responded (I grouped them and fixed typos):

Continue reading

The blog about the technical challenges and solutions of STYLIGHT