Docker and CoreOS Meetups

As early-adopters of Docker, and since Johann – who organizes the Docker and CoreOS meetups in Munich – joined the team recently as our Engineering Evangelist, we felt that we could do more for the community. As such, we’re happy to announce that we’ll be hosting the forthcoming containers events in Munich.

Our next event will feature Jonathan Boulle from CoreOS who will be speaking on August 26th about Designing an open standard for running applications in containers.  RSVP

Here is the abstract for Jon’s talk:

2015 has been declared by some to be the “year of the production-ready container.” The open container specifications are an attempt to write-down an interoperable set of specifications for use by different application container runtimes; on Linux this includes Docker, kurma, and rkt.

Be sure to check out our event page and subscribe to our newsletter to keep track of all the events happening at STYLIGHT.

We will host a Docker meetup in September as well and we’re actively looking for speakers. Feel free to ping johann.romefort (at) stylight.com for talks proposals!

Here is the video from the last Docker meetup at STYLIGHT, featuring Dominic Philipps talking about how we deployed containers in production.

Have you heard about LEGO® SERIOUS PLAY®?

At STYLIGHT we like to experiment and play at the same time. One of our Agile Coach has a very interesting tool in his toolbox called LEGO® SERIOUS PLAY® and we decided to ask him more about what it is, and how the team benefits from it.

Some background info on LSP (LEGO® SERIOUS PLAY®).

from Wikipedia:

Lego Serious Play is a consultant service offered by a Lego Serious Play Certified Facilitator. Its goal is fostering creative thinking through team building metaphors of their organizational identities and experiences using Lego bricks. Participants work through imaginary scenarios using visual three-dimensional Lego constructions, hence the name “serious play”.

The method is described as “a passionate and practical process for building confidence, commitment and insight”. The approach is based on research which suggests that hands-on, “minds-on” learning produces a deeper, more meaningful understanding of the world and its possibilities. It is claimed that participants come away with skills to communicate more effectively, to engage their imaginations more readily, and to approach their work with increased confidence, commitment and insight.[2]

Leveraging concepts such as Play, Constructionism and Imagination this method allow people to interact on a visual model to analyse situations or problems and collaboratively find solutions.

If you’re interested in learning more, just leave a comment and Torsten will be in touch. As mentioned in the video as well, Torsten hosts the LEGO® SERIOUS PLAY® Meetup in Munich. You can RSVP here: RSVP

Time to get that old LEGO® box from your attic, and start exploring a new world of possibilities!

Engineering Diversity Meetup

Asking Johann, our new Engineering Evangelist who freshly started a week ago, why he made the decision to work at STYLIGHT, one of the things he pointed out was that he liked how diverse the team is. A few facts about our diversity:

  • 50% of our team is from outside of Germany
  • We speak English as a default language.
  • We encourage minorities to jump on the tech bandwagon by, for example, sponsoring the last Rail Girls workshop and employing people based on their talent, regardless of their gender.

As a company embracing diversity, we immediately jumped on the movement of the #ILookLikeAnEngineer hashtag and went on interviewing our own female engineers to ask them about their take on diversity in technology, the stereotypes they face and how they react to it.

Today we’d like to push our commitment forward and take a leap towards encouraging tech diversity by organizing the first #ILookLikeAnEngineer community gathering in Europe. Ever been told that you don’t look like an engineer? Always felt that you don’t fit in the standard engineer stereotype? Maybe you’re part of an underrepresented group in tech or simply you have ideas on how to tear apart the stereotypes that are plaguing the tech community? We want to hear your story!

We will get our photo booth ready to let you show the world what an engineer looks like!

Join us for an evening of drinks, brainstorming, networking and talks in the beautiful STYLIGHT Atrium. Everyone is welcome, Klingons, Vulcans and Vogons included.

RSVP

Join Us

10 things I learned as Rocket Internet CTO, by Christian Hardenberg (dahoam talk)

At the Daho.am conference last June, we had the honour to have Christian Hardenberg giving a talk on the 10 things he has learned as Rocket Internet CTO. The talk was so insightful that we decided to transcribe the best parts of Christian’s talk below:


1. Page speed matters:

conversion rate is directly correlated to page speed.

Use not only Page Speed but also New Relic to have deep insights. We aggregate the page speed metrics across all our companies (400+). Reason for slow pages is sometimes due to local slow internet (like in Philippine).

  • Defer loading external scripts
  • Prioritize content above the fold
  • Split CSS in critical and non-critical part
  • Reduce DOM size
  • Page type specific JS/CSS
  • JS/CSS prefetching
  • Stream html header early

2. Test Driven Optimization

Optimisation is Dangerous.

1. define your objectives
2. Build realistic Test Data Sets
3. Run jMeter Load Tests

The enemy is generally where you think it, in one of our company Memcache was one of the culprit because of the large amount of data read and write on each request.

3. Test Driven Security

Don’t fix it before you test it. Your worst enemy should be inside: we hired hackers and then they show the team their finding. ‘Showing’ breaches is a good way to make engineer aware of the importance of security.

Security learnings:

  • DDOS Attacks : Only solution is external providers (CloudFlare, MyraClowd, Neustar, Prolexic)
  • WordPress / Jenkins – Separate from Production environment because very vulnerable
  • Password: 2-factor authentication
  • Public Github Repos: Automated monitoring
  • Credentials Storage in Puppet ; HIERA-EYAML Encryption
  • Password hashes – Sanitized views on DB
  • Offboarding – Google Apps as Identity Provider

4. Hardware is stable, Networks are not.

How we improved Stability?

1. Minimize time to recovery
2. Detect Network Bottlenecks with Testing
3. Detect Issues early with sensors
4. 24/7 Monitoring and health checks

5. Technologies that work for us

6. Monolith First vs Micro Services

Disadvantages of going with Microservices:

  • Overhead is signifcant (but decreasing)
  • Early on the requirements are never clear. Refactoring Micro Services is hard
  • Hard to do microservices in a KISS Strategy.

Alternative:

  • Split along very obvious boundaries with thin interfaces (TMS, Shop, Accounting)
  • Merge systems with non-obvious boundaries (Shop, Customer, Order) but avoid tight coupling
  • Start unbundling once your team grows about 30 dev.

7. Timeboxes create Efficiency

Time Boxed Rocket Master Launch Process. Change scope rather than timeline. Better keep the launch date fixed as it create pressure to get things done. We prefer SCRUM rather than Kanban because it’s time-boxed.

8. Efficiency is fun

What do we do to make things going so fast? We’re very productive and efficient

Efficiency = Standardization + Inconsistency + Pragmatism + Learning

  • Standardization: Sky Rocket Tech Framework.
  • Eventual Inconsistency: Same as with database, if you want consistency you have a lot of overhead. 90% works well, accept the 10% inconsistent and fix along the way
  • Pragmatism: “Do we really need it?”: KISS + YAGNI, Bias for Action – Don’t ask just do. Combine with a culture of learning.
  • Correct Mistakes Fast: Measure and Benchmark everything. Accept new Data and Act. When we move to AWS, it was much more expensive, crashed a lot, and slower. We rolled back. After a while we re-evaluated and we’re not on AWS for all new companies we launch.

9. Simple is beautiful

Leave the campground cleaner than you found it.
Every code you touch, clean it a little bit.

  • Ongoing Refactoring
  • Pull Requests with Code Review
  • Pair Programming
  • Unit Testing
  • Scrutinizer
  • Quality Culture

Being productive is fun. Find technical debt and fix it to not loose the good people.

10. The Mobile Moment

More computers in the pockets than on desktops. Will talk more about this topic next year.


Watch Christian’s video:

Categorizing Products the STYLIGHT Way

Introduction

Machine learning is quite a hot topic nowadays, being used for a variety of purposes– from recommending products and movies to  analysis of business data. Here at STYLIGHT, we use Machine Learning for a completely different purpose.
Our partner shops send us thousands of products every day, and we are putting them into our own fashion categories ranging from very broad ones, e.g. clothing, to quite detailed ones, e.g. mini skirts. With almost 800 classes, this task can be very time-consuming when conducted manually; therefore, we have implemented a machine learning system which allows us to predict the fashion categories of the products in a fast, scalable and automated way.

System Overview

A classical machine learning architecture consists of the steps shown in Figure 1. For each block, an individually specialized algorithm can be chosen depending on which type of data is available (images, documents, numbers, etc.).

Machine learning @STYLIGHT

Figure 1: Classical machine learning architecture

Feature Extraction

This is the process of getting features which might help to predict the given product. For example, if one wanted to predict the sizes of a T-shirt, one would get the name, brand, width, height, color and price of a T-shirt. These attributes form a feature vector as can be seen in Figure 2. However, some information might be completely unnecessary and getting rid of it is part of the next step.

[name, brand, width, height, color, price]T

Figure 2: Feature vector of a T-shirt

 

Feature Selection

As stated above, in this step only important features are selected. This means features that carry no information are discarded. Imagine we want to predict the sizes of a T-shirt (S, M, L) and let us assume the size is only depending on the width and length of a T-shirt. All additional information like the price or brand are then completely unnecessary. Fewer features usually lead to less noise, less memory usage and faster times for classification. However, the tricky part is to find these relevant features.

Classification

Well known classifiers which are often used are k-NN, Naive Bayes, Decision Trees, Random Forests, Support Vector Machines and Neural Networks, just to name a few. Each one of these classifiers has its own strengths and weaknesses. Some have long training phases, some cannot deal with high dimensional feature spaces and others make some assumption about the a priori distribution of the data points. The selection of the classifier is a crucial part in the machine learning pipeline and can greatly influence the final outcome. Therefore, one should carefully think before choosing one of them.

ml_classifier

 

Figure 3: Example of a classifier

Conclusion

While the basic principles for machine learning tasks as shown above are quite obvious, the devil is as always in the details.

For example, the outcome of a machine learning experiment is hardly predictable. Meaning you have to run an actual experiment in order to know how good your system will perform. Designing and conducting such tests is, however, very time consuming, but absolutely necessary to do some proper model selection and cross-validation. It is especially important to be able to make some assumptions about the generalizability of the machine learning system.

Here at Stylight, we perform a nested cross-validation to find the best model and its presumable performance with new unseen data. We tried different splits for training and testing sets and found out that a classical 80/20 ratio works best for us. However, this number can vary depending on the data but can be seen as a rule of thumb.

As we have an enormous amount of products to process we developed our own fully automated testing and validation environment in order to generate new models. It is heavily parallelized and has a low memory footprint. This allows us to test new ideas very quickly. In addition to that, we use confusion matrices amongst other things for visualizing our results. There we can see changes we make directly and identify fashion categories which might be harder to predict than others.
Last but not least, our attempts with machine learning are quite promising and we definitely want to do more in this direction. This is the reason why we are continually exploring new ways in which we can apply machine learning to make our daily work more efficient.

 

The Millennial Woman Codes: STYLIGHTers take on Rails Girls Munich

Rails Girls is global, non-profit volunteer community which hosts one-day workshops all over the world for women to learn Ruby on Rails, having the aim to “give tools and a community for women to understand technology and to build their ideas.”  When we at STYLIGHT found out about the Rails Girls event taking place dahoam in Munich, we felt immediately compelled to get involved, as it goes perfectly hand in hand with STYLIGHT’s vision to support the millennial woman and our values as an engineering company focused on diversity; therefore, we became sponsors for this great event, and in addition to supplying customized STYLIGHT “Hallo World!” goodie bags and STYLIGHT cake pops, we supported as coaches and as attendees.

IMG_5210

My fellow STYLIGHT techie María and I decided to be coaches at Rails Girls, and we had five ambitious STYLIGHTers from the HR, PR, and QA departments who joined as eager-to-learn participants! Additionally, after finding out about STYLIGHT’s focus on engineering and that we had quite a few women in our IT department, the organizer asked that one of us present at the workshop about our experiences as women in tech.

 

SAMSUNG CSC
STYLIGHT girl developers ready to go!

 

By 9am(!) Saturday morning, all the Rails Girls attendees, from high school girls to women in their thirties and forties, had arrived at the Wayra office in Munich, and with their STYLIGHT goodie bags in hand, munched on some Brezn and installed Ruby on Rails on their laptops.  After introductions and some beginner exploratory exercises, the girls were split into groups and coaches were assigned. Later in the morning, the coaches helped guide the girls through the beginning of the Rails Girls App Tutorial and answered their many insightful questions along the way!  By lunch many of the girls had already made good progress on the tutorial and were beginning to branch out with their own ideas.

 

SAMSUNG CSC
STYLIGHT’s Software Developer Julie

 

During the lightning talks I was able to speak to the girls about my own late transition into computer science halfway through university and how my experiences had ultimately led me to STYLIGHT. I was also able to share with them what a day in the life of a STYLIGHT software developer is like and why at the end of the day, I can proudly say I love my job.

María and I held a short Q&A afterwards, and we were so happy to have had girls and other coaches coming up to us throughout the rest of the day to ask us more about our general experiences in engineering as women and also more specific questions about our jobs at STYLIGHT.  It was awesome to help the girls finish up their apps: many of them even got their apps deployed online– amazing to think girls with no prior programming experience could make their own live website in one short day!

 

Training session with Maria, Junior Android Developer at STYLIGHT
Training session with Maria, Junior Android Developer at STYLIGHT

 

The conclusion of the workshop involved the presentations of the finished apps and an after party where the girls got to enjoy champagne and STYLIGHT cake pops!

 

SAMSUNG CSC
Rails Girls want sweet moments, too!

We at STYLIGHT were so proud to be involved in the success of the Rails Girls Munich workshop.  We feel so lucky as employees to have such supportive mentorship at STYLIGHT, and we were humbled to have had the opportunity to give back to the community.  It was an inspiration to meet so many amazing and ambitious women and other female developers in companies based in Munich, and we are furthermore encouraged that events like these do so much to inspire women into pursuing careers often seen as inaccessible or traditionally male-dominated. Just think about the fact that in the US the percentage of computing jobs held by women has actually fallen over the past 23 years, according to a new study of the American Association of University Women: in 2013, in fact, just 26 percent of computing jobs were held by women.

That’s something we at STYLIGHT  are definitely working on! And maybe we’ll even see some of the girls who joined the event as developers at STYLIGHT in the next few years!

In the meantime, we still have many open engineering positions.  Know any ladies up for the challenge?

velocity-vs-cycle-time

Velocity vs. Cycle time – or ‘How to predict how much work will get done by a given time’

Both velocity and cycle time make predictions based on completed work in the past. This approach is sometimes referred to as ‘yesterday’s weather’. For the remainder of this post I will assume that this work is chunked into user stories. Basing predictions on past events assumes a stable (enough) context. In our case the most important factor here is a stable team.

In either case these user stories can be estimated – usually with story points – or split to roughly the same size and then just counted. The later being faster and more outcome-focussed. Assuming same size stories simplifies measurement and calculation in both cases and is just as accurate as using more precise estimates, but requires more care when creating stories. For the remainder of this post I will assume that user stories are roughly the same size and counted instead of using more detailed estimates. If you are a firm believer in estimates the following also holds true, but requires an extra step of converting user story points into number of stories.

Velocity measures completed stories per iteration. The unit is work per time, e.g. 4 stories per 2 week iteration.

Cycle time measures the amount of time passed working on a story. The unit is time per work, e.g. 2.5 days per story.

Velocity and cycle time can be (sorta) converted into each other (neglecting times in between finishing and starting user stories and in the simplified case of WIP = 1) and are therefore (kinda) equivalent in the information they carry. They merely represent different points of view on the same thing – how long does it take to get chunks of work done. For a more in depth discussion on the actual math, check out the comments on this blog post.

velocity-vs-cycle-time

Another concept of planning is commonly mangled up in discussions about predictions, but is actually orthogonal to it: Iteration planning vs. variable input queues.

Iteration planning fixes the scope for the next iteration, e.g. 4 stories in the next 2 weeks. Velocity is often used in this context, since the unit of velocity, work per time, can be used to support the decision how much to forecast for the next iteration.

A variable input queue provides a steady stream of work. User stories can be re-sorted, added or removed anytime as long as work on them hasn’t started yet. This provides more flexibility compared to iteration planning. Measuring time per user story (cycle time) matches this well.

When using a variable input queue and no iteration planning, iteration goals and the associated commitment also disappear. This can be replaced with commitments to OKRs, which we do quarterly. Three months are long enough to accomplish significant things, while still short enough to have a sense of urgency.

Fixing the length of the input queue provides a mechanism to trigger refilling the queue as well as preventing planning to far into the future. The optimal queue length depends on story size, team capacity and predictability of stories. An amount of stories that everyone can easily keep in their heads (around 7) prevents overhead in reviewing and re-prioritizing the queue. When using a fixed input queue planning is triggered by an empty slot in the queue rather than in specific intervals. It also is finer-grained – possibly one story at a time. When integrating this in a daily standup an extra planning meeting can be avoided.

Additionally to tracking the time a story is worked on (cycle time) one can easily also measure the time from when a story is added to the queue until it is done (lead time). This makes it easier to get an idea by when a story will be finished once it is added. This is very effective if visualized at the end of the queue with something like “Your expected wait time today from this point is between x and y days.”, aka Disneyland wait time.

fixed-length-input-queue

Velocity is usually used with iteration planning and cycle time with variable input queues. This feels more natural because measuring work per time with iterations is a good match, as is measuring time per work on a variable input queue. But there is nothing preventing you from combining it differently if it makes more sense in your context.

Discussions about velocity vs cycle time are a substitute discussion for iteration planning vs variable input queues. This is where the real differences are. Velocity and cycle time are merely different points of view on the same thing. They just happen to match one approach more than the other.

 

Two lessons the Infrastructure team stole from Ops, Dev and other departments at STYLIGHT

Posts by SysAdmins are usually rants. Or patronize you about someone’s philosophy. This blogpost won’t be either. It definitely could be both, but I will instead admit to shamelessly stealing from my colleagues. In the end, we use some pretty nice technologies and I’d like to give you a glimpse of what we’re doing at STYLIGHT Infra. These are two of the lessons I learned (stole and dubbed to be my own) from my peers.

Lesson 1: Configuration management is not exclusively for servers.

In Ops, our colleagues used Puppet for quite a while. They still use it, like it and will promote it if asked. However, as powerful as Puppet is, it has an equally steep learning curve and took some time until support for Windows arrived. Taking a closer look at the competition, Chef might have set the bar for integrating Windows by collaborating with Microsoft on their Desired State Configuration System for Powershell, but it is still to be considered a static Configuration Management System.

Being more recent players in the game, Ansible and Salt created some buzz as they took what was good from Puppet/Chef and extended that for a Remote Execution Engine. In a scenario that requires you not only to maintain reliable configurations but also ad hoc troubleshooting and fast access to remote machines, this comes in more than handy and seemed right away perfectly suited for our environment. For reasons I can’t fully remember anymore I ignored Ansible at the time entirely, spun up a Salt-Master and pushed the first Salt-Minions to my testing OU (GPO + script = wohoo). After the promising testing phase, I pushed it to all Clients in the Active Directory– even our (ever increasing) fleet of Macs are now Minions.

Right now Salt serves us in two ways:

  1.  Leveraging Salt States, we push custom applications to people and apply base sets of software to certain departments on login. We also set up a couple neat helpers such as an ELK stack or prototypes of potential tools.
  2. The real juice, however, lies in the Remote Execution Engine. Troubleshooting in a Windows world usually involves the GUI, and while it is nice to stay in touch with your colleagues by simply swinging by their desks, sometimes you just don’t want to/can’t afford the time. Powershell is neat for automating repetitive troubleshooting, and the Salt-Master serves as the central storage for the scripts. When necessary, we simply use Salt to run scripts on client machines to fix problems of all sorts. Installing fonts (and every Windows admin can tell you about the pain of doing so remotely!), installing/uninstalling Office or other applications. Even fine grained configurational changes can be either achieved out of the box using one of the (great) modules of Salt or by writing a script in Powershell.

The usage of Salt is not yet as extensive as it could be– that has to be admitted. The possibilities are there, and we look to migrate step-by-step from GPOs to Salt states where possible.

Lesson 2: Agile, SCRUM and Kanban preserve your sanity.

Agile principles and SCRUM are by their origins nothing SysAdmins, especially working on help desk or infrastructure tasks, would naturally see as something tailored for their daily work. You would see the developers at that fancy startup across the street sticking Post-it after Post-it to their walls and windows and keep wondering what the hell they are doing.

Well, not for us. The nature of the job dictates the distinction between projects and help desk. This is fairly easy– project related work screams to be tracked and managed using SCRUM (our awesome analog corner board pictured below) whereas a help desk is begged to be handled in a (digital) Kanban board.

awesome analog board

For both we use JIRA. Emails from our colleagues requesting help will be converted automatically to tickets in our help desk Kanban and all projects are handled using some parts of the SCRUM tool palette. To be honest with you, our Agile coach told me multiple times already that we created some mutated and only remotely related offspring of SCRUM, but for us it still works and has had a couple of positive effects for us:

  • The split solves the problem of justifying your attention. Should you handle help desk tickets first? Or is project work more important? Help desk tickets can be now tracked using a SLA and your project work is prioritized so you should always be in the clear which fire to extinguish first.
  • Transparency and awe for users. Walk-ins are often baffled by the projects currently displayed on our physical board and show appreciation.
  • Staying in sync with the business. A sprint planning every two weeks helps the product owner and other stakeholders to get their projects reprioritized when needed and still allow the bigger picture to be kept in mind.
  • Getting rid of hazardous processes. Even though the help desk runs as Kanban, the periodic retrospectives are used to reflect on the last two weeks.

As a final note I want to put a strong emphasis on something often simply forgotten by a lot of people working in the industry. As a SysAdmin your colleagues are a constant source of work. They might not understand networks (at all), probably bitch about the WiFi and crash your file server. However, don’t doubt their competence in their field of expertise – learn best practices and processes from your surrounding departments and evaluate them. Chances are the silver bullet is still for you to forge, but they might give you all the resources you will need.

Cross-functionality is a function over time

In my last blog post I described why we formed cross-functional business teams. In this blog post I am writing about team composition, that it changes over time and consequences of that. 

When we talk about the composition of cross-functional teams we usually have something like this in mind. The labels usually read developer, tester, designer and UX researcher or something along those lines. For the sake of this article we abstract from the specific role and just call them red, orange, yellow and brown experts.

constant expertise

This visualisation of the team composition is ignoring the fact that the amount of needed expertise to build a product changes over time. In reality it looks something more like this. 

needed expertise varies

During product development there might be a phase where there is a lot of yellow work needed (Feb) while sometime later there is almost none (Apr) and then it’s picking up again. 

There is a certain threshold up to which it makes sense to have someone with a specific expertise full-time on the team. If the needed expertise goes under that threshold that expert won’t be fully utilised. Which is OK, if it’s just a dip, but will get boring and frustrating if persistent. 

threshold

Looking at this, one might argue that the yellow expert should leave the team by mid-February and just be available to the team as needed. The orange expert joins the team around that time. The brown expert would leave sometime later around mid-March. This makes it effectively impossible to form a stable team that has the chance to gel and perform at it’s peak effectiveness.

Having T-shaped people on the team helps with this since they can help out in other disciplines than their own. This lowers the threshold in our graphical visualisation.

T-shaped: T-shaped people have two kinds of characteristics, hence the use of the letter “T” to describe them. The vertical stroke of the “T” is a depth of skill that allows them to contribute to the creative process. That can be from any number of different fields: an industrial designer, an architect, a social scientist, a business specialist or a mechanical engineer. The horizontal stroke of the “T” is the disposition for collaboration across disciplines.
IDEO CEO Tim Brown 

lower threshold

Now it makes sense to keep the yellow and brown experts for longer and bring on the orange one sooner.

Having M-shaped people on the team helps even more since they combine two or more needed disciplines. This makes it easier to stay above the threshold.

M-shaped: Building on top of the metaphor of T-shaped persons, M-shaped persons have expertise in two or more fields.

So, if our yellow expert was also an expert in the brown discipline, she would combine the areas below both of these lines resulting in the green line.

combined disciplines

Now, leaving the team because of under-utilisation is out of the picture.

Apart from having people on the team that are valuable in more than one discipline there are of course other options to deal with slack than leaving the team. How about some Kaizen? Helping someone else working on a stuck task, going to that conference, reading that book, finally doing that refactoring or writing that blog post are just a few of them.

Specialist teams are another option for experts that are having an effect here and there, but are not constantly needed on a team. In order to not create dependencies and thereby crippling autonomy of teams these specialist teams should be enablers and teachers helping teams. This means ownership stays with the teams, not with the specialists. At STYLIGHT we have for instance a platform team.

Conclusion

Forming stable cross-functional teams in the face of changing needs of expertise over time is not trivial. Being aware of this and having strategies how to deal with slack for a specific discipline (T-shaped people, Kaizen) still make it a viable strategy though. For us the advantages of cross-functional teams outweigh these difficulties.

Batch Size DOES matter

How does the batch size of work influence the performance of a (production) process? As a child I wanted to be a mad scientist, so now as an agile coach I conducted a little experiment to find out. Repeat it on your own and post us your findings.

I have seen it at one of the numerous agile gatherings I attended and have ever since repeated it several times.

Experiment Setup

Here is the list of things and people you need for the experiment.

  • a table
  • 20 coins
  • some cardboard
  • 4 people (the workers)
  • either
    • a video camera (a mobile phone would do) or
    • another 4 people (managers) with stopwatches

So here is our setup. I asked Julie (developer) and Marina (office management), Ben (our infrastructure guy), and Anselm (one of our founders) if they wanted to be part of a little game of coin flipping. Thanks again for your 10 minutes!

Our setup
Our setup

The Coin flipping Rules

There are just three simple rules.

  • Work in batches
  • Flip every coin of the batch
  • Pass on flipped batch to the next worker

An experiment is no experiment without collection of some data. :-)

What to Measure?

Let’s face it, the only thing that counts is value to the customer.

  • first value delivered to customer: the time between the first coin entering the system and when it is spilled out.
  • full value delivered to customer: time between first coin in to last coin out

And because we are a business we like to know the utilization of our workers.

  • Utilization: the time from the first coin in to the last coin out for each worker

All set?

Flip!

Just watch the videos for an impression. The hard facts I extracted from the videos come later.

What do you think? Just by looking at it, I was amazed by the tremendous increase in throughput as batch size is reduced!

Hard Facts

Here are the results extracted from the videos by watching them over and over and over again.

It took 48,9 seconds to deliver the batch of 20 coins. On the other extreme it only took 24 seconds for the full value (20 coins) to be delivered when working in batches of 2. And the first value (2 coins) was delivered after 6 short seconds!

CoinflippingGameResult001

How faster were we compared with the one batch of 20 coins?

CoinflippingGameResult003

That means the first batch of 2 coins was delivered only 12,3% of the time it took to deliver value in the 20-coin batch. And it took less then half the time (49,1%) to deliver the full value in batches of 2 coins.

CoinflippingGameResult002

Now look at it as value over time. In the 20-coin batch, over the course of 50 seconds you would have 20 coins for 2 second (20*2 value over time). On the other extreme you would have the same value over time (even more) after 14 seconds in the batches of 2!

CoinflippingGameResult004

So, knowing that time is money :-), how much more value you would have over time with the different batch sizes? Hold your breath!

CoinflippingGameCCVT

OK, with the one batch of 20 we had 20 coins for 2 seconds after 50 seconds. By reducing the batch size to 10 we got 10 coins for 8 seconds and 20 for 13 seconds. So the value over time cumulates to 340.  So by cutting the big batch in half we produced a gain in value over time of 850%!!! The smallest size of 2 would bring us to a plus of 1760%. Are you still a friend of big long running projects?

 It comes at a Cost

CoinflippingGameResult006

The cost is a higher utilization of the single worker. But from a company’s perspective this is not really bad, is it? I’d rather concentrate on a single task and get work done than being idle for 75% of my time.

Bottom Line

Smaller batches deliver value to the customer faster. Much faster. As a result there is more value over time for the customer. The utilization is also much better. So smaller batch sizes serve both your company and your customer.

The blog about the technical challenges and solutions of STYLIGHT