by Michael zur Muehlen (mzurmuehlen@stevens.edu) and Jan Recker (j.recker@qut.edu.au)
BPMN is the de facto standard for graphical process modeling. While there are other graphical languages to represent processes (EPCs, IDEF, Flowcharts, Petri Nets, among others), no other notation has seen such an uptake in such a short time as BPMN has. It is widely supported by both free and commercial process modeling tools, the WfMC has made XPDL 2.0 and 2.1 a de-facto persistency format for BPMN diagrams, and a large number of courses on modeling processes with BPMN are being offered.
Now, BPMN is a complex language. The current incarnation (BPMN 1.1) consists of 52 distinct graphical elements: 41 flow objects, 6 connecting objects, 2 grouping objects, and 3 artifacts. That’s a lot of vocabulary to learn, given that each graphical elements has meaning and rules associated with it. So what is the minimum subset of BPMN that a process modeler should know? The answer: Less than you think.
To answer this question we collected a large number of BPMN 1.0 diagrams (126 in total), from consultants, seminar participants, and online sources. We analyzed which BPMN symbols were actually used in these diagrams. The full version of our research, which we will present at the Conference on Advanced Information Systems Engineering in June, can be found here. But since this is an academic paper, here are the practical highlights of our study.
None of the diagrams we looked at used more than 15 different BPMN constructs, and none used less than 3. The models themselves contained considerably more elements, but a model with, e.g., 5 tasks connected by sequence flow was recorded as using the task symbol and the sequence flow symbol. The average subset of BPMN used in these models consisted of just 9 different symbols. That means that the average BPMN model uses less than 20% of the available vocabulary.
Figure 1 shows which construct we found across which percentage of the diagrams we collected.
Figure 1: Frequency distribution of BPMN construct usage
The results of our study are:
- Only five elements (normal flow, task, end event, start event, and pool) were used in more than 50% of the models we analyzed. These, plus the data-based XOR gateway form what we call the common core of BPMN (marked in yellow in fig. 1).
- Six additional elements were found in at least 25% of the models - gateways (parallel and unmarked XOR), lanes, text annotations, message flow, and start messages, we call these the extended core of BPMN (marked in green in fig. 1).
- 17 elements were used in less than 3 models - seven elements occurred in just two models, five in just one, and five elements were not used in any of the models we studied.
We then looked at the co-occurrence of BPMN symbols - i.e., are certain constructs used in combination, and how frequently? The combination of certain elements is mandated by the BPMN specification - you cannot use lanes without pools, or data objects without associations. But if there is a common subset used by many models, this would constitute a true “common core”. A detailed analysis revealed that BPMN elements fall into several well-defined groups. Figure 2 shows these groups as frames around the respective BPMN elements. The numbers within each frame represent the number of models (out of 126) that contain all elements within the frame.
Figure 2: Grouping of BPMN elements
Our findings are:
- The common core of BPMN is very small. The subset of BPMN across the different models varied considerably. While nearly all models contain tasks and sequence flow, adding symbols to this set leads to a near exponential drop in models that share the (bigger) set of symbols. For example, while 65 models contain tasks, sequence flow, start and end events, only 25 also contain parallel gateways, and just 10 contain parallel gateways and data-based XOR gateways.
- There are two types of BPMN modelers. While our sample is too small to explore this proposition in detail, we found anecdotal evidence that two groups of modelers use BPMN: Those who use pools and lanes to represent organizational responsibility for tasks, and those who use gateways to represent the control-flow rules of the process in detail. In other words, one group uses BPMN to specify inter-organizational settings (process choreography). Mostly, these users will be consultants or process analysts working on organizational (re-) engineering and process improvement. The other BPMN user group is leaning more towards workflow engineering (process orchestration). These users will likely be designers and analysts seeking to articulate precise flow conditions, for instance, in the context of workflow engineering or process simulation.
Implications
Our findings have implications for practitioners, software vendors, and standards makers alike.
- Practitioners can begin studying the use of BPMN by focusing on the most commonly used symbols first, leaving more specialized and lesser-used constructs for those who need more specialized BPMN training (e.g. systems analysts).
- Software vendors that are not supporting the entire BPMN vocabulary can assess what percentage of BPMN diagrams can be represented in their tool, and where enhancements should be made.
- Finally, Standards-makers should review whether a more complete, but also more complex language is a desirable result of the standardization process. Creating BPMN took six years. How much time was spent on defining those seventeen symbols that we found are hardly used? And will the extensions of BPMN 1.1 entice users to expand their commonly used vocabulary, or will they go unused?
If you would like to learn more about this research, we encourage you to read the full version of our paper:
- Michael zur Muehlen, Jan Recker. (Jun 16, 2008). “How Much Language is Enough? Theoretical and Practical Use of the Business Process Modeling Notation”, 20th International Conference on Advanced Information Systems Engineering (CAiSE 2008), Montpellier, France, June 16-20, 2008., Springer LNCS. Download (657 kb PDF)
You can find additional research on process modeling and process management in the publications section of this site, and in Jan’s QUT eprints directory.
As always, your questions or comments are much appreciated.
Tags: BPMN, modeling, simplicity, usage

Entries (RSS)
March 4th, 2008 at 6:23 pm
[…] BPM Research » How much BPMN do you need? This is a drilldown post on a slide that I saw Michael present last year on what actually gets used in BPMN. He says “What is the minimum subset of BPMN that a process modeler should know? The answer: Less than you think.” Actually, five. (tags: bpmn) […]
March 5th, 2008 at 12:50 am
As a BPMS vendor we have found the adoption of “strict” BPMN very slow and sometimes it creates more conflict than the discussion on the processes itself. We have always had the view that there is soem correlation between actual BPMN elements that are used and the Visio audit diagram/swimlanes that most people use. It looks like your research supports this as well as the fact that the BPMN rules are not beig adhered to. We have seen new tools like Process Master emerge that use the graphical representation of certain BPMN elements but that don’t enforce the BPMN rules. Great work and I will continue to follow your research.
March 9th, 2008 at 4:46 pm
[…] zur Muehlen posts a strange bit of analysis called How Much BPMN Do You Need? The method of research consisted of collecting 126 BPMN 1.0 diagrams from “consultants, […]
March 13th, 2008 at 9:21 am
Really good article!
When will people find out that simplicity adds more value than complexity?
March 14th, 2008 at 6:36 am
[…] L’article original joliment illustré, détaille quel peut être (doit être ?) votre connaissance en matière de BPMN en fonction de ce que vous envisagez faire en tant que “Process Modeler”. […]
March 14th, 2008 at 2:08 pm
[…] then blogged about the paper but went further by listing three implications that were not expressed in the […]
March 14th, 2008 at 5:51 pm
Very interesting article and research indeed. Norms, especially in IT, are mandatory but often prone to generate unecessary complexity, so reviewing what’s really core in terms of value for the users is great. In my opinion the fact that BPMN can be segmented so that the very core is pretty small illustrates why BPMN is nicely thought: you can adopt iterative approach, get value out of it with a minimal investment, and enrich your process afterwards with subtilities.
Two remarks:
1) If BPMN is not only used for high level process representation, but also to define operational process execution logic, does that somehow impact the core set of BPMN elements? I mean if one is to run the process that was designed for real, are there additional elements that would become core, just because the level of precision required is different.
2) I think it would be very interesting to go deeper into the analysis of the use patterns of BPMN by different populations of users. For me BPMN is above all a candidate shared language between business users and IT folks: that’s rare enough to investigate what parts of the language are actually shared, shoud be shared and (possibly) can’t be shared.
Regards
March 14th, 2008 at 10:03 pm
Matthieu,
Thanks for your comments.
As for your first remark, I have a gut feeling (but don’t have the data to prove or disprove it) that when you move closer to the execution level that there is a crossover point where implementing process semantics relates closely to programming. If you look at the development environments of most BPM vendors these days they are all integrated in Eclipse, giving you easy access to the source code of the adapters / data structures / programs / rule sets etc. you may need to tap into in order to make your process work. While you can create a very detailed specification using a more comprehensive subset of BPMN, the necessary overhead created by the graphical representation of a programming concept may just be too much. For example: In a BPMS I know well you can specify multiple timer events per activity. You could - say - increase the priority of the activity after 2 hours, reassign the activity to somebody else after 4 hours, and notify the manager if it has not been completed after 2 days. Now, when the activity gets reassigned to a different person that holds the same role as the first assignee, do we model this as a different activity in a different swimlane? And how do we represent the notification of the manager (who may reassign the task, but may choose not to work on it)? Details like this today are hidden in the property pages of BPMS. It is an open question, whether we should make them visible in our process models.
As for the second remark, we tried a cluster analysis on the models themselves, to see whether we could find commonalities in BPMN usage between models that were created for a similar purpose. Unfortunately, the models did not cluster at all - the reason being that each model was created with a fairly distinct subset of BPMN. If we had a larger base to draw from we might be able to find multiple cores - I would like to think that several well defined sets of constructs will emerge over time.
March 21st, 2008 at 12:07 am
[…] BPMN to draw business processes, and have counted the occurrance of rate of various elements. He summarized this in a blog post,which came to the conclusion that practitioners could focus on learning and using a small subset of […]
April 24th, 2008 at 8:52 pm
[…] foi surpresa, portanto, ler o artigo “How much BPMN do you need”, de Michael zur Muehlen e Jan Recker. Esse pessoal analisou 126 diagramas BPMN desenhados por empresas e consultores diversos e procurou […]
May 1st, 2008 at 9:43 pm
Hi there,
I think your first two conclusions, recommending pragmatism on the part of practitioners and vendors, make a lot of sense. Why would you spend a lot of time and effort, not to mention expense, learning things that will only form a small subset of your work. Particularly when you can fill in the gaps as needed.
The third seems fine on the face of it but consider an analogy. Would the technicians designing and building an airbag system or anti-lock braking system for a car gloss over it because it’s only ever used in 0.1% of car journeys? Obviously not. To have a complete car, which functions as you would like it to in all circumstances, you need all those systems. So saying we don’t need a complete modelling language based on a statistical analysis of diagrams may not be valid.
Using those data to say that the OMG should not spend a lot of time understanding and defining the most complicated and intricate, if least used, elements of a modelling notation is, in my view, a mistake.
May 9th, 2008 at 7:41 am
[…] process; therefore it may not be a surprise that a recent research conduct by Michael zur Muehlen: How much BPMN do you need? found that “the average BPMN model uses less than 20% of the available vocabulary”; suggesting […]
May 21st, 2008 at 12:58 pm
YES AND NO
I am the CEO of Process Master and the vast vast majority of users are only using 20% of the available stencil
However, BA’s and process specialists are using it all - which is the beauty of BPMN
So using BPMN for the production of maps and documentation, only needs a fraction of the stencil - and it expands naturally to 100% if you are using it for automation or indept BPI
There is a video of all this on http://www.ProcessMaster.com
Cheers
Alan