Moving Through Berlin By Bike

Analysing Accidents and Bicycle
Infrastructure in Berlin

Re-thinking mobility is crucial to improve air quality and CO2 consumption in traffic. For this reason, Berlin has set itself the goal of getting away from a car based concept of mobility. Instead, Berlin wants to promote cycling as an environmentally friendly and sustainable mode of transportation.

At the same time, cyclists are the most at-risk group of all traffic participants. According to the accident statistics around a quarter of all accidents involve cyclists. And while accident numbers have generally declined over the last fifteen years in Berlin, they have recently remained rather stable with little improvements. In contrast, accident numbers have continued falling In other large German cities. The data also reveals that most bike accidents happen on roads without any sort of bike lane or on advisory bike lane paths. In around 80% of these accidents cars are involved putting cyclists at serious risk of getting injured or even killed.

While many people want to switch to cycling, safety concerns are deterring many people from using bikes more often. Perceived and actual risk of biking is a key determinant in mobility decisions. The importance ofthese risk considerations can be seen during the Corona epidemic when commuters have switched from public transit (where infection seems more likely) to cars and bicycles. To get people on the bike, safety has therefore to be improved. The recent change in mobility patters during the Corona epidemic has further increased the need to improve bike safety. But first, we need to understand where and why cyclists are at risk of having accidents.

Using publicly available data on accidents with injuries from the German Federal Statistical Office — provided by the datenguidepy wrapper and the “Unfallatlas” — as well as information on bicycle infrastructure and traffic cells , the relevant areal unit for data collection and planning in Berlin, these questions are explored on the following pages.


Trends in reported accidents with injuries in major German cities


Next: Who is most frequently involved in accidents with injuries in Berlin?

Participants involved in Reported Accidents with Injuries in Berlin in 2019


Next: Explore Reported Bike Accidents with Injuries and Bicycle Infrastructures in Berlin

Next: Compare Reported Bike Accidents with Injuries by Bicycle Infrastructure and Opponent

Reported Bike Accidents in 2019 by Bicycle Infrastructure and Opponent


Next: Explore areas of high bike accident rates


Curious about the project? Read more about our motivation and methodology!

About the Project

A Shiny app created with ☕ and ❤️ for the CorrelAidX Challenge 2020
“Analysing and visualising German regional statistics with datenguidepy”
by Cédric Scherer, Andreas Neumann, Saleh Hamed & Steffen Reinhold

CorrelAidX Berlin, a local chapter of CorrelAid
Good Causes. Better Effects. Local Implementation.

CorrelAid is a non-partisan non-profit network of data science enthusiasts who want to change the world through data science. We dedicate our work to the social sector and those organizations that strive for making the world a better place. In order to improve data literacy in society, we share our knowledge within our network and beyond and are always looking for ways to broaden our horizons.

You want more?

Learn more about CorrelAid
Read more about our methodology
Get the code from GitHub


For our assessment we used 3 major data sources:

  • datenguidepy – A Python package created by CorrelAid containing regional data (NUTS1-3) on various topics such as economics, finance or environmental and social affairs. The data source is the German Federal Statistical Office that collects, checks and verifies the data on a yearly basis.

  • Unfallatlas – An ongoing project by the German Federal Statistical Office (Destatis) since 2016. For each year, destatis collects regional data on car, bike, motorcycle, pedestrian, and lorry accidents with personal injuries on a NUTS1-3 level from several German states and converts them into an open source geographical dataset. Additional data such as the month, day, time and the geographical location of the incident as well as road and light conditions during the accident are also included. For our work we selected the latest data on (bike) accidents in Berlin. In 2019 there were 5,005 recorded bike accidents within the city state.

  • Technologiestiftung Berlin – A non-profit foundation that is commited to the digitization of Berlin by providing open information, software and infrastructure. We used geodata on traffic cells (“Teilverkehrszellen”) and bike traffic infrastructures (“Radverkehrsanlagen”) in Berlin. Overall, there are 15 different bike lane categories. For our analysis, we aggregated the categories to five major classes, based on structural appearance (true bike paths, bike paths combined with sidewalk, bike lanes on roads and bus lanes) and whether the use is mandatory or advisory.

Step 1: We matched the bike lane data with the accident data from 2019. We added a 4-meter buffer to each bike lane that is represented by a geographical linestring. Any accident point that falls within the buffer will be associated with the specific bike lane segment and therefore the bicycle infrastructure. All accidents that did not fall in any of the five categories was classified as “road only”. We used the merged data sets to explore the bike accidents and the bicycle infrastructure in Berlin.

Step 2: Next, we aimed to investigate the number of accidents that occurred on bike infrastructures as a proportion of all accidents and where the hotspots can be located. We matched the dataset obtained in Step 1 with 1,223 subtraffic cells. Most bike accidents happened on roads and not on bike infrastructure. However, in both cases the hotspots can be found near Berlin city centre.

Future steps: While this data already reveals interesting patterns, some open questions remain. In future steps we would like to explore whether and by how much certain bicycle infrastructures reduce the risks of accidents. For analyzing this question, we want to relate number of accidents with injuries in each subtraffic cell with the coverage of bike infrastructure and traffic volumes. Therefore, data sources on traffic volumes (by separate modes of transportation) are explored. Publicly available counter data is one potential source but there are only a few spots where traffic volumes were actually counted. Private providers of fitness or routing apps may be another source of data for traffic data.