Marcel Boersma (PhD candidate from KPMG and Computational Science Lab), Dr. Sumit Sourabh and Prof. Dr. Drona Kandhai of the Computational Science Lab, together with Aleksei Maliutin and Lucas Hoogduin from KPMG as industry partner, developed a novel data driven method for financial audits. This method derives primary monetary flows from the transaction data of a company that can be used by auditors in financial audits. Recently the article has been published in Nature Scientific Reports.

The importance of financial audits

Recent accounting scandals, for example the Enron (2001) scandal, Petrobras (2014) scandal and the Wirecard (2019) scandal, emphasize the consequences of untrustworthy financial information. Yet, new data driven audit methods are scarce and not yet widely applied in this industry. While these techniques make it possible to consistently and objectively analyze vast amounts of data and thereby potentially improve the audit quality. Other potential applications include fraud analytics. 

Financial audit methods

Despite the growing interest in applying data driven methods in audit and the availability of large volumes of digitally recorded data, audit methods are mostly manual in nature, and developing an algorithmic equivalent proves challenging. Data used in these algorithms are recorded at the most granular level, individual transactions that pertain to the company. But auditors provide trust about aggregated representations, not individual data points. To develop data driven methods, understanding the link between the low-level recorded data and its aggregate representation is therefore essential. 

Network theory

We developed a method that turns low-level financial transaction data (journal entries) into a bipartite network representation (Boersma et al. 2018, Boersma et al. 2020). The network represents what flows in the financial system, e.g. Revenue value, inventory value, and why it flows, the underlying business process that drives the change, e.g. selling goods or dispatching goods.

This network enables us to capture the interconnectedness of the company’s financial system from individual data points and link this to the high-level aggregates presented in the annual reports in a data driven way.

In Figure 1 left, we see the detailed bipartite network extracted from transaction data and, on the right, the simplified version (obtained by grouping similar nodes together) that provides a high-level summary of the company’s monetary flows. At first, the simplified representation is obtained in a hybrid approach, where the auditor manually groups similar nodes. But, thanks to our recent contribution (Boersma et al. 2020), we can automate the simplification procedure with the use of network embeddings. Within the simplified network structure, we look for the dynamics on the process level that result in relationships between monetary flows. The relationships are used to predict monetary flows and check whether anomalies present themselves when the actual monetary flows are recorded by the company. This information is important to the auditor to determine high risk areas of an audit. 

Data driven financial audits

With data driven methods, we can analyze vast amounts of data consistently and objectively. And, thereby, provide assurance to financial data in a new manner. We believe that methods, as presented by us, can pave the way towards a more data driven audit. 


Boersma, M., Sourabh, S., Hoogduin, L., & Kandhai, D. (2018). Financial statement networks: an application of network theory in audit. The Journal of Network Theory in Finance4.

Boersma, M., et al. “Reducing the complexity of financial networks using network embeddings.” Scientific Reports 10.1 (2020): 1-15.