Use Cases of Blockchain Analysis
Before diving into the analytics part I would like to shortly motivate why analyzing blockchain data is of interest. There are various use cases that require analysis of blockchains.
Forensics and Fraud Detection
As blockchains and cryptocurrencies have not yet seen much regulations -although governments are looking into it- it can be used for criminal activities. Its anonymous nature makes it a good candidate to carry out trades on the black market or money laundering. Analysing blockchain data in a behavioral context could help law enforcement agencies fighting cybercrime, inside trading or other criminal activities.
Price Forecasting
Predicting the price of crypto assets is a hot topic for retail traders but also for institutional investment companies. In the context of financial analysis raw blockchain data could be fused with price data, news data which could include possible regulations from governments or announcements about updates or forks from the community. Social media activities from crypto advisors or influencers could also be merged with the goal to accurately predict the price of crypto currencies.
Data Sources for Blockchain Analytics
Once the analytics-related questions are formulated it is necessary that the relevant blockchain data is gathered. As of now there are various possible data sources and procedures to get hold of data and it pretty much depends on the specific problem you are working on.
Download Blockchain Data
The probably fastest and easiest approach is to download relevant blockchain data. One example is this thread on the bitcointalk forum where bitcoin data can be downloaded.
APIs for Blockchain Data
There are also API providers which offer. For the Ethereum blockchain the etherscan API might be a good starting point. Their API might be used to get account balances, list transactions and event logs. The API is designed in a REST style and uses the HTTP protocol. There is also the websocket protocol used to exchange information about events in realtime. Another option for the Ethereum blockchain is Ethplorer.
Interacting with the Blockchain
In some situations relying on external parties to download blockchain data or use their API for queries might not be desired as it introduces dependencies and somehow violates the paradigm of decentralization. In such cases one might directly interact with nodes of the blockchain.
For the Ethereum blockchain there is a JavaScript API which is called web3. This allows you to pull data from the blockchain such as transactions, store the data and run your analytics scripts.
Modelling Blockchain Data
You might ask yourself how to best model blockchain-related data and again it depends very much on your use case and your experience. I would say that the most prominent approaches are the following ones:
You might use a relational database where you have tables for nodes, addresses, blocks and transactions. For a first study a simple to use SQL-like database system such as PostgreSQL might be a good candidate.
You can use a graph-based approach where you model the addresses as nodes and transactions as links between different nodes. Neo4j would be an example for a graph database management system that could be used in such a way.
Processing Blockchain Data in a Pipeline
Ideally you set up an automated pipeline that can handle all the data processing for you: The pipeline should cover all the required steps from taking the data from the source over transformations and analysis up to the illustration of the result in a dashboard or report.
Specialist Software and Tools for Blockchain Analytics
Since the blockchain technology is still fairly new there is not yet too much specific tooling available for blockchain analytics.
Please do not hesitate to contact us for further information on this topic.