Back to overview

Open Sourcing our NFT Smart Contract Bytecode Risk Analyzer AI

Open Sourcing our NFT Smart Contract Bytecode Risk Analyzer AI

Protecting the NFT Market: Early Identification of Security Issues in Decentralized Finance

Leveraging Machine Learning for a Safer NFT Ecosystem


The Non-Fungible Token (NFT) market has exploded in popularity within the decentralized finance (DeFi) ecosystem, presenting new investment opportunities for traders and investors. However, it has also brought with it a surge of fraudulent activities, such as scams and rug-pulls, which can lead to substantial financial losses for investors. The lack of regulatory oversight and complexity of the technology make it difficult for investors to detect and avoid these types of fraud.

To address these challenges, we propose a novel approach for early identification of security issues in NFTs using machine learning techniques. Our approach leverages Control Flow Graphs (CFGs) and Graph2Vec embeddings to distinguish between malicious and non-malicious smart contracts. 

Today we are open sourcing our research on this “Smart Contract Bytecode Risk Analyzer AI”.
You can find the source code and the docs here:

Background: The Rise of NFTs and Associated Risks

The Non-Fungible Token (NFT) market has experienced substantial growth in recent years, reaching a value of over $2.5 billion in 2021 from less than $1 million in 2019 (according to NFT projections).

This growth can be attributed to increasing interest from individual and institutional investors, as well as the development of NFTs and meta-universe-related products.

The popularity of NFTs has been driven by their use in areas such as digital art, fashion, and gaming, where virtual ownership is common. Additionally, the rise of virtual real estate and digital art has also played a significant role in driving the growth of the NFT market.

However, with the growth of the NFT market, several risks and challenges have emerged, with scams and rug-pulls being the most prominent. Scams refer to fraudulent activities where an individual or group creates and promotes a fake NFT project or token with the intention of defrauding investors. On the other hand, rug-pulls refer to a form of exit scam in the NFT market, where the project owner maliciously designs smart contracts to perform detrimental actions such as draining wallets, stealing collectibles, and other malicious activities.

Moreover, lack of regulation and oversight is another major challenge in the NFT market. The lack of clear rules and guidelines makes it challenging for investors to understand the risks and potential returns associated with investing in NFTs. This makes it more difficult for investors to make informed investment decisions, and there are no legal protections in place for investors who fall victim to scams or rug-pulls. Additionally, the complexity of the technology used in NFTs and DeFi can present several challenges for investors and market participants, making it difficult for them to comprehend and analyze potential scams and rug-pulls.

Given the risks and challenges that investors face in identifying and avoiding fraudulent activities, maintaining a stable and secure environment for the market becomes increasingly difficult. One type of fraud investors may encounter are rug-pulls, which come in two forms: hard and soft. 

Hard rug-pulls involve intentional malicious code in a smart contract, while soft rug-pulls occur when founders fail to deliver on promises and take all profits. To mitigate the risks associated with NFT scams, detecting security issues early on is essential. Our approach specifically targets early detection of hard rug-pulls, aiming to protect investors and promote a more secure NFT market.

Developing a Framework to Detect Security Issues

Detecting malicious smart contracts in the NFT market is crucial for preventing fraudulent activities such as rug-pulls. To achieve this, we developed a framework consisting of multiple steps that involved data collection, cleaning, and analysis, as well as machine learning techniques.

Data Collection and Preprocessing: The first step in our approach is data collection and preprocessing. To train our model to distinguish between malicious and non-malicious smart contracts, we required a comprehensive dataset of smart contract bytecode. The bytecode of smart contracts is publicly accessible on the blockchain and can be obtained from various sources. However, identifying malicious contracts posed a significant challenge, as it required extensive manual research to identify past rug pulls and other fraudulent activities.

To address this, we adopted a hybrid approach that involved combining publicly available datasets with manual research from news articles, tweets, and audit reports. We scraped contracts from sources such as Opensea and Etherscan. After collecting the smart contract addresses, we cleaned the data to ensure it was relevant and appropriate for analysis. This involved removing duplicates and irrelevant information and checking for missing or incomplete information. We then extracted the bytecode for each smart contract address, which allowed us to easily reference the bytecode during the analysis stage.

Labeling: With a clean dataset in hand, we were able to move forward with our research design and begin analyzing the bytecode of each contract. After gathering and extracting the necessary data, we proceeded with the task of labeling each contract in our dataset as either malicious or non-malicious. This process was based on the information we had gathered during the data collection stage, including any relevant external sources of information.

Addressing Class Imbalance: We ended up with 225 malicious contracts and 992 non-malicious contracts. To address the imbalanced nature of the dataset, we adopted a malicious bias approach, undersampling the non-malicious contracts and oversampling the malicious ones.

Bytecode Analysis: Since not all NFT smart contracts are open source, we decided to focus on the bytecode of smart contracts to detect malicious ones as soon as possible. We disassembled the bytecode into a sequence of opcodes, which are easier to understand than the bytecode but still challenging to follow the execution flow and find any anomalies.

Constructing Control Flow Graphs (CFGs): To analyze the bytecode of smart contracts, we constructed Control Flow Graphs (CFGs) using a Python script that consisted of a bytecode parser and a CFG builder. The construction process involved a series of carefully implemented steps to ensure the accuracy and effectiveness of the resulting graphs. Firstly, we disassembled the bytecode into individual opcodes and their corresponding arguments using the pyevmasm library. This enabled us to examine each instruction and understand how it operated. Secondly, we grouped the opcodes into basic blocks using a simple rule that allowed us to segment the bytecode into smaller, more manageable units. Thirdly, we identified the edges between basic blocks by analyzing the opcodes for jump, call, and branch instructions, allowing us to visualize the program’s control flow. Finally, to improve the semantic expressiveness of the CFGs, we performed normalization on the opcodes and operands. By following these steps, we were able to create normalized CFGs that accurately represented the bytecode’s control flow.

Graph Embedding: To capture the closeness or similarity between graph instances and systematically extract features for classification, we utilized the Graph2Vec algorithm to embed the Control Flow Graphs (CFGs) as fixed-length vectors. This algorithm is designed to learn distributed representations of graphs and is unsupervised, making it a powerful tool for recognizing complex patterns in the CFGs using a neural network.

To generate embeddings for the CFGs, we utilized the karateclub.graph2vec library, which employs the Weisfeiler-Lehman tree features to capture the local node structure of the graph and a document-feature co-occurrence matrix to generate embeddings. This approach is task agnostic and captures structural equivalence, allowing the embeddings learned from Graph2Vec to be used for various downstream tasks, such as classification and clustering, with or without fine-tuning.

Overall, our use of Graph2Vec allowed us to effectively represent the CFGs as fixed-length vectors and leverage the power of neural networks to distinguish between malicious and non-malicious smart contracts.

Custom Neural Network Classifier: After generating Graph2Vec embeddings for the CFGs, we employed a custom neural network classifier to distinguish between malicious and non-malicious smart contracts. The neural network model consisted of multiple layers of neurons that were trained on a labeled dataset, with inputs being the Graph2Vec embeddings of the CFGs, and the outputs being the classification of the smart contracts as malicious or non-malicious.
The classifier consisted of an input linear layer, a hidden Leaky ReLU layer, and an output layer that outputted a single value (either 0 or 1)
The model was trained using stochastic gradient descent optimization in batches, and K-fold cross-validation was used to prevent overfitting. Performance was measured using standard metrics such as precision, recall, and F1 score.

Optimizing and Evaluating the Model

After training the neural network classifier on the Graph2Vec embeddings and relevant hyperparameters, we implemented several optimizations to improve its performance. These included implementing a dynamic learning rate for the neural network, experimenting with different optimization algorithms such as Stochastic Gradient Descent (SGD) and Adam, and developing a hyperparameter optimizer to automatically tune the hyperparameters for our use case.

We evaluated the model’s performance using standard metrics such as precision, recall, and F1 score. The precision metric measures the percentage of predicted malicious contracts that are actually malicious, while the recall metric measures the percentage of actual malicious contracts that are correctly identified as such. The F1 score is the harmonic mean of the precision and recall metrics, providing a balanced evaluation of the model’s performance.


Overall, the custom neural network classifier performed well in identifying malicious smart contracts, with a minimum F1 score of 0.649 and a maximum of 0.857, indicating a balanced evaluation of precision and recall. The recall metric, which measures the percentage of malicious contracts correctly identified by the model, had a minimum value of 0.571 and a maximum value of 0.875, suggesting that the model was able to identify a significant percentage of malicious contracts while maintaining a low false-negative rate. The accuracy metric had a minimum value of 0.861 and a maximum value of 0.967, indicating a low overall error rate and high accuracy in the model’s predictions. These statistical results were obtained through 10-fold cross-validation, where the model was trained and tested on different subsets of data, providing further evidence of the model’s consistent and reliable performance.


Our approach provides an efficient and effective way to detect potential scams and rug-pulls in smart contracts, and has the potential to be utilized by various organizations dealing with smart contracts. However, as with any approach, there are limitations to be addressed. The small dataset used for training and evaluation may limit its generalizability, and our approach may not be effective in detecting certain types of rug-pulls that require off-chain data and social network analysis.

Future research can explore alternative approaches and techniques to improve the accuracy and generalizability of our approach. These can include identifying different types of malicious smart contracts, using explainable AI techniques to improve interpretability, and investigating reinforcement learning techniques to develop self-improving smart contract analysis tools.

Despite these limitations, our approach has shown promising results in detecting potential rug-pulls in smart contracts. By implementing frameworks like the one developed in this study, investors can better protect themselves from the risks associated with NFTs and decentralized finance systems. As the technology continues to evolve and attract more attention from investors, effective tools and strategies are essential to protect users from potential scams and fraud. Continued research and collaboration between researchers, industry professionals, and regulators are crucial to identifying and addressing emerging threats in the NFT market, ensuring a safer and more reliable experience for all participants.


We have open sourced our “AISafeGuard: AI-Powered Smart Contract Risk Analysis”.
You can find the source code and the docs here:

Contributions are welcome!


Big thanks goes to Nader Bennour as main developer of this prototype, to Jens Ernstberger from the TUM for the amazing scientific support and to Antonius Gress for the supervision. The Blockbrain team is very grateful about the cooperation with the TUM on this project!