While searching for models that integrate Web3 credit scoring, I came across RociFi, a protocol that has gradually evolved from a scoring model integrated within a loan protocol to a model offering credit scores via a data feed.
What is RociFi?
RociFi (https://roci.fi/) is a comprehensive protocol. Initially, it included both a credit scoring solution and a lending protocol. Recently, RociFi merged with Gora and shifted its business model to become an oracle data feed provider. As the Web3 ecosystem increasingly focuses on building fundamental primitives rather than all-in-one solutions, RociFi followed this trend, transforming into a lending risk oracle.
However, some tools are no longer functional, such as the analytics tool (https://roci.fi/app/analytics), and the platform’s user interface for scoring any address is currently non-operational.

Analyzing RociFi’s Machine Learning Model
Despite these changes, RociFi’s model is still available in a public repository for analysis. The machine learning model behind the credit scoring is explained in detail here: Building an ML Model for Undercollateralized DeFi. Running the “CSM Playground” code locally can be tricky, so I recommend using Google Colab for easier execution. You’ll need to upload the example.csv file and the model to your Colab instance.
RociFi’s Credit Score Scale
RociFi’s credit score scale ranges from 1 to 10, where:
- 1 indicates the lowest credit risk (best score)
- 10 indicates the highest credit risk (worst score)
The data consumed by the model is grouped into five main categories:
The example.csv file attached to the repository contains 98 features, but the model only consumes 39. The key features used by the model include:
total_borrow, count_borrow, avg_borrow_amount, std_borrow_amount, borrow_amount_cv, total_repay, count_repay, avg_repay_amount, std_repay_amount, repay_amount_cv, total_deposit, count_deposit, avg_deposit_amount, std_deposit_amount, deposit_amount_cv, total_redeem, count_redeem, avg_redeem_amount, std_redeem_amount, redeem_amount_cv, total_liquidation, count_liquidation, avg_liquidation_amount, std_liquidation_amount, liquidation_amount_cv, days_since_first_borrow, net_outstanding, int_paid, net_deposits, count_repays_to_count_borrows, avg_repay_to_avg_borrow, net_outstanding_to_total_borrowed, net_outstanding_to_total_repaid, count_redeems_to_count_deposits, total_redeemed_to_total_deposits, avg_redeem_to_avg_deposit, net_deposits_to_total_deposits, net_deposits_to_total_redeemed, avg_liquidation_to_avg_borrow.
Scoring and Normalization Process
The model’s outputs are not normalized and go through a specific process. Here’s how it works:
- convert_prob_to_score: Normalizes predicted probabilities into a score between 1 and 10.
- count_borrow: Penalizes users with a high number of borrowing events, factoring in statistical uncertainty. High or frequent borrowing is considered a risk factor and lowers the score.
- transform_predictions: Combines penalization and normalization to produce a final score, ensuring that users with fewer borrowing events are treated differently.
Correlation Analysis
With the credit score in hand, we can explore the correlation between various columns and the final score. Below are the top 10 highest correlations:
- eth_coef_activity: 0.168041 (Not in features).
- eth_min_sent_value: 0.162103 (Not in features).
- eth_coef_activity_send: 0.139009 (Not in features).
- eth_avg_days_delay_between_sent: 0.138705 (Not in features).
- liquidated_60_days: 0.113128 (Not in features).
- eth_count_of_contract: 0.102769 (Not in features).
- eth_contract_value_pct_change_60_days: 0.099055 (Not in features).
- eth_receive_value_pct_change_60_days: 0.096795 (Not in features).
- eth_contract_value_pct_change_14_days: 0.093838 (Not in features).
- eth_contract_value_pct_change_7_days: 0.093838 (Not in features)
Inverse Relationships
Below are the columns with negative correlations (inverse relationships) with the final score:
- days_since_first_borrow: -0.562672 (In features).
- repay_amount_cv: -0.515483 (In features).
- borrow_amount_cv: -0.514606 (In features).
- deposit_amount_cv: -0.477170 (In features).
- redeem_amount_cv: -0.417422 (In features).
- total_redeemed_to_total_deposits: -0.301257 (In features).
- count_borrow: -0.299121 (In features).
- total_redeem: -0.278000 (In features).
- count_redeem: -0.255720 (In features).
- total_borrow: -0.249974 (In features).
Supported Lending Protocols
The borrow and repay data is gathered from the following lending protocols:
- Aave
- Compound
- Cream
- RociFi
- Venus
- MakerDAO
- GMX
- Radiant
For address-specific data, the model uses data from Ethereum and protocols across Arbitrum, Fantom, Polygon, Optimism, BSC, and Avalanche.
Final Notes:
- An initial exploratory analysis using Sweetviz can be found on example_report.html.
- The dataset does not differentiate between EOA and Smart Contract addresses. In fact, both types of addresses are listed in the example.
- The response of the model to Safe type addresses is unclear.
- Integrating additional databases like ETHs, RNS, or the Arkham database could be considered to enrich the model.
- Acquiring DEX data can be challenging, as decentralized exchanges are increasingly becoming underlying tools. Often, users do not interact directly with DEXes but rather with bridges or other tools that utilize DEXes behind the scenes. Notwithstanding, these rows are not consumed by the model.
0 Comments