CorneliusLM is a project that aims at fine-tuning a base model (Qwen/Qwen2.5-1.5B-Instruct) to give it a voice of a pretentious philosophical contrarian. The base model has been quantized and LoRA adapter have been trained on a synthetic dataset. The dataset of 5k of chat message sequences was taken originally from the OpenAssistant/oasst2 dataset and transformed in batches with the use of LangChain + OpenAI's gpt-4o.
It still needs more work, but you can interact with the current version of Cornelius here.
Tech stack:
[ transformers ] [ trl ] [ bitsandbytes ] [ datesets ] [ LangChain ] [ openrouter.ai ]
Project link:
CodeMentat is an attempt to improve coding AI agents by elevating their code analysis and understanding capabilities. The approach is based on decomposing the code repository into a tree of code snippets (based on Abstract Syntax Trees of individual languages, optionally leveraging Microsoft's Language Server Protocol) and examining simple relation ships between the snippets.
By organising the entire repository of code into a tree in which the relationships between nodes are: "A uses B and C" it's possible to recurrently request interpretetion of code snippets that don't use anything (the most basic building blocks, e.g. B and C) and once this explanation is available, to use it in interpretation of more complex snippets (snippet A). This way, byt exploring the tree, the functionality of the entire application can be understood.
Tech stack:
[ TypeScript ] [ Claude ] [ LangChain ] [ openrouter.ai ]
Project link:
This project is a part of my article series on threat modelling for blockchains Layered threat model for Web3 applications part 1— distribution of responsibility. The goal of this project is to provide basis for discussing potential threats and vulnerabilities in various layers of blockchain networks functional stack. This project should not be used for production purposes without mitigating all internal weaknesses and vulnerabilities.
Tech stack:
[ Rust ] [ Alloy ] [ Tokio ]
Project link: