About Me

Who Am I?

Hi, I am Aryan Dua, a Computer Science student at IIT Delhi with a passion for exploring the limitless possibilities of technology. My journey is driven by an insatiable curiosity for Machine Learning, Artificial Intelligence, and their transformative real-world applications. From designing AI-driven personalized treatment pathways to fine-tuning language models, I thrive at the intersection of innovation and impact.

I've delved into diverse areas such as natural language processing, image recognition, and graph algorithms, honing my expertise in building neural networks and deploying large-scale models on HPC clusters. My academic and project experiences have equipped me to tackle end-to-end challenges, from data preprocessing and feature engineering to model optimization and deployment.

What excites me most is solving complex problems through elegant code and collaborative teamwork. Whether developing AI products, contributing to ML research, or pushing the boundaries of deep learning, I aim to create technology that not only innovates but also benefits society at large.

Beyond the screen, I find joy in making technology more accessible and impactful, always looking for ways to give back to the community. Let's connect and explore how we can drive meaningful change together!

Here, check out some of my interests:

AI

Comp. Coding

NLP

Automation

8 Dept Rank
750+ Competitive Programming Problems Solved
516 All India Rank
Education

Academic life

I am in my 9th semester of my degree at IIT-Delhi. I have completed the B.Tech part of my degree with a GPA of 9.00, whereas the M.Tech part is still ongoing. I have also been in the merit list of the "top 7% students in a semester" two times so far.

I cleared the prestigious JEE Advanced entrance examination in 2020 with an All India Rank of 516, out of the 250,000 students who gave it.

I cleared the JEE Mains entrance examination in 2020 with an All India Rank of 534, out of the 1,000,000 students who gave it.

I cleared the prestigious KVPY scholarship examination in 2020 with an All India Rank of 677, and was awarded the scholarship. I also scored 418/450 in the BITSAT entrance examination that tests English, science, maths and logic. I was in the top 100 students among those who appeared for it.

I was one of the few who cleared the NSEC(National Standard Examination of Chemistry) and was qualified for the Indian National Chemistry Olympiad 2020. The qualification rate in the exam was 1%. I had also cleared the regional round of the Maths Olympiad with a state rank of 12.

Courses Completed

COL106

Data Structs and Algorithms

COL333

Artificial Intelligence

COL774

Machine Learning

COL864

Advanced AI

ELL884

NLP

AIL821

Advances in LLMs

Master's Thesis

AI-Based Personalized Treatment Pathways
January 2024 - Present

  • Going on

Academic Projects

Graph-to-Text Conversion Using GPT-2 October - November 2024

Replicated and extended the graph-to-text conversion methodology using GPT-2 on the WebNLG dataset. Adapted the original framework, which utilized BERT and T5, to fine-tune GPT-2 for generating coherent text from structured graphs. Conducted experiments to evaluate model performance and text quality.

Hate Span Detection February - March 2024

Developed a system to identify hateful spans within sentences classified as hateful, leveraging BERT and Conditional Random Fields (CRF) models for improved sequence labeling.

  • BERT: Utilized pre-trained BERT for tokenization, numerical embedding transformation, and performance optimization over multiple epochs.
  • CRF: Implemented CRFs for sequence labeling tasks, extracting features such as word casing, suffixes, and numerical properties to capture contextual patterns indicative of hate speech.

CRDT-Based Collaborative Editor October - November 2023

A CRDT-based collaborative Jupyter notebook-styled editor that facilitates seamless real-time document collaboration, ensuring conflict-free concurrent edits, offline editing capabilities, and efficient conflict resolution.

Experimented with PROGPROMPT October - November 2023

Successfully experimented with the implementation of the PROGPROMPT paper which presented a programmatic LLM prompt structure that enables plan generation functional across situated environments, robot capabilities, and tasks. The key insight was to prompt the LLM with program-like specifications of the available actions and objects in an environment, as well as with example programs that can be executed.

Distributed Streaming Word Count Application September - October 2023

Implemented a mini-batching streaming system for counting words in a stream of tweets and stored the counts to Redis. Ensured Redis fault tolerance by checkpointing, and worker fault tolerance by updating Redis atomically using Lua.

Built a Covid-19 Data Visualiser April - May 2023

From open source data, I built a web application using Django with a Postgres-12 server at the backend. The application lets you choose the filters to apply for certain statistics of a country, displays the list of countries that satisfy the filter and plots Covid-19 graphs for those countries.

Tinkering the Linux Kernel February - April 2023

Attempted the hard track (vs easy track) assignments based on the latest Linux Kernel. Implemented everything from scheduling to memory to device drivers in the kernel. Was privileged to be a part of the first Operating Systems course in the world to teach concepts practically through the Linux Kernel. We spent 4 months looking at the intricacies of the kernel code during lectures and applied these concepts in assignments wherein we had to look through the kernel documentation and elixir.bootlin. The most challenging aspect of this project was to implement OS task scheduling. We implemented a thread-safe data structure in the Linux kernel to track the number of context switches of processes. We added system calls to the kernel for registering and de-registering processes and returning the number of switches.

Multi-Class Image Classifier October - November 2022

Implemented a general neural network architecture to learn a model for multi-class classification, working on Fashion- MNIST data set (a data set of Zalando's article images, where each example is represented as a 28 x 28 grayscale image, with a label from 10 classes)

Naive-Bayes Text Classifier September - October 2022

Implemented and constructed a text classifier using the Naive Bayes Algorithm which works on a Large movie review data set and predicts the sentiment of a movie review as positive or negative. Also formed a word cloud of the most frequently occurring words of the positive and negative classes.

PSP Content Distribution Network September - October 2022

Designed a scalable and efficient multithreaded Server-Client Arch. for the distribution of a decentralized file to users.

IITD Maze Game March - April 2022

A muliplayer maze game based on the IIT Delhi map, built using SDL2 library in C++. The multiplayer dseign has been implemented using sockets. The game is a basic life simulator of life at IITD. You have 4 coefficients that determine your "score", and be sure not to collide into the angry professor. Instructions to download and play are there in README.

Data-Driven Selection In Artificial Muscle February - April 2022

Built a multi-label SVM classifier which can classify the type of actuator required based on the input features like Stress, Strain, Efficiency, etc. The data collected from the research site was sparse and incomplete and so I had to implement a novel approach to build the classifier with an accuracy of 72% as of now. Instead of using all 5 feature columns to build the classifer, I had to use them 2 at a time to maximise training examples with all non-null features and then build 10 classifiers. I then found the final prediction by compiling the predictions from all the classifiers.

An ARMv4T CPU Implementation January - March 2022

Developed a fully operational processor using VHDL, capable of executing a comprehensive range of machine-level instructions. Upon translating assembly programs into their corresponding binary representations, the processor reliably performs the specified sequence of commands.

A Compiler for the WHILE language January - March 2022

Built a full compiler which first converts any given program written in the WHILE programming language to an Abstract Syntax Tree, then from that AST it evaluates the program using a VMC Stack Machine implementation(Value - Memory - Command). The code of the compiler was written in sml(ML-LEX and ML-YACC were used for lexing and parsing the program)

ML in Speech Processing January - February 2022

Developed a fundamental audio processing library capable of detecting a specified set of twelve predetermined keywords from a one-second audio segment. The machine learning algorithm employed for this task was crafted using C++. However, the project encompassed multiple programming languages including C++, Python, bash scripting, and Makefiles.

Merge Sort in Assembly January - February 2022

Implemented the merge sort algorithm in Assembly language, sorting sets of strings given in input files, lexicographically. This was simulated using ARMSim.

Edge Detection January - March 2022

Built a program to detect edges in an image using the properties of high and low-pass filters, 2D convolutions, Gaussian kernels, Fourier transforms, etc. The novel part about this project was that we did not use the built-in OpenCV library functions, we implemented the whole program ourselves, from scratch.

DSCoin, A Blockchain Implementation August - October 2021

Developing and implementing my own Cryptocurrency, as well as the software for processing transactions through data structures in Java.

Developing machine learning models July - August 2021

Developing machine learning models to solve problems based on classification, neural networks, supervised and unsupervised learning algorithms.

Internship Experience

Quant Intern - Quadeye May - July 2024

  • Worked in Equity Derivatives to recognise profitable oppourtunities and devised a high frequency arbitrage strategy
  • Produced competitive market returns for cash and carry strategy in multiple underlyings with minimal delta.
  • Analysed performance, risk and market exposure through advanced quantitative techniques and simulation.
  • Analysed Volume, Trades and Order-book changes to generate signals for making and liquidating market positions.

ML Engineer - Ripik.ai May - July 2023

  • Contributed to the development and refinement of Ripik Optimus, a state-of-the-art AI production planner and scheduling tool tailored for pharmaceutical drug manufacturing and quality testing built for SunPharma.
  • Architected task-based graph optimization algorithms, resulting in optimal drug production schedules.
  • Desgined robust pipelines to streamline data preprocessing and feature engineering of raw plant data.

Research Intern - Zuse Institute Berlin June - August 2022

Programming asynchronous algorithms which involve gradients, projections and proximity operators and determining hyperparameter selection strategies for the synchronous and asynchronous version of the algorithms and comparing them both.

My Skills

Some techincal skills I have learnt along my journey are:

Python

C

C++

Prolog

SML

VHDL

Assembly

Java

HTML5, CSS3

LaTeX

Extra-curriculars

Recommendations

Himanshu
July 27, 2023

Himanshu Mittal, Mentor, Ripik.ai

"I am thrilled to offer my wholehearted recommendation for Aryan. Throughout his time with us, he consistently showcased strong analytical abilities, a keen grasp of complex concepts, and a genuine enthusiasm for learning. Witnessing his exceptional growth and unwavering dedication as a data scientist was a privilege. Aryan's relentless pursuit of excellence and positive attitude left a significant impact on our team, and I am confident that he will flourish in his data science career."

Zev
July 29, 2022

Zev Woodstock, Supervisor, Zuse Institute Berlin

"Aryan is an excellent programmer and researcher -- I was repeatedly impressed by how quickly he grasped onto graduate-level topics in mathematics, as well as how he was very self-driven in both learning theory and developing code."

Taruna Aswani
June 29, 2022

Taruna Aswani, Senior Professor, Bakliwal Tutorials

"Aryan's determination to success led him to excel in class 12th. One thing which differentiates Aryan from other students is his curiosity and the habit of asking questions. He is very organised, consistent and persistent. Aryan has a growth mindset and able to connect learning to life. He always sets his goal and also knows how to deal with failure."

Vaibhav Bakliwal
March 10, 2022

Vaibhav Bakliwal, Director, Bakliwal Tutorials

"Aryan was an outstanding student. He was always a self-learner. Prior to each class, he would read and try to understand the material by himself and ask doubts to deepen his understanding. From my understanding, Aryan is a sincere learner, affectionate towards his friends and teachers and hardworking individual."

Contact Me

Zanskar Hostel, IIT Delhi - 110016