Jivitesh Jain
Tagline:Graduate Researcher at Carnegie Mellon. All things NLP, LLMs, and Responsible AI.
Pittsburgh, PA, USA
Hello
👋 I'm a research masters student in Language Technologies (MLT) at Carnegie Mellon University in the steel city of Pittsburgh. My current research with Prof. Mona Diab revolves around Responsible AI, with projects ranging from using interpretability techniques to predict the chances that an LLM is hallucinating (which, in turn, gives us confidence while using LLMs in factuality-critical applications) to the use of LLMs for regulatory compliance of software. I'm also interested in training and fine-tuning LLMs, building end-to-end workflows such as RAG, and making LLMs run efficiently on smaller devices.
Previously, I spent two amazing years in London as a software engineer at Palantir. There, I built high-scale, high-availability backend systems that enabled data and ML workflows in Palantir Foundry and AIP -- Palantir's super capable platforms for doing cool (and useful) things with your organization's data using AI. This, this, this, and this are examples of projects that I led, worked on, or contributed to.
Before that, I graduated with honors [, two gold medals, and many lifelong friends] from IIIT Hyderabad with a bachelors degree in Computer Science. While there, I worked on problems in computational social science, especially revolving around online social media networks, with Prof. Ponnurangam Kumaraguru and Prof. Joyojeet Pal at the Precog Research Group.
I also explored the field of 3D computer vision at Brown University’s Interactive 3D Vision and Learning Lab (IVL) with Prof. Srinath Sridhar. Our work dealt with the category-level pose canonicalization of full as well as partial objects using tensor-field networks. During that time, I also interned at Google and participated in Major League Hacking’s Fellowship. Look around here to find projects I’ve worked on or my involvements in hackathons and programming competitions.
After slamming down my laptop's lid, I like to spend time creating art, discovering the best hot chocolate places in town, and sipping hot chocolate while creating art.
If you find me a good fit for your team, want help, or just want to say hello – please freely reach out via the links below!
Publications
Urbanization and Literacy as Factors in Politicians’ Social Media Use in a Largely Rural State: Evidence from Uttar Pradesh, India
Conference PaperPublisher:ACM SIGCAS/SIGCHI Conference on Computing and Sustainable Societies (COMPASS)Date:2022Authors:Asmit Kumar SinghJivitesh JainLalitha KameswariPonnurangam KumaraguruJoyojeet PalDescription:With Twitter growing as a preferred channel for outreach among major politicians, there have been focused efforts on online communication, even in election campaigns in primarily rural regions. In this paper, we examine the relationship between politicians’ use of social media and the level of urbanization and literacy by compiling a comprehensive list of Twitter handles of political party functionaries and election candidates in the run-up to the 2022 State Assembly elections in Uttar Pradesh, India. We find statistically significant relationships between political Twitter presence and levels of urbanization and with levels of literacy. We also find a strong correlation between vote share and Twitter presence in the winning party, a relationship that is even stronger in urban districts. This provides empirical evidence that social media is already a central part of electoral outreach processes in the Global South, but that this is still selectively more relevant to voters in, and politicians standing for elections from urban and higher-educated regions.
Condor: Self-Supervised Canonicalization of 3D Pose for Partial Shapes
Conference PaperPublisher:IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)Date:2022Authors:Rahul SajnaniAdrien PoulenardJivitesh JainRadhika DuaLeonidas J GuibasSrinath SridharDescription:Progress in 3D object understanding has relied on manually “canonicalized” shape datasets that contain instances with consistent position and orientation (3D pose). This has made it hard to generalize these methods to in-the-wild shapes, e.g., from internet model collections or depth sensors. ConDor is a self-supervised method that learns to Canonicalize the 3D orientation and position for full and partial 3D point clouds. We build on top of Tensor Field Networks (TFNs), a class of permutation- and rotation-equivariant, and translation-invariant 3D networks. During inference, our method takes an unseen full or partial 3D point cloud at an arbitrary pose and outputs an equivariant canonical pose. During training, this network uses self-supervision losses to learn the canonical pose from an un-canonicalized collection of full and partial 3D point clouds. ConDor can also learn to consistently co-segment object parts without any supervision. Extensive quantitative results on four new metrics show that our approach out-performs existing methods while enabling new applications such as operation on depth images and annotation transfer.
Capitol (Pat) riots: A Comparative Study of Twitter and Parler
ReportDate:2021Authors:HitkulAvinash PrabhuDipanwita GuhathakurtaJivitesh JainMallika SubramanianManvith ReddyShradha SehgalTanvi KarandikarAmogh GulatiUdit AroraRajiv Ratn ShahPonnurangam KumaraguruDescription:On 6 January 2021, a mob of right-wing conservatives stormed the USA Capitol Hill interrupting the session of congress certifying 2020 Presidential election results. Immediately after the start of the event, posts related to the riots started to trend on social media. A social media platform which stood out was a free speech endorsing social media platform Parler; it is being claimed as the platform on which the riots were planned and talked about. Our report presents a contrast between the trending content on Parler and Twitter around the time of riots. We collected data from both platforms based on the trending hashtags and draw comparisons based on what are the topics being talked about, who are the people active on the platforms and how organic is the content generated on the two platforms. While the content trending on Twitter had strong resentments towards the event and called for action against rioters and inciters, Parler content had a strong conservative narrative echoing the ideas of voter fraud similar to the attacking mob. We also find a disproportionately high manipulation of traffic on Parler when compared to Twitter.
Battling Hateful Content in Indic Languages (HASOC Shared Task 2021)
Conference PaperPublisher:ACM Forum for Information Retrieval Evaluation (FIRE)Date:2021Authors:Aditya KadamAnmol GoelJivitesh JainJushaan Singh KalraMallika SubramanianManvith ReddyPrashant KodaliTH ArjunManish ShrivastavaPonnurangam KumaraguruDescription:The extensive rise in consumption of online social media (OSMs) by a large number of people poses a
critical problem of curbing the spread of hateful content on these platforms. With the growing usage
of OSMs in multiple languages, the task of detecting and characterizing hate becomes more complex.
The subtle variations of code-mixed texts along with switching scripts only add to the complexity. This
paper presents a solution for the HASOC 2021 Multilingual Twitter Hate-Speech Detection challenge
by team Precog IIIT Hyderabad. We adopt a multilingual transformer based approach and describe our
architecture for all 6 subtasks as part of the challenge. Out of the 6 teams that participated in all the
subtasks, our submissions rank 3rd overall.What's Kooking? Characterizing India's Emerging Social Network, Koo
Conference PaperPublisher:IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)Date:2021Authors:Asmit Kumar SinghChirag JainJivitesh JainRishi Raj JainShradha SehgalTanisha PandeyPonnurangam KumaraguruDescription:[Best Student Paper Award]
Social media has grown exponentially in a short period, coming to the forefront of communications and online interactions. Despite their rapid growth, social media platforms have been unable to scale to different languages globally and remain inaccessible to many. In this paper, we characterize Koo, a multilingual micro-blogging site that rose in popularity in 2021, as an Indian alternative to Twitter. We collected a dataset of 4.07 million users, 163.12 million follower-following relationships, and their content and activity across 12 languages. We study the user demographic along the lines of language, location, gender, and profession. The prominent presence of Indian languages in the discourse on Koo indicates the platform’s success in promoting regional languages. We observe Koo’s follower-following network to be much denser than Twitter’s, comprising of closely-knit linguistic communities. An N-gram analysis of posts on Koo shows a #KooVsTwitter rhetoric, revealing the debate comparing the two platforms. Our characterization highlights the dynamics of the multilingual social network and its diverse Indian user base.
Projects
Implementation and Analysis of state-of-the-art Retrieval-Augmented Generation Techniques for Question Answering
date: 2024Organization:Carnegie Mellon University
Description:This project explores all aspects of Retrieval-Augmented Generation (RAG), including collecting data for the knowledge corpus, generating synthetic Q/A pairs on that data for parameter-efficient quantized model fine-tuning using LoRA, annotating a gold test set, using SOTA embedding and indexing methods, retrieved document reranking and summarization, and query rewording using Hypothetical Document Embeddings (HyDE). We perform comprehensive ablation studies to analyze the effect of each component of the pipeline in detail, and release our dataset and code publicly.
Mini Llama
date: 2024Organization:Carnegie Mellon University
Description:A miniaturized implementation of the Llama 3.1 Language Model in PyTorch, with Q-LoRA parameter-efficient fine-tuning, trained on the TinyStories dataset.
SimpleRA Database Management System
date: 2021Organization:International Institute of Information Technology, Hyderabad
Description:A DBMS implemented from scratch in C++, with query parsing and planning, linear hash and B+ tree indexing, 2-way merge sort, caching and memory page management.
Multilingual Sparse Indexing and Search for Wikipedia
date: 2021Organization:International Institute of Information Technology, Hyderabad
Description:An efficient search and indexing engine for English and Hindi Wikipedia with blocked-sort indexing and distributed search to reduce index size by 75% and search 80GB of XML in <1 second.
Image Segmentation and Background Removal using GrabCut
date: 2021Organization:International Institute of Information Technology, Hyderabad
Description:This project implements major portions of the GrabCut Algorithm, a markov random field based image segmentation algorithm, a variant of which you might have seen available in Microsoft Office Products as the background removal tool. The algorithm is iterative and interactive, and is known for the minimal and simple user interaction required, as well as its ability to fix the results using further iterations of user input. The algorithm is based on multiple iterations of GraphCut, and uses each iteration to improve its model of the foreground and background color distributions, which are modeled as gaussian mixtures.
Citation Intent Classification
date: 2021Organization:NAACL
Description:An NLP system to classify citation intent in scientific literature using fine-tuned transformer models like RoBERTa, for the 3C SDP shared task at NAACL 2021, implemented using PyTorch.
Kishmish - A shell for Linux
date: 2019Organization:International Institute of Information Technology, Hyderabad
Description:A bash-like shell for Linux systems, implemented from scratch in C, with support for process management, piping and redirection and built-in as well as system commands.
Improvements to the XV-6 Operating System
date: 2019Organization:International Institute of Information Technology, Hyderabad
Description:Improvements to MIT’s xv-6 operating system with additional system calls, and implementations of FCFS, Priority-Based and Multi-Level Feedback Queue process scheduling, written in C.
Resume
DownloadContact
Address
5412 Gates and Hillman Centers
Carnegie Mellon University
4902 Forbes Ave
Pittsburgh 15213, PA