About
I’m Jonathan Lim Siu Chi (@jonathanlimsc), sc are the romanized initials of my Chinese first name, 修齐.
I am passionate about AI and building scalable AI-driven products that impact real-world users.
On the more research side of things, I am broadly interested in improving the generalizability of ML models, looking at various AI sub-fields such as reinforcement learning, neuro-AI and multi-modal models.
In my 7 years of work experience, I have built:
- Built
scalable, high throughput LLM and vector DB RAGsystems that help thousands of customers generate pitch-perfect proposal answers from their uploaded documents and library entries - Trained from scratch a
GPT2-based LLMfornovel molecule generationand co-authoredGotta be SAFE: A New Framework for Molecular Design Sentiment classificationof customer service chat-logs, usingRoBERTatrained from scratch on in-house datasets.- Real-time
Machine translationbetween >15 language pairs including English, Chinese, Thai, Vietnamese, Bahasa Indonesia, usingFairSeq, deployed on GCP GPU instances, serving >200QPS per second, managed usingKubernetes. Recommendation systembased onGraph Neural Network (LightGCN)to recommend brands to customers based on domain browsing behaviour extracted from telco logs.Convolutional Neural Networkmodels for tumor detection and disease classification in MRI brain scans- Real-time
OCRpipeline based onAttention-OCRto extract text information from user identity cards, productionized withdocker-compose. For monitoring model prediction distribution, I deployed theElasticsearch-Logstash-Kibana (ELK)stack.