Posts by Collection

portfolio

publications

Crossing Linguistic Horizons: Finetuning and Comprehensive Evaluation of Vietnamese Large Language Models

Published in NAACL 2024, 2023

Recent advancements in large language models (LLMs) have underscored their importance in the evolution of artificial intelligence. However, despite extensive pretrained on multilingual datasets, available open-sourced LLMs exhibit limited effectiveness in processing Vietnamese. The challenge is exacerbated by the absence of systematic benchmark datasets and metrics tailored for Vietnamese LLM evaluation. To mitigate these issues, we have finetuned LLMs specifically for Vietnamese and developed a comprehensive evaluation framework encompassing 10 common tasks and 31 metrics. Our evaluation results reveal that the fine-tuned LLMs exhibit enhanced comprehension and generative capabilities in Vietnamese. Moreover, our analysis indicates that models with more parameters can introduce more biases and uncalibrated outputs and the key factor influencing LLM performance is the quality of the training or fine-tuning datasets. These insights underscore the significance of meticulous fine-tuning with high-quality datasets in enhancing LLM performance.

Download here

An Attention Graph Neural Network for Stereo-active Molecules

Published in GEM workshop, ICLR 2024, 2024

Molecules can show stereochemistry: two molecules with the same atomic connectivity may exhibit different bioactivity due to different spatial arrangements. We propose a graph neural network architecture that utilizes a chiral-sensitive aggregation function and self-attention mechanism to improve the performance of molecular properties prediction by exploiting chiral information. Unlike many black-box deep learning models, the internals of our network are interpretable by visualizing the learned weights of the attention layers, providing better support for drug discovery.

Download here

Hybrid Transformer and Holt-Winter’s Method for Time Series Forecasting

Published in Time Series for Health Workshop, 2024

Time series forecasting is an important research topic in machine learning due to its prevalence in social and scientific applications. Multi-model forecasting paradigm, including model hybridization and model combination, is shown to be more effective than single-model forecasting in the M4 competition. In this study, we hybridize exponential smoothing with transformer architecture to capture both levels and seasonal patterns while exploiting the complex non-linear trend in time series data. We show that our model can capture complex trends and seasonal patterns with moderately improvement in comparison to the state-of-the-arts result from the M4 competition.

Download here

talks

URA RESEARCH GROUP

Published:

This is a description of your talk, which is a markdown files that can be all markdown-ified like any other post. Yay markdown!

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.