Profile
Senior in Computer Engineering from KUT
Alternative Military Service Status : on duty (2020/11/27 ~ 2023/09/26
)
Links
kozistr@gmail.com | |
Github | https://github.com/kozistr |
Kaggle | https://www.kaggle.com/kozistr |
https://www.linkedin.com/in/kozistr |
Interests
- Lots of challenges like Kaggle
- (light-weighted) Single Image Super Resolution (SISR)
- End to End Speaker Diarization (E2E SD)
Previously, I'm also interested in offensive security, kind of Reverse Engineering, Linux Kernel Exploitation.
Challenges & Awards
Machine Learning
-
Kaggle Challenges :: Kaggle Challenges :: Competition Expert
- Cornell Birdcall Identification - team, top 2% (24 / 1395), Private 0.631 (2020.)
- ALASKA2 Image Steganalysis - solo, top 9% (93 / 1095), Private 0.917 (2020.)
- Tweet Sentiment Extraction - solo, top 4% (84 / 2227), Private 0.71796 (2020.)
- Flower Classification with TPUs - solo, top 4% (27 / 848), Private 0.98734 (2020.)
- Kaggle Bengali.AI Handwritten Grapheme Classification - solo, top 4% (67 / 2059), Private 0.9372 (2020.)
- Kaggle Kannada MNIST Challenge - solo, top 3% (28 / 1214), Private 0.99100 (2019.)
-
NAVER NLP Challenge :: NAVER NLP Challenge 2018
- Final - Semantic Role Labeling (SRL) 6th place :: Presentation
-
A.I R&D Challenge :: A.I R&D Challenge 2018
- Final - Fake or Real Detection - as Digital Forensic Team
-
NAVER A.I Hackathon :: NAVER A.I Hackathon 2018
- Final - Kin 4th place, Movie Review 13th place :: summarypaper_
-
TF-KR Challenge :: Facebook TF-KR MNIST Challenge
- TF-KR MNIST Challenge - Top 9, 3rd price, ACC 0.9964
Hacking
- Boot2Root CTF 2018 :: 2nd place (Demon + alpha)
- Harekaze CTF 2017 :: 3rd place (SeoulWesterns)
-
WhiteHat League 1 (2017) :: 2nd place (Demon)
- Awarded by 한국정보기술연구원 Received an award of $3,000
Work Experience
Company
Machine Learning Researcher, Watcha, (2020.06.22 ~ Present)
- Working as a full time.
-
Developed training recipes to train sequential recommendation architecture.
- Build Future module for better understand at the time of inference.
- Apply augmentations to the various features, leads to performance gain & robustness.
-
In A/B (online) test (statistically significant
p-value < 0.05
)- coming soon
-
Developed a model to predict expected users' view-time of the contents.
- Predict how many people going to watch, how much time people going to watch the content before the content is supplied.
- Find out which features impact users' watch.
-
Developed a pipeline to recognize main actors from the poster and still-cut images.
- Utilize SOTA face detector & recognizer.
- Optimized pre/post processing routines for low
latency
.
-
Developed a novel sequential recommendation architecture to recommend what content to watch next.
-
In A/B (online) test (statistically significant
p-value < 0.05
)- Paid Conversion : improved 1.39%p
- Viewing Days : improved 0.25%p
- Viewing Minutes (median) : improved 4.10%
- Click Ratio : improved 4.30%p
- Play Ratio : improved 2.32%p
-
-
Developed Image Super Resolution model to upscale movie & tv poster, still-cut images.
- Optimize the codes for fast
inference time
&memory-efficiency
on cpu. - In internal evaluation (qualitative evaluation by the designers), it catches details better & handles higher resolution & takes a little time.
- Optimize the codes for fast
Machine Learning Engineer, Rainist, (2019.11.11 ~ 2020.06.19)
- Worked as a full time.
-
Developed the card & bank account transaction category classification models, designed light-weight purpose for the low latency. (now on service)
- In A/B (online) test (statistically significant
p-value < 0.05
) - *Accuracy : improved about 25 ~ 30%p
- In A/B (online) test (statistically significant
-
Developed the RESTful API server to serve deep learning model (utilized k8s and open source project)
- zero failure rate (0 40x, 50x errors)
- Developed the classification model for forecasting possibility of loan overdue.
% *Accuracy : how many people don't update/change their transactions' category.
Machine Learning Engineer, VoyagerX, (2019.01.07 ~ 2019.10.04)
- Worked as an intern.
- Developed speaker verification, diarization models & logic for recognizing the arbitrary speakers recorded from the noisy (real-world) environment.
- Developed a hair image semantic segmentation / image in-paint / i2i domain transfer model for swapping hair domain naturally.
Penetration Tester, ELCID, (2016.07 ~ 2016.08)
- Worked as a part-time job.
- Penetrated some products related to network firewall and anti-virus product.
Out Sourcing
- Developed Korean University Course Information Web Parser (About 40 Universities). 2 times, (2017.7 ~ 2018.3)
- Developed AWS CloudTrail logger analyzer / formatter. (2019.09 ~ 2019.10)
Lab
HPC Lab, KoreaTech, Undergraduate Researcher, (2018.09 ~ 2018.12)
- Wrote a paper about improved TextCNN model for predicting a movie rate.
Publications
Paper
[1] Kim et al, CNN Architecture Predicting Movie Rating, 2020. 01.
- Wrote about the CNN Architecture, which utilizes a channel-attention method (SE Module) to TextCNN model, brings performance gain over the task while keeping its latency, generally.
- Handling un-normalized text w/ various convolution kernel size and dropout
Conferences/Workshops
[1] kozistrteam, [NAVER NLP Challenge 2018 SRL Task](https://github.com/naver/nlp-challenge/raw/master/slides/Naver.NLP.Workshop.SRL.kozistrteam.pdf)
- SRL Task, challenging w/o any domain knowledge. Presented about trails & errors during the competition
Journals
[1] zer0day, Windows Anti-Debugging Techniques (CodeEngn 2016) Sep. 2016. PDF
- Wrote about lots of anti-reversing / debugging (A to Z) techniques avail on window executable binary
Posts
[1] kozistr (as a part of team, Dragonsong
) towarddatascience
- Wrote about audio classifier with deep learning based on the kaggle challenge where we participated
Personal Projects
Computer Languages
Python
> C/C++
Assembly (x86, x86-64, arm, ...)
> experienced with more than 10 languages
Machine Learning
Generative Models
-
GANs-tensorflow :: Lots of GAN codes :) :: Generative Adversary Networks
- ACGAN-tensorflow :: Auxiliary Classifier GAN in tensorflow :: code
- StarGAN-tensorflow :: Unified GAN for multi-domain :: code
- LAPGAN-tensorflow :: Laplacian Pyramid GAN in tensorflow :: code
- BEGAN-tensorflow :: Boundary Equilibrium in tensorflow :: code
- DCGAN-tensorflow :: Deep Convolutional GAN in tensorflow :: code
- SRGAN-tensorflow :: Super Resolution GAN in tensorflow :: code
- WGAN-GP-tensorflow :: Wasserstein GAN w/ gradient penalty in tensorflow :: code
- ... lots of GANs (over 20) :)
Super Resolution
-
Single Image Super Resolution :: Single Image Super Resolution (SISR)
I2I Translation
- Improved Content Disentanglement :: tuned version of 'Content Disentanglement' in pytorch :: code
Style Transfer
-
Image-Style-Transfer :: Image Neural Style Transfer
- style-transfer-tensorflow :: Image Style-Transfer in tensorflow :: code
Text Classification/Generation
Speech Synthesis
-
Tacotron-tensorflow :: Text To Sound (TTS)
- tacotron-tensorflow :: lots of TTS models in tensorflow ::
code
- tacotron-tensorflow :: lots of TTS models in tensorflow ::
Speech Recognition :: Speech Recognition
- [private] :: noisy acoustic speech recognition system in tensorflow ::
code
Optimizer
-
AdaBound :: Optimizer that trains as fast as Adam and as good as SGD
- AdaBound-tensorflow :: AdaBound Optimizer implementation in tensorflow :: code
-
RAdam :: On The Variance Of The Adaptive Learning Rate And Beyond in tensorflow
- RAdam-tensorflow :: RAdam Optimizer implementation in tensorflow :: code
R.L
- Rosseta Stone :: Hearthstone simulator using C++ with some reinforcement learning :: code
Plug-Ins
IDA pro plug-in - Golang ELF binary (x86, x86-64), RTTI parser
- Recover stripped symbols & information and patch byte-codes for being able to hex-ray
Open Source Contributions
- syzkaller :: New Generation of Linux Kernel Fuzzer :: Minor contribution #575
- simpletransformers :: Transformers made simple w/ training, evaluating, and prediction possible w/ one line each. :: Minor contribution #290
Security, Hacking
CTFs, Conferences
- POC 2016 Conference Staff
- HackingCamp 15 CTF Staff, Challenge Maker
- CodeGate 2017 OpenCTF Staff, Challenge Maker
- HackingCamp 16 CTF Staff, Challenge Maker
- POX 2017 CTF Staff, Challenge Maker
- KID 2017 CTF Staff, Challenge Maker
- Belluminar 2017 CTF Staff
- HackingCamp 17 CTF Staff, Challenge Maker
- HackingCamp 18 CTF Staff, Challenge Maker
Teams
Hacking Team, Fl4y. Since 2017.07 ~
Hacking Team, Demon by POC. Since 2014.02 ~ 2018.08
Presentations
2018
[2] Artificial Intelligence ZeroToAll, Apr 2018.
[1] Machine Learning ZeroToAll, Mar 2018.
2015
[1] Polymorphic Virus VS AV Detection, Oct 2015.
2014
[1] Network Sniffing & Detection, Oct, 2014.