Site Reliability Engineer • 2023-05 — Present
I am currently working on the Central Platform Technology & Service (CPTS) team.
I -
- Build and introduce several internal systems for the team and all developers of MG brands, including central documentation, monitoring, artifact storage, and ticketing systems.
- Introduce monitoring systems and on-call cultures to the team, along with documentations; 100% of internal applications now use these new monitoring systems.
- Build internal tools and systems for developers, such as a CLI to login the AWS accounts with ECR, decrypt data for debugging, generate authentication tokens, and more.
- Standardize procedures in infrastructure-related topics, such as Terraform / Terragrunt codes, tag policies, billing reports, AWS account management and so forth.
- Coordinate closely with the security team on matters related to AWS resources and third-party integration management, including cross-team reviews, R&R discussions and migrations.
- Lead a few migration projects (CDNs, internal VPNs, etc.) within the team, participate in all steps, including initial PoC, developing migration plans, actual migration, and documentation.
- Develop the hiring process, take-home assignments, pre-screen and interview questions to hire backend engineers for the team; participate in interviews.
DevOps Engineer • 2019-04 — 2023-04
As a core member of the DevOps team, I played a significant role in building and enhancing overall infrastructure.
I -
- Migrated ~50% of Kubernetes workloads to ARM64 (Graviton) based nodes.
- Introduced Kubernetes and Istio to production services including Azar, Hakuna; helped developers migrate EC2 based legacy microservices to Kubernetes.
- Deployed 30+ Kubernetes clusters for production, development, and management, and also deployed infrastructure components, including cluster-autoscaler, node-exporter and Zabbix agents.
- Developed and managed CI/CD pipelines for 300+ microservices across 5+ products and 4+ environments using Spinnaker, Github Action, Jenkins, Helm, and Vault.
- Provided an automated monitoring and alerting system using Prometheus, Mimir, Grafana, and Zabbix; developed a GitOps system to configure alert conditions and thresholds that covers > 90% of alerts.
- Researched and executed PoC, including adopting Istio, Vault, Mimir, SonarQube, Bottlerocket, and Harbor to production; found zero-downtime upgrade paths for Kubernetes, Istio, and other components.
- Upgraded certain legacy systems, including OS, Kubernetes, Jenkins, and Zabbix; transitioned VPC peerings to inter-region peered transit gateways.
- Developed a GitOps-based permission management system using Okta, Active Directory, and AWS SSO with 30+ accounts, AWS Client VPN with 50+ VPCs. This system was also integrated with JetBrains IDE products, Grafana, and Harbor.
- Developed management tools and helpful internal CLIs, including a production database query executor, Spinnaker pipeline generator, and configuration/credentials fetcher for multiple AWS accounts and Kubernetes clusters.
- Participated in several rebuilding projects of business-critical microservices; set up initial architectures with developers to meet high-availability and multi-regional requirements.
- Served in the capacity of a site reliability engineer; had on-call duty for Azar and common infrastructures, planned and trained for disaster recovery with an AWS region failure scenario.
- Assisted the security team in achieving SOX compliance in our systems, including CI/CD and access management.
- Contributed to some open source projects, including Bottlerocket, aws-ebs-csi-driver and Istio.
- Wrote 10+ tech blog articles (in Korean), 50+ internal manuals and tutorials; and gave public presentations - EKS Cluster Migration (in Korean) and Microservice CI/CD (in Korean).
Developer • 2018-06 — 2019-04
I developed an automated ads manager, and migrated a set of APIs.
I -
- Made an automated ads manager; it automatically creates ads from the content management tools, measures various metrics (CPC, ROI, etc.) using Google DFP and Facebook Ad APIs, changes their budgets depending on the metrics.
- Migrated a set of image processing APIs from Google AppEngine to Google Kubernetes Engine.
- Organized duplicate build and deploy scripts in many git repositories.
Software Engineer • 2017-07 — 2018-05
I developed various features for Lendit's web service, back office and infrastructures.
I -
- Improved build and deploy infrastructures for automatic deployments using Travis CI and AWS CodeDeploy.
- Refactored and optimized codes; removed a lot of legacy from both backends and frontends (converting Java Date to LocalDate and LocalDateTime, converting JSP pages to Vue, converting Java to Kotlin).
- Coded an program to transfer telegram to Shinhan bank using Python 3.6 + aiohttp.
- Opened a tech blog and posted work-related articles including interviews and the build infrastructures.
Undergraduate Intern • 2017-01 — 2017-06
I participated in small projects related to security issues.
I -
- Improved TLS/SSL vulnerabilities scanner in terms of accuracy and speed using Go.
- Made a dynamic crawler and an reflected XSS detector using headless Chrome.
Undergraduate Intern • 2016-02 — 2016-12
I participated in a lab project.
I -
- Developed a TLS/SSL vulnerabilities scanner using Python.
- Performed static and dynamic analyze of Android apps.
• Cloud Products: AWS (EC2, EKS, VPC, S3, RDS, ElastiCache and more), GCP* (GKE)
• Tools: Kubernetes, Istio, Terragrunt, Terraform, Ansible, Spinnaker
• Programming Languages: Python, JavaScript, Go, Java*, Kotlin*
• Web Backends: Django, Spring Boot*
• Web Frontends*: HTML, CSS, Vue*, NuxtJs*
• Basics: Linux, Git, MySQL, PyCharm, Vim (and no Emacs!)
• Natural Languages: Korean (native), English (sometimes, it's broken)
• Hello, World!*: TypeScript, React, C#