Hadoop and Big Data Systems
M.S. Computer Engineering, Johns Hopkins University
B.S. Computer Engineering, UMBC
Large Pharmaceutical Company
Large-scale Clinical Trial Data Lake on Hadoop
Project lead and platform architect for the implementation of a large-scale graph datastore built with Accumulo, Hortonworks Data Platform (Hadoop), and Amazon Web Services.
Civilian Government Agency
Greenplum Data Warehouse in AWS
Designed and implemented a large-scale Greenplum database on AWS that utilized Apache Spark to ingest data from disparate sources. Implemented and designed several ETL pipelines for this system, implemented security controls with Amazon KMS, and integrated with Splunk for logging.
- Hadoop: Spark, Mapreduce, Ambari, Accumulo, HBase, Hive
- AWS: EC2, Lambda, CloudFormation, RDS, SQS, KMS
- Databases: Greenplum, Dynamo, MySQL, PostgreSQL, Oracle
- DevOps: Jenkins, Docker, Vagrant