VINÍCIUS

Senior Data Engineer

Vinicius is an experienced data engineer with 16 years of experience skilled in building end-to-end data pipelines. He leverages Python, Apache Pulsar, DBT, Snowflake, Docker, and Kubernetes to deliver scalable and maintainable solutions. With expertise in SQL Server performance optimization and AWS services, he ensures efficient data processing and storage. Vinicius excels in mentoring junior engineers and implementing data validation techniques. His proficiency in data modeling and CDC-based ingestion pipelines enables timely insights and informed decision-making.

Expert in building end-to-end data pipelines, he has leveraged modern technologies such as Python, Apache Pulsar, DBT, Snowflake, Docker, and Kubernetes. Additionally, he has optimized SQL Server performance by closely monitoring, analyzing, and rewriting complex queries, ensuring optimal database performance. Besides being certified in AWS services, he has extensive experience in utilizing them for data processing and storage. Moreover, he has built data pipelines using Python, Confluent Kafka, Stitch, DBT, and Snowflake, harnessing the capabilities of Snowflake's cloud data platform.

Proven track record in designing and implementing scalable data warehouses and data platforms, demonstrating the ability to collaborate effectively and drive successful project outcomes.
With extensive experience, he has utilized industry-leading tools such as Airflow, DBT, and Kubernetes. He monitored pipelines using New Relic, Grafana, CloudWatch, DBT, and DataBand, ensuring the smooth operation of data pipelines. Moreover, he built robust and reliable data pipelines using Python, Apache Pulsar, DBT, Snowflake, Docker, and Kubernetes. These tools and technologies enabled efficient data ingestion and transformation processes, ensuring the seamless flow of data through the pipelines.

Proficient in implementing data validation techniques, he has built data validation solutions using Schema Registry, Great_Expectations, and DBT Tests. By leveraging these tools, he ensured the accuracy and integrity of data throughout the pipeline. These validation techniques guarantee that the data meets the defined schemas, quality standards, and expectations, enabling reliable and trustworthy data processing.
Demonstrated a strong ability to mentor and guide junior engineers, contributing to the establishment of standards that improve developer workflows. He actively recommends best practices and provides guidance to junior engineers, fostering their technical growth and expertise. By sharing knowledge and promoting continuous learning, he ensures the development of a skilled and proficient engineering team. Additionally, his efforts in improving developer workflows contribute to increased efficiency and productivity within the team.

With a deep understanding of data modeling principles and best practices, he has developed optimized data warehousing and data lake solutions. In collaboration with a Data Architect, he defined data modeling solutions for Data Warehousing and Data Lake, incorporating industry best practices and design standards. These solutions empower data analysts and scientists to perform autonomous data exploration, allowing them to extract valuable insights from the data with ease. By ensuring a well-designed and efficient data modeling approach, he enables efficient data exploration and analysis, ultimately driving informed decision-making processes.
He has extensive experience with CDC-based ingestion pipelines, specifically utilizing Debezium and Confluent Kafka. By employing CDC (Change Data Capture) in RDS (Relational Database Service), he enables efficient and real-time data integration from highly transactional databases into Snowflake. This approach ensures enhanced data availability and freshness, as the pipelines capture and propagate changes from the source databases to Snowflake in near real-time. By leveraging CDC-based ingestion pipelines, he enables organizations to make timely and informed decisions based on the most up-to-date data.

Consistently researched and implemented best practices in data engineering, keeping up with the latest trends, tools, and methodologies, ensuring the delivery of high-quality, scalable, and maintainable data solutions.

Tech tools :

Python

PySpark

Spark Scala

SQL

Power BI

Tableau

Looker

Amplitude

AWS (Amazon Web Services)

Docker

Kubernetes

Spring

Java

PostgreSQL

GraphQL

MongoDB

Lambda functions

DynamoDB

Apache

API Gateway

React

Amplify Framework

Apollo

Cognito

Strong expertise in building end-to-end data pipelines using modern technologies like Python, SQL, AWS, and Snowflake.