👋Hey, I'm

Nidhi Vichare

I am a Data and AI Executive

About Me

My name is Nidhi Vichare! I am an innovative Data Leader, Change Agent, Speaker, Mentor, Blogger, Artist and Entrepreneur. I am an accomplished Data and AI Executive, with 20+ years of expertise in implementing diverse domains such as AdTech, Networking, and Trading. I specialize in distributed systems, big data, microservices, cloud computing, product innovation, application development, agile methodologies, infrastructure, security, and privacy. I have been leading cloud transformations, database architecture, engineering and operations teams at Samsung Ads, Cisco, Esignal, Lycos and other startups. I am passionate about cloud architecture, dataops, observability, multi-dimensional databases, data migrations and serverless applications to manage large scale data applications cross-functionally.

My “Architecture first” mentality helped me drive multi-billion-dollar transformations of engineering operations into a DataSecOps model to deliver just-in-time insights with automated systems and self-serve capabilities. During this journey I have built strong organizations and teams with bias for action. I am a continuous learner who seamlessly adapts to new and emerging technologies.

I 💙 to mentor and had opportunities to realize my passion by sharing my insights via blogging, personal coaching and staying involved with various women in technology organizations(Cisco WISE and FairyGodBoss). I bring in organizational effectiveness across global teams by creating frameworks, strengthening employee relationships and foster a culture of innovation by encouraging everyone to contribute through new ideas and insights.

My Technology Blogs 📝

Featured
07 • 16 • 2023
Web3 and Generative AI Driving Innovation ForwardThe convergence of Generative AI and Web3 holds immense potential to revolutionize various sectors, including Decentralized Autonomous Organizations (DAOs), gaming and virtual worlds, and AI-generated NFTs and digital art. Generative AI and Web3 are two promising technologies that are rapidly evolving to reshape industries in novel ways. Generative AI focuses on creating new data, such as text, images, and music, while Web3 is the next generation of the internet, built on blockchain technology.
Featured
07 • 02 • 2023
Chatting Over Documents with OpenAI, LangChain and PineconeExplore the creation of an advanced document-based question-answering system using LangChain and Pinecone. By capitalizing on the latest advancements in large language models (LLMs) like OpenAI GPT-4, we'll construct a document question-answer system with the LangChain and Pinecone.
Featured
05 • 01 • 2023
Data Strategy Navigator: The Pyramid Mission FrameworkA Global Data Strategy framework serves as a guiding light for the organization's data strategy, helping it to stay on course and navigate towards its goals. It is a reliable tool that helps organizations navigate their way through the complex terrain of data strategy, with the pyramid structure representing a solid foundation and the mission framework serving as a clear guide towards achieving their goals.
Featured
04 • 28 • 2023
Privacy By Design - A Key Aspect of a Global Data StrategyA Global Data Strategy centered around a Privacy by Design Methodology means that the organization's data strategy is built with privacy in mind from the start, rather than added as an afterthought.
Featured
05 • 15 • 2023
Securing Software Development: Exploring the DevSecOps Pipeline and Shift Left SecurityThe DevSecOps pipeline is a methodology that emphasizes the integration of security practices into every stage of the software development lifecycle (SDLC). Identify the list of tools that provide advanced security features and functionalities to help organizations with a higher focus on security to enhance their overall security posture, detect vulnerabilities, and ensure the robustness of their applications and infrastructure.
Featured
04 • 28 • 2023
Maximize Productivity and Minimize Meeting Fatigue: A Guide to Leading Effective Team MeetingsMeeting fatigue is real, and when calendars become a solid block of meetings, it can be challenging to find time to complete essential tasks. Using the framework will help maximize productivity and minimize meeting fatigue.
Featured
02 • 18 • 2023
Options Analysis of Database Build ToolsA database build tool is a software tool designed to manage the creation and modification of database schema and objects, and to automate the deployment of those changes to different environments. DBT vs Flyway
Featured
03 • 12 • 2023
AVRO vs JSONAvro and JSON are both data serialization formats used in distributed computing systems, but they have several differences. Avro is a binary format that is more compact and efficient than JSON, making it more suitable for use in distributed systems. It also supports schema evolution and is language independent. On the other hand, JSON is a text-based format that is more human-readable than Avro, and it is more widely used because it is supported by many programming languages and frameworks.
Featured
03 • 12 • 2023
Snowflake Table Types 2023What are the new table types in snowflake in 2023? GA, PuPr and PrPr
Featured
03 • 12 • 2023
Hashing - A PrimerHashing is a technique used to map data of arbitrary size to a fixed size. It is used in a variety of applications such as data storage, data transmission, data compression, data indexing, and data encryption. Hashing is a one-way function, which means that it is easy to compute the hash value for a given input, but it is computationally infeasible to determine the input given the hash value. This makes hashing a useful technique for data security.
Featured
03 • 12 • 2023
Debezium - A PrimerDebezium is a powerful platform that can be used in a variety of use cases where real-time data capture and streaming are required. Its flexibility, scalability, and extensibility make it a popular choice among organizations that need to build real-time data pipelines and microservices-based architectures.
Featured
02 • 19 • 2023
Accurately Estimating Project Completion Dates - A Key Aspect of Effective LeadershipUnderstanding the Phases of a Project; Focusing on Outcomes instead of Activity; Using the Cone of Uncertainty Framework
Featured
02 • 18 • 2023
Concurrency benefits and pitfallsConcurrency allows a system to execute multiple tasks or processes simultaneously, which can improve performance, resource utilization, responsiveness, and scalability. However, there are potential pitfalls such as deadlocks, race conditions, synchronization overhead, debugging and testing challenges, and resource contention. To leverage concurrency effectively, it is important to design and implement concurrent systems carefully and use appropriate synchronization mechanisms and testing approaches to identify and mitigate potential issues.
Featured
01 • 06 • 2023
Auto-Clustering within Snowflake Clustering in Snowflake is a way of organizing data in tables to make querying more efficient. It is based on the unique concept of micro-partitions, which is different from the static partitioning of tables used in traditional data warehouses
Featured
02 • 19 • 2023
Spark optimizations Spark optimizations are techniques used to improve the performance and efficiency of Spark applications. Key optimizations include memory management, data partitioning, caching, parallelism, resource management, and optimization libraries. These techniques enable faster and more efficient processing of large datasets, making Spark a popular choice for big data processing.
Featured
03 • 12 • 2023
Build confidence with Kafka Apache Kafka is a powerful open-source streaming platform that enables businesses to manage data streams effectively. However, building enterprise-grade solutions with Kafka requires a comprehensive understanding of its key components. In this article, we will explore the four core components of Kafka and their purpose in developing a robust streaming platform.
Featured
01 • 30 • 2021
Which Data Warehouse is the right choice - Redshift or Snowflake?Snowflake is a cloud-native, SQL data warehouse built to let users put all their data in one place for ease of access and analysis. Amazon Redshift boasts low maintenance costs, high speed, strong performance, and high availability.
Featured
11 • 22 • 2020
Build confidence with Snowflake Snowflake is an analytic data warehousewas built from the ground up for the cloud to optimize loading, processing and query performance for very large volumes of data. It features storage, compute, and global services layers that are physically separated but logically integrated.
Featured
11 • 23 • 2020
AWS Security Best PracticesHow do you build a secure environment in AWS Cloud? There are a few good security practices and guidelines, that must be incorporated into a full, end-to-end secure design.
09 • 19 • 2020
Getting Started with Data Lakes (Part 1)Why are Data lakes central to the modern data architecture?
09 • 22 • 2020
Getting Started with Data Lakes (Part 2)Sample Architecture for creating a Data Pipeline. The architecture depicts the components and the data flow needed for a event-driven batch analytics system.
09 • 23 • 2020
Getting Started with Data Lakes (Part 3)A security primer for Data Lakes; Data Security and Data Cataloging for data lakes
09 • 24 • 2020
Getting Started with Data Lakes (Part 4)Learn about the final and crucial considerations for setting up your Data Lakes. Confirm if data lakes are the best choice and implement the right level discipline.
10 • 02 • 2020
Anatomy of an AnalyzerUnderstanding the science behind text analysis requires special algorithms that determine how a string field in a document is transformed into terms in an inverted index. This blog talks about analyzers which are a combinations of tokenizers, token filters, and character filters.
10 • 05 • 2020
How I led a Dynamic Data Engineering Team?I will take you through my journey of overcoming obstacles by embracing hybrid Cloud environments, modern tools and technologies for digital transformation so we could reap the benefits of a solid, long-term solution.
10 • 10 • 2020
Best Practices on Database DesignHow do you design good databases? Experienced database designers perform all the necessary design functions, in their proper sequence, leaving out nothing. There are a few good database design techniques and guidelines, that must be incorporated into a full, end-to-end database design method.
10 • 15 • 2020
Distributed Storage SystemsWhat are some of the core attributes of a distributed storage system? With a strong understanding of the fundamentals of distributed storage systems, we are able to categorize and evaluate new systems
11 • 11 • 2020
My Favorite Database TechnologiesMy experiences over the last few years have made Cassandra, Elasticsearch, Snowflake, Spark and others my favorites. Learn more about the technologies that will continue to grow in 2021.
11 • 11 • 2020
Build confidence with Cassandra Apache Cassandra is a distributed, NoSQL Database management system (DBMS) designed for high volumes of data. Learn more about the Architecture and the derisk approach when working with Cassandra.
11 • 22 • 2020
Build confidence with Spark Apache Spark is a unified analytics engine for big data processing with a myriad of built-in modules for Machine Learning, Streaming, SQL, and Graph Processing. Learn about the best practices when creating Spark integration projects.
11 • 22 • 2020
Build confidence with Elasticsearch ElasticSearch is a search engine based on the Lucene library that provides a distributed, multitenant-capable full-text search engine. Learn about ElasticSearch and its best practices when using it.