Project Overview

Data Overview

Clarifying Field Definitions
- The original dataset lacked clear definitions for many fields and values. To ensure proper interpretation, I refined the definitions using a combination of online research and logical inference.
Data Preprocessing
- Based on the refined field definitions, I filtered out rows with invalid or unjustifiable null values to maintain data integrity.
Multi-dimensional Visualization Design
- While the analysis relied primarily on bar charts, I tried to incorporate multiple dimensions within each figure.
- For example, in Figure 2, bars were grouped by country (x-axis), subdivided by interaction type, and color-coded by discovery path, enabling layered insights in a single plot.
Extracting Business Insights
- Based on the patterns observed in exploratory visualizations, I drew insights for improving user engagement.

As a preliminary check, I performed a simple missing value analysis, which revealed that two fields—Query Typed and Displayed Name—contained null values. These fields appeared to be semantically related to the Section field, which indicates how the result was generated. (Table 1)

The Section field was a categorical variable with three values: Prequery Results, Suggestion Results, and Title Results. However, since these categories were not clearly defined, I clarified their meanings through a combination of online research and logical inference before proceeding with the analysis. (Table 2)

According to the clarified definitions in Table 2, the three cases listed in Table 3 were logically inconsistent, so I removed the corresponding rows from the dataset.
1. Case 1 and 2: The user did not type a query (Query Typed = null), but the result was categorized as generated based on user input (Section = Suggestion Results or Title Results).
2. Case 3: There was no record of an autocomplete result (Displayed Name = null), yet the result was categorized as system-generated (Section = Suggestion Results).